Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis

Andersson, Rabé; Bermejo-García, Javier; Agujetas, Rafael; Cronhjort, Mikael; Chilo, José

doi:10.3390/s24154769

Open AccessArticle

Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis

by

Rabé Andersson

^1,*

,

Javier Bermejo-García

²

,

Rafael Agujetas

²

,

Mikael Cronhjort

¹

and

José Chilo

¹

Department of Electrical Engineering, Mathematics and Science, University of Gävle, 801 76 Gävle, Sweden

²

Departamento de Ingeniería Mecánica, Energética y de los Materiales, Escuela de Ingenierías Industriales, Universidad de Extremadura, 06006 Badajoz, Spain

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(15), 4769; https://doi.org/10.3390/s24154769

Submission received: 30 April 2024 / Revised: 19 July 2024 / Accepted: 20 July 2024 / Published: 23 July 2024

(This article belongs to the Special Issue AI-Enabled Smart Sensing Technologies for Human-Centered Healthcare Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Gait monitoring using hip joint angles offers a promising approach for person identification, leveraging the capabilities of smartphone inertial measurement units (IMUs). This study investigates the use of smartphone IMUs to extract hip joint angles for distinguishing individuals based on their gait patterns. The data were collected from 10 healthy subjects (8 males, 2 females) walking on a treadmill at 4 km/h for 10 min. A sensor fusion technique that combined accelerometer, gyroscope, and magnetometer data was used to derive meaningful hip joint angles. We employed various machine learning algorithms within the WEKA environment to classify subjects based on their hip joint pattern and achieved a classification accuracy of 88.9%. Our findings demonstrate the feasibility of using hip joint angles for person identification, providing a baseline for future research in gait analysis for biometric applications. This work underscores the potential of smartphone-based gait analysis in personal identification systems.

Keywords:

smartphone sensors; IMU sensors; person recognition; machine learning classification; human motion analysis

1. Introduction

The past few decades have witnessed unprecedented advancements in smartphone technologies. Each year, these handheld devices, which are equipped with a diverse array of sensors, have grown more sophisticated [1]. These sensors serve various functions, from accelerometers, magnetometers, and gyroscopes to environmental sensors like ambient light and temperature sensors [2,3].

Smartphone sensors have already found valuable applications across multiple fields, including health and rehabilitation, fitness by tracking physical activities, monitoring heart rates, and measuring sleep patterns [4,5]. In the automotive industry, they are employed for vehicle navigation and accident detection [6]. Environmental monitoring and augmented reality also benefit from the data collection and processing capabilities of smartphone sensors [7]. Building upon these technological advancements, smartphones have also emerged as a robust means of human identification through gait recognition [8,9,10].

Gait monitoring using sensory data has gained increased research attention over the years as gait biometrics is a unique pattern of human locomotion. Every locomotion pattern is unique due to variations in magnitude and time among people but identifiable as it is performed naturally and habitually every day, which involves many muscles and joints [8]. Similar to other biometric data such as iris, voice samples, and fingerprints, gait analysis has appeared in applications for security and identification purposes. Unlike other biometric data, gait patterns have a multidimensional and convoluted nature, making them extremely difficult to mimic or steal, just like fingerprints or voice patterns [11].

Gait analysis can be captured by foot pressure sensors, wearable sensors, or a vision-based system using multiple cameras or stereo vision [12,13]. However, foot pressure sensors and vision-based systems require a spacious work area, which is costly and restricted to laboratory research that depends on complex and multi-sensor settings and specialized personnel to carry out the operations, which make them impractical for many applications [14,15]. Therefore, the integration of smartphone sensors for gait analysis and person recognition offers numerous potential benefits [16,17,18]. First, the availability and affordability of smartphone sensors make them accessible to a broad demographic; in fact,

85.74

% of the world population used smartphones in the year 2023, as illustrated in [19]. Second, their portability and user-friendly interfaces render them ideal for delivering gait recognition in various settings, from hospitals to home environments [20].

Numerous studies have investigated person recognition through gait analysis, employing various techniques and sensor placements. For example, Hoang et al. demonstrated the feasibility of gait recognition using smartphone accelerometers, achieving up to 92.7% accuracy [21]. Connor and Ross provided a comprehensive survey on gait recognition modalities, highlighting the effectiveness of full-body motion capture and foot pressure patterns [12]. Makihara et al. discussed the use of multiple joints and their combined movement patterns for accurate identification [22]. While these studies used multiple sensor inputs, our focus on hip movement is justified by the work of Derawi et al., which showed that hip rotation patterns are highly individualistic and can be effectively captured by waist-mounted accelerometers. Concentrating on hip movement reduces the impact of variations in arm swing or upper body movements, making this approach more user-friendly [23].

Hip joint analysis offers several advantages: it is central to gait biomechanics as the primary connection between the lower limbs and the trunk [24]; requires fewer sensors than full-body analysis [25]; preserves privacy better than facial recognition or full-body gait analysis [26]; is sensitive to pathological, neurological, and musculoskeletal conditions [27]; applies directly to many rehabilitation protocols [28]; and correlates strongly with energy expenditure during walking, as 45% of the mechanical energy comes from the hip joint [29,30]. These benefits, combined with smartphone ubiquity, make hip joint analysis a promising approach to study for gait movement monitoring and person recognition.

However, gait movement data can be hard to interpret or analyze [15,31]; thus, researchers use machine learning techniques for gait recognition and data analysis captured by a camera, force-based systems, or force-sensitive resistors. For this reason, multiple machine learning algorithms, such as support vector machine (SVM), neural networks (NNs), long short-term memory neural network (LSTM NN), recurrent neural network (RNN), naive Bayes (NB), linear discriminant analysis (LDA), and hybrid convolutional neural network (HCNN), were utilized in previous studies for gait recognition and classification to extract a comprehensive understanding of biometric data based on human movements [9,32,33]. Many ML techniques showed their superior performance and powerful use in various fields, including human recognition, manufacturing, robotics, quality inspection, sports performance analysis, and medical diagnosis [31,34].

Therefore, this study explores the use of smartphone IMU sensor data with ML classification techniques, including perceptron, logistic regression, nearest neighbor rule, naive Bayes, and random forests, to recognize persons based on their hip joint angle. We validate smartphone gait movement by comparing it with MCSs using a pendulum test bench. Our future goal is to utilize this system identification approach in a rehabilitation hip joint exoskeleton system within a healthcare environment where data can be anonymously shared with a healthcare system to recognize patients before giving or following their treatments to avoid sharing any sensitive personal information.

The paper is structured as follows: Section 2 describes the theory behind the steps conducted in the study, while Section 3 discusses the methods used throughout this research, with subsections that shed light on the measurement comparison, the test bench with the MCS and smartphone configuration, and the machine learning analysis. Section 4 is devoted to presenting the results and discussion. Finally, Section 5 concludes with the paper’s findings and analysis.

2. Theory

To conduct a hip joint analysis using smartphone sensors, we elucidate the backbone theoretical key of our research, beginning with the reason behind human gait analysis, followed by the role of inertial measurement units (IMUs) in smartphones and some signal-processing techniques used.

2.1. Human Gait Analysis

Human walking is fundamentally a periodic activity characterized by repetitive motions of body segments. Human gait analysis is pivotal in rehabilitation for the precise identification of walking abnormalities and biomechanical inefficiencies. By analyzing gait patterns, clinicians can tailor rehabilitation programs to address specific deficits, thereby enhancing the efficacy of interventions and improving patient outcomes or combining it with rehabilitation robotic exoskeleton sessions [35]. Since exoskeleton devices comprise sensors, actuators, and electronic circuits operating in close contact with patients, understanding the intricacies of gait mechanics is pivotal for ensuring that these devices operate safely and reliably, identifying walking abnormalities for orthotics or prosthetic leg users, and optimizing movements to mitigate posture-related issues [30].

Gait analysis can include kinematics, kinetics, or EMG measurements. Gait kinematics describes the motion of the major joints and segments of the lower extremity, such as the hip, knee, ankle, and foot, while gait kinetics studies the forces that result from the movement of human gait segments, including the ground reaction force, joint reaction force, and joint torqueEMG sensors, to measure the electrical activity of the muscles that control the movement of these segments, such as the quadriceps, hamstrings, gastrocnemius, and tibialis anterior [35]. However, among these techniques, the measurement of gait kinematics is pivotal for recognizing the gait phases, joint angles, and segment movements.

To systematically evaluate human gait in a kinematics-based manner, an MCS, with the use of computer vision techniques, stands out for its unparalleled accuracy, utilizing either marker-based or markerless methods to achieve results precise to within 1 mm [36]. These systems use reflective markers placed on the person’s body or markerless image processing techniques to capture human gait. However, these systems are expensive and out of reach for many clinicians, especially in developing countries [37].

Thus, alternative kinematic measurement techniques can be performed using wearable sensors such as inertial measurement units (IMUs) or magnetic, angular rate, and gravity (MARG) sensors fused in micro-electro-mechanical systems (MEMSs) in smartphones for indoor and outdoor measurements, in contrast with MCSs for laboratories.

2.2. IMUs and MARG in Smartphones

Smartphones have multiple embedded motion sensors, such as accelerometers, gyroscopes, and magnetometers, that can offer alternative solutions compared with external IMUs for gait recognition. In addition, IMUs and MARG can possess multiple advantages, such as low power consumption, lightweight, and ease of use [38,39]. These sensors are three-axial sensors that capture the acceleration

a_{i}

along the X, Y, and Z axes corresponding to the roll (

ϕ

), pitch (

θ

), and yaw (

ψ

) axes, as shown in Figure 1 and represented in (1).

a_{i} = {[\begin{matrix} a_{x} & a_{y} & a_{z} \end{matrix}]}^{T}

(1)

The gyroscope and magnetometer are also three-axial sensors. The accelerometer gauges three-axial linear acceleration (

a_{i}

), the gyroscope quantifies angular velocity (

ω_{i}

), and the magnetometer assesses the magnetic field (

m_{i}

). These sensors’ data can be used to analyze human gait movements and to find the joint trajectories by fusing them to obtain the orientation angles [35].

2.3. The Sensor Fusion and Signal Processing

The sensor measurements of MARG and IMU usually have intrinsic drift and high-degree noise, making it challenging to reconstruct trajectories and orientation estimation directly [41]. Therefore, finding the orientation estimation of the smartphone (

ϕ

,

θ

,

ψ

) based only on angular velocities (

ω

) from a gyroscope sensor gives inaccurate measurements, as they include bias (

b_{ω}

) (low-frequency noise) and Gaussian noise (

W_{Noise}

) associated with the true angular velocity (

ω_{true}

). However, integrating any error from the gyroscope leads to a drift in orientation estimation over time due to the integration operation. Similarly, using only accelerometer measurements (a) is an inadequate method for orientation estimation as its measurements include bias (

b_{a}

), noise (

W_{Noise}

), gravitational acceleration (

a_{gravity}

), and non-gravitation acceleration (

a_{true}

), which consequently lead to estimation with high-frequency noise. Thus, accumulated errors caused by integration calculations of gyroscope measurements can be solved by fusing measurements provided by an accelerometer and magnetometer in one of the sensor fusion techniques [38,42]. However, the magnetometer measurements (m) suffer from the influence of magnetic field interference (

m_{init}

), bias (

b_{m}

), and noise (

W_{Noise}

) in addition to the true magnetometer measurement (

m_{true}

), as shown in Equations (2)–(4) [43].

ω = ω_{true} + b_{ω} + W_{Noise},

(2)

a = a_{true} + a_{gravity} + b_{a} + W_{Noise},

(3)

m = m_{true} + m_{init} + b_{m} + W_{Noise} .

(4)

Therefore, the sensor fusion technique is an essential procedure to determine the accurate orientation. This technique combines data from multiple sensors to provide a more reliable and comprehensive understanding of drift-free and noise-free spatial orientation [37].

In the literature, sensor fusion algorithms (SFAs) are predominantly categorized into deterministic and stochastic frameworks. Within the deterministic paradigm, algorithms such as linear complementary filters (LCFs) and nonlinear complementary filters (NCFs) are commonly employed. In contrast, the stochastic domain encompasses a diverse array of algorithms, including linear Kalman filters (LKFs), extended Kalman filters (EKFs), complementary Kalman filters (CKFs), square root unscented Kalman filters (SRUKFs), and square root cubature Kalman filters (SRCKFs) [44].

It is noteworthy that LKF has been utilized for orientation estimation in MARG and IMU sensor arrays, but it has limitations in adequately addressing the inherent non-linearities present in real-time systems [37]. Thus, many studies have proposed advanced SFAs such as the EKF used in the attitude heading reference system (AHRS) [45].

AHRS is a particularly indirect EKF quaternion-based algorithm that has the ability to estimate magnetic disturbances, which can mitigate the effect of interference

m_{init}

and makes it useful for its robustness in various applications [42]. It consists of a two-step process: prediction and correction to determine the orientation q (

ϕ

,

θ

,

ψ

), as illustrated in [46] and shown in Figure 2 [47]. The prediction phase relies on gyroscope data based on integrating the angular velocity (w), while the correction phase utilizes acceleration signals (a) and magnetometer readings (m). The AHRS algorithm updates the orientation estimation q by comparing it with the orientation in the prediction phase to minimize the error between the estimated and the actual orientation through iterative correction procedures based on the accelerometer and the magnetometer [48].

3. Methods

3.1. Participants

In this paper, 10 subjects (8 males and 2 females) were asked to walk on a treadmill (BH F2W Dual) for 10 min at a constant normal walking speed of 4 km/h. Summary statistics (mean ± standard deviation (SD)) included age (

33.7 \pm 7.65

) ranging from 22 to 45 years, weight in kg (

63.75 \pm 10.33

), height in m (

1.74 \pm 0.09

), and body mass index (kg/m²): (

21.75 \pm 2.24

).

The subjects were in good health and free of any visible walking impairments. All subjects were instructed to participate in a preliminary warm-up session for 5 min of walking on a treadmill to guarantee the safety of the participants and familiarize them with the treadmill environment before the experimental measurements. All experimental procedures were approved by the Local Ethics Committee at the University of Extremadura.

3.2. Comparison of Measurements

To compare the angle estimation using a smartphone with a motion capture system (MCS), a pendulum was set on a kinematic test bench, located in the mechanical engineering laboratory at the University of Extremadura, as shown in Figure 3a. It consisted of a smartphone placed at one end of a link, while the other end articulated at a fixed point (referred to as point O), allowing it to rotate 360°. This same point coincided with the center of a goniometer, enabling the user to select the initial amplitude (

θ_{0}

) at which the oscillation began at time t = 0.

The idea was to compare the angle measurements of a pendulum using a smartphone and an MCS. The MCS consisted of 8 cameras (Optitrack, Natural Point, Corvallis, OR, USA). The cameras’ frame rate was set at 100 Hz. Before starting the recordings, a calibration of the space was performed using a calibration square on the floor and a T-wand. To measure the angle, three markers were placed on the test bench, one on the test bench corner, another marker at the tip of the pendulum axis, and the last one at the base of the pendulum. Figure 3a shows the configuration of the camera system and the position of the markers on the test bench for the pendulum measurements.

The comparison was conducted by moving the pendulum at small angles (around 10°). Given the forces acting on a pendulum as shown in Figure 3b, the pendulum angle can also be theoretically calculated from the differential equation, which is

\ddot{θ} + \frac{g}{l} θ = 0,

(5)

where

\ddot{θ}

represents the pendulum’s angular acceleration, g is the acceleration of gravity, l is the length of the pendulum, and

θ

is the angular position.

Consider the equation for the simple harmonic motion,

θ (t) = A sin ω t + ϕ,

(6)

where A is the amplitude of motion,

ω

denotes the angular velocity, t is the time variable, and

ϕ

is the phase angle. Additionally, it is noted that the term (

g / l

) multiplied by

θ

in Equation (5) represents the square of the angular velocity, denoted as

ω^{2}

. Therefore, for

t = 0

and small angles (around 10°), the equations that describe the kinematics of the pendulum are

θ (t) = θ_{0} sin (ω t + \frac{π}{2}),

(7)

\dot{θ} (t) = ω θ_{0} cos (ω t + \frac{π}{2}),

(8)

\ddot{θ} (t) = - ω^{2} θ_{0} sin (ω t + \frac{π}{2}),

(9)

where

θ (t)

,

\dot{θ} (t)

, and

\ddot{θ} (t)

are the angular position, angular velocity, and angular acceleration, all three depending on time and the initial angular position

θ_{0}

.

3.3. Hip Joint Identification

The identification of hip joint angles was conducted in two parts: data acquisition and feature extraction and ML classification techniques.

3.3.1. Data Acquisition

For this study, 2 smartphones were used to measure the hip joint angle based on 3-axial measurement (X, Y, and Z) coordinates: the acceleration in meters per second squared (m/s²), the angular velocity in radians per second (rad/s), and the magnetic field in microteslas (

μ

T). One smartphone was mounted on the subject’s torso, while another smartphone was attached to the thigh like a pendulum-like structure, as shown in Figure 4. The sampling rate for recording the data was 100 Hz.

After mounting the smartphones on the subject’s torso and thigh, an initial calibration was conducted, where the subject was standing in an upright position for 5 s. This calibration is needed for high-accuracy measurements as MEMS sensors are manufacturer-calibrated, but some errors can arise over time [43]. In addition to the calibration procedure and to mitigate the impact of wearing error, we implemented a standardized protocol for smartphone placement to minimize variability. The device was securely fastened to the lateral side of the thigh and torso using a specially designed adjustable strap, ensuring consistent positioning across participants. This approach is supported by Jayasinghe et al., who demonstrated a strong correlation (above 0.9) between loose clothing-mounted sensors and body-mounted sensors when placed on the thigh and shank. This correlation indicates that our methodology of placing smartphones on the thigh is robust against variations in wearing conditions, thus minimizing potential biases [49].

Then, with the sensor fusion technique, the sensors provide Euler angles

ϕ_{1}

,

θ_{1}

, and

ψ_{1}

for the torso from the orientation around X, Y, and Z of smartphone 1 and

ϕ_{2}

,

θ_{2}

, and

ψ_{2}

for the thigh, which come from the orientation around X, Y, and Z of smartphone 2. Therefore, the relative hip joint angle (

ϕ_{h}

,

θ_{h}

,

ψ_{h}

) was calculated as the difference between the torso and thigh angles, as illustrated in Figure 4 and the following equations:

ϕ_{h} = ϕ_{2} - ϕ_{1},

(10)

θ_{h} = θ_{2} - θ_{1},

(11)

ψ_{h} = ψ_{2} - ψ_{1},

(12)

where

ϕ_{h}

is the roll angle that represents the abduction/adduction of the hip joint, while

θ_{h}

represents the flexion/extension and

ψ_{h}

represents the internal/external rotation because the hip joint is mechanically represented as a ball-and-socket joint. However, our angle of interest for this study was the hip joint flexion/extension, as shown in Figure 4, as the data serve a planned hip rehabilitation exoskeleton moving in a sagittal plane like the one shown in the research [30,50]. The functions to connect the smartphones with MATLAB version R2024a in the cloud and the AHRS filters were called using MATLAB functions and the “sensor fusion and tracking” toolbox, as illustrated in [51,52]. Therefore, the flowchart for acquiring and processing data from two smartphones is illustrated in Figure 5, which was the pre-stage for feature extraction and classification techniques.

3.3.2. Feature Extraction and Classification Techniques

For this study, the duration of data acquisition for each participant was consistently fixed at 600 s, which was divided into 100 intervals, with each interval lasting 6 s. Each interval was considered a separate walking trial. Therefore, 100 trials were used to capture the normal walking patterns of each participant to facilitate the feature extraction process and, subsequently, the training models using ML. Furthermore, 10 classes were assigned to represent 10 subjects, and each class contained 100 feature vectors.

As people have various walking styles, we employed statistical calculations focusing on the angles of the hip joint—expressed in degrees, where 0 degrees represent the case when the subject is in an upright position and positive and negative degrees represent flexion and extension, respectively, as shown in Figure 4 [53]. Time domain features are extensively used in biological systems due to their lower computational complexity and ease of implementation [54]. This led to extracting nine time-domain distinct features of each trial per subject (class). The features capture the dynamic characteristics of hip trajectory, providing critical insights into the variability and overall patterns of movement. The extracted features were mean value (MV), median (M), maximum angle, covariance (COV), minimum angle, variance (VAR), standard deviation (SD), kurtosis (KUR), and skewness (SKE) of 100 datasets for each subject. The detailed calculations of nine extracted features are shown in Table 1 and discussed in [40,55].

For hip joint angle analysis and classification, principal component analysis (PCA) was applied to the dataset with nine extracted features of the hip joint angles for 10 subjects. The purpose of PCA is to visualize the class regions in the space of predictor variables (features) and to reduce the dimensionality of the data while retaining most of the variance present in the original features [56,57,58]. Based on the explained variance, the first three principal components were selected for further analysis, as they collectively account for 99.86% of the total variance. Further details and the rationale for this selection are discussed in Section 4.3.

After the PCA representation was performed, we employed a variety of ML methods from an open-source data-mining tool called the Waikato Environment for Knowledge Analysis (WEKA) software, version 3.8.6, to classify the extracted features [59,60,61,62]. The classification methods were chosen from five well-known classification categories, namely, Bayesian, function, lazy, rule, and tree classifiers, and to train the models, we chose 16 various classifier algorithms, as detailed in Table 2. The results were evaluated using classification accuracy and receiver operating characteristic (ROC). Classification accuracy measures the overall correctness of a model. The ROC curve, along with its area under the curve, evaluates a classifier’s ability to distinguish between classes across various thresholds [63,64,65].

In the context of WEKA, Bayesian classifiers, such as Bayesian network (BayesNet) and naive Bayesian (NB), use Bayes’s theorem to generate probabilistic outputs. BayesNet represents a set of variables via a directed acyclic graph (DAG), where each variable is a random variable and the edges are the probabilistic dependencies of the variable. The NB uses the Bayesian theorem with strong (naive) assumptions among the extracted features [60,66,67]. Still, determining the optimal sample size for training data is a crucial factor for achieving accurate classification performance, as it enables a closer approximation of the true data distribution. Thus, a novel approach has been introduced recently to estimate the minimum training sample size required for a Bayes classifier, which is detailed in a recent study [68]. This method employs a proxy learning curve, providing a practical framework for researchers to gauge the quantity of data necessary for their models to perform effectively. In the realm of the Naive Bayes (NB) classifier, it is important to note the simplifying assumption of feature independence given the class label, which, despite its potential to misrepresent the feature interdependencies, often results in a robust baseline for classification tasks due to its computational efficiency and surprisingly effective performance in high-dimensional settings.

The function classifiers utilize neural network and regression procedures in their functions [61]. Function classifiers such as logistic regression, multilayer perceptron (MultiPerceptron), sequential minimal optimization (SMO), simple logistic, and classification via regression (ClassViRegression) employ mathematical functions to represent relationships among the data connections. The MultiPerceptron and logistic regression are non-parametric supervised classifications, but the multilayer perceptron can predict more complex features than logistic regression, which usually predicts binary outcomes [69]. SMO utilizes the support vector machine (SVM) algorithm in the training procedures, whereas simple regression is a condensed version of logistic regression. However, solving the problem with classification can be handy using regression.

Lazy classifiers such as KStar calculate the distance between instances by employing a probabilistic measure based on the potential transformation of one instance into another, whereas the locally weighted learning (LWL) modifies the weight of each neighbor according to a distance function. However, the instance-based k (IBk) is the WEKA’s k-nearest neighbor (kNN) algorithm. All lazy classifiers defer model building until prediction time, making them efficient for certain datasets [70].

The rule classifiers such as Java repeated incremental pruning (JRip) implement repeated incremental pruning to produce an error reduction (RIPPER) algorithm [71]. The RIPPER builds a set of rules in the classification by repeatedly adding rules to the models to cover many instances and minimizing the error of overfitting [72]. A partial decision tree (PART) combines both decision trees with rule-based learning. Instead of building a full decision tree, it establishes rules by building and pruning partial decision trees; thus, it is called PART. PART utilizes partial C4.5 combined with the RIPPER algorithm in learning [73]. Both JRip and PART are known for producing models that are relatively easy to interpret.

Lastly, the tree classifiers were used, as they are among the most used classification techniques because of their ease of implementation [74]. Among the tree classifiers is J48, which implements the C4.5 algorithm developed by Ross Quinlan for generating decision trees from a set of training data and uses the concept of information entropy [75]. The other algorithm was the logistic model trees (LMTs) that build a decision tree based on simple class values at the leaves and based on a logistic regression model at each level of the tree (node). This LMT is used to capture the linear and non-linear instructions of the decision tree using logistic regression methods. Meanwhile, random forest (RF) can classify large amounts of data with accuracy, as it uses a multitude of decision trees and outputs the mode of classes at each individual tree. RF is robust for a large number of extracted features due to its capability to deal with overfitting [74]. However, for the fast decision tree learning procedure, a reduced error pruning tree (REPTree) was used. The algorithm builds a decision tree based on the gain/variance information to prune it using reduced error pruning with backfitting. REPTree uses the methods from C4.5 and the REP concept in its procedures [74]. Both RF and REPTree are known for their efficiency on large datasets [76].

4. Results and Discussion

4.1. Comparison of the Measuring Systems

In this initial step before data acquisition, we examined the accuracy of measurements using smartphone sensors and motion-capture systems, benchmarked against a specially designed pendulum test bed. The focus was on assessing the precision and reliability of smartphone angle measurement compared with a well-known and precise measurement system represented by the MCS, thereby establishing confidence in smartphone measurements utilizing sensor fusion algorithms. The results are shown in Figure 6.

The obtained results reveal a high degree of congruence between the two systems across the oscillatory motion of the pendulum. Notably, the amplitude consistency and frequency alignment between the smartphone and MCS data were observed, with small differences in the amplitude of the two measurements. For this reason, the measurement performance of the smartphone compared with the MCS can be measured using the root mean square error (RMSE), as deduced by

R M S E = \sqrt{\frac{\sum_{i = 0}^{N} {(M_{(MCS)} - M_{(smartphone)})}^{2}}{N}},

(13)

where N is the number of observation samples over time, which is, for these measurements,

N = 6500

. The

M_{(MCS)}

represented in this work as the reference measurement using the MCS and

M_{(smartphone)}

is the measurement conducted by the smartphone.

Therefore, the calculated (RMSE =

0.34

) indicates the efficacy of the smartphone in capturing the pendulum’s motion. Meanwhile, the attenuation in amplitude over time was consistent across both measurement modalities, suggesting a linear damping characteristic that is likely attributable to aerodynamic drag and mechanical friction at the pivot point. Furthermore, we noticed that the amplitude from the smartphone was less than that of the MCS, but did not affect the ability to identify subjects based on their gait patterns.

4.2. Hip Joint Angles

Through sensor fusion procedures in MATLAB, we transform raw data from accelerometer, gyroscope, and magnetometer sensors into hip joint angle measurements. The sensor fusion technique is detailed in the flowchart shown in Figure 5.

This study pays special attention to movements within the sagittal plane, underpinning its relevance to rehabilitation scenarios. Therefore, we incorporate the significance of the pitch angle, outputted by the AHRS algorithm, as shown in Figure 2. The acceleration measurement, angular velocity, and magnetic field that are involved in the sensor fusion of a subject under the experimental test can be seen in Figure 7, as well as their results from the sensor fusion algorithm representing the hip joint angles being tested.

Although the graph shows 20 s, it does not need to show huge differences within the steps as the subject walked on a treadmill, with a constant speed and floor (controlled environment); thus, we expect not to see huge differences in a human walking style within 20 s. However, the measurement data showed that there were differences in hip joint angles among the 10 subjects (people have various walking styles) and slight changes (a few degrees) within the subject steps when a subject became tired [53]. Within this context, some of these features extracted from the hip joint angle, such as the mean value and the median, for all the subjects under the experiment are shown in Figure 8.

4.3. Classification Analysis

Principal component analysis (PCA) was applied to the dataset with nine features to visualize their relationships, reduce the dimensionality of the data, and capture the variance in the data. For our study, we initially considered all nine components, as each component captures a part of the total variance. However, after analyzing the explained variance that is shown in Table 3, we determined that the first three principal components collectively account for 99.86% of the total variance, which is substantial and sufficient for our analysis. Therefore, we decided to use the first three PCA components for visualization and further analysis, ensuring that most of the information is retained while simplifying the dataset by reducing the dimensionality from nine to three.

The choice of using the first three components is visually supported by the 3D PCA plot in Figure 9, where each data point represents a feature vector and each color represents a subject. The three axes (

P C 1, P C 2, P C 3

) represent the directions of maximum variance in the data.

P C 1

accounts for the most variance, followed by

P C 2

and then

P C 3

. The different colors represent the 10 different subjects. Each subject, demarcated by a unique color, presents a cluster formed by the data points, which indicate distinct separation between subjects’ gait patterns, demonstrating the efficacy of the dimensionality reduction. However, the proximal clustering within each subject suggests a high intra-subject consistency in gait features, while the spatial segregation between subjects underscores the inter-subject variability. Therefore, if the clusters are well separated, it suggests that the PCA has done a good job of distinguishing between the different subjects’ gait patterns. Additionally, if any points are far away from the main clusters, then they could be considered outliers. For this end, we do not see any significant outliers, which suggests that the gait patterns are relatively consistent within each subject.

The trajectories formed by the points (from one end of the graph to the other) can indicate the progression of the gait cycle for each subject. However, subjects 1, 3, and 9 form tight clusters, indicative of consistent and stable gait patterns with little variation. Subjects 2, 4, and 5 have more spread along the principal component axes, which may imply variability in specific gait characteristics. However, the spread is controlled, suggesting that these variations are systematic and could be related to individual walking styles or physiological differences. Subject 6 shows a distinct distribution, potentially indicating unique gait features that may differ significantly from the other subjects, while subjects 7 and 8 show a spread in the PCA space that suggests unique gait patterns. These subjects may have gait features that are less common among the cohort, which could be indicative of unique biomechanical traits. Lastly, subject 10 has its data points isolated from the rest, particularly along PC3. Such separation suggests that this subject has a gait pattern with distinct characteristics that are not shared with the other subjects, which could be of particular interest for specific gait analysis.

Five machine learning classifiers—16 algorithms—were trained and tested on hip joint angles based on nine extracted features. The data were trained by running each algorithm 10 times and choosing cross-validation with 10 folds in WEKA. It involves splitting the data into 10 subsets, with 1 subset used for testing and the rest for training in each iteration. Additionally, stratified cross-validation is used to maintain class distribution. The final estimate is an average of the 10 iterations, with an optional standard deviation. Ten-fold is preferred due to its proven accuracy and theoretical support. Repeated stratified cross-validation further enhances reliability.

When investigating classification accuracy (CA) and other metrics for various classification algorithms, certain tuning parameters were adjusted within the WEKA environment. WEKA’s graphical user interface provides a user-friendly platform for this purpose. For example, in the case of BayesNet, a combination of the SimpleEstimator with an alpha range of 0.5–0.8 and the LAGDHillClimber algorithm was used. This particular configuration showed superior CA compared with alternatives such as K2, SimulatedAnnealing, and TAN, which are also available in WEKA. As for the Naive Bayes (NB) classifier, the experiments were performed by switching the KernalEstimator between true and false while keeping the other WEKA default parameters constant. The MultiPerceptron was another interesting algorithm that was tested with both 5 and 6 hidden layers. To do this, we switched back and forth between ‘a’ and ‘t’ in the WEKA object editor. The sequential minimal optimization (SMO) algorithm was configured with the PolyKernel. This configuration resulted in the highest CA compared with other kernels such as Puk, StringKernel, and RBFKernel, all with their default parameters. Other classifiers such as SimpleLogistic, classifierViaRegression, KStar, PART, Logistic-R, and LMT were run with their default parameters in WEKA.

Local Weighted Learning (LWL) was selected with its default settings, but the LinearNNSearch algorithm was preferred over other options such as KDTree, Cover, and BalTree due to its higher CA. The instance-based K-nearest neighbor (IBK) classifier was tested with a KNN value of 9, which outperformed the other KNN values from 1 to 13 in terms of CA. JRip was run with an option of 9 folds for pruning, as this was found to be more effective compared with the default value of 1 fold. The J48 classifier was utilized with a confidence factor of 0.25, which is the default setting. Adjusting this value resulted in no significant differences in CA. The random forest (RF) classifier was selected with certain parameters: MaxDepth was set to 0, the number of trees in the forest to 100, and numFeature to 0, which determines the number of randomly selected attributes. Finally, REPTree was selected with its default settings, including a minNum of 2.0, which refers to the minimum total weight of instances in a leaf. Additionally, numFolds was set to 3, and maxDepth was set to −1, indicating no restrictions on the tree depth. This systematic approach to tuning the parameters and selecting the algorithms was crucial for optimal classification performance.

The evaluation of these algorithms was conducted based on several key metrics, such as CA, receiver operating characteristic (ROC), and classification interval (CI). The ROC area is a single measure of the overall performance of a classification model, with a higher area under the curve (AUC) value indicating better model performance—with values ranging from 0 to 1, where 0.5 denotes random guessing, and 1 signifies perfect performance. A CI serves as a quantitative measure of uncertainty in estimation, wherein the interval’s width is inversely related to the level of certainty. A broader confidence interval signifies a higher degree of uncertainty, whereas a narrower interval suggests increased confidence in the estimation [77]. To calculate the lower and upper limits p of the CI, we used a Wilson score interval method using Equation (14) [78,79].

p = (f + \frac{z^{2}}{2 N} \pm z \sqrt{\frac{f}{N} - \frac{f^{2}}{N} + \frac{z^{2}}{4 N^{2}}}) / (1 + \frac{z^{2}}{N})

(14)

Here, N is the number of instances in the test set; f is the observed sample proportion (

f = S / N

), where S is the number of successes (or the number of correct guesses made by the model); and z is the z-score corresponding to the desired confidence level. In our study, when we use a confidence interval of

80 %

(with a wider interval indicating more uncertainty and a narrower one indicating higher confidence), z =

1.82

. The term inside the square root is the adjusted standard error of the proportion, while the denominator is a correction factor that adjusts the interval’s width. The ± symbol indicates that the term comes after it adds for the upper limit of the confidence interval and subtracts for the lower limit. Therefore, the detailed CA, ROC, and CI of all the classification models by running the algorithms 10 times in WEKA can be seen in Table 4. Additionally, a boxplot in Figure 10 provides a graphical comparison among the classification models.

Table 4 and Figure 10 provide insight into the predictive capabilities of the machine learning models. The MultiPerceptron algorithm exhibited the highest classification accuracy (CA), indicating its effectiveness in handling the complex relationships within the gait data. Other models, like SimpleLogistic and LMT, also showed high accuracy and receiver operating characteristic (ROC) values.

The evaluation metrics, such as CA, ROC, and classification intervals (CsI), served as critical indicators of model performance. The LMT algorithm demonstrated high CA and ROC, suggesting its strength in class probability estimation. The CI calculations provided by the equation helped quantify the uncertainty in model estimates, offering a comprehensive assessment of model reliability. These results suggest that a combination of PCA for feature reduction and a suitable selection of classifiers can yield robust and reliable insights into gait analysis for both research and clinical practice. For visualization purposes, the ROC areas for three subjects within the highest CA (MultiPerceptron, SimpleLogistic, and LMT) are shown in Figure 11.

The ROC curves show that the false positive rate (FPR,1-specificity) approaches zero, while the true positive rates (TPRs) for the respective classifiers are still quite high, indicating a strong performance in the low false alarm regime. The curves start at the top-left corner, which suggests that the classifiers can identify a significant number of true positives without incurring many false positives. However, as the FPR increases, the rate at which the TPR increases will differ among the classifiers. A steep initial slope in this region is desirable as it indicates that the classifier can achieve a high TPR without significantly increasing the FPR. Additionally, it is also crucial to consider the area under the ROC curve (AUC) values. The closer the AUC is to 1, the better the classifier’s overall ability is to distinguish between the positive and negative classes across all thresholds. The AUC values for the classifiers are substantially high (0.973, 0.985, and 0.995), indicating good overall performance.

These curves suggest that the classifiers perform well, particularly at low false alarm rates, which is often a critical area in many applications where the cost of a false alarm is high. By dissecting these results, we provide a nuanced understanding of each algorithm’s performance, offering valuable insights into their potential application in similar research contexts. These results can be a guide for future algorithmic choices in similar studies regarding their classification accuracy.

Additionally, the confusion matrices for the three classifiers (MultiPerceptron, SimpleLogistic, and LMT) with the highest classification accuracy (close to 89%) are depicted in Figure 12. The confusion matrices show that class 2 has variability in CA with the most confusion in distinguishing classes 1, 4 and 8 for the three classifiers. The same occurs with class 8, which shows confusion in classes 1, 2 and 5 with (12 instances) in SimpleLogistic. However, the three classifiers show sufficient CA to distinguish the various classes.

Furthermore, our study provides several contributions to smartphone applications and gait recognition by demonstrating the effectiveness of various classification algorithms for human recognition based on hip joint angles. However, other studies of gait recognition using sensors on ankle or wrist joints have shown promising results in gait analysis. For instance, Talha et al. [14] utilized a smartphone motion sensor on an ankle joint, achieving a classification accuracy of 87% with IMU raw data and 94% when using the gender and height feature in a training set. Similarly, Deb et al. [11] employed a time-warped similarity metric with accelerometer sensor data on wrist and ankle joints, resulting in a classification accuracy of 89.7% and 82.3%, respectively.

Our study utilizes conventional classifiers available in WEKA for their balance between performance and computational efficiency. Therefore, future research aims to explore advanced classifiers like LSTM NN, RNN, and HCNN to improve performance and capture complex gait patterns, building upon the solid foundation established by our conventional machine learning approaches. However, implementing these modern classifiers would require substantial changes to our setup, including deep learning frameworks like TensorFlow or PyTorch, which involve high computational demands and are beyond our current scope in this study.

Another future research direction may utilize our methods with diverse demographics of participants who have various hip or walking pathologies to compare the classification effectiveness with our current results. Further optimization of parameters and settings is worth investigating and could achieve the best possible performance and accuracy for the classification task. Therefore, we intend to investigate the integration of late fusion techniques, such as score fusion and majority voting strategies, to improve the accuracy and stability of the proposed models [91].

5. Conclusions

Our study investigates the utilization of smartphone IMU sensors to discriminate subjects based on their walking styles by analyzing hip joint angles. Our findings confirm the reliability of these sensors in measuring hip joint angles and effectively distinguishing between individuals using classification techniques. Through sensor fusion, which integrates accelerometer, gyroscope, and magnetometer data, we have achieved accuracy levels comparable with a reference system of angle measurements obtained from a camera array. By employing statistical methodologies for feature extraction and machine learning algorithms, we achieve an 88.9% classification accuracy. This underscores the immense potential of smartphones in facilitating comprehensive human walking analysis and proficiently classifying sensor data.

Author Contributions

Conceptualization, R.A. (Rabé Andersson) and J.C.; methodology, R.A. (Rabé Andersson), J.B.-G. and J.C.; software, R.A. (Rabé Andersson); validation, R.A. (Rabé Andersson), J.B.-G. and R.A. (Rafael Agujetas); formal analysis, R.A. (Rabé Andersson); investigation, R.A. (Rabé Andersson), J.B.-G., R.A. (Rafael Agujetas) and J.C.; resources, R.A. (Rabé Andersson); data curation, R.A. (Rabé Andersson) and J.B.-G.; writing—original draft preparation, R.A. (Rabé Andersson), M.C. and J.C.; writing—review and editing, R.A. (Rabé Andersson), J.B.-G. and M.C.; visualization, R.A. (Rabé Andersson); supervision, R.A. (Rafael Agujetas), M.C. and J.C.; project administration, R.A. (Rabé Andersson); funding acquisition, R.A. (Rabé Andersson). All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from the University of Gävle and was partially supported by the Ministry of Science and Innovation—Spanish Agency of Research (MCIN/AEI/10.13039/501100011033), through the project PID2022-1375250B-C21.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The measurement data are available on request from the corresponding author. The data are not publicly available due to data privacy protection regulations.

Acknowledgments

The authors gratefully acknowledge the support provided for this research by the University of Gävle, the University of Extremadura, and the Ministry of Science and Innovation—Spanish Agency of Research. The authors would also like to gratefully acknowledge the participants in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MCS	motion capture system
EMG	electromyography
ML	machine learning
SVM	support vector machine
NN	neural network
LSTM NN	long short-term memory neural network
RNN	recurrent neural network
NB	naive Bayes
LDA	linear discriminant analysis
HCNN	hybrid convolutional neural network
IMU	inertial measurement unit
MARG	magnetic, angular rate, and gravity
MEMS	micro-electro-mechanical system
SFA	sensor fusion algorithm
LCF	linear complementary filter
NCF	nonlinear complementary filter
LKF	linear Kalman filter
EKF	extended Kalman filter
CKF	complementary Kalman filter
SRUKF	square root unscented Kalman filter
SRCKF	Square Root Cubature Kalman Filter
AHRS	attitude heading reference system
SD	standard deviation
RMSE	root mean square error
MV	mean value
M	median
COV	covariance
VAR	variance
KUR	kurtosis
SKE	skewness
PCA	principal component analysis
WEKA	Waikato Environment for Knowledge Analysis
BayesNet	Bayesian network
NB	naive Bayesian
SMO	sequential minimal optimization
LWL	locally weighted learning
kNN	K-nearest neighbor
IBk	instance-based K
JRip	Java repeated incremental pruning
RIPPER	repeated incremental pruning to produce error reduction error reduction
PART	partial decision tree
LMT	logistic model tree
REPTree	reduced error pruning tree
CA	classification accuracy
CI	confidence interval
ROC	receiver operating characteristic
AUC	area under the curve

References

Richter, F. Smartphone Sales Worldwide 2007–2021|Statista. Available online: https://www.statista.com/statistics/263437/global-smartphone-sales-to-end-users-since-2007/ (accessed on 17 October 2023).
Majumder, S.; Deen, M.J. Smartphone sensors for health monitoring and diagnosis. Sensors 2019, 19, 2164. [Google Scholar] [CrossRef] [PubMed]
Sprager, S.; Juric, M.B. Inertial Sensor-Based Gait Recognition: A Review. Sensors 2015, 15, 22089–22127. [Google Scholar] [CrossRef] [PubMed]
Drake, J.; Schulz, K.; Bukowski, R.; Gaither, K. Collecting and analyzing smartphone sensor data for health. In Proceedings of the PEARC ’21: Practice and Experience in Advanced Research Computing, Boston, MA, USA, 18–22 July 2021; pp. 1–4. [Google Scholar] [CrossRef]
Moral-Munoz, J.A.; Zhang, W.; Cobo, M.J.; Herrera-Viedma, E.; Kaber, D.B. Smartphone-based systems for physical rehabilitation applications: A systematic review. Assist. Technol. 2021, 33, 223–236. [Google Scholar] [CrossRef] [PubMed]
Faiz, A.B.; Imteaj, A.; Chowdhury, M. Smart vehicle accident detection and alarming system using a smartphone. In Proceedings of the 2015 International Conference on Computer and Information Engineering (ICCIE), Rajshahi, Bangladesh, 26–27 November 2015; pp. 66–69. [Google Scholar] [CrossRef]
Kashevnik, A.; Ponomarev, A.; Shilov, N.; Chechulin, A. In-Vehicle Situation Monitoring for Potential Threats Detection Based on Smartphone Sensors. Sensors 2020, 20, 5049. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y.; Deng, Y. Sensor orientation invariant mobile gait biometrics. In Proceedings of the IEEE International Joint Conference on Biometrics, Clearwater, FL, USA, 29 September–2 October 2014; pp. 1–8. [Google Scholar] [CrossRef]
Ramakrishna, M.V.; Harika, S.; Chowdary, S.M.; Kumar, T.P.; Vamsi, T.K.; Adilakshmi, M. Machine Learning based Gait Recognition for Human Authentication. In Proceedings of the 2nd International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 23–25 March 2023; pp. 1316–1322. [Google Scholar] [CrossRef]
Damaševičius, R.; Maskeliunas, R.; Venčkauskas, A.; Woźniak, M. Smartphone User Identity Verification Using Gait Characteristics. Symmetry 2016, 8, 100. [Google Scholar] [CrossRef]
Deb, S.; Ouyang, Y.; Chua, M.C.H.; Tian, J. Gait identification using a new time-warped similarity metric based on smartphone inertial signals. J. Ambient Intell. Humaniz. Comput. 2020, 11, 4041–4053. [Google Scholar] [CrossRef]
Connor, P.; Ross, A. Biometric recognition by gait: A survey of modalities and features. Comput. Vis. Image Underst. 2018, 167, 1–27. [Google Scholar] [CrossRef]
Wan, C.; Wang, L.; Phoha, V.V. A Survey on Gait Recognition. ACM Comput. Surv. 2018, 51, 89. [Google Scholar] [CrossRef]
Talha, M.; Soomro, H.A.; Naeem, N.; Ali, E.; Kyrarini, M. Human Identification Using a Smartphone Motion Sensor and Gait Analysis. In Proceedings of the PETRA ’22: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece, 29 June–1 July 2022; pp. 197–202. [Google Scholar] [CrossRef]
Martins, M.; Elias, A.; Cifuentes, C.; Alfonso, M.; Frizera, A.; Santos, C.; Ceres, R. Assessment of walker-assisted gait based on Principal Component Analysis and wireless inertial sensors. Rev. Bras. De Eng. Biomédica 2014, 30, 220–231. [Google Scholar] [CrossRef]
Zhang, M.W.; Chew, P.Y.; Yeo, L.L.; Ho, R.C. The untapped potential of smartphone sensors for stroke rehabilitation and after-care. Technol. Health Care 2016, 24, 139–143. [Google Scholar] [CrossRef]
Kong, P.W. Editorial–Special Issue on “Sensor Technology for Enhancing Training and Performance in Sport”. Sensors 2023, 23, 2847. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Chan, P.P.; Lam, B.M.; Chan, Z.Y.; Zhang, J.H.; Wang, C.; Lam, W.K.; Ho, K.K.W.; Chan, R.H.; Cheung, R.T. Sensor-based gait retraining lowers knee adduction moment and improves symptoms in patients with knee osteoarthritis: A randomized controlled trial. Sensors 2021, 21, 5596. [Google Scholar] [CrossRef] [PubMed]
Turner, A. How Many Smartphones Are In The World? 2021. Available online: https://www.bankmycell.com/blog/how-many-phones-are-in-the-world/ (accessed on 2 January 2024).
Bhattacharjya, S.; Cavuoto, L.A.; Reilly, B.; Xu, W.; Subryan, H.; Langan, J. Usability, Usefulness, and Acceptance of a Novel, Portable Rehabilitation System (mRehab) Using Smartphone and 3D Printing Technology: Mixed Methods Study. JMIR Hum. Factors 2021, 8, e21312. [Google Scholar] [CrossRef] [PubMed]
Thang, H.M.; Viet, V.Q.; Dinh Thuc, N.; Choi, D. Gait identification using accelerometer on mobile phone. In Proceedings of the 2012 International Conference on Control, Automation and Information Sciences (ICCAIS), Saigon, Vietnam, 26–29 November 2012; pp. 344–348. [Google Scholar] [CrossRef]
Makihara, Y.; Matovski, D.S.; Nixon, M.S.; Carter, J.N.; Yagi, Y. Gait Recognition: Databases, Representations, and Applications. In Wiley Encyclopedia of Electrical and Electronics Engineering; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar] [CrossRef]
Derawi, M.O.; Nickely, C.; Bours, P.; Busch, C. Unobtrusive user-authentication on mobile phones using biometric gait recognition. In Proceedings of the 2010 6th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2010, Darmstadt, Germany, 15–17 October 2010; pp. 306–311. [Google Scholar] [CrossRef]
Neumann, D.A. Kinesiology of the hip: A focus on muscular actions. J. Orthop. Sport. Phys. Ther. 2010, 40, 82–94. [Google Scholar] [CrossRef] [PubMed]
Muro-de-la Herran, A.; García-Zapirain, B.; Méndez-Zorrilla, A. Gait Analysis Methods: An Overview of Wearable and Non-Wearable Systems, Highlighting Clinical Applications. Sensors 2014, 14, 3362. [Google Scholar] [CrossRef] [PubMed]
Bouchrika, I.; Goffredo, M.; Carter, J.; Nixon, M. On using gait in forensic biometrics. J. Forensic Sci. 2011, 56, 882–889. [Google Scholar] [CrossRef] [PubMed]
Baker, R. The history of gait analysis before the advent of modern computers. Gait Posture 2007, 26, 331–342. [Google Scholar] [CrossRef] [PubMed]
Fleury, A.; Mourcou, Q.; Franco, C.; Diot, B.; Demongeot, J.; Vuillerme, N. Evaluation of a Smartphone-based audio-biofeedback system for improving balance in older adults–a pilot study. Annu. Int. Conf. IEEE Eng. Med. Biol. Society. IEEE Eng. Med. Biol. Society. Annu. Int. Conf. 2013, 2013, 1198–1201. [Google Scholar] [CrossRef]
Giandolini, M.; Poupard, T.; Gimenez, P.; Horvais, N.; Millet, G.Y.; Morin, J.B.; Samozino, P. A simple field method to identify foot strike pattern during running. J. Biomech. 2014, 47, 1588–1593. [Google Scholar] [CrossRef]
Andersson, R.; Björsell, N. The Energy Consumption and Robust Case Torque Control of a Rehabilitation Hip Exoskeleton. Appl. Sci. 2022, 12, 11104. [Google Scholar] [CrossRef]
Taniguchi, H.; Sato, H.; Shirakawa, T. A machine learning model with human cognitive biases capable of learning from small and biased datasets. Sci. Rep. 2018, 8, 7397. [Google Scholar] [CrossRef] [PubMed]
Ordóñez, F.J.; Roggen, D.; Liu, Y.; Xiao, W.; Chao, H.C.; Chu, P. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [PubMed]
Jiang, X.; Chu, K.H.; Khoshnam, M.; Menon, C. A Wearable Gait Phase Detection System Based on Force Myography Techniques. Sensors 2018, 18, 1279. [Google Scholar] [CrossRef] [PubMed]
Goh, G.L.; Goh, G.D.; Pan, J.W.; Teng, P.S.P.; Kong, P.W. Automated Service Height Fault Detection Using Computer Vision and Machine Learning for Badminton Matches. Sensors 2023, 23, 9759. [Google Scholar] [CrossRef]
Tao, W.; Liu, T.; Zheng, R.; Feng, H. Gait Analysis Using Wearable Sensors. Sensors 2012, 12, 2255. [Google Scholar] [CrossRef]
Lihinikaduarachchi, I.; Rajapaksha, S.A.; Saumya, C.; Senevirathne, V.; Silva, P. Inertial Measurement units based wireless sensor network for real time gait analysis. In Proceedings of the TENCON 2015—2015 IEEE Region 10 Conference, Macao, China, 1–4 November 2015; pp. 1–6. [Google Scholar] [CrossRef]
Olivares, A.; Górriz, J.M.; Ramírez, J.; Olivares, G. Sensor fusion adaptive filtering for position monitoring in intense activities. In Hybrid Artificial Intelligence Systems, Part I; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6076, pp. 484–491. [Google Scholar] [CrossRef]
Ding, W.; Gao, Y. Attitude Estimation Using Low-Cost MARG Sensors with Disturbances Reduction. IEEE Trans. Instrum. Meas. 2021, 70. [Google Scholar] [CrossRef]
Ferrari, A.; Micucci, D.; Mobilio, M.; Napoletano, P. Trends in human activity recognition using smartphones. J. Reliab. Intell. Environ. 2021, 7, 189–213. [Google Scholar] [CrossRef]
Pinto, B.; Correia, M.V.; Paredes, H.; Silva, I. Detection of Intermittent Claudication from Smartphone Inertial Data in Community Walks Using Machine Learning Classifiers. Sensors 2023, 23, 1581. [Google Scholar] [CrossRef] [PubMed]
Pan, T.Y.; Kuo, C.H.; Hu, M.C. A noise reduction method for IMU and its application on handwriting trajectory reconstruction. In Proceedings of the 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, Seattle, WA, USA, 11–15 July 2016. [Google Scholar] [CrossRef]
Sun, W.; Wu, J.; Ding, W.; Duan, S. A robust indirect Kalman filter based on the gradient descent algorithm for attitude estimation during dynamic conditions. IEEE Access 2020, 8, 96487–96494. [Google Scholar] [CrossRef]
Olsson, F.; Kok, M.; Halvorsen, K.; Schön, T.B. Accelerometer calibration using sensor fusion with a gyroscope. In Proceedings of the 2016 IEEE Statistical Signal Processing Workshop (SSP), Palma de Mallorca, Spain, 26–29 June 2016; pp. 1–5. [Google Scholar] [CrossRef]
Nazarahari, M.; Rouhani, H. Sensor fusion algorithms for orientation tracking via magnetic and inertial measurement units: An experimental comparison survey. Inf. Fusion 2021, 76, 8–23. [Google Scholar] [CrossRef]
Nazarahari, M.; Rouhani, H. 40 years of sensor fusion for orientation tracking via magnetic and inertial measurement units: Methods, lessons learned, and future challenges. Inf. Fusion 2021, 68, 67–84. [Google Scholar] [CrossRef]
Diaz, E.M.; De Ponte Muller, F.; Jimenez, A.R.; Zampella, F. Evaluation of AHRS algorithms for inertial personal localization in industrial environments. IEEE Int. Conf. Ind. Technol. 2015, 2015, 3412–3417. [Google Scholar] [CrossRef]
Yadav, N.; Bleakley, C. Accurate Orientation Estimation Using AHRS under Conditions of Magnetic Distortion. Sensors 2014, 14, 20008–20024. [Google Scholar] [CrossRef] [PubMed]
Tomaszewski, D.; Rapiński, J.; Pelc-Mieczkowska, R. Concept of AHRS Algorithm Designed for Platform Independent Imu Attitude Alignment. Rep. Geod. Geoinform. 2017, 104, 33–47. [Google Scholar] [CrossRef]
Jayasinghe, U.; Hwang, F.; Harwin, W.S. Comparing Loose Clothing-Mounted Sensors with Body-Mounted Sensors in the Analysis of Walking. Sensors 2022, 22, 6605. [Google Scholar] [CrossRef]
Andersson, R.; Björsell, N. The MATLAB Simulation and the Linear Quadratic Regulator Torque Control of a Series Elastic Actuator for a Rehabilitation Hip Exoskeleton. In Proceedings of the 2022 5th International Conference on Intelligent Robotics and Control Engineering (IRCE), Tianjin, China, 23–25 September 2022; pp. 25–31. [Google Scholar] [CrossRef]
Sensor Fusion and Tracking Toolbox—MATLAB. Available online: https://se.mathworks.com/products/sensor-fusion-and-tracking.html (accessed on 24 January 2023).
Orientation from accelerometer, gyroscope, and magnetometer readings—MATLAB—MathWorks Nordic. Available online: https://se.mathworks.com/help/fusion/ref/ahrsfilter-system-object.html (accessed on 24 January 2023).
Pandey, N.; Abdulla, W.; Salcic, Z. Gait-based person identification using multi-view sub-vector quantisation technique. In Proceedings of the 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Sharjah, United Arab Emirates, 12–15 February 2007. [Google Scholar] [CrossRef]
Cao, Y.; Gao, F.; Yu, L.; She, Q. Gait recognition based on emg information with multiple features. IFIP Adv. Inf. Commun. Technol. 2018, 538, 402–411. [Google Scholar] [CrossRef]
Chen, J.; Sun, Y.; Sun, S. Improving Human Activity Recognition Performance by Data Fusion and Feature Engineering. Sensors 2021, 21, 692. [Google Scholar] [CrossRef]
Yang, M.J.; Zheng, H.R.; Wang, H.Y.; Mcclean, S.; Harris, N. Combining feature ranking with PCA: An application to gait analysis. In Proceedings of the 2010 International Conference on Machine Learning and Cybernetics, Qingdao, China, 11–14 July 2010; Volume 1, pp. 494–499. [Google Scholar] [CrossRef]
Jolliffe, I.T. Principal Component Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 1986. [Google Scholar] [CrossRef]
Márquez, F.P.G. Advances in Principal Component Analysis; IntechOpen: Rijeka, Croatia, 2022. [Google Scholar] [CrossRef]
Weka. Weka 3—Data Mining with Open Source Machine Learning Software in Java. Available online: https://ml.cms.waikato.ac.nz/weka/ (accessed on 1 January 2024).
Kotak, P.; Modi, H. Enhancing the Data Mining Tool WEKA. In Proceedings of the 2020 5th International Conference on Computing, Communication and Security (ICCCS), Patna, India, 14–16 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
Kumari Dash, R. Selection of the best classifier from different datasets using WEKA. Int. J. Eng. Res. Technol. (IJERT) 2013, 2. [Google Scholar] [CrossRef]
Alshammari, M.; Mezher, M. A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA Toolbox. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 224–229. [Google Scholar] [CrossRef]
Eligo, W.M.; Leng, C.; Kurika, A.E.; Basu, A. Comparing Supervised Machine Learning Algorithms on Classification Efficiency of multiclass classifications problem. Int. J. Emerg. Trends Eng. Res. 2022, 10, 346–360. [Google Scholar] [CrossRef]
Ong, M.S.; Magrabi, F.; Coiera, E. Automated categorisation of clinical incident reports using statistical text classification. Qual. Saf. Health Care 2010, 19, e55. [Google Scholar] [CrossRef]
Davis, J.; Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 233–240. [Google Scholar] [CrossRef]
Area, S.; Mesra, R. Analysis of Bayes, neural network and tree classifier of classification technique in data mining using WEKA. Comput. Sci. Inf. Technol. 2012, 2, 359–369. [Google Scholar] [CrossRef]
Bouckaert, R. Bayesian Network Classifiers in Weka; Working Paper Series; University of Waikato, Department of Computer Science: Hamilton, New Zealand, 2004; pp. 1–23. [Google Scholar]
Salazar, A.; Vergara, L.; Vidal, E. A proxy learning curve for the Bayes classifier. Pattern Recognit. 2023, 136, 109240. [Google Scholar] [CrossRef]
Sahoo, G.; Kumar, Y. Analysis of parametric & non parametric classifiers for classification technique using WEKA. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 2012, 4, 43. [Google Scholar] [CrossRef]
Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: Burlington, MA, USA, 2016. [Google Scholar]
Shahzad, W.; Asad, S.; Khan, M.A. Feature subset selection using association rule mining and JRip classifier. Int. J. Phys. Sci. 2013, 8, 885–896. [Google Scholar] [CrossRef]
Thakur, S.; Meenakshi, E.; Priya, A. Detection of malicious URLs in big data using RIPPER algorithm. In Proceedings of the 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology (RTEICT), Bangalore, India, 19–20 May 2017; pp. 1296–1301. [Google Scholar] [CrossRef]
Mohamed, W.N.H.W.; Salleh, M.N.M.; Omar, A.H. A comparative study of Reduced Error Pruning method in decision tree algorithms. In Proceedings of the 2012 IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, 23–25 November 2012; pp. 392–397. [Google Scholar] [CrossRef]
Rajesh, P.; Karthikeyan, M. A comparative study of data mining algorithms for decision tree approaches using weka tool. Adv. Nat. Appl. Sci. 2017, 11, 230–243. [Google Scholar]
Salzberg, S.L. C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Mach. Learn. 1994, 16, 235–240. [Google Scholar] [CrossRef]
Frank, E.; Hall, M.; Witten, I. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, 4th ed.; Morgan Kaufmann: Burlington, MA, USA, 2016. [Google Scholar]
Hazra, A. Using the confidence interval confidently. J. Thorac. Dis. 2017, 9, 4125–4130. [Google Scholar] [CrossRef]
Richard, R. Confidence Intervals. Applied Biostatistics for the Health Sciences, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2022; pp. 235–271. [Google Scholar] [CrossRef]
Wallis, S. Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative Methods. J. Quant. Linguist. 2013, 20, 178–208. [Google Scholar] [CrossRef]
Kozlow, P.; Abid, N.; Yanushkevich, S. Gait Type Analysis Using Dynamic Bayesian Networks. Sensors 2018, 18, 3329. [Google Scholar] [CrossRef] [PubMed]
Manap, H.H.; Tahir, N.M.; Abdullah, R. Anomalous gait detection using Naive Bayes classifier. In Proceedings of the ISIEA 2012—2012 IEEE Symposium on Industrial Electronics and Applications, Bandung, Indonesia, 23–26 September 2012; pp. 378–381. [Google Scholar] [CrossRef]
Yang, J.H.; Park, J.H.; Jang, S.H.; Cho, J. Novel Method of Classification in Knee Osteoarthritis: Machine Learning Application Versus Logistic Regression Model. Ann. Rehabil. Med. 2020, 44, 415–427. [Google Scholar] [CrossRef] [PubMed]
Szczepanski, D. Multilayer perceptron for gait type classification based on inertial sensors data. In Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, Gdańsk, Poland, 11–14 September 2016; pp. 947–950. [Google Scholar] [CrossRef]
Platt, J.C. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; Microsoft: Redmond, WA, USA, 1998. [Google Scholar]
Seo, J.; Kim, T.; Lee, J.; Kim, J.; Choi, J.; Tack, G. Fall prediction of the elderly with a logistic regression model based on instrumented timed up & go. J. Mech. Sci. Technol. 2019, 33, 3813–3818. [Google Scholar] [CrossRef]
Ng, Y.L.; Jiang, X.; Zhang, Y.; Shin, S.B.; Ning, R. Automated Activity Recognition with Gait Positions Using Machine Learning Algorithms. Eng. Technol. Appl. Sci. Res. 2019, 9, 4554–4560. [Google Scholar] [CrossRef]
Atkeson, C.G.; Moore, A.W.; Schaal, S. Locally Weighted Learning. Artif. Intell. Rev. 1997, 11, 11–73. [Google Scholar] [CrossRef]
Frank, E.; Witten, I.H. Generating Accurate Rule Sets Without Global Optimization. In Proceedings of the Fifteenth International Conference on Machine Learning, Madison, WI, USA, 24–27 July 1998; pp. 144–151. [Google Scholar]
Shi, L.F.; Qiu, C.X.; Xin, D.J.; Liu, G.X. Gait recognition via random forests based on wearable inertial measurement unit. J. Ambient Intell. Humaniz. Comput. 2020, 11, 5329–5340. [Google Scholar] [CrossRef]
NH, W. Classification of control and neurodegenerative disease subjects using tree based classifiers. J. Pharm. Res. Int. 2020, 32, 63–73. [Google Scholar]
Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms, 2nd ed.; Wiley Blackwell: Hoboken, NJ, USA, 2014; Volume 9781118315, pp. 1–357. [Google Scholar] [CrossRef]

Figure 1. (a) The three-axial sensor measurements of the smartphone; (b) the fixed reference frame of the earth [40].

Figure 2. The attitude heading reference system (AHRS) algorithm.

Figure 3. (a) The test bench configuration to compare the MCS and the smartphone measurements; (b) the theoretical sketch of the configuration of a pendulum.

Figure 4. The smartphone setup on a subject during the test.

Figure 5. The flowchart for acquiring and processing sensor data.

Figure 6. Comparison of measurements of MCS and smartphone in the pendulum test bench.

Figure 7. The acceleration, angular velocities, and magnetic field with the hip joint angle of a test subject.

Figure 8. The boxplot for the mean value and the median features for all the experimental subjects.

Figure 9. The 3D principal component analysis (PCA) for all subjects with color coding to differentiate each subject.

Figure 10. The boxplot of the classification accuracy across multiple models.

Figure 11. The ROC curve of the optimal classifier with the highest classification accuracy across different subjects: (a) the MultiPerceptron classifier for subject 2, (b) the SimpleLogistic classifier for subject 4, and (c) the LMT classifier for subject 6.

Figure 12. The confusion matrices for the classifier with the highest classification accuracy (the classes are from 1 = subject 1 to 10 = subject 10): (a) the confusion matrix for the MultiPerceptron classifier, (b) the confusion matrix for the SimpleLogistic classifier, and (c) the confusion matrix for the LMT classifier.

Table 1. The mathematical representations and descriptions of the features [40,55].

Feature	Description	Mathematical Definition
Mean Value (MV)	The average of all angles in the sequence.	$\frac{1}{N} \sum_{i = 1}^{N} x_{i}$
Median (M)	The middle angle value in the ordered sequence.	$M_{odd} = x_{(\frac{N + 1}{2})}, M_{even} = \frac{x_{(\frac{N}{2})} + x_{(\frac{N}{2} + 1)}}{2}$
Maximum Angle	The largest angle observed.	$x_{\max}$
Covariance (COV)	Indicates how two angle variables vary together.	$\frac{1}{N - 1} \sum_{i = 1}^{N} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})$
Minimum Angle	The smallest angle observed.	$x_{\min}$
Variance (VAR)	Measures the dispersion around the mean angle.	$\frac{1}{N - 1} \sum_{i = 1}^{N} x_{i}^{2}$
Standard Deviation (SD)	Shows the amount of variation or dispersion of angle values.	$\sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{2}}$
Kurtosis (KUR)	Describes the sharpness or flatness of the angle distribution.	$\frac{1}{N} \sum_{i = 1}^{N} {(\frac{x_{i} - \bar{x}}{σ})}^{4}$
Skewness (SKE)	Shows the asymmetry in the angle distribution.	$\frac{1}{N} \sum_{i = 1}^{N} {(\frac{x_{i} - \bar{x}}{σ})}^{3}$

N: number of samples for each trial.

Table 2. The classification algorithms in WEKA.

No.	Category	Algorithms
1	Bayesian Classifiers	BayesNet
		NB
2	Function Classifiers	Logistic-R
		MultiPerceptron
		SMO
		Simple Logistic
		ClassViRegression
3	Lazy Classifiers	KStar
		LWL
		IBk
4	Rule Classifiers	JRip
		PART
5	Tree Classifiers	J48
		LMT
		RF
		REPTree

Table 3. Explained variance of principal components.

Principal Component	1	2	3	4	5	6	7	8	9
Explained Variance (%)	91.1337	8.4542	0.2717	0.1325	0.0073	0.0005	0.0001	0.0000	0.0000

Table 4. The classification accuracy of the multiple classifiers with rows shaded in gray indicating the highest classification accuracy (CA) percentage values for 10 subjects’ data [55].

No.	Classification Model	CA %	Av. ROC	Av. CI
1	BayesNet [80]	84.26 ± 1.2	0.975 ± 0.002	[0.829 0.859]
2	NB [81]	85.5 ± 2.1	0.988 ± 0.003	[0.856 0.884]
3	Logistic-R [82]	87.1 ± 1.6	0.986 ± 0.001	[0.865 0.892]
4	MultiPerceptron [83]	88.9 ± 1.3	0.965 ± 0.003	[0.874 0.899]
5	SMO [84]	84.9 ± 1.2	0.976 ± 0.002	[0.847 0.875]
6	SimpleLogistic [85]	88.4 ± 2.3	0.989 ±0.002	[0.861 0.907]
7	ClassViRegression [82]	85.4 ± 1.8	0.985 ± 0.004	[0.847 0.875]
8	KStar [86]	86.1 ± 2.6	0.987 ± 0.003	[0.849 0.887]
9	LWL [87]	63.4 ± 1.6	0.937 ± 0.001	[0.622 0.661]
10	IBk [86]	84.1 ± 1.1	0.917 ± 0.001	[0.830 0.852]
11	JRip [86]	80.1 ± 1.1	0.947 ± 0.003	[0.786 0.818]
12	PART [88]	83.4 ± 1.5	0.932 ± 0.001	[0.819 0.849]
13	J48 [86]	84.9 ± 1.4	0.937 ± 0.005	[0.814 0.844]
14	LMT [74]	88.2 ± 1.4	0.989 ± 0.001	[0.868 0.896]
15	RF [89]	86.9 ± 1.1	0.903 ± 0.001	[0.863 0.889]
16	REPTree [90]	82.9 ± 1.2	0.960 ± 0.001	[0.813 0.843]

CA: classification accuracy; M: mean, SD: standard deviation; Av. CI: average classification interval; Av. ROC: average receiver operating characteristic. All values under CA are in the form M ± SD).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Andersson, R.; Bermejo-García, J.; Agujetas, R.; Cronhjort, M.; Chilo, J. Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis. Sensors 2024, 24, 4769. https://doi.org/10.3390/s24154769

AMA Style

Andersson R, Bermejo-García J, Agujetas R, Cronhjort M, Chilo J. Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis. Sensors. 2024; 24(15):4769. https://doi.org/10.3390/s24154769

Chicago/Turabian Style

Andersson, Rabé, Javier Bermejo-García, Rafael Agujetas, Mikael Cronhjort, and José Chilo. 2024. "Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis" Sensors 24, no. 15: 4769. https://doi.org/10.3390/s24154769

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis

Abstract

1. Introduction

2. Theory

2.1. Human Gait Analysis

2.2. IMUs and MARG in Smartphones

2.3. The Sensor Fusion and Signal Processing

3. Methods

3.1. Participants

3.2. Comparison of Measurements

3.3. Hip Joint Identification

3.3.1. Data Acquisition

3.3.2. Feature Extraction and Classification Techniques

4. Results and Discussion

4.1. Comparison of the Measuring Systems

4.2. Hip Joint Angles

4.3. Classification Analysis

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI