1. Introduction
Step length (or stride length) plays an important role in addressing the issue of human health conditions, especially for seniors. It is an indicator that predicts accidental falls and fall-related injury in the elderly [
1], which may cause fatality [
2]. A reduced step length has been found to be associated with the increased dependence, mortality, and institutionalization of older people [
3]. The variability of the step length also indicates the integrity of gray matter, which is closely related to personal memory and executive functions [
4]. Furthermore, step length is one of the significant components in gait patterns. It can be converted to gait speed, which is useful in predicting life expectancy [
5]. Therefore, monitoring the human step length is a vital topic that is worthy of investigating.
The estimation of the step length can be traced back to the problem of distance estimation. Although distance estimation has been intensively researched for general communication systems, there are few papers explicitly researching the human step length in daily activities, such as walking and jogging, in both indoor and outdoor environments. Moreover, as mentioned in more detail in the next section of this paper, the existing publications that address the estimation of step length either have modest accuracy or follow privacy-invasive, health-concerning, and strictly space-confined approaches. Specifically, camera-based technologies [
6,
7] are privacy-invasive and prone to error as they may record images or video footage of the participants. The camera-based methods also require a specific experimental setting because any obstacle appearing between the camera and the person under test can cause measurement errors. Meanwhile, laser-based methods [
8] may arouse health concerns because a long-time exposure to lasers in these methods may cause some health hazards. On the other hand, sensing mats [
9,
10,
11] have been well adopted to improve the safety of patients, especially the disabled and those with disorders. However, the sensing mat approach is confined to particular spaces, such as clinics, hospitals, or a specific laboratory setting where the sensing mat is laid, because the person under test must walk or run on this mat. Therefore, a more-accurate, less-invasive, less-health-concerning, and less-space-confined, but also cost-efficient technique for step length estimations in daily activities is still missing.
Thus, this paper aimed to estimate the step length based on the received signal strength indicator (RSSI) method in both walking and jogging activities in indoor and outdoor scenarios. The RSSI has been widely employed in distance estimation, and it might provide reliable performance [
12,
13,
14,
15,
16,
17], especially for measurements in line-of-sight (LOS) paths over short distances, such as the step length measurements in this paper. The step length in this paper refers to the average distance between two ankles of the person under test when the person is walking or jogging at a normal and equal pace. Unlike our previous work in [
18], which only considered a static environment, this paper undertook experiments in actual moving activities. In particular, in this paper, we propose a novel filtering technique to be applied along with the empirical path loss model proposed in [
18] to estimate the step length in walking and jogging situations.
The main contributions of this paper are summarized as follows:
A novel filtering technique is proposed to eliminate on-ankle path loss outliers and keep the remaining as a reliable range with a pair of upper and lower thresholds. This range of path loss values was used to estimate the human step length in daily activities, such as walking and jogging;
The distribution of the on-ankle path loss was revealed to follow a two-term Gaussian distribution, and the two thresholds lied on each side of its second hump;
The thresholds can be determined mathematically. The upper threshold relates to the fitting equation of the second hump of the two-term Gaussian distribution, which was found as for an outdoor and for an indoor environment. The lower threshold relates to the survival rate, which is located at the point where the survival rate of the measured data is 0.68;
The proposed filtering technique resulted in an accurate estimation of the step length, with errors of only 10.15 mm and 4.40 mm for walking and jogging in an indoor environment, respectively, and only 4.81 mm and 10.84 mm for the same activities in an outdoor environment.
The rest of the paper is organized as follows.
Section 2 reviews the related works.
Section 3 describes the proposed system model. In
Section 4, the experimental procedures are detailed.
Section 5 presents the experimental results and analyses of the step length estimation accuracy in the indoor walking, indoor jogging, outdoor walking, and outdoor jogging situations.
Section 6 concludes the paper. Finally,
Section 7 states the limitations and the future works.
2. Related Works
Accurate estimation of the human step length is a challenging task, especially in human daily activities, due to the randomness of these activities. As a result, there are few research papers that explicitly address the problem of step length estimation, although the overarching topic of distance measurements has been intensively researched for general communication systems. These research papers are briefly reviewed as follows.
The researchers in [
6] used cameras as additional sensors in pedestrian dead reckoning (PDR) to analyze step length and step frequency. Currently, PDR is a popular indoor localization method [
19,
20] due to the wide availability of smart devices. Cameras were also employed in [
7] to track the motions of the person under test. The stride length can be estimated by detecting and extracting several pieces of perspective information related to predefined markers and edges. The experiment results implied that the camera-based method was a promising way to detect all steps when the user was moving slowly, especially in an indoor environment. Recently, the researchers in [
21] proposed a machine-learning-based step length estimation algorithm with the use of cameras and smartphones. This research considered a systematic feature selection algorithm to determine the choice of user-specific parameters from a large collection. The mean absolute errors of the step length estimations were 3.48 cm and 4.19 cm for a known test person and an unknown test person, respectively. However, the above camera-based techniques are flexibility-constrained because the camera must be arranged at a certain place and has a limited horizon. Moreover, its accuracy may be reduced in fast-moving situations or by obstacles appearing between the cameras and the person under test.
The gait patterns can also be detected by infrared thermography, such as in [
8], where the best accuracy was found to be 91%. However, the drawback is that lasers are not common in daily usage because of the training requirements, costly equipment, and the potential health concern for long-term exposure.
An inertial sensor can be utilized in an inertial measurement unit (IMU) to collect gait-related parameters, which then help to estimate the human step length. An IMU generally consists of an accelerometer, a compass, and a gyroscope. Currently, most smart devices have built-in inertial sensors. The smart device can be held in hand [
22] or attached to the body, such as the pelvis [
23], which provides useful information and helps position the point of interest. References [
19,
20] estimated the human stride length based on the data collected from inertial sensor measurements from a smartphone. The experimental results demonstrated that the step length can be estimated with an error rate of 4.63% for indoor scenarios. Considering a general step length of 0.7 m, the corresponding absolute error would be 3.24 cm. The error of step length estimation was reduced to 2% in [
24] based on a back-propagation artificial neural network using an IMU that was placed on the foot. The research in [
25] compared the accuracy of estimation between different placements of the IMUs. Firstly, this paper utilized only one inertial sensor on each shank, called the integrator-based method, providing an average accuracy of 91.21%. The accuracy was improved to 95.37% if two sensors were employed on each leg, namely the angle-based method. As a result, the maximum error was 11.26 cm and 5.51 cm for the integration and angle mode, respectively. Although the integrator-based method was simpler, the angle-based method achieved better accuracy in terms of step length estimation since it was not sensitive to the initial conditions and errors caused by double integration. However, experiments and analyses in the outdoors are still missing. Moreover, a major disadvantage of using IMUs is that they typically suffer from an accumulated error, which means the accuracy will be degraded over time.
Deep learning has been adopted to estimate human step length because it can learn the features of the data automatically and has shown excellent performance in different application domains with the cost of powerful computing facilities. The proposed deep-learning-based algorithm in [
26] can adapt to different phone carrying ways and does not require individual stature information and spatial constraints. The average error of this method was 3.01%, which means if the actual step length was 0.7 m, then the corresponding error range was within 2.1 cm. Paper [
27] defined a deep-learning-based framework with an activity recognition model to regress the user change in distance and step length. The average error of the proposed method was 2.1%, which was about 1.47 cm if the step length was 0.7 m. It is worth noting that the positions (e.g., handheld position or pocket position) of the smartphone also had a huge influence on the estimation by around 5% [
28]. The researchers in [
29] investigated human step length and step width using wearable sensors in a computer-assisted rehabilitation environment. The results showed that in a specific experimental environment, gait patterns could be detected and the mean absolute errors were 0.2396 cm and 1.92 cm, respectively. However, the data in this paper were collected using specific equipment under a specific environment, rather than normal indoor and outdoor propagation environments in daily human activities.
Therefore, in this paper, we aimed to propose a step length estimation technique that has high accuracy and is less-privacy-invasive, less-health-concerning, and less-space-confined than the aforementioned techniques, without requiring powerful computing facilities as the deep-learning-based ones.
Our previous work presented in [
18] proposed an empirical path loss model to estimate the human step length in both indoor and outdoor scenarios under a static context rather than in a dynamic one. Therefore, this paper aimed to estimate human step length in daily activities. In particular, a novel filtering technique is proposed in this paper, which was used along with the hardware transceivers and the empirical path loss model developed in our previous work [
18] to estimate human step length correctly in both walking and jogging activities in both indoor and outdoor environments.
4. Experiment Setups
In this section, we detail our experimental settings. Similar to our previous work in [
18], the Arduino Integrated Development Environment (IDE), XBee Configuration & Test Unit (X-CTU), Arduino UNO microprocessors, and XBee-PRO S2C wireless transceivers were employed in this experiment. The core communication technology used in the XBee-PRO S2C modules is the spreading spectrum technique regulated by the IEEE 802.15.4 standard for low-rate wireless personal area networks (LR-WPANs) [
31]. In particular, each group of four data bits is mapped into one of 16 nearly orthogonal spreading sequences, each of which is 32 chips long. The resulting chip sequence is modulated on the radio-frequency carrier in the 2.4 GHz band by the offset quadrature phase shift keying (O-QPSK) modulation scheme. The components of the transceivers are depicted in
Figure 1. The parameters were configured as follows: transmission power
dBm and data rate 9600 bps. This is the most proper configuration of the developed transceivers for measuring the distance between two ankles, as discovered from our previous experiments in [
18].
The transceivers were attached to the inner side of human ankles at the same height
h, as shown in
Figure 2. The distance between two antennas was regarded as the real step length
(m). In our experiments, the transmitter and the receiver were placed on the medial side of the ankles of the subject under test in a way that the antennas faced each other, as shown in
Figure 2b. This means that there existed an LOS path between the transceivers, even when the person under test was walking or jogging, and that there was no human body part appearing between them. As a result, this placement of equipment can eliminate the shadowing effect caused by any body parts. This intuitive prediction was confirmed in our previous work [
18], where experiments were performed both off-body and on ankles to compare the shadowing effect. The results in [
18] showed that the shadowing effect caused by the human body was negligible in our experiments.
The main purpose of this system was to transmit and receive continuous data packets to/from each other, and the assembled micro SD card in the receiver recorded the RSSI values continuously. From the RSSI values, the on-ankle path loss can be calculated as (cf. (5) in [
18]):
where
(dB) is the transmitted power. From (
4) and (
5), the distance between the two transceivers is:
Following is a trial experiment of the indoor walking situation to explore the relationship between the measured path loss values and the positions of the two ankles.
Figure 3a plots the on-ankle path loss over time. During the first 0.64 s, the transmitter and the receiver initialize themselves and synchronize with each other. Once the transmitter and the receiver are synchronized, it takes around 0.02 s for the hardware to measure and record each RSSI value into the micro-SD card, as shown in
Figure 3b.
After the initial synchronization phase, the Arduino may encounter erroneous transmissions from time to time due to temporarily being out-of-synchronization. To cope with this, in our experiments, the Arduino UNO hardware was programmed in a way that, if an erroneous transmission occurs (i.e., the receiver does not receive the packet successfully), a very big value of path loss (120 dB was chosen in our experiments) would be recorded to the data file in the micro-SD card to flag this erroneous transmission. Thereby, in the later analysis, any erroneous transmission would be easily detected and omitted. As shown in
Figure 3a, the temporary out-of-synchronization status was normally very short, and the Arduino UNOs could quickly synchronize again with each other. Hence, in general, the Arduino UNO transceivers were relatively stable and reliable.
Because of the modest computation capability of the Arduino UNO, the transceivers in our experiments were programmed to only transmit and receive data packets to record the RSSI values in order to avoid any unnecessary delay. Processing of the raw data was performed offline on a computer instead. It is also noted that we aimed to estimate the average step length over a certain period, rather than outputting the instant estimated step length values, to mitigate the randomness in the measurement process. As a result, the processing time of our algorithm had a negligible effect on the RSSI calculations.
It was observed that the measured path losses had a periodical pattern. To explore the meaning of the peaks and troughs of the path loss, let us consider two points
(2.08 s, 30 dB) and
(2.32 s, 54 dB) from the plot, where
is at a trough and
is the following peak. A video of the footage was captured in tandem with the path loss measurements. Based on the time stamps, we obtained the corresponding video frames, which corresponded to
and
, as shown in
Figure 3b,c. In
Figure 3b, two feet are aligned with each other. In other words, at
, the distance between two ankles is the shortest, which indicates the pedestrian has moved the left leg from behind to the middle position and is about to step forward. Hence, a step is half-finished at the bottom points of
Figure 3a. The step is fully finished in
Figure 3c. The ankles are at the largest distance from each other, where
is located. This means the peak path loss value at
in the time duration [2.08 s, 2.32 s] coincidentally corresponds to the step length. Note that
dB is not the global largest value of path loss in
Figure 3a. For example, the peak path loss values at the points
–
at the time instants 0.92 s, 0.94 s, 2.88 s, 3.58 s, and 3.78 s were even bigger than 54 dB. In other words, the path loss corresponding to the step length is expected to be in a high range of the path loss values, but not necessarily the largest value. Hence, to find the step length, it was necessary to examine the histogram of the experimental data.
The bar chart in
Figure 4 depicts the probability histogram of the on-ankle path loss in this trial experiment. Clearly, the histogram shows a two-hump shape with the most likely path loss occurring at the peak density
dB. The first, smaller hump corresponds to the half-finished steps, i.e., when the two feet are about to cross each other. The second, bigger hump corresponds to the events when the two feet are likely most separated from each other. The step length (i.e., the maximum distance between the two transceivers) may occur somewhere around the peak density rather than always at the peak density in the histogram. To demonstrate this point, let us consider the two different moments
s and
s when the path loss of 46 dB took place (cf.
Figure 5a,b). These two figures suggest that, although the on-ankle path losses at these time instants were the same and both corresponded to the peak density in the histogram, the feet of the person under test were not in the identical posture. This means that the path loss corresponding to the peak density did not always correspond to the step length due to the randomness of the propagation channel. This observation is confirmed again in
Figure 5c,d, where we show the two maximum distance events at the time instants
s and
s when
dB. The path loss
dB corresponds to the second maximum density, rather than the peak one in
Figure 4.
From the aforementioned observations, we conjectured that the human step length can be estimated within a certain range around the peak density of the histogram. This is because the actual step length may occur before or after the peak density due to the randomness of the propagation channel caused by the dynamic motions of the person under test. Therefore, in the following experiments, we propose a filtering technique to discard outlier data to form a range of reliable path loss values for estimating the step length. The accuracy analyses are also mentioned in the next section.
5. Experimental Results and Analysis
In this section, experiments were conducted in four dynamic scenarios, including indoor walking, outdoor walking, indoor jogging, and outdoor jogging. The indoor experiments were carried out in a corridor of a building, while the outdoor ones were conducted along some pavement, which can be seen as an open area in
Figure 6. The participant walked or jogged along a straight path with a length of 35.7 m. There were 50 steps and 38 steps in the walking and jogging scenarios, respectively. Therefore, the real average step length for walking was
m, while for jogging, it was
m. In each scenario, the experiments were carried out 10 times with over 1500 data in each dataset. Altogether, there were more than 15,000 data for each scenario. In our previous work [
18], we derived the empirical path loss model for the wireless channel between the two ankles in a static situation, as shown in (
1). As mentioned above, there existed randomness of the path loss in dynamic situations where the person under test was walking or jogging. Thus, we propose a filtering technique to apply along with the empirical model in (
1) in order to eliminate the on-ankle path loss outliers. The resulting ranges of on-ankle path loss were then used to estimate the step length in the four motion scenarios. The following subsections are the experiment results and analyses for the four motion circumstances.
5.1. Empirical Threshold Pair
We propose a novel filtering technique to filter out the path loss outliers by setting a threshold pair, which consisted of an upper threshold and a lower threshold. As these two thresholds work together, we found both thresholds simultaneously. As shown in
Figure 3 and
Figure 5, the path loss for the step length could be neither the maximum path loss value nor the path loss value corresponding to the peak density of the histogram. This was because the randomness of the propagation channel was caused by the dynamic movements of the person under test. Thus, it is important to consider a suitable range of the path loss values that might possibly correspond to the maximum distance between two ankles. To this end, based on the collected datasets, we first examined different combinations of the lower bound and the upper bound of this range to find the pair of boundaries that minimized the error between the average estimated step length and the true step length. The path loss values higher than the upper threshold or lower than the lower threshold were considered as outlier values.
Figure 7 demonstrates the normalized errors of the step length estimations in the indoor walking and indoor jogging scenarios for different lower and upper thresholds. The relative (or normalized) error
in percentage is defined as:
where
is the average estimated distance between two ankles under a certain experimental scenario, which involves 10 datasets,
is the real step length, and
i is either
w for the walking scenario or
j for the jogging scenario.
Figure 7 shows that the (lower, upper) threshold pairs of (40 dB, 52 dB) and (40 dB, 56 dB) resulted in the average estimated step lengths being the closest to the true step lengths (i.e., the smallest normalized error
) in the indoor walking and indoor jogging scenarios, respectively. Along with
Figure 7,
Table 1 and
Table 2 show in more detail the estimated step length (cf. (
4)), averaged over all ten datasets for some different pairs of the (lower, upper) thresholds for the indoor walking and indoor jogging scenarios. In each cell of the table, there are three numbers. The average estimated step length in meters, which is the average result based on 10 experimental datasets, is located outside of the brackets. Following in the brackets are the average absolute error in millimeters and the average relative error in percentage, respectively.
The average absolute error was calculated as
.
Table 1 and
Table 2 confirm further the observation gained from
Figure 7 that the best pairs of (lower, upper) thresholds of the path losses were (40 dB, 52 dB) and (40 dB, 56 dB) for the indoor walking and indoor jogging cases, respectively. The average absolute and normalized estimation errors were just 10.15 mm and 1.42% for the indoor walking case, while these numbers were 4.40 mm and 0.47% for the indoor jogging case.
Similarly,
Figure 8 and
Table 3 and
Table 4 clearly show that the best (lower, upper) thresholds of the path losses used for estimating the average step length in the outdoor walking and jogging scenarios were (39 dB, 51 dB) and (42 dB, 54 dB), respectively. The average absolute and relative estimation errors for the former case were just 4.81 mm and 0.67%, while they were 10.84 mm and 1.15% for the latter one.
It is noted that the estimation error in the indoor walking scenario was higher than that in the indoor jogging one. This can be explained as follows. In general, one might expect that the error of the walking scenarios is smaller than that of the jogging ones as walking is a slower and more stable activity than jogging. This expectation was confirmed from the experimental results of the outdoor scenarios, where the errors for outdoor walking and jogging were 4.81 mm and 10.84 mm, respectively. However, this expectation may not always be the case for an indoor environment since there are more multipaths indoors than outdoors. Because walking takes a longer time than jogging to complete a step, when multipath propagation occurred, more affected RSSI (thus path loss) values during that step were recorded to the dataset in the walking scenario than in the jogging one. As a result, the histogram of the path loss dataset collected for the indoor walking scenario may have some (local) peaks that were far more distinct from the remaining non-peak values, compared to the indoor jogging case. This phenomenon can be observed in
Figure 9a (mentioned later in
Section 5.2), where the density of the path loss value of 46 dB was much more prominent than other non-peak values, while the local peaks in
Figure 9b are less prominent compared to their surrounding values. This led to a slightly worse accuracy in average step length estimation in the indoor walking compared to the indoor jogging.
5.2. Upper Threshold Analysis
The data analyses mentioned in
Section 5.1 are critical as they allowed us to devise the novel filtering technique, which is detailed below.
In order to formulate the thresholds mathematically, we firstly depict the probability histogram for all the datasets (around 15,000 data) collected in each experimental scenario, as shown in
Figure 9. The probability histogram of the measured on-ankle path loss is represented by blue bars. It is noted that the plotted histogram has two humps, which correspond to the half-finished step, where the two feet are about to pass each other, and the fully finished step, when the two feet are most apart from each other, respectively. The plotted histogram can be well approximated by the probability density function (PDF) of a two-term Gaussian distribution model via the curve-fitting process indicated by the solid green curve in
Figure 9 with the general PDF equation:
where
,
is the amplitude,
is the centroid, and
relates to the peak width of this Gaussian distribution (
2). These coefficients can be found from the curve fitting of the two-term Gaussian distribution model. Ideally, the step length is related to the maximum on-ankle path loss. However, due to the randomness of the propagation channel, the actual step length may correspond to a non-peak path loss around the peak of the second hump. This means that the pair of the (lower, upper) thresholds should capture a suitable range of the path loss values around the peak of the second hump. The values bigger than the upper threshold or smaller than the lower threshold were considered as outliers for estimating the path loss that corresponds to the step length. To capture the suitable window of the possible path loss values for estimating the step length, intuitively, the upper threshold should be located somewhere at the right slope of the second hump, while the lower threshold lies somewhere at the left slope of the second hump, i.e., in between the first hump and the second hump.
From
Figure 9, it is observed that the impact of the first hump on the right slope of the second hump was negligible. Thus, we can extract the second hump and approximate its right slope by the Gaussian distribution:
This observation is confirmed in
Figure 9a, where the bell-shaped red dashed curve representing the Gaussian distribution in (
9) coincides with the right slope of the second hump of the two-term Gaussian distribution. As a result, we can obtain the mean
and the standard deviation
of the second hump based on the above Gaussian distribution in (
9) as:
The above observations and analyses hold for all indoor/outdoor walking and indoor/outdoor jogging cases, as shown in
Figure 9a–d.
The curve fitting parameters
,
,
,
, and
for the second hump in the four scenarios can be found in
Table 5. Since the path loss, which corresponds to the step length, is a random variable, its upper threshold should be determined as a function of both the mean value
and the standard deviation value
of the second term of the two-term Gaussian distribution in (
9). This philosophy is similar to the well-known concept of calculating the retransmission timeout (RTO) on the Internet where the RTO is the function of both the mean value of the round-trip time (RTT) and its deviation value.
Table 6 presents the values of function
,and the corresponding difference, denoted as
(dB), between these values and the upper thresholds, which were worked out empirically from the actual measured data in
Section 5.1.
Table 6 clearly shows that the empirical upper thresholds in the indoor walking and jogging scenarios were both very well approximated by
with the differences
of only 0.3319 dB and 1.1391 dB, respectively. This finding makes sense because the upper threshold is equal to the mean path loss value
plus a margin, which is equal to the standard deviation
in this case.
Similarly, the empirical upper thresholds in the outdoor walking and jogging scenarios were both very close to with the difference of merely 0.3541 dB and 0.3208 dB, respectively. The upper thresholds in the two indoor cases were higher than those in the outdoor scenarios due to the fact that there were more multipaths indoors than outdoors; thus, the actual path loss that corresponds to the step lengths might vary more widely around its mean value.
5.3. Lower Threshold Analysis
As mentioned in
Section 5.2, the lower threshold was located between the first hump and the second hump of the two-term Gaussian distribution, which means its value would be affected by both humps. Therefore, it was impossible to analyze the lower threshold based on a single hump as for the upper bound mentioned above. Thus, other techniques should be used to analyze the lower threshold. One of the possible techniques is based on the cumulative distribution function (CDF) or the survival function. The survival function is complementary to the CDF. It indicates the probability of the path loss value greater than or equal to a certain value.
Figure 10 depicts the probability histogram, the CDF (the red curves), and the survival function (the bold green curves) of the measured path loss data for all four scenarios together with the lower thresholds (the black dashed lines), which were empirically found to be 39 dB and 42 dB in the outdoor walking and jogging cases, respectively, and 40 dB in the indoor cases, as detailed in
Section 5.1.
Figure 10 reveals an interesting fact that the intersections between the empirical lower thresholds and the survival curves were around 0.68 in all four cases. In other words, the measured path loss value was bigger than or at least equal to the value of the lower threshold 68% of the time in all four scenarios. Path loss values between the lower threshold and the upper one should be considered as the potential path losses corresponding to the step lengths. Based on the above empirical measurements and statistical analyses, we deduced that the lower threshold can be numerically found as the corresponding path loss value when the survival rate reaches 0.68.