**1. Introduction**

The Global Navigation Satellite System (GNSS) is dominant in providing outdoor positioning service due to its coverage and high accuracy. However, people spend about 80% of their time indoors every day according to incomplete statistics. During the epidemic, medical staff or volunteers needed to grasp the dynamic position of personnel in isolated hotels and isolated wards in real time. When an indoor fire occurs, rescuers need to know the exact location of trapped people in time. In the construction of a smart city [1–3], and the tracking of pandemics [4], indoor positioning is the basic technology. In short, indoor positioning has broad application prospects. However, GNSS cannot provide services indoor. Researchers have proposed to use pedestrian dead reckoning (PDR) or PDR and wireless sensor fusion to achieve indoor positioning [5]. Smartphone-based step detection is necessary for PDR to determine pedestrian trajectory information [6,7]. In addition, smartphones bring many conveniences to people's lives with their rich functions and applications. Among these functions, step detection plays a role in health care for obese patients, has become a physical therapy to control chronic low back pain [8], monitor the

**Citation:** Xu, Y.; Li, G.; Li, Z.; Yu, H.; Cui, J.; Wang, J.; Chen, Y. Smartphone-Based Unconstrained Step Detection Fusing a Variable Sliding Window and an Adaptive Threshold. *Remote Sens.* **2022**, *14*, 2926. https://doi.org/10.3390/ rs14122926

Academic Editor: Liang Chen

Received: 27 April 2022 Accepted: 17 June 2022 Published: 19 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

fall of the elderly [9], and can also be applied to daily fitness training [10]. MI band, Huawei band and other commercial products record users' daily steps, and then give health tips. The commonly used step detection algorithms are zero velocity update (ZUPT) [11,12], autocorrelation analysis [13], peak detection [14], etc.

ZUPT refers to the lower limb being in a static state at a certain time during the walking process, and the walking speed is zero at this time, or the output values of acceleration sensor and angular velocity sensor will be approximately zero when the foot makes contact with the ground. The ZUPT method generally requires the sensor to be fixed in a specific position of the lower limbs, such as calves, feet, etc. [11,12]. Obviously, the smartphone does not have the basic conditions for this method.

When pedestrians walk continuously, there is a high correlation between the front and back gait cycles. The auto correlation analysis detects the number of steps by judging the correlation coefficient of the two cycles. Pan [13] uses auto correlation analysis to calculate steps. The experimental results show that the average step counting accuracy can reach 97.8% when the pedestrians dynamically switch the position carried by the smartphone, but this experiment failed to consider the important factor of changing the movement state of pedestrians at any time on the step-counting accuracy. Additionally, the calculation of correlation coefficient is large, which affects the timeliness of the algorithm.

Peak detection also detected the number of steps according to the periodicity of the pedestrian's continuous walking. Unlike the auto correlation analysis method, the peak detection takes the number of peaks (valleys) generated by the acceleration sensor or gyroscope sensor as steps. However, the pseudo-peak restricts the accuracy of the detection steps of the peak detection method. At present, the pseudo-peak is mainly eliminated by setting the threshold. Xu et al. [7] used a fixed threshold method to remove pseudopeak, which has high accuracy when pedestrian motion and smartphone carrying mode are constrained. Cho et al. [14] used the sign-of-slope method and average threshold method to realize peak detection; Zhang et al. [15] used the mean value of the acceleration amplitude of the previous window to dynamically update the acceleration threshold; Wang [16] adaptively selects the threshold of acceleration according to the average value of the difference between peaks and valleys in unit time. Ryu et al. [17] proposed an adaptive threshold method, which uses the average value of the first five consecutive peaks as the adaptive threshold. However, these algorithms are easy to cause misjudgment in multiple motion modes. Dirican et al. [18] proposed a threshold-based unconstrained step counting algorithm. Unlike other ways of thinking about setting update thresholds based on acceleration data, this method sets the real and imaginary parts of the data transformed by fast Fourier transform (FFT) to different thresholds and achieves the update of thresholds by averaging the current and previous thresholds. This method achieves an adaptive update of acceleration thresholds and can adapt to a variety of unconstrained states, but the accuracy of the pedometer for running states is only 41.7%.

In addition to the pseudo-peak affecting the accuracy of step counting, selecting the appropriate sliding window is also helpful to indirectly improve the accuracy of step counting. Currently, the sliding window size is mainly based on the time required for a single step, so that the pedestrian can continuously output the step counting results without delay for each step; however, this requires that the size of the sliding window should essentially match the time used for a single step. Ning et al. [19] sets the size of three sliding windows according to the three states of motion of going up and down, walking and running, but the size of these windows are still fixed, so it is difficult to adjust to different users. Kang et al. [20] proposes a method of changing the sliding window, which mainly determines the size of the sliding window adaptively according to the walking frequency, but the accuracy of the walking frequency will affect the determination of the sliding window. It can be seen that it is difficult to ascertain the size of the sliding window when realizing continuous single-step counting. As long as the influence of sliding window size on real-time performance is within an acceptable range, looking for a sliding window which is not based on single-step detection, and solving the problem that a large sliding window separates the connection between neighboring windows, is also an ideal solution to the problem.

In summary, in the motion environment and motion state, the random transformation of smartphone carrying mode and the complex unconstrained state with interference factors, the step detection algorithm based on peak detection has the problem that the adaptive threshold is difficult to accurately update dynamically and the sliding window is difficult to accurately determine, which restricts the accuracy of peak detection. In view of this, smartphone-based unconstrained step detection fusing a variable sliding windows and an adaptive threshold is proposed in this paper. The algorithm uses the minimum peak filtered by the sliding window as the adaptive threshold to solve the problem that the peak threshold is difficult to update adaptively when the pedestrian state changes. The algorithm is a step detection algorithm of a variable sliding window. It is not based on the commonly used single-step detection of a sliding window, but realizes the variable sliding window on the basis of a fixed sliding window of 1 s, which ensures the close connection between the windows. At the same time, the cooperative time threshold solves the problem that the initial peak and the final peak make it difficult to distinguish the authenticity in the fixed sliding window. Using this algorithm, the accuracy of step counting can be guaranteed under complex unconstrained conditions. There are three aspects of contributions for the smartphone-based unconstrained step detection method proposed in this paper. First, it allows users to carry smartphones at multi points for indoor positioning. Second, users can freely switch the way of carrying and the state of motion for indoor positioning. Third, smartphones with different valences have high step detection accuracy.

The Section 1 introduces the research background and existing problems of the step detection algorithm. The preprocessing process and motion state recognition process of the step detection algorithm are described in the Section 2. Smartphone-based unconstrained step detection fusing a variable sliding window and an adaptive threshold is proposed in the Section 3. The Section 4 evaluates the step counting performance of smartphone-based unconstrained step detection fusing a variable sliding window and an adaptive threshold in constrained and unconstrained states through 50 groups of experiments. The Section 5 summarizes the work of this paper.

#### **2. Step Detection Preprocessing and Motion State Recognition**

As the sensors in the smartphone can detect the periodic changes of pedestrians, smartphones can detect the step numbers. Both acceleration sensors and gyroscope sensors can detect the periodic change when walking. However, the sensitivity of the gyroscope sensor is depressed, and the acceleration sensor is mostly used for step detection [21].

With each step forward, the pedestrian will produce a vertical motion and forward motion. The vertical axis of the three-axis accelerometer will produce an approximate sinusoidal wave, and the number of peaks (valleys) detected can be used as the number of steps of the pedestrian. However, the location of the smartphone carried by pedestrians is changeable, and it is difficult to accurately identify which single axis is in the vertical state, but the influence of sensor attitude can be reduced by calculating the overall acceleration. Formula (1) is the formula for calculating the overall acceleration.

$$a\_c(t) = \sqrt{a\_x^2 + a\_y^2 + a\_z^2} \tag{1}$$

In the formula, *ax*, *ay*, *az* represent the accelerometer output values of the *X*-axis, *Y*-axis and *Z*-axis at t time, and *ac*(*t*) represents the overall acceleration.

The signal characteristics of triaxial acceleration and overall acceleration were compared by experiment. Figure 1 shows the triaxial acceleration signal and the overall acceleration signal when walking at will. In the experiment, there are three states: walking and the smartphone is flat, normal walking and putting the smartphone next to the ear, running and swing hand. It can be seen from Figure 1 that the most sensitive axis has undergone three transformations, namely the *Z*-axis, *Y*-axis and *X*-axis, whereas the overall

acceleration signal shows significant periodical changes. Therefore, the overall acceleration is adopted in the step detection algorithm in this paper.

**Figure 1.** Triaxial acceleration and overall acceleration when walking at will.

The accuracy of the acceleration sensor of the smartphone is low, which leads to too many burr points of the original overall acceleration signal. It needs to be filtered to reduce the interference of more burrs before step detection. In addition, in the process of movement, pedestrians will be accompanied by multiple motion states. Different motion states often have different peak threshold. Therefore, it is necessary to identify motion state before step detection. The following focuses on the methods of step detection data preprocessing and motion state recognition.

#### *2.1. Data Preprocessing of Step Detection*

In order to remove the white Gaussian noise, Guo et al. [22] uses the weighted moving average method and the Kalman filter to preprocess the original resultant acceleration data to remove the influence of the Gaussian white noise, and then uses the Butterworth filter to refine the denoising. But the excessive and complicated filtering methods increased the data processing time. Zhang et al. [15] adopted the sliding window filter method to weaken the multi-peak phenomenon, which is a more common and better smoothing method for filter data, but the method loses the characteristics values of the data. Alabadleh et al. [23] used the Kalman filter and high-pass filter to smooth data, which removed gravity and outliers, but the algorithm had some complexity. Liu et al. [24] adopted a low-pass filter to eliminate signal noise, and the low-pass filter can keep the characteristics of the data very well. In this paper, a Finite Impulse Response (FIR) low-pass filter based on a Hamming window [25] is used to preprocess the ensemble acceleration signal, where the order of the filter is 10 and the length of the Hamming window is 11. As the actual output frequency of some smartphones does not match the sampling frequency, the sampling frequency used in the paper is the actual output frequency and the pass-band frequency is 5 Hz. Figure 2 shows the original overall acceleration signal, and Figure 3 compares a sliding window filter (the window size is 15 samples) and a FIR low-pass filter. The FIR low-pass filter retains the large and small peaks of the original overall acceleration signal (shown by the black arrows in Figure 3), reflecting the different characteristics of the left and right footsteps of the human body when walking.

**Figure 2.** Raw overall acceleration signal.

**Figure 3.** FIR low-pass filter and sliding window filter.

#### *2.2. Motion State Recognition*

Pedestrians lift one foot off the ground, move to a new position and after that, put it back on the ground, which is known as a single step [17]. Figure 4 shows the decomposition diagram of pedestrian single-step action. Since the step detection algorithm in this paper is mainly applied to PDR, stroll walking, normal walking and running in the previous progress are considered, which are general division methods.

**Figure 4.** Single step action decomposition.

When pedestrians are in different states of motion, the time and the peak acceleration of a single step are different. If a single fixed time threshold and a peak threshold are used to realize step counting, it is difficult to ensure the accuracy of steps. Identifying different states of motion and setting or updating different thresholds according to different states of motion can effectively improve the step counting accuracy. Zhang [26] uses the finite state machine method to distinguish whether the pedestrian is at rest or in motion, but does not make a further division of the motion state. In the study of Chen et al. [27], based on the inherent correlation between the state of motion and the maximum acceleration, the maximum acceleration threshold is set to identify the state of motion of pedestrians. With this method, it is easy to misjudge the state of motion under more complex unconstrained conditions. For example, the acceleration caused by the arm swing of the hand-held smartphone during stroll walking is similar to that of normal walking and when the smartphone is flat. Generally speaking, the step frequency of stroll walking, normal walking and running increases in turn, so the state of motion can be identified based on the step frequency. Using FFT, the time domain information can be converted into frequency domain information, and with the exception of the first DC point, the point with the largest amplitude is taken as the step frequency.

In this paper, through 50 experimental tests of 25 people, it is found that the walking frequency of continuous walking should be less than 1.6 Hz, the walking frequency of continuous normal walking should be less than 2 Hz, and the walking frequency of continuous running is 2–3.5 Hz. Limited by navigation factors and non-competitive state, the frequency above 3.5 Hz is mostly caused by interference factors such as body shaking, typing, video brushing, etc. Therefore, the frequency above 3.5 Hz is regarded as an interference state in this paper. When the state of motion is stable, the frequency calculated by FFT will be similar to the real step frequency. To accurately identify the step frequency, FFT requires at least 256 samples. In this paper, when the number of samples is less than 256, FFT judgment is not enabled. When the number of samples is more than 256, the window size of FFT is set to 256 samples, and the sliding time is 1 s (the actual samples in 1 s). Because the FFT used in this paper requires 256 samples (about 5 s), if there is multiple switching of state of motion in the window, it will reduce the accuracy of FFT to judge the step frequency. In order to prevent large step counting errors caused by misjudgment, the frequency threshold should be as small as possible. Theoretically, the time threshold should correspond to the step frequency, but it is found that in the walking state (stroll walking, normal walking), the time required for a single step is similar to that of running. So in order to avoid misjudging some true peaks because of the time threshold, in this paper, the time thresholds of all exercise states are set to smaller values. According to many experimental tests and references, the spectrum, the step frequency threshold, the time threshold and the peak threshold of walking, normal walking and running are shown in Figure 5 and Table 1.

**Figure 5.** Spectrum of pedestrian stroll walking, normal walking and running: (**a**) Stroll walking (**b**) Normal walking (**c**) Running.


