**Iterative High-Accuracy Parameter Estimation of Uncooperative OFDM-LFM Radar Signals Based on FrFT and Fractional Autocorrelation Interpolation**

### **Yifei Liu \* , Yuan Zhao , Jun Zhu, Ying Xiong and Bin Tang**

School of information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; zy\_uestc@outlook.com (Y.Z.); uestczhujun@163.com (J.Z.); Xiongy@uestc.edu.cn (Y.X.); BinT@uestc.edu.cn (B.T.)

**\*** Correspondence: flyliu97@foxmail.com; Tel.: +86-159-2874-2900

Received: 3 September 2018; Accepted: 17 October 2018 ; Published: 19 October 2018

**Abstract:** To improve the parameter estimation performance of uncooperative Orthogonal Frequency Division Multi- (OFDM) Linear Frequency Modulation (LFM) radar signals, this paper proposes an iterative high-accuracy method, which is based on Fractional Fourier Transform (FrFT) and Fractional Autocorrelation (FA) interpolation. Two iterative estimators for rotation angle and center frequencies are derived from the analytical formulations of the OFDM-LFM signal. Both estimators are designed by measuring the residual terms between the quasi peak and the real peak in the fractional spectrum, which were obtained from the finite sampling data. Successful elimination of spectral leakage caused by multiple components of the OFDM-LFM signal is also proposed by a sequential removal of the strong coefficient in the fractional spectrum through an iterative process. The method flow is given and its superior performance is demonstrated by the simulation results.

**Keywords:** uncooperative sensor signal processing; MIMO radar; fractional Fourier transform; fractional autocorrelation interpolation

### **1. Introduction**

As a novel radar system, the Multiple-Input Multiple-Output (MIMO) radar employs multiple transmitting antennas to emit mutually orthogonal waveforms and uses multiple receiving antennas to process the echo signals simultaneously [1]. Subject to current technical conditions, the coherent MIMO radar technique is commonly used in modern MIMO radar systems [2].

In this paper, we focus on the high-accuracy parameter estimation of uncooperative Orthogonal Frequency Division Multi- (OFDM) Linear Frequency Modulation (LFM) signals, which have been widely used in coherent MIMO radar systems. In the past decades, much research has been conducted on OFDM-LFM waveform design [1–3]. However, only a few studies have discussed parameter estimation for uncooperative OFDM-LFM signals in electronic warfare systems. The signal model of the intercepted MIMO signals based on a single-channel receiver has been analyzed in the literature [4,5]. Moreover, an improved Multiple Wigner–Hough Transform (MWHT) [6] was proposed to enhance the performance of signal detection and parameter estimation for OFDM-LFM signals. Other estimation algorithms based on likelihood estimators or optimization methods for multicomponent LFM signals were applied in [7–9]. However, these earlier algorithms have limitations on the estimation accuracy and efficiency due to the cross-term and picket fence effects. Besides, most of these algorithms also lack computational efficiency, making them more difficult and expensive to realize [10].

Based on the fast algorithm of Fractional Fourier Transform (FrFT) that was proposed by Ozaktas [11], related research works [12–17] have brought the application of digital FrFT (DFrFT) to maturity. On the one hand, fractional derivatives and calculus in a complex plane were studied

in [13–16], which are beneficial in establishing fractional models in engineering. Furthermore, the contributions in [17] proposed the fractional geometric calculus and extended the fractional calculus to any dimension. On the other hand, the analytical FrFT formulations of multicomponent LFM signals were introduced in [10]. Conventional DFrFT utilizes a coarse-fine search strategy to improve the estimation accuracy. The coarse-fine strategy firstly obtains a crude estimation by searching the maximum FrFT coefficient of the received data, and then, the result is refined by modified methods such as Newton-type methods [8] and interpolation methods [10,18]. However, these methods require numerous extra calculations and make it difficult to handle the OFDM-LFM signals.

Inspired by the recently-developed fast iterative interpolated beamforming estimation method [19], we propose a fast and high accuracy estimator for uncooperative OFDM-LFM signals based on DFrFT and Fractional Autocorrelation (FA). We refer to the proposed method as Fast Iterative Interpolated DFrFT (FII-DFrFT), which exhibits desirable convergence properties and the same order computational complexity as Digital Fourier Transform (DFT).

The rest of this paper is organized as follows. In Section 2, the signal model of the intercepted uncooperative OFDM-LFM signal and the analytical formulations of DFrFT for this signal are given. In Section 3, the proposed FII-DFrFT is described. Section 4 gives the numerical simulation of the proposed algorithm, and some conclusions are drawn in Section 5.

### **2. Signal Model and FrFT**

Let us consider a single-channel reconnaissance receiver and an adversary MIMO radar system with *M* transmitters. Assume that this radar employs OFDM-LFM waveforms, which were firstly introduced into the design of an MIMO radar system by F. Cheng [1]. Afterwards, the signal of the *m*-th transmitter is given as [1]:

$$s\_m\left(t\right) = u\_m\left(t\right)e^{i2\pi f\_0 t}, 1 \le m \le M\tag{1}$$

$$u\_{\mathcal{W}}\left(t\right) = \frac{1}{\sqrt{T\_p}} \text{rect}\left(\frac{t}{T\_p}\right) e^{j2\pi\left(mf\_{\Lambda}t + \frac{1}{2}\mu\_0 t^2\right)} e^{j\phi\_m}, 1 \le m \le M\tag{2}$$

where *f*<sup>0</sup> denotes the carrier frequency of the victim radar system, *Tp* is the pulse duration, *f*<sup>Δ</sup> is the frequency step between two adjacent transmitters, *μ*<sup>0</sup> is the chirp rate and *φ<sup>m</sup>* is the initial phase of the *m*-th transmitting signal. Here, we also assume *μ*0*Tp f*<sup>0</sup> [1].

Therefore, the MIMO radar signal intercepted by the reconnaissance receiver can be written as:

$$\text{tr}\left(t\right) = A\_m \sum\_{m=0}^{M-1} s\_m\left(t\right) + \omega\left(t\right) \tag{3}$$

where *Am* is the complex constant amplitude of the *m*-th subpulse and *ω* (*t*) represents zero-mean white Gaussian noise with variance *σ*2. The receiver first detects the observed signal energy and estimates the carrier frequency. Here, it is assumed that the above steps have been accomplished [20,21]. Then, these detected pulse observations are demodulated into intermediate frequency and sampled at an appropriate frequency, *fs*, which satisfies the bandpass sampling theorem. Thus, we can collect *N* successive samples of the signal pulse represented as:

$$\text{tr}\left[n\right] = A\_m \sum\_{m=0}^{M-1} e^{j2\pi \left[f\_m n T\_s + \frac{1}{2}\mu\_0 (n T\_s)^2\right]} e^{j\phi\_m} + \omega \left[n\right] \tag{4}$$

where *Ts* = 1/ *fs*, *fm* = *fI* + *m f*Δ, *fI* denotes the demodulated intermediate frequency and *n* = 0, 1, ··· , *N* − 1 *N* = *Tp fs* . The classical definition of FrFT [11] is:

$$X\_{\mathfrak{A}}\left(\mu\right) = \int\_{-\infty}^{\infty} K\_{\mathfrak{A}}\left(t, \mu\right) \mathfrak{x}\left(t\right) dt\tag{5}$$

where *K<sup>α</sup>* (*t*, *u*) is the kernel function with:

$$K\_{\mathfrak{a}}\left(t,u\right) = \begin{cases} B\_{\mathfrak{a}} \exp\left[\mathfrak{j}\pi\left(\left(u^{2}+t^{2}\right)\cot u - 2ut\csc u\right)\right], u \neq k\pi\\ \delta\left(t-u\right) & \mathfrak{a} = 2k\pi\\ \delta\left(t+u\right) & \mathfrak{a} = 2\left(k+1\right)\pi \end{cases} \tag{6}$$

and *<sup>B</sup><sup>α</sup>* <sup>=</sup> (<sup>1</sup> <sup>−</sup> j cot *<sup>α</sup>*). *<sup>α</sup>* <sup>=</sup> *<sup>p</sup>π*/2 is called the rotation angle, while *<sup>p</sup>* is the order of FrFT. *<sup>u</sup>* is a spectral parameter. We employed the fast digital algorithm of FrFT [11], which is represented as:

$$X\_{\rm a} \left(\frac{\mathcal{U}}{2\Delta x}\right) = \frac{B\_{\rm a}}{2\Delta x} e^{j\pi \tan\left(\frac{\eta}{2}\right) \left(\frac{\mathcal{U}}{2\Delta x}\right)^2} \sum\_{n=-N}^{N} e^{j\pi \csc a \left(\frac{\mathcal{U}-\mathcal{A}}{2\Delta x}\right)^2} e^{j\pi \tan\left(\frac{\eta}{2}\right) \left(\frac{\eta}{2\Delta x}\right)^2} \times \left(\frac{n}{2\Delta x}\right) \tag{7}$$

where *<sup>U</sup>* <sup>=</sup> *<sup>u</sup>*2Δ*<sup>x</sup>* and <sup>Δ</sup>*<sup>x</sup>* <sup>=</sup> <sup>√</sup>*N*.

The OFDM-LFM signal is reformulated into multiple impulses only for a particular *p* in the FrFT domain, while the Gaussian white noise term is distributed evenly in the (*α*, *U*) plane. After peak searching, the estimated coordinates *α*ˆ 0, *U*ˆ *<sup>m</sup>* can be used to obtain the estimators for OFDM-LFM signal parameters as [10]:

$$\begin{cases} \begin{aligned} \dot{\boldsymbol{\mu}}\_{0} &= -\cot \left( \dot{\boldsymbol{\kappa}}\_{0} \right) \frac{f\_{\ast}^{2}}{N} \\ \dot{f}\_{m} &= \dot{\boldsymbol{\Omega}}\_{m} \csc \left( \dot{\boldsymbol{\kappa}}\_{0} \right) \frac{f\_{\ast}}{N} \\ \dot{f}\_{\boldsymbol{\Lambda}} &= \frac{1}{M-1} \sum\_{m=2}^{M-1} \left( f\_{m} - f\_{m-1} \right) \\ \dot{\boldsymbol{A}}\_{m} &= \frac{|\boldsymbol{\chi}\_{\dot{\boldsymbol{\omega}}\_{0}}(\boldsymbol{\Omega}\_{m})|}{\boldsymbol{\Delta x} \left| \boldsymbol{\beta}\_{\boldsymbol{\Phi}\_{0}} \right|} \end{aligned} \tag{8}$$

However, this estimation performance depends on the grid size used for searching, while the ideal impulses require that Equation (7) is computed on an infinite number of grid points. In practice, due to the finite sampling data and leakage of other components' energy, there always exists some residual terms between the estimated quasi peaks (*α*B, *U*B*m*) and real peaks (*α*0, *Um*), where *α*<sup>B</sup> and *U*B*<sup>m</sup>* represent the bias estimations. Here, we set the residual terms as *δ*<sup>0</sup> and *εm*, where *α*<sup>0</sup> = *α*<sup>B</sup> + *ϕ*<sup>0</sup> and *Um* = *U*B*<sup>m</sup>* + *εm*. Furthermore, we set *ϕ*<sup>0</sup> = *δ*0Δ*α*, where Δ*α* is the coarse searching interval of rotation *α*. In addition, it is reasonable to assume that *δ*0,*ε<sup>m</sup>* ∈ [−0.5, 0.5]. Therefore, the residual term is the decisive point affecting the parameter estimation precision in Equation (8). Through conventional algorithms such as Newton-type [8] and interpolation [18], the residual term can be estimated. However, the first method suffers from a huge computational cost, and the second is only developed for monocomponent signals.

### **3. The Proposed Method**

Our proposed estimation method is inspired by the multiple component estimator in [19], which was designed for direction of arrival estimation and implemented by DFT. However, if we want to use that idea in OFDM-LFM radar signal parameter estimation, some improvements on DFrFT should be conferred.

Substituting Equation (3) into Equation (5) and ignoring the noise term, the energy of the OFDM-LFM signal concentrates in the DFrFT domain:

$$X\_{n\_0}\left(lI\right) = A\_m B\_{n\_0} e^{j\pi l l^2 \left(-\mu\right)} \sum\_{m=0}^{M-1} \left\{ e^{j\phi\_m} \delta\left[2\pi \left(mf\_\Lambda - lL\csc a\_0\right)\right] \right\} \tag{9}$$

After peak searching at a sufficient grid interval, Δ*α*, we can obtain *M* peak coordinates (*α*B, *U*B*m*). Then, the true chirp rate and true center frequency of the *m*-th component is given by:

$$\begin{cases} \begin{aligned} \mu\_0 &= -\cot \left( a\_{\rm B} + \delta\_0 \Delta a \right) \frac{f\_s^2}{N} \\\ f\_{\rm ll} &= mf\_{\rm A} = \left( l l\_{\rm Bm} + \varepsilon\_{\rm m} \right) \csc \left( \pounds\_0 + \delta\_0 \Delta a \right) \frac{f\_s}{N} \end{aligned} \tag{10}$$

In the following subsections, we will derive the estimator for chirp rate *μ*<sup>0</sup> based on the *δ*<sup>0</sup> and the estimator for center frequency *fm* based on the *εm*.

### *3.1. Estimator for Chirp Rate*

Due to the fact that the analytical formulation of the quasi-peak amplitude |*Xα*<sup>B</sup> (*U*B*m*)| in the FrFT domain involves the Fresnel integral formula [10], it is hard to construct the estimator for *δ*<sup>0</sup> directly through the iterative method. Therefore, we introduce the FA algorithm to remove the Fresnel term, which is given as [22]:

$$\left(\left(\mathbf{x}\_a^\*\mathbf{x}\right)\left(\tau\right) = \int \mathbf{x}\left(t + \frac{\tau}{2}\sin a\right) \mathbf{x}^\*\left(t - \frac{\tau}{2}\sin a\right) e^{2j\pi t\tau\cos a} dt\tag{11}$$

where *τ* represents the delay factor. Then, the FA envelope statistic is also given as:

$$L\left(a\right) = \int\_{-\infty}^{\infty} |\left(\mathfrak{x}\_a^\*\mathfrak{x}\right)\left(\mathfrak{x}\right)| d\mathfrak{x} \tag{12}$$

Substituting Equation (3) into Equations (11) and (12) and ignoring the noise term, we can derive the FA envelope of the OFDM-LFM signal:

$$\begin{split} \left(\mathbf{x}\_{a}^{\*}\mathbf{x}\right)\left(\tau\right) &= \int\_{-\infty}^{\infty} \sum\_{m=0}^{M-1} A\_{m} \mathbf{s}\_{m}\left(t + \frac{\tau}{2}\sin a\right) \sum\_{m=0}^{M-1} A\_{m} \mathbf{s}\_{m}\left(t - \frac{\tau}{2}\sin a\right) e^{2j\pi t\tau} \cos a \, dt \\ &= \int\_{-\infty}^{\infty} \gamma\left(t\right) e^{j2\pi t\tau\left(\mu y\sin a + \cos a\right)} \sum\_{m\_{i}=0}^{M-1} \sum\_{m\_{j}=0}^{M-1} A\_{m\_{i}} A\_{m\_{j}} e^{j\pi\tau f\_{\rm A} \sin a\left(m\_{i} - m\_{j}\right)} e^{j2\pi t f\_{\rm A} \left(m\_{i} - m\_{j}\right)} \, dt \end{split} \tag{13}$$

$$L\left(\mathfrak{a}\right) = \int\_{-\infty}^{\infty} |\left(\mathfrak{x}\_{\mathfrak{a}}^{\*}\mathfrak{x}\right)\left(\mathfrak{x}\right)| d\mathfrak{x} \tag{14}$$

where *γ* (*t*) = 1/*Tp*rect *t*/ *Tp* and:

$$\Gamma\left(\mathfrak{a}\right) = \int\_{-\infty}^{\infty} \int\_{-\frac{T\_p}{2}}^{\frac{T\_p}{2}} \sum\_{m\_i=0}^{M-1} \sum\_{m\_j=0}^{M-1} A\_{m\_i} A\_{m\_j} e^{j\pi\tau f\_\Lambda \sin a\left(m\_i - m\_j\right)} e^{j2\pi t f\_\Lambda \left(m\_i - m\_j\right)} dt d\tau \tag{15}$$

It is noticed that Γ (*α*) does not involve *μ*0; therefore, we can ignore it in the following analysis of this subsection. The real peak of *L* (*α*0) satisfies *μ*<sup>0</sup> = − cot *α*0. Substituting *α*<sup>B</sup> = *α*<sup>0</sup> − *δ*0Δ*α* into Equation (14), we can obtain:

$$\begin{split} L\left(\mathfrak{a}\_{\mathsf{B}}\right) &= \int\_{-\infty}^{\infty} \left| T\_{\mathsf{P}} \mathrm{Sinc} \left[ \pi \tau \left( -\cot a\_{0} \sin \left( \mathfrak{a}\_{0} - \delta \mathfrak{o} \Delta a \right) + \cos \left( \mathfrak{a}\_{0} - \delta \mathfrak{o} \Delta a \right) \right) \right] \right| d\tau \\ &= \int\_{-\infty}^{\infty} \left| T\_{\mathsf{P}} \mathrm{sinc} \left[ \pi \tau \csc a\_{0} \sin \left( \delta \mathfrak{o} \Delta a \right) \right] \right| d\tau \end{split} \tag{16}$$

When the searching interval Δ*α* is small enough, it is reasonable to use the approximations sin (*δ*0Δ*α*) ≈ *δ*0Δ*α* and sin [*πτ* csc *α*<sup>0</sup> (0.5 − *δ*0) Δ*α*] ≈ sin [*πτ* csc *α*<sup>0</sup> (0.5 + *δ*0) Δ*α*] in Equation (16). Then, we can construct the error mapping as:

$$\beta = \frac{L\left(a\_{\text{B}} + 0.5\Delta a\right) + L\left(a\_{\text{B}} - 0.5\Delta a\right)}{L\left(a\_{\text{B}} + 0.5\Delta a\right) - L\left(a\_{\text{B}} - 0.5\Delta a\right)} \approx \int\_{-\infty}^{\infty} \frac{\left|\frac{1}{\pi \csc a\_{0} (0.5 - \delta\_{0}) a\_{s} \tau}\right|}{\left|\frac{1}{\pi \csc a\_{0} (0.5 - \delta\_{0}) a\_{s} \tau}\right|} d\tau = \frac{1}{2\delta\_{0}}\tag{17}$$

Hence, ˆ *δ*<sup>0</sup> = 1/2*β* can be used as an estimator for *δ*0.

Finally, the new estimation of rotation angle *α*<sup>0</sup> is presented as *α*ˆ <sup>0</sup> = *α*<sup>B</sup> + ˆ *δ*0Δ*α*. Then, by substituting *α*<sup>B</sup> = *α*ˆ <sup>0</sup> and renewing *α*ˆ 0, an iterative method can be combined to improve the estimation accuracy.

### *3.2. Estimator for Center Frequency*

Firstly, we consider the DFrFT for a monocomponent LFM signal. Substituting Equation (2) into Equation (7), we can obtain:

$$X\_{\rm af} \left( \frac{\mathcal{U}}{2\Delta x} \right) = \frac{B\_{\rm af}}{2\Delta x} e^{j\pi \cot a \left( \frac{\mathcal{U}}{2\Delta x} \right)^2} \sum\_{n=-N}^{N} e^{j\pi 2n \left( \frac{f\_{\rm fg}}{2f\_{\rm fg}} - \frac{\mathcal{U} \csc a}{(2\Delta x)^2} \right) + j\pi n^2 \left( \frac{k\_0}{(2f\_{\rm fg})^2} + \frac{\cot a}{(2\Delta x)^2} \right)} \tag{18}$$

According to the analysis in Section 3.1, we assume that *α*ˆ <sup>0</sup> ≈ *α*0. Hence, at the quasi peak (*α*ˆ 0, *U*B*m*), Equation (18) can be approximated by:

$$X\_{\hbar\_{0}}\left(\frac{\mathcal{U}\_{\text{Bm}}}{2\Delta x}\right) = \frac{B\_{\hbar\_{0}}}{2\Delta x}e^{j\pi\cot\theta\_{0}\left(\frac{\mathcal{U}\_{\text{Bm}}}{2\Delta x}\right)^{2}}\sum\_{n=-N}^{N}e^{j\pi n\left(\frac{f\_{\text{m}}}{f\_{\text{s}}} - \frac{\mathcal{U}\_{\text{Bm}}\csc\theta\_{0}}{2N}\right)}\tag{19}$$

Using *fm* = *Um fs* csc *α*0/2*N* and *Um* = *U*B*<sup>m</sup>* + *ε*0, we can rewrite Equation (19) as:

$$X\_{\mathbb{A}\_0} \left( \frac{\mathbb{L}I\_{\mathbb{B}m}}{2\Delta x} \right) = \frac{B\_{\mathbb{A}\_0}}{2\Delta x} e^{j\pi \cot \mathbb{A}\_0 \left( \frac{\mathbb{L}\mathbb{B}m}{2\Delta x} \right)^2} \sum\_{n=-N}^{N} e^{j\pi n \frac{\mathbb{1}\_0 \csc \mathbb{A}\_0}{2N}} \tag{20}$$

Similar to the approach in Section 3.1, we can obtain *Xα*<sup>ˆ</sup> <sup>0</sup> *<sup>U</sup>*B*m*±*<sup>P</sup>* 2Δ*x* as:

$$\chi\_{\mathbb{A}\_{0}}\left(\frac{\mathsf{U}\_{\mathrm{Bm}}\pm P}{2\Delta x}\right) = \Gamma'\left(\mathbb{A}\_{0}, \mathsf{U}\_{\mathrm{Bm}}\pm P\right) \left[\frac{e^{-j\pi\frac{\left(\varepsilon\_{0}\mp P\right)\csc\delta\_{0}}{2}}\left(1-e^{j\pi\left(\varepsilon\_{0}\mp P\right)\csc\delta\_{0}}\right)}{1-e^{j\pi\frac{\left(\varepsilon\_{0}\mp P\right)\csc\delta\_{0}}{2N}}}\right] \tag{21}$$

where:

$$\Gamma'\left(\hbar\_{0\prime}\,\mathrm{l}\,\mathrm{l}\_{\mathrm{Bm}}\pm P\right) = \frac{B\_{\hbar\_0}}{2\Delta x} e^{j\pi\,\mathrm{cot}\,\hbar\_0\left(\frac{\mathrm{l}\,\mathrm{l}\_{\mathrm{Bm}}\pm P}{2\Delta x}\right)^2} \tag{22}$$

When (*ε*<sup>0</sup> ∓ *<sup>P</sup>*)  *<sup>N</sup>*, it is reasonable to use the approximation 1 − *<sup>e</sup><sup>x</sup>* ≈ *<sup>x</sup>*(*<sup>x</sup>* → <sup>0</sup>) in Equation (21). Then, by setting *P* = 1/ csc *α*ˆ 0, we can construct the error mapping as:

$$\dot{m} = \frac{\left| \mathbf{X}\_{\hat{\mathbf{n}}\_{0}} \left( \frac{\mathbf{U}\_{\text{Ran}} + P}{2\Delta \mathbf{x}} \right) \right| + \left| \mathbf{X}\_{\hat{\mathbf{n}}\_{0}} \left( \frac{\mathbf{U}\_{\text{Ran}} - P}{2\Delta \mathbf{x}} \right) \right|}{\left| \mathbf{X}\_{\hat{\mathbf{n}}\_{0}} \left( \frac{\mathbf{U}\_{\text{Ran}} + P}{2\Delta \mathbf{x}} \right) \right| - \left| \mathbf{X}\_{\hat{\mathbf{n}}\_{0}} \left( \frac{\mathbf{U}\_{\text{Ran}} - P}{2\Delta \mathbf{x}} \right) \right|} = \frac{\varepsilon\_{0}}{\mathbf{c} \mathbf{s} \mathbf{c} \, \mathfrak{a}\_{0}} \tag{23}$$

Hence, *ε*ˆ0 = *h* csc *α*ˆ <sup>0</sup> can be used as an estimator for *ε*ˆ0. The fine estimation of *Um* is presented as *U*ˆ *<sup>m</sup>* = *U*B*<sup>m</sup>* + *ε*ˆ0. Then, by substituting *U*<sup>B</sup> = *U*ˆ <sup>0</sup> and renewing *U*ˆ 0, an iterative method can also be combined to improve the estimation accuracy.

### *3.3. Iterative Estimation for OFDM-LFM*

In this subsection, we extend the proposed center frequency estimation method to the OFDM-LFM signals. The major difference between multiple center frequency estimation and single center frequency estimation is the estimation error that is caused by the leakage of multiple components in the OFDM-LFM signal. This error will lead to a bias in the interpolated DFrFT coefficients, deviating

from their true values. Assuming the noise-free actual coefficients *<sup>X</sup>*˜ *<sup>α</sup>*<sup>ˆ</sup> 0,*<sup>m</sup> U*<sup>ˆ</sup> *<sup>m</sup>* <sup>±</sup> *<sup>P</sup>* /2Δ*x* of the *m*-th component, we obtain:

$$\mathcal{R}\_{\mathbf{k}\_{0},\mathbf{m}}\left(\frac{\Omega\_{\mathbf{n}}\pm P}{2\Delta x}\right) = DFRFT\_{\left(\mathbf{k}\_{0},\mathbf{\tilde{i}}\mathbf{\tilde{i}}\_{\mathbf{n}}\pm p\right)}\left(\mathbf{x}\begin{bmatrix} \mathbf{n} \end{bmatrix}\right) = X\_{\mathbf{k}\_{0},\mathbf{m}}\left(\frac{\Omega\_{\mathbf{n}}\pm P}{2\Delta x}\right) + \sum\_{l=1,\left(l\neq m\right)}^{M} \check{X}\_{\mathbf{k}\_{0},l}\left(\frac{\Omega\_{\mathbf{n}}\pm P}{2\Delta x}\right) \tag{24}$$

where *Xα*<sup>ˆ</sup> 0,*<sup>l</sup> U*<sup>ˆ</sup> *<sup>m</sup>* <sup>±</sup> *<sup>P</sup>* /2Δ*x* represent the leakage terms introduced by the other *M* − 1 OFDM-LFM components, which can be calculated by:

 *Xα*<sup>ˆ</sup> 0,*<sup>l</sup> <sup>U</sup>*<sup>ˆ</sup> *<sup>m</sup>*±*<sup>P</sup>* 2Δ*x* <sup>=</sup> *Al*DFrFT(*α*<sup>ˆ</sup> 0,*U*<sup>ˆ</sup> *<sup>m</sup>*±*p*) (*s*ˆ*<sup>l</sup>* [*n*]) <sup>=</sup>*Al Bα*<sup>ˆ</sup> <sup>0</sup> <sup>2</sup>Δ*<sup>x</sup> e jπ* cot *α*ˆ <sup>0</sup> *<sup>U</sup>*<sup>ˆ</sup> *<sup>m</sup>*±*<sup>P</sup>* 2Δ*x* <sup>2</sup> *N* ∑ *n*=−*N e jπn* (*U*<sup>ˆ</sup> *<sup>l</sup>*−*U*<sup>ˆ</sup> *<sup>m</sup>*∓*P*) csc *<sup>α</sup>*<sup>ˆ</sup> <sup>0</sup> <sup>2</sup>*<sup>N</sup>* (25)

where *Al* represents the complex amplitude of the *l*-th (*l* = 1, . . . , *M*) component in the fractional domain. Therefore, the estimation error of interpolated coefficients can be reduced by subtracting the sum of the leakage from other components, which is:

$$
\hat{X}\_{\text{flo,m}}\left(\frac{\hat{\mathcal{U}}\_{\text{fl}}\pm P}{2\Delta\text{x}}\right) = \check{X}\_{\text{flo,m}}\left(\frac{\hat{\mathcal{U}}\_{\text{fl}}\pm P}{2\Delta\text{x}}\right) - \sum\_{l=1,\left(l\neq m\right)}^{M} \check{X}\_{\text{fl}\_{0}l}\left(\frac{\hat{\mathcal{U}}\_{\text{fl}}\pm P}{2\Delta\text{x}}\right) \tag{26}
$$

Based on the above analysis, we propose an iteration-based algorithm to accomplish the parameter estimation of the OFDM-LFM signal, which is given in Algorithm 1.

**Algorithm 1:** Proposed fast iterative interpolated digital fractional Fourier transform method **Initialization:** Set *q* = 0, ˆ *<sup>δ</sup>*<sup>0</sup> <sup>=</sup> 0, *<sup>ε</sup>*ˆ*<sup>m</sup>* <sup>=</sup> 0 and *<sup>A</sup>*ˆ*<sup>m</sup>* <sup>=</sup> 0 (*<sup>m</sup>* <sup>=</sup> <sup>1</sup> ··· *<sup>M</sup>*). Calculate *<sup>L</sup>* (*α*) = *Tp* <sup>−</sup>*Tp* <sup>|</sup>(*x*<sup>∗</sup> *<sup>α</sup>x*) (*τ*)|*dτ*. Find *<sup>α</sup>*<sup>B</sup> <sup>=</sup> arg Max *<sup>α</sup>* {*<sup>L</sup>* (*α*)}, where *α* ∈ [0 : Δ*α* : *π*] and *u* ∈ [0:1: *N* − 1]. **<sup>1</sup> Repeat <sup>2</sup>** Calculate *L* (*α*<sup>B</sup> ± 0.5Δ*α*) and *β* using Equations (14) and (17). **<sup>3</sup>** Renew *α*<sup>B</sup> = *α*<sup>B</sup> + ˆ *δ*0Δ*α*, where ˆ *δ*<sup>0</sup> = 0.5*β*. **<sup>4</sup> Until** *q=Q*; **<sup>5</sup>** Let *Xα*<sup>ˆ</sup> <sup>0</sup> [*u*] = DFrFT(*α*<sup>ˆ</sup> 0,*u*) (*x* [*n*]), *P* = (csc *α*ˆ <sup>0</sup>) −1 **<sup>6</sup> Repeat <sup>7</sup> for** *m* = 1 **to** *M* **do <sup>8</sup> if** (*q* == 1) **then <sup>9</sup>** *<sup>X</sup>*˜ *<sup>α</sup>*<sup>ˆ</sup> <sup>0</sup> [*u*] <sup>=</sup> *<sup>X</sup>α*<sup>ˆ</sup> <sup>0</sup> [*u*] <sup>−</sup> *<sup>M</sup>* ∑ *l*=1,*l*=*m A*ˆ *<sup>l</sup>*DFrFT(*α*<sup>ˆ</sup> 0,*u*) (*s*ˆ*<sup>l</sup>* [*n*]) **<sup>10</sup>** *<sup>U</sup>*<sup>ˆ</sup> <sup>B</sup>*<sup>m</sup>* <sup>=</sup> arg Max *<sup>u</sup>* ' '*X*˜ *<sup>α</sup>*<sup>ˆ</sup> <sup>0</sup> [*u*] ' '2 **<sup>11</sup> end <sup>12</sup>** Calculate *<sup>X</sup>*<sup>ˆ</sup> *<sup>m</sup>*,*α*<sup>ˆ</sup> <sup>0</sup> *<sup>U</sup>*<sup>ˆ</sup> *<sup>m</sup>* <sup>±</sup> *<sup>P</sup>* and *h* by Equations (23), (25) and (26). **<sup>13</sup>** Renew *U*ˆ <sup>B</sup>*<sup>m</sup>* = *U*ˆ <sup>B</sup>*<sup>m</sup>* + *ε*ˆ*m*, where *ε*ˆ*<sup>m</sup>* = *h* csc *α*ˆ 0. **<sup>14</sup>** *A*ˆ*<sup>m</sup>* = ' ' ' ' ' *Xα*<sup>ˆ</sup> <sup>0</sup> , *U*ˆ *<sup>m</sup>* - <sup>−</sup> *<sup>M</sup>* ∑ *l*=1,*l*=*m Xα*<sup>ˆ</sup> 0,*<sup>l</sup> <sup>U</sup>*<sup>ˆ</sup> *<sup>m</sup>* 2Δ*x* ' ' ' ' ' / Δ*x* ' ' ' *Bα*ˆ 0 ' ' ' **<sup>15</sup> end <sup>16</sup> Until** *q=Q*; **Result:** *U*ˆ *<sup>m</sup>* = *U*ˆ <sup>B</sup>*<sup>m</sup>* + *ε*ˆ*m*, *α*ˆ <sup>0</sup> = *α*<sup>B</sup> + ˆ *δ*0Δ*α*. Calculate *μ*ˆ0, ˆ *fm*, ˆ *f*Δ by Equation (8).

Next, we discuss the computational complexity of this method. The proposed method consists of two parts, which are the estimation process of *α*<sup>0</sup> and the estimation process of *Um*. The major computational load in the first part is due to the FA, which is about *O* [*G* (2*N* log *N* + *N*)] [22], where *G* = *π*/Δ*α* (• indicates the floor operator). The major computational load in the second part is due to the DFrFT during coarse searching, which is about *O* (*N* log *N*) [11]. In addition, there are *M*-times DFrFT coefficient calculations, whose computation complexity is about *O* (2*MN*) during each iteration. Consequently, the overall complexity of the above-mentioned method can be expressed as approximately *O* [*GN* log *N* + *N* log *N*+*MN*], which is more efficient than the methods in [6] (requires *O GN*<sup>2</sup> log *N* ), [7] (requires *O N*<sup>3</sup> ) and [10] (requires *O GN*<sup>2</sup> log *N* ), but less efficient than the method in [18] (requires *O* (8*N* + *N* log *N*)).

### **4. Simulations**

The goal of this section is to evaluate the estimation performance of the proposed method through Monte Carlo simulations. Consider two kinds of OFDM-LFM radar waveforms that are applied in different radar modes (searching and tracking modes). The employed simulation parameters are listed in Table 1, which are consistent with the simulation settings in [1]. As defined in [1], we consider that the signal amplitude *Am* of each subpulse is equal to *A*0. As analyzed in Section 2, it is assumed that the signal detection and carrier frequency estimation have been accomplished. Here, we ignore the influence of signal detection probability and the accuracy of the carrier frequency estimation for the results. Then, the demodulated baseband pulse observations are sampled at the frequency *fs* = 50 MHz. In this context, the Normalized Mean Squared Error (NMSE) is used to evaluate the estimation accuracy. Furthermore, we define the Signal-to-Noise Ratio (SNR) as *ρ* = 10 lg *A*2 1/*σ*<sup>2</sup> and set the searching interval of rotation to be Δ*α* = 0.001. Some other algorithms reported in [6,7,10,18] and Cramer–Rao lower Bounds (CRB) [23] are also reviewed for comparison.


**Table 1.** Parameters of Orthogonal Frequency Division Multi-Linear Frequency Modulation (OFDM-LFM) signals.

Figure 1 gives the NMSE of the chirp rate estimation, *μ*ˆ0, versus different SNRs. In this simulation, Monte Carlo experiments were repeated 500 times for each SNR from −18 dB to 2 dB. It is obvious from Figure 1 that most NMSE curves of estimation algorithms approach or achieve the CRB at specific SNRs. Among them, the performance of the proposed method coincides with the CRB at the lowest SNR, which is −11 dB. Moreover, the simulation result from Figure 1 confirms that the estimation performance of the proposed method slightly outperforms the other algorithms at all SNRs. Here, the iteration number is set to *Q* = 3, which is demonstrated in Figure 2.

In Figure 2, we study the effect of the iteration number, *Q*, on the convergence characteristics of the proposed method. In this simulation, the NMSE curves of frequency step (*f*Δ) estimation versus the iteration number (*Q*) when the SNR is set to [−10, −7, −4, −1] dB are depicted. Here, both signals with parameters from Ω<sup>1</sup> and Ω<sup>2</sup> are used for the simulation. As can be seen, the parameter estimation performance converges after three iterations for almost all of the SNRs. Therefore, through the simulation in Figure 2, the iteration number *Q* is suggested to be chosen as three.

**Figure 1.** Normalized Mean Squared Error (NMSE) of *μ*<sup>0</sup> for signal Ω<sup>1</sup> versus the signal-to-noise ratio. FII, Fast Iterative Interpolated.

**Figure 2.** NMSE of *f*<sup>Δ</sup> for signals Ω<sup>1</sup> and Ω<sup>2</sup> versus the number of iterations.

### **5. Conclusions**

In this study, we have derived the analytical AF and DFrFT approximation of OFDM-LFM radar signals. A new method called FII-DFrFT was proposed for uncooperative OFDM-LFM parameter estimation, which was formulated by locating the bottleneck issue that affects the estimation performance. The analytical formulas were hence derived, as well as their performance evaluation. Numerical simulations showed the validity and superiority of the proposed method, through comparisons with some existing algorithms at different SNRs. Nevertheless, as an uncooperative facility, especially for hostile MIMO radars, the estimation performance in the presence of clutter and other structure interferences is still a challenge for most cases. Hence, in future research, we would like to focus on the derivation and evaluation, taking into consideration the keen factors' uncertainty, as well as the clutter background, before the proposed scheme is employed for practical applications.

**Author Contributions:** Y.L. designed and wrote this paper under the supervision of B.T., Y.Z. assisted with the methodology and formal analysis. J.Z. and Y.X. provided the support of the entire study. Y.Z. and B.T. reviewed and edited the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China (Grant No. 61571088).

**Acknowledgments:** Thanks to the University of Electronic Science and Technology of China. Thanks to mysupervisor Bin Tang. Thank you to mydear love Leilei and all the people who support us.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Abbreviations**

The following abbreviations are used in this manuscript:


### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Differential Run-Length Encryption in Sensor Networks**

### **Chiratheep Chianphatthanakit , Anuparp Boonsongsrikul \* and Somjet Suppharangsan**

Department of Electrical Engineering, Faculty of Engineering, Burapha University Chonburi Campus, Chonburi 20131, Thailand

**\*** Correspondence: anuparp@eng.buu.ac.th; Tel.: +66-3810-2222 (ext. 3380)

Received: 28 May 2019; Accepted: 17 July 2019; Published: 19 July 2019

**Abstract:** Energy is a main concern in the design and deployment of Wireless Sensor Networks because sensor nodes are constrained by limitations of battery, memory, and a processing unit. A number of techniques have been presented to solve this power problem. Among the proposed solutions, the data compression scheme is one that can be used to reduce the volume of data for transmission. This article presents a data compression algorithm called Differential Run Length Encryption (D-RLE) consisting of three steps. First, reading values are divided into groups by using a threshold of Chauvenet's criterion. Second, each group is subdivided into subgroups whose consecutive member values are determined by a subtraction scheme under a K-RLE based threshold. Third, the member values are then encoded to binary based on our ad hoc scheme to compress the data. The experimental results show that the data rate savings by D-RLE can be up to 90% and energy usage can be saved more than 90% compared to data transmission without compression.

**Keywords:** data compression; wireless sensor networks; energy consumption

### **1. Introduction**

Wireless sensor networks (WSNs) consist of smart wireless sensors working together to monitor areas and to collect data such as temperature and humidity from the environment. However, sensor nodes are faced with resource constraints in terms of energy, memory, and a processing unit [1]. Many works [2–4] proposed a variety of solutions to overcome the restrictions. A challenge is how to prolong sensor life time during data delivery from sensor nodes to a base station. Energy consumption is a hard problem in the design and deployment in WSNs because sensor nodes may be deployed in harsh environments where it is not easy to replace the batteries [5]. Energy consumption mostly occurs in either data computing or data transmission. Wang et al. [6] reported that the ratio between computing and communication incurred energy consumption is about 1:3000; therefore, sensor nodes should focus on effective data communication. If sensor nodes reduce the number of data transmissions, this can obviously save energy consumption in the entire network. The end-to-end energy cost and network lifetime are greatly restricted if the cooperative transmission model is not designed properly [7]. The most common technique for saving energy is the use of a sleep-wake scheduling scheme [8] in which a significant part of the sensor's transceiver is switched off. However, the solution induces a problem of time synchronization [9] and the possibility of retransmitting data. Sensor network topologies also have a massive impact on energy usage in data transmission. In tree-based topologies [10], data aggregation approaches are often mentioned in order to reduce data redundancy and resulted in decreasing the number of data transmissions. However, it is merely given an approximate data value in a local area [11]. In cluster-based topologies [7,12], a cluster head plays an important role that collects and forwards all data from neighboring nodes to the base station. The cluster head consumes higher energy than other neighboring nodes and results in failures if it

is out of energy more quickly. If the number of failures exceeds the tolerance level, a system may collapse [13]. To disregard this problem, a cluster head [14] is assumed as a special node having more sufficient energy than its neighboring nodes. On the other hand, our proposed solution does not require a special node. It simply can be applied for all sensor nodes including a cluster head that can be exhausted when it works hard. Residual energy is a criterion in selection of a cluster head [15].

In this paper, data compression is a proposed solution that can reduce a data packet size and amount of data transmission and result in prolonging the battery life of wireless sensor nodes. The proposed concept shown in Figure 1 can apply either lossless or lossy data compression. Furthermore, the proposed data compression does not require extra RAM. Data compression can be divided into lossless and lossy algorithms. Lossless compression provides data accuracy but normally requires extensive use of memory for making a lookup table. A sensor LZW (S-LZW) algorithm [16] is an extension of a lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch (LZW) [17,18]. Capo-Chichi et al. [19] and Roy et al. [20] reported a concept of S-LZW that is a dictionary-based algorithm that is initialized by all standard characters of 255 ASCII codes. However, a new string in the input stream creates a new entry and results in the limitation of memory in a sensor node. In [21–24], their schemes are based on lossless entropy compression (LEC) for data compression by using the Huffman variable length codes. The data difference is an input to an entropy encoder. LEC is one of the efficient schemes in data compression, therefore LEC is applied for reliable data transmission to monitor a structural health in wireless sensor networks [25]. On the other hand, lossy compression [26] is data compression that is appropriate for sending approximate data or repeated data [27]. The *K*-Run-Length Encoding (*K*-RLE) algorithm [28] is a lossy compression that is an adaptation of RLE [17]. *K*-RLE's data accuracy and compression ratio depend on the K-precision.

**Figure 1.** Data encoding and decoding process.

### **2. Related Work**

In [26], researchers presented a comparison of data compression schemes with different sensor data types and sensor data sets in WSNs. In [24], data types in compression can be considered and divided into smooth temperature and relative humidity data and dynamic volcanic data that exhibit dramatic different characteristics. Since power consumption is one of the main concerns, Koc et al. [29] studied and measured power consumption during data compression by using the MSP432 family of microcontrollers. With the same fixed parameters of wireless environments, the energy usage for a fixed size packet would be the same on delivering the packet. Reducing the number of data packets would help reduce the energy consumption. Therefore, data compression has played a significant role in WSNs. Two types of data compression are generally categorized and referred to as lossy and lossless compression. The former compression permanently removes a certain amount of data, reducing the size of data to much smaller than the original ones, but it degrades the quality of data. While the latter compression reduces the data size without any data quality loss, its compression rate is lower than the former compression. In our experiments, we compared our results with three lossless algorithms that is LEC [21], Lempel-Ziv-welch (LZW) [17,18] and run-length encoding (RLE), whereas we compared our results with *K*-RLE for the lossy scheme. LEC is a lossless algorithm based on the Huffman concept in which the entropy is used for defining the Huffman codes. Table 1 shows the prefix and suffix codes used in LEC. LEC computes the difference data values and then replaces the difference by the corresponding codes from Table 1. LEC Algorithm is shown in Algorithm 1.


### **Algorithm 1** LEC Pseudocode

```
Require: di, Table of LEC codes
Ensure: bsi
  if (di == 0) then
    ni = 0
  else
    ni = log2(|di|) + 1
  end if
  si = Table(ni)  extract si from LEC Table
  if (ni == 0) then
    bsi = si
    return bsi
  end if
  if (di > 0) then
    ai = (di)|ni (v)|ni is the ni low-order bits of v.
  else
    ai = (di − 1)|ni
  end if
  bsi = (si, ai)
  return bsi
```
When *di* is negative, low-order bits of the two's complement representation of (*d* − 1) are used for the suffix code. For example, suppose we have a data set: <19, 18, 20, 21>. Starting with the first data 19, we then compute the value difference between a pair of consecutive data, resulting in <19, −1, 2, 1>. By using Table 1 and Algorithm 1 above, we obtain the following encoded *bsi*: (0001 0011), (010,0) (011,10) (010,1).

LZW is a dictionary-based lossless compression. LZW used in the experiments begins with the value of 256 onwards to avoid repeating the value of the first 256 ASCII codes. The algorithm repeatedly reads a symbol input to form a string and checks if the string is not in the dictionary. Once such a string is found, the corresponding output code for the string without the last symbol that is the longest string in the dictionary is sent out, and the new found string is added to the dictionary with the next available output code. Table 2 shows an example of data input string AAAABAAAABCC. Applying Algorithm 2, the seven new codes (code from 256 to 262) are added into the dictionary and the output strings are <A, AA, A, B, AAA, B, C, C>. The output codes for those output strings are <65,

256, 65, 66, 257, 66, 67, 67> where each output code uses nine bits, so in total the encoding output uses 72 bits compared to original input 96 bits. Nevertheless, a larger dictionary requires larger memory.


**Table 2.** LZW codes.


```
STRING ← get input symbol
while there are still input symbols do
  SYMBOL ← get input symbol
  if (STRING+SYMBOL is in Dictonary) then
    STRING = STRING+SYMBOL
  else
    output the code for STRING
    add STRING+SYMBOL to Dictionary
    STRING = SYMBOL
  end if
end while
output the code for STRING
```
RLE is the simplest compression, working by counting the amount of repeating consecutive identical data. The amount of consecutive identical data followed by the data symbol is replaced for the original repeating data. For example, the data of AAABBCEEFFFFFFFFAA are compressed to 3A2B1C2E8F2A, which implies that there are 3 A's, 2 B's, C, 2 E's, 8 F's and 2 A's next to each other in series. RLE pseudocode is shown in Algorithm 3.

**Algorithm 3** RLE Pseudocode

```
while there are still input symbols do
  count = 0
  repeat
    get input symbol
    count = count + 1
  until symbol unequal to next symbol
  output count and symbol
end while
```
*K*-RLE is based on a RLE algorithm allowing quality loss to such an extent. The value of *K* indicates the range of different data values. If *K*=1, for example, the data of <19, 18, 20, 21> will be encoded as <(3, 19),(1, 21)> because the first three pieces of data are in the range of 19 ± 1 and the last two pieces of data are in the range of 20 ± 1. The encoded data <(3, 19),(1, 21)> then will be decoded as <19, 19, 19, 21>. Obviously, the decoded data are different from the original data due to the lossy scheme. *K*-RLE pseudocode is shown in Algorithm 4.

### **Algorithm 4** *K*-RLE Pseudocode

```
v1 ← read input value
count = 1
while there are still input values do
  v2 ← read next input value
  if (|v1 − v2| ≤ K) then
    count = count + 1
  else
    output (count, v1)
    v1 ← v2
    count = 1
  end if
end while
output (count, v1)
```
LZW compresses the same data with the same encoding though these data are at different positions in the data input stream. In contrast, RLE requires that the same data must stand next to each other in a row. Both LEC and LZW apply a similar concept in terms of the prefix codes. However, LEC also has suffix codes addressing the different value; hence, each encoding of difference value in the input stream consists of prefix and suffix codes. If the input stream has many pieces of consecutive identical data, RLE performs very well. LZW would be preferred to RLE if the input stream consisted of many repeating data with shorter output codes. LEC works even better if the input stream has many of the same levels of the difference values with shorter prefix and suffix codes. Aforementioned algorithms have the linear time complexity O(*n*) and perform best in their own characteristics, not for general datasets. This gap stimulates how we can combine each strong point of those algorithms to compress the data. To this end, we have developed Differential Run Length Encryption (D-RLE), which also has the linear time complexity O(*n*), and will explain its concept in the next section.

### **3. Differential Run Length Encryption**

This section presents the proposed algorithm called Differential Run Length Encryption or D-RLE, which consists of three steps. First, raw data are collected and divided into several groups by using a threshold of Chauvenet's criterion. Second, consecutive data are subtracted and arranged into multiple subgroups of differential values based on a threshold of *K*-RLE. Third, our adaptive data compression, which is the DSC-based scheme [30], is employed to each piece of subgroup data. Data formats in D-RLE are shown in Table 3. As a result, D-RLE significantly reduces the amount of data delivery, saves energy consumption and prolongs the battery life of the sensor nodes.



### *3.1. Group Division by Chauvenet's Criterion*

The raw data can be considered as a random sample <*r*0,*r*1, ... ,*rm*> and they are grouped by using a pre-defined Chauvenet's criterion [31,32] *Dmax*, which is set to 1.96 according to the significance level of 0.05 from statistics. The data *ri* is passed to Equation (1) for calculating *Di* in which *μ* and *σ* are mean and standard deviation, respectively. The value of *Di* is then compared with *Dmax* to consider if *ri* should belong to the same group or not. We maintain *ri* to the same group if *Dmax* is greater than or equal to *Di*; otherwise, we split *ri* into the next group. For example, suppose we have the following raw data <21,25,28,30,31,35,37,42,47,49,50,55,62,76,82,95,105,103,92,86,71,63,59,52,41,34,30,26,25,21>. As the raw data coming into Equation (1), we found that *D*<sup>15</sup> = |95 − 52.43|/25.34 = 1.68, whereas *D*<sup>16</sup> = |105 − 52.43|/25.34 = 2.07. Since *Dmax* = 1.96 is less than *D*<sup>16</sup> = 2.07, data *r*<sup>16</sup> is split into a different group. Therefore, in this data sample set, there are two groups where the first group is <21,25,28,30,31,35,37,42,47,49,50,55,62,76,82,95>, and the other group is <105,103,92,86,71,63,59,52,41,34,30,26,25,21>.

$$D\_i = \frac{|r\_i - \mu|}{\sigma} \tag{1}$$

Once a group is split, the mean and standard deviation are recalculated and updated with the remaining data and used in Equation (1) for the next round of group divisions. Algorithm 5 shows the algorithm for the group division step.

### **Algorithm 5** Group Division


### *3.2. Subgroups Division*

Each group from the first step will be sub-divided based on the *K*-RLE scheme. The value of *K* implies the degree of tolerance. It is lossless if *K* equals zero. The result of the subgroup division is written in the compact format: <*rb*, |*c*1, *d*1|1, |*c*2, *d*2|2, ... , |*cn*, *dn*|*n*>, where the symbol |*ci*, *di*|*<sup>i</sup>* is used to separate *i*th subgroup and *rb* is the first value of the group. The number of elements in *i*th subgroup is denoted by *ci* and the value difference between the last raw data of the present subgroup *i*th and the last raw data in the previous subgroup (*i* − 1)th is denoted by *di*. The algorithm of discovering the *ci* and *di* is shown in Algorithm 6. As we start off the index from zero, *m* + 1 is the number of data members in the group. The index *n* is the number of subgroups that is also the number of members in set *Ci* and *Di*. The relationship between *m* and *ci* is shown by Equation (2):

$$m = \sum\_{i=1}^{n} c\_i. \tag{2}$$

*Sensors* **2019**, *19*, 3190

**Algorithm 6** Computing |*ci*, *di*|

**Require:** *K*,*r* = {*r*0,*r*1,...,*rm*} **Ensure:** *Ci* = {*c*1, *c*2,..., *cn*}, *Di* = {*d*1, *d*2,..., *dn*} initialize *i* = 1, *j* = 1, *count* = 1, *f* = 1,*s* = 0 **while** *j* < (*m* + 1) **do** *s* = *r*[ *f* ] − *r*[*j* + 1] **if** |*s*| ≤ *K* **then** *count* = *count* + 1 **else** *C*[*i*] = *count D*[*i*] = *r*[*j*] − *r*[ *f* − 1] *i* = *i* + 1 *f* = *j* + 1 *count* = 1 **end if** *j* = *j* + 1 **end while**

For the first group <21, 25, 28, 30, 31, 35, 37, 42, 47, 49, 50, 55, 62, 76, 82, 95>, we ran the algorithm from Algorithm 6 above and had the following subgroup division result: <21, |1, 4|, |1, 3|, |2, 3|, |1, 4|, |1, 2|, |1, 5|, |1, 5|, |2, 3|, |1, 5|, |1, 7|, |1, 14|, |1, 6|, |1, 13|> . Value 21 is the first raw data *rb* of the group, followed by |*ci*, *di*|*i*=1*to*13. The variable *ci* is just a counter, starting from one, indicating how many pairs that the absolute difference between the first and next raw data in the same subgroup do not differ more than the pre-defined *K* value. The variable *di* roughly dictates to us how different the data are between the present and previous subgroups. The following subgroup division result: <105,|1,−2|,|1,−11|,|1,−6|,|1,−15|,|1,−8|,|1,−4|,|1,−7|,|1,−11|,|1,−7|,|1,−4|,|2,−5|, |1,−4|> is obtained for the second group from the first step.

### *3.3. Adaptive Data Encoding*

The last step is the process of adaptively encoding each subgroup division result. Shortened opcodes for *ci* and *di* are used to compress raw data via the encoding process. The number of bits to represent *ci* and *di* is determined by set *Ci* = {*c*1, *c*2, ... , *cn*} and set *Di* = {*d*1, *d*2, ... , *dn*}, respectively. We shall use the first subgroup division result, <21, |1, 4|, |1, 3|, |2, 3|, |1, 4|, |1, 2|, |1, 5|, |1, 5|, |2, 3|, |1, 5|, |1, 7|, |1, 14|, |1, 6|, |1, 13|>, as an illustration of the encoding process. To begin with, set *Ci* is {1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1} and set *Di* is {4, 3, 3, 4, 2, 5, 5, 3, 5, 7, 14, 6, 13}. Then, Equations (3)–(5) are used to compute the number of bits for *ci* and *di*, respectively:

$$\mathfrak{w} = \left[ \log\_2(\max\_{i=1}^n(c\_i)) \right],$$

$$d\_L = d\_{\bar{l}} \text{ where } \bar{i} = \operatorname\*{argmax}\_{\bar{i}} |D\_{\bar{i}}|\_{\prime} \tag{4}$$

$$\#d = \begin{cases} k+1 & \text{if } -(2^k)+1 \le d\_L \le 2^k, \\ k+2 & \text{if } d\_L = -(2^k), \end{cases} \text{ where } k \in I^+. \tag{5}$$

Because the maximum value of C is 2, the number of bits for *ci* equals (log2 2) = 1 bit. The argument of the maximum for index *i* of absolute value of *Di* is *i* = 11; hence, *dL* = *d*<sup>11</sup> = 14 and <sup>14</sup> <sup>≤</sup> <sup>2</sup>*k*<sup>=</sup>4; then, the number of bits for *di* equals *<sup>k</sup>* <sup>+</sup> <sup>1</sup> <sup>=</sup> <sup>4</sup> <sup>+</sup> <sup>1</sup> <sup>=</sup> 5 bits. In binary code, 1 and 5 are represented by 0001 and 0101, respectively. We combine 0001 with 0101 to form a byte as 0001 0101 referred to as <|#*c*, #*d*|> and this byte will be our length field to notify the decoder of the bit sizes of *ci* and *di*. We use these bit sizes to limit the bit length used for binary codes of *ci* and *di* . In addition, the value of each *ci* before changing to binary code is decreased by one as doing so will help reduce

bit sizes for *ci*. For example, the binary code *ci* = 8 is 1000 (4 bits), but we could obtain the shortened binary code 111 (3 bits) if we decrease 8 to 7. We could not do the same decrease for *di* as *di* can be either positive or negative values, whereas *ci* is only positive integers. Subsequently, the data field is set to the format <*rb*, |*c*1, *d*1|1, |*c*2, *d*2|2, ... , |*cn*, *dn*|*n*> in which *rb* is represented by its corresponding 8-bit binary code, and |*ci*, *di*|*i*=1 to *<sup>n</sup>* are opcodes for *ci* and *di* denoted by shorten binary codes where *ci* = *ci* − 1. Two's complement is used for negative values of *di*. The length and data fields join together to make a data payload. Finally, the data payload <|#*c*, #*d*|,*rb*, |*c*1, *d*1|1, |*c*2, *d*2|2, ... , |*cn*, *dn*|*n*> of our example is <|1, 5|, 21, |0, 4|1, |0, 3|2, |1, 3|3, |0, 4|4, |0, 2|5, |0, 5|6, |0, 5|7, |1, 3|8, |0, 5|9, |0, 7|10, |0, 14|11, |0, 6|12, |0, 13|13> and will be encoded as <|0001, 0101|, |0001 0101|, |0, 00100|1, |0, 00011|2, |1, 00011|3, |0, 00100|4, |0, 00010|5, |0, 00101|6, |0, 00101|7, |1, 00011|8, |0, 00101|9, |0, 00111|10, |0, 01110|11, |0, 00110|12, |0, 001101|13>. The total number of encoded bits for each subgroup division result can be found by Equation (6). In our example case *K* = 1, the total number of encoded bits is 16 + [13 ∗ (1 + 5)] = 94 bits compared to 2 ∗ 16 ∗ 8 = 256 bits without compression. This means we have saved approximately 63.28% of data delivery:

$$Size\_{bit} = 16 + [n \times (\#c + \#d)].\tag{6}$$

For the second subgroup division result, the number of bits for *ci* and *di* equal to 1 and 5 bits, respectively. Therefore, the data payload of the second subgroup is <|0, 5|, 105, |0, −2|, |0, −11|, |0, −6|, |0, −15|, |0, −8|, |0, −4|, |0, −7|, |0, −11|, |0, −7|, |0, −4|, |1, −5|, |0, −4|> and encoded as <|0000, 0101|, |0110 1001|, |0, 11110|1, |0, 10101|2, |0, 11010|3, |0, 10001|4, |0, 11000|5, |0, 11100|6, |0, 11001|7, |0, 10101|8, |0, 11001|9, |0, 11100|10, |1, 11011|11, |0, 11100|12>. The total number of encoded bits is 16 + [12 ∗ (1 + 5)] = 88 bits compared to 240 bits without compression for the second subgroup. This means we have saved approximately 63.33% of data delivery.

This example shows that, from the raw data size of 256 + 240 = 496 bits, our D-RLE has compressed the raw data to 94 + 88 = 182 bits, which means we have saved bits sent up to 63.31% of the raw data. Note that we have demonstrated only 30 pieces of raw data for illustration purposes, but, in practice, each sensor node would collect more data in the long run before delivering the data. In the next section, we will show that D-RLE impressively improves accomplishment for the longer data delivery.

### **4. Performance Evaluation**

In our testbed, we assumed that data sets are already stored in a sensor node. The sensor node could be a cluster head that collects data either from itself or from neighboring nodes. Later, the sensor node performs data compression and wirelessly sends compressed data to a receiver. This section analyzes the performance of our algorithm and compares the performance with four benchmark algorithms on five datasets.

### *4.1. Effect of K Value*

We have evaluated the performance of our algorithm in terms of data rate savings (DRS) [30] as shown in Equation (7). Particularly, the case of lossless compression (*K* = 0) and the case of lossy compression (*K* = 1) were evaluated. The lossy case for *K* = 1 has been shown in the previous section and its DRS is 63.31%. We did repeat the same procedure except for *K* = 0 in the lossless case on the same raw data <21, 25, 28, 30, 31, 35, 37, 42, 47, 49, 50, 55, 62, 76, 82, 95, 105, 103, 92, 86, 71, 63, 59, 52, 41, 34, 30, 26, 25, 21>, and received the corresponding encoded data payload <|0, 5|,*rb*, *d*1, *d*2, ... , *dn*>. Notice that the number of bits for the opcode *ci* is 0; hence, there is no need for sending *ci* of each subgroup. The total number of encoded bits used is 172. Therefore, the DRS of the lossless case is 172/496 = 65.32%, compared to 63.31% of the above lossy case:

$$DRS = \left(1 - \frac{\text{size of compressed data}}{\text{size of raw data}}\right) \times 100\%. \tag{7}$$

To achieve more data rate savings, we could allow more data lost or distorted to some extent. It depends on applications how much the quality loss is acceptable and this is done via the value of *K* in the algorithm. Table 4 shows the effect of varying *K*. We obtained more data rate savings when *K* ≥ 4 for the same raw data previously illustrated.


**Table 4.** Comparison results of varying *K* values.

### *4.2. Evaluation Results*

We have split our experiments into two sets. The dataset we used in the first set is varied in size—roughly speaking as small, medium, and large sizes—while, for the second set, we have fixed each dataset with the same size. Subsequently, benchmark algorithms as well as our proposed D-RLE algorithm were applied to compress those datasets and then the energy consumption for data compression and transmission would be measured.

Starting with the first set, we have simulated a 100-byte temperature dataset shown in Figure 2c and its shape resembles Figure 2d, which is a real collected dataset by [33]. In addition to these datasets, three more datasets referred to as sine-like, chaotic, and temperatureMin datasets as illustrated in Figure 2 were used for evaluating effectiveness of compression algorithms in our experiments. The simulated temperature dataset was created by Algorithm 7 while the temperatureHr dataset was the actual hourly recorded temperature data for 48 h. The temperatureMin dataset was retrieved from the same source of the temperatureHr dataset, but minutely recorded data. For the sine-like dataset, the minimum and maximum data values are 2 and 19, respectively. The neighboring data value next to 2 is 3 and then the data value is increased by 3 until reaching the maximum value. The data value after the maximum was set to 18 then decreased by 3 until touching the minimum value. By doing so for two cycles, the sine-like dataset has 30 pieces of data. We have fixed data ranging from 2 to 19 for the sine-like dataset, whereas we have randomly selected data ranging from 0 to 20 for the chaotic dataset. For chaotic and simulated temperature datasets, each dataset consists of 100 pieces of raw data. The temperatureHr dataset only has 48 pieces of data as the data were hourly recorded for 48 h, whereas the temperatureMin dataset has 2880 pieces of data since the data were minutely recorded for the same 48 h period. The raw data in each dataset were recorded as a series of strings; hence, bit sizes of data 2, 12 , and 102 are 8, 16, and 24 bits, respectively.

### **Algorithm 7** Creating simulated temperature data

```
Require: g, uppertemp, lowertemp
```

```
Ensure: t[g] = {t1, t2,..., tg}
  initialize min = 0, max = 10, temp = 20, i = 1, up = true
  while i ≤ g do
    while (temp ≤ uppertemp) AND (up is true) do
      temp = temp + Random(min, max)
      if (uppertemp < temp) then
         up = f alse
         temp = uppertemp
      end if
      t[i] = temp
      i = i + 1
    end while
    while (lowertemp ≤ temp)AND (up is false) do
      temp = temp − Random(min, max)
      if (temp < lowertemp) then
         up = true
         temp = lowertemp
      end if
      t[i] = temp
      i = i + 1
    end while
  end while
```
**Figure 2.** *Cont*.

**Figure 2.** Five datasets used in the experiment. (**a**) sine-like; (**b**) chaotic; (**c**) simulated temperature; (**d**) temperatureHr; (**e**) temperatureMin.

For the second set of experiments, we extended the size of each dataset except for temperatureMin dataset to 46,080 bits. The reason to do this experiment is to investigate the energy usage for lengthy data transmission in one shot compared to multishot transmission of a small amount of data. We expected that the one shot delivery should have more efficient energy usage than the multishot since the sensor node in the single shot would be in a silent or power saving mode longer, whereas the multishot could wake up the sensor node more frequently. For both sets of experiments, four selected benchmark compression algorithms which are RLE, *K*-RLE, LEC, and LZW were used and compared to our D-RLE algorithm. For *K*-RLE, we set *K* = 1. A sensor node has been used in the experiments and run these algorithms to compress the datasets before transmitting data to a base station. The DRS obtained and energy consumed by each algorithm then were recorded for comparisons. The algorithms were implemented into a LAUNCHXL-CC1310 board [34] acting as the sensor node equipped with an RF module. To measure the power and energy usage in data compression and transmission, an MSP430FR5969 board [35] and code composer studio (CCS) program (version 8.0.0, Texas Instrument Inc., Dallas, Texas, USA) [36] were used. The MSP430FR5969 board was connected to the LAUNCHXL-CC1310 board as shown in Figure 3, and the amount of energy used was then measured by the EnergyTrace() function in CCS software. We took off the jumper connecting between the microcontroller and the debug parts of the CC1310 board to ensure that the power source came from the MSP430FR5969 board. The corresponding 3.3V Vcc and ground pins between the two boards were wired up as illustrated by black and white lines in Figure 3.

**Figure 3.** Board configuration for measuring energy use.

Tables 5 and 6 show the comparison results and total energy consumption on the datasets with different sizes while Tables 7 and 8 show the comparison results and total energy consumption on lengthy datasets with the same size of 46,080 bits, respectively. In the matter of DRS, the sine-like data gradually change values and there are no repeating values in adjacent data, so it is obvious that RLE and *K*-RLE poorly perform while D-RLE works much better than others. On the other hand, the temperatureMin dataset has many repeating values and this characteristic does help RLE, *K*-RLE, and D-RLE to have higher DRS than LEC and LZW. The temperatureMin dataset has many nearby identical repeating data making it possible for LZW to create a dictionary with shorter encoding bits than LEC, and hence LZW has compressed data better than LEC. For a simulated temperature dataset, LZW, RLE and *K*-RLE perform worse than LEC and D-RLE since there are no repeating data values. LEC and D-RLE share a similar concept in the way of encoding the difference values. While LEC considers the all of the data as one group for data encoding, D-RLE divides the data into several subgroups in which the members within the same subgroup are not much different, leading to smaller encoding bits. For the temperatureHr dataset, D-RLE performs very well while among *K*-RLE, LEC and LZW work comparably, but RLE is the worst. RLE is the worst algorithm for the datasets that there are no repeating values. The negative DRS of RLE means RLE could not compress data at all, and it also adds extra overheads into the raw data. For the chaotic dataset, LEC, LZW and D-RLE perform better than the RLE family due to the data fluctuation and the dataset rarely has repeating data. We found that, in terms of DRS on those datasets, our D-RLE is the winner on both single shot and multi-shot patterns. Though the compression time by D-RLE is longer than others except for LZW, the compression energy used by D-RLE is not much different from others in multi-shot patterns. For the single shot pattern, D-RLE spends compression energy similar to LEC and consumes more compression energy than RLE and *K*-RLE, but less than LZW. We would suggest using D-RLE for a dataset that has a long sequence of data and it works best for repeating data or a gradual change in data values.




**Table 6.** Total energy use for the datasets with different sizes.



**Table 8.** Total energy use for the datasets with the same size of 46,080 bits.


As a result of highest DRS performance, the number of packets for data delivery by D-RLE is smaller than the number of packets by other algorithms, leading to less transmission energy. In terms of energy use, D-RLE uses the least total energy compared to other algorithms on most of the datasets as shown in Tables 6 and 8. For example, in temperatureHr dataset of 46,080 bits, D-RLE approximately sends only 12 packets with the total energy use of 18.82 mJ, while others use more than 30 packets with the total of energy greater than 45 mJ. The total power use, *Ptotal*, consists of two parts from compression

and transmission steps, which is referred to as compression power, *Pc*, and transmission power, *Pt*, respectively. *Ptotal* is determined by Equation (8) in which the subscript *i* indicates the *i*th group number when we compress the data, and *j* expresses the *j*th payload or packet number that we deliver. The values of *g* and *p* are the number of groups and the number of packets. To calculate corresponding energy used in each step, we use the relationships between energy and power from Equations (9) and (10), where *Tci* and *Ttj* are the compression time spending for compressing *i*th group and transmission time spending for delivering *j*th packet, respectively. Lastly, total energy consumption, *Etotal*, is computed by Equation (11), adding transmission and compression energy. The experimental results show that D-RLE takes minimal transmission energy in exchange for slightly more compression energy, but it is worthy of being considered, as D-RLE significantly reduces total energy use while other benchmark algorithms consume much higher total energy level on the same datasets:

$$P\_{total} = \sum\_{i=1}^{\mathcal{S}} P\_{c\_i} + \sum\_{j=1}^{p} P\_{t\_{j'}} \tag{8}$$

$$E\_{\mathcal{L}} = \sum\_{i=1}^{\mathcal{S}} P\_{\mathcal{C}\_i} \times \Delta T\_{\mathcal{C}\_i \prime} \tag{9}$$

$$E\_t = \sum\_{j=1}^p P\_{t\_j} \times \Delta T\_{t\_{j'}} \tag{10}$$

$$E\_{total} = E\_t + E\_c.\tag{11}$$

Figure 4 compares power and energy consumed by D-RLE during data compression and transmission between single shot and multishot cases on the TemperatureMin dataset, respectively. Both cases have the same data length of 46,080 bits. The single shot receives all of the data before starting compression and transmission while the multishot would receive several data portions in which each portion has the same data length. According to the graph, it is clearly seen that the transmission period demands higher power consumption than the compression period. However, the energy consumption during the transmission indicated as T in the graph is less than the energy used during compression indicated as C in the graph. The smaller size of compressed data gives the shorter period of data transmission time for the single shot case. On the other hand, the multishot case has to repeat many compression and transmission cycles and take more time. In each cycle of the multishot, the compression step is reinitiated, which gives us lower compression efficiency and hence the accumulated energy used by the multishot is greater than the accumulated energy used by the single shot. Therefore, the single shot is more efficient and energy saving compared to the multishot. Other datasets have the graphs in the same manner as the Temperature dataset.

### *4.3. Performance Visualization*

We have plotted radar charts as shown in Figure 5 according to five categories for making a simple way to visualize performance comparison among the algorithms. The first three categories are DRS and data accuracy (in the sense of how much difference there is between the decoded data and its original data). The last three ad hoc categories are called compression time efficiency (CTE), compression energy efficiency (CEE), and transmission energy efficiency (TEE) in which they are defined by Equations (12)–(14), respectively. The parameter *A* in those equations is the number of algorithms we used in the experiments , i.e., *A* = 5. *Tci* , *Eci* and *Eti* are the compression time, compression energy, and transmission energy of *i*thalgorithm, respectively. Each category has a score from 0 to 100; the higher the score, the better the performance. The left panel of Figure 5 shows comparisons among RLE, *K*-RLE and D-RLE while the right panel of Figure 5 shows comparison among LEC, LZW and D-RLE. For the left panel, D-RLE performs better than the others on most categories except for CTE given the fact that D-RLE takes more time in the compression step. For the right panel, D-RLE performs equally or slightly better than the others in terms of CEE, TEE, and accuracy. D-RLE is

located between LEC and LZW on CTE, whereas D-RLE mostly achieves better DRS (DRS results on the radar charts Figure 5a–c might not be clearly seen as shown as the number in Table 7). On average, D-RLE gets a high score and is well balanced, reaching the vertex of pentagon in the graph when compared to other algorithms in each category.

**Figure 4.** Power and energy comparison between single shot and multishot on the TemperatureMin dataset.

**Figure 5.** *Cont*.

**Figure 5.** Radar chart comparison. (**a**) sine-like; (**b**) chaotic; (**c**) simulated temperature; (**d**) temperatureHr; (**e**) temperatureMin.

$$CTE\_i = (1 - \frac{T\_{c\_i}}{\frac{A}{\sum\\_i} T\_{c\_d}}) \times 100\tag{12}$$

$$CEE\_i = \left(1 - \frac{E\_{c\_i}}{\frac{A}{\sum\_{c} E\_{c\_d}}}\right) \times 100\tag{13}$$

$$TEE\_i = (1 - \frac{E\_{t\_i}}{\sum\_{a=1}^{A} E\_{t\_a}}) \times 100\tag{14}$$

### **5. Conclusions**

We have presented a compression algorithm called D-RLE applied to the domain of wireless sensor nodes in which energy use is one of the most important aspects. It starts with dividing the data into many groups based on Chauvenet's criterion and then each group further forms subgroups to which an adaptive encoding is applied. According to the experimental results, D-RLE have demonstrated that it performs very well, gives the highest data savings rate and spends less energy compared to other benchmark algorithms. In particular, D-RLE is suitable for big amounts of data with repeating or gradually changed values and for a single shot delivery mode. Due to its highest compression rate, the amount of data transmission is significantly reduced and hence less energy is demanded. This prolongs the battery life of the sensor nodes. This work is an alternative way to increase the performance of the sensor node concerning the energy.

**Author Contributions:** C.C. designed the core architecture and performed the hardware/software implementation and experiments; A.B. provided supervision to the project and has the responsibility as the main corresponding author; S.S. co-supervised the project and contributed to the experimental design, partial software development and result analysis.

**Funding:** This research was supported by the Faculty of Engineering, Burapha University, Thailand under the contract number 15/2555. The authors gratefully acknowledge the contributions of all committee members for their valuable suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
