Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks

Liu, Fen; Xiao, Wendong; Chen, Shuai; Jiang, Chengpeng

doi:10.3390/s18124090

Open AccessArticle

Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks

by

Fen Liu

^1,2,

Wendong Xiao

^1,2,*,

Shuai Chen

^1,2 and

Chengpeng Jiang

^1,2

¹

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing100083, China

²

Beijing Engineering Research Center of Industrial Spectrum Imaging, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Sensors 2018, 18(12), 4090; https://doi.org/10.3390/s18124090

Submission received: 20 October 2018 / Revised: 18 November 2018 / Accepted: 19 November 2018 / Published: 22 November 2018

(This article belongs to the Special Issue Advanced Technologies on Green Radio Networks)

Download

Browse Figures

Versions Notes

Abstract

:

Collaborative target tracking is one of the most important applications of wireless sensor networks (WSNs), in which the network must rely on sensor scheduling to balance the tracking accuracy and energy consumption, due to the limited network resources for sensing, communication, and computation. With the recent development of energy acquisition technologies, the building of WSNs based on energy harvesting has become possible to overcome the limitation of battery energy in WSNs, where theoretically the lifetime of the network could be extended to infinite. However, energy-harvesting WSNs pose new technical challenges for collaborative target tracking on how to schedule sensors over the infinite horizon under the restriction on limited sensor energy harvesting capabilities. In this paper, we propose a novel adaptive dynamic programming (ADP)-based multi-sensor scheduling algorithm (ADP-MSS) for collaborative target tracking for energy-harvesting WSNs. ADP-MSS can schedule multiple sensors for each time step over an infinite horizon to achieve high tracking accuracy, based on the extended Kalman filter (EKF) for target state prediction and estimation. Theoretical analysis shows the optimality of ADP-MSS, and simulation results demonstrate its superior tracking accuracy compared with an ADP-based single-sensor scheduling scheme and a simulated-annealing based multi-sensor scheduling scheme.

Keywords:

energy harvesting; wireless sensor networks; adaptive dynamic programming; target tracking; extended Kalman filter; sensor scheduling

Graphical Abstract

1. Introduction

A wireless sensor network (WSN) is usually deployed to monitor the physical phenomena in the geographic area covered by a large number of sensor nodes. It has the advantages of low cost, rapid deployment, self-organization, and fault tolerance, with wide applications such as environmental monitoring [1,2], medical care [3], pension service [4], and intelligent transportation [5].

Target tracking is a typical research problem in WSNs for studying collaborative signal and information processing where the sensors are scheduled by considering the tracking performance and the constrained network resources (e.g., the wireless bandwidth and limited sensor energy).

Usually, sensor scheduling relates to tasking the appropriate sensors at the right time to achieve satisfactory performance by considering the limited sensing, computing, and communication resources of the sensors. Effective sensor scheduling algorithms are needed for collaborative target tracking in order to get accurate estimates and effective resource utilization. For example, sensors are scheduled by using dynamic clusters and duty cycling technology to economize the limited energy of the sensors in [6]. An adaptive sensor scheduling scheme was proposed in [7] to improve tracking accuracy on the premise of employing the same number of sensors based on extended Kalman filter (EKF). However, these existing sensor scheduling methods are based on the local optimization of performance for limited time steps, and may be sub-optimal from a global perspective (especially regarding the performance for all the time steps from the current one to the end or infinity).

Additionally, with the development of new technologies such as new materials, microelectronics, energy storage, and conversion, energy harvesting technologies that can obtain energy from a variety of energy sources in the environment (e.g., light [8], wind [9], thermoelectric energy [10,11], electromagnetic radiation [12]) have recently been developed. Energy harvesting can be used to charge batteries from time to time, and can help to avoid battery replacement. Energy harvesting technologies make it possible for the sensor nodes to obtain energy from the environment at a low cost. Accordingly, an energy harvesting-based WSN was introduced in which the sensor nodes could harvest energy from the environment to power the sensors [13].

Theoretically, the lifetime of an energy harvesting-enabled network is infinite. Subsequently, the key problem for sensor scheduling is no longer to maximize the network lifetime, but to optimize the network performance under the given energy harvesting capability. Hence, the development of energy harvesting technologies has provided a new challenge of infinite-horizon sensor scheduling with finite energy harvesting capability for high-performance target tracking. In this paper, we apply adaptive dynamic programming (ADP) to sensor scheduling for collaborative target tracking in an energy-harvesting WSN over an infinite horizon.

ADP was proposed by Werbos [14]. The typical structure of ADP consists of a critic module, a state transition module, and an action module. Each module can be realized by a neural network (NN) [15,16,17]. Characterized by strong abilities of self-learning and adaptation, ADP has demonstrated a strong capability to find the optimal control policy and solve the discrete system dynamic programming (DP) problem [18], including the adaptive critic design, reinforcement learning [19], and so on, obtaining the approximate optimal performance and the optimal control to satisfy the Bellman optimal principle through the function approximation structure. An ADP-based sensor scheduling scheme for target tracking in an energy-harvesting WSN was proposed in [20], which made the sensor energy consumption and tracking accuracy optimal over the system operational horizon for WSNs. However, only one sensor was scheduled for each time step, and therefore the tracking accuracy improvement was limited. In contrast, due to the energy harvesting capabilities, the motivation of this paper is to present a novel multi-sensor scheduling scheme for global performance optimization over an infinite horizon.

The rest of this paper is organized as follows: In Section 2, the related work is introduced. In Section 3, the energy harvesting model is established based on solar energy harvesting, and the EKF-based target motion model and measurement model are introduced. In Section 4, the details of the target tracking system and the predicted energy consumption model are given. In Section 5, the optimal multi-sensor scheduling problem is abstracted to a mathematical model and the ADP-based multi-sensor scheduling algorithm (ADP-MSS) is proposed. In Section 6, the performance of the proposed algorithm is examined by theoretical analysis and simulations. In Section 7, conclusions are drawn and some suggested future work is discussed.

2. Related Work

In the existing work for target tracking in WSNs, various sensor scheduling schemes have been proposed. By considering the number of scheduled sensors for each time step, they can be classified into one-sensor scheduling schemes and multi-sensor scheduling schemes. Meanwhile, by considering the scheduling mechanisms, they can be classified into non-adaptive sensor scheduling schemes and adaptive sensor scheduling schemes.

For example, a periodic sensor scheduling (PSS) scheme in which sensors sense the target alternatively within the predefined time slots was presented to avoid the inter-sensor interference (ISI) problem and utilize the sensors more effectively [21]. The drawback of PSS is the existence of an empty detection when a scheduled sensor cannot generate an effective measurement, which results in lower tracking accuracy and the wasting of sensor power. A distributed sensor scheduling scheme was proposed in [22] where the tasking sensor is elected spontaneously from the sensors with pending sensing tasks via random competition based on carrier sense multiple access. Each node does not need to know the location of the other nodes, which requires less occupied memory. However, the computation burden of the scheduling is shared to all the active nodes. Aiming to optimize the tradeoff between the tracking accuracy and the energy cost for collaborative target tracking in WSNs, a dynamic sensor selection scheme based on genetic algorithms was proposed in [23].

In the above approaches, only one sensor node is scheduled for performing the measurement at each time step. Generally, tracking performance can be further improved by multi-sensor scheduling. For example, a distributed multi-sensor target tracking algorithm was proposed in [24] by using a cluster-based Kalman filter (KF). At each time step, one sensor is selected as the head to fuse the measurements from the other sensors, estimate the target state using EKF, and send the results to the base station. A distributed-saturation-degree-based algorithm was proposed for target tracking with multiple ultrasonic sensors, where the ISI avoidance problem is converted to the problem of multiple access in a shared channel and the scheduling problem is transformed to a coloring problem [25]. In [26], probability-based prediction and sleep scheduling protocols were presented to improve energy efficiency with limited tracking performance loss. The above work does not analyze the energy consumption of the sensor node, and lacks the adaptation mechanism in response to energy changes in the network.

Some adaptive sensor scheduling solutions can be found in the literature. For example, an adaptive sensor scheduling scheme was introduced by scheduling the next tasking sensor for the next time step according to the predicted tracking accuracy derived from the trace of the covariance matrix of the state estimation [21]. In [27], an energy-efficient target tracking method was proposed, where the KF is used to predict the target location for the next time step, then the sensor node and the cluster are selected to minimize the energy consumption. A multi-step sensor scheduling scheme is adopted based on the adaptive sampling interval approach to achieve fast tracking speed and superior energy efficiency without degrading the tracking accuracy [28]. To improve the performance of energy efficiency and tracking accuracy, authors in [29] proposed a multi-step sensor scheduling approach using the branch-and-bound algorithm. It could achieve the optimal multi-step sensor scheduling solution, but easy led to the “curse of dimensionality” problem.

Nevertheless, the above adaptive sensor scheduling solutions only choose one sensor node for each time step. Similarly, a flexible mechanism to improve the tracking performance can be obtained by adaptive multi-sensor scheduling. For example, in [30], a distributed adaptive multi-sensor scheduling was presented to implement the target tracking with the cooperation of the sensor nodes. In the adaptive sampling interval approach for single target tracking, the sensors are scheduled in alternative tracking mode to implement energy-efficient tracking according to the predicted tracking accuracy based on EKF [31]. To minimize the estimation error over multiple time steps in a computationally tractable fashion, Huber [32] proposed an information-based pruning algorithm for multi-step sensor scheduling by using the information matrices of the sensors and the monotonicity of the Riccati equation. In [33], several suboptimal scheduling algorithms were proposed with the performance expressed by the weighted sum of the estimated error covariance matrix in KF and the energy consumption. The posterior Cramer–Rao lower bound was proposed as a sensor selection metric, which put a constraint on the total number of selected sensors to observe the target over a time window [7,34].

However, all these methods dispatch the sensors based on the optimization of local performance, instead of global performance. In energy-harvesting WSNs, novel design criteria are required to achieve an overall performance optimization over an infinite horizon.

There are some studies in the literature on energy harvesting-based WSNs. For example, the Markov decision process (MDP) was presented to maximize the long-term expected throughput to derive the optimal power level [35]. However, the computational complexity of the MDP-based approaches is generally high due to the large volume of the state and action space. To optimize the transmission performance, Lyapunov optimization theory was used in an energy harvesting wireless communication system [36]. DP was proposed for the optimization of the task scheduling to maximize the quality of service in a solar energy-harvesting Internet of Things [37]. However, DP easily leads to the “curse of dimensionality”.

Recently, data-driven approaches have been widely used in the control field to realize a variety of data-based linear and nonlinear systems, for prediction, evaluation, scheduling, monitoring, diagnosis, decision-making, and optimization [38,39,40,41,42,43,44,45,46]. ADP is a typical data-driven approach for control over an infinite horizon, which can avoid the “curse of dimensionality” problem and the reverse solving problem existing in DP [47]. Up to now, ADP has been applied to nonlinear zero-/nonzero-sum differential games [48,49], optimal tracking control problems [50], optimal control of intelligent grid [51,52], and optimal time slot scheduling of MAC protocol [53]. Recently ADP was also proposed as an optimal sensor scheduling scheme for target tracking in an energy-harvesting WSN [20], by scheduling one sensor for each time step over an infinite horizon considering the global tracking accuracy and energy consumption. However, ADP-based multi-sensor scheduling for collaborative target tracking in energy-harvesting WSNs remains as an open and challenging problem.

3. Basic Models

A sensor node usually consists of a sensing unit, a processing unit, a transceiver unit, and a power unit. The energy is finite in a power unit, while the energy of an energy-harvesting sensor node can be collected by the energy harvesting device through ambient energy from time to time and stored as electric energy.

In this paper, we assume that solar energy-based harvesting technology is adopted by the sensor nodes. We also assume that the WSN of this paper is composed of one sink node and M energy-harvesting sensor nodes. Each sensor node with enough energy can sense the target in its sensing region and transmit the perceived information to the sink node. The sink node fuses the received measurements, predicts the target states, and performs sensor scheduling.

3.1. Solar Energy Harvesting Model of the Sensor Nodes

In this paper, we assume that solar energy harvesting is used by the sensor.

If the sensor’s energy storage capacity is unlimited, the harvested energy of a sensor node can be modeled as [54]:

E_{h} (t_{0}, Δ T) = \int_{t_{0}}^{t_{0} + Δ T} η_{e} f_{e} (t) d t = η_{e} \int_{t_{0}}^{t_{0} + Δ T} f_{e} (t) d t,

(1)

where

t_{0}

is the starting time for energy harvesting,

Δ T

is the time duration,

f_{e} (t)

is the statistical distribution of the solar energy, and

η_{e}

is the conversion efficiency of the solar panel.

However, the unlimited storage capacity is impractical. Suppose that the maximal energy storage capacity of sensor

i

is

H_{i}^{\max} (0 < H_{i}^{\max} < \infty)

. Then, the harvested energy of sensor

i

with the residual energy

E_{l e f t}

is

\min (E_{l e f t} + E_{h} (t_{0}, Δ T), H_{i}^{\max})

.

3.2. EKF-Based Prediction and Estimation Model for Target State

In this paper, we will apply EKF to the target tracking problem. The basic idea is to use minimum mean square error as the best estimation criterion and update the current estimated state with the previous prediction and the current measurements [55]. In this paper, we adopt a linear target motion model and a non-linear measurement model, both with Gaussian noise distributions.

The state of the target at the k-th time step at

t_{k}

is

X (k) = {[\begin{matrix} \begin{matrix} x (k) & v_{x} (k) \end{matrix} & \begin{matrix} y (k) & v_{y} (k) \end{matrix} \end{matrix}]}^{T},

(2)

where

(x (k)

,

y (k))

are the location coordinates of the target and

(v_{x} (k), v_{y} (k))

are the velocity of the target along the x-axis and the y-axis at

t_{k}

. The target motion is modeled by the following constant velocity motion model

X (k + 1) = A (Δ t_{k}) X (k) + w (Δ t_{k}),

(3)

A (Δ t_{k}) = [\begin{matrix} 1 & Δ t_{k} & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & Δ t_{k} \\ 0 & 0 & 0 & 1 \end{matrix}], Q (Δ t_{k}) = q [\begin{matrix} \frac{Δ t_{k}^{3}}{3} & \frac{Δ t_{k}^{2}}{2} & 0 & 0 \\ \frac{Δ t_{k}^{2}}{2} & Δ t_{k} & 0 & 0 \\ 0 & 0 & \frac{Δ t_{k}^{3}}{3} & \frac{Δ t_{k}^{2}}{2} \\ 0 & 0 & \frac{Δ t_{k}^{2}}{2} & Δ t_{k} \end{matrix}] .

(4)

If the target is detected by n sensors, then the sink will obtain n measurements

z_{j} (k) (j = 1, \dots, n)

. Let

Z (k) = {[z_{1} (k), z_{2} (k), \dots, z_{n} (k)]}^{T}

, then the measurement model is given by

Z (k) = [\begin{matrix} h_{1} (X (k)) \\ h_{2} (X (k)) \\ ⋮ \\ h_{n} (X (k)) \end{matrix}] + [\begin{matrix} v_{1} (k) \\ v_{2} (k) \\ ⋮ \\ v_{n} (k) \end{matrix}] = h (X (k)) + v (k) .

(5)

Some notations used in EKF are listed as follows:

$\hat{X} (k + 1 | k)$ : Step prediction for the (k + 1)-th time step using state estimation at the k-th time step.
$\hat{X} (k + 1 | k + 1)$ : State estimation at the (k + 1)-th time step.
$Δ t_{k}$ : Sampling time interval between two successive time steps.
$w (Δ t_{k})$ : Process noise at the k-th time step.
$Q (Δ t_{k})$ : Covariance matrix of the process noise at the k-th time step.
$q$ : Given scalar that represents the intensity of the process noise.
$v_{i} (k)$ : Measurement noise of sensor i at the k-th time step.
$v (k)$ : Measurement noise at the k-th time step.
$R_{i} (k)$ , $R (k)$ : Covariance matrix of the measurement noise at the k-th time step.
$h_{i} (X (k))$ , $h (X (k))$ : Measurement function at the k-th time step.
$H (k + 1)$ : Jacobian matrix of h at $t_{k + 1}$ with respect to $\hat{X} (k + 1 | k)$ .
$P (k + 1 | k)$ : Error covariance matrix of the state prediction for the (k + 1)-th time step.
$P (k + 1 | k + 1)$ : Error covariance matrix of the state estimation at the (k + 1)-th time step.
$I$ : Unit matrix.
$K (k + 1)$ : Kalman gain at the (k + 1)-th time step.

Both

w (Δ t_{k})

and

v_{i}

are independent and assumed to have zero-mean, white, Gaussian probability distributions.

h_{i}

is generally non-linear depending on

X (k)

, the measurement characteristic, and the parameters (e.g., the location) of sensor i.

In EKF, the prediction is operated as

\hat{X} (k + 1 | k) = A (Δ t_{k}) \hat{X} (k | k),

(6)

P (k + 1 | k) = A (Δ t_{k}) P (k | k) A^{T} (Δ t_{k}) + Q (Δ t_{k}) .

(7)

The estimation is operated as

K (k + 1) = P (k + 1 | k) H^{T} (k + 1) {[H (k + 1) P (k + 1 | k) H^{T} (k + 1) + R (k + 1)]}^{- 1},

(8)

\hat{X} (k + 1 | k + 1) = \hat{X} (k + 1 | k) + K (k + 1) (Z (k + 1) - h (\hat{X} (k + 1 | k))),

(9)

P (k + 1 | k + 1) = (I - K (k + 1) H (k + 1)) P (k + 1 | k) .

(10)

3.3. Tracking Accuracy

At the k-th step, the sink node schedules the sensors to minimize the global performance, which is composed of the energy consumption and tracking accuracy, under the limited energy harvesting capabilities. However, it is impractical to calculate the error through the difference between the real state and the estimated state because the measurement is unobtainable prior to the sink scheduling the sensors. However, the error covariance based on EKF is available before measuring, and it describes the degree of the difference between the estimation and the expectation. Hence, (11) can be used to evaluate the tracking accuracy:

T (k) = trace (P (k | k)) .

(11)

4. The Optimal Sensor Scheduling Problem

4.1. System Assumptions

In our proposed algorithm, the assumptions made about the network model are as follows.

The sensors and sink node are stationary.
The sink node has strong computing ability and energy storage capacity with enough memory.
The sink is aware of the locations of the sensor nodes.
All sensor nodes are homogeneous (i.e., having the same sensing, processing, and communication capabilities).
A sensor node and the sink node can communicate directly with each other via a single-hop link.

4.2. Target Tracking Mechanism

Figure 1 illustrates a target tracking scenario in an energy-harvesting WSN. When observed the target, the tasking sensors transmit the perceived measurements to the sink node. Then the sink node fuses the received measurements, predicts the target state for the next time step based on EKF, schedules the next tasking sensor nodes by ADP-MSS and notifies them by the low-power paging channel.

The general tracking system in the energy-harvesting WSN works as follows.

Initialization. When the target enters the sensor field, the energy-harvesting sensor with enough energy that detects the target for the first time becomes the first tasking sensor. It sends the measurement to the sink node.
State estimation and prediction. When the sink gets the new measurements, it estimates and predicts the state and error covariance by EKF.
Sensor scheduling. Based on the above solar energy harvesting model, the sink performs the sensor scheduling by ADP-MSS to minimize the performance which consists of the predicted tracking accuracy and energy consumption.
Mode swapping. The sink wakes up the tasking sensors for the current time step and switches the others to the sleeping mode via the low-power paging channel.
Monitoring and transmitting. The tasking sensors monitor the target and transmit the measurements to the sink.

4.3. Energy Consumption Analysis

At the k-th time step, the detection model of sensor i is described as

D_{i} (k) = {\begin{cases} 0 & E_{i} (k) < E^{h} \\ 1 & E_{i} (k) \geq E^{h} \end{cases},

(12)

where

E^{h}

is a threshold value for sensing the target,

E_{i} (k)

represents the received signal level, and

E_{i} (k) = E_{i}^{0} \exp (- β d_{(x, i)})

, in which

E_{i}^{0}

and

β

are constant and

d_{(x, i)}

is the Euclidean distance between the target and sensor

i

. The set of tasking sensors scheduled to track the target

Ω_{T} (k)

is a subset of

Ω_{D} (k) = {i | D_{i} (k) = 1}

, which denotes the set of all candidate sensors that possibly detect the target. At

t_{k}

, the energy consumption of sensor

i

is

E_{c o n} (i) = {\begin{cases} E_{r} + E_{t} (i) + E_{p} & u_{i} (k) = 1 \\ E_{s} & u_{i} (k) = 0 \end{cases} .

(13)

If

u_{i} (k) = 1

, the sensor i is scheduled, otherwise the sensor i is sleeping.

E_{r} = e_{r} b_{r}

represents the energy consumed to receive

b_{r}

bits of data.

E_{t} (i) = (e_{t} + e_{d} d_{(s, i)}^{2}) b_{t}

represents the energy consumption due to transmitting

b_{t}

bits of data to the sink node s.

E_{p} = e_{p} b_{p}

represents the energy consumption due to sensing and data processing of

b_{p}

bits, and

E_{s}

represents the energy required for sleeping.

e_{r}

,

e_{t}

,

e_{d}

, and

e_{p}

are decided by the specifications of the sensor.

In this paper, the design objective is to schedule the sensors for high tracking performance over an infinite horizon. Set the system state as the residual energy of the energy-harvesting sensors and the system control as the sensor scheduling solution. At

t_{k}

, before scheduled, the residual energy of sensor

i

is

E_{i}^{b f} (k) = \min (E_{i}^{a f} (k - 1, u_{i} (k - 1)) + E_{h} (t_{k}, Δ t_{k}), H_{i}^{\max}) = f (E_{i}^{a f} (k - 1, u_{i} (k - 1))),

(14)

where

E_{i}^{a f} (k - 1, u_{i} (k - 1))

is the residual energy of sensor

i

after being scheduled at the (k − 1)-th time step. If scheduled, sensor i must satisfy the restriction

E_{i}^{b f} (k) \geq E_{r} + E_{t} (i) + E_{p} .

(15)

Let

Ω_{u} (k)

be the subset in

Ω_{D} (k)

. For time step k, the scheduled sensors must be a subset of

Ω_{u} (k)

. After being scheduled at the k-th time step, the consumed energy and residual energy of sensor i are

E_{i}^{c} (k, u_{i} (k)) = (E_{r} + E_{t} (i) + E_{p}) u_{i} (k) + E_{s} (1 - u_{i} (k)),

(16)

E_{i}^{a f} (k, u_{i} (k)) = E_{i}^{b f} (k) - E_{i}^{c} (k, u_{i} (k)) .

(17)

According to (14), (15), and (17), we can obtain

E_{i}^{a f} (k, u_{i} (k)) = f (E_{i}^{a f} (k - 1)) + g_{i} u_{i} (k) - E_{s},

(18)

where

g_{i} = E_{s} - (E_{r} + E_{t} (i) + E_{p})

. Let

g = [g_{1}, g_{2}, \dots, g_{M}]

, the system state of the k-th time step is

E_{a f} (k) = [E_{1}^{a f} (k, u_{1} (k)), E_{2}^{a f} (k, u_{2} (k)), \dots, E_{M}^{a f} (k, u_{M} (k))]

, and the control of the k-th time step is

u (k) = [u_{1} (k), u_{2} (k), \dots, u_{M} (k)]

. Then, the system model is

E_{a f} (k) = f (E_{a f} (k - 1)) + g \times u (k) - E_{s},

(19)

where

g \times u (k)

means the Hadamard product (i.e., element-wise product) between

g

and

u (k)

.

5. ADP-Based Optimal Multi-Sensor Scheduling Algorithm

5.1. The Proposed Algorithm

We analyzed the predicted tracking accuracy and energy consumption respectively for time step k. To acquire the trade-offs between the potentially infinite network lifetime and the tracking accuracy, we define the utility function at time step k as

U (k) = β_{1} T (k) + \sum_{i = 1}^{M} E_{i}^{c} (k, u (k)),

(20)

in which

β_{1} > 0

is a coefficient to adjust the weight of the tracking accuracy [7]. It is obvious that

U (k)

is finite. Define the global performance index as the weight sum of the utility function from time step k to the infinite:

J (k) = \sum_{j = k}^{\infty} γ^{j - k} U (j),

(21)

where

0 < γ \leq 1

is a discount factor. Then, we can derive a Hamilton–Jacobi–Bellman (HJB) equation:

J (k) = U (k) + γ J (k + 1) .

(22)

Hence, the objective function of the optimization multi-sensor scheduling problem for target tracking in an energy-harvesting WSN is

\begin{array}{l} \min_{u (k)} J (k) \\ s t {\begin{array}{l} D_{i} (k) = 1 & \forall i \in {i | u_{i} (k) = 1} \\ E_{a f} (k) \geq 0 \end{array} \end{array}

(23)

Let

J^{*} (k) = \min_{u (k)} J (k)

. Then, we can get the following HJB equation

J^{*} (k) = \min_{u (k)} {U (k) + γ J^{*} (k + 1)}

(24)

and the optimal control sequence

u^{*} (k)

by

u^{*} (k) = \arg \min_{u (k)} {U (k) + γ J^{*} (k + 1)}

(25)

Generally, the optimal performance index function

J^{*} (k)

is nonlinear, and it is difficult to obtain the optimal control by directly solving (24). To overcome the above problem, the ADP-MSS is proposed to get the approximate optimal solution in this paper.

A diagram of the proposed ADP-MSS is shown in Figure 2, which is comprised of three modules: model, critic network, and action. The model describes the relationship between the next system state

E_{a f} (k + 1)

with the current system state

E_{a f} (k)

and the system control

u (k)

(i.e., the model in (19)). The critic network evaluating the infinite horizon performance is realized by a neural network, in which the input is the system state and the output is the evaluated performance index

Φ (k)

which tends to satisfy the HJB equation defined as in (22). The action is executed to find the optimal control for the evaluated performance in the critic network.

It runs as follows.

At first, let

Φ^{[0]} (k) = 0

for any

k

, then we can obtain the optimal performance index at the first iteration step

Φ^{[1]} (k) = \min_{u (k)} {U (k) + γ Φ^{[0]} (k + 1)}

(26)

and the optimal control strategy

u^{[0]} (k) = \arg \min_{u (k)} {U (k) + γ Φ^{[0]} (k + 1)} .

(27)

Next, when the iteration step i = 1, 2, ⋯, we can obtain

Φ^{[i + 1]} (k) = \min_{u (k)} {U (k) + γ Φ^{[i]} (k + 1)},

(28)

u^{[i]} (k) = \arg \min_{u (k)} {U (k) + γ Φ^{[i]} (k + 1)} .

(29)

The critic network is designed to approximate

Φ^{[i + 1]}

. The input is

E_{a f} (k) \in R^{1 \times M}

where

R

is the set of real numbers and the output is

Φ^{[i + 1]} (k) = w_{c}^{T} (k) σ (v_{c}^{T} (k) E_{a f} (k)) .

(30)

The optimal object can be expressed as

{\tilde{Φ}}^{[i + 1]} (k) = U (k) + γ Φ^{[i]} (k + 1) .

(31)

Hence, we can define the error of the network as

e_{c}^{[i + 1]} (k) = Φ^{[i + 1]} (k) - {\tilde{Φ}}^{[i + 1]} (k) .

(32)

Therefore, the objective function needed to be minimized in the critic network is

E_{c}^{[i + 1]} = {(e_{c}^{[i + 1]} (k))}^{2} / 2

. The steepest descent method is used for the weight update:

w_{c}^{'} (k) = w_{c} (k) - α_{c} \partial E_{c}^{[i + 1]} (k) / \partial w_{c} (k),

(33)

v_{c}^{'} (k) = v_{c} (k) - α_{c} \partial E_{c}^{[i + 1]} (k) / \partial v_{c} (k),

(34)

in which

0 < α_{c} < 1

is the learning rate and the updated weights are

w_{c}^{'} (k)

and

v_{c}^{'} (k)

.

5.2. The ADP-MSS Implementation Process

The pseudocode for ADP-MSS at time step k is given in Algorithm 1. Here, the system state

E_{a f} (k)

is known,

δ

is a very small positive value defined by the user, and

Φ^{[i]} (k)

denotes the iterative global performance index from time step k to the infinite, at iteration step i. This iteration procedure can be terminated after a predefined number of iteration step (MI) is reached.

Algorithm 1 ADP-MSS

1: set the value of

R_{i}

,

q

,

β_{1}

,

γ

,

α_{c}

,

X (1)

,

P (1)

2: select the initial value of

w_{c} (k)

,

v_{c} (k)

randomly from a given region

3: set i = 0;

Φ^{[i]} (k) = 0

,

\forall k > 0

and termination = false

4: while (termination = false) do

5:

u^{[i]} (k) = \arg \min_{u (k)} {U (k) + γ Φ^{[i]} (k + 1)}

(Action)

6:

E_{a f}^{[i]} (k + 1) = f (E_{a f} (k)) + g \times u^{[i]} (k) - E_{s}

(Model)

7:

Φ^{[i]} (k + 1) = w_{c}^{T} (k) σ (v_{c}^{T} (k) E_{a f}^{[i]} (k + 1))

(Critic Network)

8:

{\tilde{Φ}}^{[i + 1]} (k) = U (k) + γ Φ^{[i]} (k + 1)

(HJB equation)

9:

Φ^{[i + 1]} (k) = w_{c}^{T} (k) σ (v_{c}^{T} (k) E_{a f} (k))

(Critic Network)

10:

e_{c}^{[i + 1]} (k) = Φ^{[i + 1]} (k) - {\tilde{Φ}}^{[i + 1]} (k)

11:

i = i + 1

12: if

∥ e_{c}^{[i + 1]} (k) ∥ < δ

or

i > M I

then

13: termination = true

14: else

15: updated the weight

w_{c} (k)

,

v_{c} (k)

by the steepest descent method

16: end if

17: end while

18: return:

u^{[i]} (k)

6. Performance Analysis

6.1. Theoretical Analysis

Now we will prove the convergence of ADP-MSS. That is, when

i \to \infty

,

Φ^{[i]} (k) \to J^{*} (k)

.

Theorem 1.

Let the proposed ADP-MSS be implemented according to (26)–(29), then

{Φ^{[i]} (k), i = 0, 1, 2, \dots}

is a bounded sequence.

Proof:

Define a new sequence as follows:

Ψ^{[i + 1]} (k) = U^{[i]} (k) + γ Ψ^{[i]} (k),

(35)

in which

U^{[i]} (k)

is the utility under the control

u^{[i]} (k)

, then

U^{[i]} (k)

is bounded because the number of scheduled sensors is finite. Set

Ψ^{[0]} (k) = 0

for any

k

, and we can obtain

\begin{array}{l} Ψ^{[i + 1]} (k) & = U^{[i]} (k) + γ Ψ^{[i]} (k + 1) \\ = U^{[i]} (k) + γ [U^{[i - 1]} (k + 1) + γ Ψ^{[i - 1]} (k + 2)] \\ = U^{[i]} (k) + γ U^{[i - 1]} (k + 1) + γ^{2} [U^{[i - 2]} (k + 2) + γ Ψ^{[i - 2]} (k + 3)] \\ = \dots \\ = \sum_{j = 0}^{i} γ^{j} U^{[i - j]} (k + j) + γ^{i + 1} Ψ^{[0]} (k + i + 1) \\ = \sum_{j = 0}^{i} γ^{j} U^{[i - j]} (k + j) \end{array}

(36)

Hence,

Ψ^{[i + 1]} (k)

is bounded. According to (28), we can conclude that

Φ^{[i + 1]} (k) \leq Ψ^{[i + 1]} (k)

, so

{Φ^{[i]} (k), i = 0, 1, 2, \dots}

is a bounded sequence. □

Theorem 2.

Let the proposed ADP-MSS be implemented according to (26)–(29), then

{Φ^{[i]} (k), i = 0, 1, 2, \dots}

is a monotone non-decreasing sequence, that is,

Φ^{[i + 1]} (k) \geq Φ^{[i]} (k) .

(37)

Proof:

Mathematical induction is used in the proof.

At first, when

i = 0

,

\forall k, Φ^{[0]} (k) = 0

. Then

Φ^{[1]} (k) \geq Φ^{[0]} (k) .

(38)

Assuming that

Φ^{[l]} (k) \geq Φ^{[l - 1]} (k)

for

i = l - 1, l = 1, 2, \dots

, when

i = l

,

\begin{array}{l} Φ^{[l + 1]} (k) & = \min_{u (k)} {U (k) + γ Φ^{[l]} (k + 1)} \\ \geq \min_{u (k)} {U (k) + γ Φ^{[l - 1]} (k + 1)} \\ = Φ^{[l]} (k) \end{array}

(39)

Therefore,

{Φ^{[i]} (k), i = 0, 1, 2, \dots}

is a monotonic non-decreasing sequence. □

From Theorem 1 and Theorem 2, it can be inferred that

{Φ^{[i]} (k)}

is convergent. Denote

Φ^{\infty} (k) = \lim_{i \to \infty} Φ^{[i]} (k) .

(40)

Theorem 3.

For any

k

,

Φ^{\infty} (k)

is the optimal performance index, that is,

Φ^{\infty} (k)

satisfies the HJB equation

Φ^{\infty} (k) = \min_{u (k)} {U (k) + γ Φ^{\infty} (k + 1)} .

(41)

Proof:

From Theorem 2, we can get

Φ^{\infty} (k) \geq Φ^{[i + 1]} (k)

. Let

i \to \infty

, then we can obtain

Φ^{\infty} (k) \geq \min_{u (k)} {U (k) + γ Φ^{\infty} (k + 1)} .

(42)

Based on the definition of

Φ^{\infty} (k)

,

\forall ε > 0

,

\exists Φ^{[p]} (k)

, such that

Φ^{[p]} (k) \leq Φ^{\infty} (k) \leq Φ^{[p]} (k) + ε .

(43)

Then, we have

Φ^{\infty} (k) \leq \min_{u (k)} {U (k) + γ Φ^{[p - 1]} (k + 1)} + ε .

(44)

ε

can be ignored because it is any positive value. Let

p \to \infty

, then

Φ^{\infty} (k) \leq \min_{u (k)} {U (k) + γ Φ^{[\infty]} (k + 1)} .

(45)

From (42) and (45), we can get (41), which is just the definition of

J^{*} (k)

after replacing

Φ^{\infty} (\cdot)

by

J^{*} (\cdot)

. Hence, we can conclude that

Φ^{\infty} (k) = J^{*} (k)

, which means that the sequence of the iterative performance indexes in the proposed ADP-MSS will converge to the optimal solution. □

6.2. Simulation Results

In this paper we used Matlab 2014 as the simulation tool and considered a numerical example in which a WSN is deployed to monitor the moving target in a closed region with 10 m × 10 m square. The WSN contained 24 sensor nodes and one sink located at the center, as shown in Figure 3. For each sensor node, the sensing region was a circle centered on its own location with a radius of 3 m.

In the simulations of this paper, the ranging sensors were used to measure the distance between the sensor and the target. For sensor

i

located at

(x_{i}, y_{i})

, the measurement function

h_{i}

is

h_{i} (X (k)) = \sqrt{{(x (k) - x_{i})}^{2} + {(y (k) - y_{i})}^{2}} .

(46)

The Jacobian matrix for the measurement function is

H (k + 1) = [\begin{matrix} \frac{x (k) - x_{1}}{\sqrt{{(x (k) - x_{1})}^{2} + {(y (k) - y_{1})}^{2}}} & 0 & \frac{y (k) - y_{1}}{\sqrt{{(x (k) - x_{1})}^{2} + {(y (k) - y_{1})}^{2}}} & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ \frac{x (k) - x_{n}}{\sqrt{{(x (k) - x_{n})}^{2} + {(y (k) - y_{n})}^{2}}} & 0 & \frac{y (k) - y_{n}}{\sqrt{{(x (k) - x_{n})}^{2} + {(y (k) - y_{n})}^{2}}} & 0 \end{matrix}] .

(47)

We assumed the solar panel’s area was 5 cm × 5 cm. The harvested energy rate was 0.1 W/cm², and energy conversion efficiency was 15%. The max capacity of each battery was

5 \times 10^{- 2}

J with the initial energy being

2.5 \times 10^{- 3}

J, and they had infinite recharge cycles. Meanwhile, the variance of the measurement noise of the sensor nodes changed from 0.01 to 0.1. Except for

E_{s}

, the energy consumption parameters were borrowed from [31] as shown in Table 1, and the other constant parameters are given in Table 2.

In the simulations, the true trajectory of the target was a circle with a radius of 4 m centered at the center of the WSN. The residual energy of the 24 sensors at time step k

E_{a f} (k)

was used as the system state, and could be obtained by the previous system state estimation and the control according to (19). The control was the sensor scheduling scheme

u (k) = [u_{1} (k), u_{2} (k), \dots, u_{24} (k)]

, where

u_{i} (k) \in {0, 1}

and

u_{i} (k) = 1

means that sensor i was scheduled as one of the tasking sensors at time step k, otherwise the sensor i was not scheduled and could remain in the sleeping mode. While the target was moving in the monitoring area, the tracking system iteratively performed target detection by the scheduled tasking sensors, transmitting the measurements from the tasking sensors to the sink, target state estimation and prediction by the sink, and sensor scheduling by the sink. If the sensors are not properly scheduled, it can result in the failure of the tracking or degradation of the overall tracking performance.

The structure of the adopted critic network was 24–30–1 with 24 inputs, 30 nodes in the hidden layer, and 1 output. Its initial weight values were set randomly from the range (0, 0.5). Figure 4 shows the changes of the performance indexes for the first time-step of ADP-MSS, initialized at 0. It can be found that the change of the performance indexes was monotone non-decreasing as analyzed in Theorem 2, and the curve converged after about 600 iterations.

The true trace and estimated trajectories of the target are shown in Figure 5 when the variance of the measurement noise was 0.05 and the target speed was 5 m/s. The corresponding tracking error is shown in Figure 6, which consists of the Euclidean distance from the true coordinate to the estimated coordinate of the target at time step k. To evaluate the tracking accuracy of ADP-MSS, the tracking errors of an ADP-based single-sensor scheduling algorithm (ADP-SSS) and simulated annealing algorithm-based multi-sensor scheduling (SAA-MSS) are also shown in the same figure, where it is obvious that the proposed approach ADP-MSS was more stable and accurate.

Figure 7 and Figure 8 show the tracking errors respectively while the target speed increased from 1 to 10 m/s and the variance of the measurement noise changed from 0.01 to 0.1. From the curves in these two figures, we can find that the results of the two multi-sensor scheduling schemes (ADP-MSS and SAA-MSS) were more stable and accurate than those of the single-sensor one (ADP-SSS). This is because multiple sensors can provide more information to improve the tracking accuracy using data fusion. In addition, it is easy to find that the results from ADP-MSS scheme were better than those from SAA-MSS. The main reason is that the SAA-MSS only takes the local optimization of the performance into account. In fact, the node’s state is associated with its previous state and may influence its states at the following steps. Hence, from the overall performance perspective, local optimal solutions are not the most reasonable decisions, and may have a negative impact on the global performance.

From Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8, the following conclusions can be drawn:

The performance index of ADP-MSS was monotonically non-decreasing and converged.
The multi-sensor scheduling schemes were more stable and reliable than the single one.
The proposed ADP-MSS could achieve global performance optimality.

7. Conclusions

ADP is an efficient method to solve the dynamic programming problems of discrete systems. This paper introduces the ADP approach (ADP-MSS) to the optimal multi-sensor scheduling problem for target tracking in energy-harvesting WSNs. We present an adaptive scheme to schedule the tasking sensors by considering the global optimization of the performance composed of the energy consumption and tracking accuracy over an infinite time horizon. Theoretical analysis proved that the iterative control by ADP-MSS will converge to the optimal solution. Through simulation results, we found that the multi-sensor scheduling schemes were more stable and reliable than the single sensor scheduling scheme and the proposed ADP-MSS was superior to an SAA-based multi-sensor scheduling scheme from a global perspective. As future work, more advanced ADP based cross-layer sensor network design schemes can be studied by jointly designing the network protocols with the sensor scheduling.

Author Contributions

F.L. proposed the ADP-MSS scheme for target tracking in the energy-harvesting WSN and conducted the experiments and analysis. W.X. supervised the work. S.C. and C.J. were involved in the discussions on ADP theory and its applications.

Funding

This work is supported in part by the National Natural Science Foundation of China (Grants No.61673055 and No.61773056).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ma, J.J.; Meng, F.S.; Zhou, Y.X.; Wang, Y.Y.; Shi, P. Distributed water pollution source localization with mobile UV-visible spectrometer probes in wireless sensor networks. Sensors 2018, 18, 606. [Google Scholar] [CrossRef] [PubMed]
Mortazavi, S.H.; Salehe, M.; MacGregor, M.H. Maximum WSN coverage in environments of heterogeneous path loss. Int. J. Sens. Netw. 2014, 16, 185–198. [Google Scholar] [CrossRef]
Touati, F.; Mnaouer, A.B.; Erdene-Ochir, O.; Mehmood, W.; Hassan, A.; Gaabab, B. Feasibility and performance evaluation of a 6LoWPAN-enabled platform for ubiquitous healthcare monitoring. Wirel. Commun. Mob. Comput. 2016, 16, 1271–1281. [Google Scholar] [CrossRef]
Tang, D.; Yusuf, B.; Botzheim, J.; Kubota, N.; Chan, C.S. A novel multimodal communication framework using robot partner for aging population. Expert Syst. Appl. 2015, 42, 4540–4555. [Google Scholar] [CrossRef]
Odat, E.; Shamma, J.S.; Claudel, C. Vehicle classification and speed estimation using combined passive infrared/ultrasonic sensors. IEEE Trans. Intell. Transp. Syst. 2018, 19, 1593–1606. [Google Scholar] [CrossRef]
Yan, D.M.; Wang, J.K. Sensor scheduling algorithm target tracking-oriented. Wirel. Sens. Netw. 2011, 3, 295–299. [Google Scholar] [CrossRef]
Chen, H.; Zhang, S.; Liu, M.; Zhang, Q. An artificial measurements-based adaptive filter for energy-efficient target tracking via underwater wireless sensor networks. Sensors 2017, 17, 971. [Google Scholar] [CrossRef] [PubMed]
Reisi, A.R.; Moradi, M.H.; Jamasb, S. Classification and comparison of maximum power point tracking techniques for photovoltaic system: A review. Renew. Sustain. Energy Rev. 2013, 19, 433–443. [Google Scholar] [CrossRef]
Wang, J.L.; Zhao, G.F.; Zhang, M.; Zhang, Z.E. Efficient study of a coarse structure number on the bluff body during the harvesting of wind energy. Energy Sources Part A Recov. Util. Environ. Eff. 2018, 40, 1788–1797. [Google Scholar] [CrossRef]
Prijic, A.; Vracar, L.; Vuckovic, D.; Prijic, Z. Thermal energy harvesting wireless sensor node in aluminum core PCB technology. IEEE Sens. J. 2015, 15, 337–345. [Google Scholar] [CrossRef]
Hou, L.Q.; Tan, S.D.; Zhang, Z.J.; Bergmann, N.W. Thermal energy harvesting WSNs node for temperature monitoring in IIoT. IEEE Access 2018, 6, 35243–35249. [Google Scholar] [CrossRef]
Liu, J.X.; Xiong, K.; Fan, P.Y.; Zhang, Z.D. RF energy harvesting wireless powered sensor networks for smart cities. IEEE Access 2017, 5, 9348–9358. [Google Scholar] [CrossRef]
Kausar, A.S.M.Z.; Reza, A.W.; Saleh, M.U.; Ramiah, H. Energizing wireless sensor networks by energy harvesting systems: Scopes, challenges and approaches. Renew. Sustain. Energy Rev. 2014, 38, 973–989. [Google Scholar] [CrossRef]
Werbos, P.J. Advanced forecasting methods for global crisis warning and models of intelligence. Gen. Syst. Yearb. 1977, 22, 25–38. [Google Scholar]
Song, R.Z.; Xiao, W.D.; Zhang, H.G.; Sun, C.Y. Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 1733–1739. [Google Scholar] [CrossRef]
Song, R.Z.; Xiao, W.D.; Sun, C.Y. A new self-learning optimal control laws for a class of discrete-time nonlinear systems based on ESN architecture. Sci. China Inf. Sci. 2014, 57, 1–10. [Google Scholar] [CrossRef]
Song, R.Z.; Xiao, W.D.; Wei, Q.L.; Sun, C.Y. Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems. Soft Comput. 2014, 18, 1645–1653. [Google Scholar] [CrossRef]
Wei, Q.L.; Liu, D.R.; Lin, H.Q. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans. Cybern. 2016, 46, 840–853. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.G.; Zhang, X.; Luo, Y.H.; Yang, J. An overview of research on adaptive dynamic programming. Acta Autom. Sin. 2013, 39, 303–311. [Google Scholar] [CrossRef]
Song, R.Z.; Wei, Q.L.; Xiao, W.D. ADP-based optimal sensor scheduling for target tracking in energy harvesting wireless sensor networks. Neural Comput. Appl. 2016, 27, 1543–1551. [Google Scholar] [CrossRef]
Xiao, W.D.; Wu, J.K.; Xie, L.H.; Dong, L. Sensor scheduling for target tracking in networks of active sensors. Acta Autom. Sin. 2006, 32, 922–928. [Google Scholar] [CrossRef]
Zhang, F.; Chen, J.M.; Li, H.B.; Sun, Y.X.; Shen, X.M.(S.). Distributed active sensor scheduling for target tracking in ultrasonic sensor networks. Mob. Netw. Appl. 2012, 17, 582–593. [Google Scholar] [CrossRef]
Wang, Y.; Wand, D. Energy-efficient node selection for target tracking in wireless sensor networks. Int. J. Distrib. Sens. Netw. 2013, 2013, 1–6. [Google Scholar] [CrossRef]
Medeiros, H.; Park, J.; Kak, A. Distributed object tracking using a cluster-based Kalman filter in wireless camera networks. IEEE J. Sel. Top. Signal Process. 2008, 2, 448–463. [Google Scholar] [CrossRef]
Cheng, P.; Zhang, F.; Chen, J.M.; Sun, Y.X.; Shen, X.M. A distributed TDMA scheduling algorithm for target tracking in ultrasonic sensor networks. IEEE Trans. Ind. Electron. 2013, 60, 3836–3845. [Google Scholar] [CrossRef]
Jiang, B.; Ravindran, B.; Hyeonjoong, C. Probability-based prediction and sleep scheduling for energy-efficient target tracking in sensor networks. IEEE. Trans. Mob. Comput. 2013, 12, 735–747. [Google Scholar] [CrossRef]
Madaan, A.; Makki, S.K.; Osborne, L.; Sun, B. An Intelligent Energy Efficient Target Tracking Scheme for Wireless Sensor Environment. In Proceedings of the IEEE International Symposium on Wireless Pervasive Computing, Modena, Italy, 5–7 May 2010. [Google Scholar]
Xiao, W.D.; Xie, L.H.; Chen, J.F.; Shue, L. Multi-Step Adaptive Sensor Scheduling for Target Tracking in Wireless Sensor Networks. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 14–19 May 2006. [Google Scholar]
Song, B.; Xiao, W.D.; Zhang, Z.H. Multi-Step Sensor Scheduling for Energy-Efficient High-Accuracy Collaborative Target Tracking in Wireless Sensor Networks. Green Computing and Communications (GreenCom). In Proceedings of the 2013 IEEE and Internet of Things (iThings/CPSCom), IEEE International Conference on and IEEE Cyber, Physical and Social Computing, Beijing, China, 20–23 August 2013. [Google Scholar]
Lin, J.Y.; Xiao, W.D.; Lewis, F.L.; Xie, L.H. Energy-efficient Distributed Adaptive Multisensor Scheduling for Target Tracking in Wireless Sensor Networks. IEEE Trans. Instrum. Meas. 2009, 58, 1886–1896. [Google Scholar] [CrossRef]
Xiao, W.D.; Zhang, S.; Lin, J.Y.; Tham, C.K. Energy-efficient adaptive sensor scheduling for target tracking in wireless sensor networks. J. Control Theor. Appl. 2010, 8, 86–92. [Google Scholar] [CrossRef]
Huber, M.F. Optimal pruning for multi-step sensor scheduling. IEEE Trans. Autom. Control 2012, 57, 1338–1343. [Google Scholar] [CrossRef]
Maheswararajah, S.; Halgamuge, S.K.; Premaratne, M. Sensor scheduling for target tracking by suboptimal algorithms. IEEE Trans. Veh. Technol. 2009, 58, 1467–1479. [Google Scholar] [CrossRef]
Zhang, Q.; Liu, M.; Zhang, S. Node topology effect on target tracking based on UWSNs using quantized measurements. IEEE Trans. Cybern. 2015, 45, 2323–2335. [Google Scholar] [CrossRef] [PubMed]
Blasco, P.; Gunduz, D.; Dohler, M. A learning theoretic approach to energy harvesting communication system optimization. IEEE Trans. Wirel. Commun. 2013, 12, 1872–1882. [Google Scholar] [CrossRef]
Qiu, C.R.; Hu, Y.; Chen, Y.; Zeng, B. Lyapunov optimization for energy harvesting wireless sensor communications. IEEE Internet Things J. 2018, 5, 1947–1956. [Google Scholar] [CrossRef]
Caruso, A.; Chessa, S.; Escolar, S.; Toro, X.D.; López, J.C. A dynamic programming algorithm for high-level task scheduling in energy harvesting IoT. IEEE Internet Things J. 2018, 5, 2234–2248. [Google Scholar] [CrossRef]
Hou, Z.S.; Xu, J.X. On data-driven control theory: The state of the art and perspective. Acta Autom. Sin. 2009, 35, 650–667. [Google Scholar] [CrossRef]
Zhao, P.; Nagamune, R. Switching LPV controller design under uncertain scheduling parameters. Automatica 2017, 76, 243–250. [Google Scholar] [CrossRef]
Sato, M.; Peaucelle, D. A new method for gain-scheduled output feedback controller design using inexact scheduling parameters. In Proceedings of the 2018 IEEE Conference on Control Technology and Applications, Copenhagen, Denmark, 21–24 August 2018. [Google Scholar]
Lee, J.M.; Lee, J.H. Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes. Automatica 2005, 41, 1281–1288. [Google Scholar] [CrossRef]
Hou, Z.S.; Jin, S.T. A novel data-driven control approach for a class of discrete-time nonlinear systems. IEEE Trans. Control Syst. Technol. 2011, 19, 1549–1558. [Google Scholar] [CrossRef]
Ji, H.H.; Hou, Z.S.; Fan, L.L.; Lewis, F.L. Adaptive iterative learning reliable control for a class of non-linearly parameterised systems with unknown state delays and input saturation. IET Control. Theor. Appl. 2016, 10, 2160–2174. [Google Scholar] [CrossRef]
Rosas, A.D.; Velazquez, V.K.; Olivares, F.L.; Camacho, T.A.; Williams, I. Methodology to assess quality of estimated disturbances in active disturbance rejection control structure for mechanical system. ISA Trans. 2017, 70, 238–247. [Google Scholar] [CrossRef] [PubMed]
Roman, R.C.; Precup, R.E.; David, R.C. Second order intelligent proportional-integral fuzzy control of twin rotor aerodynamic systems. Procedia Comput. Sci. 2018, 139, 372–380. [Google Scholar] [CrossRef]
Vrkalovic, S.; Lunca, E.C.; Borlea, I.D. Model-free sliding mode and fuzzy controllers for reverse osmosis desalination plants. Int. J. Artif. Intell. 2018, 16, 208–222. [Google Scholar]
Zhang, H.G.; Cui, L.L.; Zhang, X.; Luo, Y.H. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 2011, 22, 2226–2236. [Google Scholar] [CrossRef] [PubMed]
Wei, Q.L.; Song, R.Z.; Yan, P.F. Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 444–458. [Google Scholar] [CrossRef] [PubMed]
Jiang, H.; Zhang, H.G.; Zhang, K.; Cui, X.H. Data-driven adaptive dynamic programming schemes for non-zero-sum games of unknown discrete-time nonlinear systems. Neurocomputing 2018, 275, 649–658. [Google Scholar] [CrossRef]
Jiang, H.; Luo, Y.H. Data-driven approximate optimal tracking control schemes for unknown non-affine non-linear multi-player systems via adaptive dynamic programming. IET Electron. Lett. 2017, 53, 465–467. [Google Scholar] [CrossRef]
Wei, Q.L.; Shi, G.; Song, R.Z.; Liu, Y. Adaptive dynamic programming-based optimal control scheme for energy storage systems with solar renewable energy. IEEE Trans. Ind. Electron. 2017, 64, 5468–5478. [Google Scholar] [CrossRef]
Wei, Q.L.; Liu, D.R.; Lewis, F.L.; Liu, Y.; Zhang, J. Mixed iterative adaptive dynamic programming for optimal battery energy control in smart residential microgrids. IEEE Trans. Ind. Electron. 2017, 64, 4110–4120. [Google Scholar] [CrossRef]
Xiao, W.D.; Liu, F.; Zhang, J.J. Adaptive Dynamic Programming for Multi-Point Scheduling in Energy Harvesting Wireless Sensor Networks. In Proceedings of the 2015 IEEE 12th International Conference on Ubiquitous Intelligence and Computing and 2015 IEEE 12th International Conference on Autonomic and Trusted Computing and 2015 IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops, Beijing, China, 10–14 August 2015. [Google Scholar]
Chen, H.B.; Zeng, Q.; Zhao, F. Efficient sleep scheduling algorithm for target tracking in double-storage energy harvesting sensor networks. Int. J. Distrib. Sens. Netw. 2016, 2016, 1–8. [Google Scholar] [CrossRef]
Kalman, R.E.; Bucy, R.S. New results in linear filtering and prediction theory. J. Basic Eng. 1961, 83, 95–108. [Google Scholar] [CrossRef]

Figure 1. The target tracking system in an energy-harvesting wireless sensor network (WSN).

Figure 2. Structure of the adaptive dynamic programming based multi-sensor scheduling algorithm (ADP-MSS). EKF: extended Kalman filter.

Figure 3. The layout of an energy-harvesting WSN.

Figure 4. The changes of performance indexes with the iterations.

Figure 5. The true trace and estimated trajectories of the target when the variance of the measurement noise was 0.05 and the target speed was 5 m/s. ADP-MSS: ADP-based multi-sensor scheduling algorithm; ADP-SSS: ADP-based single-sensor scheduling algorithm; SAA-MSS: simulated annealing algorithm-based multi-sensor scheduling.

Figure 6. The tracking error when the variance of the measurement noise was 0.05 and the target speed was 5 m/s.

Figure 7. The average tracking error when the target speed changed from 1 m/s to 10 m/s.

Figure 8. The average tracking error when the variance of the measurement noise changed from 0.01 to 0.1.

Table 1. Parameters in the energy consumption model.

Parameters	Value
$e_{t}$	$4.5 \times 10^{- 5} J / bit$
$e_{d}$	$1 \times 10^{- 11} J / bit \cdot m^{2}$
$e_{r}$	$1.35 \times 10^{- 4} J / bit$
$e_{p}$	$5 \times 10^{- 5} J / bit$
$E_{s}$	$1 \times 10^{- 6} J$

Table 2. Simulation parameters.

Parameters	Value
process noise parameter q	1
coefficient $β_{1}$	0.10
discount factor $γ$	0.70
learning rate $α_{c}$	0.20
sampling interval	0.05 s
Packet size in each transmission	10 bits
number of nodes in the NN hidden layer	30
initial location of the target	(8.81, 6.23)
computation precision $δ$	1 × 10⁻³
max iteration step MI in ADP algorithm	1000

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, F.; Xiao, W.; Chen, S.; Jiang, C. Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks. Sensors 2018, 18, 4090. https://doi.org/10.3390/s18124090

AMA Style

Liu F, Xiao W, Chen S, Jiang C. Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks. Sensors. 2018; 18(12):4090. https://doi.org/10.3390/s18124090

Chicago/Turabian Style

Liu, Fen, Wendong Xiao, Shuai Chen, and Chengpeng Jiang. 2018. "Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks" Sensors 18, no. 12: 4090. https://doi.org/10.3390/s18124090

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks

Abstract

1. Introduction

2. Related Work

3. Basic Models

3.1. Solar Energy Harvesting Model of the Sensor Nodes

3.2. EKF-Based Prediction and Estimation Model for Target State

3.3. Tracking Accuracy

4. The Optimal Sensor Scheduling Problem

4.1. System Assumptions

4.2. Target Tracking Mechanism

4.3. Energy Consumption Analysis

5. ADP-Based Optimal Multi-Sensor Scheduling Algorithm

5.1. The Proposed Algorithm

5.2. The ADP-MSS Implementation Process

6. Performance Analysis

6.1. Theoretical Analysis

6.2. Simulation Results

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI