Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks

Gul, Omer Melih; Demirekler, Mubeccel

doi:10.3390/s17102206

Open AccessArticle

Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks

by

Omer Melih Gul

^*

and

Mubeccel Demirekler

Department of Electrical and Electronics Engineering, Middle East Technical University (METU), 06531 Cankaya, Ankara, Turkey

^*

Author to whom correspondence should be addressed.

Sensors 2017, 17(10), 2206; https://doi.org/10.3390/s17102206

Submission received: 1 August 2017 / Revised: 16 September 2017 / Accepted: 20 September 2017 / Published: 26 September 2017

(This article belongs to the Special Issue Energy Harvesting Sensors for Long Term Applications in the IoT Era)

Download

Browse Figures

Versions Notes

Abstract

:

This paper considers a single-hop wireless sensor network where a fusion center collects data from M energy harvesting wireless sensors. The harvested energy is stored losslessly in an infinite-capacity battery at each sensor. In each time slot, the fusion center schedules K sensors for data transmission over K orthogonal channels. The fusion center does not have direct knowledge on the battery states of sensors, or the statistics of their energy harvesting processes. The fusion center only has information of the outcomes of previous transmission attempts. It is assumed that the sensors are data backlogged, there is no battery leakage and the communication is error-free. An energy harvesting sensor can transmit data to the fusion center whenever being scheduled only if it has enough energy for data transmission. We investigate average throughput of Round-Robin type myopic policy both analytically and numerically under an average reward (throughput) criterion. We show that Round-Robin type myopic policy achieves optimality for some class of energy harvesting processes although it is suboptimal for a broad class of energy harvesting processes.

Keywords:

energy harvesting; decision making; resource allocation; scheduling policy; wireless sensor network

1. Introduction

1.1. Motivation

The Internet of Things (IoT) is an intelligent large-scale communication infrastructure of uniquely identifiable devices capable of communicating with each other wirelessly through the Internet [1]. The devices in an IoT structure are typically equipped with wireless sensors [2]. Wireless Sensor Networks (WSNs) provide the opportunity of efficient data collection and transmission anywhere [3]. Thus, WSNs have various applications, such as agriculture [4], ambient air monitoring [5,6], frost monitoring [7], structural health monitoring [8,9,10], remote assistance for elderly people [11], home monitoring [3,11,12,13] and smart cities [14,15]. Being frugal with energy consumption is important to several WSN deployments. Energy harvesting (EH) [16] can particularly facilitate WSN applications where replacing battery is not practical. Therefore, energy harvesting is a promising approach for the emerging IoT technology [17]. Energy may be harvested from the environment in several different ways (solar, piezoelectric, wind, etc.) [17]. As energy harvesters generally depend on uncontrollable energy resources and the amount of harvested energy is generally low [17,18], WSNs need robust, self-adaptive, energy efficient policies to optimize their reliable operation lifetime [19,20].

In this paper, we consider a fusion center (FC) collecting data from M EH wireless sensors. At each time slot (TS), K sensors are scheduled for data transmission by the FC, which does not have the direct knowledge of the battery states of the sensors or the statistics of their EH processes. It is assumed that the communication is error-free and the sensors are data backlogged but limited in available energy. Each sensor has an infinite-capacity battery to store the harvested energy and battery leakage is ignored. When a sensor is scheduled in TS t, it sends data to the FC in the TS t as long as it has enough energy in that TS. Sending one packet takes up one TS. The objective of the FC is to maximize the average throughput over a time horizon.

In fact, battery states can be made available to the FC through some additional cost (i.e., feedback) and complexity in some WSNs. However, sending information about the battery state will cause extra time and energy consumption, which we avoid. Assume that the header containing only battery state is H bytes and the remaining part (payload + other headers) of the data packet is P bytes, then sending information about battery state will cause

\frac{H}{P}

times more time and energy consumption than those consumption not sending no information about battery state. We can avoid extra time consumption by consuming significantly more energy. As it is well known in communication field, data transmission rate is a concave function of transmission power. In fact, the well known Shannon’s capacity formula [21] ( Shannon’s capacity formula is

C = B \log_{2} (1 + \frac{S}{N}),

where C is the maximum capacity of the channel in bits/second otherwise called Shannon’s capacity limit for the given channel, B is the bandwidth of the channel in Hertz, S is the signal power in Watts and N is the noise power, also in Watts. The ratio

\frac{S}{N}

is called Signal to Noise Ratio (SNR)) indicates that this concave function is a logarithmic function. Therefore, when sending info about battery state, we can avoid the extra time consumption only by consuming much more extra energy than

\frac{H}{P}

. For example, assume that the overhead containing knowledge of battery state is one fourth of the exact data, i.e.,

\frac{H}{P} = \frac{1}{4}

. Then, sending both overhead and exact data instead of sending only exact data in the same time duration may cause two times more energy consumption. Thus, it can be said that although energy consumption and network lifetime are not performance metrics in the problem at hand, the problem definition (with sending no information about battery states) helps sensors decrease energy consumption per data packet transmission and thus network lifetime can be increased. Therefore, it is more relevant from a practical perspective that the FC makes scheduling decisions without any knowledge about battery states or statistics of their EH processes [22].

To set up the problem, a model about generation and usage of energy is needed. Each sensor accesses the energy state of its own battery only at the beginning of the time slots in which it is scheduled by the FC. Moreover, independent from the functional form (linear or other) and type of energy harvesting resource (solar, wind, piezoelectric, RF, etc.), the net amount of harvested energy minus used energy is stored losslessly. This assumption is consistent with typical batteries in use today for which leakage is negligibly small over several minutes because battery leakage causes the battery to self-discharge less than 10% (10% for Nickel-based batteries and 5% for Lithium-ion batteries) in 24-h from the results in [23]. Based on these mild assumptions about EH processes, an appropriate performance criterion is the average throughput (reward) criterion over a time horizon rather than expected discounted throughput (reward) for the problem at hand [24].

1.2. Related Work

Although EH processes are not limited to be Markovian in this work, under Markovian assumption, the problem could be formulated as a partially observable Markov decision process (POMDP) [25]. In this case, dynamic programming (DP) [26] may be employed for optimal solution. However, DP has exponential complexity, which limits its scalability [27].

A second approach is reinforcement learning by considering the problem as a POMDP. Q-learning [28], one of the most effective model-free reinforcement learning algorithms, would guarantee convergence to an optimal solution in this problem. However, its very slow convergence [29] deems it non-ideal for a problem with a sizeable state space, especially as the discount factor approaches 1. R-learning [30], which maximizes the average reward, may be considered; however, there is no guarantee on the convergence of R-learning. Therefore, reinforcement learning do not seem to be suitable for obtaining an efficient solution to this problem. There are other approaches that can, in the long run, guarantee convergence to optimal behavior. However, in many practical applications, a policy that achieves near optimality very quickly is preferable to the one that converges too slowly to exact optimality [29].

Another approach to this problem is to set it up as a restless multi-armed bandit (RMAB) problem. An optimal solution was proposed for RMAB problem under certain assumptions by Whittle [31]. It is shown that finding the optimal solution to a general RMAB problem is PSPACE-hard [32] (In complexity theory, PSPACE is the set of all decision problems that can be solved by a Turing machine using a polynomial amount of space). As a policy with a reasonable complexity, myopic policy (MP) has been suggested for various RMAB problems. While MP is not optimal in general since it focuses only on the present state [33], it can be proven to be optimal in certain special cases.

A very similar problem to the problem at hand is investigated in [34,35]. In fact, we pose the same problem in [34,35] with the exception that we assume infinite capacity battery without leakage at the sensors, in contrast to [34,35] where either no battery or unit capacity batteries with leakage are assumed. Both [34,35] formulate the problem as a POMDP, and, due to the myopic approach in these work, the focus is on the immediate reward instead of future rewards. In [35], a single-hop WSN consisting of EH sensors with unit capacity batteries (i.e., able to store only one transmissions’s worth of energy) and a fusion center is posed as a RMAB problem. The optimality of a round-robin (RR) based MP is proved under certain specific assumptions. Then, it is shown that this RR based MP coincides with the Whittle index policy, which is generally suboptimal for RMAB problems [36], for a specific case. In [34], the problem is formulated as a POMDP and the optimality of a MP is proven for two cases: (1) the sensors are unable to harvest and transmit simultaneously, and transition probabilities of the EH processes are affected by the scheduling decisions, and (2) the sensors have no batteries.

In [37,38,39], we investigate quite a similar problem with the problem at hand, although the problem in [38] has some differences due to its system model. In this paper, we consider more general class of energy harvesting processes than [37,39] do (as it is explained in the rest of this paper, we consider energy harvesting processes with intensities both

ρ \leq 1

and

ρ > 1

in this paper, whereas we consider only energy harvesting processes with intensities

ρ \leq 1

in [37,38,39]. Besides this, in this paper, we also consider the cases for which finding exact throughput performance of the myopic policy is not possible with using only intensities. For these cases, we find an upper bound for the throughput performance of the myopic policy).

1.3. Our Contributions

Main contributions of the paper are summarized as follows:

The EH WSN problem is studied under average throughput (reward) criterion and no battery leakage assumption for the most general class of EH processes whereas the problem is studied under discounted throughput (reward) criterion and battery leakage assumption for certain specific cases in [34,35].
This paper considers a battery capacity (infinite-capacity) larger than unit capacity (which is the maximum battery capacity considered in [34,35]) for the EH WSN problem.
We show that under average throughput criterion and infinite-capacity battery assumption, RR policies including the MP in [34,35] achieve optimality for some class of EH processes although they are suboptimal for a broad class of EH processes.
We obtain an upper bound for throughput performance of the RR policies under average throughput criterion for quite general (Markov, i.i.d., nonuniform, uniform, etc.) EH processes. Furthermore, we show that all RR policies including the myopic policy achieve almost the same throughput performance under an average throughput criterion.
Compared with [34,35], we consider more reasonable finite capacity battery case in the numerical results and show that there is a slight difference in throughput performance between the finite capacity battery case and infinite capacity battery case.

1.4. Organization of the Paper

The rest of this paper is organized as follows. The system model and problem formulation are given in Section 2. In Section 3, we show that RR based MP in [34,35] cannot achieve

100 %

throughput for a broad class of EH processes under average throughput (reward) criterion. Moreover, we obtain an upper bound for throughput performance of RR policies including the myopic policy under average throughput criterion. Furthermore, we show that RR policies including the myopic policy achieve almost the same throughput as each other. In Section 4, numerical results show that the myopic policy is suboptimal for a broad class of EH processes, which supports the results found in Section 3. Section 5 concludes the paper and provides some future directions.

2. System Model and Problem Formulation

We consider a single-hop WSN where a fusion center (FC) collects data from M EH-capable sensors (please see Figure 1). The index set of all sensors is denoted by

S = \{1, 2, \dots, M\}

. The WSN operates in a time-slotted fashion indexed as

t = 1, 2, \dots, T

. At the beginning of each TS, the FC schedules K sensors for data transmission by assigning each sensor to one of its K mutually orthogonal channels. As the research community working on multi-channel protocols generally either assume that channels are perfectly orthogonal (interference-free) or consider the use of only orthogonal channels [40], we assume that the channels are mutually orthogonal, i.e., there is no interference. If the sensors send data at a low data transmission rate and interference management is applied, very low-error transmission can be achieved. Therefore, we assume error-free transmission in the WSN. We assume that the sensors always have data to send as it is assumed in [34,35]. When you consider a single hop wireless sensor network in a wide lowland (flat cropland), there will be no obstacles like buildings, hills which may cause shadowing, reflection, refraction or absorption/diffractions, etc. In a single hop WSN with a central scheduler, the sensors are expected to send the same type of data such as humidity, temperature, pressure, etc. Considering the applications of WSNs in agriculture and frost monitoring [41,42,43,44,45], the sensors have nearly the same propagation conditions to send the same type of data in large croplands. Therefore, we assume that data packets have equal size and sending one packet takes up one TS. A unit energy is defined as the energy required for a sensor to send one packet in one TS.

The energy harvested by sensor i in TSs 1 through t is denoted by

E_{i} (t)

, and the energy harvested in TS t is denoted by

E_{i}^{h} (t)

; i.e.,

E_{i}^{h} (t) = E_{i} (t + 1) - E_{i} (t) .

For

t = 1, \dots, T

, we define the activation set, denoted by

π (t)

, as set of the sensors scheduled in TS t under a policy

π

.

As it is assumed in [34,35], if a sensor has sufficient energy and scheduled in TS t, it sends one data packet to the FC in TS t. The number of data packets sent by sensor i in TS t under a policy

π

can be written as

I_{\{i \in π (t)\}} I_{\{B_{i}^{π} (t) \geq 1\}} \in \{0, 1\},

where

I_{\{.\}}

is the indicator function and

B_{i}^{π} (t)

is the stored energy in infinite-capacity battery of sensor i in TS t under a policy

π

. Under the policy

π

,

B_{i}^{π} (t)

is evolved as

B_{i}^{π} (t + 1) = B_{i}^{π} (t) + E_{i}^{h} (t) - I_{\{i \in π (t)\}} I_{\{B_{i}^{π} (t) \geq 1\}} \forall i, t .

(1)

The number of data packets sent by all sensors to the FC within the first t TSs under a policy

π

is denoted by

V^{π} (t),

which is

V^{π} (t) = \sum_{i = 1}^{M} V_{i}^{π} (t),

where the number of packets sent by sensor i in TSs 1 through t under a policy

π

is denoted by

V_{i}^{π} (t) = \sum_{τ = 1}^{t} I_{\{i \in π (τ)\}} I_{\{B_{i}^{π} (τ) \geq 1\}}

.

In [34,35], the objective is to find a policy that maximizes the total throughput over the time horizon under expected discounted reward criteria, where the discount factor corresponds to battery leakage. (Ref. [34] considers the problem under discounted throughput (reward) criteria since [34] assume battery leakage with discount factor

0.9

such that stored energy decreases to 90% in a time slot which is generally less than 1 ms. However, this is not realistic with recent battery technology.) On the other hand, battery leakage in typical batteries causes less than 10% decrease in the stored energy in 24 h [23]. This decrease implies that battery leakage in a 1 ms-long time slot is less than

0.000000005 %

and so the discount factor is greater than

0.9999999988

, i.e.,

0.9999999988 \leq β < 1

(Twenty-four hours equals to 86400000 ms. If length of a time slot is chosen as 1 ms, then

0.90 \leq β^{86400000} < 1

which implies that

0.9999999988 \leq β < 1

). Therefore, we neglect battery leakage in our problem formulation, which is practical in terms of engineering aspects. As the problem at hand assumes infinite data backlog and no battery leakage from [19,23], it is delay insensitive by nature. Hence, from [19,23,24], we formulate the scheduling problem as follows.

Problem 1.

Average throughput (reward) maximization over a time horizon, T

\begin{matrix} \underset{{\{π (t)\}}_{t = 1}^{T}}{m a x} & \frac{1}{T} V^{π} (T), \\ s . t . & L a b e l (1) . \end{matrix}

The following notions are used in the rest of the paper.

Definition 1.

For a given sequence of energy harvests, an optimal policy,

π^{*}

, is a policy that maximizes the total throughput of all sensors upto

K T

over a time horizon, T, i.e.,

π^{*} ≜ \arg \max_{π \in G} \frac{V^{π} (T)}{T},

where G is set of feasible policies.

Definition 2.

A fully efficient policy,

π^{F E}

, is a policy under which the sensors use up all of their harvested energy at the end of the time horizon which yields

V^{F E} (T) = \sum_{i = 1}^{M} V_{i}^{F E} (T)

(Although we use

V^{π} (t)

to denote the total throughput achieved in first t TSs under a policy π, the total throughput achieved in first t TSs under a policy

π^{F E}

is denoted by

V^{F E} (T)

instead of

V^{π^{F E}} (T)

for simplicity. Moreover,

V^{*} (T)

and

V^{R R} (T)

are used instead of

V^{π^{*}} (T)

and

V^{π^{R R}} (T)

, respectively. Similarly,

V_{i}^{F E} (T)

,

V_{i}^{*} (T)

and

V_{i}^{R R} (T)

are used instead of

V_{i}^{π^{F E}} (T)

,

V_{i}^{π^{*}} (T)

and

V_{i}^{π^{R R}} (T)

, respectively.), where

V_{i}^{F E} (T) = ⌊E_{i} (T)⌋

.

For certain EH processes, an optimal policy may not be a fully efficient policy, as it is explained in Remark 1.

Definition 3.

Efficiency of a policy π, denoted by

η (π)

, is defined as the ratio of the throughput of a policy π over the throughput of a fully efficient policy,

π^{F E}

, over the time horizon, T. It can be expressed as

η (π) ≜ \frac{V^{π} (T)}{V^{F E} (T)},

(2)

where

V^{π} (T)

and

V^{F E} (T)

are the number of collected data packets (throughput) over a time horizon T under a policy π, and fully efficient policy

π^{F E}

, respectively (When K and T are in order of tens and thousands, respectively, throughput of an optimal policy is expected to be in the order of ten thousands. The term, efficiency, provides us the opportunity of dealing with small numbers less than or equal to 1 instead of large throughput numbers. Efficiency of a policy also gives us the relative throughput of that policy to the throughput of a fully efficient policy, which provide convenience in numerical results).

The efficiency term itself can also be considered as relative energy consumption of the system to total energy harvested by the system.

The number of data packets which can be sent by all sensors from TS

t + 1

to TS T is denoted by

Y (t)

, i.e.,

Y (t) ≜ V^{F E} (T) - V^{*} (t) .

(3)

The number of data packets which can be sent by sensor i from TS

t + 1

to TS T is denoted by

Y_{i} (t)

, i.e.,

Y_{i} (t) ≜ V_{i}^{F E} (T) - \max_{π^{*} \in G^{*}} V_{i}^{*} (t),

(4)

where

G^{*}

is the set of all throughput-optimal policies (under different throughput-optimal policies, the throughput of a sensor i in first t TSs may be differed since sensor i may be scheduled by the FC different times under different throughput-optimal policies).

Notice that

Y (0) = \sum_{i = 1}^{M} ⌊E_{i} (T)⌋

and

Y_{i} (0) = ⌊E_{i} (T)⌋

.

Definition 4.

Intensity of sensor i,

ρ_{i}

, is defined as the integer part of the total energy harvested by sensor i over the time horizon, T, normalized by

\frac{K T}{M}

, i.e.,

ρ_{i} ≜ \frac{M ⌊E_{i} (T)⌋}{K T} .

Definition 5.

Intensity , ρ, is defined as the sum of integer parts of the total energy harvested by all sensors over the time horizon, T, normalized by

K T

, i.e.,

ρ ≜ \frac{\sum_{i = 1}^{M} ⌊E_{i} (T)⌋}{K T} = \frac{\sum_{i = 1}^{M} ρ_{i}}{M} .

Remark 1.

If both of the following conditions,

Y_{i} (t) \leq (T - t)

\forall i \in S

,

\forall t

and

Y (t) \leq K (T - t)

\forall t

, are satisfied, then an optimal policy becomes a fully efficient policy, i.e.,

V^{*} (T) = V^{F E} (T)

. Otherwise, an optimal policy cannot achieve throughput of

V^{F E} (T) = \sum_{i = 1}^{M} ⌊E_{i} (T)⌋

, i.e.,

V^{*} (T) < V^{F E} (T)

. In the cases violating at least one of these conditions, comparing a policy with a fully efficient policy is much simpler than comparing it with an optimal policy. Therefore, we also introduce the notion of fully efficient policy.

For ease of reference, our commonly used notation is summarized in Table 1.

3. Efficiency of Myopic and Round Robin Policies

A similar problem to the problem at hand is studied in [34,35] for certain specific cases under discounted reward criterion. RR based MP is proposed in both papers in which they prove the optimality of this policy for certain specific cases. We applied this RR based myopic policy to the problem at hand. As the MP in [34,35] is an RR policy with quantum = 1 TS, we investigate only RR policies with quantum = 1 TS, denoted by

π^{R R}

, in this paper.

Definition 6.

For the network that consists of M sensors and an FC with K channels, a Round Robin (RR) policy with quantum = 1 TS is an RR policy under which the FC schedules the sensors by allocating one TS to each sensor for data transmission in a period of

\frac{M}{K}

TSs (Quantum is defined as the number of TSs allocated to each sensor in a period (round) by an RR policy. An RR policy with quantum=n TSs is an RR policy that allocates n TSs to each sensor in a period (round) of

\frac{M n}{K}

, and so on. For applicability of RR policies with quantum =

n \geq 1

TS,

\frac{M}{K}

must be an integer).

In this section, we show that RR policies with quantum = 1 TS are generally suboptimal by Theorem 1. Next, we study their efficiencies more precisely and obtain an upper bound for their efficiencies by Theorem 2. Then, we show that an RR policy with quantum = 1 TS achieves almost the same efficiency as another RR policy with quantum = 1 TS by Theorem 3, which implies that the MP in [34,35] is generally suboptimal and the upper bound obtained for RR policies with quantum = 1 TS is also valid for the MP in [34,35].

Theorem 1.

If there exist some sensors

i \in S

such that

Y_{i} (t) > ⌈\frac{K (T - t)}{M}⌉

for some

t < T

, all RR policies with quantum = 1 TS have efficiency lower than 100% even if fully efficient policy exists for Problem 1 (they are suboptimal).

Proof.

Under an RR policy with quantum = 1 TS, each sensor is visited by the FC either

⌊\frac{K (T - t)}{M}⌋

or

⌈\frac{K (T - t)}{M}⌉

times if

\frac{K (T - t)}{M}

is not an integer. If

\frac{K (T - t)}{M}

is an integer, then FC allocates

\frac{K (T - t)}{M}

TSs to each sensor. This means that total number of transmissions from any sensor cannot exceed

⌈\frac{K (T - t)}{M}⌉

.

Y_{i} (t) > ⌈\frac{K (T - t)}{M}⌉

implies that sensor i need to send more than

⌈\frac{K (T - t)}{M}⌉

data packets in TSs

t + 1

through T so as to send

V_{i}^{F E} (T) = ⌊E_{i} (T)⌋

packets, which must be sent by each sensor for full efficiency. Hence, even if a fully efficient policy exists, any RR policy with quantum = 1 TS is not fully efficient for Problem 1 (they are suboptimal). ☐

3.1. Efficiency Bounds of RR Policies with Quantum = 1 TS

In this subsection, efficiency bounds of RR policies with quantum = 1 TS are studied precisely for general EH processes.

Lemma 1.

There exists a class of EH processes with intensity

ρ_{i}

, such that, for these EH processes, some sensor i transmits lower than

\min \{Y_{i} (0), q_{i}\}

data packets over a time horizon, T, by an RR policy with quantum = 1 TS, where

q_{i}

is the number of TSs allocated to sensor i over the time horizon, i.e.,

q_{i} \in \{⌊\frac{K T}{M}⌋, ⌊\frac{K T}{M}⌋ + 1\}

.

Proof.

Please see Appendix A. ☐

Now, we introduce two sets,

H_{1}

and

H_{2}

, which are used in Theorem 2, Corollaries 1 and 2.

H_{1}

denotes the index set of the sensors i for which

⌊\frac{K T}{M}⌋ + 1

TSs are allocated and

Y_{i} (0) > ⌊\frac{K T}{M}⌋ + 1

. Moreover,

H_{2}

denotes the index set of the sensors i for which

⌊\frac{K T}{M}⌋

TSs are allocated and

Y_{i} (0) > ⌊\frac{K T}{M}⌋

. By definition,

ρ_{i} > 1

for sensors

i \in H ≜ H_{1} \cup H_{2}

.

Theorem 2.

For general EH processes,

(i): If $\frac{K T}{M} \notin Z$ , efficiency of an RR policy with quantum=1 TS satisfies

$η (π^{R R}) \leq 1 - \frac{\sum_{i \in H} (ρ_{i} - 1) - \frac{M}{K T} |H_{1}|}{M ρ} .$
(ii): If $\frac{K T}{M} \in Z$ , efficiency of an RR policy with quantum=1 TS satisfies

$η (π^{R R}) \leq 1 - \frac{\sum_{i \in H} (ρ_{i} - 1)}{M ρ} .$

Proof.

Please see Appendix B. ☐

From Theorem 2, we derive the following corollaries.

Corollary 1.

For sufficiently large T, we have

M ≪ K T

. By the definition,

|H_{1}| < \sum_{i \in H_{1}} ρ_{i} \leq M ρ

so

\frac{|H_{1}|}{K T ρ} ≪ 1

. Hence, upper bound of efficiency of an RR policy with quantum=1 TS can approximately be written as

1 - \frac{\sum_{i \in H} (ρ_{i} - 1)}{M ρ}

.

Corollary 2.

For sufficiently large T,

|H_{1}| < M ρ ≪ K T ρ

. If

\frac{|H_{1}|}{K T ρ}

in part (i) of Theorem 2 is neglected and

(i): If $H = \emptyset$ , then $η (π^{R R}) \leq 1$ .
(ii): If $ρ_{i} = c$ , $\forall i \in H$ and $ρ_{i} = 0$ , $\forall i \in S - H$ , then $η (π^{R R}) \leq \min \{1, \frac{1}{c}\}$ where $c \in R^{+}$ .

3.2. Throughput Difference of RR Policies with Quantum = 1 TS

We will prove that the throughput difference between any two RR policies with quantum = 1 TS cannot be greater than

M - K

in a time horizon. Recall that, for all RR policies with quantum = 1 TS, the scheduling is periodic with a period of

\frac{M}{K}

TSs. The only difference between any two RR policies with quantum = 1 TS is their initial scheduling time,

t_{0}

; therefore, an RR policy with quantum = 1 TS that starts to send first packet in TS

t_{0}

can be labeled as

R R_{t_{0}}

, where

1 \leq t_{0} \leq \frac{M}{K}

.

Lemma 2.

The number of transmissions of sensor i can be varied at most one under two different RR policies with quantum = 1 TS over the time horizon T, i.e.,

\max_{1 \leq t_{0} \leq \frac{M}{K}} V_{i}^{R R_{t_{0}}} (T) - \min_{1 \leq t_{0} \leq \frac{M}{K}} V_{i}^{R R_{t_{0}}} (T) \leq 1 .

Proof.

Please see Appendix C. ☐

The following example is given to illustrate Lemma 2.

Example 1.

Assume that

T = 14

,

M = 3

and

K = 1

. There are three RR policies with quantum=1 TS which schedule the same sensor node i in different TSs as follows.

R R_{1}

: Sensor i is scheduled in TSs

1, 4, 7, 10, 13,

R R_{2}

: Sensor i is scheduled in TSs

2, 5, 8, 11, 14,

R R_{3}

: Sensor i is scheduled in TSs

3, 6, 9, 12 .

Notice that sensor i have the chance of transmission five times under

R R_{1}

and

R R_{2}

while it has a chance of transmission only four times under

R R_{3}

. Denote the transmission times of sensor i under different RR policies by

t_{j}^{n}

where

n = 1, 2, 3

are the indices (initial scheduling times) of RR policies and

j = 1, \dots, 4

or 5 are transmission times of sensor i under the applied RR policy in this example.

E_{i} (t_{j}^{2}) \geq E_{i} (t_{j}^{1})

for

j = 1, \dots, 5

and

E_{i} (t_{j + 1}^{2}) \geq E_{i} (t_{j}^{3})

for

j = 1, \dots, 4

and

E_{i} (t_{j + 1}^{1}) \geq E_{i} (t_{j}^{3})

for

j = 1, \dots, 4

; therefore,

V_{i}^{R R_{2}} (14) \geq V_{i}^{R R_{1}} (14) = V_{i}^{R R_{1}} (13) \geq V_{i}^{R R_{3}} (14) = V_{i}^{R R_{3}} (12)

. On the other hand,

E_{i} (t_{j}^{2}) \leq E_{i} (t_{j}^{3})

for

j = 1, \dots, 4

; therefore,

V_{i}^{R R_{2}} (11) \leq V_{i}^{R R_{3}} (12)

. The difference between the most and the least efficient RR policies with quantum = 1 TS are

\begin{matrix} V_{i}^{R R_{2}} (14) - V_{i}^{R R_{3}} (14) & = V_{i}^{R R_{2}} (14) - V_{i}^{R R_{3}} (12) \\ \leq V_{i}^{R R_{2}} (14) - V_{i}^{R R_{2}} (11) \\ \leq 1 . \end{matrix}

Hence, it is observed that throughput of a sensor i can be varied at most 1 under any RR policies with quantum = 1 TS.

The following theorem is based on the extension of Lemma 2 for the whole network.

Theorem 3.

An RR policy with quantum = 1 TS achieves at most

M - K

more throughput than another RR policy with quantum = 1 TS over the time horizon T, i.e.,

\max_{π^{R R} \in G^{R R}} V^{R R} (T) - \min_{π^{R R} \in G^{R R}} V^{R R} (T) \leq M - K,

where

G^{R R}

is the set of all RR policies with quantum = 1 TS.

Proof.

In this proof, we first consider the case of transmitting messages of

\frac{M}{K}

sensors over a single channel under an RR policy with quantum = 1 TS. Then, we extend the result of Lemma 2 to the case of multiple (K) channels. Notice that

t_{0} = m o d_{(\frac{M}{K})} T I_{\{m o d_{(\frac{M}{K})} T \neq 0\}} + \frac{M}{K} I_{\{m o d_{(\frac{M}{K})} T = 0\}}

is the best choice regardless of the sensor i and this most efficient RR policy must be applied to one of the

\frac{M}{K}

sensors over a single channel. When K channels of the FC are considered, this most efficient RR policy must be applied to K of M sensors. By considering this fact and Lemma 2, an RR policy with quantum = 1 TS transmits at most

M - K

more data packets than another RR policy with quantum = 1 TS. ☐

Remark 2.

From Theorem 3 and Definition 2,

\max_{π^{R R} \in G^{R R}} η (π^{R R}) - \min_{π^{R R} \in G^{R R}} η (π^{R R}) \leq \frac{M - K}{V^{F E} (T)} = \frac{M - K}{ρ K T},

where

G^{R R}

is the set of all RR policies with quantum = 1 TS.

As the MP in [34,35] is also an RR policy with quantum = 1 TS, from Remark 2, it has almost same efficiency as another RR policy with quantum=1 TS for sufficiently large time horizon T, i.e.,

η (π^{M P}) \approx η (π^{R R})

if

K, M ≪ T

. For sufficiently large time horizons, these results can be extended for RR policies with quantum =

n > 1

TSs.

4. Numerical Results

In this section, efficiency of the myopic policy (MP) is evaluated for the cases of infinite battery and finite battery with

B = 50

(

B = 50

implies that the battery of a sensor can store energy enough to send 50 data packets since we assume that each data packet transmission requires one unit of energy) at the time horizons varying from 0 to 2000 TSs via simulations (as efficiency of a policy are defined only for the time horizon, T, we obtain efficiency vs. time horizon figures in this section. Notice that efficiency of the MP at

T = 0

is taken as 0 for these simulations).

For each node i, the Markovian EH process is modelled by a state space

\{0, 1, 2\}

and a

3 \times 3

transition matrix

[\begin{matrix} 0.90 & 0.05 & 0.05 \\ 0.05 & 0.90 & 0.05 \\ 0.05 & 0.05 & 0.90 \end{matrix}]

. The harvested energy for sensor i in TS t is

E_{i}^{h} (t) = \frac{K}{M} \times ρ_{i} \times M_{i} (t)

where EH state of sensor i in TS t is denoted by

M_{i} (t) \in \{0, 1, 2\}

.

Notice that efficiency of the MP in [34,35] is almost the same as efficiency of an RR policy with quantum=1 TS since

η (π^{R R}) \approx η (π^{M P})

for sufficiently large T from Theorem 3 and Remark 2. We observe efficiency of the MP under nonuniform EH processes (it is obvious that the MP achieve efficiencies close to efficiency of an optimal policy for uniform EH processes. As this case is trivial, we did not show the results for this case).

The simulations are made by taking

M = 100

and

K = 10

under Markovian and i.i.d EH processes with various intensities which are adjusted by choosing intensities of some sensors as

3.0

and choosing others as

0.3

as explained in Table 2 (In WSNs, it is highly possible that some EH sensors can harvest energy much efficiently than others due to their energy harvesting resource (solar, piezoelectric, RF, wind, etc.) and environmental conditions. For example, solar energy harvesting is generally more efficient than the others. Therefore, we choose intensities of some sensors much larger than the others (3.0 for some sensors and 0.3 for the remaining ones) in order to represent the difference between the amount of energy harvested by sensors).

4.1. Infinite Capacity Battery

In Figure 2, for i.i.d. EH process with

ρ \leq 1

, efficiencies of the MP at

T = 2000

are

0.758

,

0.564

and

0.469

for

ρ = 0.435

,

ρ = 0.705

and

ρ = 0.975

, respectively. In Figure 3, for Markov EH process with

ρ \leq 1

, at

T = 2000

, the MP achieves efficiency of

0.758

,

0.552

and

0.467

for

ρ = 0.435

,

ρ = 0.705

and

ρ = 0.975

, respectively. The dramatic difference between efficiencies of the MP in these three intensities is expected since Theorem 2 and Corollary 1 state that, as the number of nodes with intensity

ρ_{i} > 1

increases, the efficiency of RR policies with quantum = 1 TS decrease. Notice that, from Theorem 2 and Corollary 1, efficiencies of an RR policy with quantum = 1 TS at

T = 2000

are expected to be

η (π^{R R}) \leq 0.770

,

η (π^{R R}) \leq 0.574

and

η (π^{R R}) \leq 0.487

for

ρ = 0.435

,

ρ = 0.705

and

ρ = 0.975

, respectively. When EH processes have memory (Markov processes), we observe similar results to the results in memoriless (i.i.d.) EH processes with same intensity.

In Figure 2, for i.i.d. EH process with

ρ > 1

, efficiencies of the MP at

T = 2000

are

0.417

and

0.380

for

ρ = 1.245

and

ρ = 1.515

, respectively. In Figure 3, for Markov EH process with

ρ > 1

, at

T = 2000

, the MP achieves efficiency of

0.415

and

0.381

for

ρ = 1.245

and

ρ = 1.515

, respectively. From Corollary 1, efficiencies of an RR policy with quantum = 1 TS at

T = 2000

are expected to be

η (π^{R R}) \leq 0.438

and

η (π^{R R}) \leq 0.406

for

ρ = 1.245

and

ρ = 1.515

, respectively. For EH processes with

ρ > 1

, from Definitions 1–3, efficiency of an optimal policy is

η (π^{*}) = \frac{V^{*} (T)}{V^{F E} (T)} \leq \frac{1}{ρ} .

Hence, at

T = 2000

,

η (π^{*}) \leq 0.803

and

η (π^{*}) \leq 0.660

for

ρ = 1.245

and

ρ = 1.515

, respectively. Markov EH processes have similar results to the results for i.i.d. EH processes with same intensity.

4.2. Finite Capacity Battery

In this subsection, the simulations are made by considering a finite battery capacity with

B = 50

under Markovian and i.i.d EH processes with the intensities in Table 2.

In Figure 4 (i.i.d. EH process), efficiencies of the MP at

T = 2000

are

0.757

,

0.555

,

0.464

,

0.415

and

0.380

for

ρ = 0.435

,

ρ = 0.705

,

ρ = 0.975

,

ρ = 1.245

and

ρ = 1.515

, respectively. In Figure 5 (Markov EH process), at

T = 2000

, the MP achieves efficiency of

0.756

,

0.564

,

0.462

,

0.414

and

0.380

for

ρ = 0.435

,

ρ = 0.705

,

ρ = 0.975

,

ρ = 1.245

and

ρ = 1.515

, respectively. From Corollary 1, efficiencies of an RR policy with quantum = 1 TS at

T = 2000

are expected to be

η (π^{R R}) \leq 0.758

,

η (π^{R R}) \leq 0.564

,

η (π^{R R}) \leq 0.469

η (π^{R R}) \leq 0.438

and

η (π^{R R}) \leq 0.406

for

ρ = 0.435

,

ρ = 0.705

,

ρ = 0.975

,

ρ = 1.245

and

ρ = 1.515

, respectively.

Markov EH processes have similar results to the results for i.i.d. EH processes with the same intensity.

4.3. Discussion

In this subsection, the efficiencies of the myopic policy in both infinite and finite capacity battery cases are compared with each other based on the numerical results in Table 3. Besides this, these numerical results are compared with the expected upper bounds for efficiency of myopic policy. Finally, the complexity of the Round-Robin based myopic policy is investigated.

From Table 3, it can observed that the maximum efficiency difference (

0.009

) occurs between

B = \infty

(

0.564

) and

B = 50

(

0.555

) for i.i.d. EH processes with intensity

ρ = 0.705

. For this intensity, the efficiency in finite battery case is

1.596 %

less than the efficiency in infinite battery case. Besides this, the minimum difference (0.000) occurs between

B = \infty

(

0.380

) and

B = 50

(

0.380

) for i.i.d. EH processes with

ρ = 1.515

. Therefore, the efficiency of MP for

B = 50

is only

1.596 %

less than that for

B = \infty

at most. Hence, we can conclude that the MP can achieve almost the same throughput performance with a reasonable finite capacity (

B = 50

) battery as that with infinite capacity battery.

Moreover, from Table 3, it can observed that the maximum deviation (difference) occurs between the upper bound (

0.406

) and efficiency of MP for i.i.d. EH processes (

0.380

) with intensity

ρ = 1.515

. For this intensity, the efficiency is

6.40 %

less than the upper bound. Besides this, the minimum deviation occurs between the upper bound (

0.770

) and efficiency of MP for i.i.d. EH processes (

0.758

) with

ρ = 0.435

. For this intensity, the efficiency is only

1.56 %

less than the upper bound. Based on these results, we can say that the upper bounds for efficiency of the MP are generally tight.

In addition, the Round Robin based myopic policy is a simple policy for this problem. There is an initial ordering and the order is kept during the time interval when Round Robin based scheduling is performed. Sorting algorithms that are required for initial ordering has a worst case complexity of

O (M^{2})

and Round Robin algorithm has a complexity of

O (1)

. Therefore, the myopic policy is a low-complexity solution for the problem at hand.

5. Conclusions

This paper investigates a problem occurring in a single-hop WSN where an FC schedules a set of EH sensors to collect data from them. The FC does not know the instantaneous battery states or the statistics of EH processes at sensors that are data backlogged and the communication is error-free. There is no leakage from the infinite-capacity batteries. The problem at hand is set up as an average throughput (reward) maximization problem. The myopic policy in [34,35] that has an RR structure is applied to this problem as a solution. It is shown that RR policies with quantum = 1 TS are suboptimal for a broad class of EH processes. Next, an upper bound is obtained for efficiencies of RR policies with quantum = 1 TS. Then, it is shown that the myopic policy have almost equal efficiency as another RR policy with quantum = 1 TS. Furthermore, numerical results show that the myopic policy is suboptimal for a broad class of EH processes although it achieves optimality for certain specific cases.

As a future work, we search for a simple, optimal solution to this problem for quite general EH processes. As another future work, we look for extending the single hop problem to multi hop case. Moreover, we plan to investigate the same problem under finite capacity battery case and make the throughput performance analysis of the myopic policy under finite capacity battery case. In addition, we will work to extend the problem in our future works such that we can consider the network lifetime as a performance metric. We believe that novel approaches and concepts in this paper will give insight to the researchers who study similar scheduling problems.

Author Contributions

Omer Melih Gul conceived and designed the system model; Omer Melih Gul and Mubeccel Demirekler analyzed the throughput performance of the Round Robin based myopic policy proposed to the problem; Omer Melih Gul performed the simulations of the model; Omer Melih Gul and Mubeccel Demirekler wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IoT	Internet of Things
WSN	Wireless sensor network
EH	Energy harvesting
FC	Fusion center
TS	time slot
SNR	Signal to Noise Ratio
RF	Radio frequency
POMDP	Partially Observable Markov Decision Processes
DP	Dynamic Programming
RMAB	Restless Multi Armed Bandit
PSPACE	Polynomial Space
MP	Myopic policy
RR	Round-Robin
IID	independent and identically distributed

Appendix A. Proof of Lemma 1

Define a new function,

ρ_{i} (t)

, for a sensor i as

ρ_{i} (t) ≜ \frac{V_{i}^{F E} (T) - \max_{π^{R R} \in G^{R R}} V_{i}^{R R} (t)}{⌈\frac{K (T - t)}{M}⌉},

(A1)

where

G^{R R}

is the set of all RR policies with quantum = 1 TS.

Assume

ρ_{i} (t) > 1 > ρ_{i}

for some sensor i and t. From Label (A1),

\begin{matrix} \frac{V_{i}^{F E} (T) - \max_{π^{R R} \in G^{R R}} V_{i}^{R R} (t)}{⌈\frac{K (T - t)}{M}⌉} & > & 1, τ \leq t < T, \\ V_{i}^{F E} (T) - \max_{π^{R R} \in G^{R R}} V_{i}^{R R} (t) & > & ⌈\frac{K (T - t)}{M}⌉, τ \leq t < T, \end{matrix}

(A2)

where

τ = \min_{ρ_{i} (t) > 1} t

.

In the interval,

τ \leq t < T

, sensor i can send at most

⌈\frac{K (T - t)}{M}⌉

data packets under an RR policy with quantum = 1 TS. Therefore, Label (A2) implies that

\min \{Y_{i} (0), q_{i}\} = Y_{i} (0)

packets cannot be sent by sensor i over a time horizon, T, where

q_{i} \in \{⌊\frac{K T}{M}⌋, ⌊\frac{K T}{M}⌋ + 1\}

. If the EH process at sensor i were a constant EH process with intensity

ρ_{i} < 1

(for these EH processes,

ρ_{i} (t) = ρ_{i}

for all

1 \leq t < T

), then sensor i would send

\min \{Y_{i} (0), q_{i}\} = Y_{i} (0)

packets over a time horizon, T, from Definition 5 and Remark 1. Hence, efficiencies of RR policies with quantum=1 TS under such an EH process that

ρ_{i} (t) > 1 > ρ_{i}

for some sensor i and t become lower than they do under a constant EH process with intensity

ρ_{i}

. This implies that there exists a class of EH processes for which some sensor i cannot achieve the throughput of

\min \{Y_{i} (0), q_{i}\}

packets over time horizon, T.

Appendix B. Proof of Theorem 2

Define

q ≜ \frac{K T}{M} - \frac{m}{M},

(A3)

where q and m are integers and

0 \leq m \leq M - K

. Recall

\frac{M}{K} \in Z

for applicability of RR policies with quantum = 1 TS.

Case i: Assume

\frac{K T}{M} \notin Z

and the RR policy with quantum = 1 TS starts to scheduling with sensor 1 without loss of generality. Notice that

K \leq m \leq M - K

if

\frac{K T}{M} \notin Z

. If the order

1, 2, \dots, M

is followed by the RR policy with quantum=1 TS for the scheduling, sensors

1, \dots, m

are scheduled

q + 1

times and sensors

m + 1, \dots, M

are scheduled q times over a time horizon, T. From Lemma 1, for some EH processes,

V^{R R} (T) < \sum_{i = 1}^{m} \min \{Y_{i} (0), q + 1\} + \sum_{i = m + 1}^{M} \min \{Y_{i} (0), q\};

(A4)

otherwise, Label (A4) becomes equality for other EH processes. Hence,

V^{R R} (T) \leq \sum_{i = 1}^{r} \min \{Y_{i} (0), q + 1\} + \sum_{i = r + 1}^{M} \min \{Y_{i} (0), q\} .

(A5)

Let

S = H_{1} \cup H_{2} \cup (S - H_{1} - H_{2}),

where

H_{1} \subset \{1, \dots, m\}

and

H_{2} \subset \{m + 1, \dots, M\}

are the index sets of sensors that have enough energy to send more than

q + 1

and q data packets, respectively. With this specification, the total throughput is

V^{R R} (T) \leq \sum_{i \in H_{1}} (q + 1) + \sum_{i \in H_{2}} q + \sum_{i \in S - (H_{1} \cup H_{2})} Y_{i} (0) .

(A6)

From Label (A6) and Definition 3,

\begin{matrix} η (π^{R R}) & = & \frac{V^{R R} (T)}{Y (0)} \\ \leq & \frac{\sum_{i \in H_{1}} (q + 1) + \sum_{i \in H_{2}} q + \sum_{i \in S} Y_{i} (0) - \sum_{i \in H_{1} \cup H_{2}} Y_{i} (0)}{\sum_{i \in S} Y_{i} (0)} \\ = & 1 - \frac{\sum_{i \in H_{1} \cup H_{2}} Y_{i} (0) - \sum_{i \in H_{1}} (q + 1) - \sum_{i \in H_{2}} q}{\sum_{i \in S} Y_{i} (0)} . \end{matrix}

(A7)

If the numerator and denominator of the second term of the right-hand side in Label (A7) are normalized by

\frac{K T}{M}

,

η (π^{R R}) \leq 1 - \frac{\sum_{i \in H_{1} \cup H_{2}} \frac{M Y_{i} (0)}{K T} - \sum_{i \in H_{1}} \frac{M (q + 1)}{K T} - \sum_{i \in H_{2}} \frac{M q}{K T}}{\sum_{i \in S} \frac{M Y_{i} (0)}{K T}} .

(A8)

From Definition 4,

ρ_{i} = \frac{M Y_{i} (0)}{K T}

. Labels (A3) and (A8) yields

η (π^{R R}) \leq 1 - \frac{A}{\sum_{i \in S} ρ_{i}},

(A9)

where

\begin{matrix} A & = & \sum_{i \in H_{1} \cup H_{2}} ρ_{i} - \sum_{i \in H_{1}} (1 + \frac{M - m}{K T}) - \sum_{i \in H_{2}} (1 - \frac{m}{K T}) \\ = & \sum_{i \in H_{1} \cup H_{2}} (ρ_{i} - 1) - \sum_{i \in H_{1}} \frac{M}{K T} + \sum_{i \in H_{1} \cup H_{2}} \frac{m}{K T} \\ \geq & \sum_{i \in H_{1} \cup H_{2}} (ρ_{i} - 1) - \frac{M}{K T} |H_{1}| . \end{matrix}

Hence, efficiencies of RR policies with quantum = 1 TS satisfy

η (π^{R R}) \leq 1 - \frac{\sum_{i \in H} (ρ_{i} - 1) - \frac{M}{K T} |H_{1}|}{M ρ} .

Case ii: If

\frac{K T}{M} \in Z

, then

q = \frac{K T}{M}

since

m = 0

in Label (A3) in this case. Let

S = H_{2} \cup (S - H_{2})

where

H_{2}

is index set of sensors that have enough energy to send more than q data packets. (Notice that

H_{1} = \emptyset

and

H = H_{1} \cup H_{2} = H_{2}

if

\frac{K T}{M} \in Z

.) By following similar steps in part (i), we obtain

\begin{matrix} V^{R R} (T) & \leq & \sum_{i = 1}^{M} \min \{Y_{i} (0), q\} \\ \leq & \sum_{i \in H_{2}} q + \sum_{i \in S - H_{2}} Y_{i} (0) . \end{matrix}

(A10)

From Label (A10) and Definition 3, efficiency of RR policies with quantum = 1 TS can be expressed as

\begin{matrix} η (π^{R R}) & = & \frac{V^{R R} (T)}{Y (0)} \\ = & \frac{\sum_{i \in H_{1}} (σ + 1) + \sum_{i \in H_{2}} σ - \sum_{i \in H_{1} \cup H_{2}} W_{i} (0) + \sum_{i \in H_{1} \cup H_{2}} W_{i} (0) + \sum_{i \in S - H_{1} - H_{2}} W_{i} (0)}{\sum_{i \in S} W_{i} (0)} \\ \leq & \frac{\sum_{i \in H_{1} \cup H_{2}} q + \sum_{i \in S} Y_{i} (0) - \sum_{i \in H_{1} \cup H_{2}} Y_{i} (0)}{\sum_{i \in S} Y_{i} (0)} \\ \leq & 1 - \frac{\sum_{i \in H_{2}} \frac{M Y_{i} (0)}{K T} - \sum_{i \in H_{2}} \frac{M q}{K T}}{\sum_{i \in S} \frac{M Y_{i} (0)}{K T}} . \end{matrix}

(A11)

From Label (A3), Definitions 4 and 5, we obtain

η (π^{R R}) \leq 1 - \frac{\sum_{i \in H} (ρ_{i} - 1)}{M ρ} .

Appendix C. Proof of Lemma 2

Recall that

\frac{M}{K} \in Z

for applicability of RR policies with quantum = 1 TS. There are

\frac{M}{K}

RR policies with quantum=1 TS which start to schedule a sensor i in different TSs. Recall that

1 \leq t_{0} \leq \frac{M}{K}

where

t_{0}

is initial time for

R R_{t_{0}}

policy to schedule the sensor i. The proof is divided into two cases:

Case i: Assume that

\frac{K T}{M} \notin Z

. Then,

\frac{K T}{M} ≜ q + \frac{K r}{M}

where q and r are integers such that

1 \leq r \leq \frac{M}{K} - 1

and

\frac{M}{K} \in Z

. Some of the RR policies with quantum=1 TS schedules a sensor i

q + 1

times, whereas the other RR policies schedules the sensor iq times. The set of initial times of the RR policies which scheduled sensor i

q + 1

times is denoted by

P_{q + 1}^{R R} = \{1, \dots, r\},

whereas the set of others is denoted by

P_{q}^{R R} = \{r + 1, \dots, \frac{M}{K}\}

.

t_{j}^{n}

denotes the

j^{t h}

transmission time of sensor i under an RR policy,

R R_{n}

, which starts to schedule sensor i in TS n, where

j = 1, \dots, q

or

q + 1

are transmission times of sensor i under the applied RR policy,

R R_{n}

. By definition,

E_{i} (t_{j + 1}^{a}) \geq E_{i} (t_{j}^{b}), \forall a \in P_{q + 1}^{R R}, \forall b \in P_{q}^{R R}, \forall j \in \{1, \dots, q\},

which yields

V_{i}^{R R_{a}} (T) \geq V_{i}^{R R_{b}} (T), \forall a \in P_{q + 1}^{R R}, \forall b \in P_{q}^{R R} .

(A12)

Notice that

r = m o d_{(\frac{M}{K})} T \in P_{q + 1}^{R R}

(in fact,

r = \max_{n \in P_{q + 1}^{R R}} n

). Therefore,

E_{i} (t_{j}^{r}) \geq E_{i} (t_{j}^{a}), \forall a \in P_{q + 1}^{R R}, \forall j \in \{1, \dots, q + 1\},

which yields

V_{i}^{R R_{r}} (T) \geq V_{i}^{R R_{a}} (T), \forall a \in P_{q + 1}^{R R} .

(A13)

Similarly,

r + 1 = \min_{n \in P_{q}^{R R}} n

. Therefore,

E_{i} (t_{j}^{r + 1}) \leq E_{i} (t_{j}^{b}), \forall b \in P_{q}^{R R}, \forall j \in \{1, \dots, q\},

which yields

V_{i}^{R R_{r + 1}} (T) \leq V_{i}^{R R_{b}} (T), \forall b \in P_{q}^{R R} .

(A14)

From Labels (A12)–(A14),

\begin{matrix} V_{i}^{R R_{r}} (T) & \geq & V_{i}^{R R_{a}} (T) \geq V_{i}^{R R_{b}} (T) \geq V_{i}^{R R_{r + 1}} (T), \\ \forall a & \in & P_{q + 1}^{R R}, \forall b \in P_{q}^{R R} . \end{matrix}

(A15)

Furthermore,

E_{i} (t_{j}^{r + 1}) \geq E_{i} (t_{j}^{r}), \forall j \in \{1, \dots, q\},

which yields

V_{i}^{R R_{r + 1}} (t_{j}^{r + 1}) \geq V_{i}^{R R_{r}} (t_{j}^{r}), \forall j \in \{1, \dots, q\} .

(A16)

From Label (A16),

\begin{matrix} V_{i}^{R R_{r + 1}} (T) & = & V_{i}^{R R_{r + 1}} (T - \frac{M}{K} + 1) \\ \geq & V_{i}^{R R_{r}} (T - \frac{M}{K}) . \end{matrix}

(A17)

From Labels (A15) and (A17),

\begin{matrix} \max_{n} V_{i}^{R R_{n}} (T) - \min_{n} V_{i}^{R R_{n}} (T) & = & V_{i}^{R R_{r}} (T) - V_{i}^{R R_{r + 1}} (T) \\ \leq & V_{i}^{R R_{r}} (T) - V_{i}^{R R_{r}} (T - \frac{M}{K}) \\ \leq & 1 . \end{matrix}

(A18)

Hence, the lemma is proved for the first case,

\frac{K T}{M} \notin Z

.

Case ii: If

\frac{K T}{M} \in Z

, each of

\frac{M}{K}

RR policies with quantum=1 TS schedules a sensor i

q = \frac{K T}{M}

times. The set of initial times of RR policies with quantum=1 TS is denoted by

P_{q}^{R R} = \{1, 2, \dots, \frac{M}{K}\}

.

t_{j}^{n}

denotes transmission time of sensor i under an RR policy,

R R_{n}

, where

j = 1, \dots, q

are transmission times of sensor i under the applied RR policy,

R R_{n}

, in this case

E_{i} (t_{j}^{\frac{M}{K}}) \geq E_{i} (t_{j}^{n}) \geq E_{i} (t_{j}^{1}), \forall n \in P_{q}^{R R}, \forall j \in \{1, \dots, q\},

which yields

V_{i}^{R R_{\frac{M}{K}}} (T) \geq V_{i}^{R R_{n}} (T) \geq V_{i}^{R R_{1}} (T), \forall n \in P_{q}^{R R} .

(A19)

Moreover,

E_{i} (t_{j - 1}^{\frac{M}{K}}) \leq E_{i} (t_{j}^{n}), \forall n \in P_{q}^{R R}, \forall j \in \{2, \dots, q\},

which yields

V_{i}^{R R_{\frac{M}{K}}} (T - \frac{M}{K}) \leq V_{i}^{R R_{n}} (T), \forall n \in P_{q}^{R R} .

(A20)

From Labels (A19) and (A20),

\begin{matrix} \max_{n} V_{i}^{R R_{n}} (T) - \min_{n} V_{i}^{R R_{n}} (T) & = & V_{i}^{R R_{\frac{M}{K}}} (T) - V_{i}^{R R_{1}} (T) \\ \leq & V_{i}^{R R_{\frac{M}{K}}} (T) - V_{i}^{R R_{\frac{M}{K}}} (T - \frac{M}{K}) \\ \leq & 1 . \end{matrix}

(A21)

Hence, the lemma is proved for the second case,

\frac{K T}{M} \in Z

.

References

Sheng, Z.; Yang, S.; Yu, Y.; Vasilakos, A.; Mccann, J.; Leung, K. Survey on the IETF Protocol Suite for the Internet of Things: Standards, Challenges, and Opportunities. IEEE Wirel. Commun. 2013, 20, 91–98. [Google Scholar] [CrossRef]
Kamalinejad, P.; Mahapatra, C.; Sheng, Z.; Mirabbasi, S.; Leung, V.C.M.; Guan, Y.L. Wireless Energy Harvesting for the Internet of Things. IEEE Commun. Mag. 2015, 53, 102–108. [Google Scholar] [CrossRef]
Tsai, C.W.; Hong, T.P.; Shiu, G.N. Metaheuristics for the lifetime of WSN: A review. IEEE Sens. J. 2016, 16, 2812–2831. [Google Scholar] [CrossRef]
Wark, T.; Corke, P.; Sikka, P.; Klingbeil, L.; Guo, Y.; Crossman, C.; Valencia, P.; Swain, D.; Bishop-Hurley, G. Transforming Agriculture through Pervasive Wireless Sensor Networks. IEEE Pervasive Comput. 2007, 6. [Google Scholar] [CrossRef] [Green Version]
Chaiwatpongsakorn, C.; Lu, M.; Keener, T.C.; Khang, S.-J. The deployment of carbon monoxide wireless sensor network (CO-WSN) for ambient air monitoring. Int. J. Environ. Res. Public Health 2014, 11, 6246–6264. [Google Scholar] [CrossRef] [PubMed]
Liu, X.-Y.; Zhu, Y.; Kong, L.; Liu, C.; Gu, Y.; Vasilakos, A.V.; Wu, M.-Y. CDC: Compressive data collection for wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 2015, 26, 2188–2197. [Google Scholar] [CrossRef]
Valente, J.; Sanz, D.; Barrientos, A.; del Cerro, J.; Ribeiro, A.; Rossi, C. An Air-Ground Wireless Sensor Network for Crop Monitoring. Sensors 2011, 11, 6088–6108. [Google Scholar] [CrossRef]
Fu, T.; Ghosh, A.; Johnson, E.A.; Krishnamachari, B. Energy-Efficient Deployment Strategies in Structural Health Monitoring using Wireless Sensor Networks. J. Struct. Control Health Monit. 2012, 20, 971–986. [Google Scholar] [CrossRef]
Aderohunmu, F.; Balsamo, D.; Paci, G.; Brunelli, D. Long Term WSN Monitoring for Energy Efficiency in EU Cultural Heritage Buildings. Lect. Notes Electr. Eng. 2013, 281, 253–261. [Google Scholar]
Balsamo, D.; Paci, G.; Benini, L.; Davide, B. Long Term, Low Cost, Passive Environmental Monitoring of Heritage Buildings for Energy Efficiency Retrofitting. In Proceedings of the 2013 IEEE Workshop on Environmental Energy and Structural Monitoring Systems (EESMS), Trento, Italy, 11–12 September 2013; pp. 1–6. [Google Scholar]
Suryadevara, N.K.; Mukhopadhyay, S.C. Wireless sensor network based home monitoring system for wellness determination of elderly. IEEE Sens. J. 2012, 12, 1965–1972. [Google Scholar] [CrossRef]
Yetgin, H.; Cheung, K.T.K.; El-Hajjar, M.; Hanzo, L. Network lifetime maximization of wireless sensor networks. IEEE Access 2015, 3, 2191–2226. [Google Scholar] [CrossRef]
Zhou, F.; Chen, Z.; Guo, S.; Li, J. Maximizing Lifetime of Data-Gathering Trees With Different Aggregation Modes in WSNs. IEEE Sens. J. 2016, 16, 8167–8177. [Google Scholar] [CrossRef]
Hancke, G.P.; de Carvalho e Silva, B.; Hancke, G.P., Jr. The role of advanced sensing in smart cities. Sensors 2013, 13, 393–425. [Google Scholar] [CrossRef] [PubMed]
Gomez, C.; Paradells, J. Urban Automation Networks: Current and Emerging Solutions for Sensed Data Collection and Actuation in Smart Cities. Sensors 2015, 15, 22874–22898. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Paradiso, J.A.; Starner, T. Energy scavenging for mobile and wireless electronics. IEEE Pervasive Comput. 2005, 4, 18–27. [Google Scholar] [CrossRef]
Sudevalayam, S.; Kulkarni, P. Energy Harvesting Sensor Nodes: Survey and Implications. IEEE Commun. Surv. Tutor. 2011, 13, 443–461. [Google Scholar] [CrossRef]
Akyildiz, I.F.; Su, W.; Sankarasubramaniam, Y.; Cayirci, E. A survey on sensor networks. IEEE Commun. Mag. 2002, 40, 102–114. [Google Scholar] [CrossRef]
Kansal, A.; Hsu, J.; Zahedi, S.; Srivastava, M.B. Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. Syst. 2007, 6, 32. [Google Scholar] [CrossRef]
Garcia-Hernandez, C.F.; Ibargengoytia-Gonzalez, P.H.; Garcia-Hernandez, J.; Perez-Diaz, J.A. Wireless Sensor Networks and Applications: A Survey. IJCSNS Int. J. Comput. Sci. Netw. Secur. 2007, 7, 264–273. [Google Scholar]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Technol. J. 1948, 27, 379–426. [Google Scholar] [CrossRef]
Alsheikh, M.A.; Hoang, D.T.; Niyato, D.; Tan, H.; Lin, S. Markov Decision Processes With Applications in Wireless Sensor Networks: A Survey. IEEE Commun. Surv. Tutor. 2015, 17, 1239–1267. [Google Scholar] [CrossRef]
Battery University (Cadex Electronics). BU-802b: What Does Elevated Self-discharge Do? Available online: http://batteryuniversity.com/learn/article/elevatingselfdischarge (accessed on 4 April 2017).
Arapostathis, A.; Borkar, V.S.; Fernández-gaucherand, E.; Ghosh, M.K.; Marcus, S.I. Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Control Optim. 1993, 31, 282–334. [Google Scholar] [CrossRef]
Monahan, G.E. State of the art-A survey of partially observable Markov decision processes: Theory, models, and algorithms. Manag. Sci. 1982, 28, 1–16. [Google Scholar] [CrossRef]
Bellman, R.E. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 1957. [Google Scholar]
Littman, M.L.; Dean, T.L.; Kaelbling, L.P. In the complexity of solving Markov decision problems. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–20 August 1995; pp. 394–402. [Google Scholar]
Watkins, C.J. Learning from Delayed Rewards. Ph.D. Thesis, University of Cambridge, Cambridge, UK, 1989. [Google Scholar]
Kaelbling, L.P.; Littman, M.L.; Moore, A.W. Reinforcement learning: A survey. J. Artif. Intell. Res. 1996, 4, 237–285. [Google Scholar]
Mahadevan, S. Average reward reinforcement learning: Foundations, algorithms, and empirical results. Mach. Learn. Spec. Issue Reinf. Learn. 1996, 22, 159–196. [Google Scholar]
Whittle, P. Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 1988, 25, 287–298. [Google Scholar] [CrossRef]
Papadimitriou, C.H.; Tsitsiklis, J.N. The complexity of optimal queueing network control. Math. Oper. Res. 1999, 24, 293–305. [Google Scholar] [CrossRef]
Hero, A.; Castanon, D.; Cochran, D.; Kastella, K. Foundations and Applications of Sensor Management; Springer: New York, NY, USA, 2007. [Google Scholar]
Blasco, P.; Gunduz, D.; Dohler, M. Low-Complexity Scheduling Policies for Energy Harvesting Communication Networks. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Istanbul, Turkey, 7–12 July 2013; pp. 1601–1605. [Google Scholar]
Iannello, F.; Simeone, O.; Spagnolini, U. Optimality of myopic scheduling and whittle indexability for energy harvesting sensors. In Proceedings of the 46th Annual Conference on Information Sciences and Systems(CISS), Princeton, NJ, USA, 21–23 March 2012; pp. 1–6. [Google Scholar]
Gittins, J.; Glazerbrook, K.; Weber, R. Multi-Armed Bandit Allocation Indices; Wiley: West Sussex, UK, 2011. [Google Scholar]
Gul, O.M.; Uysal-Biyikoglu, E. A randomized scheduling algorithm for energy harvesting wireless sensor networks achieving nearly 100% throughput. In Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC), Istanbul, Turkey, 6–9 April 2014; pp. 2456–2461. [Google Scholar]
Gul, O.M.; Uysal-Biyikoglu, E. Achieving nearly 100% throughput without feedback in energy harvesting wireless networks. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, 29 June–4 July 2014; pp. 1171–1175. [Google Scholar]
Gul, O.M.; Uysal-Biyikoglu, E. UROP: A Simple, Near-Optimal Scheduling Policy for Energy Harvesting Sensors. arXiv, 2014; arXiv:1401.0437. [Google Scholar]
Durmaz Incel, O. A survey on multi-channel communication in wireless sensor networks. Comput. Netw. 2011, 55, 3081–3099. [Google Scholar] [CrossRef]
Beckwith, R.; Teibel, D.; Bowen, P. Unwired Wine: Sensor Networks in Vineyards. In Proceedings of the 2014 IEEE Sensors, Vienna, Austria, 24–27 October 2004; pp. 1–4. [Google Scholar]
Chaudhary, D.D.; Nayse, S.P.; Waghmare, L.M. Application of Wireless Sensor Networks for Greenhouse Parameter Control in Precision Agriculture. Int. J. Wirel. Mob. Netw. IJWMN 2011, 3, 140–149. [Google Scholar]
Srbinovska, M.; Gavrovski, C.; Dimcev, V.; Krkoleva, A.; Borozan, V. Environmental parameters monitoring in precision agriculture using wireless sensor networks. J. Clean. Prod. 2015, 88, 297–307. [Google Scholar] [CrossRef]
Project RHEA. Robot Fleets For Highly Effective Agriculture And Forestry Management. From 2010-08-01 to 2014-07-31, Closed Project. Available online: http://cordis.europa.eu/project/rcn/95055en.html (accessed on 29 May 2017).
Project CROPS. Intelligent Sensing and Manipulation for Sustainable Production and Harvesting of High Value Crops, Clever Robots for Crops. From 2010-10-01 to 2014-09-30, Closed Project. Available online: http://cordis.europa.eu/project/rcn/96216_en.html (accessed on 25 May 2017).

Figure 1. An example single hop WSN where an FC collects data from 10 EH sensors.

Figure 2. Efficiency of the myopic policy (MP) for i.i.d. EH processes under infinite capacity battery assumption.

Figure 3. Efficiency of the myopic policy (MP) for Markov EH processes under infinite capacity battery assumption.

Figure 4. Efficiency of the myopic policy (MP) for Markov EH processes under finite capacity (

B_{i} = 50

) battery assumption.

Figure 4. Efficiency of the myopic policy (MP) for Markov EH processes under finite capacity (

B_{i} = 50

) battery assumption.

Figure 5. Efficiency of the myopic policy (MP) for i.i.d. EH processes finite capacity (

B_{i} = 50

) battery assumption.

Figure 5. Efficiency of the myopic policy (MP) for i.i.d. EH processes finite capacity (

B_{i} = 50

) battery assumption.

Table 1. Summary of commonly used symbols and notation.

Symbol	Definition
M	The number of energy harvesting nodes
K	The number of mutually orthogonal channels of FC
S	The index set of all nodes
T	The time horizon
$V^{π} (t)$	Throughput of all nodes in TSs 1 through t under a policy $π$
$V_{i}^{π} (t)$	Throughput of node i in TSs 1 through t under a policy $π$
$η (π)$	Efficiency of a policy $π$
$Y_{i} (t)$	The number of packets which can be sent by node i in $(t, T]$
$ρ_{i}$	Intensity of node i
$ρ$	Intensity

Table 2. W and L denote the numbers of sensors with

ρ_{i} = 0.3

and

ρ_{i} = 3.0

, respectively.

ρ

denotes the resultant intensity.

Table 2. W and L denote the numbers of sensors with

ρ_{i} = 0.3

and

ρ_{i} = 3.0

, respectively.

ρ

denotes the resultant intensity.

W	95	85	75	65	55
L	5	15	25	35	45
$ρ$	$0.435$	$0.705$	$0.975$	$1.245$	$1.515$

Table 3. Efficiency of MP for IID and Markov EH processes under both infinite and finite capacity battery assumptions

B = \infty

and

B = 50

stands for infinite and finite capacity batteries, respectively.

ρ

denotes the intensity. Max. efficiency difference between

B = \infty

and

B = 50

represents the efficiency difference between

B = \infty

and

B = 50

cases for the same intensity. Max. efficiency difference (%) btw.

B = \infty

and

B = 50

represents the percentage of efficiency difference between

B = \infty

and

B = 50

cases over the efficiency in

B = \infty

case for the same intensity. Max. deviation between the bound and efficiency of MP represents the difference between the upper bound of efficiency of MP and minimum efficiency result of MP for the same intensity.

Table 3. Efficiency of MP for IID and Markov EH processes under both infinite and finite capacity battery assumptions

B = \infty

and

B = 50

stands for infinite and finite capacity batteries, respectively.

ρ

denotes the intensity. Max. efficiency difference between

B = \infty

and

B = 50

represents the efficiency difference between

B = \infty

and

B = 50

cases for the same intensity. Max. efficiency difference (%) btw.

B = \infty

and

B = 50

represents the percentage of efficiency difference between

B = \infty

and

B = 50

cases over the efficiency in

B = \infty

case for the same intensity. Max. deviation between the bound and efficiency of MP represents the difference between the upper bound of efficiency of MP and minimum efficiency result of MP for the same intensity.

$ρ$	$0.435$	$0.705$	$0.975$	$1.245$	$1.515$
Efficiency of MP for Markov EH process, $B = \infty$	$0.758$	$0.562$	$0.467$	$0.415$	$0.381$
Efficiency of MP for Markov EH process, $B = 50$	$0.756$	$0.561$	$0.462$	$0.414$	$0.380$
Efficiency of MP for IID EH process, $B = \infty$	$0.758$	$0.564$	$0.469$	$0.417$	$0.380$
Efficiency of MP for IID EH process, $B = 50$	$0.757$	$0.555$	$0.464$	$0.415$	$0.380$
Max. efficiency difference between $B = \infty$ and $B = 50$	$0.002$	$0.009$	$0.005$	$0.002$	$0.001$
Max. efficiency difference (%) btw. $B = \infty$ and $B = 50$	$0.264$	$1.596$	$1.071$	$0.480$	$0.262$
Upper bound for efficiency of MP	$0.770$	$0.574$	$0.487$	$0.438$	$0.406$
Max. deviation between the bound and efficiency of MP	$0.014$	$0.019$	$0.025$	$0.024$	$0.026$

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gul, O.M.; Demirekler, M. Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks. Sensors 2017, 17, 2206. https://doi.org/10.3390/s17102206

AMA Style

Gul OM, Demirekler M. Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks. Sensors. 2017; 17(10):2206. https://doi.org/10.3390/s17102206

Chicago/Turabian Style

Gul, Omer Melih, and Mubeccel Demirekler. 2017. "Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks" Sensors 17, no. 10: 2206. https://doi.org/10.3390/s17102206

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Average Throughput Performance of Myopic Policy in Energy Harvesting Wireless Sensor Networks

Abstract

1. Introduction

1.1. Motivation

1.2. Related Work

1.3. Our Contributions

1.4. Organization of the Paper

2. System Model and Problem Formulation

3. Efficiency of Myopic and Round Robin Policies

3.1. Efficiency Bounds of RR Policies with Quantum = 1 TS

3.2. Throughput Difference of RR Policies with Quantum = 1 TS

4. Numerical Results

4.1. Infinite Capacity Battery

4.2. Finite Capacity Battery

4.3. Discussion

5. Conclusions

Author Contributions

Conflicts of Interest

Abbreviations

Appendix A. Proof of Lemma 1

Appendix B. Proof of Theorem 2

Appendix C. Proof of Lemma 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI