Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach

Su, Jingtao; Xu, Haitao; Xin, Ning; Cao, Guixing; Zhou, Xianwei

doi:10.3390/s18103173

Open AccessArticle

Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach

¹

Department of Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China

²

Institute of Telecommunication Satellite, China Academy of Space Technology, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Sensors 2018, 18(10), 3173; https://doi.org/10.3390/s18103173

Submission received: 11 August 2018 / Revised: 16 September 2018 / Accepted: 18 September 2018 / Published: 20 September 2018

(This article belongs to the Special Issue Green Communications and Networking for IoT)

Download

Browse Figures

Versions Notes

Abstract

:

The IoT system has become a significant component of next generation networks, and drawn a lot of research interest in academia and industry. As the sensor nodes in the IoT system are always battery-limited devices, the power control problem is a serious problem in the IoT system which needs to be solved. In this paper, we research the resource allocation in the wireless powered IoT system, which includes one hybrid access point (HAP) and many wireless sensor nodes, to obtain the optimal power level for information transmission and energy transfer simultaneously. The relationship between the HAP and the sensor nodes are formulated as the Stackelberg game, and the dynamic variations of the energy for both the HAP and IoT devices are formulated through the dynamic game with mean field control. Then the power control in the wireless powered IoT system is formulated as a mean field Stackelberg game model. We aim to minimize the transmission cost for each sensor node based on optimally power resource allocation. Meanwhile, we attempt to minimize the energy transfer cost based on power control. As a result, the optimal solutions based on the mean field control of the sensor nodes and the HAP are achieved through dynamic programming theory and the law of large numbers, and

ε

-Nash equilibriums can be obtained. The energy variations for both the sensor nodes and HAP after the control of resource allocation based on the proposed approach are verified based on the simulation results.

Keywords:

power control; wireless energy transfer; IoT system; mean field Stackelberg game

1. Introduction

Internet of Things (IoT), as the main pattern to link between people and things, has been employed in the logistics for tail after, to build industry IoT environments, and for academia research [1,2]. Through IoT techniques [3], it is easy for people to access and control the date generated by the sensors, to structure the IoT system. In the IoT system, because the sensors are always battery limited devices [4], one of the main concern faced by the mass sensors is the energy consumption problem. Although the technology of NB-IoT is introduced by 3GPP to achieve low energy consumption [5], efficient energy utility in the IoT system still a key challenge that is under active research.

The development of the techniques for energy harvesting and wireless power transfer provides a paradigm to solve the energy efficiency and consumption problems in the IoT system [3,6]. Through the energy harvesting circuit, the sensors can harvest energy from different energy sources [7], such as sun light, wind, etc. Radio frequency (RF) based wireless power transfer in IoT system [8,9], which is more reliable and controllable, is also drawn a lot of research interests [10].

In this paper, we focus on the energy consumption problem in IoT system with RF based wireless power transfer, to achieve optimally resource allocation. We try to use the mean field Stackelberg game theory [11] to solve the resource allocation problem in the wireless powered IoT system, which consists of one HAP and a large number of sensor nodes. The mean field game is firstly inspired by [12,13], to solve the resource allocation problem with a large number of game players. In this paper, through combining the Stackelberg game and the mean field game, we aim at finding the optimal power control strategies when there is a large number of the sensor nodes. The dynamic characteristics of the battery’s energy variation is also considered in the proposed game model. We pay attention to the power control problem, to minimize the utility for both the HAP and the senor nodes.

In summary, the key contributions of this paper are as follows:

Firstly, we research a wireless powered IoT system, which consists of one hybrid access point (HAP) and $N$ sensor nodes. The HAP is both the information collection center and the energy source for the sensor nodes.
Secondly, a mean field Stackelberg game model is proposed to formulate the resource allocation problem in the proposed IoT system. The Stackelberg game is a one-leader-many-followers Stackelberg game. The HAP is the leader, where the sensor nodes are followers. For the mean field game, we use the energy variations as the system state. The objectives for the sensor nodes are to minimize the transmission cost during the energy transfer and information transmission. The objective for the HAP is to control the power level for energy transfer to minimize its utility.
Finally, the mean field control for both the sensor nodes and the HAP are given based on dynamic programming and the law of large number. The $ε$ -Nash equilibriums are also obtained and discussed.

The paper is organized as follows: Section 2 summarizes the related works. Section 3 gives the system model and problem formulation for the power control problem. Section 4 provides the mean field control of the sensor nodes with

ε

-Nash equilibrium, the mean field control of the HAP with a

ε

-Nash equilibrium. The implementation algorithm are also given in this section. Section 5 is the performance evaluation for both the sensor nodes and the HAP. Finally, the paper is concluded in Section 6.

2. Related Works

Although the battery limited problem can be solve through the wireless power transfer technique [14], the resource allocation problem in the wireless powered IoT system, especially the power control problem, is still an urgent problem that desperately needs to be solved. A large number of works have been done in this area [15,16,17,18]. In [15], the authors solve the resource allocation problem in cyber-physical IoT to maximize the energy efficiency. The proposed resource allocation scheme is based on the mixed integer non-convex programming theory and can be divided into two sub-problems, the power allocation problem and the channel allocation problem, and the Dinkelbach’s algorithm is used to solve the proposed optimal allocation problems.

In [16], a green resource allocation method is proposed, which considers the QoE as the main influencing factor. Then the authors use the deep reinforcement learning to solve the QoE based resource allocation problem.

In [17], a utility-lifetime maximization problem is considered for resource allocation. The authors use the Lagrange multiplier method to solve the proposed distributed dual subgradient algorithm. The wireless energy harvesting, wake-up radio and error control coding are all considered in model formulation.

In [18], the authors formulate a distributed power control problem in the wireless powered communication networks as a utility maximization problem, to guarantee the QoS demand and to achieve efficient energy management. In this paper, the authors propose an energy-efficient communication approach considering both the WET and WIT phase. The optimal charging power for the energy source can be determined.

The resource allocation problem in wireless powered IoT system has been considered and researched by lots of academies, but most of the previous works do not consider the size of the IoT system. When there are mass sensors in the IoT system, it is difficult to obtain the optimal power control strategy for each sensor. Meanwhile, the dynamic variation of the battery’s energy of the sensor node will also affect the resource allocation strategies, which is also not considered in the previous works.

3. System Model and Problem Formulation

3.1. System Model

In this paper, we consider a wireless powered IoT system with one dedicated hybrid access point (HAP) and

N

sensor nodes (SNs). The system model is given in Figure 1. Located at the appropriate place, the HAP can be considered as an aggregation to collect information from the sensor nodes, and can be considered as an energy source to the sensor nodes through RF-based wireless energy transfer. Each SN should upload the information to the HAP, and harvest energy from the HAP using the equipped energy harvesting circuit. As each SN’s energy is limited by the battery capacity, it mainly uses the energy from the HAP for information transmission. Assuming that the energy transfer and information transmission can be done simultaneously. For the sensor nodes, we assume that the wireless energy and information transmission are operated at the same frequency, based on the “harvest-then-transmit” protocol [19,20], as shown in Figure 2. Based on the system model, we will try to find out the optimized allocated power levels for the resource allocation problems in the propose system. For the HAP, the optimal power strategy for energy transfer should be solved. For the SNs, the power solutions for information transmission are in demand.

3.2. Stackelberg Game Framework

In our proposed wireless powered IoT system, there are one dedicated HAP and

N

SNs. In the downlink scenario, the HAP controls its power level for energy transfer. In the uplink scenario, each SN controls its power level for information transmission. As each SN uses the energy from the HAP for information transmission, the power level for energy transfer can significantly affect the performance of the SNs. Then the relationships between the HAP and the SNs can be considered as a Stackelberg game, more specifically, it can be considered as a one-leader-many-followers Stackelberg game. The HAP works as the leader, where the SNs are the followers. The Stackelberg game is composed by two parts, the leader-level game and the followers-level game, respectively, as shown in Figure 3.

(1) Leader-level game: As the HAP can significantly affect the performance of the SNs, it is considered as the leader. The HAP will transfer the energy to the SNs based on its own aspiration, and announces its strategy of power level for energy transfer to the SNs. Then the HAP can affect the SNs on their strategies for information transmission. Once the SNs make decisions on power level for information transmission, the HAP can re-adjust its energy transfer strategy to get more utilities.

(2) Follower-level game: As the SNs are affect by the HAP, they can be considered as the followers of the game. The SNs control their power for information transmission under the HAP’s energy transfer strategy, by playing a Stackelberg game.

3.3. System State

In this paper, as we concentrate on controlling the power for both the HAP and SNs, we use the energy as the system state for mean field game (MFG) construction. There are two energy state variables for the proposed IoT systems, the energy level of the HAP denoted by

x_{0} (t)

, and the energy level of SN

i

denoted by

{x_{i} (t), 1 \leq i \leq N}

.

(1) Energy of the HAP: the energy of the HAP is mainly dominated by the power level for energy transfer. Assuming the energy is transferred by the HAP in a unique frequency, to avoid interference to information transmission and the power level for energy transfer is denoted by

p_{0} (t)

at time instant

t

. For the HAP, the energy level can be described by the following differential equation:

d x_{0} (t) = [α_{0} x_{0} (t) + β_{0} p_{0} (t)] d t

(1)

where

x_{0} (t)

is the energy level of the HAP, with an initial energy state

x_{0} (0)

.

α_{0}

is a random coefficient of energy degradation brought by the system consumption, an

α_{0} x_{0} (t)

denotes the energy brought by the system consumption. Generally,

α_{0} x_{0} (t)

can be represented as [21,22]:

α_{0} x_{0} (t) = (P_{H C} + P_{R F} + P_{R P}) δ_{0}

where

P_{H C}

is the power consumption of the hardware circuit,

P_{R F}

is the power consumption of the RF module, and

P_{R P}

is the power consumption of packets exchanged by the HAP with controller.

δ_{0}

is the duration/slot for the energy transfer.

β_{0}

is a random efficiency coefficient of energy transfer, which depends on the energy transfer circuit. The energy transfer process should be a broadcast process. The initial state of the HAP is independent of the SNs with mean

E x_{0} (0) = {\bar{x}}_{0}

.

(2) Energy of the SNs: the energy of each SN is dominated by the energy from the HAP and the power for information transmission. Assuming the power for information transmission is denoted by

{p_{i} (t), 1 \leq i \leq N}

. For any specific SN, the evolution of the energy is described by:

d x_{i} (t) = [α_{i} x_{i} (t) + β_{i} p_{i} (t) + ρ_{i} h_{i} p_{0} (t)] d t

(2)

where

{x_{i} (t), 1 \leq i \leq N}

is the energy level of the SN

i

. Each SN has an initial energy state, which is denoted by

{x_{i} (0), 1 \leq i \leq N}

, which are independent of each other with mean

{E x_{i} (0) = \bar{x}, 1 \leq i \leq N}

. In the current analysis,

N

is taken to be large so that MFG analysis may be applied.

α_{i}

is a random coefficient of energy degradation caused by the system consumption, which includes the power consumption of the hardware circuit and the RF module [23]. The power consumed in sensing and processing are also included in this coefficient [24].

β_{i}

is a random efficiency coefficient of information transmission, which depends on the information transmission circuit.

ρ_{i}

is the conversion efficiency coefficient of energy transfer, and

h_{i}

is the channel power gain from the HAP to SN

i

.

3.4. Problem Formulation

In this sub-section, we will give the optimal power control problem for the HAP and the SNs. We want to find the optimal power level for both the HAP and the SNs based on the proposed model. For the HAP, the optimal power strategy for energy transfer could be obtained by minimize the following utility function, which is:

J_{0}^{N} (p_{0}, p^{N}) = \int_{0}^{T} {μ_{0} {(x_{0} (t) - H_{0} x^{N} (t))}^{2} + ν_{0} {(p_{0} (t))}^{2}} d t

(3)

where

μ_{0} \geq 0

and

ν_{0} > 0

, are positive weighting factors representing relative importance of the objective components. The objective of the HAP is a linear combination of two components. The first component is the utility function denoted by

μ_{0} {(x_{0} (t) - H_{0} x^{N} (t))}^{2}

, which means the available energy for transfer, compared to the mass behavior of the energy of SNs. In the first component,

x^{N} (t) = (1 / N) \sum_{i = 1}^{N} x_{i} (t)

denotes the mean field term that captures the mass behavior of the SNs. The second part is the payment earned from the SNs for energy transfer, and is denoted by

ν_{0} {(p_{0} (t))}^{2}

. Therefore, minimize the utility function of the HAP gives us the following objective function:

p_{0}^{*} (t) = \underset{p_{0} (t)}{\arg \min} J_{0}^{N} (p_{0}, p^{N})

(4)

For the SNs, we want to find the optimal power strategies for information transmission considering a large population. Then, for any specific SN, its cost function is given as follow:

J_{i}^{N} (p_{i}, p_{- i}, p_{0}) = \int_{0}^{T} {μ_{i} {(x_{i} (t) - H_{i} x^{N} (t))}^{2} + ν_{i} {(p_{i} (t))}^{2} + η_{i} p_{0} (t) p_{i} (t)} d t

(5)

where

μ_{i} \geq 0

,

ν_{i} > 0

and

η_{i} > 0

are positive weighting factors. The objective for any specific SN is composed by three parts. The first part is

μ_{i} {(x_{i} (t) - H_{i} x^{N} (t))}^{2}

, which means the available energy for information transmission, compared to the mass behavior of the SNs. The second part of the objective is

ν_{i} {(p_{i} (t))}^{2}

, denotes the power cost component for information transmission. The third part of the objective is

η_{i} p_{0} (t) p_{i} (t)

, which is the payment for the energy harvesting, depends on both the harvested energy and the power for information transmission. Therefore, minimize the objective for any specific SN gives us the following objective function:

p_{i}^{*} (t) = \underset{p_{i} (t)}{\arg \min} J_{i}^{N} (p_{i}, p_{- i}, p_{0})

(6)

For both the HAP and the SNs, the objective functions are formulated with the mean field game framework through the mean field term

x^{N} (t)

. Based on the mean field term, we can analyze the IoT system with a large population. Both the HAP and the SNs can obtain their distributed equilibriums by the estimation of the mass response.

4. Game Analysis and Implementation Algorithm

4.1. Mean Field Control of Sensor Nodes

In this section, we will try to get the mean field control solutions for the SNs, based on an energy transfer power strategy of the HAP. First, the local optimal control of each sensor node can be considered as a dynamic game and the open-loop and state feedback solutions will be given based on the Bellman’s dynamic programming principle. Then we will extend the size of the IoT system, and use the strong law of large numbers (SLLN) to get the mean field control solution for all the SNs, then each sensor node can obtain the distributed equilibrium solution based on the mean field control solution.

For each SN, it always constitutes an

ε

-Nash equilibrium for any control strategy of the HAP, which gives out the optimality of the optimal control problem for each SN, and is given as follows.

Definition 1 (

ε

-Nash equilibrium).

Given an energy transfer power strategy of the HAP, which is denoted by

p_{0}

, for each sensor node

{i, 1 \leq i \leq N}

, it constitutes an

ε

-Nash equilibrium, if there exists

ε \geq 0

such that

J_{i}^{N} ({\bar{p}}_{i}, {\bar{p}}_{- i}, p_{0}) \leq \inf_{p_{i} \in U_{i} (p_{0})} J_{i}^{N} (p_{i}, {\bar{p}}_{- i}, p_{0}) + ε

, for all

i

,

1 \leq i \leq N

.

Proposition 1.

For the sensor node

{i, 1 \leq i \leq N}

, a set of controls

(x_{i}^{*}, p_{i}^{*})

constitutes an open loop equilibrium to the power control problem in Equations (2) and (6), and the optimal control solution can be given by:

p_{i}^{*} (t) = - ν_{i}^{- 1} β_{i} λ_{i} (t) - ν_{i}^{- 1} η_{i} p_{0} (t)

(7)

subject to:

\begin{array}{l} d x_{i}^{*} (t) = [α_{i} x_{i}^{*} (t) + β_{i} (- ν_{i}^{- 1} β_{i} λ_{i} (t) - ν_{i}^{- 1} η_{i} p_{0} (t)) + ρ_{i} h_{i} p_{0} (t)] d t \\ = [α_{i} x_{i}^{*} (t) - ν_{i}^{- 1} β_{i}^{2} λ_{i} (t) - ν_{i}^{- 1} β_{i} η_{i} p_{0} (t) + ρ_{i} h_{i} p_{0} (t)] d t \end{array}

(8)

where

x_{i}^{*} (0) = x_{i}^{} (0)

.

λ_{i} (t)

is a costate function with

λ_{i} (T) = 0

and can be given by the following differential equation:

d λ_{i} (t) = [- μ_{i} (x_{i}^{*} (t) - H_{i} z (t)) - α_{i} λ_{i} (t)] d t

(9)

and:

z (t) = x^{N} (t) = (1 / N) \sum_{i = 1}^{N} x_{i} (t)

Theorem 1.

The optimal control problem has a unique solution.

Proof.

The corresponding optimal solution for the SN

{i, 1 \leq i \leq N}

in Equation (7) is given by the Hamilton Jocabi Bellman (HJB) equation, based on the following equation:

L_{i} (p_{i}, x_{i}) = μ_{i} {(x_{i} (t) - H_{i} x^{N} (t))}^{2} + ν_{i} {(p_{i} (t))}^{2} + η_{i} p_{0} (t) p_{i} (t) + λ_{i} (t) [α_{i} x_{i} (t) + β_{i} p_{i} (t) + ρ_{i} h_{i} p_{0} (t)]

(10)

Then the optimal control problem has a unique solution, which is given by:

p_{i}^{*} (t) = \partial L_{i} (p_{i}, x_{i}) / \partial p_{i}^{} (t)

(11)

In Equations (7)–(9), we can find that under the mean field game analysis framework, the corresponding optimal solution for each sensor node can be affected by the mass behaviors of all the sensors. The corresponding optimal solutions can be considered as the mean field game Nash equilibrium control strategies. □

Proposition 2.

For each sensor node, the state feedback control equilibrium is given by:

p_{i}^{*} (t) = ν_{i}^{- 1} β_{i} V (t) x_{i}^{*} (t) - ν_{i}^{- 1} β_{i} ϕ_{i} (t) - ν_{i}^{- 1} η_{i} p_{0} (t)

(12)

where

V (t)

is the value function which will be given later. We call Equation (7) or (12) is the optimal localized power strategy of the sensor node

i

for information transmission, because the optimal power strategy is a function of the local information and the strategy of the HAP. In (12), the optimal power strategy

p_{i}^{*} (t)

is a function of the energy state

x_{i}^{*} (t)

and the value function

V_{i} (t)

, where the value function

V_{i} (t)

should satisfy the following relation:

λ_{i} (t) = - V_{i} (t) x_{i}^{*} (t) + ϕ_{i} (t)

(13)

and:

- \frac{d V_{i} (t)}{d t} = ν_{i}^{- 1} β_{i}^{2} V_{i}^{2} (t) + 2 α_{i} V (t) - μ_{i}

(14)

where

ϕ_{i} (T) = 0

,

V_{i} (T) = 0

. Based on Proposition 2, we can obtain the state feedback equilibrium of the optimal control strategy in Equation (7). Meanwhile, the corresponding optimal state trajectory, the corresponding energy variations in Equations (8) and (9) can be re-written as follows:

d x_{i}^{*} (t) = = [(α_{i} + β_{i}^{2} ν_{i}^{- 1} V_{i} (t)) x_{i}^{*} (t) - β_{i}^{2} ν_{i}^{- 1} ϕ_{i} (t) - β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) + ρ_{i} h_{i} p_{0} (t)] d t

(15)

d ϕ_{i} (t) = = [- (α_{i} + β_{i}^{2} ν_{i}^{- 1} V_{i} (t)) ϕ_{i} (t) + μ_{i} H_{i} z (t) - V (t) β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) - ρ_{i} h_{i} p_{0} (t)] d t

(16)

Next, in order to get the mean field estimation, we should apply the strong law of large numbers (SLLN) to the control strategies given in the above. For each sensor node, the optimal power strategy can be given by Equation (7), and the associated energy state is given by Equation (8). Let

λ^{N} (t) = (1 / N) \sum_{i = 1}^{N} λ_{i} (t)

, then

z (t) = \lim_{N \to \infty} x^{N} (t)

, and

λ (t) = \lim_{N \to \infty} λ^{N} (t)

can be given by:

d z (t) = [α_{i} z (t) + β_{i} (- β_{i} ν_{i}^{- 1} λ (t) - ν_{i}^{- 1} η_{i} p_{0} (t)) + ρ_{i} h_{i} p_{0} (t)] d t

(17)

d λ (t) = [- μ_{i} (z (t) - H_{i} z (t)) - α_{i} λ (t)] d t

(18)

With the functions given in Equations (15)–(18) can also be written as:

d z (t) = = [(α_{i} + ν_{i}^{- 1} β_{i}^{2} V (t)) z (t) - β_{i}^{2} ν_{i}^{- 1} ϕ (t) - β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) + ρ_{i} h_{i} p_{0} (t)] d t

(19)

d ϕ (t) = = [- (α_{i} + β_{i}^{2} ν_{i}^{- 1} V (t)) ϕ (t) + μ_{i} H_{i} z (t) - V (t) β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) - ρ_{i} h_{i} p_{0} (t)] d t

(20)

where

ϕ (t) = \lim_{N \to \infty} (1 / N) \sum_{i = 1}^{N} ϕ_{i} (t)

and

ϕ (T) = 0

. When the number of the sensor nodes

N

is arbitrary large, we can find the mean field estimation based on Equation (19) and (20). Additionally, we can find that the mean field estimation is dependent on the HAP’s power control strategy.

Proposition 3.

For any power strategy of the HAP, we have:

Ε \int_{0}^{T} {‖ x^{N} (t) - H_{i} z (t) ‖}^{2} d t = O (\frac{1}{N})

(21)

Recall the optimal power strategy for the sensor node

i

,

1 \leq i \leq N

, in Proposition 1:

p_{i}^{*} (t) = - ν_{i}^{- 1} β_{i} λ_{i} (t) - ν_{i}^{- 1} η_{i} p_{0} (t)

(22)

where

(x_{i}^{*}, λ_{i})

is determined in Equations (8) and (9). The above optimal control strategy given in Equation (22) is an open loop solutions controlled by the power control strategy of the HAP.

Theorem 2.

For any power strategy of the HAP for energy transfer, the information transmission strategy for each sensor node

{i, 1 \leq i \leq N}

, constitutes an

ε

-Nash equilibrium, that is, for any

i

,

1 \leq i \leq N

, we have:

J_{i}^{N} ({\bar{p}}_{i}, {\bar{p}}_{- i}, p_{0}) \leq \inf_{p_{i} \in U_{i} (p_{0})} J_{i}^{N} (p_{i}, {\bar{p}}_{- i}, p_{0}) + ε

(23)

where

ε = O (1 / \sqrt{N})

.

4.2. Mean Field Control of HAP

In this section, we will analyze the mean field control problem for the HAP, and try to obtain the optimal control strategy. The open-loop solution will be given and the mean field control solution can be obtained.

Definition 2.

For the HAP, the control strategy

p_{0}^{*} (t)

are optimal if the following inequality holds for all feasible controls

p_{0}^{} (t) \neq p_{0}^{*} (t)

:

J_{0}^{N} (p_{0}^{*}, p^{N}) \leq J_{0}^{N} (p_{0}, p^{N})

(24)

Proposition 4.

The HAP’s optimal control problem is to minimize the following equation:

J_{0}^{N} (p_{0}, p^{N}) = Ε (\int_{0}^{T} {μ_{0} {(x_{0} (t) - H_{0} z (t))}^{2} + ν_{0} {(p_{0} (t))}^{2}} d t)

(25)

subject to:

d x_{0} (t) = [α_{0} x_{0} (t) + β_{0} p_{0} (t)] d t

(26)

d z (t) = [α_{i} z (t) + β_{i} (- ν_{i}^{- 1} β_{i} λ (t) - ν_{i}^{- 1} η_{i} p_{0} (t)) + ρ_{i} h_{i} p_{0} (t)] d t

(27)

d λ (t) = [- μ_{i} (z (t) - H_{i} z (t)) - α_{i} λ (t)] d t

(28)

As the HAP is the leader in the proposed game model, and we should apply the Stackelberg game analysis to the proposed model, there exists two more constraints in the control of the HAP compared to the control of the sensor nodes, given by Equations (27) and (28). Based on the mean field control solutions given in Section 3, the constraints given by Equations (27) and (28) can be replaced by:

d z (t) = = [(α_{i} + ν_{i}^{- 1} β_{i}^{2} V (t)) z (t) - β_{i}^{2} ν_{i}^{- 1} ϕ (t) - β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) + ρ_{i} h_{i} p_{0} (t)] d t

(29)

d ϕ (t) = = [- (α_{i} + β_{i}^{2} ν_{i}^{- 1} V (t)) ϕ (t) + μ_{i} H_{i} z (t) - V (t) β_{i} ν_{i}^{- 1} η_{i} p_{0} (t) - ρ_{i} h_{i} p_{0} (t)] d t

(30)

Proposition 5.

For the HAP, there exists an optimal control solution given by the pair

(x_{0}^{*}, p_{0}^{*})

if and only if

p_{0}^{*} (t) = - ν_{0}^{- 1} β_{0} λ_{0} (t) + ν_{0}^{- 1} Λ_{1} (t) \sum_{i = 1}^{N} η_{i} ν_{i}^{- 1} β_{i} - ν_{0}^{- 1} Λ_{2} (t) \sum_{i = 1}^{N} ρ_{i} h_{i}

(31)

where:

d x_{0} (t) = [α_{0} x_{0} (t) - ν_{0}^{- 1} β_{0}^{2} λ_{0} (t) + ν_{0}^{- 1} β_{0} Λ_{1} (t) \sum_{i = 1}^{N} η_{i} ν_{i}^{- 1} β_{i} - ν_{0}^{- 1} β_{0} Λ_{2} (t) \sum_{i = 1}^{N} ρ_{i} h_{i}] d t

(32)

d λ_{0} = [- μ_{0} (x_{0}^{*} (t) - H_{0} z (t)) - α_{0} λ_{0} (t)] d t

(33)

d Λ_{1} = [η_{i} H_{0} (x_{0}^{*} (t) - H_{0} z (t)) - Λ_{1} \sum_{i = 1}^{N} α_{i} - Λ_{2} \sum_{i = 1}^{N} η_{i} (H_{i} - 1)] d t

(34)

d Λ_{2} = [Λ_{1} \sum_{i = 1}^{N} β_{i}^{2} ν_{i}^{- 1} + Λ_{2} \sum_{i = 1}^{N} α_{i}] d t

(35)

\begin{array}{l} d z (t) = [\begin{array}{l} α_{i} z (t) - ν_{i}^{- 1} β_{i}^{2} λ_{i} (t) - ν_{i}^{- 1} β_{i} η_{i} [\begin{array}{l} - ν_{0}^{- 1} β_{0} λ_{0} (t) + ν_{0}^{- 1} Λ_{1} (t) \sum_{i = 1}^{N} η_{i} ν_{i}^{- 1} β_{i} \\ - ν_{0}^{- 1} Λ_{2} (t) \sum_{i = 1}^{N} ρ_{i} h_{i} \end{array}] \\ + ρ_{i} h_{i} [- ν_{0}^{- 1} β_{0} λ_{0} (t) + ν_{0}^{- 1} Λ_{1} (t) \sum_{i = 1}^{N} η_{i} ν_{i}^{- 1} β_{i} - ν_{0}^{- 1} Λ_{2} (t) \sum_{i = 1}^{N} ρ_{i} h_{i}] \end{array}] d t \end{array}

(36)

d λ (t) = [- μ_{i} (z (t) - H_{i} z (t)) - α_{i} λ (t)] d t

(37)

4.3. Mean Field Control Algorithm

In this subsection, we will discuss the implementation algorithm for the proposed model. As shown in Figure 4, the whole algorithm cycling can be divided into two parts. One is the “mean field control of sensor nodes” part, which is used to calculate the equilibrium for the sensor nodes. The other is the “mean field control of HAP” part, to make a decision on the power level for energy transfer. As all the objective functions given in the mean field control process are linear quadratic functions, and the solutions should be solved based on the Stackelberg game framework, the complexity of the algorithm will be

O (n^{2})

. The progress can be described as follows.

Algorithm 1 Mean field control algorithm for the HAP and sensor nodes.

Set up the parameter for the HAP and sensor nodes.
The HAP announce the power strategy for energy transfer to the sensor nodes.
Start the mean field game of the HAP and sensor nodes.
Calculate the mean field control solutions for the sensor nodes first.
Setup the objective function and state function for the sensor nodes.
Calculate the solutions for the sensor nodes based on Equations (12)–(18).
Get the mean field estimation of the sensor nodes for the HAP.
Calculate the mean field control solutions for the HAP.
Setup the objective function and state function for the HAP.
Calculate the solutions for the HAP based on Equations (31)–(37).
End.

5. Performance Evaluation

In this section, we provide simulation results to illustrate the convergence property and effectiveness of the proposed model. Assuming all the sensor nodes are uniform sensor node, that have the same parameter settings. Each sensor wants to control the power level for information transmission to minimize the cost given in Equation (5). The mean field control solutions introduced in Section 3 and Section 4 are simulated.

Figure 5 shows the optimal variations of the energy state for the sensor node, with the power level for energy transfer are set to be 50 W, 100 W, 150 W, and 200 W, respectively. In Figure 5a, the power level for energy transfer is set to be 50 W, the energy state of the sensor node can be increased from the initial energy state to a higher energy state with energy transfer. When we increase the power level of the HAP for energy transfer in Figure 5b–d, the final energy state will be increased. The higher of the transfer energy, the higher of the achieved energy state. The sensor node can have much more energy stored in its battery with higher energy from the HAP. Related to the energy state, the power level for information transmission for each sensor node is given in Figure 6. With the increasing of the transferred energy, there will be more power for the sensor node to transmit information. When the power level of HAP for energy transfer is 50 W, shown in Figure 6a, the sensor node will increase the power level for information transmission at the first 6 s. It will decrease the power level to have more energy available at the next 4 s. The power level for the information transmission can achieve convergence when the power level for energy transfer is large than 50 W.

The variation of the mean filed term, the mass behavior of the sensor nodes is given in Figure 7. Figure 8 shows the variation of the energy state of the HAP. As the HAP is the energy source for the sensor nodes, its energy will decrease with the time duration.

6. Conclusions

In this paper, we have proposed a Stackelberg mean field game-based model to solve the power control problems in the wireless powered IoT system, to minimize the cost of the information transmission for the sensor nodes, and to minimize the cost of the HAP. In the proposed game model, the relations between the HAP and sensors is analyzed based on the Stackelberg game, and the objective functions are constructed using the mean field game. We consider the energy variations of sensors and HAP as the system state to construct the mean field game model. Then mean filed control for both the sensor nodes and the HAP are analyzed, and

ε

-Nash equilibriums are obtained. Based on the simulations results, it can be seen that our proposed model can achieve optimal power control for both the sensor nodes and the HAP. In future work, we will attempt to extend our proposed mean field Stackelberg game-based algorithm, in order to employ it in other kinds of networks, such as smart grid networks [25], M2M networks [26], 5G networks [27,28], and so on [29].

Author Contributions

J.S. conceived the draft writing of the manuscript. H.X. conceived system model, game analysis and the performance evaluation, he also conceived on the manuscript organization; N.X. and G.C. conceived to the simulations; X.Z. conceived to the project management; and all authors contributed to data analysis, simulations, and the writing of this paper.

Funding

This work is supported by the National Natural Science Foundation of China, No. 61501026, and the National Key R&D Program of China, No.2018YFB1003905.

Acknowledgments

The authors would like to thank the editor and the anonymous reviewers for their valuable comments and suggestions that improved the quality of this paper.

Conflicts of Interest

The authors declare no conflicts of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Lv, T.; Lin, Z.; Huang, P.; Zeng, J. Optimization of the Energy-Efficient Relay-Based massive IoT Network. IEEE IOT J. 2018, 5, 3043–3058. [Google Scholar] [CrossRef]
Whitmore, A.; Agarwal, A.; Li, X. The Internet of Things—A Survey of Topics and Trends. Inf. Syst. Front. 2015, 17, 261–274. [Google Scholar] [CrossRef]
Zhong, S.; Wang, X. Energy Allocation and Utilization for Wirelessly Powered IoT Networks. IEEE IOT J. 2018, 5, 2781–2792. [Google Scholar] [CrossRef]
Sarwesh, P.; Shet, N.S.V.; Chandrasekaran, K. Energy efficient network architecture for IoT applications. In Proceedings of the International Conference on Green Computing and Internet of Things, Noida, India, 8–10 October 2015; pp. 784–789. [Google Scholar]
Malik, H.; Pervaiz, H.; Alam, M.M.; Moullec, Y.L.; Kuusik, A.; Imran, M.A. Radio Resource Management Scheme in NB-IoT Systems. IEEE Access 2018, 6, 15051–15064. [Google Scholar] [CrossRef]
Afghan, S.A.; Adila, A.H.; Husi, G. Towards the self-powered Internet of Things (IoT) by energy harvesting: Trends and technologies for green IoT. In Proceedings of the 2nd International Symposium on Small-scale Intelligent Manufacturing Systems (SIMS), Cavan, Ireland, 16–18 April 2018; pp. 1–5. [Google Scholar]
Sudevalayam, S.; Kulkarni, P. Energy Harvesting Sensor Nodes: Survey and Implications. IEEE Commun. Surv. Tutor. 2011, 13, 443–461. [Google Scholar] [CrossRef] [Green Version]
Wu, Q.; Chen, W.; Ng, D.W.K.; Schober, R. Spectral and Energy Efficient Wireless Powered IoT Networks: NOMA or TDMA? IEEE Trans. Veh. Technol. 2018, 67, 6663–6667. [Google Scholar] [CrossRef]
Yanagawa, S.; Shimizu, R.; Hamada, M.; Shimizu, T.; Kuroda, T. Wireless power transfer to stacked modules for IoT sensor nodes. In Proceedings of the International SoC Design Conference (ISOCC), Seoul, Korea, 5–8 November 2017; pp. 59–60. [Google Scholar]
Bi, S.; Zeng, Y.; Zhang, R. Wireless powered communication networks: An overview. IEEE Wirel. Commun. 2016, 23, 10–18. [Google Scholar] [CrossRef]
Moon, J.; Basar, T. Linear-quadratic stochastic differential Stackelberg games with a high population of followers. In Proceedings of the IEEE Conference on Decision and Control, Osaka, Japan, 15–18 December 2015; pp. 2270–2275. [Google Scholar]
Huang, M.; Caines, P.E.; Malhamé, R.P. Large-Population Cost-Coupled LQG Problems with Nonuniform Agents: Individual-Mass Behavior and Decentralized, -Nash Equilibria. IEEE Trans. Autom. Control. 2007, 52, 1560–1571. [Google Scholar] [CrossRef]
Aziz, M.; Caines, P.E. A Mean Field Game Computational Methodology for Decentralized Cellular Network Optimization. IEEE Trans. Control. Syst. Technol. 2017, 25, 563–576. [Google Scholar] [CrossRef]
Prawiro, S.Y.; Murti, M.A. Wireless power transfer solution for smart charger with RF energy harvesting in public area. In Proceedings of the 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), Singapore, 5–8 February 2018; pp. 103–106. [Google Scholar]
Li, S.; Ni, Q.; Sun, Y.; Min, G.; AI-Ruibaye, S. Energy-Efficient Resource Allocation for Industrial Cyber-Physical IoT Systems in 5G Era. IEEE Trans. Ind. Inf. 2018, 14, 2618–2628. [Google Scholar] [CrossRef] [Green Version]
He, X.; Wang, K.; Huang, H.; Miyazaki, T.; Wang, Y.; Guo, S. Green Resource Allocation based on Deep Reinforcement Learning in Content-Centric IoT. IEEE Trans. Emerg. Top. Comput. 2018, 99, 1. [Google Scholar] [CrossRef]
Mahapatra, C.; Sheng, Z.; Kamalinejad, P.; Leung, V.C.M.; Mirabbasi, S. Optimal Power control in Green Wireless Sensor Networks with Wireless Energy Harvesting, Wake-up Radio and Transmission control. IEEE Access 2016, 5, 501–518. [Google Scholar] [CrossRef]
Vamvakas, P.; Tsiropoulou, E.E.; Vomvas, M.; Papavassiliou, S. Adaptive Power Management in Wireless Powered Communication Networks: A User-Centric Approach. In Proceedings of the 38th IEEE Sarnoff Symposium, Newark, NJ, USA, 18–20 September 2017; pp. 1–6. [Google Scholar]
Wang, M.; Xu, H.; Zhou, X. Cooperative Dynamic Game based Optimal Power Control in Wireless Sensor Network Powered by RF Energy. Sensors 2018, 18, 2393. [Google Scholar] [CrossRef] [PubMed]
Ju, H.; Zhang, R. Throughput maximization in wireless powered communication networks. IEEE Trans. Wirel Commun. 2014, 13, 418–428. [Google Scholar] [CrossRef]
Ni, W.; Dong, X. Energy Harvesting Wireless Communications With Energy Cooperation Between Transmitter and Receiver. IEEE Trans. Commun. 2015, 63, 1457–1469. [Google Scholar]
Wu, Q.; Chen, W.; Ng, D.W.K. User-Centric Energy Efficiency Maximization for Wireless Powered Communications. IEEE Trans. Wirel Commun. 2016, 15, 6898–6912. [Google Scholar] [CrossRef]
Ejaz, W.; Naeem, M.; Basharat, M. Efficient Wireless Power Transfer in Software-Defined Wireless Sensor Networks. IEEE Sens. J. 2016, 16, 7409–7420. [Google Scholar] [CrossRef]
Ejaz, W.; Shah, G.A.; Hasan, N.U. Energy and throughput efficient cooperative spectrum sensing in cognitive radio sensor networks. Emerg. Telecommun. Technol. 2015, 26, 1019–1030. [Google Scholar] [CrossRef]
Cacciapuoti, A.S.; Caleffi, M.; Marino, F. Mobile Smart Grids: Exploiting the TV White Space in Urban Scenarios. IEEE Access 2017, 4, 7199–7211. [Google Scholar] [CrossRef]
Tsiropoulou, E.E.; Mitsis, G.; Papavassiliou, S. Interest-aware Energy Collection & Resource Management in Machine to Machine Communications. Ad Hoc Netw. 2018, 68, 48–57. [Google Scholar]
Zhang, S.; Wu, Q.; Xu, S. Fundamental Green Tradeoffs: Progresses, Challenges, and Impacts on 5G Networks. IEEE Commun. Surv. Tutor. 2017, 19, 33–56. [Google Scholar] [CrossRef]
Wu, Q.; Li, G.Y.; Chen, W. An Overview of Sustainable Green 5G Networks. IEEE Wirel. Commun. 2017, 24, 72–80. [Google Scholar] [CrossRef] [Green Version]
Sara, C.A.; Marcello, C.; Luigi, P. On the Probabilistic Deployment of Smart Grid Networks in TV White Space. Sensors 2016, 16, 671. [Google Scholar] [Green Version]

Figure 1. Wireless powered IoT system.

Figure 2. Time-switching protocol.

Figure 3. Stackelberg game.

Figure 4. Implementation algorithm.

Figure 5. Variations of energy state with different transfer power. (a) Power transfer = 50 W; (b) power transfer = 100 W; (c) power transfer = 150 W; and (d) power transfer = 200 W.

Figure 6. The power level for information transmission. (a) Power transfer = 50 W; (b) power transfer = 100 W; (c) power transfer = 150 W; and (d) power transfer = 200 W.

Figure 7. Variations of the mean field term. (a) Power transfer = 50 W; (b) power transfer = 100 W; (c) power transfer = 150 W; and (d) power transfer = 200 W.

Figure 8. Variations of the HAP’s energy.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, J.; Xu, H.; Xin, N.; Cao, G.; Zhou, X. Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach. Sensors 2018, 18, 3173. https://doi.org/10.3390/s18103173

AMA Style

Su J, Xu H, Xin N, Cao G, Zhou X. Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach. Sensors. 2018; 18(10):3173. https://doi.org/10.3390/s18103173

Chicago/Turabian Style

Su, Jingtao, Haitao Xu, Ning Xin, Guixing Cao, and Xianwei Zhou. 2018. "Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach" Sensors 18, no. 10: 3173. https://doi.org/10.3390/s18103173

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Resource Allocation in Wireless Powered IoT System: A Mean Field Stackelberg Game-Based Approach

Abstract

1. Introduction

2. Related Works

3. System Model and Problem Formulation

3.1. System Model

3.2. Stackelberg Game Framework

3.3. System State

3.4. Problem Formulation

4. Game Analysis and Implementation Algorithm

4.1. Mean Field Control of Sensor Nodes

4.2. Mean Field Control of HAP

4.3. Mean Field Control Algorithm

5. Performance Evaluation

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI