Next Article in Journal
PGD-Trap: Proactive Deepfake Defense with Sticky Adversarial Signals and Iterative Latent Variable Refinement
Previous Article in Journal
Leveraging Off-the-Shelf WiFi for Contactless Activity Monitoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Utility Maximization for the IRS-Assisted Wireless Powered Communication Network with Multiple Service Providers

School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
*
Authors to whom correspondence should be addressed.
Electronics 2024, 13(17), 3352; https://doi.org/10.3390/electronics13173352
Submission received: 18 June 2024 / Revised: 17 August 2024 / Accepted: 18 August 2024 / Published: 23 August 2024

Abstract

:
We consider a multi-device wireless-powered communication network, where an intelligent reflecting surface (IRS) is deployed to assist the wireless energy transfer (WET) from the power station (PS) to the information transmitter (IT) and the wireless information transfer (WIT) from the IT to the devices. The IRS, IT, and PS belong to different service providers, where the PS receives revenue from the IT for WET and pays fees to its energy source and the IRS, and the IT receives revenue from the devices for WIT and pays fees to the PS and the IRS. We model the interactions between the IT and the PS through a Stackelberg game and aim to achieve a win-win situation between them in terms of utility. Specifically, we solve a follower problem that maximizes the utility of the PS by jointly optimizing the IRS reflection coefficients and the transmit power of the PS, and a leader problem that maximizes the utility of the IT by jointly optimizing the energy price, the transmit power allocation of the IT, and the time allocation as well as the IRS reflection coefficients in the WET and WIT phases. The results show that, by deploying IRS, it is possible to effectively enhance the signal strength and thus increase the revenue of the system, and by using Time Division Multiple Access (TDMA), the performance of the system is greatly improved by coordinating the timing of the reception of the signal by each device.

1. Introduction

With the advent of large-scale deployment of battery-powered devices, it may be critical to extend device lifetime in energy-constrained Internet of Things (IoT) networks. Wireless energy transfer (WET) technology has been widely explored for its ability to provide power to energy-constrained devices in a wireless and flexible manner [1]. Therefore, a wireless-powered communication network (WPCN) can be established using WET technology. In a WPCN, the information transmitter (IT) usually adopts a harvest-then-send transmission mode; i.e., it first harvests energy during the WET phase and then uses the harvested energy for information transmission. However, the communication performance of a device in a WPCN can be degraded by the double path loss effect caused by channel path losses in both the WET and wireless information transmission (WIT) phases. Therefore, it is essential for a WPCN to eliminate the double path loss effect and ensure the communication performance of the devices. An intelligent reflecting surface (IRS) consists of a large number of passive reflecting units, each of which can independently adjust its reflection amplitude and phase for the incident signal. Based on this advantage, the wireless communication channel can be intelligently reconfigured by an IRS. In [2], by optimizing the IRS reflection phase shifts as well as the precoding vectors, the secure communication rate of the system with an IRS was significantly increased compared to that of the system without one. Reference [3] focuses on the phase estimation of individual channels in IRS-assisted communication systems with single-antenna transceivers under the correlated Rayleigh fading. In [4], an IRS assisted the simultaneous wireless information and power transfer system by optimizing the transmit precoders at the access point and the phase shifts at the IRS, and the transmit power required at the AP was significantly reduced. Thus, use of the IRS is expected to provide an effective solution for eliminating the double-path loss effect in a WPCN.
There have been existing works considering the use of an IRS to assist a WPCN [5,6]. In [5], the results indicated that dynamic IRS reflection coefficients have practical significance in improving the performance of WPCN systems. In [6], the authors considered an intelligent reflecting surface (IRS)-empowered wireless-powered communication network (WPCN) in an Internet of Things (IoT) network. In both cases, the average sum rate of a system with an IRS is greater than that of a system without one. Therefore, the use of an IRS can create a favorable propagation environment, effectively reducing the double-path loss and improving the performance of a WPCN. In addition, most of the current research on IRS-assisted WPCNs considers only a single service provider, and the resulting optimization problems can be solved using centralized optimization algorithms.
In a WPCN, there may be multiple service providers, such as an energy service provider and an information service provider. Unlike the single service provider systems, there are interactions between each service provider, and each service provider pursues its own utility maximization, so it is desired to optimize the parameters of different service providers to achieve a win-win situation for all service providers. The use of game theory can effectively achieve this goal through sequential strategic optimization interactions among different service providers. In [7], a multi leader single-follower (MLSF) Stackelberg game was formulated to study the resource allocation problem in a WPCN with cooperative jamming. The cooperative and non-cooperative relationships between the wireless nodes and the hybrid access points were studied in [8]. Reference [9] focused on the energy game model between the Base Station and D2D transmitter, and the results showed that the use of game theory can effectively improve the system utility. The results of these works demonstrated that game theory can provide a framework for sequential strategic optimization interaction among service providers with different optimization goals in a WPCN. However, these existing works did not consider the use of an IRS to assist different service providers.
A few studies have attempted to apply game theory to IRS-aided communication systems. Reference [10] studied an evolutionary game model where users interact with dynamic network service providers to adjust their utility. In [11], monetary incentives were considered to study the energy interaction between an IT and a power station (PS) belonging to different operators in an IRS-assisted WPCN. Deploying an IRS can effectively improve system performance, but in practice, the use of an IRS is not free of charge. If there is no revenue, the service provider of the IRS should stop providing the IRS service [12]. Therefore, the IRS should be considered as an independent service provider. To the best of our knowledge, there are few studies on IRS-assisted WPCNs considering the IRS as an independent service provider, which motivated the present study.
In this paper, we consider an IRS-assisted multi-device WPCN. The network first performs WET and then performs WIT. In the WET phase, the IT harvests energy from the PS, and in the WIT phase, the IT transmits information to the devices in the manner of time division multiple access (TDMA). The IRS is used for signal quality improvement during the WET and WIT phases. The PS, the IT, and the IRS are different service providers, and the use of the IRS is not free. The IT purchases energy from PS, so it can decide how much energy to purchase and can act as the leader of the system. The PS, as a follower, provides energy to the IT to earn profits. The IT pays the cost of obtaining energy and obtains profits through transmitting information. The network optimizes the transmit power of the PS, the energy price, the transmit power allocation of the IT, and the time allocation as well as the IRS reflection coefficients in the WET and WIT phases to maximize the utilities of the PS and the IT, respectively. We use the Stackelberg game to model the interactions of different service providers and propose efficient algorithms to achieve a win-win situation for multiple service providers. The main contributions of this paper are outlined as follows.
To the best of our knowledge, this is the first paper to consider the IRS as an independent service provider. In contrast to [13], we deploy an IRS in order to improve the signal strength of the wireless communication system as well as to avoid possible obstruction of the signal. The IRS in [11] provides the service to the device for free, which is not possible in reality. If there is no revenue, the IRS will refuse to provide the service, so this question considers the IRS as an independent service provider and charges for the service provided. We have not set a limit on the minimum value of the signal that the IT transmits to the user, which is unreasonable; rather, we have set a limit on the signal that the IT transmits to the user through TDMA. Furthermore, we thought it was reasonable to limit the maximum transmit power of the PS. Unlike the existing works, we assume here that the signal enhancement service provided by the IRS is not free, and the cost of using the IRS is modeled as proportional to the number of IRS reflection units activated. We also assume that the IT and the PS are different service providers. We use the Stackelberg game to provide a hierarchical framework for the complex optimization interactions among multiple service providers, and to formulate the utility maximization problems for the IT and the PS as leader and follower problems, respectively.
To solve the non-convex follower problem, we propose an efficient alternating optimization-based algorithm that decomposes the considered problem into a transmit power optimization sub-problem and an IRS reflection coefficient optimization sub-problem and solve them alternatively and iteratively. To solve the IRS reflection coefficient optimization sub-problem, we use a transformation that converts a non-trackable equality constraint into an inequality constraint, and use a penalty-based technique to ensure the equivalence of such a transformation. We also apply the successive convex approximation (SCA) technique to find a locally optimal solution to this sub-problem.
We also propose a similar alternating optimization-based algorithm for the leader problem, which decomposes it into an IRS reflection coefficient optimization sub-problem, a WET time portion optimization sub-problem, a WIT time portion optimization sub-problem, a transmit power allocation sub-problem, and an energy price optimization sub-problem. The difficulty in solving this problem lies in solving the IRS reflection coefficient optimization sub-problem. We apply the closed-form fractional programming (FP) approach, the penalty-based equivalent transformation, and the SCA technique to obtain a locally optimal solution to this sub-problem. In addition, we obtain the optimal (and closed-form) solution to the other subproblems.
We show that the Stackelberg equilibrium between the PS and IT can be achieved through theoretical analysis and simulations. Simulation results also show that our proposed algorithm is able to maximize the utilities of the PS and IT, and that the use of the IRS significantly improves the utilities of both the PS and IT.

2. System Model

As shown in Figure 1, we consider an IRS-assisted WPCN that includes a PS, an IT, K devices, and an IRS with N reflection units. The PS, IT, and devices are all equipped with a single antenna. In order to fully investigate the role of IRS in wireless communication systems, we consider the IRS as an independent service provider, and the results provided by the computations can be directly used by other service providers. The entire transmission block of the considered network can be divided into a WET phase and a WIT phase. In the WET phase, the PS transmits energy to power the IT; in the WIT phase, the IT transmits information to the devices. The IRS enhances the signal quality of the WET and WIT phases through intelligent signal reflection. These two phases operate in a TDMA manner. Specifically, the entire transmission block is divided into K + 1 time slots, with the WET and WIT phases occupying the first and remaining K time slots, respectively [14]. T denotes the total duration of the transmission block, and the durations of the WET and WIT phases can be written as t 0 T and k = 1 K t k T , respectively, where t 0 and t k denote the time portions of the energy transfer from the PS to the IT and the information transfer from the IT to device k, respectively, and they should be non-negative, i.e.,
t k 0 , k .
For simplicity, we assume a normalized transmission block duration, i.e., T = 1 s. Thus, t k and t 0 should also satisfy the total transmission block duration constraint:
k = 1 K t k + t 0 1 .
The PS, IT, and IRS are different service providers. In particular, the IT is an information service provider, the PS is an energy service provider, and the IRS is a signal enhancement service provider. The PS transfers energy to the IT and receives payment for the energy from the IT. The IT transmits information to the devices and receives payment for the information from the devices [15]. The IRS receives payment from the PS and IT in the WET and WIT phases, respectively, for its signal reflection, which improves the channel quality of both phases. The energy and information service providers try to allocate their resources to achieve their own utility maximization, and using conventional independent optimization approaches that ignore the interactions of their resource allocation strategies may cause one service provider’s utility maximizing resource allocation strategy to reduce the utility of the others. Therefore, we use the Stackelberg game to provide a hierarchical framework for the sequential strategy interactions between the information service provider (the IT) and the energy service provider (the PS). We consider a buyer-dominated energy market, where the IT is the leader who can determine the price of energy transfer and the PS is the follower [16].
The channel coefficients from the PS to the IRS, from the IRS to the IT, from the PS to the IT, from the IT to the IRS, from the IRS to the k-th device, and from the IT to the k-th device are denoted by h r C N × 1 , h i t H C 1 × N , h p t H C 1 × 1 , g r C N × 1 , g i d k H C 1 × N , and h k H C 1 × 1 , respectively. We use the quasi-static block flat-fading channel model to model all channels, and thus assume that all channel coefficients are constant in each transmission block and may vary over different blocks. To obtain the upper bound on the performance of the considered network, we assumed that the channel state information (CSI) of all channels is perfectly known [17,18,19].
In the WET phase, the PS transfers energy with transmit power P p subject to the minimum and maximum limits:
0 P p P max ,
where P max is the maximum transmit power of the PS. The PS needs to pay the energy cost to its energy source according to the value of P p , and the cost is t 0 ( a P p 2 + b P p ) , where a > 0 and b > 0 are predetermined cost parameters [13].
The IRS reflects the energy signal with a reflect beamforming matrix Φ diag φ , where φ w E , 1 , , w E , N T C N × 1 , and the operator diag ( a ) forms a diagonal matrix with its diagonal elements equal to the vector a . For the n-th reflection unit of the IRS, if it is not activated, then the n-th diagonal element of Φ is equal to zero, i.e., w E , n = 0 , and if it is activated, then w E , n = e j θ E , n , where θ E , n 0 , 2 π is the reflection phase shift of the n-th reflection unit of the IRS during the WET phase. The IRS charges the PS according to the number of activated IRS reflection units during the WET phase. The fee charged by the IRS can be written as r Φ 0 , where r is the price per activated reflection units of the IRS, and Φ 0 denotes the 0 -norm of Φ , representing the number of non-zero diagonal elements in Φ . Let Φ ( n , n ) denote the n-th diagonal element of Φ . If the n-th reflection unit of the IRS is activated, then | Φ ( n , n ) | = 1 , and if it is not activated, then | Φ ( n , n ) | = 0 . This condition is equivalent to the following equality constraint
Φ n , n 1 Φ n , n = 0 , n = 1 , 2 , , N .
With (4), Φ 0 is equal to the number of activated IRS reflection units.
Thus, the energy harvested by the IT is
E = t 0 η min P p h i t H Φ h r + h p t H 2 , P sat ,
where η 0 , 1 is the energy harvesting efficiency, P sat is the saturation power threshold of the IT’s energy harvester [20], and the noise power at the energy harvester has been ignored [21]. The IT pays the PS according to the energy it receives. Let λ be the price of the power received by the IT per Watt. The revenue received by the PS from the IT is λ t 0 P p | h i t H Φ h r + h p t H | 2 .
Let U p P p , Φ denote the utility of the PS, which is defined as the difference between the revenue of the PS and the payments of the PS for the IRS’s signal enhancement and the energy cost. Therefore, the utility of the PS can be expressed as
U p P p , Φ = λ t 0 P p h i t H Φ h r + h p t H 2 r Φ 0 t 0 a P P 2 + b P p .
In the WIT phase, the IT transmits information with the harvested energy to the devices to obtain revenue. Let p k denote the transmit power allocated for the information transmission of the k-th device, and the maximum transmit energy of the IT should not be greater than its harvested energy, i.e.,
k = 1 K t k p k t 0 η min P p h i t H Φ h r + h p t H 2 , P sat ,
and
p k 0 , k .
The IRS reflects the signal for the k-th device with a reflect beamforming matrix Ψ k diag ψ k , where ψ k w k , 1 , , w k , N T C N × 1 . For the n-th reflection unit of the IRS, if it is not activated for the k-th device, then the n-th element of Ψ k is equal to zero, i.e., w k , n = 0 , and if it is activated, then w k , n = e j θ k , n , where θ k , n 0 , 2 π is the reflection phase shift of the n-th reflection unit of the IRS for the k-th device in the WIT phase. The IRS charges the IT for the number of activated IRS reflection units in the WIT phase. The fee charged by the IRS at time slot k is r Ψ k 0 . Similar to Φ , we introduce the following constraints to ensure that Ψ k 0 is equal to the number of activated IRS reflection units in time slot k:
Ψ k n , n 1 Ψ k n , n = 0 , n = 1 , 2 , , N , k = 1 , 2 , , K .
Thus, the achievable rate of the k-th device in bits/Hertz (bits/Hz) in time slot k can be written as
R k = t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 ,
where σ 2 is the noise power at each device’s receiver. Let R min be the minimum rate requirement of each device, and the rates of the devices should satisfy the following minimum rate constraint:
t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 R min , k .
The sum rate of all devices is
R sum = k = 1 K R k .
Let β be the revenue from transmitting the achievable rate per bits/Hz. The revenue of the IT during the WIT phase is β R sum [22].
Let U t λ , t k , p k , Ψ k , t 0 denote the utility of the IT, which is defined as the difference between the revenue earned from providing the information transmission service to the devices and the payments paid to the IRS for signal enhancement and the PS for power transfer. Therefore, the utility of the IT can be expressed as
U t λ , t k , p k , Ψ k , t 0 = β k = 1 K t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 λ t 0 P p h i t H Φ h r + h p t H 2 r k = 1 K Ψ k 0 .
The rest of this paper is organized as follows. Section 2 describes the system model and the relevant parameters. Section 3 introduces the leader problem and the follower problem. Section 4 introduces the Stackelberg Game-Based Optimization Algorithm. Section 5 presents the experimental results of the proposed system model. Section 6 provides the conclusions of our work.

3. Problem Formulation

To achieve a win-win situation for multiple service providers, the transmit power of the PS and the IRS reflection coefficients for the WET phase are jointly optimized to maximize the utility of the PS. The energy price, the time portions of all time slots, the transmit powers of the IT for all devices, and the IRS reflection coefficients for the WIT phase are jointly optimized to maximize the utility of the IT. The optimization is performed in the Stackelberg game framework, where a leader problem and a follower problem are to be solved, respectively.
The leader problem aims to maximize the utility of the IT by jointly optimizing the IRS reflection coefficients for all devices in the WIT phase Ψ k 1 K , the energy price λ , the time portions of the WET and WIT phases t k 0 K , and the IT’s transmit power allocated to the devices p k 1 K , subject to each device’s minimum rate constraint (11), the IRS’s reflection coefficient constraint (9), the non-negative energy price constraint, the total transmission duration constraint (2), the IT’s maximum transmit energy constraint (7), the non-negative energy allocation constraint (8), and the non-negative time portion constraint (1). Thus, the leader problem can be formulated as
max λ , Ψ k 1 K , t k 0 K , p k 1 K β k = 1 K t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 λ t 0 P p h i t H Φ h r + h p t H 2
r k = 1 K Ψ k 0 s . t . ( 1 ) , ( 2 ) , ( 7 ) , ( 8 ) , ( 9 ) , ( 11 ) ,
λ 0 ,
The follower problem aims to maximize the utility of the PS by jointly optimizing the PS’s transmit power P p and the IRS reflection coefficients for the WET phase Φ , subject to the constraints of the maximum transmit power of the PS in (3) and the IRS’s reflection coefficients in (4). Thus, the follower problem can be formulated as
max P p , Φ λ t 0 P p h i t H Φ h r + h p t H 2 t 0 a P p 2 + b P p r Φ 0 s . t . ( 3 ) , ( 4 ) .
The leader problem and the follower problem together form a Stackelberg game problem. We consider using the backward induction method to solve this problem [21]. Specifically, we first solve the follower problem (15). Then we substitute the obtained solution Φ and P p into the leader problem (14) and solve it.

4. Stackelberg Game-Based Optimization Algorithm

In this section, we first solve the follower problem (15) and the leader problem (14), respectively. We then prove that the Stackelberg equilibrium can be achieved, resulting in a win-win situation for the multiple service providers in the considered network.

4.1. Proposed Algorithm for the Follower Problem (15)

In problem (15), the optimization variables P p and Φ are coupled in the objective function but are not in the constraints. This motivates us to solve problem (15) by decomposing it into two sub-problems and solving them alternatively. Specifically, Sub-Problem 1 optimizies Φ with fixed P p , and Sub-Problem 2 optimizies P p with fixed Φ . The procedures for solving the sub-problems and the overall algorithm are presented below.

4.1.1. Solution to Sub-Problem 1

Sub-Problem 1 optimizes Φ with fixed P p . Since Φ 0 in (15a) is non-convex, | h i t H Φ h r + h p t H | 2 in (15a) is difficult to handle, and the left-hand side (LHS) of (4) is non-linear. Sub-Problem 1 is a non-convex optimization problem and is difficult to solve. We tackle these difficulties as follows. First, the modulus of the diagonal elements of Φ is either 0 or 1, so Φ 0 = Φ F 2 = φ 2 2 , where Φ F denotes the Frobenius norm of Φ , and φ 2 denotes the 2 -norm of φ . Second, we define m diag h i t H h r C N × 1 , and | h i t H Φ h r + h p t H | 2 in (15a) can be transformed to | φ H m + h p t H | 2 . Third, | Φ n , n | 1 | Φ n , n | in (4) can be written as φ H e i e i H φ 1 φ H e i e i H φ , where e i represents a vector where the i-th element is 1 and the other elements are 0. With the above transformations, we further introduce auxiliary variables ω ω 1 , , ω N T and a penalty factor μ to resolve the non-linear issue of (4), and then we can write Problem (15) into a more tractable form:
max φ , ω i 1 N λ t 0 P p φ H m + h p t H 2 r φ 2 2 μ i = 1 N ω i
s . t . 0 φ H e i e i H φ 1 φ H e i e i H φ ω i , i = 1 , 2 , , N .
Note that the penalty factor μ can guarantee that the solution to Problem (16) can satisfy constraint (4). This is because, when μ , the optimal solution of ω i 0 , and in this case, (16b) is equivalent to (4). Therefore, we can solve Problem (16) to obtain the solution to Sub-Problem 1.
Since the term φ H m + h p t H 2 in (16a) is non-convex with respect to φ , and (16b) is a non-convex constraint, Problem (16) is still difficult to solve. We use the SCA technique to solve Problem (16) iteratively. We suppose that φ u is the solution of (16) obtained at the u-th iteration. The first-order Taylor expansions of φ H m + h p t H 2 and φ H e i e i H φ 1 φ H e i e i H φ at φ u can be respectively written as
φ H ( u ) m + h p t H 2 + 2 Re φ H m + h p t H H φ H ( u ) m + h p t H ,
and
φ H u e i e i H φ u 1 φ H u e i e i H φ u + 2 e i e i H φ u 4 e i e i H φ u φ H u e i e i H φ u T φ φ u ,
where Re · denotes the real part operator. By replacing φ H m + h p t H 2 and φ H e i e i H φ 1 φ H e i e i H φ with (17) and (18) in Problem (16), respectively, the problem to be solved in the ( u + 1 ) -th iteration can be constructed as
max φ , ω i 1 N λ t 0 P p φ H ( u ) m + h p t H 2 + 2 λ t 0 P p Re φ H m + h p t H H φ H ( u ) m + h p t H r φ 2 2 μ i = 1 N ω i
s . t . 0 φ H u e i e i H φ u 1 φ H u e i e i H φ u + 2 e i e i H φ u 4 e i e i H φ u φ H u e i e i H φ u T φ φ u ω i , i = 1 , 2 , , N .
Problem (19) is a convex optimization problem and can be solved by a standard solver like CVX.

4.1.2. Solution to Sub-Problem 2

Sub-Problem 2 optimizes P p with fixed Φ in Problem (15) and can be written as
max P p λ t 0 P p h i t H Φ h r + h p t H 2 t 0 a P p 2 + b P p s . t . ( 3 ) .
The objective function (20a) is a quadratic concave function of P p , and the constraint (3) is linear, so Problem (20) is a convex optimization problem. By setting the first-order derivative of (20a) to zero, we have the following equation:
λ t 0 h i t H Φ h r + h p t H 2 2 a t 0 P p b t 0 = 0 .
By solving (21), we obtain its solution P p = λ γ e b 2 a , where γ e = | h i t H Φ h r + h p t H | 2 . Let P p * denote the optimal solution to Problem (20), and according to the constraint (3), we can obtain P p * as
P p * = λ γ e b 2 a , 0 λ γ e b 2 a P max , 0 , λ γ e b 2 a 0 , P max , λ γ e b 2 a P max .

4.1.3. Overall Algorithm for Problem (15)

The overall algorithm for Problem (15) is summarized in Algorithm 1, where U p t = g ( P p t , φ t ) denotes the objective value of Problem (15) with the variables P p t and φ t . The algorithm solves Sub-Problems 1 and 2 in Steps 3–6 and Step 7, respectively. Since the locally optimal solution to Sub-Problem 1 and the optimal solution to Sub-Problem 2 can be obtained at each iteration, the objective function value of Problem (15) is monotonically non-decreasing. Furthermore, since P p is upper bounded by P max , there must be an upper bound on the objective function value of Problem (15). Therefore, Algorithm 1 is guaranteed to converge (Lemma 2, [23]). The flowchart of Algorithm 1 is shown in Figure 2.
Algorithm 1 Proposed Algorithm for Problem (15).
1:
Initialization: Set initial value for P p 0 . Set U p 0 = 0, t = 0 , ε 0 = 1 × 10 3 , and χ 0 = 1 × 10 8 .
2:
repeat
3:
      repeat
4:
         Set initial value for μ .
5:
         Given P p t , solve problem (16) and denote the obtained solution as φ t + 1 .
6:
         If φ H e i e i H φ 1 φ H e i e i H φ > χ 0 , set μ = 2 μ .
7:
      until  φ H e i e i H φ 1 φ H e i e i H φ χ 0 .
8:
      Given φ t + 1 , obtain P p t + 1 according to (22).
9:
      Set t = t + 1 .
10:
    Set U p t = g ( P p t , φ t ) .
11:
until  U p t U p t 1 U p t ε 0 .

4.2. Proposed Algorithm for the Leader Problem (14)

We substitute the obtained solution to the follower problem (15), i.e., P p * and φ , into the leader problem (14), and solve it as follows. In Problem (14), the optimization variables λ , t k 0 K , p k 1 K , and Ψ k 1 K are intricately coupled in the objective function, and the objective function is not jointly concave with respect to them. Furthermore, the LHS of the equality constraint (9) is non-linear, so Problem (14) is non-convex and difficult to solve optimally. To solve this problem efficiently, we use the alternating optimization algorithm to decouple the optimization variables and decompose Problem (14) into five sub-problems that are solved alternately. Specifically, Sub-Problem 1 optimizes Ψ k 1 K with a fixed λ , t k 0 K , and p k 1 K ; Sub-Problem 2 optimizes t 0 with a fixed λ , Ψ k 1 K , t k 1 K , and p k 1 K ; Sub-Problem 3 optimizes t k 1 K with a fixed λ , t 0 , Ψ k 1 K , and p k 1 K ; Sub-Problem 4 optimizes p k 1 K with a fixed λ , t k 0 K , and Ψ k 1 K ; Sub-Problem 5 optimizes λ with a fixed Ψ k 1 K , t k 0 K , and p k 1 K . The procedures for solving all sub-problems and the overall algorithm are presented below.

4.2.1. Solution to Sub-Problem 1

Sub-Problem 1 optimizes Ψ k 1 K in Problem (14) with a fixed λ , t k 0 K , and p k 1 K . Similar to (15), by setting g k diag ( g i d k H ) g r and replacing Ψ k n , n 1 Ψ k n , n with ψ k H e i e i H ψ k ( 1 ψ k H e i e i H ψ k ) in (9), we can write Sub-Problem 1 as
max ψ k 1 K β k = 1 K t k log 2 1 + p k ψ k H g k + h k H 2 σ 2 r k = 1 K ψ k 2 2 s . t . ( 11 ) ,
ψ k H e i e i H ψ k 1 ψ k H e i e i H ψ k = 0 , i = 1 , 2 , , N , k = 1 , 2 , , K .
The logarithmic functions in (23a) are non-concave with respect to ψ k , so Problem (23) is still difficult to solve. We apply the closed-form FP approach [24] to solve it efficiently, which involves the following two steps.
Step 1: Lagrangian Dual Transform. By introducing auxiliary variables s s 1 , , s K T and after some manipulations, Problem (23) can be equivalently transformed into the following form [24]
max ψ k 1 K , s k 1 K β k = 1 K t k log 2 1 + s k β k = 1 K t k s k + β k = 1 K t k 1 + s k p k ψ k H g k + h k H σ 2 1 + p k ψ k H g k + h k H 2 σ 2 r k = 1 K ψ k 2 2 s . t . ( 11 ) , ( 23b ) .
Since { ψ k } 1 K and { s k } 1 K are coupled in Problem (24), we solve it by iteratively updating { ψ k } 1 K and { s k } 1 K . When { ψ k } 1 K is fixed, Problem (24) is a convex optimization problem. Therefore, by solving the equation that the first-order derivative of (24a) is zero, the optimal solution of { s k } 1 K can be obtained:
s k * = p k ψ k H g k + h k H 2 σ 2 , k .
When { s k } 1 K are fixed, by defining s k ^ t k 1 + s k * p k σ 2 , Problem (24) can be reduced to
max ψ k 1 K k = 1 K s k ^ ψ k H g k + h k H 2 1 + p k ψ k H g k + h k H 2 σ 2 r k = 1 K ψ k 2 2 s . t . ( 11 ) , ( 23b ) .
Step 2: Quadratic Transform. Problem (26) is a multiple-ratio FP problem. To solve this problem, we introduce auxiliary variables ρ ρ 1 , , ρ K T and transform it equivalently into the following form [24]:
max ψ k 1 K , ρ 2 k = 1 K s k ^ Re ρ k H ψ k H g k + h k H k = 1 K ρ k 2 1 + p k ψ k H g k + h k H 2 σ 2 r k = 1 K ψ k 2 2 s . t . ( 11 ) , ( 23b ) .
Since { ψ k } 1 K and ρ are coupled in Problem (27), we solve it by iteratively updating { ψ k } 1 K and ρ . When { ψ k } 1 K are fixed, the optimal solution for ρ is
ρ k * = s k ^ ψ k H g k + h k H 1 + p k ψ k H g k + h k H 2 σ 2 , k .
By substituting (28) into Problem (27), we can find that Problems (27) and (26) are equivalent, so we will focus on solving Problem (27) in the following.
With given ρ k * , similar to Problem (15), we introduce auxiliary variables κ κ 1 , , κ N T and a penalty factor ν into Problem (27) to resolve the non-linear issue of (23b), and we transform Problem (27) into
max ψ k 1 K , κ i 1 N 2 k = 1 K s k ^ Re ρ k H ψ k H g k + h k H k = 1 K ρ k 2 1 + p k ψ k H g k + h k H 2 σ 2 r k = 1 K ψ k 2 2 ν i = 1 N κ i s . t . ( 11 ) ,
0 ψ k H e i e i H ψ k 1 ψ k H e i e i H ψ k κ i , i = 1 , 2 , , N , k = 1 , 2 , , K .
Problem (29) is difficult to solve since the constraints (11) and (29b) are non-convex. We use the SCA technique to overcome such a difficulty. Similar to (15), we assume that ψ k ( q ) is the solution obtained at the q-th iteration, by replacing the terms ψ k H g k + h k H 2 in (11) and ψ k H e i e i H ψ k 1 ψ k H e i e i H ψ k in (29b) with their first-order Taylor expansions at ψ k ( q ) , respectively, the problem to be solved in the (q + 1)-th iteration can be constructed as
max ψ k 1 K , κ i 1 N 2 k = 1 K s k ^ Re ρ k H ψ k H g k + h k H k = 1 K ρ k 2 1 + p k ψ k H g k + h k H 2 σ 2
r k = 1 K ψ k 2 2 ν i = 1 N κ i s . t . ψ k H ( q ) g k + h k H 2 + 2 Re ψ k H m + h k H H ψ k H ( q ) m + h k H
2 R min t k 1 σ 2 p k , k = 1 , 2 , , K , 0 ψ k H q e i e i H ψ k q 1 ψ k H ( q ) e i e i H ψ k ( q ) + 2 e i e i H ψ k ( q ) 4 e i e i H ψ k ( q ) ψ k H ( q ) e i e i H ψ k ( q ) T ψ k ψ k ( q ) κ i ,
i = 1 , 2 , , N , k = 1 , 2 , , K .
Problem (30) is a convex optimization problem, and we can use a standard solver like CVX to solve it efficiently.

4.2.2. Solution to Sub-Problem 2

Sub-Problem 2 optimizes t 0 with a fixed Ψ k 1 K , λ , p k 1 K , and t k 1 K , and it can be written as
max t 0 λ t 0 P p * h i t H Φ h r + h p t H 2
s . t . k = 1 K t k + t 0 1 ,
t 0 0 ,
k = 1 K t k p k t 0 η min P p * h i t H Φ h r + h p t H 2 , P sat .
According to (31b–d), we obtain t 0 1 k = 1 K t k , t 0 0 , and t 0 ϵ , respectively, where ϵ = k = 1 K t k p k η min P p * h i t H Φ h r + h p t H 2 , P sat . Based on these inequalities and the fact that (31a) is decreasing with t 0 , we can obtain the optimal solution to Problem (31), denoted by t 0 * , in the following expression
t 0 * = ϵ , ϵ > 0 , 0 , otherwise .

4.2.3. Solution to Sub-Problem 3

Sub-Problem 3 optimizes t k 1 K with a fixed Ψ k 1 K , λ , t 0 , and p k 1 K , and can be written as
max t k 1 K β k = 1 K t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2
s . t . t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 R min , k = 1 , 2 , , K ,
k = 1 K t k + t 0 1 , t k 0 , k = 1 , 2 , , K ,
k = 1 K t k p k t 0 η min P p * h i t H Φ h r + h p t H 2 , P sat .
Problem (33) is a linear programming problem and can be solved with a standard solver such as CVX.

4.2.4. Solution to Sub-Problem 4

Sub-Problem 4 optimizes p k 1 K with a fixed Ψ k 1 K , λ , and t k 0 K , and can be written as
max p k 1 K β k = 1 K t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2
s . t . t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 R min , k = 1 , 2 , , K ,
p k 0 , k = 1 , 2 , , K ,
k = 1 K t k p k t 0 η min P p * h i t H Φ h r + h p t H 2 , P sat .
Problem (34) is a convex optimization problem and can be solved with a standard solver such as CVX.

4.2.5. Solution to Sub-Problem 5

Sub-Problem 5 optimizes λ with a fixed Ψ k 1 K , p k 1 K , and t k 0 K , and can be written as
max λ λ t 0 P p * h i t H Φ h r + h p t H 2
s . t . k = 1 K t k p k t 0 η min P p * h i t H Φ h r + h p t H 2 , P sat ,
λ 0 .
Note that, according to (22), P p * in (35a,b) is a function of λ . According to the constraint (35b), Problem (35) can be solved in the cases of P p * | h i t H Φ h r + h p t H | 2 P sat and P p * | h i t H Φ h r + h p t H | 2 > P sat , respectively.
When P p * h i t H Φ h r + h p t H 2 P sat , according to the expression of P p * given in (22), there are three cases to consider. Case 1: if P p * = λ γ e b 2 a , Problem (35) can be written as
max λ t 0 h i t H Φ h r + h p t H 2 2 a γ e λ 2 b λ
s . t . k = 1 K t k p k t 0 η h i t H Φ h r + h p t H 2 2 a λ γ e b , ( 35c ) .
λ * = 2 a k = 1 K t k p k + b η t 0 h i t H Φ h r + h p t H 2 η t 0 h i t H Φ h r + h p t H 4 ,   P p * = λ γ e b 2 a , 0 ,       P p * = 0 , 2 a P max + b h i t H Φ h r + h p t H 2 , P p * = P max .
By analyzing the first-order derivative of (36a) and according to the constraints (36b) and (35c), the optimal solution of Problem (36), denoted by λ * , should be λ * = 2 a k = 1 K t k p k + b η t 0 h i t H Φ h r + h p t H 2 η t 0 h i t H Φ h r + h p t H 4 . Case 2: if P p * = 0 , then λ * = 0 . Case 3: if P p * = P max , then according to the condition λ γ e b 2 a P max in (22), λ * = 2 a P max + b h i t H Φ h r + h p t H 2 . Therefore, when P p * h i t H Φ h r + h p t H 2 P sat , the expression for λ * is
Similarly, when P p * h i t H Φ h r + h p t H 2 > P sat , λ * can be obtained as
λ * = 2 a P sat + b h i t H Φ h r + h p t H 2 h i t H Φ h r + h p t H 4 , P p * = λ γ e b 2 a , 2 a P max + b h i t H Φ h r + h p t H 2 , P p * = P max .

4.2.6. Overall Algorithm for Problem (14)

The overall algorithm for Problem (14) is presented in Algorithm 2, where R ( p ) = f { ψ k ( p ) } 1 K denotes the objective value of Problem (29) with variables { ψ k ( p ) } 1 K in iteration p, and U t ( q ) = z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , { t k ( q ) } 0 K denote the objective values of Problem (14) with variables { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , and { t k ( q ) } 0 K in iteration q. In each iteration of Algorithm 2, the following inequalities hold
z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q 1 ) } 1 K , t 0 ( q ) , { t k ( q ) } 1 K
z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , t 0 ( q ) , { t k ( q ) } 1 K
z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , t 0 q + 1 , { t k ( q ) } 1 K
z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , t 0 q + 1 , { t k q + 1 } 1 K
z { p k q + 1 } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , t 0 q + 1 , { t k q + 1 } 1 K
z { p k q + 1 } 1 K , λ q + 1 , { ψ k ( q ) } 1 K , t 0 q + 1 , { t k q + 1 } 1 K .
where (39a) holds because { ψ k ( q ) } 1 K is the locally optimal solution to Problem (29); (39b) holds because t 0 q is the optimal solution to Problem (31); (39c) holds because { t k ( q ) } 1 K is the optimal solution to Problem (33); (39d) holds because { p k ( q ) } 1 K is the optimal solution to Problem (34); (39e) holds because λ q is the optimal solution to Problem (35). Due to the non-decreasing trend of the objective value over iterations shown in (39) and the existence of an upper bound on the objective value, Algorithm 2 is guaranteed to converge (Lemma 2, [23]). During each iteration, the gains of the leader and the follower are increased, and the optimal solution to the leader and follower problems is obtained when the Stackelberg equilibrium is reached, and if one party disrupts the equilibrium for self-interest, the gains of the other party are reduced, and at this point the equilibrium is broken, and the win-win situation is also destroyed.
In the inner layer, the main complexity of Algorithm 1 comes from Step 5, In Step 5, the complexity is O L o u t e r L i n n e r N 3.5 , where L o u t e r and L i n n e r denote the number of iterations required for reaching convergence in the inner and outer layers, respectively. The complexity of Algorithm 2 is O N i n t ( N 2 K + 2 K + 2 3 K + 5 3 + ( N 2 K + 2 K + 2 2 3 K + 5 2 + N 2 K + 2 K + 2 2 ) , where N i n t , N 2 K + 2 K + 2 and 3 K + 5 denote the number of iterations required for reaching convergence, the number of variables, and the iteration number, respectively. The flowchart of Algorithm 2 is shown in Figure 3.
Algorithm 2 Algorithm for the Leader Problem (14).
1:
Initialization: Set initial values for { ψ k 0 } 1 K , { s k 0 } 1 K , { ρ k 0 } 1 K , λ 0 , { p k 0 } 1 K , and { t k 0 } 0 K . Set q = 0 , U t ( 0 ) = 0, ε 0 = 1 × 10 3 , and χ 0 = 1 × 10 8 .
2:
repeat
3:
       Set p = 0 and R 0 = f { ψ k 0 } 1 K .
4:
       repeat
5:
          Set initial value for ν .
6:
          repeat
7:
           Given { s k p } 1 K and { ρ k ( p ) } 1 K , solve Problem (23) and denote the obtained solution as { ψ k ( p + 1 ) } 1 K .
8:
           If there exists k such that ψ k H e i e i H ψ k 1 ψ k H e i e i H ψ k > χ 0 , then set ν = 2 ν .
9:
          until  ψ k H e i e i H ψ k 1 ψ k H e i e i H ψ k χ 0 , k .
10:
          Given { ρ k ( p ) } 1 K and { ψ k ( p + 1 ) } 1 K , obtain { s k p + 1 } 1 K according to (25).
11:
          Given { s k p + 1 } 1 K and { ψ k ( p + 1 ) } 1 K , obtain { ρ k p + 1 } 1 K according to (28).
12:
          Set p = p + 1 .
13:
          Set R ( p ) = f { ψ k p } 1 K .
14:
       until  R ( p ) R p 1 R p ε 0 .
15:
       Set ψ k q + 1 = ψ k p , k .
16:
       Given { p k q } 1 K , λ ( q ) , { ψ k q + 1 } 1 K , and { t k q } 1 K , obtain t 0 q + 1 according to (32).
17:
       Given { p k q } 1 K , λ ( q ) , { ψ k q + 1 } 1 K , and t 0 q + 1 , solve Problem (33) and denote the obtained solution as { t k q + 1 } 1 K .
18:
       Given { t k q + 1 } 0 K , λ ( q ) , and { ψ k q + 1 } 1 K , solve Problem (34) and denote the obtained solution as { p k q + 1 } 1 K .
19:
       Given { p k q + 1 } 1 K , { t k q + 1 } 0 K , and { ψ k q + 1 } 1 K , if P p h i t H Φ h r + h p t H 2 P sat , then obtain λ q + 1 according to (37); otherwise, obtain λ q + 1 according to (38).
20:
       Set q = q + 1 .
21:
       Set U t ( q ) = z { p k ( q ) } 1 K , λ ( q ) , { ψ k ( q ) } 1 K , { t k ( q ) } 0 K .
22:
until  U t q U t ( q 1 ) U t t ε 0 .

4.2.7. Stackelberg Equilibrium Analysis

After solving Problem (15) and Problem (14), the Stackelberg equilibrium between the PS and IT can be achieved [11]. The reason is given below. Let P p * , Φ * and P p , Φ denote the obtained locally optimal solution and a feasible solution to the follower problem (15), respectively. In addition, let λ * , t k * 0 K , Ψ k * 1 K , p k * 1 K and λ , t k 0 K , p k 1 K , Ψ k 1 K denote the obtained locally optimal solution and a feasible solution to the leader problem (14), respectively. For the follower problem (15), the following inequality holds
U p λ * , t k * 0 K , p k * 1 K , Ψ k * 1 K , P p * , Φ * U p λ * , t k * 0 K , p k * 1 K , Ψ k * 1 K , P p , Φ .
For the leader problem (14), the following inequality holds
U t λ * , t k * 0 K , p k * 1 K , Ψ k * 1 K , P p * , Φ * U t λ , t k 0 K , p k 1 K , Ψ k 1 K , P p * , Φ * .
According to the definition of the Stackelberg equilibrium in [25], the Stackelberg equilibrium is reached as long as the inequalities (40) and (41) are satisfied. Therefore, by solving Problems (15) and (14), the obtained solution λ * , t k * 0 K , p k * 1 K , Ψ k * 1 K , P p * , Φ * is the Stackelberg equilibrium point for the considered Stackelberg game.

5. Simulation Results

We evaluate the performance of the proposed Stackelberg game-based optimization algorithm through computer simulations. In the simulations, the proposed algorithm is compared to the following benchmark schemes.
(1) 
W/o IRS: This scheme does not use an IRS. It maximizes the utility of the PS by optimizing P p using an algorithm similar to Algorithm 1, and it maximizes the utility of the IT by optimizing p k 1 K , t k 0 K , and λ using an algorithm similar to Algorithm 2.
(2) 
Fixed time: This scheme fixes the time allocation t k 0 K . It maximizes the utility of the PS by optimizing P p and Φ using an algorithm similar to Algorithm 1, and maximizes the utility of the IT by optimizing p k 1 K , t 0 , λ , and Ψ k 1 K using an algorithm similar to Algorithm 2.
(3) 
Fixed time & w/o IRS: This scheme does not use an IRS and fixes the values of t k 0 K . It maximizes the utility of the PS by optimizing P p using an algorithm similar to Algorithm 1, and maximizes the utility of the IT by optimizing p k 1 K , t 0 , and λ using an algorithm similar to Algorithm 2.
(4) 
Rate maximization: This scheme ignores the cost of using the IRS and uses all IRS reflection units. It maximizes the sum rate of all devices by jointly optimizing the IRS reflection coefficients in the WET phase Φ , the transmit power of the PS P p , the IRS reflection coefficients for all devices in the WIT phase Ψ k 1 K , the IT’s transmit power allocated to the devices p k 1 K , and the time portions of the WET and WIT phases t k 0 K under the constraints (1), (2), (3), (7), (8), and (11). The considered optimization problem is
max t k 0 K , p k 1 K , Ψ k 1 K , P p , Φ k = 1 K t k log 2 1 + p k g i d k H Ψ k g r + h k H 2 σ 2 s . t . ( 1 ) , ( 2 ) , ( 3 ) , ( 7 ) , ( 8 ) , ( 11 ) ,
w E , n = 1 , n ,
w k , n = 1 , n , k .
This scheme uses the alternating optimization algorithm to solve this problem, and the procedure is similar to Algorithm 2.
In a two-dimensional coordinate system, the IRS, the IT, and the PS are located at 5 , 1 m, 5 , 0 m, and 0 , 0 m, respectively. The devices are randomly distributed in a circle centered at 10 , 0 m with a radius of 0.5 m. The large-scale fading path loss model for all channels is given by L d = C 0 d / d 0 α , where C 0 , d, and α denote the path loss at the reference distance d 0 = 1 m, the propagation distance, and the path loss exponent, respectively. The path loss exponents of the channels from the PS to the IT, from the IT to the IRS, from the IRS to the devices, and from the IRS to the IT are set to 2.5, and the path loss exponents of the channels from the PS to the IT and from the IT to the devices are set to 3.5. The small-scale fading of all channels is modeled by the Rayleigh fading model. Unless otherwise specified, the other simulation parameters are set as follows: N = 30 , η = 0.8 , β = 10 , r = 0.01 , a = 1 , b = 1 , σ 2 = 80 dBm, C 0 = 30 dB, and R min = 0.01 bits/Hz.

5.1. Impact of the Number of Devices

Figure 4 shows the utilities of the IT achieved by different schemes versus the number of devices K. It can be observed that the utilities of the IT achieved by all schemes increase as the number of devices increases except for the “rate maximization” scheme. This is because the more devices there are, the more revenue the IT earns. In the “rate maximization” scheme, the utilities are basically unchanged, because the IT will maximize the transmission power of the PS, thus achieving the maximum achievable rate, which this scheme aims to maximize when there is only one device. The proposed algorithm has a lower utility than the “rate maximization” scheme, but achieves a higher utility than the other benchmark schemes. This is due to the fact that IRS enhances the strength of the signal in the WET and WIT phases and uses TDMA in the WIT phase to co-ordinate the timing of the reception of the signal by each device. The system has a higher degree of flexibility and is therefore able to realize higher utilities than other schemes. It outperforms the “w/o IRS” scheme due to the signal enhancement of the IRS in the WET and WIT phases. The “fixed time” scheme has an even lower utility than the “w/o IRS” scheme, demonstrating the effectiveness of time allocation in improving the utility of IT.
Figure 5 shows the utilities of the PS achieved by different schemes versus the number of devices K. It is observed that the utilities of the PS achieved by all schemes increase as K increases. The proposed algorithm has the highest utility. The proposed algorithm achieves higher utility than the “w/o IRS” scheme, demonstrating that the use of an IRS can achieve a higher utility. In addition, the “rate maximization” scheme achieves a negative utility, because it only considers maximizing the sum rate of the devices and ignores the utility of the PS. Additionally, the “rate maximization” scheme activates all IRS reflection units in the WET phase, resulting in a high cost. The results of Figure 4 and Figure 5 demonstrate that the proposed algorithm can achieve a win-win situation between the PS and IT.
Figure 6 shows the sum rates of all devices achieved by different schemes versus the number of devices K. It is observed that the sum rates achieved by all schemes increase as K increases. The “rate maximization” scheme has the highest sum rate because it only considers maximizing the sum rate, ignores the utility of the IT, and treats the IRS as a free service by activating all reflection units of the IRS. In addition, although the proposed algorithm does not aim to maximize the sum rate, it achieves a significantly higher sum rate than the “w/o IRS”, “fixed time”, and “fixed time & w/o IRS” schemes, demonstrating that the use of an IRS can not only improve the utilities of the IT and PS, but also increase the sum rate of the devices. Furthermore, the difference in sum rate between the proposed algorithm and the “fixed time” scheme shows that the time allocation in the WET and WIT phases has a decisive influence on the performance of the system. Therefore, in practice, we need to consider the time for each device to receive the signal and deploy the IRS to improve the signal strength and thus increase the utility of the system.

5.2. Impact of the Maximum Transmit Power of the PS

Figure 7 shows the utilities of the IT achieved by different schemes versus the maximum transmit power of the PS P max for K = 5 . It can be observed that the proposed algorithm achieves the highest utility. This is due to the fact that, for the same parameters, the proposed scheme can increase the benefit of the system due to its superiority in terms of performance. Therefore, in order to increase the utility of the system, parameters can be set slightly larger in practice. Furthermore, it can be observed that the utility of the IT of the proposed algorithm first increases and then remains unchanged as P max increases. This is because, when P max is not large, as P max increases, the energy harvested by the IT increases, resulting in an increase in the revenue earned by the IT. On the other hand, when P max is sufficiently large, the Stackelberg equilibrium is reached at a certain value of P max (about 33 dBm), and the utility of the IT is maximized at that point. The utilities of the “w/o IRS” and “fixed time & w/o IRS” schemes do not increase much with P p due to the absence of the IRS.
Figure 8 shows the utilities of the PS achieved by different schemes versus the maximum transmit power of the PS P max for K = 5 . It can be observed that, as P max increases, the utilities of the PS achieved by all schemes increase first and then tend to remain unchanged. This is because, when P max is not large, the increase in P max increases the energy sold by the PS, thus increasing its utility. On the other hand, when P max is sufficiently large, the Stackelberg equilibrium is reached at about P max = 33 dBm, and all systems have no incentive to change their resource allocations after reaching the equilibrium, which could reduce utility.

5.3. Impact of the Price of the Activated IRS Reflection Unit

The numbers of activated IRS reflection units in different schemes versus the price of the activated IRS reflection units r for K = 5 is shown in Figure 9, where “proposed-WIT” and “proposed-WET” stand for the proposed algorithm in the WIT and WET phases, respectively, and “fixed time-WIT” and “fixed time-WET” stand for the “fixed time” scheme in the WIT and WET phases, respectively. In Figure 9, it can be observed that the number of activated IRS reflection units decreases in all schemes as the price r increases. This is because, as the price of activated IRS reflection units increases, the IT and PS use fewer IRS reflection units to reduce costs and increase their utilities. For the PS, as r increases, the number of activated IRS reflection units used in the WET phase decreases more slowly than that in the WIT phase to allow the IT to harvest more energy. For the “fixed time” scheme, there is no time allocation for the WIT phase, resulting in lower revenues for the IT and PS, as compared to the proposed algorithm. Therefore, to increase the utility, the system with the “fixed time” scheme needs to reduce costs and thus uses a lower number of activated IRS reflection units in the WET and WIT phases, as compared to that with the proposed algorithm.

6. Conclusions

We have studied a multi-device IRS-assisted WPCN with multiple devices, where the PS, the IT, and the IRS belong to different service providers. We set up a Stackelberg game framework to model the resource allocation strategy interactions among different service providers (the PS and the IT). For the PS, we formulate a follower problem, where its utility is maximized by jointly optimizing the IRS’s reflect beamforming and its transmit power. For the IT, we formulate a leader problem, where its utility is maximized by jointly optimizing the energy price, the time allocation, the IRS’s reflect beamforming, and its transmit power allocation. To solve these problems, we have proposed efficient algorithms based on the alternating optimization technique. The SCA technique has been used to solve the follower problem, and the closed-form FP approach and the SCA technique have been used to solve the leader problem. Simulation results show that the proposed algorithm can effectively improve the PS and IT’s utilities, as compared to some baseline algorithms, and can achieve the Stackelberg equilibrium between different service providers, thus achieving a win-win situation for the PS and IT. Simulation results also show that the IRS can significantly improve all service providers’ utilities by enhancing the signal quality, even when the cost of the IRS is taken into account.

Author Contributions

The authors confirm their contributions to this article as follows: research conception design: X.S. and G.Z.; data collection: X.S.; analysis and interpretation research results: X.S., G.Z. and K.L.; manuscript preparation: X.S., G.Z., K.L. and M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bi, S.; Ho, C.K.; Zhang, R. Wireless powered communication: Opportunities and challenges. IEEE Commun. Mag. 2015, 53, 117–125. [Google Scholar] [CrossRef]
  2. Cui, M.; Zhang, G.; Zhang, R. Secure Wireless Communication via Intelligent Reflecting Surface. IEEE Wirel. Commun. Lett. 2019, 8, 1410–1414. [Google Scholar] [CrossRef]
  3. Sun, Z.; Jing, Y. On the Performance of Training-Based IRS-Assisted Communications under Correlated Rayleigh Fading. IEEE Trans. Commun. 2023, 71, 3117–3131. [Google Scholar] [CrossRef]
  4. Wu, Q.; Zhang, R. Joint Active and Passive Beamforming Optimization for Intelligent Reflecting Surface Assisted SWIPT Under QoS Constraints. IEEE J. Sel. Areas Commun. 2020, 38, 1735–1748. [Google Scholar] [CrossRef]
  5. Wu, Q.; Zhou, X.; Chen, W.; Li, J.; Zhang, X. IRS-Aided WPCNs: A New Optimization Framework for Dynamic IRS Beamforming. IEEE Trans. Wirel. Commun. 2022, 21, 4725–4739. [Google Scholar] [CrossRef]
  6. Chu, Z.; Xiao, P.; Mi, D.; Yin, C.; Hao, W.; Liu, W.; Sodre, A.C. Jointly Active and Passive Beamforming Designs for IRS-Empowered WPCN. IEEE Internet Things J. 2024, 11, 11579–11592. [Google Scholar] [CrossRef]
  7. Zhong, Y.; Zhou, F.; Wang, Y.; Deng, X.; Al-Dhahir, N. Cooperative Jamming-Aided Secure Wireless Powered Communication Networks: A Game Theoretical Formulation. IEEE Commun. Lett. 2020, 24, 1081–1085. [Google Scholar] [CrossRef]
  8. Zheng, H.; Xiong, K.; Fan, P.; Zhong, Z.; Letaief, K.B. Age of Information-Based Wireless Powered Communication Networks with Selfish Charging Nodes. IEEE J. Sel. Areas Commun. 2021, 39, 1393–1411. [Google Scholar] [CrossRef]
  9. Chu, Z.; Nguyen, H.X.; Le, T.A.; Karamanoglu, M.; Ever, E.; Yazici, A. Secure Wireless Powered and Cooperative Jamming D2D Communications. IEEE Trans. Green Commun. Netw. 2018, 2, 1–13. [Google Scholar] [CrossRef]
  10. Luong, N.C.; Van, N.T.T.; Feng, S.; Nguyen, H.T.; Niyato, D.; Kim, D.I. Dynamic Network Service Selection in IRS-Assisted Wireless Networks: A Game Theory Approach. IEEE Trans. Veh. Technol. 2021, 70, 5160–5165. [Google Scholar] [CrossRef]
  11. Zhai, L.; Zou, Y.; Zhu, J.; Guo, H. A Stackelberg Game Approach for IRS-Aided WPCN Multicast Systems. IEEE Trans. Wirel. Commun. 2022, 21, 3249–3262. [Google Scholar] [CrossRef]
  12. Zhang, Y.; Xu, Y.; Xu, Y.; Yang, Y.; Luo, Y.; Wu, Q.; Liu, X. A Multi-Leader One-Follower Stackelberg Game Approach for Cooperative Anti-Jamming: No Pains, No Gains. IEEE Commun. Lett. 2018, 22, 1680–1683. [Google Scholar] [CrossRef]
  13. Chu, Z.; Nguyen, H.X.; Caire, G. Game Theory-Based Resource Allocation for Secure WPCN Multiantenna Multicasting Systems. IEEE Trans. Inf. Forensics Secur. 2018, 13, 926–939. [Google Scholar] [CrossRef]
  14. Zhang, D.; Wu, Q.; Cui, M.; Zhang, G.; Niyato, D. Throughput Maximization for IRS-Assisted Wireless Powered Hybrid NOMA and TDMA. IEEE Wirel. Commun. Lett. 2021, 10, 1944–1948. [Google Scholar] [CrossRef]
  15. Kang, X.; Zhang, R.; Motani, M. Price-Based Resource Allocation for Spectrum-Sharing Femtocell Networks: A Stackelberg Game Approach. IEEE J. Sel. Areas Commun. 2012, 30, 538–549. [Google Scholar] [CrossRef]
  16. Zhai, L.; Zou, Y.; Zhu, J.; Li, B. Improving Physical Layer Security in IRS-Aided WPCN Multicast Systems via Stackelberg Game. IEEE Trans. Commun. 2022, 70, 1957–1970. [Google Scholar] [CrossRef]
  17. Kang, Z.; You, C.; Zhang, R. IRS-Aided Wireless Relaying: Deployment Strategy and Capacity Scaling. IEEE Wirel. Commun. Lett. 2022, 11, 215–219. [Google Scholar] [CrossRef]
  18. Zhou, F.; You, C.; Zhang, R. Delay-Optimal Scheduling for IRS-Aided Mobile Edge Computing. IEEE Wirel. Commun. Lett. 2021, 10, 740–744. [Google Scholar] [CrossRef]
  19. Chu, Z.; Zhu, Z.; Li, X.; Zhou, F.; Zhen, L.; Al-Dhahir, N. Resource Allocation for IRS-Assisted Wireless-Powered FDMA IoT Networks. IEEE Internet Things J. 2022, 9, 8774–8785. [Google Scholar] [CrossRef]
  20. Dong, Y.; Hossain, M.J.; Cheng, J. Performance of Wireless Powered Amplify and Forward Relaying Over Nakagami-m Fading Channels with Nonlinear Energy Harvester. IEEE Commun. Lett. 2016, 20, 672–675. [Google Scholar] [CrossRef]
  21. Zhai, L.; Zou, Y.; Zhu, J.; Jiang, Y. Stackelberg Game-Based Multiple Access Design for Intelligent Reflecting Surface Assisted Wireless Powered IoT Networks. IEEE Trans. Wirel. Commun. 2023, 22, 6883–6897. [Google Scholar] [CrossRef]
  22. Gao, Y.; Yong, C.; Xiong, Z.; Niyato, D.; Xiao, Y.; Zhao, J. A Stackelberg Game Approach to Resource Allocation for IRS-aided Communications. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
  23. Hua, M.; Wu, Q.; Yang, L.; Schober, R.; Poor, H.V. A Novel Wireless Communication Paradigm for Intelligent Reflecting Surface Based Symbiotic Radio Systems. IEEE Trans. Signal Process. 2022, 70, 550–565. [Google Scholar] [CrossRef]
  24. Guo, H.; Liang, Y.C.; Chen, J.; Larsson, E.G. Weighted Sum-Rate Maximization for Reconfigurable Intelligent Surface Aided Wireless Networks. IEEE Trans. Wirel. Commun. 2020, 19, 3064–3076. [Google Scholar] [CrossRef]
  25. Yao, F.; Jia, L.; Sun, Y.; Xu, Y.; Feng, S.; Zhu, Y. A Hierarchical Learning Approach to Anti-jamming Channel Selection Strategies. Wirel. Netw. 2019, 25, 201–213. [Google Scholar] [CrossRef]
Figure 1. An IRS-assisted WPCN with multiple service providers.
Figure 1. An IRS-assisted WPCN with multiple service providers.
Electronics 13 03352 g001
Figure 2. Flowchart of Algorithm 1.
Figure 2. Flowchart of Algorithm 1.
Electronics 13 03352 g002
Figure 3. Flowchart of Algorithm 2.
Figure 3. Flowchart of Algorithm 2.
Electronics 13 03352 g003
Figure 4. Utilities of the IT achieved by different schemes versus the number of devices K.
Figure 4. Utilities of the IT achieved by different schemes versus the number of devices K.
Electronics 13 03352 g004
Figure 5. Utilities of the PS achieved by different schemes versus the number of devices K.
Figure 5. Utilities of the PS achieved by different schemes versus the number of devices K.
Electronics 13 03352 g005
Figure 6. Sum rates of all devices achieved by different schemes versus the number of devices (N = 30, r = 0.01).
Figure 6. Sum rates of all devices achieved by different schemes versus the number of devices (N = 30, r = 0.01).
Electronics 13 03352 g006
Figure 7. Utilities of the IT achieved by different schemes versus the maximum transmit power of the PS for K = 5 .
Figure 7. Utilities of the IT achieved by different schemes versus the maximum transmit power of the PS for K = 5 .
Electronics 13 03352 g007
Figure 8. Utilities of the PS achieved by different schemes versus the maximum transmit power of the PS for K = 5 .
Figure 8. Utilities of the PS achieved by different schemes versus the maximum transmit power of the PS for K = 5 .
Electronics 13 03352 g008
Figure 9. Numbers of activated IRS reflection units in different schemes versus the price of the activated IRS reflection units r for K = 5 .
Figure 9. Numbers of activated IRS reflection units in different schemes versus the price of the activated IRS reflection units r for K = 5 .
Electronics 13 03352 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, X.; Li, K.; Cui, M.; Zhang, G. Utility Maximization for the IRS-Assisted Wireless Powered Communication Network with Multiple Service Providers. Electronics 2024, 13, 3352. https://doi.org/10.3390/electronics13173352

AMA Style

Sun X, Li K, Cui M, Zhang G. Utility Maximization for the IRS-Assisted Wireless Powered Communication Network with Multiple Service Providers. Electronics. 2024; 13(17):3352. https://doi.org/10.3390/electronics13173352

Chicago/Turabian Style

Sun, Xinge, Kangbai Li, Miao Cui, and Guangchi Zhang. 2024. "Utility Maximization for the IRS-Assisted Wireless Powered Communication Network with Multiple Service Providers" Electronics 13, no. 17: 3352. https://doi.org/10.3390/electronics13173352

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop