Next Article in Journal
Visual-Inertial-Wheel Odometry with Slip Compensation and Dynamic Feature Elimination
Previous Article in Journal
Threshold-Switching Memristors for Neuromorphic Thermoreception
Previous Article in Special Issue
Secure and Intelligent Single-Channel Blind Source Separation via Adaptive Variational Mode Decomposition with Optimized Parameters
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Energy Efficiency Maximization for Wirelessly-Powered UAV-Assisted Secure Sensor Network

College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
*
Author to whom correspondence should be addressed.
Current address: College of Electronic and Information Engineering, Southwest University, No. 2, Tiansheng Road, Beibei District, Chongqing 400700, China.
Sensors 2025, 25(5), 1534; https://doi.org/10.3390/s25051534
Submission received: 17 December 2024 / Revised: 26 January 2025 / Accepted: 31 January 2025 / Published: 1 March 2025
(This article belongs to the Special Issue Advances in Security for Emerging Intelligent Systems)

Abstract

:
The rapid proliferation of Internet of Things (IoT) devices and applications has led to an increasing demand for energy-efficient and secure communication in wireless sensor networks. In this article, we firstly propose an intelligent approach to maximize the energy efficiency of the UAV in a secure sensor network with wireless power transfer (WPT). All sensors harvest energy via downlink signal and use it to transmit uplink information to the UAV. To ensure secure data transmission, the UAV needs to optimize the transmission parameters to decode received information under malicious interference from an attacker. Code Division Multiple Access (CDMA) is adopted to improve uplink communication robustness. To maximize the UAV’s energy efficiency in data collection tasks, we formulate a constrained optimization problem that jointly optimizes charging power, charging duration, and data transmission duration. Applying Deep Deterministic Policy Gradient (DDPG) algorithm, we train an action policy to dynamically determine near-optimal transmission parameters in real time. Numerical results validate the superiority of proposed intelligent approach over exhaustive search and gradient ascent techniques. This work provides some important guidelines for the design of green secure wireless-powered sensor networks.

1. Introduction

The future Internet of Things (IoT) is anticipated to enable massive deployments of intelligent sensors that can securely and efficiently connect to the internet and interact with one another. However, powering these large-scale sensor networks sustainably remains a critical challenge, as relying on grid electricity for a vast number of devices is often impractical. Wireless power transfer (WPT) has emerged as a promising solution, allowing sensors to harvest energy wirelessly while eliminating the need for frequent battery replacements. Despite its advantages, WPT systems face significant wireless path loss and heightened security risks, making the development of secure and energy-efficient transmission schemes essential for real-world applications. Recent research has investigated the approach for optimizing secrecy performance in wireless communication networks, e.g., [1,2,3,4,5,6]. For instance, ref. [1] proposed a secure communication paradigm at physical layer for wireless-powered sensor networks, where corresponding challenges, countermeasures, and road ahead were discussed in detail. Ref. [2] proposed a certificateless linearly homomorphic signature scheme, for data transmission and authentication, to ensure the authenticity, integrity and non-repudiation of data. Ref. [3] analyzed secrecy outage probability (SOP) for dual-hop relaying systems with hybrid MIMO RF/FSO links and proposed transmit antenna selection (TAS) schemes to enhance secrecy under imperfect channel state information (CSI). Similarly, ref. [4] examined SOP for FSO systems with a distinct eavesdropper (ED) near the relay, deriving closed-form expressions for various scenarios. Additionally, ref. [5] proposed a learning-based secure transmission framework using deep neural networks to optimize transmission parameters for maximizing secrecy throughput, while [6] focused on robust secure transmission by optimizing the worst-case secrecy rate under energy-harvesting and power constraints. However, these studies primarily address secrecy issues rather than energy efficiency issues, leaving a significant gap in developing integrated frameworks that ensure energy-efficient and secure communication for intelligent sensor networks in IoT applications.
Refs. [7,8,9] further investigated the improvement approach for energy efficiency in terms of wireless communication network with eavesdroppers. Ref. [7] proposed a novel secure and energy-efficient message communication system, called MComIoV, where MComIoV was evaluated through security proof and analysis against various attacks to verify corresponding robustness. Ref. [8] proposed a beamformer design approach to optimize the energy efficiency, in terms of secrecy bits per Joule under secrecy quality-of-service (QoS) constraints, for a multi-user MIMO communication network with an eavesdropper. Ref. [9] optimized the resource allocation strategy, the UAV’s trajectory, and the jamming policy of the jammer UAV are jointly optimized for maximizing overall energy efficiency for secure unmanned aerial vehicle (UAV) communication system. However, these literature did not focus on secure communication network with wireless power transfer.
Refs. [10,11,12,13,14,15,16,17] further investigated energy efficient simultaneous information and power transfer (SWIPT) system. Ref. [10] realized high energy efficient transmission by optimizing power allocation and Energy Harvesting (EH) relay selection in clustered wireless sensor network. Ref. [11] investigated the energy efficiency maximization for coordinated multi-point (CoMP) SWIPT heterogeneous networks (HetNets) where joint beam-forming and power allocation are optimized under intra-cell and inter-cell interferences. Ref. [12] studied the optimal resource allocation and data forwarding in integrated data and energy transmission. Ref. [13] focused on the energy efficient SWIPT in MIMO system where optimal power allocation and precoding have been realized. Ref. [14] developed a tractable model in terms of joint downlink and uplink transmission of K-tier heterogeneous cellular networks by SWIPT, in which authors derive the outage probability and the average ergodic rate of a random mobile user under different cell associations. This will provide guideline for energy efficient SWIPT system. Ref. [15] proposed a heuristic algorithm to optimize the transfer time allocation based on energy consumption distribution of nodes under scene of SWIPT. Ref. [16] proposed a mode switching scheme for SWIPT on basis of random beamforming technique. This can generate artificial channel fading to increase energy harvesting efficiency of receiver. Ref. [17] has achieved the energy efficient secure transmission system by jointly optimizing energy beamformers, information beamformers and transmit time switching ratio, subject to constrained usersharvested energy as well as power budget of BS. Refs. [18,19,20,21] mainly investigated the wirelessly powered network with separate information transmission. Refs. [22,23,24] aimed at improving the energy efficiency of wireless powered network by utilizing renewable energy. Ref. [22] proposed a hybrid framework that combines the two technologies, namely solar energy harvesting and wireless charging, in which cluster head placement has been optimized to minimize corresponding energy consumption. In [23], a M/M/1 make-to-stock queuing model is proposed to investigate decentralized decisions on how much amount of renewable energy should be supplied to BS for minimizing the individual cost of renewable source. Ref. [24] investigated the tradeoff between energy consumption and QoS, where the wireless network is powered by both grid and harvested renewable energy. However, these literature did not consider the security issue and mainly use iterative algorithms to optimize transmission parameters. Since online iterative calculation needs to consume some time, these approaches can not be used for determining optimal transmission parameters under strict latency constraint.
Recently, deep reinforcement learning (DRL) become a powerful tool for handing complicated optimization issues in communication network [25]. Refs. [26,27,28] introduced the deep reinforcement learning algorithm to achieve energy efficient sensing, navigation and wireless transmission, respectively. Ref. [26] utilized distributed DDPG to enhance EE in UAV mobile sensing (MCS). Ref. [27] proposed a decentralized deep reinforcement learning (DRL) framework to arrive at energy efficient navigation for distributed UAVs. Ref. [28] proposed a reinforcement learning based energy management strategy to access good long-term average net bit rate.However, DRL approach, applicable for realizing green secure data collection in IoT, has not been investigated yet.
In a word, green secure transmission scheme for wirelessly powered sensor network was not well investigated yet in all the above literature. In this article, we design a DDPG approach to arrive at a near-optimal secure transmission scheme for maximizing the energy efficiency of data collection in wireless sensor network with malicious attackers and WPT. With well-trained policy model, optimal transmission parameters can be determined in real time. As such, our proposed approach can effectively reduce the online latency of wireless transmissions. To our best knowledge, this is the first work to investigate the adaptive scheme for green secure IoT transmissions under strict latency constraint and WPT.

2. System and Channel Model

We consider a wirelessly-powered sensor network with an UAV, an attacker and multiple sensors, as illustrated in Figure 1. More specifically, the UAV firstly transfers energy to all sensors through downlink and then, all sensors send back collected data to the UAV through uplink using harvested energy. Note also that the attacker would constantly transfer interference signal to the UAV for degrading decoding performance. To achieve secure data collection, the UAV needs to be capable of normally collecting sensor data under the interference from attacker. We assume that all channels experience slow flat fading, where g i denotes the channel power gain from the UAV to sensor i, g i consists of fading channel gain h i and path loss d i α , i = 1 , 2 , 3 , , M . As such, channel power gain is a random variable depending upon distance d i and path loss exponent α , i = 1 , 2 , 3 , , M . Applying linear energy harvesting (EH) model, the harvested energy at sensor i, i = 1 , 2 , 3 , , M , is given by
E H i = P S g i τ w p η ,
where P S is charging power level of the UAV, τ w p is the charging duration and η is DC energy transfer efficiency. All sensors’ positions are assumed to follow 2D poisson point process (PPP) with the density of λ . During uplink data collection, both TDMA and CDMA are applied to facilitate the data transmission.

2.1. TDMA

In terms of TDMA mode, all sensors send collected data to the UAV in turn through uplink transmission. τ i is the transmission duration for sensor i, i = 1 , 2 , 3 , , M . As such, effective data rate can be denoted by
R T D M A = i = 1 M τ i B log 2 ( 1 + P S g i 2 η τ τ i ( P a t g a t + σ 2 ) ) / T .
Here, B is the effective bandwidth of wirelessly-powered network, σ 2 is the average noise power, T is the duration of data collection and i = 1 M τ i is equal to T τ , P a t is the transmit power level of the attacker and g a t is the channel power gain from the attacker to the UAV.

2.2. CDMA

In terms of CDMA mode, all sensors employ orthogonal codes to simultaneously transmit collected data to the UAV. Then, while noting that CDMA can considerably reduce the inter-user interference [29], we ignore the decoding interference from all normal sensors. Accordingly, effective data rate can be denoted by
R C D M A = i = 1 M B τ u p log 2 ( 1 + P S g i 2 η τ w p τ u p ( P a t g a t + σ 2 ) ) T .
Here, B is the effective bandwidth, σ 2 is the average noise power and τ u p is the uplink transmission duration, equal to T τ w p .

3. Energy Efficiency Maximization

In this section, we intend to maximize the energy efficiency for wirelessly-powered sensor network under CDMA mode and TDMA mode. With TDMA mode, action policy for determining near-optimal transmission parameters is obtained, and all closed-form expressions for optimal transmission parameters are derived accordingly.

3.1. TDMA

3.1.1. Optimization Without QoS Constraint

We maximize the energy efficiency of data collection under an attacker’s interference, which is calculated as the product of effective power and operating duration. General optimization problem is formulated as follows
max P S , τ , τ 1 , τ 2 , , τ M E f f = i = 1 M τ i T B log 2 ( 1 + P S g i 2 η τ τ i ( P a t g a t + σ 2 ) ) ( P S τ T + P C ) , s . t . i = 1 M τ i = T τ , 0 < P S P max ,
where i = 1 M τ i B log 2 ( 1 + P g i 2 η τ τ i ( P a t g a t + σ 2 ) ) / T is the effective throughput and P S τ / T + P C is the effective transmit power of the UAV. Note that iterative algorithms have high online calculative complexity and as such, they are not suitable for adaptive transmission subject to strict latency constraint. Accordingly, we intend to derive closed-form analytical expressions for optimal transmission parameters, where corresponding parameters can be determined in real time without involving any iterative calculation. First of all, we present a lemma to show the condition of energy efficiency maximization.  
Lemma 1.
Optimal τ 1 , τ 2 , ⋯, τ M need to satisfy the condition P S g 1 2 η τ / τ 1 = P S g 2 2 η τ / τ 2 = = P S g M 2 η τ / τ M = C .
We provide corresponding proof as below
Proof. 
We assume that P S and τ are given and as such, optimization problem can be updated to
max τ 1 , τ 2 , , τ M E E f f = i = 1 M τ i B log 2 ( 1 + P S g i 2 η τ / τ i ( P a t g a t + σ 2 ) ) P S τ + P C T , s . t . i = 1 M τ i = T τ .
Note that the second derivative, for this objective function, with respect to τ i is always less than zero, i = 1 , 2 , , M . Hence, the above function and constraint are both jointly concave with respect to τ 1 τ 2 , ⋯ and τ M . According to Ref. [30], we apply KKT condition to arrive at the mathematical expression L = E E + λ ( T τ i = 1 M τ i ) . As such, we can further transform τ 1 , , τ M L = 0 to
B ( ln ( 1 + P S g i 2 η τ τ i ( P a t g a t + σ 2 ) ) P S g i 2 η τ / ( τ i ( P a t g a t + σ 2 ) ) ( 1 + P S g i 2 η τ / ( τ i ( P a t g a t + σ 2 ) ) ) ) ( P S τ + P C T ) ln 2 λ = 0 ,
where i = 1 , , M . We can see that Equation (4) can hold if and only if P S g 1 2 η τ / τ 1 = P S g 2 2 η τ / τ 2 = = P S g M 2 η τ / τ M = C , where C is a constant.
Accordingly, by substituting the condition of Lemma 1 into original optimization problem, we can rewrite corresponding problem to
max P S , τ , τ 1 , , τ M E E f f = B ( i = 1 M τ i ) log 2 ( 1 + C ( P a t g a t + σ 2 ) ) P S τ + P C T , s . t . i = 1 M τ i = T τ , 0 < P S P max , P S g 1 2 η τ / τ 1 = = P S g M 2 η τ / τ M = C .
According to i = 1 M τ i = T τ and P S g 1 2 η τ τ 1 = = P S g M 2 η τ τ M = C , we have C = i = 1 M P S g i 2 η τ T τ and τ i = ( T τ ) g i 2 i = 1 M g i 2 , i = 1 , 2 , , M . Then, we further simplify the optimization problem to
max P S , τ E E f f = B ( T τ ) log 2 ( 1 + P S ( i = 1 M g i 2 ) η τ ( T τ ) ( P a t g a t + σ 2 ) ) ( P S τ + P C T ) , s . t . τ > 0 , 0 < P S P max .
Such objective function is still not jointly concave with respect to P S and τ . As such, we define a variable x as P S ( τ / ( T τ ) ) . Then, this optimization problem can be rewritten to
max P S , x E E f f = B log 2 ( 1 + x ( i = 1 M g i 2 ) η / ( P a t g a t + σ 2 ) ) ( x + P C ( 1 + x P S ) ) , s . t . x > 0 , 0 < P S P max .
While noting that τ / ( T τ ) ranges from 0 to , x and P S are independent with each other. This is because that x is possibly to be any value regardless of how much is P S . Accordingly, the objective function is monotonically increasing with P S , based on which optimal P S is equal to its peak value P max . Subsequently, such optimization problem can be transformed to an univariate optimization problem, shown as follows
max x E E f f = B log 2 ( 1 + x ( i = 1 M g i 2 ) η / ( P a t g a t + σ 2 ) ) ( x ( 1 + P C P max ) + P C ) , s . t . x > 0 .
By setting E f f / x = 0 , we arrive at the optimal solution as below
x * = P C 1 + P C P max ( P a t g a t + σ 2 ) ( i = 1 M g i 2 ) η W 0 [ e 1 ( P C 1 + P C P max ( i = 1 M g i 2 η ( P a t g a t + σ 2 ) ) 1 ) ] ( P a t g a t + σ 2 ) ( i = 1 M g i 2 ) η ,
where W 0 [ . ] denotes the positive branch of Lambert W function [31]. Accordingly, optimal charging duration τ can be calculated by
τ * = T T 1 + x * / P max .
Accordingly, on the basis of the derived condition τ i = ( T τ ) g i 2 / i = 1 M h i 2 , optimal data transmission duration τ i , i = 1 , 2 , , M , can be denoted by
τ i * = ( T τ * ) g i 2 i = 1 M g i 2 .

3.1.2. Optimization with QoS Constraint

In real application scenarios, we usually need to ensure that effective throughput is greater than a threshold for satisfying the QoS requirement. Then, updated optimization problem can be shown as follows
max P S , τ E E f f = B ( T τ ) log 2 ( 1 + P S ( i = 1 M g i 2 ) η τ ( T τ ) ( P a t g a t + σ 2 ) ) ( P S τ + P C T ) , s . t . τ > 0 , 0 < P S P max , B ( T τ ) T log 2 ( 1 + P S ( i = 1 M g i 2 ) η τ ( T τ ) ( P a t g a t + σ 2 ) ) R min ,
where R min is the lower bound of effective throughput. Following a similar process in Section 3.1, optimal P S is equal to P max . Accordingly, the above constraint for effective throughput can be transformed to B ( T τ ) T log 2 ( 1 + P max ( i = 1 M g i 2 ) η τ ( T τ ) ( P a t g a t + σ 2 ) ) R min . Through some mathematical manipulations, effective throughput constraint can be further updated to
B R min ln 2 ( W 0 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] + R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) ) τ T τ B R min ln 2 ( W 1 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] + R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) ) .
We present corresponding proof as follows
Proof. 
First of all, we use simple mathematical manipulations to transform effective throughput constraint to
R min ln 2 B ( τ T τ + ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) exp ( R min ln 2 B ( τ T τ + ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η 1 ) ) .
using the property of lambert W function, Equation (8) can be derived.
Also note that τ T τ is equal to or greater than zero and as such, B R min ln 2 ( W 1 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] + R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) ) has to be greater than zero constantly for ensuring the existence of a feasible solution. Accordingly, we have W 1 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] < R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) . When
R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) 1 , this inequality is equivalent to R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) < R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) , since W 1 [ x ] is a monotonically decreasing function. While noting that R min ln 2 B is greater than zero, R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) < R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) is not possibly to hold. When R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) > 1 , B R min ln 2 ( W 1 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] + R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) ) must be greater than zero because W 1 [ R min ( P a t g a t + σ 2 ) ln 2 B P max ( i = 1 M g i 2 ) η exp ( R min ln 2 B ( 1 ( P a t g a t + σ 2 ) P max ( i = 1 M g i 2 ) η ) ) ] is less than 1 . Accordingly, R min ( P a t g a t + σ 2 ) ln 2 P max B η ( i = 1 M g i 2 ) need to be greater than 1 to satisfy minimum effective throughput constraint, which means P max ( i = 1 M g i 2 ) R min ( P a t g a t + σ 2 ) ln 2 B η . Meanwhile, i = 1 M g i 2 is a random variable, which may approach zero. So P max has to be large enough to make such condition satisfiable. Combining Equation (6) and derivation process in III. A, optimal charging duration τ should approach zero and as such, effective throughput actually approaches B log 2 ( 1 + x ( i = 1 M g i 2 ) η / ( P a t g a t + σ 2 ) ) . Here, x was defined in III. A. Accordingly, minimum effective throughput constraint can be equivalent to condition that effective received signal to noise ratio (SNR) is greater than 2 R min B 1 , denoted by γ ( R min ) . According to the above analysis, With optimal parameters, the UAV’s received SNR is equal to x ( i = 1 M g i 2 ) η ( P a t g a t + σ 2 ) . Accordingly, x needs to be not less than γ ( R min ) ( i = 1 M g i 2 ) η / ( P a t g a t + σ 2 ) . Then, we can rewrite the optimal x to
x * ¯ = max { x * , γ ( R min ) ( i = 1 M g i 2 ) η / ( P a t g a t + σ 2 ) } .
On the basis of Equation (7), the optimal τ 1 , τ 2 , ⋯, τ M can be updated accordingly.

3.2. CDMA

In terms of CDMA mode, to maximize overall energy efficiency of the UAV, we formulate an optimization problem as follows
max P S , τ w p E f f = i = 1 M B ( T τ w p ) log 2 ( 1 + P S η g i 2 τ w p ( P a t g a t + σ 2 ) ( T τ w p ) ) P S τ w p + P C T , s . t . 0 < P S P max , 0 τ w p T .
We can see that it is not possibly to derive closed-form optimal solution in terms of this case. As mentioned above, iterative algorithms have high online computational complexity and can not determine near-optimal transmission parameters in real time, e.g., [7,8,9,18,19,20,21]. While noting that machine learning can well train a DNN model for performing adaptive optimal data collection, it can offer lower latency than conventional iterative algorithms. As such, we intend to design a DDPG approach to arrive at a near-optimal solution.

DDPG Solution

Applying DDPG approach, we construct one critic network and one actor network, respectively, as illustrated in Figure 2, where θ Q denotes the parameter set of critic network and θ μ denotes the parameter set of actor network. The input of critic network is the state vector and action vector, and the output is estimated Q value. The input of actor network is the state vector, and corresponding output is the action vector. During the training process, the critic network feeds back an estimated Q value for current action to the actor network to update corresponding parameter set using chain rule based gradient ascent. Adopted critic-actor configuration is presented in Figure 2, where s t is the state vector at time instant t and a t is the action vector at time instant t. In this case, we define the state vector s t as [ h 1 , h 2 , , h M ] T , which consists of channel gains from all sensors to the UAV at time instant t. Action vector a t is defined as [ P S , τ w p ] T . Following DDPG approach, we use expected reward function to evaluate the action policy [27], shown as follows
y t = R t ( s t , a t ) + γ Q ( s t + 1 , μ ( s t + 1 | θ μ ) | θ μ ) | θ Q ) .
Here, y t is the expected reward for adopted action a t , γ is the discounted factor, R t ( s t , a t ) is the instantaneous reward with given state s t and action a t , s t + 1 is the state at time instant t + 1 , μ ( s t + 1 | θ μ ) is the output of actor network. Prior to offline training, state transition tuples { s i , a i , R i , s i + 1 } , i = 1 , 2 , , N , need to be randomly generated at first, and then put into a memory buffer of size K. Note that expected reward can be estimated using a critic network. During each training iteration, we need to obtain an action vector from current actor network and then, calculate the resulting reward as well as the state vector in the next time instant. Subsequently, one new experience tuple can be obtained for randomly replacing one existing experience tuple in the memory buffer. Note that within each training iteration, corresponding action vector is the output with random exploration, shown as below
a t = μ ( s t | θ μ ) + N t .
N t denotes the variable following normal distribution with the mean value of zero and the variance of v. At the end of each training iteration, v is updated to β v, where β is a constant ranging from 0 to 1. After that, we can obtain the critic network parameter θ Q through minimizing loss function, shown as follows
θ Q = min θ Q 1 N i = 1 N ( y i Q ( s i , a i | θ Q ) ) 2 .
The state transition tuples { s i , a i , R i s i + 1 } , i = 1 , 2 , , N ¡ , is directly extracted from the memory buffer. In terms of this case, R t ( s t , a t ) is defined as instantaneous energy efficiency for the UAV, denoted by
R t ( s t , a t ) = i = 1 M B ( T τ w p ) log 2 ( 1 + P S g i 2 τ w p ( P a t g a t + σ 2 ) ( T τ w p ) ) ( P S τ w p + P C T ) H .
Applying chain rule based gradient ascent, through extracting M tuples in random from the memory buffer, we further calculate the renewed parameter set of actor network as follows
θ μ = θ μ + 1 M Σ i = 1 M a Q ( s , a | θ Q ) | s = s i , a = u ( s i ) θ μ μ ( s | θ μ ) | s = s i .
Then, we arrive at the updated parameter set of critic network as follows
θ Q = k 1 θ Q + ( 1 k 1 ) θ Q ,
and the updated parameter set of actor network as follows
θ μ = k 1 θ μ + ( 1 k 1 ) θ μ ,
where k 1 is the exploration constant during the process of learning. The pseudo code is shown in Algorithm 1.
For each training iteration, the above mentioned training process needs to be repeated. After sufficient amount of training iterations, one actor network can be well trained to determine near-optimal transmission parameters in real time.
Algorithm 1 pseudo-code for action policy training
  • Initialize the critic network Q ( s , a | θ Q ) and the actor network μ ( s , a | θ μ ) .
  • Initialize the network parameter θ Q and θ μ .
  • for episode [ 1 , 2 , , M ]  do
  •    Initialize a Rayleigh fading random process and generate initial state s t .
  •    for t [ 1 , 2 , , T ]  do
  •      Select action a t = μ ( s t | θ μ ) + N t .
  •      Randomly generate new state s t + 1 .
  •      Use a t and s t + 1 to calculate the resulting reward R t .
  •      Save state transition tuple { s t , a t , R t , s t + 1 } into the memory buffer.
  •      Extract a random minibatch of N state transition tuples { s i , a i , R i , s i + 1 } from the memory buffer, i = 1 , 2 , , N .
  •      Set expected value function y i to r i + γ Q ( s i + 1 , μ ( s i + 1 | θ μ ) | θ Q ) .
  •      Update critic network parameter to θ Q by minimizing the loss function as Equation (13).
  •      Update actor network parameter by performing gradient ascent, shown as θ μ = θ μ + a Q ( s , a | θ Q ) | s = s i , a = u ( s i ) θ μ μ ( s | θ μ ) .
  •      Update corresponding network parameters as θ Q = k 1 θ Q + ( 1 k 1 ) θ Q and θ μ = k 1 θ μ + ( 1 k 1 ) θ μ .
  •    end for
  • end for

4. Numerical Results

To test the validity of theoretical analysis and proposed optimization approach, we presented some numerical examples on important parameters, such as maximum energy efficiency, effective throughput. All the parameters, used in simulation, have been given in Table 1. To simplify the simulation process, P a t is set to a very small value. All channel gains are assumed to be the same and follow rice distribution, all sensors’ positions are assumed to follow normal distribution in a circle area, and the UAV is assumed to be in stationary hovering state. Since the noise at the sensor is usually very tiny, Noise power consumption is set to be 1 × 10 8 Watt.
Figure 3 presents the minimum energy consumption with respect to channel power gain under TDMA mode. Note that under TDMA mode, minimum energy consumption can be calculated by substituting optimal charging power and optimal charging duration into the objective function in Section 3.1. As expected by intuition, we see that energy efficiency monotonically increases with channel power gain. This is because that when channel condition deteriorate, charging power level needs to be higher for keeping the amount of collected data unchanged.
Figure 4 presents the effective transmit power x with respect to channel power gain under TDMA mode. As expected, we see that the optimal power level generally decreases as channel power gain increases. We also see that descending velocity slightly decreases with the increasing channel power gain. This is because that when channel condition becomes better, less charging power is needed in general for collecting a fixed amount of data.
Figure 5 presents the effective throughput with respect to channel power gain under TDMA mode. We can see that the resulting throughput increases as channel power gain increases. This is because that received SNR increases as channel power gain increases. Note that since optimal charging power and charging duration varies with channel power gain, effective throughput increases with channel power gain in approximately linear manner.
Figure 6 presents the numerical results on average energy efficiency per sensor, under CDMA mode, from gradient ascent, DDPG based approach and exhaustive search. It shows that the result from exhaustive search result is very close to that from DDPG approach and as such, our proposed DDPG approach achieves near-optimal performance. Furthermore, we compare the performance of DDPG approach with that of gradient ascent and show that DDPG approach has much better performance. Note that as the step size is relatively big, the result from gradient ascent is not good.

5. Conclusions

In this paper, we investigated the issue of energy efficiency maximization in wirelessly-powered sensor network with an attacker. In the case of TDMA, we maximized the energy efficiency through deriving closed-form optimal charging power, charging duration and transmission durations. Additionally, in the case of CDMA, we proposed a DDPG based approach to arrive at an action policy for determining near-optimal transmission parameters in real time, and reduce the calculation complexity. These results will be very valuable for arriving at energy efficient wirelessly-powered sensor network.

Author Contributions

Algorithm and System design, F.X.; Simulation and Analysis, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lu, X.; Luong, N.C.; Hoang, D.T.; Niyato, D.; Xiao, Y.; Wang, P. Secure Wirelessly Powered Networks at the Physical Layer: Challenges, Countermeasures, and Road Ahead. Proc. IEEE 2013, 110, 1410–1423. [Google Scholar] [CrossRef]
  2. Zhu, T.; Shen, L. Secure and Efficient Certificateless Linearly Homomorphic Signature Scheme for Wireless Sensor Networks. In Proceedings of the International Conference on Big Data and Privacy Computing, Macau, China, 10–12 January 2024; pp. 113–121. [Google Scholar]
  3. Lei, H.; Luo, H.; Park, K.H.; Ansari, I.S.; Lei, W.; Pan, G.; Alouini, M.S. On Secure Mixed RF-FSO Systems with TAS and Imperfect CSI. IEEE Wirel. Commun. Lett. 2020, 68, 4461–4475. [Google Scholar] [CrossRef]
  4. Saxena, V.N.; Gupta, J.; Dwivedi, V.K. Secured End-to-End FSO-VLC-Based IoT Network with Randomly Positioned VLC: Known and Unknown CSI. IEEE Internet Things J. 2023, 10, 1347–1357. [Google Scholar] [CrossRef]
  5. He, D.; Liu, C.; Wang, H.; Quek, T.Q.S. Learning-Based Wireless Powered Secure Transmission. IEEE Wirel. Commun. Lett. 2019, 8, 600–603. [Google Scholar] [CrossRef]
  6. Feng, R.; Li, Q.; Zhang, Q.; Qin, J. Robust Secure Transmission in MISO Simultaneous Wireless Information and Power Transfer System. IEEE Trans. Veh. Technol. 2015, 64, 400–405. [Google Scholar] [CrossRef]
  7. Limbasiya, T.; Das, D.; Das, S.K. MComIoV: Secure and Energy-Efficient Message Communication Protocols for Internet of Vehicles. IEEE/ACM Trans. Netw. 2021, 29, 1349–1361. [Google Scholar] [CrossRef]
  8. Nghia, N.T.; Tuan, H.D.; Duong, T.Q.; Poor, H.V. MIMO Beamforming for Secure and Energy-Efficient Wireless Communication. IEEE Signal Process. Lett. 2017, 24, 236–239. [Google Scholar] [CrossRef]
  9. Cai, Y.; Wei, Z.; Li, R.; Ng, D.W.K.; Yuan, J. Joint Trajectory and Resource Allocation Design for Energy-Efficient Secure UAV Communication Systems. IEEE Trans. Commun. 2020, 68, 4536–4553. [Google Scholar] [CrossRef]
  10. Guo, S.; Wang, F.; Yang, Y.; Xiao, B. Energy-Efficient Cooperative Transmission for Simultaneous Wireless Information and Power Transfer in Clustered Wireless Sensor Networks. IEEE Trans. Commun. 2015, 63, 4405–4417. [Google Scholar] [CrossRef]
  11. Tang, J.; Shojaeifard, A.; So, D.K.C.; Wong, K.K.; Zhao, N. Energy Efficiency Optimization for CoMP-SWIPT Heterogeneous Networks. IEEE Trans. Commun. 2018, 66, 6368–6383. [Google Scholar] [CrossRef]
  12. Yang, K.; Yu, Q.; Leng, S.; Fan, B.; Wu, F. Data and Energy Integrated Communication Networks for Wireless Big Data. IEEE Access 2016, 4, 713–723. [Google Scholar] [CrossRef]
  13. Sun, Q.; Li, L.; Mao, J. Simultaneous Information and Power Transfer Scheme for Energy Efficient MIMO Systems. IEEE Trans. Veh. Technol. 2014, 18, 600–603. [Google Scholar] [CrossRef]
  14. Akbar, S.; Deng, Y.; Nallanathan, A.; Elkashlan, M.; Aghvami, A. Simultaneous Wireless Information and Power Transfer in K-Tier Heterogeneous Cellular Networks. IEEE Trans. Wireless Commun. 2016, 15, 5804–5818. [Google Scholar] [CrossRef]
  15. Huang, H.; Li, C.; Wu, C.; Wu, F.; Li, L. Energy Efficiency Optimization Method under the Scene of Wireless Information and Energy Simultaneous Transmission. In Proceedings of the International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Zhengzhou, China, 18–20 October 2018; pp. 321–328. [Google Scholar]
  16. Ju, H.; Zhang, R. A Novel Mode Switching Scheme Utilizing Random Beamforming for Opportunistic Energy Harvesting. IEEE Trans. Wirel. Commun. 2014, 13, 2150–2162. [Google Scholar] [CrossRef]
  17. Nasir, A.A.; Tuan, H.D.; Duong, T.Q.; Poor, H.V. Secure and Energy-Efficient Beamforming for Simultaneous Information and Energy Transfer. IEEE Trans. Wirel. Commun. 2017, 16, 7523–7537. [Google Scholar] [CrossRef]
  18. Zhang, X.; Zhang, X.; Han, L. An Energy Efficient Internet of Things Network Using Restart Artificial Bee Colony and Wireless Power Transfer. IEEE Access 2019, 7, 12686–12695. [Google Scholar] [CrossRef]
  19. Sheng, M.; Wang, L.; Wang, X.; Zhang, Y.; Xu, C.; Li, J. Energy Efficient Beamforming in MISO Heterogeneous Cellular Networks with Wireless Information and Power Transfer. IEEE J. Sel. Areas Commun. 2016, 34, 954–968. [Google Scholar] [CrossRef]
  20. Chen, X.; Wang, X.; Chen, X. Energy-Efficient Optimization for Wireless Information and Power Transfer in Large-Scale MIMO Systems Employing Energy Beamforming. IEEE Wirel. Commun. Lett. 2013, 2, 667–670. [Google Scholar] [CrossRef]
  21. He, S.; Huang, Y.; Jin, S.; Yu, F.; Yang, L. Max-Min Energy Efficient Beamforming for Multicell Multiuser Joint Transmission Systems. IEEE Commun. Lett. 2013, 17, 1956–1959. [Google Scholar] [CrossRef]
  22. Wang, C.; Li, J.; Yang, Y.; Ye, F. Combining Solar Energy Harvesting with Wireless Charging for Hybrid Wireless Sensor Networks. IEEE Trans. Mob. Comput. 2018, 17, 560–576. [Google Scholar] [CrossRef]
  23. Li, D.; Saad, W.; Guvenc, I.; Mehbodniya, A.; Adachi, F. Decentralized Energy Allocation for Wireless Networks with Renewable Energy Powered Base Stations. IEEE Trans. Commun. 2015, 63, 2126–2142. [Google Scholar] [CrossRef]
  24. Mao, Y.; Zhang, J.; Letaief, K.B. Grid Energy Consumption and QoS Tradeoff in Hybrid Energy Supply Wireless Networks. IEEE Trans. Wirel. Commun. 2016, 15, 3573–3586. [Google Scholar] [CrossRef]
  25. Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Harley, T.; Lillicrap, T.P.; Silver, D.; Kavukcuoglu, K. Asynchronous Methods for Deep Reinforcement Learning. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 1–10. [Google Scholar]
  26. Liu, C.H.; Chen, Z.; Zhan, Y. Energy-Efficient Distributed Mobile Crowd Sensing: A Deep Learning Approach. IEEE J. Sel. Areas Commun. 2019, 37, 1262–1276. [Google Scholar] [CrossRef]
  27. Liu, C.H.; Ma, X.; Gao, X.; Tang, J. Distributed Energy-Efficient Multi-UAV Navigation for Long-Term Communication Coverage by Deep Reinforcement Learning. IEEE Trans. Mob. Comput. 2019, 19, 1274–1285. [Google Scholar] [CrossRef]
  28. Qiu, C.; Hu, Y.; Chen, Y.; Zeng, B. Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications. IEEE Internet Things J. 2019, 6, 8577–8588. [Google Scholar] [CrossRef]
  29. Smida, B.; Affes, S.; Jamaoui, K.; Mermelstein, P. A Multicarrier-CDMA Space–Time Receiver with Full-Interference-Suppression Capabilities. IEEE Trans. Veh. Technol. 2008, 57, 363–379. [Google Scholar] [CrossRef]
  30. Xu, D.; Li, Q. Joint Power Control and Time Allocation for Wireless Powered Underlay Cognitive Radio Networks. IEEE Commun. Lett. 2017, 6, 294–297. [Google Scholar] [CrossRef]
  31. Blondeau, F.; Monir, A. Evaluation of the Lambert W function and application to generation of generalized Gaussian noise with exponent 1/2. IEEE Trans. Signal Process. 2002, 50, 2610–2615. [Google Scholar]
Figure 1. Wirelessly-powered secure sensor network using CDMA mode.
Figure 1. Wirelessly-powered secure sensor network using CDMA mode.
Sensors 25 01534 g001
Figure 2. Critic and Actor network in DDPG algorithm.
Figure 2. Critic and Actor network in DDPG algorithm.
Sensors 25 01534 g002
Figure 3. Miximum energy efficiency, TDMA mode.
Figure 3. Miximum energy efficiency, TDMA mode.
Sensors 25 01534 g003
Figure 4. Optimal effective transmit power x with respect to channel power gain, TDMA mode.
Figure 4. Optimal effective transmit power x with respect to channel power gain, TDMA mode.
Sensors 25 01534 g004
Figure 5. Effective throughput with respect to channel power gain, TDMA mode.
Figure 5. Effective throughput with respect to channel power gain, TDMA mode.
Sensors 25 01534 g005
Figure 6. Effective energy efficiency per sensor, CDMA mode.
Figure 6. Effective energy efficiency per sensor, CDMA mode.
Sensors 25 01534 g006
Table 1. Parameters used in the simulation.
Table 1. Parameters used in the simulation.
ParameterMeaningValue
σ 2 Noise Power1   ×   10 8 Watt
BAvailable Bandwidth10 kHz
ρ Path Loss Exponent2
PCCircuit Power of Source Node0.01 Watt
η DC Conversion Efficiency0.8
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, F.; Zhang, X. Intelligent Energy Efficiency Maximization for Wirelessly-Powered UAV-Assisted Secure Sensor Network. Sensors 2025, 25, 1534. https://doi.org/10.3390/s25051534

AMA Style

Xu F, Zhang X. Intelligent Energy Efficiency Maximization for Wirelessly-Powered UAV-Assisted Secure Sensor Network. Sensors. 2025; 25(5):1534. https://doi.org/10.3390/s25051534

Chicago/Turabian Style

Xu, Fang, and Xinyu Zhang. 2025. "Intelligent Energy Efficiency Maximization for Wirelessly-Powered UAV-Assisted Secure Sensor Network" Sensors 25, no. 5: 1534. https://doi.org/10.3390/s25051534

APA Style

Xu, F., & Zhang, X. (2025). Intelligent Energy Efficiency Maximization for Wirelessly-Powered UAV-Assisted Secure Sensor Network. Sensors, 25(5), 1534. https://doi.org/10.3390/s25051534

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop