1. Introduction
Cyber-physical-social system (CPSS) is attracting increasing attention through the integration of social system and cyber-physical system (CPS). The CPSS as a novel paradigm enables the deep interaction of social space and CPS or Internet of Things (IoT), which brings significant changes for improving the service and management of complex physical systems. Architectures, methods and schemes of CPSS for different application areas have been proposed [
1,
2,
3,
4,
5,
6,
7,
8,
9]. The wireless sensor network (WSN) as the sensing network is one of the most important parts. Importantly, compared with CPS or IoT, the CPSS contains a large number of social sensors, such as smart phones and tablets. The social sensors have a great impact on data collection, transmission, and processing. The novel WSN composed of social and physical sensors undertakes the collection task of sensing information and plays an important role in CPSS. Since the sensor nodes are battery-powered and the battery capacities are limited, improving energy-efficiency of transmission for the novel WSN in CPSS is still one of the most important research topics. In recent years, the emerging collaborative beamforming (CB) technique provides a novel solution for improving energy efficiency of information transmission in traditional WSN. However, due to the characteristic of dynamic social mobility for social sensors, the related studies of CB in traditional WSN composed of physical sensors are not applicable for the CPSS. Our objective was to integrate the social dynamic model with collaborative beamforming technique to solve the problem of coordinated transmission of WSN with social and physical sensors for the CPSS.
In CPS, the traditional WSN only considering fixed physical sensors is widely used to sense or collect the large-scale information due to the characteristics of low cost, lower power and the small devices. The energy efficient schemes for the traditional WSN have been paid great attention in the past few decades. In recent years, the emerging of collaborative beamforming (CB) technique brings a novel solution for improving the energy efficiency of WSNs [
10]. The CB technique requires that all fixed sensor nodes in WSN divide into different clusters. All fixed sensor nodes in a cluster transmit collaboratively the sensing information to the intended base station (BS). In process of transmission, the phase synchronization is achieved by the existing synchronization methods. The CB can enhance the transmission gain in intended direction and reduce the interference power at other directions. In other words, these fixed sensor nodes in CB can obtain the beampattern with stable mainlobe and low sidelobe by the cooperative way. Since the sensor nodes are randomly deployed in the given areas, the random distribution of sensor nodes cause that the amplitude of sidelobe in beampattern presents the unpredictability [
11,
12,
13]. It means that the node locations have an important impact for the amplitude of sidelobe [
14]. However, the high sidelobe level can lead to the strong interference for the others unintended BSs. Therefore, achieving the required sidelobe level can effectively improve the service performance of CPS.
On the other hand, the existing CB optimization methods for sidelobe control in WSN mainly consider the fixed and static sensor nodes. These methods generally include transmission coefficient optimization [
15,
16,
17,
18,
19,
20] and sensor node selection [
21,
22,
23]. These methods integrate the coefficient optimization into node selection. They first use the intelligence algorithm to optimize the peak sidelobe level and then select an optimal sensor node set based on the regular antenna array. However, these research works have not presented the analysis of the computational complexity and the implementation scheme of information transmission.
As mentioned above, although the existing algorithms exhibit substantial improvement in terms of minimizing the peak sidelobe or sidelobe control in direction of unintended BS, they assume the sensor nodes in traditional WSN are static and the optimization results may cause some sensor nodes to be selected more often such that the network lifetime is reduced. Due to the characteristic of the social dynamic of social sensors for the novel WSN in CPSS, the existing CB optimization methods are not applicable for the novel WSN including mobile social and static physical sensors in CPSS. In this paper, we integrate the social dynamic model into CB to solve the problem. The corresponding transmission optimization method based on reinforcement learning is developed to improve the transmission efficiency of WSN in CPSS. Moreover, the communication mechanism of implementing the transmission scheme and the detailed performance analysis are presented.
The remainder of this paper is organized as follows. The related work is presented in
Section 2.
Section 3 presents the WSN sensing scenario based on CB in CPSS, including the model of social dynamic mobility and the model of CB transmission. In
Section 3, we formulate the CB transmission optimization problem.
Section 4 presents a transmission schemes of CB based on reinforcement learning to maximize the SNR performance, while considering the constraint of the interference power at the direction of unintended BS in CPSS.
Section 5 evaluates the proposed methods via extensive simulations. Finally, we conclude this work in
Section 6.
2. Related Work
In recent years, cyber-physical-social system (CPSS) attracts a great amount of attention because it considers the integration of the human activity with CPS or IoT. Many applications based on CPSS have been developed, such as intelligent transportation [
2] and smart city [
3]. The corresponding architecture, methodology, control, and command schemes are also proposed [
1,
4,
5]. In particular, the sensing network architecture based on WSN as an important part of CPSS plays a significant role for sensing, transmitting and collecting information. Since the WSN architecture composed of social and physical sensor nodes is significantly different from that of traditional WSN, the related research works on traditional WSN are not applicable for CPSS due to the characteristics of dynamic social mobility for social sensors. Therefore, some novel methods are proposed to solve the emerging problems in CPSS. For example, a moving centroid based routing protocol is presented to deal with the incompletely predictable cyber devices in cyber-physical-social distributed systems [
6]. In [
7], the authors studied a fusion scheme of cellular network and wireless sensor for CPSS. To prolong the lifetime of the IoT, a novel rendezvous data routing scheme based on lower sensors is proposed to achieve the data transmission for scalable CPS networking infrastructure [
8]. In addition, a new sensor trajectory planning method is proposed to solve the trajectory planning problem for robotic CPSS [
9]. However, all the above-mentioned methods have not considered the improvement of transmission efficiency of the novel WSN in CPSS. In [
24], the authors presented a framework and infrastructure of collaborative CPS to improve the management efficiency of the increasing number of devices. The importance of collaboration among components is considered.
On the other hand, the collaborative beamforming technique has been studied extensively in recent years, as it can increase the transmission range and improve the energy efficiency. The related research works on CB transmission in WSN can be divide into three classes, with respect to beampattern analysis, sidelobe level optimization, and synchronization scheme. The analysis of beampattern mainly focuses on the performance of mainlobe, sidelobe, and directivity for different sensor node distributions [
10]. The sensor nodes distributions mainly include uniform distribution [
11], Gaussian distribution [
12], and arbitrary distribution [
13]. Based on the analysis of beampattern for different nodes distributions, results are obtained for the performance characteristics of different node distributions and the feasibility of the CB technique in WSN. However, for WSN with small-size nodes, the beampattern performance with random node distribution can hardly meet the transmission requirement in practical applications. To address this problem, the researchers mainly focus on the minimization of peak sidelobe level and the sidelobe control at the unintended receiver directions. The two optimization objectives can be achieved by transmission coefficient optimization and sensor node selection. The two classes of methods can be differential according to the characteristic of transmission coefficient. When the transmission coefficient is continuous, the corresponding methods mainly focus on the optimization of transmission power coefficient. If each sensor node in WSN has only two states, i.e., active on or active off, node selection is generally considered to solve the optimization problem. For transmission coefficient optimization, intelligent algorithms are often used to solve the minimum peak sidelobe or maximum energy efficiency problem for CB in WSN. Due to the non-convex characteristics of this optimization problem, intelligence heuristic algorithms are often employed. For example, the authors in [
17,
21] utilized particle swarm optimization (PSO) and the firefly algorithm (FA) to optimize the amplitude coefficient of each collaborative node, respectively. Then, a collaborative node set is selected from the candidate nodes based on a circular antenna array. Similarly, the authors in [
20] utilized the genetic algorithm (GA) to minimize the peak sidelobe amplitude. Furthermore, several coefficient optimization methods are proposed to extend network lifetime. A multi-objective beampattern optimization problem is formulated in [
18], and a metaheuristic method is proposed to calculate the transmission coefficient. This method can ensure low peak sidelobe level and energy consumption, in comparison with the existing heuristic algorithms. In [
19], a beampattern optimization method based on non-dominated sorting genetic algorithm II (NSGA II) is proposed to prolong network lifetime for CB in WSN, which effectively reduces the energy consumption and improves the performance of the peak sidelobe. The optimization objectives involve the peak sidelobe and beampattern directivity. Since the transmission coefficient is characterized as a continuous variable, these intelligent algorithms generally have high computational complexity and cause redundant energy consumption. In addition, the corresponding implementation schemes are not introduced in details.
To provide another solution, the situation that the transmitter of each node only has two power levels (i.e., zero and maximum power) is considered. The optimization objective is mainly to minimize the interference power at direction of the other unintended BS by determining the nodes states in terms of active or sleep states. The beampattern optimization problem is typically regarded as a node selection problem. As the problem is a non-convex and combinatory in nature, it is difficult to be solved in polynomial time. Heuristic and combinatory optimization methods are utilized to minimize the sidelobe level. In [
22], the authors proposed a random node selection method. This method randomly selects
L nodes from
N nodes and evaluates whether the sidelobe level of the
N nodes meets the given requirement in the direction of unintended BS. Then, it needs to repeat the above process until the requirement is met. However, the computational complexity of this method is very high which may significantly increase the energy consumption of sensor nodes. To reduce the computational complexity, a node selection algorithm based on cross-entropy optimization (CEO) is proposed in [
23]. The evaluation results show that the CEO algorithm has better sidelobe performance and lower computational complexity than the method proposed in [
22]. However, it only considers static physical sensor nodes and can hardly be extended to mobile networks including the CPSS.
In summary, most of the above-mentioned related works tend to use intelligence algorithms to solve the CB optimization problems for WSN in CPS. These methods assume that the physical sensors in WSN are fixed. However, the social sensors have the property of social dynamic mobility, which significantly affect the performance of CPSS. Therefore, in this work, we focus on the integration of social and physical sensors and propose a corresponding CB optimization method to improve the transmission performance in CPSS.
4. The Propose Transmission Optimization Algorithm
In this section, we solve the above-mentioned transmission optimization problem based on a dynamic learning algorithm. Then, the corresponding implementation scheme is presented.
4.1. Dynamic Learning Algorithm
A game theoretical approached is used to solve this problem. Each sensor nodes in this system acts as a game player to interact with other sensors. In a dynamic environment, the reactive action of the game can obtain better performance than a deterministic method. For the problem under investigation, the location of each social sensor node follows a dynamic process. Hence, we use a dynamic learning algorithm to solve this problem. In this work, we first present the game as
, where
M denotes the number of players,
represents the state of dynamic environment, and
is the utility function of the player. In this work, the action set of sensor node includes the two transmission power levels
, which refer to the sleep and active states of each sensor node, respectively. For the utility function
, we first analyze the optimization objective and the constraints. In particular, the objective is to maximize the SNR at the intended BS, which is proportional to the transmission rate. The action of each sensor node, i.e.,
or
, determines the value of the SNR. When
, the transmission power is
, otherwise, it is 0. We consider the INR for the unintended BSs as a penalty term of the objective function. Then, the corresponding utility function
can be derived as follows:
Here, since all M sensor nodes use the collaborative beamforming technique to achieve cooperatively the same optimization objective, the utility function of each node is the same, i.e., .
For the game theoretical approach, reinforcement learning methods are generally used to solve the problem. Although the Markov decision process (MDP) can also be used to solve the problem, the accurate state information is required, which is difficult to obtain in practical applications. In addition, the implementation of MDP is centralized. For the CPSS, the state of each social sensor node is determined by human activity. Thus, the accurate prior knowledge is hardly obtained in dynamic environment. In addition, the centralized optimization is hard to be implemented due to the difficulty of collecting the global state information of all sensor nodes. For the WSN with a distributed architecture in CPSS, a distributed scheme needs to be considered. Therefore, we first present a distributed reinforcement learning algorithm based on regret matching to improve the transmission performance of the WSN with social and physical sensor nodes in CPSS. The regret matching scheme focuses on the regret related to the actions of each player. For any time, each player (node) calculates the regret value by adjusting its strategy. The specific definition is given by
where
denotes the received utility for each player at time
, and
is the utility when the player switch to action
while other players retaining theirs unchanged. Then, each player updates its strategy
as follows:
where
.
The proposed algorithm is presented in Algorithm 1, where the regret matching method is used to maximize the utility in Equation (
23). In the initialization stage, each social or physical sensor node initializes the location of node, state, and probability of action selection. Then, each sensor node selects the specific action (corresponding to the transmission power) according to the initial probability. After that, the learning function in Equation (
24) is used to update the regret value based on the utility function. The probability of action selection is recalculated based on Equation (
25). Then, the location of social sensor nodes is changed according to the model of social dynamic mobility in
Section 3. Finally, when the number of iteration meets the stopping condition, the result is returned and the algorithm stops. Otherwise, the above procedure is repeated.
Algorithm 1 The proposed algorithm based on regret matching. |
- Input:
The number of sensor nodes M; The coordinates of nodes and BSs ; ; The number of iteration T - Output:
The probability of action selection - 1:
Initialization: The regret value of social sensor nodes , ,, where n is the location label of social sensor node; The regret value of physical sensor nodes , ; The probability of action selection , ,; - 2:
Calculate the transition probability based on Equations ( 3)–( 5) of the model of dynamic mobility in Section 3.2- 3:
Repeat - 4:
For each sensor node do - 5:
Select a action for based on the - 6:
EndFor - 7:
Calculate the utility based on the Equation ( 23) - 8:
For each sensor node do - 9:
If is social node, - 10:
Calculate the corresponding regret value of different action at location n according to Equation ( 23) - 11:
else - 12:
Calculate the corresponding regret value of different action using Equation ( 24) - 13:
EndIf - 14:
Update the probability of action selection using Equation ( 25). - 15:
EndFor - 16:
Update the location information of each social node based on the transition probability of social dynamic mobility - 17:
- 18:
Until - 19:
return the probability of action selection
|
In Algorithm 1, some initial variables are initialized on Line 1. The calculation of transition probability is presented on Line 2. On Lines 3–6, each node selects a action from action set based on the probability of action selection, i.e., selecting a transmission power level (zero or
). Then, the corresponding utility is calculated based on Equation (
23) on Line 7. On Lines 8-013, the regret value
or
for social or physical sensor node is calculated based on Equation (
24). Based on the regret value, the probability of action selection of each node is recalculated according to Equation (
25). Next, the location of social sensor node is updated based on the transition probability of social dynamic mobility. Note that the location for physical sensor node is static. Thus, the location is unchanged. After this, it returns to Line 3 to repeat the calculation procedure. On Lines 18–19, when
, the algorithm stops and returns the probability of action selection of each node.
4.2. The Implementation Scheme
To implement the proposed CB transmission scheme of sensor nodes in CPSS, we present the message exchange process for communications. Then, the proposed algorithm can be implemented in a decentralized way. Let be collaborative node set with O social sensor nodes and physical nodes; our objective is to select a collaborative node set from M candidate nodes including social and physical sensor nodes. The detail implementation procedure is given as follows:
Step 1: Initialization. Source node s sends the Initialization Messages to each node () in the deployment area. Each node initializes its regret value by or with and , as well as the probability of action selection . The iteration index is set to .
Step 2: Collect Node Information. Source node s sends an Information Collection Message to each node () in the deployment area. When a node receives the Information Collection Message, its sends the State Information Message including its node location, node ID, and current strategy as a response to source node. The source node saves the state information of all candidate nodes and broadcasts the information to each sensor node and then carries out the next step.
Step 3: Establish Collaborative Node Set. Source node sends Node Selection Messages to all social and physical sensor nodes in transmission range of source node. When the sensor node can receive the message, it selects an action according to the probability of action selection and responds by an Action Selection Message. This step repeats until all neighbouring sensor nodes of the source node have responded. Then, the selected nodes use CB to transmit the State and Strategy Information to the intended BS.
Step 4: Calculate the Utility. The intended BS calculates the utility according Equation (
23). Then, the intended BS returns the result through an
Utility Information Message to each collaborative node.
Step 5: Calculate Regret Value and Update the Probability of Action Selection. Each node calculates the regret value based on the received utility using Equation (
24). Then, the probability of action selection are updated using Equation (
25).
Step 6: Update the Social Sensor Node Location. The social sensor nodes update the current location information based on the social dynamic mobility.
Step 7: Repeat Iteration. If the number of iterations is less than the maximum number, then and return to Step 2. Otherwise, the learning process stops and go to the next step.
Step 8: Data Transmission. Source node s broadcasts the Sensing Information Message to all collaborative nodes with . Then, each candidate node responds by a confirmation Message. Then, the phase synchronization and CB technique are used to send the data symbol to the intended BS. When the intended BS successfully receives the signal and the interference power at the unintended BS is lower than the required value, each collaborative node will implement the transmission of data symbol based on the current strategy.
4.3. The Complexity of the Proposed Algorithm
Here, we analyze the complexity of the proposed algorithm. According to the description of implementation for the proposed algorithm, each sensor node needs to execute the following operations. First, the transition probability is required to be calculated. In this procedure, we consider the complexity of the calculation is
U. Moreover, the main calculation in Algorithm has the calculation of utility in Equation (
23) and regret value in Equation (
24). The corresponding complexity is denoted by
and
. We know the iteration number is
T. Thus, the total computational complexity is
. In terms of energy consumption, we assume the consumed energy of the computation is
. In procedure of implementation, the messages mainly have
Information and State Collection Messages,
Node and Action Selection Messages,
State and Strategy Information Messages,
Utility Information Message,
Sensing Information Message, and
Confirmation Message. We assume the energy consumptions of these messages are denoted by
,
,
,
,
, and
. The total energy consumption is
. We know from that the CB itself is a technique of improving energy efficiency and the consumed energy of each node can be reduced by an order of
for the consumed energy without using CB. From the implementation scheme, the
State and Strategy Information Messages and
Sensing Information Message are transmitted to sink node by CB. Therefore, the proposed algorithm can reduce the energy consumption of
.
5. Performance Evaluation
In this section, we present the performance evaluation of the proposed algorithm for CB in WSN. We know from the work in [
10] that the real experimentation of collaborative beamforming in WSN needs the support of carrier, phase, and time synchronization technique. Although various phase synchronization approaches of CB are presented in the literature [
10], the synchronization is still under study. Therefore, we assume the synchronization can be performed and use the simulation experiment to evaluate the CB transmission performance of the proposed algorithm in CPSS. We first consider a WSN with 5 social sensor nodes and 25 physical sensor nodes, which are randomly deployed in a circular area with
radius. We assume that each social sensor node has
locations in this area. The movement is based on the model of the social dynamic mobility. The location of each physical sensor nodes remains unchanged in the whole process. We first consider the INR threshold
dB. The transmit power of each node has two levels, i.e., zero and
, where
is normalized by
. The value of
is the same as that in [
11,
23], which is set as 20 dB. The channel coefficient follows a distribution with a zero-mean and
. Moreover, the direction of the intended BS is
, and other three unintended BSs are located at
,
, and
, respectively.
We first analyze the beam pattern performance. Based on the above setting, we present the beam pattern performance for 10 time slots in
Figure 3. As we can see, the interference powers for all three unintended BSs are less than the required
dB. The result shows the proposed algorithm can adapt to the location change of social sensor nodes.
To further analyze the performance, we compare the proposed algorithm with log linear learning (LLL) method [
29] and the random node selection algorithm [
22]. The LLL is often used to optimize the game theoretical problem [
30,
31]. The method focuses on the updating of the action selection probability based on the received utility. The random node selection is to choose
L nodes to meet the requirement of transmission gain in the direction of intended BS or low interference power for the directions of the unintended BSs. The comparison result of average beampattern performance is shown in
Figure 4. It can be observed that, when the
is set as 4, the result indicates that the INR at the three unintended BSs of the proposed algorithm is much lower than that of the log-linear learning method and the random node selection algorithm. In addition, the SNR of the mainlobe of our proposed algorithm at the direction of the intended BS is much higher than that of the other algorithms. This is because the proposed algorithm considers the utility of all different position for all previous time slots based on regret matching. For the LLL method, the utilities of different actions are calculated and then the action selection probability is updated based on the current utility. Since the location of social sensor node is dynamic, the LLL method can hardly capture such location change.
Figure 5 shows the average SNR performance for different thresholds of
. It can be seen that the proposed algorithm has higher SNR performance than the other two algorithms. When the
is set to 8, the proposed algorithm obtains the maximum SNR amplitude. The LLL method achieves the best performance only when
. In addition, when the
is in the range 3–9, the proposed algorithm has a better SNR than the LLL method. Because the method of random node selection can hardly meet the requirement of
, the SNR performance for the situation without considering
falls in the area between 11.5 dB and 12 dB. It can be seen in
Figure 5 that the proposed algorithm can effectively improve the SNR performance under the given different
.
The above analysis mainly focuses on a fixed number of social and physical sensor nodes. To further analyze the performance with respect to different number of social sensor nodes, we consider 4–12 social sensor nodes. Then, the SNR performance under
dB is presented in
Figure 6. The SNR performance of the proposed algorithm and LLL method decreases with the number of social sensor nodes. Specifically, when the number of social sensor nodes is greater than 9, the SNR amplitude reduces significantly for the proposed algorithm. Similarly, this situation also happens in LLL method. It shows that, although some social sensor nodes integrated into physical sensor nodes have positive role for the CB transmission in CPSS, excessive social sensor nodes have a negative impact on the transmission of CB. The reason is because the movements of excessive social sensor nodes can hardly meet the condition of
. When the
is set as 7 dB, the impact on the transmission of CB is weakened. As shown in
Figure 7, the SNR performance reduces slowly with the number of social sensor nodes. It is evident in
Figure 6 and
Figure 7 that the number of social sensor nodes has an impact on the CB transmission. However, as the value of
increases, the impact is weakened gradually.
Next, we analyze the influence of different values of
and
for the dynamic mobility model. According to the model of dynamic mobility in [
25], we consider two groups of model parameters: (a)
and
; and (b)
and
. The values of
and
can determine the probability of preferential return. Based on the two groups of parameters, we can obtain different transition probabilities of social sensor nodes. The corresponding SNR performance for different mobility parameters is shown in
Figure 8. It can be seen that the SNR performance when
is better than that of
, for both the proposed algorithm and the LLL method. As the increase of the threshold
, the difference between the two groups of parameters reduces gradually. This is due to the following reasons. First,
means each social sensor node has greater probability to return to the previously visited nodes than that of
. It reduces the difficulty of capturing the dynamic mobility for the proposed learning method. Second, as the value of the threshold
increases, the constraint is weakened. The proposed algorithm can adapt to the change such that the difference of the SNR performance is reduced.