1. Introduction
Due to the rapid development of modern wireless communication technology, various new wireless mobile terminals are emerging, and the demand for electromagnetic spectrum resources is increasing rapidly. Currently, wireless transmission services have been allocated in all frequency bands, and the spectrum resources are almost exhausted [
1]. However, while the scarcity of spectrum resources is increasing, the problem of inefficient utilization and idleness of static spectrum resource management solutions is also very prominent [
2]. Therefore, how to maximize the utilization of spectrum resources is currently an urgent problem that needs to be addressed.
Currently, cognitive radio networks (CRNs), which include key technologies such as dynamic spectrum access (DSA) and opportunistic spectrum access (OSA), are recognized as effective tools for improving the utilization of limited spectrum resources. CRNs perceive, recognize, and utilize the available spectrum in specific task spaces through self-learning and interaction with the surrounding environment, adapting to the constantly changing radio environment. Furthermore, the acquisition of spectrum status information by SUs through spectrum sensing is not only the first step in implementing CRNs but also the foundation for subsequent effective analysis and utilization of idle spectrum resources [
3]. However, in practice, SUs often encounter problems such as long delays, high energy consumption, and limited capture range when scanning and sensing the entire spectrum (especially in wideband spectrum sensing tasks), which inevitably hinder the efficient operation of the CRN system [
4].
To address the aforementioned problems, researchers have proposed spectrum prediction techniques. SUs can predict the future slot’s received spectrum power by mining and analyzing historical spectrum sensing data and then only sense the spatiotemporal spectrum resources with predicted values below the access power threshold, effectively reducing the time delay and energy consumption of subsequent processing. Early spectrum prediction research mainly focused on time-domain spectrum prediction methods, lacking research on multiple dimensions such as time, frequency, and space, including linear regression models [
5], time series prediction models [
6], Markov prediction models [
7], neural network models [
8], etc. The authors of [
9] used the multiple attribute decision-making (MADM) methods and artificial neural network architecture to determine the best candidate channel to realize spectrum switching decisions. References [
10,
11] proposed using fuzzy decision-making principles to estimate handoff spectrum probability, which effectively improved switching efficiency. In recent years, deep learning models have become a powerful tool for spectrum prediction to leverage the potential correlations of frequency data in multiple dimensions such as time, frequency, and space [
12,
13,
14].
In response to the spatial dependence, time dependence, or spectral dependence of spectrum data, composite neural networks such as long-short term memory (LSTM) models and convolutional neural networks (CNN) [
15] have been used for joint spectrum prediction in multiple dimensions. Yu et al. [
16] proposed a hierarchical dual-CNN and GRU (DCG) model for predicting the local spectrum availability of SUs in CR communication, which can explore the spectral and temporal correlations between spectrum occupancy data. However, simply connecting RNN and CNN still cannot build a comprehensive ability to discover correlations between spatiotemporal multidimensional input data. Reference [
17] used transfer learning models for spectrum prediction, but due to the differences in frequency band data, prediction models cannot be directly used across frequency bands. The STS-PredNet [
18] models that the received signal strength at a specific spatial location are determined by a weighted linear combination of multiple SUs. The weighting coefficients are obtained using the inverse distance weighting method based on the distance between SUs and the specific spatial location, which ignores the inherent spatial correlation between different observation locations. Model TF
2AN [
19] based on the preprocessing of the spectrum map, a weighted transfer learning model is introduced to share the spectrum knowledge among multiple locations and frequency bands to improve the performance of the spectrum prediction model. The input of the model SAE-TSS [
20] is the image format, and the spectrum sequence is converted into the image format for offline training. The above models are based on the spectral data in space, time, and frequency domains, and make use of the complex correlation between cross-domain knowledge. However, they do not remove the influence of regular data structure and can not extract the inherent correlation of non-Euclidean space well enough. Recently, tensor analysis has been adopted as a framework [
21,
22] to leverage multidimensional correlations for spectrum prediction. However, using tensor decomposition to handle high-dimensional data requires a long computation time, and to achieve the highest possible prediction accuracy, it also requires the transmission of as much information as possible from the base station.
To address the above issue, this paper proposes a TensorGCN-LSTM hybrid network model to provide an effective method based on mining the implicit rules among electromagnetic data in the spatial, frequency, and temporal domains for cognitive radio task area. More specifically, the proposed approach considers SUs at different spatial locations and the spectrum states of SUs at different frequencies in the task area as nodes and constructs two categories of graph structures accordingly. Tensor graph convolution (TensorGCN) [
23] is an effective structure for processing tensor data, which we introduce into the field of spectrum prediction to handle tensor graphs consisting of the two aforementioned graph structures. The essence of the TensorGCN-LSTM model is to utilize graph convolution operations to sequentially discover the correlation rules of spectrum data in the spatial and frequency domains, as well as use LSTM to explore the correlation rules in the temporal domain, thereby improving the accuracy of predicting the change in spectrum state over time and providing a basis for spectrum resource planning and scheduling. Comparative experimental results show that the TensorGCN-LSTM model can provide stable and accurate prediction results.
In summary, our core contributions are three-fold:
We abstract SUs as nodes and transform the spectrum prediction task into a supervised learning task based on graph tensor structured data. From the existing research literature, we first introduce the concept of graph tensor data structures into the field of spectrum prediction;
To extract the correlation features of different frequency data over a period received by SUs, we regard the SU’s state of receiving data at different frequencies as nodes (called virtual nodes) and design the inter-frequency graph network structure to extract the frequency-domain correlation features of the spectrum;
We propose TensorGCN-LSTM, a new joint prediction model in the time, space, and frequency domains, which integrates multidimensional features of task area spectrum data to predict the spectrum state. Ablation experimental results show that compared with other single time-series prediction methods and spatiotemporal prediction methods, the TensorGCN-LSTM model has a more accurate prediction performance.
The rest of this paper is organized into four sections.
Section 2 presents the preliminary works, including the establishment of the tensor graph and definition of the spectrum prediction.
Section 3 describes the methodology of the deep learning model for forecasting spectrum evolution.
Section 4 presents an introduction to the experiment settings and dataset.
Section 5 discusses the evaluation of the results. Finally, concluding remarks and future research directions are discussed in
Section 6.
Section 7 introduces a patent resulting from the work reported in this paper.
3. Methodology
In this section, we elaborate on the implementation process of the prediction method based on the TensorGCN-LSTM hybrid model, shown in detail in
Figure 4. The model first performs graph convolution on the node features in the spatial domain graph structure to generate node embedding. Then, it combines the node embedding with the spectral graph structure and performs secondary graph convolution to extract information that integrates spatial and spectral information from secondary users. We refer to the above two graph convolution operations as intra-frequency graph convolution and inter-frequency graph convolution, respectively. They are shown in
Figure 4 (upper right). Afterward, the spatial–spectral embeddings are fed into the LSTM model to generate fusion feature information in multiple dimensions of spatial, spectral, and temporal domains. Finally, the fusion features are passed through a fully connected layer to output the predicted RSS results.
According to the processing method adopted by the graph convolutional neural network [
24], the forward propagation formula of the graph convolution for US nodes in the spatial domain graph structure is as follows:
where
is the parameter matrix of the filter for intra-frequency graph convolution that needs to be learned and updated.
is the
r-th order Chebyshev polynomial and the standardized Laplace matrix
of adjacency matrix
refers to:
where
represents the maximum eigenvalue of the Laplacian matrix
.
and, respectively, refer to the identity matrix and degree matrix of the matrix
.
in Equation (10) is the spatial embedding vector extracted by graph convolution. Therefore,
serves as the feature matrix for input inter-frequency graph convolution, and the input vector
to the LSTM module can be obtained through the following:
Similar to Equation (10), represents the normalized Laplacian matrix corresponding to the adjacency matrix . is the filter parameter matrix that needs to be learned and updated through inter-frequency graph convolution. is the Chebyshev polynomial of order. It should be noted that due to the filtering operation being an approximation of the R-th order Laplacian operator, it is localized to R-order neighboring nodes. In our experiments, we set .
To learn the temporal evolution characteristics of electromagnetic waves, we input the fused spatial and frequency domain embedding
of each secondary user node into an LSTM model [
25]. This operation is shown in
Figure 5.
At each time slot, the LSTM unit takes the fused embedding
of the node as input, which enables the LSTM model to more comprehensively describe the temporal evolution process of electromagnetic waves based on the integrated frequency and spatial propagation characteristics. We describe the entire process of the LSTM using Equation (13):
Based on the output of the proposed TensorGCN-LSTM model, we finally predict RSS by:
where
denote the RSS of SU
corresponding to
frequencies at the time
in the future and
is a full connection layer.
4. Numerical Experiments
The datasets of the simulation experiment were generated based on the addition of mobile transmitters with random emission frequencies. The position coordinates of the considered transmitters varied randomly and uniformly with time. The lognormal shadowing model adhered to the Gudmundson model [
26], which provides the correlation between the PU and SUs. Multiple mobile primary users working on different frequencies were added to the cognitive radio task region, forming experimental data of power spectral density with spatial and frequency domain characteristics that continuously varied over time.
For simplicity, the transmission power of each primary user was set at 1 w. In addition, for the representation of temporal data, we uniformly divided the time axis into windows and aggregated the spectral data within the same time window into one-time steps. Finally, we used discretized time steps to represent continuous temporal data.
4.1. Experiment Settings
According to the spatial resolution requirements of the spectrum prediction task, we divided the cognitive radio task region into a 200 × 200 grid and randomly distributed 174 secondary users uniformly in each grid, as shown in
Figure 6. The monitored frequency range was between 800 and 900 MHz, with a frequency resolution and spectrum sensing sweep span of 200 kHz for the spectrum sensor, generating a total of 500 frequency bands.
In the experiment, under the premise of examining whether the prediction model worked and not caring about the accuracy of radio wave propagation attenuation, we only considered the path loss and shadow fading of radio wave propagation for the attenuation of the spectrum sensor’s received power.
For the simulation experiment, the log-normal shadow fading model (
,
) was used to model the shadow fading of the task area. The path propagation loss in the task area was modeled using a logarithmic distance path loss model, which is shown in Equation (15):
where
is a constant coefficient related to the gain of the transmitting antenna, which is generally represented by the measured power value at
. Here,
represents the far field distance of the antenna and is a constant reference distance.
is the distance between the receiver of the secondary user and the transmitter of the primary user. In the simulation experiment, we gridded the target area and set
to represent the path loss of radio wave propagation attenuated by each grid.
is the path loss exponent, which typically ranges from 3.7 to 6.5 for urban macrocells. In the experiment,
was set to 5.
4.2. Dataset Preparation
We had each secondary user collect spectrum data for each frequency band in the task area over 17,280 time slots. We then divided the dataset into the training set, validation set, and test set in a ratio of 6:2:2.
Figure 7 shows the RSS distribution of the task area for 174 secondary users continuously receiving 6 time-slots at 800 MHz.
Following the description in
Section 2.2, we constructed spatial- and frequency-domain structure diagrams. As shown in
Figure 8a, the spatial-domain structure was constructed for 174 secondary user nodes at different frequencies. The coordinate positions of each node in the diagram corresponded to the spatial coordinates of the secondary users in the task area. To concisely represent the frequency-domain structure, we selected a schematic diagram of the frequency domain graph for 10 frequencies (801 MHz, …, 810 MHz) within the 800–810 MHz frequency range. Each frequency state was treated as a node, and the absolute value of the correlation coefficient between the spectrum data of each frequency was used as the weight for the corresponding edge. This allowed us to construct a graph structure with frequencies as nodes, as shown in
Figure 8b.
5. Discussion
To validate the feasibility of conducting power spectral data prediction experiments using simulated datasets, we calculated the data correlations of individual secondary users in the time, frequency, and spatial domains, as shown in
Figure 9. Specifically,
Figure 9a illustrates the spatial correlation structure among secondary users, indicating a strong spatial correlation among them. Moreover, the proximity of secondary user indices reflected a stronger spatial correlation between closely located secondary users. In
Figure 9b, the time-domain correlation of power spectral data for the same secondary user node across different frequency bands is depicted. It can be observed that the values of time-domain correlation were generally large, and the correlation distribution graph demonstrated the regularity of tidal effects in the spatial activity of the primary user.
Figure 9c presents the distribution of frequency-domain correlations between any two time slots of the spectral state on a sensing node in the simulated dataset. Although the numerical values of frequency-domain correlation might not be as close to 1 as those of spatial and time-domain correlation, there were still some significant correlation values in certain frequency bands. The occurrence of windowing effects in the 48 frequency points within the frequency range of 800 to 810 MHz indicated a highly correlated spectral state evolution between low-frequency and high-frequency bands.
To validate the effectiveness of the proposed TensorGCN-LSTM model, we conducted experimental comparisons with three other models: LSTM, GCN, and GC-LSTM. We evaluated the generalization ability of each model by analyzing the loss values on the training, validation, and test sets. Additionally, we examined the prediction accuracy of the models using metrics such as the MAE (Mean Absolute Error), RMSE (Root Mean Square Error), and
(coefficient of determination). The calculations were performed according to Equation (16):
where
and
represent the true values and predicted values of RSS, respectively.
represents the number of received data samples from secondary users, and
represents the sample mean.
The evaluation results of the loss function metrics for each model were the average values of the predicted results from 174 secondary user nodes. The models were trained using 24 historical samples to predict the data for the next 30 time slots.
Table 1 presents the average cumulative losses of the four prediction models on the training, test, and validation sets at a frequency of 810 MHz.
When evaluating the prediction error metrics of the prediction model, we conducted experimental comparisons using the RSS data received at 810 MHz frequency by the secondary user with index 0 (
) in the spatial domain graph structure shown in
Figure 8a. To further explore the temporal variation in the ground-level RSS and four model predictions, we randomly selected the predicted results of 580 consecutive time slots for the secondary user
.
Figure 10 displays the predicted power spectral density values of four models compared to the true power spectral density values. Generally speaking, the prediction curve of the TensorGCN-LSTM model aligned more closely with the actual trend and was closer to the real data. It can be seen that LSTM better grasped the changing trend of data in the time domain. Meanwhile, the spatial prediction model (GCN) showed a tendency to overestimate the ground-true value, and the spatiotemporal prediction model (GC-LSTM) showed an underestimation of the high values.
In
Figure 11, the Pearson linear correlation between the predicted and actual values revealed that, as the spatial, temporal, and frequency features fused, the predicted results exhibited a more concentrated numerical distribution with reduced variance. The slope corresponding to the TensorGCN-LSTM model (0.88) was less than one and the largest, indicating that our proposed model achieves a more balanced distribution trend between underestimation of low values and overestimation of high values. This strongly demonstrated that the fusion of multiple feature attributes contributes to the overall smoothness of the model’s predictions. Additionally, the
R-value (
) and
MAE value (
) of the TensorGCN-LSTM model indicated a strong consistency between the predicted values and the actual values.
Table 2 presents a comparison of prediction errors for four prediction models under different prediction horizons. From the table, it can be observed that the TensorGCN-LSTM model exhibited varying degrees of reduction in prediction errors compared to the other three models, as indicated by RMSE, MAE, and MAPE metrics. The results demonstrated that considering the spatial and frequency distribution characteristics of radio propagation improves the prediction accuracy of the TensorGCN-LSTM model. Looking at the prediction error results of the GCN and LSTM models, it was evident that a neural network structure solely focusing on spatial correlations cannot effectively enhance the predictive accuracy of temporal data. With an increase in the prediction horizon, the uncertainty of all four models’ predictions increased, leading to gradually larger prediction errors. However, based on the comparison results for the 20th–30th horizons, our proposed TensorGCN-LSTM model exhibited better long-term prediction capability. This finding validates the beneficial effects of effectively integrating temporal-, spatial-, and frequency-domain features to enhance the prediction performance of the model.
The purpose of our experiment was to validate the effectiveness of the proposed model. The shadow fading component in the synthetic data generation followed the Gudmundson model, while the actual data were more complex than this. As a result, the complexity of the simulated experimental data may not be as high as that of real measurement data, and the variations in spectrum data may not be significant in the spatial and frequency domain. Consequently, the overall difference in error metrics among the four prediction models is not substantial. However, experiments on the simulated dataset have demonstrated that TensorGCN-LSTM exhibits significant potential in exploring the multidimensional interactions of spectrum prediction.
6. Conclusions
In this paper, we proposed a novel graph neural network deep learning framework called TensorGCN-LSTM for spectrum prediction. First, based on the global spatial distribution map of secondary users in the task area, we utilized the “spatial domain graph structure” to capture the characteristics of electromagnetic wave propagation in spatial space. Additionally, we employ the “frequency domain graph structure” to capture the frequency domain correlation between spectrum states in different service frequency bands. Subsequently, the LSTM model was used to summarize the temporal variation features of the secondary users’ network received power. Finally, by integrating the interaction information of spatial, frequency, and temporal domains through fully connected layers, we achieved the prediction of spectrum trends under the conditions of multi-dimensional information fusion. We showed the success of our approach through experiments on a simulated dataset that explored the multidimensional interactions of spectrum prediction. In future work, we plan to incorporate real measurement data and incorporate additional domain knowledge, such as terrain structures and weather information, to further enhance its accuracy and robustness in spectrum prediction.