2. System Model
A downlink multi-cell NOMA-CoMP wireless network is described in
Figure 1. For simplicity of discussion, we only consider establishing a three-cell NOMA-CoMP network model. The analysis and simulation results under the model are applicable to the situation of more cells, as the formulas and the algorithms we proposed are based on the discussion of
B cells, where the conclusion we draw is not effected by the number of cells.
There are 3 CoMP clusters in
Figure 1 (i.e., one JT-NOMA-CoMP cluster and two DPS-NOMA-CoMP clusters). In the JT-NOMA-CoMP cluster,
and
are the cell-center users served by
and
, respectively.
is the high-rate requirement edge user from the prediction of the user behavior, which is expected to get the best service. Meanwhile,
and
are the low-rate requirement users from the user behavior prediction, which means they only need to be served with their target rate. In the
ts time slot,
and
form NOMA-CoMP clusters with
and
through DPS-CoMP mode, respectively. According to the principle of NOMA, the subchannels of each cell are superimposed on multiple users for simultaneous transmission. Therefore, interference will occur on this subchannel, which is called intra-cell interference. In order to obtain the signal desired by the user accurately, the receiving end of each user adopts the SIC and the user can eliminate the intra-cell interference caused by other users whose channel gain is smaller than its own channel gain on the subchannel.
First, we consider the DPS-NOMA-CoMP cluster. Each CoMP user is only scheduled by one CoMP-BS, and other CoMP-BSs will not cause interference to the user (i.e., there is no ICI). Consider a group of users
, where each user has a distinct channel gain. Moreover, suppose the SIC decoding order is based on the user’s index, i.e., the signal of
is decoded first, then the signal of
, and so on. Therefore, according to the SIC,
can decode its desired signal by treating the signals of all other users in the system as intra-cell interference. In this way,
can decode its desired signal after eliminating all users whose channel gain is smaller than its own by applying SIC technology. In this DPS-NOMA-CoMP system, the achievable rate of any user
l on subchannel
k can be written as
where the channel is assumed to be a Rayleigh fading channel over bandwidth
.
denotes the channel power gain for user
l at the receiver, and
represents the transmit power of user
l.
represents the channel selection parameter. When the subchannel
k is allocated to the user
l, the value is 1, otherwise it is 0.
denotes additive white Gaussian noise (AWGN) at user
l and
is the variance of the noise.
Then we consider the JT-NOMA-CoMP cluster. We assume that the users in this cluster are scheduled by at most two CoMP-BSs, denoted by
and
, respectively. The number of cell-center users in the cell
and
is defined as
and
, respectively, and the number of of edge users with high rate requirement is defined as
. In this two-BS CoMP-set, the set of non-CoMP-UEs
in the cell
, the set of non-CoMP-UEs
in the cell
, and the set of CoMP-UEs
all are assumed to follow the SIC ordering according to their subscripts. Hence, the achievable data rate for the CCU
i served by
for
can be expressed as (in order to make the formula clear, we omit the part where the channel selection parameter
is 0):
where
and
denote the CCU
i’s channel gain with
(desired channel) and with
(ICI channel), respectively.
represents the transmit power of the CCU
i from
, while
represents the power of other cell-center users from
who are matched into the same JT-CoMP subchannel with the user
i but have higher SIC ordering.
represents the transmit power of cell-center users from
, which form the NOMA-CoMP cluster at
end with the same edge users (i.e., the transmit power of ICI).
The achievable data rate for the CCU
j served by
can be expressed as
where
and
represent the CCU
j’s channel gain with
(desired channel) and with
(ICI channel), respectively. The terms
and
are similar to Equation (
2) but in terms of CCU
j in the cell
. As well,
denotes the power for cell-center users from
which form the NOMA-CoMP cluster at
end with the same edge users.
The achievable data rate for the edge user
e, from the same JT-NOMA-CoMP cluster as in Equations (
2) and (
3), can be expressed as
where
The term denotes the desired signal transmitted from both CoMP-BSs ( and ), where , and indicates the transpose of . represents intra-cell interference caused by other users in the cluster who have higher SIC ordering. and are the ICI from two jointly scheduled CoMP-BSs, and , respectively.
In this system model, we need to maximize the sum data rate of CoMP users in the JT-NOMA-CoMP cluster, while ensuring that the CoMP users in the DPS-NOMA-CoMP cluster and the non-CoMP users in the two clusters reach their target rate.
and
represent the power budget on the
k-th subchannel and the total system power budget. We assume that the number of users multiplexed on each subchannel is
M. The range of
M is
, where
is the upper bound of the multiplexed user on the subchannel and
is the lower bound. The optimization problem is expressed as
Since the transmission power of the BS and the power allocation of each subchannel is limited, the power allocation variable must meet the constraints C1, C2. Constraints C3–C6, respectively, represent the the minimum rate requirement of non-CoMP user i served by , and non-CoMP user j served by , CoMP user e jointly scheduled by and , and user l in the DPS-NOMA-CoMP cluster. Constraint C7 indicates whether a discrete binary variable is allocated. Constraint C8 ensures that each subchannel can only be allocated at most users.
Due to the structure of the optimization function and the existence of discrete variables
and continuous variables
, the entire optimization problem has the characteristics of mixed integer nonlinear programming (MINLP), whose optimal solution is difficult to determine. However, in practical systems, the power is typically set in discrete steps. In this way, we discretize the entire optimization problem to facilitate our following work. We predict the requirement rate of users in
Section 3 to ensure that edge users can be divided into two varieties.
3. Predict and Leverage Users’ Rate Requirements
In this section, we predict the user requirement rate based on the China Family Panel Studies (CFPS) dataset. Assuming that the base station can obtain user information completely, we can classify the edge users into high-rate requirement and low-rate requirement by the user’s predicted rate.
3.1. Introduction of Machine Learning Algorithms
Random forest is to construct a forest in a random way, and there are many decision trees in the forest. In random forests, there is no relationship between each decision tree. After getting the forest, let each decision tree in the forest make a judgment separately when a new input sample comes in. Determine which category the sample should belong to, and then determine which category is selected the most, and predict which category the sample belongs to.
The linear discriminant analysis algorithm tries to find a line so that the projections of points of the same class on the line are as close as possible, and the projections of points of different classes on the line are as far as possible. When a new sample point needs to be classified, the projection of the point on the straight line is calculated, and the classification of the new sample point is judged according to the projection position.
Naive Bayes classification is a method based on Bayes’ theorem and assuming that the feature conditions are independent of each other. First, learn the joint probability distribution from input to output through the given training set, with the assumption that the feature words are independent. Then, based on the learned model, input X to find the output that maximizes the posterior probability of Y.
Support vector machine (SVM) is a binary classification model whose basic model is a linear classifier that defines the largest margin in the feature space. SVMs also include kernel tricks, which make them essentially non-linear classifiers. The learning strategy of SVM is interval maximization, which can be formalized as a problem of solving convex quadratic programming, The learning algorithm of SVM is an optimization algorithm for solving convex quadratic programming.
K-Nearest Neighbor (KNN) is one of the simplest machine learning algorithms that can be used for classification and regression, and it is a supervised learning algorithm. Its main idea is that if most of the K nearest neighbors of a sample in the feature space belong to a certain class, then the sample also belongs to that class and has the characteristics of the samples in that class. The KNN method only determines the class of samples to be classified based on the class of one or more recent samples.
RUS refers to random undersampling, which randomly selects a certain amount of majority class samples and minority class samples from the dataset to form a training dataset with a balanced distribution. Boost refers to the Adaboost algorithm, which means that the algorithm adds RUS technology to the Adaboost algorithm. The Adaboost algorithm is an ensemble learning algorithm. Its core idea is to train different classifiers (weak classifiers) for the same training set, and then combine these weak classifiers to form a stronger final classifier (strong classifier). The RUSBoost algorithm changes the random sampling of the Adaboost algorithm to the sampling technology of unbalanced data, which greatly improves the accuracy of the classification of unbalanced data sets.
3.2. User Requirement Rate Prediction
We use the latest data released by CFPS and select 17,498 samples about mobile phones (other samples have nothing to do with mobile phone use or refuse to answer questions related to mobile phone using habits). Using the users’ profiles as features, we predict the user’s communication habits as labels. These characteristics are all independent variables that may affect the user’s communication habits which are the user’s profile. For the data released by CFPS, we screen out about 45 features from 1371 variables.
In order to predict the rate requirements of users through features, we use “the frequency of using the Internet to entertain” and “the frequency of using the Internet to contact” to express the rate requirements of video services and the rate requirements of text services (e.g., WeChat or email), respectively. According to the dataset’s quality, two labels are selected to represent the user’s requirement rate: (1) The frequency of using the Internet to entertain and (2) the frequency of using the Internet to contact. Correspondingly, each label is represented by five levels (inaccurate frequency) in CFPS: (1) Almost every day, (2) several times a week, (3) several times a month, (4) sometimes, (5) never or do not answer. Specifically, “almost every day” is denoted by 1, and “never or do not answer” is denoted by 5.
Next, we use machine learning methods to predict the user’s communication habits. Specifically, we use random forest, linear discriminant, naive Bayes, linear support vector machine (SVM), and k-nearest neighbor (KNN). In addition, we also design a learning method based on the RUSBoost algorithm to predict the communication habits of users. RUSBoost is an algorithm for dealing with class imbalance problems in data with discrete class labels. It uses a combination of RUS (random undersampling) and the standard boosting program AdaBoost to model the minority classes better.
We provide confusion matrices for predicting the “frequency of using the Internet to entertain” and “the frequency of using the Internet to contact” in
Figure 2 and
Figure 3 to evaluate the performance of different learning algorithms, respectively. We use digitd with color in
Figure 2 and
Figure 3 for correct predictions. During simulation, we use cross-validation to prevent overfitting by dividing the dataset into 10 groups and estimating the accuracy of each group. The row shows the number of users with the real value, and the column shows the prediction number of users.
We observe that the accuracy of our algorithm for users in classes 1 and 5 are higher. Some users in class 2 are misclassified as class 1 users and given a higher prediction rate requirement. However, the cost of misclassification is small. For users with small samples in classes 3 and 4, the learning method based on the RUSBoost algorithm that we use can model the minority categories better, so that the accuracy has better performance than other learning algorithms.
5. Discrete Power Allocation Algorithm Based on Group Search in NOMA-CoMP Systems
In order to achieve the maximum sum rate of high-speed demand users, we need to optimize the user’s power allocation variable . Nevertheless, the power allocation value for paired users is continuous. It is impossible to divide them based on the exhaustive search. However, the power is usually set in discrete steps in existing systems. Therefore, we can discretize the total power into the number of L uniform power levels, and each power level can be expressed as , using to represent the power level, and . After we discretized the power, the original optimization problem could be optimized based on the idea of group search.