1. Introduction
Massive multiple input and multiple output (Massive MIMO), as a key technique for the 5th generation (5G) wireless communications systems, presents a highly promising solution to meet the demanding requirements of spectrum efficiency and the energy efficiency [
1,
2,
3]. The full capacity gain is mostly based on the assumption that asymptotic orthogonality of the subchannels, which requires that the channel fading coefficients satisfy independent and identical distribution. However, huge numbers of element antennas have proven to be double-edged, a great opportunity but also challenging the next generation of communication systems. New propagation characteristics are inevitably brought into wireless channel modeling due to the dense deployment of so many antenna elements. Both element number and physical size of the antenna array lead to the sub-channel correlation, which deeply influences the cooperative work of the Massive MIMO antenna element [
4].
In recent years, research has spotlighted the state-of-the-art channel models. Among them, non-stationary modelling draws more attention. Lund University conducted measurements in 2.6 GHz, and the received power showed some discontinuous jump points in the dimensions of the antenna array, indicating the inconsistency of the elements [
5]. In the outdoor macro-cell, three-dimensional spatial parameters of azimuth angle and elevation angle are extracted based on the realistic field data, and from the viewpoint of scatterer clusters, the correlation between clusters is fully proven [
6]. In high-speed railway channel modeling, nonstationarity, as a significant requirement, is pointed out, and an angular spread-dependent model is proposed to describe the relationship between the angular dispersion and the spatial correlation [
7,
8]. Then, in the outdoor measurement of 64-array antenna in 3.3 GHz, the results of the angle power spectrum and the power delay profile show that the channel does not meet the stationary condition in the spatial domain nor in the frequency domain [
9]. In another measurement, with 128 antenna arrays, the non-stationary characteristic of the channel is further proven due to the influence of multipath distribution [
10]. Then, in a series of measurements at 2, 4 and 6 GHz bandwidth, the non-stationary characteristic of the channel is also confirmed by the correlation bandwidth [
11]. Some works cast light on visible clusters and visible regions, based on which a novel Massive MIMO model is proposed. Even more, the contribution of non-stationarity to the sum and rate of upstream channels is proved by numerical analysis [
12]. In these measurements most multipaths behave in clusters. Here, a cluster is considered as a set of multipath components with similar power, delay and spatial characteristics. So far, many traditional models have been based on clusters, such as Saleh-Valenzuela [
13], COST2100 [
14], 3GPP SCM [
15] and WINNER [
16].
The traditional geometric model is mathematical based, which would go some way to weaken the universality. The dependence of the channel parameters on the measurement data result in that the channel model can only be applied to the cases with high statistical similarity with the measurement environment. Once the changes happen, the original model will degrade or even fail to work. Therefore, a multi-path clustering algorithm based on big data technology is introduced into the Massive MIMO wireless channel modeling with the development of information technology, which can help us to determine the random channel parameters through model training, which means, based on the real measurement data, we can quickly and effectively construct the channel model.
It is an inevitable trend that statistics-based big data technology is widely accepted in wireless channel modeling. With the enlargement of the antenna array scale, the improvement in channel bandwidth, the subdivision of transmission scenarios and the expansion of information scale, the in-field data volume shows an “exponential growth”. It is impossible to mine huge amounts of channel measurement data solely by human labor. Big data technology is a new technology that deals with massive and complex data and extracts valuable information quickly. With the development of large data technology, the application of mathematical algorithms on massive data to classify data types, analyze data regularity, and extract data relationships can help people explore correlations in complex data quickly and effectively. Machine learning, as a branch of artificial intelligence, is a key method of large data clustering. In recent years, the theory of artificial intelligence has been developed rapidly. The method of machine learning is helpful to enhance the adaptability of the channel model. Suppose that we have the framework of the model, the only work we need to do is to train the model with mass in-field data. Before that, we can develop the training criteria, such as MMSE (minimum mean square error). Good performance and practicability are important in channel modeling, which is meaningful to reduce the cost of re-modeling.
Massive MIMO wireless channel modeling takes a step forward by applying the methods of big data and machine learning. Compared with the logical ways, such as deduction and reasoning, the method of big data pays more attention to interconnections of data instead of causal analysis. It is also different from the traditional modeling method using geometric statistics. The geometric mode is usually based on a certain physical framework, then the model parameters are estimated using a large number of measurement data. Big data technology is a serial operation of searching, comparison and clustering, aiming to reveal the inner connections among massive data sets. It makes the model more complex to take the spatial dimension into consideration during channel modeling. However, if we catch the key item of the correlation, which is verified to exist in entries of the channel matrix, it is possible to lower the dimensions. Usually we use statistical and theoretical deduction methods to determine model parameters, while the big data method estimates these variable parameters through model training. It means a large number of in-field measurement data are employed to train the model based on a given model framework. Then, the optimal parameter set can be calculated, and the model can be determined. The innovative application of big data technology in channel modeling has attracted wide attention in academia.
The main contributions of this paper include realistic channel measurement, parameter estimate based on big data technology and state-of-the-art modelling. We focus on correlation and non-stationary characteristics. First, an account is given to describe the measurements campaign. The platform is equipped with a scalable virtual 128-element antenna array, the raw measurement data are recorded, and the channel impulse response (CIR) is illustrated. Next, a series of high-resolution estimation algorithms are employed. We carry out the data analysis based on big data technology. Then, a state-of-the-art channel model is presented based on the antenna correlations over the array. Last, we draw the conclusions.
The paper is organized as follows: in
Section 2 the Massive MIMO channel measurement campaign and primary signal processing are introduced. Then, in
Section 3 the clustering algorithm is outlined, and the channel parameters extracted from the measured snapshots is demonstrated. We also establish the non-stationary Massive MIMO channel model. Finally, in
Section 5, we summarize our contributions and draw conclusions.
3. Channel Model
The following sections demonstrate the results of data measured on Massive MIMO radio channels. These investigations aim to unfold the critical correlation features from the experimental data.
According to the measured results, the correlation fading characteristics of antenna elements were different. Therefore, it was globally non-stationary on the Massive MIMO antenna array. As to the closely placed elements, it is easily understood that the fadings were similar due to the same shadow regions they were located in. Thus, the sub-channel can be considered as stationary covered by the same shadow regions.
Generally, there are two methods to map out the stationary intervals. The first is the averaging method. According to the empirical model, the antenna elements can be divided into several subsets. Here, every subset is composed of a fixed number of elements. This method operates easily, which ignores the impact of specific propagation scenarios, but as a trade-off, it’s at low accuracy. The second is the method of inspection. That is to say, we can estimate the parameters of the multi-path cluster according to the measurement results [
21], then we extract the visible region of the multi-path cluster, furthermore grouping the antenna into a subset to ensure the channels are stationary within the groups. Here, a group of antenna elements is covered by a visible region in sight of radio propagation. As a result, the boundaries of the visible region are formed, and we should note that groups are irregularly divided. The method of inspection is based on the division of the antenna stationary subinterval, which greatly depends on the operator’s observation and judgement. In addition, the larger the physical size of the array is, the more difficult the actual operation.
In order to overcome the limitations of these methods, we introduce the automatic clustering algorithm, named ECD (element channel distance) algorithm based on the traditional MCD (multipath component distance) algorithm applied in mathematics [
22,
23,
24] and clustering algorithm applied in pattern recognition [
25,
26]. Clustering is to classify data objects into multiple subsets according to certain rules. Considering the special characteristics of radio channels, we develop the traditional algorithms by introducing the continuous coverage of the visible area of the cluster on the dimension of the extended antenna array. Objects in the same subset have higher similarity, while objects in different subsets have lower coupling. Distance algorithm is the most popular criterion in multi-path clustering to model the radio channel. The multi-path with similar distances is divided into the same cluster. In practice, the distance refers to the compound distance of multi-path parameters, at the same time, the contributions of those parameters are fully considered in the way of the weighting coefficient. This is a scientific and highly efficient clustering method based on machine learning and artificial intelligence.
First of all, we should understand that the goal of clustering is to divide all measurement data into several subsets, which required high correlation within them and low coupling between them. As a result, subsets or so-called clusters are composed of multipaths, which have similar elements, including energy, delay, angle, etc.
Secondly, we should make clear that the foundation of clustering has a rich supply of data, which is not only raw data in the measurement but also the channel parameters of each antenna estimated from certain algorithms, SAGE algorithm as example, and the parameters including power, delay, angle and others. All these are called preprocessed data.
Finally, the stationary subinterval programming is carried out, which is to divide the original set of preprocessed data into some classes with similar attributes by using certain criteria. During this process, the traditional observation by a human is replaced by the machine searching. Then, some potential links between data are now uncovered with the aid of the precise algorithm.
The steps of the programming algorithm are shown in
Figure 6.
In the first step, KPowerMeans algorithm is used to cluster the multipath in the sub-channel of the antenna array. This algorithm is one of the most commonly used clustering algorithms in the field of machine learning. It belongs to the iterative clustering algorithm, which is simple and scalable.
The algorithm takes the PDP of the multipath as the initial object, searches for the peaks on the curve as the initial cluster core, and determines the number of clusters
. Then the MCD can be calculated as
where
is the parameter set of multipath components. The parameters of multipath components and cluster cores include power, relative delay, departure angle, arrival angle, etc. Considering three-dimensional space, the angle information should include azimuth angle and elevation angle, so we have
Then, the distance in the time-delay domain can be expressed as
where
, and
means the weight coefficient of time-delay.
and
are the delay of the current cluster core and the current multipath component, respectively.
The distance in the angle domain is
In order to reduce the complexity, (9) can be simplified as
It should be noted that the distance in the angle domain should be calculated for the departure angle and arrival angle, respectively.
The final MCD distance is the RMS model of the weighted sum of time-delay and angle domain distance as follows:
At the same time, we can calculate the matrix of distance as
The minimum distance method is used to search for a most optimal group. The process can be summed up as a linear programming problem expressed as
where
is a two-dimensional power matrix.
According to the programming results, the multipath is assigned to the nearest cluster, and the cluster cores are recalculated again. An iteration procedure is used to make sure that all the multipath components are divided into corresponding clusters.
Taking the outdoor LOS scenario as an example, the measured PDP of the antenna sub-channel is plotted as shown in the curve in
Figure 7a. The average noise power and signal threshold are counted according to the estimation algorithm of bottom noise, which is identified by the horizontal real line segment in the figure. The signal implicates fluctuating nature. Peaks and valleys can be seen on the curve. According to the theory of radio wave propagation, the top points represent the arrival of multipath components with strong energy. Among all the multipath components, due to spatial scattering, we assume that the LOS path exists. It is obvious that the time-delay is the smallest and the power is the strongest as to the LOS component. Therefore, based on the traditional multi-path peak search algorithm, we can reasonably set the number of clusters as the number of peak points on the PDP, and we define the cluster core as the corresponding multipath component of the peak points. Please refer the marked points in
Figure 7a for the core location.
The inherent flaw of KPowerMeans algorithm is that the random hypothesis of the initial cluster core might cause the classification result to converge to the local optima rather than the global optimal solution. This problem can be overcome by using the method of peak searching on PDP. Determining the number of clusters and the initial cluster core by measurement data can greatly improve the effectiveness and reliability of the results. However, the peak search algorithm only considers the distribution of multipath in the time-delay domain, so the classification result is necessary rather than sufficient. On the other hand, the best advantage of the KpowerMeans algorithm is that the contributions of the spatial parameters are fully considered. By analyzing the results of multipath in the spatial domain, the AOA and AOD parameter pairs are obtained, and the MCD distance in the delay domain and angle domain are calculated synthetically to cluster the multipath.
Taking the outdoor LOS scenario as example, the results of the final multipath clustering are shown in
Figure 7b. In the figure, the blue solid dot represents the spatial distribution of multipath in the sub-channel of array element, and the red solid dot represents the clustering result, namely the statistical distribution of cluster core. The implementation of the MCD algorithm is simple, and the clustering performance is optimized compared with the traditional algorithm.
In the second step, the ECD algorithm is used to program the stationary subinterval of the antenna array based on the cluster visible region theory. Because of obstacles, the radio reaches only several elements of the antenna array; that is, only some elements of the array are covered by some scatterer. It can be assumed that the radio wave coming from the same scatterer is coincident. Therefore, the stationary region can be defined. In the process of programming the stationary subinterval in the array dimension, it is key to determine the similarity. The distance criterion is still used here, and the distance of the correlation matrix is introduced to the work. The accuracy of measurements directly impacts the model. The matrix of distance is the metric to the similarity between two correlation matrices. The expression of the distance formula of the correlation matrix is [
18]
Here is the autocorrelation expression of sub-channel matrix. is the trace of matrix, and is the Frobenius norm.
The search process of the stationary subinterval is actually the process of extracting the similarity of the correlation matrix. The algorithm assumes that there is no overlap between different stationary subintervals. We mainly focus on the influence of local scatterers on the link from the user to the base station. In this way, the signals from the user arrive at some or all elements of the antenna array after scattering. For the case where only some elements are under the coverage of the scatterer, the visible region of the scatterer only covers some elements, as shown in the above
Figure 3.
According to Cauchy Schwartz inequality, as to matrices
and
, we have
Here
is the necessary and sufficient condition for the equation to hold.
means the inner product, which can be decomposed by the singular value as below:
and represent the eigenvector matrix and the eigenvalue diagonal matrix of , respectively. The value tends to zero, which means that there is only one module factor difference between matrices. Contrarily, if the value tends to 1, this means that the matrices are independent, that is, the matrices are extremely different.
Equation (14) can be rewritten as the following deformation forms
Therefore, it can be assumed that the fading of radio clusters from the same scatterer is stationary, so that the stationary region
q can be defined, including the number of antenna elements
. Here, the threshold of the correlation coefficient is 0.75. It is considered that the two correlation matrices are nonstationary if
. Note that we suppose there is no overlap between regions, so there are
Then, the non-stationary Kronecker correlation matrix is
Where is the operator of matrix diagonalization, which is realized by arranging the vector element to the diagonal position of the target matrix. is the square matrix.
The cluster correlation distance can be calculated by a stepping rate of 1 on the array, and the M-element distance vector can be obtained.
If is larger than the preset threshold, and there are values within , then the correlation matrix is divided into matrix blocks with number . In each matrix block with dimension , the cluster correlation distance is recalculated with stepping rate of 2, and a element vector is obtained. The entry of matrix block larger than the threshold is searched, and the block is further split. This is repeated until the matrix dimension is smaller than the stepping rate. Finally, every splitting block is the stationary subinterval of the correlation matrix.
In fact, the ECD algorithm is a linear projection algorithm, which aims to reduce dimensions in high-dimensional data space, such as a vector set of power, delay and angle in our application. By summarizing the linear weighted channel parameters, the parameters are converted to a one-dimensional object. Then, employing classification and clustering methods, to maximize the distance between classes, the correlation matrix is constructed. At the same time, the transformation can easily be performed forward and backward. In the mapping space, the algorithm will maximize the variance of the object, so that the projection of all samples can be separated as much as possible to complete the subset programing. Therefore, the ECD algorithm inherits the good performance of every parameter and simplifies the calculation. This method indicates that we can make up the whole through parts.
In the above example, the multipath distribution on the array domain is shown in
Figure 8a. By employing the ECD algorithm, the visible regions mapped on the array domain are shown in
Figure 8b.
In measurements, there are many multipaths within a cluster, and the processing of clustering will consume computing resources. As to the channel matrix, these correlated multipath components in the cluster will consume array degrees of freedom in some degree. Due to the high correlation of multipaths in a cluster, the harm to the channel gain caused by the clustering method is limited and acceptable. Therefore, modeling the Massive MIMO channel, we can consider the multi-path within a cluster as a single path to simplify the process. Here, the statistical method is replaced by searching technology to initialize the cluster core, ignoring the influence of dispersion. This method greatly simplifies the extraction of channel parameters, and it does not lose the accuracy of the model. The most important thing is that it greatly improves the efficiency of modeling. Although the correlation-based channel model ignores the small-scale dispersion characteristics of the channel in time, space and frequency domain, it can help us grasp the key items in the process of modeling. In practical applications, such as wireless communication system simulations at both the link level and system level, the Monte Carlo method is used to randomly generate multiple groups of model parameter samples on the user end. The randomness of the sample distribution can be used to simulate the small-scale dispersion characteristics and overcome the shortcomings.
We also take the outdoor LOS scenario as an example. According to the above steps, we can extract the multipath cluster parameters to model the channel. The finished channel correlation matrix is shown in
Figure 9. In the graph, the block effect of correlation matrix can be clearly observed.
In the above analysis process, we only consider the situation that the transmitter is equipped with multi-antenna and the receiver holds a single antenna. Thus, we can only obtain the correlation matrix of the transmitter. However, because of the independence of the transmitter and the receiver, the correlation matrix of the receiver can also be obtained by the same method. Finally, according to the Kronecher modeling method, we can obtain the system matrix of Massive MIMO wireless channel as