An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems

Xin, Pengzhe; Wang, Hailong; Liu, Yu; Chen, Jianping; Song, Tiecheng; Wang, Dongming

doi:10.3390/electronics12020388

Open AccessArticle

An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems

by

Pengzhe Xin

¹

,

Hailong Wang

²,

Yu Liu

²,

Jianping Chen

²,

Tiecheng Song

¹ and

Dongming Wang

^1,*

¹

National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China

²

Ticom Tech, Nanjing 210039, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(2), 388; https://doi.org/10.3390/electronics12020388

Submission received: 23 December 2022 / Revised: 6 January 2023 / Accepted: 11 January 2023 / Published: 12 January 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Multiple-input multiple-output (MIMO) technology is one of the key technical approaches to improve the spectrum efficiency of wireless communication. Modern communication systems employ MIMO and high-order quadrature amplitude modulation (QAM) to maximize spectral efficiency. However, with the increase in the number of antennas and modulation orders, it is very challenging to design a low-complexity and high-efficiency MIMO receiver. In recent years, with the rapid development of new technologies such as artificial intelligence, more and more researchers have tried to apply machine learning techniques in the field of communication to break through the performance of traditional communication algorithms. In this paper, we propose a new low-complexity MIMO detection algorithm: an artificial intelligence-assisted expectation propagation (EP) detection algorithm. Neural network models are used to learn and map some of the time-consuming steps in the EP detection algorithm, converting the complex operation process into a few matrix multiplication operations in order to reduce the complexity of the detection algorithm. It is verified that the method proposed in this paper can approximate the performance of the original EP detection algorithm with reduced complexity and is applicable in different scenarios.

Keywords:

MIMO system; signal detection; expectation propagation; neural network

1. Introduction

Multiple-input multiple-output (MIMO) technology has been widely applied in wireless communications to improve spectrum efficiency [1]. The main advantage of MIMO systems is their ability to use spatial multiplexing techniques to greatly improve the spectral efficiency of communication systems with bandwidth and power constraints [2]. Currently, modern communication systems often use MIMO and high-order quadrature amplitude modulation (QAM) to improve spectral efficiency.

However, the widespread deployment of high-dimensional MIMO systems is hampered by a number of limitations. In a practical scenario, communication systems are configured with a large number of radio frequency units, and the complexity of the receiver will be very high when performing signal processing [3]. Moreover, the complexity of high-performance maximum likelihood (ML) detection or sphere decoding (SD) increases exponentially with the number of antennas and the modulation order [4]. Therefore, low-complexity and high-performance MIMO detection techniques are of great research importance.

For MIMO communication systems, the ML detector needs to traverse all possible vectors, provided that the receiver obtains channel state information (CSI) and that the transmitted symbols obey a uniform distribution [5]. The ML detector is an NP-hard problem [6] and is too complex to be applied in practice when the number of transmitting antennas is large and high-order modulation is used. Therefore, how to simplify the algorithm of signal detection and approximate the performance of maximum likelihood detection has become a hot issue for the communication community.

The expectation propagation (EP) detection algorithm is a low-complexity algorithm with high performance. The EP algorithm is one of the Bayesian machine learning algorithms. It can efficiently handle continuous distributions through moment matching [7]. Therefore, it has the ability to handle more complex and general approximation functions [8].

The performance of EP detection is very close to ML detection. Even for very large MIMO orders such as 250 transmitting antennas, it only takes 7∼8 iterations to approximate the original distribution, and the complexity of each iteration depends on the inverse of the

n \times n

matrix [9]. Simulation results show that the number of iterations required by the EP algorithm is independent of the modulation order of the system, and the algorithm complexity is maintained at

O (n^{3})

[10]. Moreover, EP detection is a soft output algorithm that provides a posterior probability estimate for each received symbol, which can be naturally performed for further channel decoding [11].

The integration of artificial intelligence technology in communication will become a mainstream direction of the sixth-generation mobile communication technology (6G) development [12]. Coupled with the improvement of hardware level and computing power, machine learning technology has been widely used.

Machine learning is a type of artificial intelligence that requires the support of large amounts of data to train models that ideally meet the needs [13]. A neural network is an algorithm in machine learning that belongs to supervised learning. Supervised learning means that the output results of training data are real and certain, and can accurately give the training error and guide the model to train and learn [14]. The neural network is composed of a large number of interconnected nodes. In the process of data transmission iteration, the parameter values of nodes will be gradually optimized and the loss function will be constantly reduced, so as to achieve the purpose of accurately mapping the input data to the output data [15].

Deep learning is a branch of machine learning, which has attracted great attention in recent years, and a large number of experts have conducted in-depth research. Deep learning can be understood as a longitudinal extension of neural network algorithms [16]. Based on the basic neural network, the number of network layers and network reasoning operation process is studied in detail, so that the network model can have more powerful learning ability and the network can learn extremely complex operation processes after a lot of iterative training [17]. In recent years, deep learning has been used in various fields, such as natural language processing, computer vision, and recommendation algorithms.

In the past, most mainstream ideas relied on the powerful learning ability of neural networks to improve the performance of algorithms. The essence of the reasoning process of the network, namely the prediction of new data, is several matrix multiplication operations. Therefore, a novel idea is to learn and train some complex steps in the MIMO detection algorithm through the neural network, converting the complex calculation process into a few matrix multiplication operations, so as to greatly improve the efficiency of the detection algorithm. This strategy not only can effectively reduce the complexity of the detection algorithm in the form of a matrix product, but also can approximate the original detection performance by virtue of the fitting ability of the neural network.

The EP detection algorithm has good performance, but there are still some complicated operations in each iteration, such as matrix inversion, moment matching, etc. These calculations are not conducive to a low-complexity implementation in hardware. Therefore, this paper uses an artificial intelligence approach to optimize some of the complex steps of the EP algorithm to further reduce the complexity of the algorithm and approximate the performance of the EP algorithm. The following are the key contributions of this work:

This paper analyzes the steps with high complexity, and then uses the neural network model to learn and map some time-consuming steps in the EP detection algorithm, and convert the complex operation process into several matrix multiplication operations to reduce the complexity of the detection algorithm.
The improved method is tested with different MIMO channel models and verified to have strong robustness as well as adaptability. The performance and applicability of the method are verified in different scenarios.
Considering the problem of excessive differences in data with different SNRs, SNR is divided into different intervals, and different neural network models are trained on the data with different SNR intervals. The accuracy of the model mapping process is guaranteed.
A simple structure of 4 × 32 × 16 × 2 is chosen for the neural network model to minimize the complexity of neural network prediction and the space occupied while maintaining sufficient learning capability of the model.

The rest of this paper is organized as follows. In Section 2, we show the MIMO system model and summarize and compare common detection algorithms briefly. In Section 3, we detail the process of the EP detection algorithm. In Section 4, we first analyze the steps with high complexity in the EP algorithm and determine the application strategy of the neural network, then detail the process of model building and training, and finally embed the neural network into the EP detection algorithm. Simulation results and corresponding analysis are shown in Section 5. A conclusion is provided in Section 6.

Notation: Column vectors

x

and matrices

X

are denoted by bold letters in lower case and in upper case, respectively.

I

is an identity matrix.

{[\cdot]}^{H}

and

{[\cdot]}^{T}

are the conjugate transpose and transpose operator, respectively.

C^{m \times n}

denotes the complex matrix (or vector) with a scale of

m \times n

.

E [\cdot]

and

CoVar [\cdot]

represent the expectation operator and covariance operator, respectively.

C N (0, σ^{2})

denotes a circularly symmetric complex Gaussian (CSCG) distribution with a mean of 0 and a variance of

σ^{2}

.

2. MIMO Symbol Detection

In the MIMO system, an important problem is how to get the signal in the presence of noise, which is the process of MIMO detection. Assuming a MIMO communication system with n transmitting antennas and m receiving antennas, the transmitted symbol vector

\tilde{u}

passes through the complex channel matrix

\tilde{H}

and the Gaussian white noise

\tilde{ω}

is added to get the output symbol vector

\tilde{y}

. It can be expressed as:

\tilde{y} = \tilde{H} \tilde{u} + \tilde{ω},

(1)

where

\tilde{u} = {[{\tilde{u}}_{1}, {\tilde{u}}_{2}, \dots, {\tilde{u}}_{n}]}^{T} \in C^{n \times 1}

is

n \times 1

i.i.d transmitted symbol vector, where

{\tilde{u}}_{i} = a_{i} + j b_{i} \in \tilde{A}

.

\tilde{H} \in C^{m \times n}

is complex MIMO channel matrix.

\tilde{ω} = {[{\tilde{ω}}_{1}, {\tilde{ω}}_{2}, \dots, {\tilde{ω}}_{m}]}^{T} \in C^{m \times 1}

is independent additive complex Gaussian white noise with a mean of 0 and variance of

σ_{\tilde{ω}}^{2}

, that is

C N (0, σ_{\tilde{ω}}^{2})

.

\tilde{y} = {[{\tilde{y}}_{1}, {\tilde{y}}_{2}, \dots, {\tilde{y}}_{m}]}^{T} \in C^{m \times 1}

is the output symbol vector.

The signal-to-noise ratio (SNR) is an important parameter of communication system [18,19]. A higher SNR means better communication conditions. The SNR of the MIMO communication system can be expressed as:

SNR = 10 {log}_{10} \frac{n E_{\tilde{u}}}{σ_{\tilde{ω}}^{2}},

(2)

where

E_{\tilde{u}}

represents the average energy of transmitted symbols.

The signal detection is that the receiver gets the output symbol

\tilde{y}

and restores the transmitted symbol vector

\tilde{u}

from

\tilde{y}

. MIMO detection often adopts Bayesian estimation to estimate the expectation of a posteriori probability. It can be expressed as:

\hat{u} = E (\tilde{u} | \tilde{y}) = \int_{{\tilde{u}}_{i} \in \tilde{u}} {\tilde{u}}_{i} p ({\tilde{u}}_{i} | \tilde{y}) d {\tilde{u}}_{i} .

(3)

The posterior probability of transmitted symbol vector

\tilde{u}

can be expressed as:

p (\tilde{u} | \tilde{y}) = \frac{p (\tilde{y} | \tilde{u}) p (\tilde{u})}{p (\tilde{y})} \propto N (\tilde{y} : \tilde{H} \tilde{u}, σ_{\tilde{ω}}^{2} I) \prod_{i = 1}^{n} I_{{\tilde{u}}_{i} \in \tilde{A}} .

(4)

Its a typical variation of Bayesian formula, where

N (\tilde{y} : \tilde{H} \tilde{u}, σ_{\tilde{ω}}^{2} I)

represents a normal distribution with mean

\tilde{H} \tilde{u}

and covariance matrix

σ_{\tilde{ω}}^{2} I

.

\prod_{i = 1}^{n} I_{{\tilde{u}}_{i} \in \tilde{A}}

means that if

{\tilde{u}}_{i} \in \tilde{A}

, then this function value is 1; otherwise, it is 0. ML detection can be expressed as:

{\hat{u}}_{ML} = \underset{\tilde{u} \in {\tilde{A}}^{n}}{arg max} p (\tilde{u} | \tilde{y}) = \underset{\tilde{u} \in {\tilde{A}}^{n}}{arg min} {∥\tilde{H} \tilde{u} - \tilde{y}∥}^{2},

(5)

where

{\hat{u}}_{ML}

is the result of ML detection,

\tilde{u}

is the complex transmitted symbol vector taken from an n-dimensional complex symbol set

{\tilde{A}}^{n}

,

\tilde{H}

is the

m \times n

MIMO complex channel matrix, and

\tilde{y}

is the channel observation vector.

Equation (1) is a complex system model. It can be converted to an equivalent real system model [20]. After transforming the matrix and signal by

y = {[\begin{matrix} R {(\tilde{y})}^{T} & I {(\tilde{y})}^{T} \end{matrix}]}^{T},

(6)

u = {[\begin{matrix} R {(\tilde{u})}^{T} & I {(\tilde{u})}^{T} \end{matrix}]}^{T},

(7)

ω = {[\begin{matrix} R {(ω)}^{T} & I {(ω)}^{T} \end{matrix}]}^{T},

(8)

H = [\begin{matrix} R (\tilde{H}) & - I (\tilde{H}) \\ I (\tilde{H}) & R (\tilde{H}) \end{matrix}],

(9)

the equivalent real system model can be expressed as:

y = Hu + ω .

(10)

In the real system model, the noise power is

σ_{ω}^{2} = 0.5 σ_{\tilde{ω}}^{2}

, the average energy of transmitted symbols is

E_{u} = 0.5 E_{\tilde{u}}

. In the rest of this paper, we adopt the real-valued channel model formulation in (10). We will use the EP detection algorithm to obtain the detected transmitted signal

\hat{u}

from the received signal

y

. Then we use the neural network to optimize the EP detection algorithm and compare the performance of both in this real-valued channel model.

The detection algorithms commonly used in communication systems are the following categories: linear detection, nonlinear detection, optimal detection algorithm, and so on [21]. The summary of common MIMO detection algorithms is shown in Table 1.

3. Expectation Propagation Detection Algorithm

The EP detection algorithm uses Gaussian approximation

q_{EP} (u) = N (u : μ_{EP}, Σ_{EP})

to approximate the signal posterior distribution

p (u | y)

[22]. Therefore, the optimal EP solution is that the mean

μ_{EP}

and variance

Σ_{EP}

of Gaussian approximation

q_{EP} (u)

agree with those of the posteriori distribution

p (u | y)

. The mean and variance can be expressed as:

μ_{EP} = E_{p (u | y)} [u],

(11)

Σ_{EP} = CoVa r_{p (u | y)} [u] .

(12)

It takes

{|A|}^{n}

operations to directly calculate the moments of the signal’s posteriori distribution

p (u | y)

.

A

represents a constellation symbol set. n is the number of transmitting antennas. EP iteration is a polynomial complexity with n to appreciate (11) and (12). When the iteration is over, the EP detection will get the hard decision output. It can be expressed as:

{\hat{u}}_{k, EP} = arg min_{u_{k} \in A} {|u_{k} - u_{k, EP}|}^{2}, k = 1, 2, \dots, 2 n .

(13)

The procedure of the EP detection algorithm is as follows:

First, each non-Gaussian factor is replaced with a non-normalized Gaussian factor

q (u)

,

q (u)

can be expressed as:

q (u) \propto N (y : Hu, σ_{ω}^{2} I) \prod_{k = 1}^{2 n} e^{γ_{k} u_{k} - \frac{1}{2} λ_{k} u_{k}^{2}},

(14)

where

γ_{k}

and

λ_{k}

are real constants greater than 0.

q (u)

is a Gaussian distribution with the mean vector

μ

and the covariance matrix

Σ

.

μ

and

Σ

can be expressed as:

Σ = {(σ_{ω}^{- 2} H^{T} H + diag (λ))}^{- 1},

(15)

μ = Σ (σ_{ω}^{- 2} H^{T} y + γ),

(16)

where

λ = {[λ_{1}, λ_{2}, \dots, λ_{N}]}^{T}

,

γ = {[γ_{1}, γ_{2}, \dots, γ_{N}]}^{T}

.

{(\cdot)}^{T}

represents the conjugate transpose operation.

diag (\cdot)

represents that if the input is a square matrix, take its diagonal elements as a vector, and if the input is a vector, it is expanded as diagonal elements into a square matrix.

The EP detection updates

(γ_{k}, λ_{k}), k = 1, 2, \dots, 2 n

iteratively, and the mean

μ_{EP}

and variance

Σ_{EP}

of the non-normalized Gaussian factor

q (u)

will gradually approximate the mean and variance of the original signified posterior probability distribution

p (u | y)

with polynomial complexity, that is, to get the optimal EP solution [23]. All parameters related to the index k of the sent symbol are first initialized before iterations. Initialize

γ_{k} = 0

and

λ_{k} = E_{s}^{- 1}

.

E_{u}

is the symbol average energy (J/symbol). During each EP iteration,

(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)}), k = 1, 2, \dots, 2 n

will be updated in parallel, where

t \in [1, i t e r]

represents the number of rounds of the current EP iteration and

i t e r

is the maximum number of internal iterations of EP. Given the marginal distribution of the distribution

q^{(t)} (u)

, namely

q_{k}^{(t)} (u_{k}) = N (u_{k} : μ_{k}^{(t)}, σ_{k}^{2 (t)})

, the pair

(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})

will be updated iteratively as follows [24]:

(1): Obtain the cavity marginal distribution $q^{(t) \ k} (u_{k})$ based on $q^{(t)} (u_{k})$ , namely approximate logarithmic likelihood function:

$q^{(t) \ k} (u_{k}) = \frac{q^{(t)} (u_{k})}{exp (γ_{k}^{(t)} u_{k} - \frac{1}{2} λ_{k}^{(t)} u_{k}^{2})} \sim N (u_{k} : ε_{k}^{(t)}, υ_{k}^{2 (t)}) .$

(17)

It follows a Gaussian distribution with mean $ε_{k}^{(t)}$ and variance $υ_{k}^{2 (t)}$ .
(2): Compute the mean $ε_{k}^{(t)}$ and variance $υ_{k}^{2 (t)}$ of the cavity marginal distribution $q^{(t) \ k} (u_{k})$ :

$υ_{k}^{2 (t)} = \frac{σ_{k}^{2 (t)}}{(1 - σ_{k}^{2 (t)} λ_{k}^{(t)})},$

(18)

$ε_{k}^{(t)} = υ_{k}^{2 (t)} (\frac{u_{k}^{(t)}}{σ_{k}^{2 (t)}} - γ_{k}^{(t)}) .$

(19)
(3): Compute the mean $μ_{p k}^{(t)}$ and variance $σ_{p k}^{2 (t)}$ of the approximate posterior probability distribution ${\hat{p}}^{(t)} (u_{k})$ . The distribution ${\hat{p}}^{(t)} (u_{k})$ can be expressed as:

${\hat{p}}^{(t)} (u_{k}) \propto q^{(t) ∖ k} (u_{k}) I_{u_{k} \in A} .$

(20)

In order to ensure the stability of the algorithm, we set a constraint on $σ_{p k}^{2 (t)}$ , namely $σ_{p k}^{2 (t)} = max (δ, Va r_{{\hat{p}}^{(t)}} (u_{k}))$ according to [10], where $δ = 5 \times 10^{- 7}$ .
(4): Update the pair $(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})$ :

$γ_{k}^{(t + 1)} = \frac{μ_{p k}^{(t)}}{σ_{p k}^{2 (t)}} - \frac{ε_{k}^{(t)}}{υ_{k}^{2 (t)}},$

(21)

$λ_{k}^{(t + 1)} = \frac{1}{σ_{p k}^{2 (t)}} - \frac{1}{υ_{k}^{2 (t)}} .$

(22)

According to the moment matching principle, the second moment of the cavity marginal distribution ${\hat{q}}^{(t) \ k} (u_{k})$ is equal to the second moment of the approximate posterior probability distribution ${\hat{p}}^{(t)} (u_{k})$ . That is, make the unnormalized Gaussian distribution ${\hat{q}}^{(t) \ k} (u_{k})$ as shown in (23) have mean $μ_{p k}^{(t)}$ and variance $σ_{p k}^{2 (t)}$ .

${\hat{q}}^{(t) \ k} (u_{k}) = q^{(t) \ k} (u_{k}) exp (γ_{k}^{(t + 1)} u_{k} - \frac{1}{2} λ_{k}^{(t + 1)} u_{k}^{2}) .$

(23)

During the update process, it may occur that $λ_{k}^{(t + 1)}$ is negative, however, this parameter is an accuracy term and should be positive. The reason for this situation is that there is no pair of $(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})$ can make the variance of $q (u)$ approximate to the variance of the original distribution, that is, the moment matching condition cannot be satisfied. In this case, $γ$ and $λ$ will not be updated. Just keep the parameter values the same as before the iteration. That is $γ_{k}^{(t + 1)} = γ_{k}^{(t)}$ , $λ_{k}^{(t + 1)} = λ_{k}^{(t)}$ .
(5): In order to improve the robustness of the algorithm, $(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})$ is smoothed. It can be expressed as:

$γ_{k}^{(t + 1)} = β (\frac{μ_{p k}^{(t)}}{σ_{p k}^{2 (t)}} - \frac{ε_{k}^{(t)}}{υ_{k}^{2 (t)}}) + (1 - β) γ_{k}^{(t)},$

(24)

$λ_{k}^{(t + 1)} = β (\frac{1}{σ_{p k}^{2 (t)}} - \frac{1}{υ_{k}^{2 (t)}}) + (1 - β) λ_{k}^{(t)},$

(25)

where $β \in [0, 1]$ is the smoothness coefficient.

When the change of the mean and covariance of the posterior probability distribution between two iterations is less than

10^{- 4}

or the maximum number of iterations has been reached, the iteration is stopped.

4. Artificial Intelligence-Assisted Expectation Propagation Detection Algorithm

4.1. Application Strategy of Neural Network

In the EP detection algorithm, the most complex and time-consuming step is to calculate the mean vector

μ

and covariance matrix

Σ

of Gaussian approximation

q (u)

. In this process, there is an inversion operation, which will consume a lot of computation resources, as shown in (15) and (16). Secondly, in the process of EP inner loop iteration, it is relatively computation-consuming to calculate the mean

ε_{k}^{(t)}

and variance

υ_{k}^{2 (t)}

of the cavity marginal distribution

q^{(t) \ k} (u_{k})

and update the pair

(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})

, as shown in (18), (19), (21), and (22).

Therefore, we try to simplify the above three steps by using a neural network. We take four variables, namely the mean

μ_{p k}^{(t)}

and variance

σ_{p k}^{2 (t)}

of

{\hat{p}}^{(t)} (u_{k})

, and the mean

ε_{k}^{(t)}

and variance

υ_{k}^{2 (t)}

of the cavity marginal distribution

q^{(t) \ k} (u_{k})

, as the input of the neural network, and the mean

ε_{k}^{(t + 1)}

and variance

υ_{k}^{2 (t + 1)}

of the cavity marginal distribution

q^{(t + 1) \ k} (u_{k})

after the iteration as the output of the neural network. The relatively complicated steps in the EP detection process can be skipped, and the probability density function of the normal distribution can be generated directly according to the mean

ε_{k}^{(t + 1)}

and variance

υ_{k}^{2 (t + 1)}

of the output results of the neural network, so as to obtain the detection result.

The essence of a neural network is that the input data are calculated through network nodes to get the output and the intermediate calculation process can be equivalent to a matrix multiplication operation. So, the iteration process of EP detection can actually be equivalent to several matrix multiplication operations with the aid of neural networks, and the operation complexity can be greatly reduced compared with the complex formula calculation directly. Another advantage of neural networks is that no matter how big the data size is, the calculation process is still several steps of matrix operation, which can be effectively calculated in parallel. In the case of large amounts of data, the operation efficiency can be greatly improved.

4.2. Training Data Collection and Model Structure Design

The first step is to collect the required training data and design the model structure on the basis of comprehensive consideration of computational complexity and learning ability.

In the data collection stage, six arrays are initialized before the simulation to store the six variable data to be collected. In the simulation process, the data to be mapped during each EP iteration are saved. However, in the scenario with different SNRs, the magnitude of the data to be mapped by the neural network is quite different. For example, when SNR is 10, the order of magnitude of the variance

υ_{k}^{2 (t + 1)}

of cavity marginal distribution

q^{(t + 1) \ k} (u_{k})

, which neural network needs to output, is roughly

10^{- 3}

. Correspondingly, when the SNR is 30, the order of magnitude of

υ_{k}^{2 (t + 1)}

is roughly

10^{- 5}

. In order to obtain better training results, SNR is divided into different intervals, and different neural network models are trained on the data with different SNR intervals. We train a total of four models and divide the SNR as follows: we train the first model on the data with

SNR < 15 dB

, train the second model on the data with

15 dB ⩽ SNR < 19 dB

, train the third model on the data with

19 dB ⩽ SNR < 23 dB

, and train the last model on the data with

SNR ⩾ 23 dB

.

Finally, there are 307,200 pieces of data in each SNR interval, and each piece of data has 6 features, that is, the data scale is 307,200 × 6. The first four features are inputs and the last two features are outputs.

The essence of forward propagation of the neural network model is matrix multiplication, and the parameters of the matrix are the weights that the model needs to be trained. The number of hidden layers determines the times of matrix multiplication required in the prediction, and to a large extent determines the algorithm complexity of the model prediction once. In addition, the number of nodes in each layer of the neural network will affect the learning ability of the model. The number of nodes of the model is more, the learning and fitting ability will be stronger, and the amount of data required will be larger; otherwise, the phenomenon of overfitting will occur. If the number of nodes is too small, the learning ability of the model is limited, it is not easy to learn the complex distribution, and it is prone to the underfitting phenomenon.

After many attempts and comprehensive consideration of computational complexity and learning ability, the final structure of the model has four nodes in the input layer, corresponding to four input variables

σ_{p k}^{2 (t)}

,

μ_{p k}^{(t)}

,

υ_{k}^{2 (t)}

and

ε_{k}^{(t)}

, two hidden layers of thirty-two and sixteen nodes, and two nodes in the output layer, corresponding to the two output variables

υ_{k}^{2 (t + 1)}

and

ε_{k}^{(t + 1)}

.

4.3. Model Training and Application

The neural network model designed above is adopted to learn and map the steps required for optimization, and the trained model parameters are obtained. The process is shown in Figure 1.

First, the collected training data are imported and divided into input, output, training, and test data. The divided test set accounts for 20% of the total dataset to ensure the generalization ability of the model.

Then, we adopt the z-score method for data standardization to alleviate the impact of data dimension. Z-score standardization will normalize the data into a distribution with a mean of 0 and a standard deviation of 1. It can be expressed as:

x_{std} = \frac{x - μ_{train}}{σ_{train}},

(26)

where

μ_{train}

and

σ_{train}

are the mean and standard deviation of the input data of the training set, respectively. We use them to standardize the input data of the training and test set.

x_{std}

is the standardized result.

Then the data format is processed. The Pytorch implementation of the neural network requires the data in tensor format. Data with the tensor format can be computed faster in a neural network and can create higher-dimensional matrices and vectors. Therefore, the dataset needs to be converted to the tensor format. In addition, we convert the data format to float32. The default double data format cannot be trained in the Pytorch framework. Next, a data loader is created to train the data by passing it in batches in an iterative manner.

Finally, the neural network models are built. We adopt multilayer perceptron (MLP) with three fully connected layers, 4 × 32, 32 × 16, and 16 × 2, to realize the learning of data and ensure the simplicity of reasoning. We use ReLU() as the activation function to maximize the speed of computation and training convergence. It can be expressed as:

ReLU (x) = \{\begin{matrix} x, i f x > 0 \\ 0, i f x ⩽ 0 \end{matrix} .

(27)

We use Adam optimizer and the initial learning rate is set to 0.001. The L1 loss function, namely the minimum absolute deviation, is adopted as the error loss function. It represents the average of the sum of the absolute values of the difference between the true value and the predicted value. The model trained with this loss function is more robust to some extent. It can be expressed as:

MAE = \frac{\sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|}{n} .

(28)

Environment parameters and neural network training parameters are shown in Table 2.

After the trained model is obtained, it is necessary to further transform the complex calculation process of the EP detection algorithm into the forward propagation process of the neural network. According to SNR, the corresponding model parameters and the corresponding mean and standard deviation for standardization are imported. In the EP detection iteration process, when obtaining the mean

μ_{p k}^{(t)}

and variance

σ_{p k}^{2 (t)}

of the approximate posterior probability distribution

{\hat{p}}^{(t)} (u_{k})

and the mean

ε_{k}^{(t)}

and variance

υ_{k}^{2 (t)}

of the last round cavity marginal distribution

q^{(t) \ k} (u_{k})

, the format of these four variables is processed into the neural network input format, and then the standardization is carried out. After that, the matrix multiplication of input and the weight parameters of the neural network is performed, that is, the prediction process of the neural network is reproduced. The process of neural network prediction is as follows:

ReLU (Inpu t^{2 n \times 4} \cdot W_{1}^{4 \times 32} + b_{1}^{2 n \times 32}) = {Hidden}_{1}^{2 n \times 32},

(29)

ReLU ({Hidden}_{1}^{2 n \times 32} \cdot W_{2}^{32 \times 16} + b_{2}^{2 n \times 16}) = {Hidden}_{2}^{2 n \times 16},

(30)

ReLU ({Hidden}_{2}^{2 n \times 16} \cdot W_{3}^{16 \times 2} + b_{3}^{2 n \times 2}) = {Predict}_{2}^{2 n \times 2},

(31)

where n is the num of transmitting antennas. Finally, the output results are the mean

ε_{k}^{(t + 1)}

and variance

υ_{k}^{2 (t + 1)}

of the cavity marginal distribution

q^{(t + 1) \ k} (u_{k})

after one iteration.

5. Simulation Results

In this section, we compare and analyze the proposed detection algorithm and the original EP detection algorithm in a Rayleigh fading MIMO channel.

In the scenario, there are eight transmitting antennas and eight receiving antennas, and the Rayleigh fading MIMO channel with a mean of 0 and variance of 1 is adopted. Cyclic redundancy check addition, code block segmentation, LDPC coding, and rate matching are performed on the transmitted signals in turn. The modulation modes are 64-QAM and 256-QAM for comparison. At the receiving end, rate matching, LDPC, block segmentation, and cyclic redundancy check are decoded in order. We use the original EP detection algorithm and neural network-assisted EP detection algorithm, respectively, to test and compare the simulation results of the bit error rate (BER) of the soft decision.

Figure 2 and Figure 3 show the comparison of BER detection results after four iterations in the Rayleigh fading MIMO channel scenario with eight transmitting antennas and eight receiving antennas. 64-QAM and 256-QAM are adopted, respectively. “EP” represents the original EP detection. “MLP” represents the EP detection algorithm assisted by the neural network that we propose. When SNR is less than 20 dB, the EP detection algorithm assisted by the neural network can approximate the performance of the original EP algorithm. The results of the fourth iteration of the two algorithms are very close, indicating that the neural network can learn the core steps of the EP algorithm ideally. However, in the case of high SNR (

SNR ⩾ 20 dB

), the magnitude of the data that the neural network needs to learn is too little, as little as

10^{- 5}

order of magnitude size. It requires higher precision to fit the mapping data. In addition, the neural network needs to take into account the generalization ability of the model, and the characteristics of the model lead to its own errors. With the improvement of mapping data accuracy requirements, the tolerance of errors existing in neural networks will also decrease. Therefore, the error of the final detection performance is slightly larger, but the error is also within an acceptable range, and the effect of two iterations of the original EP algorithm can still be achieved. The results show that the EP algorithm assisted by neural networks is a feasible strategy.

Since the simulation data of the eight-input, eight-output Rayleigh fading MIMO channel scenario with 64-QAM constellation is used as the training data, in order to verify the applicability of the neural network-assisted EP detection algorithm in more different scenarios, we use a correlated Rayleigh fading MIMO channel scenario of four inputs and eight outputs, which is completely different from the training data, for testing. The correlation coefficient of adjacent antennas is 0.5. The 64-QAM and 256-QAM modulation modes are adopted, respectively. The simulation results are shown in Figure 4 and Figure 5.

When SNR is less than 20 dB, the method proposed in this paper can still approximate the original EP algorithm, and the results of the fourth iteration of the two algorithms are very close, indicating that the neural network has a good training effect on the core process of EP detection, and is applicable to different scenarios, with strong generalization ability. In the case of high SNR (

SNR ⩾ 20 dB

), from the perspective of the iterative process, it is found that the performance of the fourth iteration is worse than that of the third iteration once in a while. The reason is that the third iteration has reached a good performance, and the next iteration requires more rigorous calculations to further improve the performance. However, the matrix multiplied by each iteration of the proposed algorithm is the same content, which lacks the effect of fine-tuning. In addition, in the scenario of high SNR, it is difficult for the accuracy of the neural network to meet the requirements, so further iteration may lead to performance degradation. From the perspective of performance, due to the model characteristics of the neural network itself and the improvement of data accuracy requirements, the performance error is slightly larger than that of the original EP algorithm, but still maintains a good detection effect, which fully verifies the applicability of the model for different scenarios.

Next, the complexity is analyzed and compared. For one iteration, the original EP algorithm firstly should calculate (21) and (22) to update

(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})

. Then, we calculate (24) and (25) to smooth

(γ_{k}^{(t + 1)}, λ_{k}^{(t + 1)})

and deal with the case where

λ_{k}^{(t + 1)}

is negative. After that, (15) and (16) are calculated to find the mean vector

μ

and covariance matrix

Σ

of the non-normalized Gaussian approximation

q (u)

. This process involves matrix inversion and matrix multiplication, the complexity is

O (n^{3}) + O (n)

. Finally, according to (18) and (19), the mean

ε_{k}^{(t)}

and variance

υ_{k}^{2 (t)}

of the cavity marginal distribution

q^{(t) \ k} (u_{k})

are obtained. Therefore, the computational complexity of one iteration of the original EP algorithm is

O (n^{3})

.

The iterative process of the neural network-assisted EP algorithm only needs to go through three matrix multiplication, matrix addition, and maximizing operations. According to (29)–(31), the main computational complexity of this process can be expressed as

O (2 n \times 4 \times 32) + O (2 n \times 32 \times 16) + O (2 n \times 4 \times 2)

. After simplification, the final computational complexity is

O (n)

. The complexity is greatly reduced compared to MAP and the original EP algorithm. The computational complexity comparison of the three methods is shown in Table 3.

However, the reduction in computational complexity inevitably takes up additional space. The main additional space occupied in our proposed algorithm comes from the parameters of the neural network models. The space required by a neural network model is three weight matrices

W

with dimensions

4 \times 32

,

32 \times 16

, and

16 \times 2

, and three bias vectors

b

with dimension

1 \times 32

,

1 \times 16

, and

1 \times 2

.

6. Conclusions

In this paper, an EP detection algorithm with outstanding performance and low complexity is optimized. We analyze the steps with high complexity, then use the neural network model to learn and map some time-consuming steps in the EP detection algorithm, and convert the complex operation process into several matrix multiplication operations to reduce the complexity of the detection algorithm. Finally, the performance and applicability of the method are verified in different scenarios. The simulation results show that the neural network-assisted EP detection algorithm can approximate the performance of the original EP algorithm when the SNR is less than 20 dB and can learn the core steps of the EP algorithm more ideally. However, in the case of high SNR (

SNR ⩾ 20 dB

), the magnitude of the data that the neural network needs to learn is too small, as small as

10^{- 5}

orders of magnitude. It requires higher precision to fit the mapping data. In addition, the neural network needs to take into account the generalization ability of the model, and the characteristics of the model lead to its own errors. With the improvement of mapping data accuracy requirements, the tolerance of errors existing in neural networks will also decrease. Therefore, the error of the final detection performance is slightly larger, but the error is also within an acceptable range, and the detection performance still maintains a good level. The results show that EP algorithm assisted by a neural network is a feasible strategy. It can guarantee that the performance is close to the original EP detection algorithm, and the computational complexity can be reduced from the original

O (n^{3})

to

O (n)

.

However, performance can still be improved in scenarios with high SNRs. In the future, we can try to adopt other models to achieve a better mapping effect or use different models for training separately in each iteration so as to mitigate performance loss in scenarios with high SNRs.

Author Contributions

Conceptualization, P.X. and D.W.; writing—original draft preparation, P.X.; writing—review and editing, P.X., H.W., Y.L., J.C. and D.W.; supervision, D.W., T.S. and J.C.; project administration, D.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the national key research and development program under grants 2020YFB1807205, and by Southeast-Ticom Tech joint R&D center.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, Q.; Li, G.; Lee, W.; Lee, M.I.; Mazzarese, D.; Clerckx, B.; Li, Z. MIMO techniques in WiMAX and LTE: A feature overview. IEEE Comm. Mag. 2010, 48, 86–92. [Google Scholar] [CrossRef]
Wubben, D.; Bohnke, R.; Kuhn, V.; Kammeyer, K.D. MMSE-based lattice-reduction for near-ML detection of MIMO systems. In Proceedings of the ITG Workshop on Smart Antennas (IEEE Cat. No. 04EX802), Munich, Germany, 18–19 March 2004; pp. 106–113. [Google Scholar]
Goldsmith, A.; Jafar, S.A.; Jindal, N.; Vishwanath, S. Capacity limits of MIMO channels. IEEE J. Sel. Areas Comm. 2003, 21, 684–702. [Google Scholar] [CrossRef] [Green Version]
Chockalingam, A. Low-complexity algorithms for large-MIMO detection. In Proceedings of the 2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP), Limassol, Cyprus, 3–5 March 2010; pp. 1–6. [Google Scholar]
Chaudhary, M.; Meena, N.K.; Kshetrimayum, R.S. Local search based near optimal low complexity detection for large MIMO System. In Proceedings of the 2016 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), Bangalore, India, 6–9 November 2016; pp. 1–5. [Google Scholar]
Fincke, U.; Pohst, M. Improved methods for calculating vectors of short length in a lattice, including a complexity analysis. Math. Comput. 1985, 44, 463–471. [Google Scholar] [CrossRef]
Şahin, S.; Cipriano, A.M.; Poulliat, C.; Boucheret, M.L. Iterative equalization based on expectation propagation: A frequency domain approach. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 3–7 September 2018; pp. 932–936. [Google Scholar]
Senst, M.; Ascheid, G. How the framework of expectation propagation yields an iterative IC-LMMSE MIMO receiver. In Proceedings of the 2011 IEEE Global Telecommunications Conference-GLOBECOM 2011, Houston, TX, USA, 5–9 December 2011; pp. 1–6. [Google Scholar]
Takeuchi, K. Rigorous dynamics of expectation-propagation-based signal recovery from unitarily invariant measurements. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 501–505. [Google Scholar]
Cespedes, J.; Olmos, P.M.; Sánchez-Fernández, M.; Perez-Cruz, F. Expectation propagation detection for high-order high-dimensional MIMO systems. IEEE Trans. Commun. 2014, 62, 2840–2849. [Google Scholar] [CrossRef] [Green Version]
Santos, I.; Murillo-Fuentes, J.J.; Arias-de Reyna, E.; Olmos, P.M. Turbo EP-based equalization: A filter-type implementation. IEEE Trans. Commun. 2018, 66, 4259–4270. [Google Scholar] [CrossRef] [Green Version]
Bashar, M.; Akbari, A.; Cumanan, K.; Ngo, H.Q.; Burr, A.G.; Xiao, P.; Debbah, M.; Kittler, J. Exploiting deep learning in limited-fronthaul cell-free massive MIMO uplink. IEEE J. Sel. Areas Commun. 2020, 38, 1678–1697. [Google Scholar] [CrossRef]
Vapnik, V.N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 1999, 10, 988–999. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chiu, H.J.; Li, T.H.S.; Kuo, P.H. Breast cancer–detection system using PCA, multilayer perceptron, transfer learning, and support vector machine. IEEE Access 2020, 8, 204309–204324. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Lin, R.; Zhou, Z.; You, S.; Rao, R.; Kuo, C.C.J. Geometrical Interpretation and Design of Multilayer Perceptrons. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–5. [Google Scholar] [CrossRef] [PubMed]
Goldberger, J.; Leshem, A. MIMO detection for high-order QAM based on a Gaussian tree approximation. IEEE Trans. Inform. Theory 2011, 57, 4973–4982. [Google Scholar] [CrossRef] [Green Version]
Hassan, A.K.; Moinuddin, M. Beamforming using exact evaluation of leakage and ergodic capacity of MU-MIMO system. Sensors 2021, 21, 6792. [Google Scholar] [CrossRef] [PubMed]
Santos, I.; Murillo-Fuentes, J.J.; Boloix-Tortosa, R.; Arias-de Reyna, E.; Olmos, P.M. Expectation propagation as turbo equalizer in ISI channels. IEEE Trans. Commun. 2016, 65, 360–370. [Google Scholar] [CrossRef]
Caire, G.; Muller, R.R.; Tanaka, T. Iterative multiuser joint decoding: Optimal power allocation and low-complexity implementation. IEEE Trans. Inform. Theory 2004, 50, 1950–1973. [Google Scholar] [CrossRef]
Minka, T.P. Expectation propagation for approximate Bayesian inference. arXiv 2013, arXiv:1301.2294. [Google Scholar]
Yao, G.; Yang, G.; Hu, J.; Fei, C. A low complexity expectation propagation detection for massive MIMO system. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–6. [Google Scholar]
Santos, I.; Murillo-Fuentes, J.J.; Arias-de Reyna, E. A double EP-based proposal for turbo equalization. IEEE Signal Process. Lett. 2019, 27, 121–125. [Google Scholar] [CrossRef]

Figure 1. The process of model training and application.

Figure 2. BER performance for the case of the Rayleigh fading MIMO channel scenarios with

n = 8

,

m = 8

, and 64-QAM constellation.

Figure 2. BER performance for the case of the Rayleigh fading MIMO channel scenarios with

n = 8

,

m = 8

, and 64-QAM constellation.

Figure 3. BER performance for the case of the Rayleigh fading MIMO channel scenarios with

n = 8

,

m = 8

, and 256-QAM constellation.

Figure 3. BER performance for the case of the Rayleigh fading MIMO channel scenarios with

n = 8

,

m = 8

, and 256-QAM constellation.

Figure 4. BER performance for the case of the correlated Rayleigh fading MIMO channel scenario scenarios with

n = 4

,

m = 8

, and 64-QAM constellation.

Figure 4. BER performance for the case of the correlated Rayleigh fading MIMO channel scenario scenarios with

n = 4

,

m = 8

, and 64-QAM constellation.

Figure 5. BER performance for the case of the correlated Rayleigh fading MIMO channel scenario scenarios with

n = 4

,

m = 8

, and 256-QAM constellation.

Figure 5. BER performance for the case of the correlated Rayleigh fading MIMO channel scenario scenarios with

n = 4

,

m = 8

, and 256-QAM constellation.

Table 1. Summary of common MIMO detection algorithms.

Algorithms	Descriptions	Complexity
Maximum Likelihood (ML)	Best performance, highest complexity	$O (2^{n \times \|A\|})$
Maximum a Posteriori (MAP)	Regularized maximum likelihood	$O (2^{n \times \|A\|})$
Matched Filter (MF)	Linear detectors (LD)	$O (n \times m)$
Zero Forcing (ZF)		$O (n^{3})$
Minimum Mean Square Error (MMSE)		$O (n^{3})$
Ordered Successive Interference Cancellation (OSIC)	Interference elimination	$O (n^{4})$
Sphere Decoding (SD)	Tree based search	Unstable, depends on simulation
Gaussian Tree Approximation (GTA)	Tree based search	$O (n^{3})$
Belief Propagation (BP)	Probability estimation theory	$O (n^{2} {\|A\|}^{2})$
Probabilistic Data Association (PDA)	Probability estimation theory	$O (n^{3})$
Construct optimization problem to solve	Convex optimization theory	Unstable, depends on simulation

Table 2. Parameters of neural network module.

Parameters	Values
Programming language	Python3.7
Implementation framework	Pytorch
CUDA version	10.1
Training equipment	GPU: RTX 2080 Ti
Epochs of training	200
Optimizer	Adam
Initial learning rate	0.001
Activation function	ReLU
Loss function	L1 norm

Table 3. Computational complexity comparison.

Algorithms	Complexity
Maximum a Posteriori (MAP)	$O (2^{n \times \|A\|})$
The original EP detection	$O (n^{3})$
The neural network assisted EP detection	$O (n)$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xin, P.; Wang, H.; Liu, Y.; Chen, J.; Song, T.; Wang, D. An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems. Electronics 2023, 12, 388. https://doi.org/10.3390/electronics12020388

AMA Style

Xin P, Wang H, Liu Y, Chen J, Song T, Wang D. An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems. Electronics. 2023; 12(2):388. https://doi.org/10.3390/electronics12020388

Chicago/Turabian Style

Xin, Pengzhe, Hailong Wang, Yu Liu, Jianping Chen, Tiecheng Song, and Dongming Wang. 2023. "An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems" Electronics 12, no. 2: 388. https://doi.org/10.3390/electronics12020388

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Artificial Intelligence-Assisted Expectation Propagation Detection for MIMO Systems

Abstract

1. Introduction

2. MIMO Symbol Detection

3. Expectation Propagation Detection Algorithm

4. Artificial Intelligence-Assisted Expectation Propagation Detection Algorithm

4.1. Application Strategy of Neural Network

4.2. Training Data Collection and Model Structure Design

4.3. Model Training and Application

5. Simulation Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI