1. Introduction
Digital beamforming (DBF) is a key requirement in modern wireless systems. The cost and power consumption are the main challenges for the wide application of DBF [
1], due to the higher number of radio frequency (RF) channels and analog-to-digital converters (ADC) that are required. Therefore, it is critical to develop a novel, low-cost, power-efficient DBF system with few RF channels.
To reduce RF channels, the received array signals have to be compressed/encoded at the beginning of the front-end. However, the decrease in sampled signal dimension will lead to an unacceptable DBF performance. Thus, it is necessary to recover/decode the original received full-array signals to ensure that the signal dimension for DBF is consistent with the traditional full-channel array. Considerable efforts [
2,
3,
4,
5,
6,
7,
8,
9,
10,
11] have been made to reduce the digital receiving array (DRA) cost based on the above encoding and decoding architecture. According to the difference in codec, these approaches can be classified into the compressed-sensing-based coding DRA (CS-CDRA) [
2,
3,
4,
5,
6] and the orthogonal-coding-based coding DRA (OC-CDRA) [
7,
8,
9,
10,
11,
12], as shown in
Table 1.
For a typical CS-CDRA, the received full array signals are encoded by a measurement matrix that complies with the restricted isometry property (RIP) [
13], such as random Gaussian matrix, random Bernoulli matrix, partial Hadamar matrix, and partial Fourier matrix. Since the received array signal is compressible, it could be reconstructed by solving a convex optimization problem [
14]. However, it is impossible to recover the received array signal in real-time due to the iteration requirements of the existing sparse reconstruction algorithms [
15,
16,
17]. In addition, the sparse reconstruction algorithms cannot work stably with a low signal-to-noise ratio (SNR).
The OC-CDRA introduces a code division multiplexing (CDM) [
18] technique to identify each signal path associated with every unique antenna element. Specifically, the received signal from each antenna element is mixed with a high-speed spread spectrum code, such as the Walsh–Hadamard and Gold codes, and decorrelated with the same spread spectrum code after digitization. According to the mixer position, the OC-CDRA can be further divided into on-site coding (OSC) architecture [
7,
8,
9] and time sequence phase weighting (TSPW) architecture [
10,
11,
12]. The advantage of OSC architecture is that the received full array signals are encoded at an intermediate frequency (IF) after down-conversion, which is friendly to the implementation of the encoding network. The requirements of ADCs and parallel input/output (I/O) channels at the digital back-end are significantly reduced, but the RF channels are not. Therefore, the OSC architecture is not efficient in terms of cost reduction. However, the TSPW architecture encodes the received array signal at RF before down-conversion. The encoded signals are sampled by few RF channels and ADCs, which significantly reduce the system budget. However, both TSPW and OSC face problems in code synchronization (in both encoding and decoding) and moving target compensation [
19]. More importantly, each information bit is decoded from
L chips, where
L is the code length. the recovered signal in OC-based architecture is only
data volume of the full-channel array, which inevitably leads to data loss after decoding. The system-accumulated gain will be decreased
.
Different from the above coding DRA (CDRA), there is a generalized coding array, called antenna selection (AS) architecture [
20,
21,
22]. The fundamental idea behind AS is to reduce the number of RF chains by judiciously activating a subset of antennas. This can realize a reconfigurable array beamforming through element selection (not combination) without decoding. However, meanwhile, the array accumulated gain will be decreased according to the number of discarded array elements, which is also a form of data loss.
Artificial intelligence has shown great potential in various fields, and beamforming systems are no exception. A beamforming neural network (BFNN) is proposed in [
23] to optimize the beamformer to maximize the spectral efficiency with hardware limitations. It has strong robustness to imperfect channel state information (CSI). Ref. [
24] proposes a beamforming prediction network (BFNet) to jointly optimize the power allocation and virtual uplink beamforming of Multiple-Input and Multiple-Output (MIMO) systems. This can get rid of the complexity caused by excessive iteration to realize real-time calculation. For the local scattering caused by sea surface, [
25] introduce a convolutional neural network (CNN) framework to estimate the transmitter’s incident angle.
This paper proposes a novel, low-cost, power-efficient DBF system framework called machine-learning-based coding DRA (ML-CDRA). The received full-array signals are encoded into a few RF channels, and decoded by an artificial neural network (ANN) in real-time without any data volume loss. The proposed ML-CDRA can work stably at a low SNR, as verified in simulations at . Since the recovered signals are decoded by a single snapshot, the moving target compensation problem will not bother ML-CDRA. Moreover, we present a generalized loss function of the decoding network, which is carried out from three directions: maximum likelihood, signal sparsity, and noise (including noise power suppression, equal power constraint, and noise whitening). A feasible loss function is given as an example and the derivations for back propagation are successively derived. At the end, the implementation of ML-CDRA is also discussed based on the existing technologies and devices. A real-time processing architecture for ML-CDRA, with the decoding network implemented by a field-programmable gate array (FPGA), is presented.
It should be emphasized that the proposed ML-CDRA is a novel, low-cost, DBF system framework. The form of encoding and decoding networks are not fixed, which leads to a systematic trade-off between cost, resources, and performance. The loss function of decoding network should also be modified according to the applications.
2. Materials and Methods
As shown in
Figure 1, the proposed ML-CDRA is composed of two networks: encoding and decoding networks. The signals received from each antenna are selected and combined into a few RF channels through an encoding network. After digitization, the sampled signals are decoded by a decoding network based on machine learning to recover the originally received full-array signals.
2.1. Signal Model
The encoded signals sampled by ADCs are:
where
,
G is the gain of receiver module,
is the encoding matrix (
,
M is the number of few RF channels,
N is the number of antenna elements),
is the received array signal,
is the independent and identically distributed Gaussian noise in wireless channels, and
is the random thermal noise in RF channels.
The traditional array signal model supposes that the noise in each channel is independent and identically distributed. However, this is totally different in CDRA, which is correlated due to the encoding. The same wireless channel noise may be mixed into two different RF channels.
Thus, it is necessary to clarify the composition of the noise in sampled signals.
where
is the noise in the sampled signal.
The power of is determined by the noise figure F of the digital receiver module. For example, considering the typical value , the introduced thermal noise has equal power with the input noise after being amplified. In addition, the power of in each RF channel may be different, which is controlled by the encoding matrix .
The covariance matrix of noise
is
where
,
denotes the conjugate transpose, ∘ denotes the Hadamard product, and
is the identity matrix.
Obviously, the noise in the sampled signal is correlated.
2.2. Encoding Network
The encoding network can significantly reduce the requirement of RF channels. The received array signals are compressed into a few RF channels according to the encoding matrix , which describes the connection between antenna elements and RF channels. The form of encoding matrix is diverse. Different encoding matrices will bring a different spatial sensitivity to CDRA.
Consider the single far-field target case; the received array signal
can be expressed as
where
is the steering vector, and
is the direction of arrival.
The encoded signal of
channel is
where
is the
row vector of
, and
is the thermal noise of the
channel.
Therefore, the SNR of
channel is
where
is the signal power,
is the thermal noise power of
mth channel,
denotes the magnitude, and
denotes the variance.
It is easy to find that the signal power is a function of the direction. This means that the signal from different directions may suffer losses after the encoding network compared with the phased coherent combination. This loss is determined by the array arrangement, direction of arrival, and encoding matrix. It will eventually be reflected in the maximum SNR after DBF according to the weight of the corresponding direction, namely, spatial sensitivity. Thus, the encoding network should be carefully designed in different applications.
According to the difference in implementation, the encoding network can be divided into the fixed encoding network and the tunable encoding network. As shown in
Figure 2a, the fixed encoding network is implemented by a multiple input–multiple output (MIMO) feed network without any additional equipment. The tunable encoding network introduces additional RF switches or phase shifters to the feed network, as shown in
Figure 2b,c. An RF switch or phase shifter can provide more freedoms for signal processing. Therefore, the received array signals can be encoded in various schemes to obtain optimal performance in different applications by changing the encoding matrix.
The main difficulty of the encoding network is the topological structure overlapping of the feed network. The overlapping wiring can be realized by a multi-layer printed circuit board (PCB) with through holes. Nevertheless, the design of an overlapping feed network is still a challenge when the topological structure is too complex. The RF micro-electromechanical systems (RF-MEMS) switches matrix [
26] is one possible scheme to realize the overlapping feed network. However, the RF-MEMS switch matrix still needs further development, especially regarding large-scale and RF performance loss.
The most common feed network without overlap is the adjacent subarray structure. A typical passive array antenna structure is given in
Figure 3a. Each group of four adjacent elements is combined into a single RF channel by a fixed feed network. The subarray pattern of this structure is immutable, and the main lobe is aligned to the broadside of the array.
Figure 3b is a typical phased array antenna with subarray structure, and each group of four adjacent elements is combined into a single RF channel after the phase shifter. However, its subarray pattern can be changed by switching the phase shifter.
2.3. Decoding Network
Due to (
1) is an under-determined equation, which has an infinite number of solutions. It is impossible to recover the original received array signal from the few-channel signal by a linear transformation with a single snapshot. Therefore, non-linear processing is an inevitable choice for the decoding network.
Considering the computational complexity and the real-time processing requirements in engineering, the existing sparse reconstruction algorithms are unsuitable for CDRA [
15,
16,
17]. ANN, which has been rapidly developing in this decade, has the ability to cope with these difficulties.
The decoding network of the proposed ML-CDRA is carried out by an ANN in this paper. The forward propagation of ANN is implemented by limited multipliers and adders with a low latency. The pipeline architecture can achieve single-snapshot, real-time decoding in a pressureless manner. It should be noted that the specific network structure of ANN is open for ML-CDRA. This is a bargain between resources and performance.
Here, a generalized loss function of the decoding network based on the maximum a posteriori estimation (MAP) is given as
where
is derived from the maximum likelihood estimation.
and
are the constraints for signal and noise, respectively.
and
are hyperparameters. The details of the generalized loss function are presented in the next section.
The decoding network is trained based on back propagation, which is carried out by gradient descent as
where
is the complex network parameter
in
iteration,
is the iteration step-size,
is the partial derivative of
to
,
is the loss function, and
denotes the conjugate.
Figure 4 describes the training of a decoding network based on the proposed generalized loss function. The forward propagation of the network is performed by single snapshot. Therefore, the reconstructed noise is independent in the time dimension. This ensures that the time accumulation gains of ML-CDRA will not deteriorate compared with the full-channel DRA. The back propagation of the network is performed by batch processing. According to the authors’ experience, it is sufficient to accomplish the training of the decoding network within 1000 batches. Moreover, the training time can be shorter if an offline pre-training is performed based on a simulation or limited actual data. Considering that the non-cooperative objects are more common, the online training ability is essential for the decoding network.
2.4. An Example for Generalized Loss Function
It should first be emphasized that the form of loss function for the decoding network is diverse. The basic idea of the proposed generalized loss function is to focus on the signal and noise simultaneously, especially the noise.
A feasible loss function is given as an example based on the proposed generalized loss function, as follows. The back propagation of the decoding network is carried out by gradient descent, which is derived in
Appendix A.
where
where
,
denotes the inverse of matrix,
L is the number of snapshots,
is the covariance of the decoded signal
,
denotes the
-norm,
is the
eigenvalue of
,
K is the sparsity,
is the shape parameter,
is the covariance of the decoded noise, and
is the noise power-suppression coefficient.
2.4.1. Maximum Likelihood Estimation
According to
Section 2.1, the sampled signals are
where noise
obeys the normal distribution with mean
and covariance matrix
.
The maximum likelihood estimation is obtained by maximizing the Gaussian density function
where
denotes the determinant.
Therefore, the maximum likelihood estimation can be regarded as
2.4.2. Sparsity Constraint for Signal
The eigenvalues of the received array signal covariance are sparse, since the targets are limited. Therefore, the sparsity of signal can be constrained as
where
denotes the
-norm.
As the
-norm minimization is an NP-hard problem, it should be replaced by other models, such as
-norm and smoothed
-norm (SL0) [
27]. The SL0 approximates
-norm by a smooth Gaussian function, which is differentiable during the back propagation of the decoding network. (
13) can be rewritten by SL0 as
2.4.3. Noise Reduction and Whitening
Suppose the received array signals are accurately reconstructed by the constraints of
and
. As we know, the SNR is determined by signal and noise. Thus, the noise constraint can be designed as
Noise reduction and whitening are carried out by and , respectively. In addition, the also implies an equal-power constraint for different channels.
The existing reconstruction algorithms [
15,
16,
17] often neglect the statistical characteristics of the reconstructed noise, which is crucial to signal accumulation (including array signal processing and time-domain accumulation). The reconstructed noise power and correlation directly determine the SNR after DBF. We can clearly point out that the upper limit of the array signal reconstruction problem is determined by the noise constraint.
3. Results
To evaluate the performance of the proposed ML-CDRA, a 48-element 12-channel ML-CDRA is studied and compared with different DBF systems as shown in
Figure 5, including a 48-element, traditional, full-channel DRA (48-DRA, which has the same antenna elements), a 12-element, traditional, full-channel DRA (12-DRA, which has the same RF channels), and a 48-element, single-channel TSPW array [
10] with code length
.
As the receiving antenna of each scheme is a uniform, linear array antenna with half-wavelength inter-sensor spacing, each antenna element is omnidirectional. The encoding network of ML-CDRA is the 4-in-1 subarray structure without a phase shifter, as given in
Figure 3a. The decoding network is a three-layer, fully connected, multi-layer perceptron (MLP) with 12 neurons in the input layer, 1024 neurons in the hidden layer, and 48 neurons in the output layer. The loss function is given in
Section 2.4. The other details of the decoding network are given in
Appendix A, including the activation function and the gradient descent.
3.1. SNR after DBF
Assume that a far-field target at with single frequency. Both wireless channel noise and RF channel noise are Gaussian noise. The noise figure of the digital receiver module . The beamforming weight , where .
As shown in
Figure 6, the SNR after the DBF of ML-CDRA is almost consistent with 48-DRA, but only
RF channels are required. This is nearly
higher than 12-DRA, which has the same RF channels. In addition, the simulation results also reflect that the system-accumulated gain decreases in TSPW caused by data loss are
. More importantly, the ML-CDRA works stably even if the input SNR is as low as
. According to the simulation results, it can be reasonably speculated that the proposed ML-CDRA will still work effectively with a lower SNR.
Furthermore, the spectrum of the decoded signal after DBF is given in
Figure 7 to prove that the proposed ML-CDRA can recover the received array signal perfectly. In the normalized spectrum diagram, the noise power of ML-CDRA is basically the same as 48-DRA, and lower than 12-DRA. It should be noted that the noise of ML-CDRA is approximately uniformly distributed across the spectrum.
3.2. Beamforming Performance
The normalized array pattern of different DBF systems are given in
Figure 8. The proposed ML-CDRA has the same beamforming performance compared with 48-DRA, and a narrower beamwidth compared with 12-DRA, which means a better directivity. The traditional subarray, which has the same 4-in-1 structure without phase shifter as shown in
Figure 3a, can also achieve a similar narrow beam compared with ML-CDRA. However, the spatial filtering cannot be effectively realized when the undesired targets appear at the gate lobe. This problem was solved in the proposed ML-CDRA by recovering the received array signal.
3.3. Spatial Sensitivity
Different from the antenna pattern, the spatial sensitivity describes the relationship between the maximum SNR after DBF and the directions. Signals from different directions with the same input SNR are fed into ML-CDRA in turn. After decoding, the recovered array signals are combined according to the DBF weight of the corresponding direction, e.g., , where is the direction of arrival.
Figure 9 shows several subarray structure-encoding networks, and the spatial sensitivities of these schemes are given in
Figure 10a. Wherever the signal comes from, the full-channel DRA can achieve the same maximum DBF gain by phased coherent combination. Therefore, both 48-DRA and 12-DRA appear as a flat line. It is easy to see that the subarray structure ML-CDRA has a better performance than 12-DRA around the broadside direction. The SNR after the DBF of the 4-in-1 structure is profitable in about
compared with 12-DRA. Moreover, if the phase-shifters are introduced into the encoding network, as shown in
Figure 3b, the ML-CDRA can obtain the maximum SNR after DBF in any direction. As shown in
Figure 10b, the maximum gain of ML-CDRA is achieved at
by adjusting the phase shifters.
For the ML-CDRA with random encoding structure, as in
Figure 2a, the received array signals are combined into a few RF channels according to a random Bernoulli matrix.
Figure 11 shows that the SNR of this structure after DBF is similar to the 48-DRA in the broadside direction, and fluctuates around the 12-DRA in other directions while obtaining a narrower beamwidth, as discussed in
Section 3.2. The random encoding structure ML-CDRA can take care of the entire space simultaneously, and the SNR fluctuation in different directions is more robust than the subarray structure.
A different encoding structure means a different spatial sensitivity, which may be an advantage in some scenes. For example, in the case of target tracking, the subarray structure ML-CDRA can obtain the maximum gain in the desired direction using phase shifters, as in
Figure 10b. The signal from other directions will be attenuated due to the spatial sensitivity. Therefore, the ML-CDRA can naturally realize spatial anti-jamming.
4. Implementation
To further prove the “low-cost” property, the implementation of the proposed ML-CDRA is discussed based on the commercial chips in this section. The complexity of the decoding network is also studied.
A real-time processing implementation architecture for ML-CDRA is presented in
Figure 12. The forward propagation of the decoding network is performed by multipliers and adders in FPGA. The back propagation is carried out by gradient descent in a digital signal processor (DSP) and updated with a coherent processing cycle (CPI). For the full-channel DRA, the DBF weight calculation is also performed in DSP based on the array signals, as shown in
Figure 12a. Therefore, the gradient descent of ML-CDRA can be integrated into the same DSP chip without any additional data-transfer overhead.
To realize the real-time decoding, each multiplication and addition operation of the decoding network needs to be configured with independent resources. The pipeline architecture shown in
Figure 13, which can execute all the operations of each layer of the decoding network in each clock cycle, is more suitable to realize the real-time decoding compared with the instruction set architecture.
To be more specific, the resources consumed to implement the decoding network can be divided into registers, adders, and multipliers. For a scale-limited network, the logic cells and configurable logic blocks (CLB) in FPGA can supply the registers and adders required in the decoding network. The activation functions of the hidden layer neurons (such as Sigmoid, ReUL, sech, etc.) can be implemented by the look-up tables (LUT). In addition, the multiplication ability can be evaluated by multiply accumulate (MAC). Many operations (such as convolution, dot product, and matrix operations) can be converted to multiple MAC operations. Therefore, the networks requiring convolution operation, such as convolutional neural network (CNN), also have the possibility to be used in the proposed ML-CDRA.
Consider a
N-element
M-channel ML-CDRA, realized by a three-layer, fully connected MLP with
M neurons in the input layer,
K neurons in the hidden layer, and
N neurons in the output layer. The system clock is
. For the super-heterodyne receiver, the received signal is divided into I&Q branches after down-conversion. Therefore, the number of multiply accumulate operations performed per second is
Take a 48-element 12-channel ML-CDRA as an example, which is studied in the simulation. The decoding network is realized by a three-layer, fully connected MLP with 12 neurons in the input layer, 1024 neurons in the hidden layer, and 48 neurons in the output layer. Suppose the system clock is 100 MHz. Therefore, the number of multiply accumulate operations performed per second is 2 * (12 * 1024 + 1024 * 48) * 100 MHz = 12,288 GMAC/s.
Table 2 gives several available commercial FPGAs [
28,
29]. The Virtex UltraScale+ FPGA VU11P and VU13P are powerful enough to implement the above decoding network using only one chip. The Virtex-7 XC7VX690T can also be used to implement the decoding network with lower complexity. In addition, these FPGAs have enough I/O to assign a dedicated pin to each ADC without time division multiplexing. It should be noted that the decoding network mainly consumes the DSP Slice resources. The remaining resources in FPGA can still be used for other functions.
Hence, the proposed ML-CDRA can be implemented by an additional FPGA on the basis of full-channel DRA. As shown in
Figure 12, the required RF channels, variable gain amplifiers (VGA), ADCs and I/Os of ML-CDRA are significantly reduced compared with the full-channel DRA. Moreover, the space requirement, heat dissipation and crosstalk of RF channels can also be alleviated. Considering that the additional cost of the decoding network is lower than that of the reduced RF channel overhead, the proposed ML-CDRA is attractive.
Furthermore, for the large-scale decoding network, adding one chip cannot implement the ML-CDRA. A multiple FPGA cascade is a possible means to solve this problem. In addition, the sparsely connected networks may be a potential direction to reduce the redundant computing and resource consumption. Resource reuse is also a good choice in scenes with a low sampling rate. Some new hardware architectures/devices designed for artificial intelligence, such as the Adaptive Compute Acceleration Platform (ACAP) of Xilinx, may solve the real-time decoding challenge from the hardware level.