1. Introduction
Interest in short block length codes has been rising recently, mainly due to emerging applications that require high reliability and low latency data transmission. Examples of such applications are machine-type communications in 5G/6G, the Internet of Things, smart metering networks, remote command links, and messaging services [
1,
2,
3,
4,
5,
6,
7,
8]. Short block length codes play a crucial role in satisfying the low latency requirements of 5G communication. However, for short block length codes, the necessary trade-off between good error correction performance and decoding complexity leads to a significant gap between error correction performance and the theoretical lower bound, and on the other hand, leads to a dearth of practical decoding solutions [
9,
10].
The cyclic codes, first proposed in the 1960s [
11], can be soft decision decoded based on their code structure such as the trellis graph or the bit reliability [
12,
13,
14,
15,
16]. Decoding algorithms based on the trellis graphs can approach the maximum likelihood (ML) soft decision decoding performance, but the decoding complexity is the main concern, especially when the code length
n is not so small and the states grow exponentially with
k or
, where
k is the information bit number [
12,
13]. The generalized minimum distance decoding has a relatively low decoding complexity. However, the gap to the performance of the ML decoding could not be neglected [
14,
15]. The decoding schemes based on ordered statistics can approach ML performance and have recently become a hot topic, showing great potential in future wireless communications for short or moderate length codes [
16,
17]. The ordered statistic decoding (OSD) algorithm was proposed by Fossorier and Lin in [
16]. The traditional algorithms based on ordered statistics need to find the most reliable basis (MRB), which involves sorting the soft information of all the bits and finding
k independent columns of the generation matrix associated with the
k most reliable bits [
16]. In order to eliminate the burden of identifying the MRB, many efforts have been paid to reduce the complexity and some significant progress has been achieved [
17]. The main ideas of the existing improved OSD algorithms include skipping and/or discarding unlikely test patterns to avoid unnecessary calculations [
18,
19,
20,
21,
22,
23,
24,
25,
26]. A double re-encoding technique was proposed in [
27]. Two groups of independent candidates are chosen for a particular class of rate-1/2 systematic block codes. Without using the MRB, this method only needs a few candidate codewords, which is practical for hardware implementation [
28,
29]. The application of the OSD algorithms for short codes in wireless communications, especially the ultra-reliable and low latency communication (URLLC), has also been investigated [
30,
31,
32].
Decoder architectures of all kinds of decoding algorithms, including the soft decision decoding algorithms, have also been comprehensively evaluated. As a classical example, the Golay code is popular for its perfect structure and good error correction capability. Many important decoding algorithms have been proposed for Golay codes in various application scenarios [
33,
34,
35,
36,
37,
38]. Besides the traditional URLLC applications, the code was also applied in some other new scenarios [
39,
40]. There are many decoder examples for Golay codes [
41,
42,
43,
44]. For example, Sarangi et al. proposed a decoder for extended binary Golay codes based on the incomplete ML decoding scheme with low latency and low complexity [
41]. Reviriego et al. exploited the properties of the Golay code to implement a parallel decoder that corrects single and double-adjacent errors [
42]. Short BCH codes are another type of important block code suitable for soft decision decoding. Their implementation architectures have also been proposed [
45,
46,
47,
48,
49]. In [
50,
51], the implementations of soft decision decoding of a Golay code fully based on channel observation values were presented, which can achieve a performance close to ML performance. However, the complexity is much higher compared with some decoders using hard decision results. Therefore, in this paper, we choose the Golay code and several BCH codes as examples to evaluate the proposed decoding algorithm and decoder architecture.
Inspired by the permutation decoding algorithms [
52,
53,
54], a new soft decision decoding algorithm, which utilizes a cyclic information set, is proposed in this paper. The proposed method is based on the cyclic property of cyclic block codes to generate the information set, which is suitable for hardware implementation, thereby the MRB is not needed. Furthermore, an efficient hardware implementation architecture is also designed to reduce the complexity of generating candidate codewords and selecting the most possible member. The decoders for several cyclic codes are implemented on the hardware platform and the performance of the proposed algorithm is verified. Experimental results demonstrate that the proposed decoding algorithm shows the error correction performance close to that of the ML decoding. It can achieve a good trade-off between complexity and performance, which renders this new decoding algorithm to be quite practical. The key parameters of the decoding algorithm are also optimized based on the hardware platform. The proposed soft decision decoding algorithm can make full use of channel information and provide a significant performance improvement compared with the algebraic decoding. In summary, our contributions lie in two points. First, the proposed algorithm employs the cyclic property of cyclic block codes, enabling the utilization of a uniform systematic generator matrix in re-encoding, which can eliminate the complex transformation to obtain the new generator matrices. Then, the hardware implementation is based on the systolic array architecture, which has low complexity and short critical paths. It offers the ability to arbitrarily set the number of information sequences and candidate codewords, supporting a flexible range of applications.
The rest of this paper is organized as follows.
Section 2 introduces the calculation method of the metric used in the proposed soft decision decoding of cyclic codes. The proposed soft decision decoding algorithm based on the cyclic information set is illustrated in
Section 3.
Section 4 presents the efficient hardware implementation architecture of the proposed soft decision decoding algorithm. The verification and parameter optimization of the algorithm based on the hardware platform are presented in
Section 5. Finally, conclusions are drawn in
Section 6.
2. Metrics for the Proposed Soft Decision Decoding
In this paper, for cyclic codes, a binary phase shift keying (BPSK) constellation is transmitted over an additive white Gaussian noise (AWGN) channel. Soft decision decoding metrics commonly used include the likelihood function, Euclidean distance, correlation, correlation discrepancy, etc. In the proposed decoding algorithm, the correlation discrepancy is adopted as the decoding metric [
12].
Let
denote the received soft information sequence and
represent an arbitrary codeword, where
n is the code length. The log-likelihood function of
is expressed as
If the log-likelihoodfunction is used as the decoding metric, the soft decision ML decoding algorithm selects the codeword
with the largest
.
For the AWGN channel with a bilateral power spectral density of
, the conditional probability
is given by
where
is a BPSK symbol mapped by
, satisfying
[
12], i.e.,
where
is a symbol sequence for BPSK modulation mapped by
. The summation term in (
2) is called the squared Euclidean distance between the received sequence
and the codeword
. It is denoted as
, i.e.,
Obviously, to maximize
is equivalent to minimizing
. Extend the expression on the right side of (
4), that is,
Define the summation term as a correlation between
and
, denoted as
[
12], that is,
Then, it can be rewritten as [
12]
Similarly, finding the maximum value of
is equivalent to finding the minimum value of the second summation term in (
7). Define the summation term as the correlation discrepancy between
and
, denoted as
[
12], i.e.,
where
is the hard decision result of
, satisfying
is the binary sequence obtained after hard decision for
.
3. Soft Decision Decoding Algorithm Based on Cyclic Information Set
A low complexity soft decision decoding algorithm based on a series of cyclic information sequences, called an information set, is proposed in this paper. The proposed decoding procedure is decomposed into two steps. First, using the cyclic property of cyclic codes, generate predetermined multiple cyclic information sequences, which form the cyclic information set. Then, a group of C candidates for each information sequence is tested to find the most possible codeword.
3.1. The Proposed Soft Decision Decoding Algorithm
In this paper, multiple information sequences are used for re-encoding to generate the candidate codewords. We generate the information set using the cyclic property of the cyclic codes. The proposed soft decision decoding algorithm is depicted in
Figure 1. Assuming that the received soft information sequence is
. First, the
circular shifts are performed on
to obtain
n-dimensional soft information sequences
, where
denote the number of circular shifts. Then, the first k components in
,
are selected to form the
k-dimensional sequences
, namely the information sequences. The information sequences represent the equivalent information bits in the codewords. The process of generating multiple information sequences by circularly shifting is shown in
Figure 2. For each information sequence
, a group of
k-dimensional sequences with the minimum correlation discrepancy to
are obtained, denoted as
, where the correlation discrepancy is calculated by the hard decision of the corresponding information sequence. All candidate information sequences are encoded to yield
candidate codewords via the cyclic encoding method. In essence, the codewords of the cyclic code are still legitimate after circular shifting. The information sequence corresponding to each candidate information sequence is obtained by the circular shifting of
. The simple encoding circuits of cyclic codes can be used to encode the candidate information sequence based on the encoder using shift-registers [
12]. This method saved the burden of computing the new generator matrices. Finally, the candidate codeword with the minimum correlation discrepancy to the corresponding information sequence in the received channel observation soft information sequence
is selected as the optimal candidate codeword. If the candidate codeword is generated by the information sequence
from
, it is directly selected as the optimal decoding result. Otherwise, the candidate codeword needs to be circularly shifted to obtain the optimal decoding result. The four main steps of the algorithm are summarized in Algorithm 1.
Algorithm 1 The soft decision decoding algorithm based on cyclic information set |
Input: the received soft information sequence . Output: the optimal decoding result. 1. Generate I independent information sequences. 1.1 Circularly shift uniformly, and obtain . 1.2 Take the first k components of to generate the information sequences . 2. For each information sequence , generate C candidate information sequences with the minimum correlation discrepancy, i.e., . 3. Encode the candidate sequences to obtain candidate codewords. 4. Select the optimal candidate codeword and make a decision. 4.1 Select the candidate codeword with the minimum correlation discrepancy and obtain the associated information sequence in . 4.2 If the codeword is generated by , it is directly used as the decoding result. Otherwise, it is circularly shifted to obtain the decoding result.
|
In the proposed soft decision decoding algorithm, the correlation discrepancy is used as a metric to generate the candidate sequence and select the optimal candidate codeword. First, I information sequences are generated using cyclic shifting, and for each information sequence, the C nearest sequences in correlation discrepancy are selected as candidate information sequences. Then, the list of candidate codewords is established using the concise cyclic encoding method. Finally, the candidate codeword having the minimum correlation discrepancy with the received soft information sequence is selected as the decoding result. The size of the candidate codeword list is reduced from to based on the minimum correlation discrepancy, which can effectively reduce the search range of the decoding results. Meanwhile, for each information sequence, it only needs to calculate the correlation discrepancy of the two k-dimensional sequences once, and calculate the -dimensional correlation discrepancy for C times. The proposed algorithm avoids the complex calculation and sorting of the correlation discrepancy between all binary k-dimensional sequences compared with the ML decoding.
3.2. Efficient Iterative Generating Method for Adjacent Candidate Sequences
The generation of the candidate information sequences and the metric calculation of candidate codewords are the main steps and have high computation complexity in the proposed soft decision decoding algorithm. For each information sequence, we only need to generate a small number, denoted as C, of candidate information sequences to generate candidate codewords. Among all the members, the candidate information sequences include C members, which have the most minimal correlation discrepancy with the information sequence.
Aiming at reducing the complexity of generating candidate information sequences, an iterative algorithm is adopted to generate the candidate information sequence in this paper. This method avoids calculating the correlation discrepancies of all
sequences and sorting all the correlation discrepancies. This algorithm is first employed to calculate the log-likelihood ratio of non-binary symbols for non-binary low density parity check (LDPC) codes [
55]. In this paper, this method is extended for the generation of candidate information sequences with some modifications.
The algorithm works in the iterative mode with a total of
k iterations. In the
d-th iteration, the subsequence
, consisting of the first
d components of the information sequence, needs to be considered. The set
of the two-dimensional variables represents the generated partial binary sequence and its correlation discrepancy with
. If
,
, where
represents the hard decision result of
,
represents the negative of
. The calculation in the
d-th iteration is divided into two steps. Take the
l-th pair of elements,
and
, as an example. Each element is expanded into two elements according to the
d-th component
of the information sequence and the hard decision result
. Extending the
l-th element
in
to two elements
and
can be specifically expressed as
where ‘&’ is the concatenation operation of two binary sequences. The two-dimensional variable sets
and
extended by
can be obtained by calculating all
and
.
Then, the two sets and are merged to construct a new set . It should be ensured that the elements in are arranged in descending order of the correlation discrepancy value. Considering that the algorithm is iterative, the elements in and are also in descending order. Therefore, only the first elements of the two sets that have not been placed into need to be compared and the element with smaller correlation discrepancy is put into the set . When the number of elements in reaches the required number of candidate codewords, the construction of is stopped. After the k-th iteration, the first dimension component of each element in the obtained set constitutes C candidate sequences. The two main steps in the iterative algorithm are described in Algorithm 2.
The candidate information sequences are the sequences with the minimum distance (correlation discrepancy) to the information sequence. In the extending process, the correlation discrepancy metric of the extended sequences may become larger and larger, and it is impossible to be a candidate sequence. Thus, discarding these sequences does not affect the results of the candidate information sequence generation. The discarding process significantly reduces the complexity of generating the candidate information sequences, because the number of calculations and comparisons of the correlation discrepancy is reduced, especially for cyclic codes of long code length.
Algorithm 2 The d-th iteration of the candidate sequence generating algorithm |
Input: the set of the iteration, the absolute value of the d-th component of the information sequence, and the hard decision result . Output: the set 1. Extend the set to and . 1.1 Add to the binary sequence of each element in and obtain the set . 1.2 Add the negative of , i.e., to the binary sequence of each element in . Update the correlation discrepancy of each element to the result of adding and obtain the set . 2. Construct the set . 2.1 Combine and , i.e., . 2.2 Compare the number of elements in with required number C, if , all the elements form , otherwise, goto step 2.3. 2.3 Compare the elements of the two sets, and , and output C elements with the smallest correlation discrepancy, obtaining a new set .
|
4. The Architecture and Complexity Analysis of the Proposed Decoder
In this section, based on the proposed soft decision decoding algorithm, the decoder implementation for hardware evaluation is designed in a serial mode. Two key components in the decoder, the candidate sequence generator and the correlation discrepancy calculation unit, are both implemented based on the systolic arrays, which have lower complexity and reduce the length of the critical path. Furthermore, the implementation complexity of the proposed decoder is analyzed.
4.1. Proposed Decoder Architecture
The overall architecture of the proposed soft decision decoder is shown in
Figure 3. The decoder includes five modules. The soft information register is used to store soft information from the demodulator and perform cyclic shifting to obtain the different information sequences, that is, the information set. The following decoding operations are repeated according to the cardinality of the information set, and finally the candidate codeword with the minimum correlation discrepancy is selected as the decoding output. As described in
Section 3, the candidate sequence generator, the re-encoding unit, the correlation discrepancy calculation unit, and the decoding result update unit work in the sequential manner.
The decoding process is explained as follows. The soft information sequence from the demodulator is first buffered in the soft information register. Then, the first k components, i.e., the soft information corresponding to the information bits, are output to the candidate sequence generator to construct a series of candidate information sequences. Next, the re-encoding unit encodes the candidate information sequences to obtain the candidate codewords. The residual components of are output to the correlation discrepancy calculation unit for calculating the correlation discrepancy of each candidate codeword. The candidate codeword and its correlation discrepancy are computed serially. After the candidate codewords and the corresponding correlation discrepancies are input into the decoding result update unit, the register is used to circularly shift the buffered candidate codeword. When all the operations for a single information sequence are completed, the soft information register performs a cyclic shift on the buffered data to generate another new information sequence. In this way, the operations for each information sequence are repeated. After all the I information sequences are processed, the final decoding output is obtained in the decoding result update unit.
Among all the modules of the decoder, the candidate sequence generator and the correlation discrepancy calculation unit need to be carefully designed. For the candidate sequence generator, the iterative algorithm can significantly reduce the number of comparison operations and thus reduce the overall implementation complexity. For the correlation discrepancy computation unit, the complexity of the direct summation computation is high. In our design, we consider the systolic array architecture for the two modules. The systolic array architecture is composed of several modules working in pipeline mode, and each module is called a Processing Element (PE).
4.2. Architecture of the Candidate Sequence Generator
The candidate sequence generator can generate the candidate information sequences and corresponding
k-dimensional correlation discrepancy for subsequent modules, according to the channel observation values
. As shown in
Figure 4, the systolic array architecture consists of
k cascaded PEs working in a sequential manner based on Algorithm 2. For the candidate sequence generator architecture, the
j-th components of the received soft information sequence are connected to the
j-th PE and the overall generation starts with the first PE corresponding to
. The input of the first PE is the preset initial value, and the output is connected to the next PE. For the
d-th iteration
, the
d-th PE has two inputs, the first
components of the soft information sequence
with its hard decision result
, and the list
including the intermediate d-dimensional candidate sequence with its correlation discrepancy from the
-th PE. For the output, the
d-th PE generates the list
and transfers it to the
-th PE of the next stage, until the output of the
k-th PE constitutes the generator output, where
k is the information sequence length.
The PE is the core of the candidate sequence generator. It consists of two expansion modules, two First-In-First-Out (FIFO) memories, and one comparator, which selects the minimum of the two FIFO outputs. The
d-th PE is shown in
Figure 4, where the input and output represent the intermediate candidate sequence generation. The
d-th PE serially receives the ordered list
from
-th stage one by one. The first step is the expansion of
into
and
, as described in the previous section. Once the first elements
and
are stored in the FIFOs, the merging process starts. In each clock cycle, the outputs of the two FIFOs are compared, and the one with the smaller correlation discrepancy value is selected to be fed into the list
. Simultaneously, the pull signal is set to 1 to allow a new couple at the output of the FIFOs.
The setting of the depth of two FIFOs per stage considers the worst case, that is, all couples of are retrieved from one FIFO. Therefore, the depth of the FIFO in the d-th PE is equal to the number of couples in the output of the stage. According to the previous section, if the stage label d satisfies , the depth of the FIFO is set to C, otherwise, the depth of the FIFO is . After the last PE outputs C candidate information sequences, all of the FIFOs will be cleared and the candidate sequence generation is completed.
4.3. Architecture of the Correlation Discrepancy Calculation Unit
The correlation discrepancy calculation unit computes the correlation discrepancy between candidate codeword
and the received soft information sequence
. To simplify the hardware implementation, we consider the calculation of the correlation discrepancy corresponding to the part of the first
k bits and the part of the last
bits. The formula for calculating the correlation discrepancy can be expressed as
or concisely denoted as
Here,
.
is the hard decision result of
. The soft information sequence associated to the information bits is denoted as
, i.e.,
, while
is the part of the soft information sequence associated to the parity-check bits. Similarly, the candidate codeword
is denoted as
.
denotes the candidate information sequence associated with the information bits and
, while
.
It can be observed that the first term in Equation (
12) is the correlation discrepancy between the information sequence and the corresponding candidate sequence. The correlation discrepancy calculation unit only needs to calculate the second term
.
Considering the high computation complexity, our proposed correlation discrepancy calculation unit uses a systolic array architecture for sequential calculation. The architecture is shown in
Figure 5, which consists of
identical PEs. Taking the
d-th PE as an example, the inputs are the soft information
with its corresponding hard decision result
and the list
from the previous PE. The
d-th PE generates the list
as an output that is serially sent to the
-th PE of the next stage.
(
) is a set containing
C two-dimensional variables. The
l-th two-dimensional variable of the set can be expressed as
In the Formula (
13), the first dimension variable of
is the candidate codeword
, and the second dimension variable is the correlation discrepancy between the first
components of
and the first
bits of
.
As shown in
Figure 5, the architecture of the
d-th PE is composed of an expansion unit, an XOR gate, and a data selector. The XOR operation generates the selection signal for the data selector. If
, select
to expand
, otherwise, if
, select
to expand
to obtain
. The last PE serially outputs
, which is all candidate codewords and their correlation discrepancy with
.
4.4. Implementation Complexity of the Proposed Decoder
Considering that the OSD algorithm is the most efficient and classical algorithm nearly achieving the ML decoding performance, the complexity evaluation is based on the comparison between the proposed algorithm and the OSD algorithm. Compared with the order-
t OSD algorithm of the
k-dimensional sequences in [
16], the proposed algorithm effectively reduces the computation complexity, as the number of all the tested candidate codewords,
, is smaller than
. We utilize the (31, 21) BCH code as an example.
Table 1 shows the size of the candidate codeword list for different decoding algorithms. The (31, 21) BCH code only requires 80 candidate codewords under the appropriate parameters, and the corresponding parameter optimization will be shown in
Section 5.2 below. The proposed decoding algorithm can effectively reduce the number of the candidate codewords, compared with the OSD algorithm.
For the re-encoding, the OSD algorithm needs to obtain the generator matrix for each candidate codeword through complex matrix operations. The proposed decoding algorithm employs the cyclic property of cyclic block codes, enabling the utilization of the uniform systematic generator matrix in re-encoding. The cyclic codes can be encoded using the very simple encoding circuits with shift-registers.
All our implementations are on the Vertex-5 LX 110T Field Programmable Gate Array (FPGA) (xc5vlx110t-1ff1136), and the synthesis software is the XST program of ISE 13.2, all by Xilinx Inc. This FPGA chip is mainly used for high-performance general logic applications, containing 17,280 configurable slices and 128 block RAM blocks (BRAMs), where each slice has 4 LUTs and 4 flips-flops. The proposed hardware architecture is implemented on the FPGA platform using different parameters. The proposed soft decision decoding algorithm contains two very important parameters, that is, the number of information sequences, I, and the number of candidate codewords, C. As I increases, the number of computation operations, required to generate a list of candidate codewords, increases linearly. Another advantage is that the systolic arrays operate in parallel, and the operations are only limited by the number of candidate codewords, C.
To illustrate the required hardware resources of the decoder,
Table 2 presents the main units for the key modules. Here,
is the weight of the generator polynomial for cyclic codes. The adder and comparators have the width of
bits in the
d-th PE. Furthermore, each processing element of the candidate sequence generator requires two FIFOs with the same size. In the
d-th processing element (
), each FIFO has a width of
bits and a depth of
, where
w is the quantization bit width of correlation discrepancy. The total storage space of the FIFOs (in bits) can be expressed as
6. Conclusions
A soft decision decoding algorithm based on a cyclic information set is proposed, and its efficient hardware implementation architecture is also presented. The algorithm employs the correlation discrepancy as the decoding metric and adopts an iterative computation method. Firstly, the cyclically generated equivalent information sequences, which have the minimum correlation discrepancy with the received sequence of channel observation values, are selected as the candidate information sequences. Then, the list of candidate codewords is encoded using the very concise encoder. Finally, the optimal decoding result is obtained by selecting the optimal candidate codeword using the correlation discrepancy metric. The method reduces the search range of the decoding results by establishing the list of candidate codewords in the cyclic and iterative manner, and the complexity is lower than the ML algorithm. Furthermore, in order to verify the performance of the decoding algorithm, the implementation architecture of the algorithm is completed on the FPGA platform. The BER performances of the (15, 5) BCH code, (31, 21) BCH code, (63, 30) BCH code, and (23, 12) Golay code are evaluated on this platform. The experimental results show that the performance of the proposed decoding algorithm is close to the ML algorithm, and the decoder achieves a good trade-off between decoding complexity and error correction performance. The effect of the number of information sequences and the candidate codewords on the decoding performance is also evaluated on the FPGA platform. The complexity of the decoder can be reduced by optimizing the number of information sequences and candidate codewords.
In this paper, we only evaluated the BER performance of the cyclic codes over the AWGN channel. In the future, considering short cyclic codes have been widely employed in 5G communications, it is valuable to evaluate the proposed decoding algorithm under the wireless fading channel, combined with advanced modulations or waveforms. In addition, the serial hardware implementation architecture has been utilized to verify the BER performance of the proposed algorithm. In the future, efficient parallel architectures will be developed to meet the higher throughput requirements of 5G, enabling practical applications.