1. Introduction
The source-channel coding theorem [
1] states that a source can be reliably transmitted over a channel as long as its entropy is less than the channel capacity, assuming that latency, complexity, and block length are not constrained. This theorem suggests that we can design source and channel coding separately to achieve optimal results. However, in practical implementations, separate source and channel coding (SSCC) is suboptimal due to the residual redundancy and finite block length. As a potential remedy, joint source and channel coding (JSCC) have been proposed to provide improvements by exploiting residual redundancy and avoiding capacity loss, which can at most double the error exponent vis-a-vis tandem coding [
2]. A theoretical study of finite-blocklength bounds for the best achievable lossy JSCC rate demonstrates that JSCC designs bring considerable performance advantages over SSCC in the nonasymptotic regime [
3].
One of the main methods of JSCC is the joint decoding of a given source compression format (e.g., JPEG) with a channel decoder. For example, rate-compatible low-density parity-check (RC-LDPC) codes can provide unequal error protection for JPEG2000 compressed images [
4]. If the JPEG2000 performs source coding with certain error-resilience (ER) modes, the source decoder can use the ER mode to identify corrupt sections of the codestream for the channel decoder [
5]. Iterative decoding of variable-length codes (VLCs) in tandem with channel code can provide remarkable error correction performance [
6]. The low-density parity-check (LDPC) codes can be combined with Huffman coded sources to exploit the a priori bit-source probability [
7].
Another type of JSCC approach is to assume a Markov model of the source to decode jointly with the factor graph of the channel coding. Joint source-channel Turbo coding for binary Markov sources was investigated in [
8]. The drawback of such methods is that the standard VLCs are not suitable for forming a single graph model structure with channel codes (e.g., LDPC codes), as they require long sequences to achieve the entropy of the source. Therefore, a double LDPC (D-LDPC) code for JSCC has been proposed; one LDPC code is used to compress the source first, followed by another LDPC code to protect the compressed source against noise [
9]. D-LDPC codes can be represented as a single bipartite graph, allowing the belief propagation (BP) algorithm to decode it on the receiver side. Double protograph LDPC (DP-LDPC) codes are a variant of the D-LDPC codes that introduce a P-LDPC code to replace the conventional LDPC code [
10]. Optimization of the base matrix of the DP-LDPC codes can effectively improve the performance of the DP-LDPC JSCC system [
11,
12,
13].
Polar codes, proposed by Arıkan, provably achieve the capacity of any symmetric binary-input discrete memoryless channel (B-DMC) with efficient encoding and decoding algorithms [
14]. The successive cancellation algorithm (SC) and belief propagation algorithm (BP) [
15] are two commonly used decoding approaches. With the aid of list-decoding, polar codes can outperform LDPC codes at short codelength [
16,
17,
18,
19]. Arıkan presented the source polarization as a channel polarization complement [
20], which is the theoretical basis for applying polar codes to source compression. Polar codes can achieve the rate-distortion bound for a symmetric binary source for lossy source coding [
21]. Moreover, polar codes can asymptotically achieve the optimal compression rate for lossless source coding [
22]. Meanwhile, polar code is a good candidate to take advantage of the benefits of source redundancy in the JSCC system [
23]. The JSCC system using polar codes can significantly improve the decoding performance for language-based sources and distributed sources [
24,
25]. A quasi-uniform systematic polar code (qu-SPC) is constructed for the JSCC system with side information by introducing an additional bit-swap coding to modify the original polar coding [
26,
27]. Our previous work proposes a JSCC scheme with double polar codes by combining source polarization and channel polarization, which shows performance improvement under short codelength [
28].
Although polar codes have been studied in the fields of channel coding and source coding, there is a lack of systematic research on the framework of JSCC based on polar codes. In our previous work [
28], we introduced the basic double polar code structure of cascading source polar code and channel SPC. However, double polar codes have a high error floor under short blocklength, and lack performance analysis tools in the presence of biased sources.
In this paper, we establish a systematic framework for a JSCC with double polar codes and provide a complete theoretical analysis.
The contributions of this paper are summarized as follows:
Double Polar Coding Framework: Guided by the source polarization and channel polarization, we propose a JSCC framework based on polar codes, composed of the source polar coding and the channel polar coding. First, a group of source bits is transformed into a series of bit polarized sources by source polarization, where the bit polarized sources with higher entropy, namely, high-entropy bits, are regarded as the compressed source, with other parts are treated as redundancies. After the source is polarized, a source check decoding is performed to construct an error set containing the index of all error decisions that occurred in the source decoding. Second, another polar code (PC) or SPC is used to protect the compressed source bits and the bits in the error set against the noise. Moreover, a mapping is inserted between the source polar code and the PC, where the mapping design affects the performance of the JSCC system. Depending on the channel code, the proposed JSCC scheme is called double polar code (D-PC) or systematic D-PC (SD-PC), respectively. The proposed JSCC framework facilitates the exploitation of residual redundancy in the source code on the decoder side.
Joint Belief Propagation Decoding: To represent the D-PC and SD-PC, we introduce two factor graphs: the joint factor graph (J-FG) and the systematic joint factor graph (SJ-FG). Correspondingly, the proposed joint source and channel decoder can be divided into two types: the J-TG-based joint belief propagation (J-BP) decoder and the SJ-TG-based systematic joint belief propagation (SJ-BP) decoder. One one hand, for the J-FG, the mapping within D-PC determines the connection between the factor graph of the source code and that of the channel code. The J-BP decoder is iteratively applied over the J-FG, where the source decoder and the source decoder exchange the soft information through the mapping. On the other hand, for the SJ-FG, due to the systematic bits directly carrying high entropy bits, the factor graph of the source code and that of the channel code are combined into a single factor graph. The SD-PC can be decoded as a single factor graph using the SJ-BP decoder.
B-EXIT Evaluation: Regarding the proposed JSCC system, we present a biased extrinsic information transfer (B-EXIT) convergence analysis for the J-BP decoder and SJ-BP decoder. The EXIT method is extended to consider the non-uniform source. We design a method to calculate the distribution of each high entropy bit, which permits the B-EXIT algorithm to track mutual information transfer for each high entropy bit. This approach reveals the superiority of the proposed JSCC system from a theoretical perspective.
The rest of this paper is organized as follows.
Section 2 briefly introduces the background of polar codes and BP decoding.
Section 3 proposes the proposed JSCC framework.
Section 4 describes the J-BP decoder for D-PC, and
Section 5 provides the SJ-BP decoder for SD-PC.
Section 6 introduces the B-EXIT chart.
Section 7 shows the performance evaluation results with B-EXIT and simulations. Finally,
Section 8 concludes the paper.
2. Preliminaries
2.1. Notational Conventions
In this paper, we use calligraphic characters, such as , to denote sets. For any finite set of integers , denotes its cardinality. We denote random variables (RVs) by upper case letters, such as , and their realizations by the corresponding lower case letters, such as . For an RV X, denotes the probability assignment on X. We use the notation as shorthand for denoting a row vector . We use bold letters (e.g., ) to denote matrices. Given and , we write to denote the subvector .
2.2. Polar Codes
A polar code can be identified by a parameter vector , where is the code length, K is the code dimension and specifies the size of the information set, and is the frozen bits. The frozen bits are known to both the transmitter and the receiver, and are usually set to all zeros. Let denote the codeword of a polar code; then, the encoding process can be expressed as , where constitutes of the information bits and frozen bits , while is called the generation matrix of size N and is the nth Kronecker product of the polarizing kernel . Then, the codeword is transmitted over an AWGN channel. The modulation is binary phase shift keying (BPSK). The signals received at the destination are provided by , where is an i.i.d Gaussian noise sequence and each noise sample has zero mean and variance . Furthermore, the LLR of can be written as .
2.3. BP Decoding
An polar code can be represented by an n-stage factor graph which contains nodes. Each -index node is associated with the left-to-right and right-to-right likelihood messages. Let and denote the logarithmic likelihood ratio (LLR)-based left-to-right message and right-to-left message in the t-th iteration. The BP decoder iteratively updates and propagates these soft messages between neighboring nodes over the factor graph. Before decoding starts, is initialized to 0 for ; otherwise, , and is initialized to the channel receive value. The soft messages in the rest of the nodes are initialized to 0. Then, the BP decoder updates the soft message of each node over the whole factor graph. The BP decoder terminates when the number of iterations reaches the preset maximum number of iterations. After the iteration finishes, the estimation of information bits can be obtained by , where the hard-decision function when , and is otherwise 0.
3. Joint Source and Channel Coding
In this section, we present details of the proposed JSCC framework based on the polar code. First, we provide a brief description of the proposed JSCC framework. Then, the coding process of the D-PC scheme and the SD-PC scheme is elaborated.
3.1. Joint Source and Channel Coding Framework
As shown in
Figure 1, the proposed JSCC framework mainly comprises two phases: the source polar coding and the channel polar coding. The source polar coding phase involves the source polarization and the check decoding. The source vector
is first transformed into a polarized source vector
by source polarization. Then, the polarized source vector
, which is composed of the high-entropy vector
and the low-entropy vector
, is fed into a source polar decoder for check decoding. The check decoding generates an error set
by collecting the indices of the source decoding errors produced during source decoding. This error set
is converted into a binary sequence
and appended to the high-entropy vector
to form the vector
.
In the channel polar coding phase, there are two alternative schemes. Let represent a mapping function; for the D-PC scheme, the vector is first mapped to the vector , which is then protected against noise by a PC. For the SD-PC scheme, we directly adopt an SPC to protect the vector . From an overall perspective, in the transmitter, the source vector is coded into the codeword and then transmitted to the receiver through a channel. In the receiver, the received signal is decoded by a joint source and channel decoder to obtain the source estimation .
3.2. Joint Source and Channel Coding for D-PC
The D-PC scheme in the proposed JSCC framework can be implemented by the three procedures described below.
3.2.1. Source Polar Coding
First, the source vector is compressed using a PC. This paper considers an independent and identically distributed (i.i.d.) binary Bernoulli source
S with
. The
source bits from
S are represented by the vector
. The source polarization of the source vector
is performed as follows:
where
is the
m-order Kronecker power of
. Let
denote the
high-entropy set with
. The high-entropy vector are designated as
. To avoid errors in recovering
at the receiver, we perform a source check decoding to find all indices for which the decision of the corresponding compressed source bit
would be incorrect. Let
denote the estimation of
. Then, a set of indices
can be defined as
The final output of the source polar encoding is provided by . Because the size of is at most , each element in can be represented by an m-bit binary sequence. Let denote the binary vector in which all elements of are cascaded in binary form and let . The length of binary vector is denoted as .
The construction of the set
has been introduced in [
22] based on the SC decoding algorithm. Here, we consider the construction of
based on the BP decoding algorithm. In a BP decoder, the estimation
is updated in each iteration. Let
denote the estimation of
in the
t-th iteration and let
T denote a preset positive integer. If
satisfies the convergence condition
we can assume that the converged bit
[
29]. For
, if the converged bit
is not equal to
, then the index of
belongs to
. However, if
keeps changing without converging, i.e., it remains oscillating, the source BP decoder cannot obtain the converged
. To tackle this issue, we propose an oscillation check criterion that considers the
corresponding to the maximum LLR amplitude recorded during the oscillation as the converged
. Let
denote the LLR corresponding to the estimation
and let
represent
with maximum
during the oscillation. Moreover, the variable
counts the iteration number of the oscillation and
is the preset maximum number of oscillations. When
, the
can be provided by
where
when
and is otherwise
, then the set
includes the index
i if
.
3.2.2. Optimal Mapping
For the D-PC JSCC system, the polarized sub-channels with the highest capacity carry the vector , where the high-entropy vector consists of polarized source bits. Because the block length is finite, the polarization of both the source and the channel is insufficient. When one polarized source bit is transmitted over a polarized channel, if the entropy of the former is less than the mutual information of the latter, such transmission is unreliable. Therefore, we need to optimize the mapping between the source and channel codes to improve system performance. From the perspective of mutual information maximization, we provide Theorem 1, which states a necessary condition for the optimal mapping.
Theorem 1. For a double polar code identified by a parameter vector , there exists an optimal mapping π between the source polar code and channel polar code that maximizes the mutual information . The optimal mapping must satisfyfor all and . Proof of Theorem 1. Consider a mapping
. In the coding process, the high-entropy bits
are carried by the information bits
; then, there is
with
in the decoding process. The mutual information
is calculated as follows:
The information bit
is transmitted over the polarized channel
. The mutual information
is not able to exceed the channel capacity
. The vector
is transmitted over polarized channels
. Let
denote the
j-th element of the set
. Based on the chain rule for entropy, we have
If the index
corresponds to the
j-th element
of the set
, the
can be rewritten as
As a result of
, we can obtain
Thus, we have the following relationships:
Then, we can obtain the following inequality:
The compressed bit
is assigned to
by the mapping. Accordingly, (
11) can be rewritten as follows:
From (
12),
can reach
when (
5) holds. □
Based on Theorem 1, we propose Algorithm 1 to construct an optimal mapping. Let
and
denote the
ith element of
and
. We use a sequence
to denote the sorted
by the descending order of reliability. Moreover, a sequence
is used to denote the sorted
by descending order of entropy. Let a sequence
denote the mapping
, where
indicates that
is mapped to
, i.e.,
is transmitted by the sub-channel
. In Algorithm 1, the most reliable polarized sub-channel transmits the polarized source bit with the highest entropy. Meanwhile, the least reliable polarized sub-channel transmits the polarized source bit with the lowest entropy.
Algorithm 1:Construct the mapping . |
|
3.2.3. Channel Polar Coding
Finally, we protect the compressed source bits and the bits in the error set with another PC. Let
denote the information set of the channel PC with
. The encoding of the PC has been described in
Section 2.2. We can write the encoding process as
Because
, (
13) can be rewritten as
3.3. Joint Source and Channel Coding for SD-PC
The difference between SD-PC and D-PC is that SD-PC employs an SPC to protect the vector
. We can split the codeword into two parts by writing
, where
is the codeword length
and the systematic bits
are assigned to the vector
. For the SPC, the parity bits
are provided by
where
is an all-zero subvector and
denotes the submatrix of
consisting of the array of elements
, with
and
, and submatrix
can be similarly defined. Then,
is calculated as follows:
In summary, the source polar coding and channel polar coding can form a JSCC framework. For the D-PC JSCC system, the optimal mapping can maximize the mutual information between the compressed source bits and the corresponding estimation .
6. B-EXIT Convergence Analysis
EXIT chat [
31] is an efficient tool for analyzing the convergence of the iterative decoder. It provides an excellent visual representation of the iterative decoder by tracking the mutual information (MI) at each iteration. In this section, we perform a biased-EXIT (B-EXIT) convergence analysis for the proposed JSCC system with binary Bernoulli source. To evaluate the convergence performance of the J-BP decoder and the SJ-BP decoder under binary Bernoulli source, we track the average MI (AMI) transfer process between SFG and CFG such that we can provide a virtual representation of the iterative decoding process. The EXIT method is named B-EXIT due to the biased source, which is shown in Algorithm 4.
Algorithm 4:B-EXIT analysis. |
|
For a uniform source, it can be assumed without loss of generality that the codeword sent is all zeros, which is not reasonable for a biased source. Therefore, as shown in lines 2–3 of Algorithm 4, we actually generate a codeword for the D-PC or SD-PC with a biased source and record the compressed bits . Then, the receiver obtains the signal by sending the codeword over the binary phase shift keying (BPSK) modulated additive white Gaussian noise (AWGN) channel. Based on the received signal and the source distribution , the joint source-channel decoder, i.e., the J-BP decoder and SJ-BP decoder, can be initialized.
For the t-th iteration of the J-BP decoder, let denote the a priori AMI between the input LLRs and the compressed bits. Similarly, is used to denote the extrinsic AMI between the output LLRs and the compressed bits. For the J-BP decoder and SJ-BP decoder, the input LLRs actually refer to (), which represents the input LLRs of the CFG (SFG). Similarly, the output LLRs refer to (), which represents the output LLRs of the CFG (SFG). To track the decoding trajectory, we collect all the and at each iteration by using a number of simulated code blocks (Line 6, Algorithm 4). For , let to collect when , and let to collect when . In line 7 of Algorithm 4, for we utilize to count the PDFs with the histogram method. The can be utilized to count PDFs with the histogram method.
After the probability density functions (PDFs) of these LLRs have been counted with the histogram method, we can calculate the AMI
. The compressed bits
are not uniform due to the source
being biased. Therefore, we first need to calculate the mutual information between each compressed bit
and the corresponding LLR independently. Let
denote the mutual information between the compressed bit
and
; we can obtain
by the histogram method. Here,
refers to the probability distribution of compressed bits
. The probability distribution of each bit in
is different. The compressed bits
are obtained by multiplying the source sequence
and the
ith column of the generation matrix. Let the set
denote the index of ‘1’ in the
ith column of the generation matrix and let
denote the Hamming weight of the
ith column of the generation matrix. Then, we have
for
. Here,
is the probability that the number of ones in the sequence
is odd. Thus, we can obtain the distribution of the compressed bits
as follows [
32]:
where
p is
of a Bernoulli source
S. After obtaining the probability distribution of compressed bits
, we can calculate
. Then, the AMI
is provided by
The computed AMI can be plotted as a zigzag path, which reflects the decoding trajectory. On the EXIT plane,
AMI pairs can be written as the coordinate points, which can be expressed as follows:
where
. By connecting the vertices at both ends of the zigzag path separately, we can obtain two curves, corresponding to
and
. We now connect the adjacent coordinate points in the form
where
. Then, we obtain the extrinsic AMI transfer trajectory. After obtaining the extrinsic AMI transfer trajectory, the convergence speed of the proposed joint source-channel decoder with finite coding length and limited number of iterations can be reflected.