Next Article in Journal
Practical Entropy Accumulation for Random Number Generators with Image Sensor-Based Quantum Noise Sources
Next Article in Special Issue
Multiple Linear-Combination Security Network Coding
Previous Article in Journal / Special Issue
BAR: Blockwise Adaptive Recoding for Batched Network Coding
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design and Analysis of Systematic Batched Network Codes

1
School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Shenzhen 518172, China
2
Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong, China
3
Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117597, Singapore
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(7), 1055; https://doi.org/10.3390/e25071055
Submission received: 31 May 2023 / Revised: 25 June 2023 / Accepted: 28 June 2023 / Published: 13 July 2023
(This article belongs to the Special Issue Information Theory and Network Coding II)

Abstract

:
Systematic codes are of important practical interest for communications. Network coding, however, seems to conflict with systematic codes: although the source node can transmit message packets, network coding at the intermediate network nodes may significantly reduce the number of message packets received by the destination node. Is it possible to obtain the benefit of network coding while preserving some properties of the systematic codes? In this paper, we study the systematic design of batched network coding, which is a general network coding framework that includes random linear network coding as a special case. A batched network code has an outer code and an inner code, where the latter is formed by linear network coding. A systematic batched network code must take both the outer code and the inner code into consideration. Based on the outer code of a BATS code, which is a matrix-generalized fountain code, we propose a general systematic outer code construction that achieves a low encoding/decoding computation cost. To further reduce the number of random trials required to search a code with a close-to-optimal coding overhead, a triangular embedding approach is proposed for the construction of the systematic batches. We introduce new inner codes that provide protection for the systematic batches during transmission and show that it is possible to significantly increase the expected number of message packets in a received batch at the destination node, without harm to the expected rank of the batch transfer matrix generated by network coding.

1. Introduction

Network coding has great advantages compared with the traditional store-and-forward in network communications [1,2,3]. Random linear network coding (RLNC) provides a decentralized approach to network coding and achieves the multicast capacity of networks with packet loss in a broad setting [4,5,6,7,8,9,10]. In the past twenty years, extensive studies have been performed towards resolving the implementation issues of RLNC, such as the computational complexity and the coefficient overhead [11,12,13,14]. Batched network coding extends RLNC by introducing an inner code–outer code structure [15,16,17,18,19,20,21]. In particular, the outer code of a batched network code encodes the message packets into a sequence of batches, each of which is a number of coded packets, and the inner code is formed by linear network coding applied on the coded packets belonging to the same batch. The design of the outer code and the inner code can be separated, where the outer code achieves end-to-end reliability and the inner code maximizes the network efficiency [22]. The number of packets in a batch (called the batch size) affects the coefficient overhead and the computational complexity. To achieve the benefits of network coding and constrain the overhead/complexity, the batch size is usually a small integer larger than 1, e.g., 8 or 16 [23]. Batched network coding allows joint batch encoding/decoding, while the original RLNC schemes can be regarded as special batched network codes where the outer code has a single batch or multiple batches encoded/decoded separately.
In coding theory, a code is said to be systematic if all message symbols form a subset of the coded symbols [24]. Many practical codes can be designed to be systematic—for example, Reed–Solomon codes [25], fountain codes [26] and polar codes [27]. Standardized LDPC codes in both 802.11 and 5 G NR are systematic. For network communications, the retransmission-based end-to-end reliability scheme can be regarded as a systematic code. The benefits of the systematic codes are also attractive for practical applications of batched network codes, especially for latency-sensitive applications [28,29,30].
Different from systematic channel coding, systematic batched network coding needs to take both the outer code and the inner code into consideration. Though not optimal in general, an overlapping outer code with batches formed by subsets of the message packets proposed in [15,16,17,20] is already systematic in the sense that the union of some batches includes all the message packets. However, even with a systematic outer code, the benefits of systematic codes cannot be obtained due to network coding: using random linear coding at the intermediate nodes prevents the reception of the message packets at the destination nodes. The problem cannot be solved by simply excluding the message packets from network coding, which reduces the benefits of network coding. Most existing works on systematic RLNC focus on encoding and decoding at the source node and destination nodes, respectively, without considering network coding at the intermediate nodes [31,32,33,34,35,36].
In this paper, we study systematic batched network codes that have a systematic outer code and an inner code that can preserve the benefits of the systematic outer code. Our contributions are summarized as follows.

1.1. Contributions Regarding Systematic Outer Codes

The outer code of a batched network code can be designed by extending fountain codes or LDPC codes [19,21], which achieve higher rates than the overlapping outer codes for the same inner code. The existing outer codes obtained by coding are not designed to be systematic. In principle, any linear code can be systematic by transforming the generator matrix to the reduced echelon form. As the existing outer codes obtained by coding are linear, they can also be systematic. The main issue, however, is how to preserve the low encoding/decoding computation cost: a general transformation of the generator matrix by Gaussian elimination affects the structure of the codes and hence may increase the computation cost.
In this paper, we design a systematic outer code based on the BATched Sparse (BATS) outer code, which is a matrix-generalized fountain code [19]. When the batch size is 1, the BATS outer code becomes a fountain code. The BATS outer code preserves the rateless feature of fountain codes, i.e., the number of batches that can be generated is unlimited (i.e., the rateless property) and can achieve a nearly optimal outer code rate with low encoding/decoding complexity. To preserve the salient features of the BATS outer code, the systematic outer code is also expected to be rateless, where the first n s batches (called systematic batches) consist of a partition of the message packets. In addition to the systematic batches, the outer code can further generate more batches, called non-systematic batches. The fountain code has a low-complexity systematic design that benefits from the universal degree distribution [37]. When the batch size is larger than 1, the degree distribution of the BATS outer code depends on the rank distribution of the batch transfer matrices and hence is not universal. For this reason, the systematic design of BATS codes has to consider some new issues that do not appear in the systematic fountain code design.
In this paper, we generalize the fountain code approach to design a systematic outer code, which uses a (non-systematic) BATS outer code that satisfies the consistency requirement. In particular, a consistent outer code generates the first n s batches deterministically, which can recover all the message packets. To ensure a small coding overhead, n s is expected to be as small as possible. For a fountain code, the minimum value of n s is the same as the number of message packets, and a consistent code with the minimum value of n s can be found using a number of trials of the random encoding procedure of the fountain code. As fountain codes are universal, for each number of message packets, a consistent code can be designed once and used forever. However, BATS outer codes are not universal, and, even for the same number of message packets, the consistent code is different for different rank distributions. Our experiments show that when the number of message packets is larger, many more random trials are required to find a consistent outer code with a small coding overhead.
To design a systematic outer code with a small value of n s more efficiently, we propose a structured encoding approach for the first n s batches, called triangular embedding. Using triangular embedding, zero-coding-overhead outer codes can be designed with one or two random trials for a large range of the number of message packets. Triangular embedding does not increase the computation costs of both encoding and decoding. Moreover, we also verify in experiments that the batches generated by triangular embedding can be used with the batches generated by the BATS outer code and demonstrate superior decoding performance compared to the BATS outer code.
We also analyze the encoding and decoding computation costs of the proposed systematic outer code. For encoding, the systematic outer code has a lower computation cost than the corresponding BATS outer code. The decoding computation cost of the systematic outer code depends on the number of message packets received at the destination node. When all the message packets are received, no computation is required for decoding. When some of the message packets are not received, the decoding computation cost of the systematic outer code increases with the number of message packets that are not received and is at most 2 times the computation cost of the BATS outer code decoding.

1.2. Contributions Regarding Inner Codes

We further study the inner code that can protect the message packets in the systematic batches. For line networks, systematic inner coding has been discussed for batched network coding [23], where an intermediate node transmits both the received packets and the recoded packets generated by linear combinations of the received packets. For a line network without packet loss, the destination node can receive all the message packets generated by the systematic outer code when using systematic recoding. However, if the packet loss rate for each communication link is bounded below by a positive number, the number of message packets that can be received by the destination node decreases exponentially rapidly as the network length increases. For systematic RLNC, a decode–recode network coding approach has been proposed to protect the message packets [38], where an intermediate node first tries to decode the message packets and then transmits the decoded message packets together with some recoded packets. Systematic RLNC is a special systematic batched network code with only the systematic batches, and the decode–recode approach is mainly discussed for extended window recoding.
In this paper, we extend and refine the decode–recode approach for the inner code of batched network coding. For a general batched network code, it is not necessary that the received packets of a batch at an intermediate node can decode all the original packets. In other words, the batch transfer matrix formed by the coefficient vectors of all the received packets of a batch at an intermediate node may have a rank lower than the batch size. We instead study how to decode some of the message packets uniquely at an intermediate node. We say that a message packet in a systematic batch is recoverable at an intermediate node if it can be uniquely solved by the received packets of the batch at the intermediate node. We give a necessary and sufficient condition such that a message packet in a batch is recoverable, and we show that using Gauss–Jordan elimination, we can find all the recoverable message packets in a batch. We also analyze the recovery of the message packets at the next hop subject to packet loss and side information. Our analysis shows that generating all recoded packets using random linear coding is not preferable, and knowing more information about recoding than the coefficient vectors does not aid in the recovery of message packets.
Based on our analysis, we improve systematic inner coding to protect the message packets in a batch, where the level of protection can be tuned by a parameter. Our inner codes can achieve the same network coding gain as the existing inner codes, while significantly improving the number of received message packets. By tuning the parameter, the number of received message packets can be further increased with the cost of lower coding rates. Both the recovery of the message packets and the message protection recoding are linear operations on a batch, and hence our inner code does not increase the coefficient overhead for decoding at the destination node.

1.3. Paper Organization

The remainder of this paper is organized as follows. Section 2 is a self-contained introduction of batched network coding with the BATS outer code. In Section 3, we propose a general approach to systematic outer codes based on the BATS outer code. In Section 4, we introduce the triangular embedding approach to improve the design efficiency of the systematic outer code. In Section 5, we discuss the inner coding schemes that can protect the message packets in systematic batches. Section 6 presents the concluding remarks.

2. Ordinary Batched Network Coding

We briefly introduce ordinary (non-systematic) batched coding to assist the further discussion of the systematic design. A batched network code is formed by an outer code and an inner code. Here, we focus on a specific outer code called the BATS outer code, which was originally introduced by the BATS code. Readers are referred to [23] for more information about the BATS code.

2.1. BATS Outer Code

The outer code introduced here is also called the ordinary outer code, in contrast to the systematic outer code, to be discussed in the next section.
A finite field of size q, denoted as F q , is called the base field. A packet of length T is a column vector in F q T , and a set of packets of the same length is equated to the matrix formed by juxtaposing the packets in the set. We consider the transmission of K message packets, which form the T × K matrix B from the source node to the destination node in a network.
The (ordinary) outer code encodes the K message packets in two steps. The first step uses a systematic precode to generate a number of redundant packets, which are also called parity check packets. Let K K be the total number of packets containing the message packets and the parity check packets. Denote by B p the K K parity check packets. Let P the K × ( K K ) parity check matrix of the precode, i.e.,
[ B B p ] P = 0 .
The parity check packets can include both low-density parity check (LDPC) and high-density parity check (HDPC) packets to balance the computation cost and the decoding performance. Refer to [37] for such a design of P .
Let B = [ B B p ] , which are called the precoded packets. The second encoding step of the outer code generates batches of coded packets. Let M be a positive integer called the batch size, which is usually less than a hundred. For i = 1 , 2 , , the ith batch X i includes M packets generated from a subset B i B as follows:
X i = B i G i ,
where G i is a matrix of M columns called the batch generator matrix. The number of packets in B i , which is also the number of rows of G i , will be specified later. When M = 1 , the outer code becomes a fountain code. The design of B i is discussed as follows.
Here, we discuss general batch encoding that can be used for various decoding approaches, including inactivation decoding. The precoded packets are further separated into two parts:
  • active packets that include a subset of the message packets and all the LDPC packets, and
  • inactive packets that include all the other message packets and all the HDPC packets.
Denote by A the number of active packets. Then, the number of inactive packets is K A . We require A K . As a special case, when there are no HDPC packets or inactive packets during encoding, we have A = K . The encoding of a batch uses both active and inactive packets.
The number of active packets used in a batch is determined using a degree distribution Ψ = ( Ψ 1 , , Ψ D max ) , and it affects the decoding performance of both belief propagation decoding and inactivation decoding. The degree distribution Ψ is designed based on the batched transfer matrix rank distribution induced by the inner code. The maximum number D max for the active packets is sufficient to be a couple of multiples of M, as proven in [19]. For the encoding of each batch X i ,
  • Independently sample Ψ and obtain an integer d i A , which is called the active degree of the batch;
  • Uniformly, at random, choose d i A active packets to be included in B i .
The inactive packets can help to further improve the inactivation decoding performance. When M = 1 , on average, each batch may involve 2 or 3 inactive packets [26]. When M > 1 , the number of inactive packets in a batch can be 3 ( K A ) / n , where n is the number of batches expected to be used for decoding. Denote by d i B the number of inactive packets used in the ith batch.
Considering both active and inactive packets, B i has d i = d i A + d i B packets, where d i is called the total degree of the batch. G i is a d i × M uniformly random matrix with entries from the base field. In practice, random encoding can be implemented by a pseudorandom number generator. The random values in the encoding process can be used for decoding if they share the same pseudorandom number generator at the source node and the destination node.
Denote by ENC the encoder that implements the above encoding process of the BATS outer code. The pseudocodes of ENC are given in Appendix C for reference.

2.2. General Inner Code Formulation

We use a line network as an example to introduce the inner code, and the inner code can be extended to other network typologies as discussed in [23]. A line network of length L is formed by a sequence of network nodes labeled by 0 , 1 , , L , where the first node 0 is the source node and the last node L is the destination node. All the other nodes are called intermediate nodes. Network links exist only between two consecutive network nodes, modeled by packet erasure channels, i.e., a packet transmitted on a network link is either correctly received or erased. Figure 1 illustrates the line network.
The inner code is the composition of the recoding operations performed on each batch separately. The recoding at the source node takes the batches generated by the outer code as the input, and the recoding at an intermediate node takes the received packets of a batch as the input. For each batch, recoding generates a number of linear combinations of the packets belonging to the batch, and the packets generated by recoding are supposed to belong to the same batch. There are various approaches to the recoding operation, which is determined by the linear combination coefficients. The original RLNC schemes use coefficients chosen uniformly at random from the base field [4,6,7], and extensive research has been carried out towards recoding with lower complexity and latency [39,40,41,42,43,44]. In this paper, we study the recoding schemes that can fulfil the systematic coding requirement.
Without specifying a recoding scheme, we give a general formulation of recoding. Fix a certain network node u. Let Y i ( u ) be the received packets of the ith batch at the node u. At the source node, Y i ( 0 ) = X i . As recoding is linear, for v = 1 , , L ,
Y i ( u ) = X i H i ( u ) = B i G i H i ( u ) ,
where H i ( u ) is called the (batch) transfer matrix of the ith batch at the node u. The number of rows of H i ( u ) is M. The number of columns of H i ( u ) corresponds to the number of packets received for the ith batch at the node u, which may vary for different batches and is finite. If no packets are received for a batch, Y i ( u ) ( H i ( u ) ) is the empty matrix of 0 columns.
Note that the transfer matrices are determined not only by the recoding scheme, but also by the network packet loss pattern. Due to the randomness in both recoding and packet loss, the transfer matrices cannot be derived from the recoding design. To obtain the transfer matrices, RLNC introduces coefficient vectors embedded in the packet header immediately after X i is generated. The matrix formed by the coefficient vectors is the identity matrix. The same linear operations performed on a batch are performed on the coefficient vectors as well, so that H i ( u ) can be known at each node u that receives batch i from the header of the batch.
We say that a set of packets of a batch are linearly independent/dependent if their corresponding coefficient vectors in the packet header are linearly independent/dependent. We call rank ( H i ( u ) ) the rank of the ith batch at node u.

2.3. Decoding Algorithms

Suppose that n batches Y i ( L ) , i = 1 , , n are received at the destination node L. A decoder is expected to recover B using Y i ( L ) , i = 1 , , n , which are related by a linear system. From this perspective, we obtain an upper bound on the decoding performance [23]:
K i = 1 n rank ( H i ( L ) ) .
When used as a block code with a fixed number n of batches, the (outer) coding rate defined as K / n , together with the decoding success probability, is used to measure the outer code performance. When used as a rateless code, decoding allows more batches to be used until all the message packets are decoded, and the (outer) coding overhead defined as i = 1 n rank ( H i ( L ) ) K is used to measure the decoding performance.
As B and Y i ( L ) , i = 1 , , n are related by a linear system, Gaussian elimination is the optimal algorithm to solve B . However, Gaussian elimination incurs a computational complexity linear in K when decoding one message packet on average, which is not tolerable when K is slightly large. In the remainder of this section, we introduce several approaches that can achieve O ( 1 ) complexity in decoding one message packet. In the following, we first discuss two decoding algorithms without inactive packets and then discuss inactivation decoding.

2.3.1. Two-Step Decoding

Suppose that the number of inactive packets during encoding is 0, so that d i B = 0 for all batches. We first discuss the two-step decoding approach. The first step recovers a fraction η K / K of precoded packets using a belief propagation (BP) algorithm, which repeats the following operations:
  • A batch i is said to be decodable if d i A = rank ( G i H i ( L ) ) ; solve a decodable batch by Gaussian elimination;
  • Substitute the decoded (precoded) packets into other undecoded batches and update the corresponding batch degree and generator matrix.
The BP decoding algorithm has a low computation cost that does not depend on the total number of message packets K. The second step decodes the precoded packets to recover the message packets, which is expected to be successful if the first step recovers at least η fractions of all the precoded packets.
Assume that the ranks of batch transfer matrices at the destination node rank ( H i ( L ) ) are i.i.d and follow the distribution h = ( h 0 , h 1 , , h M ) . We call E [ h ] = i = 1 M i h i the expected rank. According to the theory of BATS codes [23], it is possible to design a degree distribution Ψ for a given rank distribution h such that when K is large, the BP decoding can recover a given η fraction of the precoded packets with a high probability when the coding rate K / n is larger, but very close to E [ h ] . In other words, we only need slightly more than K / E [ h ] batches to recover the K message packets.

2.3.2. Joint Decoding

The above two-step decoding algorithm can be improved by combining the two steps when the precoding includes LDPC. For LDPC precoding, each parity check constraint can be regarded as a batch with batch size 1 and only one all-zero received packet. Then, the BP decoding of the batches in the first step of the two-step approach can also include the parity checks.
In practice, the decoding of the LDPC precode and the decoding of the batches in the two-step decoding algorithm can be combined together to improve the performance. The joint decoding algorithm can improve the decoding success rate and reduce the coding overhead of the two-step decoding algorithm, but does not increase the computation cost of the two-step decoding.

2.3.3. Inactivation Decoding

When K is relatively small or the coding overhead is small, BP decoding tends to stop before decoding all the message packets. Although we can continue decoding by Gaussian elimination, the computational complexity is high.
A better approach is to use inactivation decoding: when BP decoding stops, an undecoded message packet is marked as inactive and substituted into the batches as a decoded packet to resume the BP decoding procedure. The decoding of batches with inactive packets also induces linear constraints on the inactive packets. Eventually, all the message packets are either decoded or inactive. The inactive packets are then solved by the linear constraints induced by decoding batches and the precodes. Inactivation decoding has the same decoding performance as Gaussian elimination, but can have a much lower computation cost if the number of inactive packets is small.
Moreover, when using inactivation decoding, we can use the inactive packets during encoding. Inactive packets during encoding are treated as inactive from the beginning of inactivation decoding and hence are also called pre-inactive packets. The extra inactive packets added during decoding are called the dynamic inactive packets. See [23] for a detailed discussion of inactivation decoding for BATS codes.
Denote by DEC the decoder that implements one of the above decoding processes of the BATS outer code. The pseudocodes of DEC for two-step decoding are given in Appendix C for reference.

3. Systematic Outer Codes

In this section, we design a systematic outer code that can preserve the silent features of the ordinary BATS outer code. We call those batches that are designed to include all the message packets the systematic batches.

3.1. Naive Approaches

Before introducing our approach, we first discuss some naive approaches and their limitations. For a fixed number n of batches, the outer code is a linear block code and hence the encoding process can be described as
X 1 X n = B G ˜
where G ˜ is the K × n M generator matrix of the first n batches. Suppose that n M K . If  G ˜  has K columns forming the identity matrix, the outer code is systematic.
First, we show that the random encoding of the ordinary BATS outer code is not a systematic code with high probability. For a batch of total degree d, the probability that a coded packet is equal to a precoded packet is d q d . As not all precoded packets are message packets, the probability that a coded packet is equal to a message packet is no greater than d q d . Typically, d M 2 and q = 256 . Thus, it is unlikely that a message packet appears in a batch using the ordinary outer code.
When n is slightly larger than K / M , the matrix G ˜ obtained from the ordinary BATS outer code has rank K with a high probability. The general procedure to make a linear code systematic is to transform G ˜ by elementary row operations into the reduced row echelon form. Although a systematic code can be obtained, the drawback of this approach is that the low encoding/decoding computation cost of the BATS outer code cannot be preserved.
Now, we discuss another naive approach that seems solve the computation cost issue. To simplify the discussion, suppose that the number of message packets K is a multiple of the batch size M. In this naive approach, the first K / M batches form a partition of all the message packets, and more (non-systematic) batches are generated according to the encoding of batches as an ordinary outer code discussed in Section 2.1. However, to guarantee good decoding performance using the naive approach, a high degree must be applied to all the non-systematic batches.
We show two cases wherein a high degree of the non-systematic batches is necessary. In the first case, one systematic batch is completely erased during the communication and all the other systematic batches are received by the destination nodes, together with a non-systematic batch. Suppose that the erased batch is randomly chosen. For all the received batches, the batch transfer matrix is the M × M identity matrix so that the decoding problem becomes one of traditional erasure coding. The total number of received packets is K. To guarantee the decoding of all the message packets, it is necessary that the degree of the received non-systematic batch is K.
In the second case, we consider that for M systematic batches, only one packet is erased during communication and all the other packets are received correctly. In other words, the batch transfer matrix of these M systematic batches is the M × M identity matrix with one column removed, chosen uniformly at random. The destination node also receives all the other systematic batches, together with a non-systematic batch, all with the identity batch transfer matrix. The total number of received packets is K. To guarantee the decoding of all the message packets, it is necessary that the degree of the received non-systematic batch is K.
From these cases, we see that to achieve a high coding rate using the naive approach, the degree of the non-systematic code must be high and hence the encoding/decoding complexity is high. In the remainder of this section, we derive an approach to obtain a systematic outer code that has similar encoding/decoding complexity to the ordinary BATS outer code.

3.2. General Approach to Systematic Outer Codes

We give a general approach tp systematic outer codes, which extends the idea of systematic fountain codes [37]. Suppose that we have K message packets B for encoding using a systematic outer code with batch size M, where K is not necessarily a multiple of M. Let n s be an integer larger than or equal to K / M , to be decided later. We wish to design an outer code such that the first n s batches are systematic batches that include all the message packets.
Our approach to a systematic outer code uses an ordinary outer code ( ENC , DEC ) , where ENC is the encoder and DEC is the decoder, as described in Section 2.1 and Section 2.3, respectively. The encoder ENC has two parts ENC n s and ENC n s + , where ENC n s generates the first n s batches and ENC n s + generates all the further batches. The decoder DEC in general applies to all the batches subject to any batch transfer matrices. We denote by DEC n s the case of DEC when applying to the first n s batches with the rank-M batch transfer matrices.
To construct the systematic outer code, we require ( ENC , DEC ) satisfying some additional requirements. The pair ( ENC , DEC ) is said to be consistent if the following conditions are satisfied:
  • ENC n s and DEC n s are deterministic; and
  • for any K packets B ,
    B = DEC n s ENC n s ( B ) .
For the consistency requirement 2, it is possible to verify (3) without any specific value B of K packets, i.e., it is not necessary to check all choices of K packets. The reason is that both ENC n s and DEC n s are linear operations and, if the decoding is successful, their joint effect is to multiply the K × K identity matrix. We discuss how to design a consistent outer code later. Here, we focus on how to use it to construct a systematic outer code.
For a consistent ( ENC , DEC ) , the decoder DEC n s solves K message packets from the n s M coded packets generated by ENC n s . Among the n s M coded packets, n s M K coded packets are redundant and can be removed without affecting the decoding performance (The decoding of a BATS code requires us to solve a system of linear equations by elementary equation operations. Each coded packet corresponds to an equation of the system. Each equation can solve at most one message packet. Therefore, exactly K equations are eventually transformed into the solutions of the message packets. The other equations are redundant). All the redundant packets can be identified by a trial of DEC n s . For i = 1 , , n s , let M i be the number of non-redundant coded packets in the ith batch. We know that i = 1 n s M i = K . Denote by DEC n s * the same decoder as DEC n s except that the redundant coded packets are removed from the decoder input.
Now, we can construct the systematic outer code. For the systematic outer code, the encoding at the source node works as follows:
  • Partition the message packets B into n s subsets X ˜ i , i = 1 , , n s , where the number of packets in the ith subset X ˜ i is M i ;
  • Calculate B ˜ = DEC n s * ( X ˜ 1 , , X ˜ n s ) = DEC n s * ( B ) ;
  • Generate the first n s batches ENC n s ( B ˜ ) ;
  • Generate more batches by performing ENC n s + on B ˜ .
See Figure 2b for an illustration of the above encoding process.
We justify that the above encoding process is systematic by showing that the first n s batches include all the message packets. Denote by ENC n s * the encoder that generates only the M i non-redundant coded packets in the ith batch, where i = 1 , , n s . For any K packets B , DEC n s * ENC n s * ( B ) = DEC n s ENC n s ( B ) = B . Note that ENC K and DEC K can be expressed as square matrices that are inverse to each other, and hence their order can be interchanged without changing the output, i.e., ENC n s * DEC n s * ( B ) = ENC n s * B ˜ = B .
The computation cost of the third step of encoding can be simplified as not all the packets in the systematic batches need to be regenerated. Let ( X 1 , , X n s ) = ENC n s ( B ˜ ) . We have X ˜ i X i and X i X ˜ i includes only the redundant packets for DEC n s in the ith batch. As X ˜ i is a subset of the message packets, it is not necessary to generate it again. Denote by ENC n the encoder of B ˜ that generates only X i X ˜ i for i = 1 , , n . Let ( X ¯ 1 , , X ¯ n ) = ENC n ( B ˜ ) . Then, the n systematic batches are X i = X ˜ i X ¯ i .
The batches generated by the above systematic encoding process will be further transmitted through a network and processed by the inner code. Let Y be the coded packets received by the destination node. To decode, first, DEC is applied on Y to output B ˜ . Then, we apply ENC n s on B ˜ to recover B . See Figure 2c for an illustration of the decoding process.

3.3. Computation Cost

At first, it seems that the systematic outer code increases the encoding and decoding computation cost because an additional decoding step is employed in the systematic encoding, and an additional ordinary encoding step is employed in the systematic decoding. However, after careful evaluation, we see that the encoding computation cost of the systematic outer code is lower than that of the ordinary outer code. The decoding computation cost of the systematic outer code depends on the number of message packets received at the destination node. In the worst case, where no message packets are received, the decoding computation cost is doubled.
To assist our discussion, we denote by b the average computation cost of encoding a packet using the ordinary outer code, and we denote by c the computation cost of decoding the ordinary outer code using K coded packets. Here, we assume that the decoding is successful with zero coding overhead. Suppose that the packet length T is much larger than M, which means that the coefficient vector length is much less than T. According to the analysis in [23], b = O ( M ) and c = O ( K M ) linear combination operations (LCOs). (A linear combination operation (LCO) refers to the computation of a linear combination x + α y , where x and y are two packets of T field elements and α is an element from the base field.) Moreover, for the two-step decoding and the joint decoding, K b c . For the inactivation decoding, if the number of inactive packets is bounded by a constant, K b c .

3.3.1. Encoding Computation Cost

The encoding computation cost depends on the number of coded packets generated. For the ordinary outer encoding, the computation cost of encoding k packets is k b , where k = 1 , 2 , . For the systematic outer code, we assume n s M = K (we will discuss how to design such a code). As the first K packets are the message packets, the encoding of the first K packets requires no computation. To encode more packets, the systematic outer code needs to execute DEC n s * , which has a computation cost c, and ENC n s + , which takes computation cost b on average to generate a packet. Therefore, when k > K , the computation cost of generating the first k coded packets using the systematic outer code is ( k K ) b + c k b . See the illustration in Figure 3a regarding the computation cost of generating the first k packets.
To further understand how the encoding computation cost affects the operation at the source node, we consider two models of message packet arrival at the source node. In the first model, the message packets arrive one-by-one with a unit time interval between two consecutive packets. The ordinary outer code encoding can only start to generate the first coded packets from the time K when a precode with HDPC is employed. Let Δ be the time taken by the ordinary encoder to generate K coded packets, where Δ K b c . The systematic outer code can generate a coded packet upon the arrival of each message packet. At the time K, the systematic outer code executes DEC n s * , which also takes Δ time. In the second model, all the K message packets arrive together at the same time, e.g., time K. For this model, the ordinary outer code behaves in the same way as for the previous model, and the systematic outer code can generate the first K coded packets at time K.
We see that for both message packet arrival models, the systematic outer code generates the first K packets earlier than the ordinary outer code. When k > K , both encoders generate the kth packet at the same time. See an illustration of this in Figure 3b.

3.3.2. Decoding Computation Cost

For the systematic outer code, the decoding computation cost depends on the number K m of message packets received by the destination node. When K m = K , i.e., all the message packets are received, no computation is required for decoding. When K m < K , the systematic code decoder needs to execute DEC , which has a computation cost c, and ENC n s * , which takes computation cost b on average to generate a packet. As K m message packets have been received, we only need to use ENC n s * to generate the remaining K K m message packets. Therefore, the overall decoding computation cost is ( K K m ) b + c 2 c K m b . When K m is close to K, the systematic outer code decoding computation cost is close to the ordinary outer code decoding. In the worst case, i.e., K m = 0 , the systematic outer code decoding computation cost is doubled compared with the ordinary outer code decoding. See an illustration in Figure 4a.
To illustrate how the decoding computation cost affects the operation at the destination node, we consider that coded packets are received one-by-one with a unit time interval between two consecutive packets. We assume that the ordinary outer code decoder starts decoding at time K and takes additional Δ time to decode all the message packets. When K m = K , all the message packets are decoded at time K. When K m < K , the systematic outer code decoder executes DEC at time K and starts to use ENC n s * from time K + Δ to generate the K K m message packets that have not been received. In the worst case, where K m = 0 , the systematic outer code decodes all the message packets at time K + 2 Δ . See an illustration in Figure 4b.

3.4. Random Design

To implement the general approach to the systematic outer code, we only need to design a consistent pair ( ENC , DEC ) . In the following part of this section, we discuss the traditional random approach to designing a consistent ( ENC , DEC ) . In the next section, we discuss a new approach that can design a consistent ( ENC , DEC ) more effectively.
Denote by h the rank distribution of the batches and let Ψ A be the degree distribution optimized for h as in ([23], Chapter 6), which achieves the near-to-optimal rate of the ordinary outer code as in Section 2.1 asymptotically. We can use the ordinary encoder and decoder as introduced in Section 2.1 with the degree distribution Ψ A to design ENC n s + and DEC for a consistent outer code ( ENC , DEC ) .
The ordinary outer code is random, but we need a deterministic encoder–decoder pair ( ENC n s , DEC n s ) to satisfy the consistent properties. For a given n s K / E [ h ] , we can perform random trials of the ordinary outer code using the degree distribution Ψ A until an instance ( ENC n s , DEC n s ) is found such that (3) is satisfied. Note that it is sufficient for us to find only one such instance. As both ENC n s and ENC n s + generate batch instances following the random outer code encoder with the degree distribution Ψ A , which is optimized for h , DEC can guarantee a high decoding success probability for a sufficiently large number of received batches [23].
If such an instance cannot be found for a certain value n s , we can increase the value of n s by 1 and try again. The ordinary outer code is expected to decode correctly with a high probability when the number of batches is sufficiently large, and we expect to design a systematic code with the expected coding overhead  n s E [ h ] K as small as possible.
When M = 1 and E [ h ] = 1 , i.e., the case of fountain codes, a consistent outer code exists for a range of the values of K when n s = K using this approach [26]. For fountain codes, the random design works well as fountain codes have a universal design that can handle all packet loss patterns. The random design is only performed once for each value of the number of message packets K. Therefore, the efficiency of the random design is not an issue. In other words, a large number of random trials can be performed to find a consistent outer code with a small or zero coding overhead.
Although the random design is suitable for fountain codes, it can be less efficient when M > 1 . BATS codes are not universal in the sense that the optimal degree distribution depends on the rank distribution h . Therefore, even for the same value of K, the random design needs to be repeated for each h , and this may need to be carried out for h obtained online. Hence, the efficiency of the random design becomes an issue for BATS codes with batch size M 2 . For M = 16 , we perform some experiments using the BATS code implementation in [45] with the parameters in Appendix B. Inactivation decoding is applied to achieve a lower coding overhead. To limit the computation cost of inactivation decoding, the number of inactive packets is limited to 150. In the experiments, we use the rank distribution h with E [ h ] = M , which is also called the rank-M distribution. The experimental results are summarized in Table 1. We observe that when K is up to 400 M , a consistent instance with n s = K / M can be found. However, the larger the value of K, the lower the probability of a code with zero coding overhead. For example, when K = 10 M , most instances have zero overhead. Meanwhile, when K = 400 M , only four instances have zero coding overhead. However, when K is 600 M , no instance is found with zero coding overhead.

4. Triangular Embedding: A Structured Systematic Outer Code Design

We propose a structured design of consistent outer codes with a general batch size M 1 . Our approach is based on the following observation. For a consistent instance found by the random design, DEC n s gives an order of the batches such that the ith batch is solvable if all the previous batches are solved. Our approach, called triangular embedding, tries to design ENC n s so that the order of the batches for solving is predefined. When M = 1 , our approach also gives a new design of systematic fountain codes.

4.1. Triangular Embedding Design

Consider the encoding of K message packets with respect to a general rank distribution h . We discuss how to generate the first n s batches, where n s K / E [ h ] . The precode is the same as the ordinary outer code. Let K and K be the number of message packets and the number of precoded packets, respectively. The precoded packets are also separated into active and inactive packets. Let A be the number of active packets, where A K .
For the degree distribution Ψ optimized for h , we assume that the degree probability is zero for degrees from 1 to M 1 (This assumption does not affect the generality of our design as it is asymptotically optimal to use such a degree distribution when the rank M probability of the batch transfer matrix is positive. When the probability of transfer matrix rank M is zero, we should reduce the batch size to improve the network’s communication efficiency). Generate the active degree values d 1 A , , d n s A for the first n s batches by sampling Ψ . To simplify the discussion, we assume that the degree values are ordered so that M d 1 A d 2 A d n s A . The inactive degree d i B of the ith batch is obtained in the same way as the ordinary outer code.
Fix positive integers M 1 , M 2 , , M n s such that i = 1 n s M i = K and M i M d i A . For example, when n s divides K, we may choose M i = K / n s . When n s does not divide K, there exist unique non-negative integers a and b < n s such that K = a n s + b . We may let M i = a + 1 for i = 1 , , b and let M i = a for i = b + 1 , , n s .
Let N inac be the maximum number of dynamic inactive packets allowed during inactivation decoding. We should determine the total number of inactive packets N inac + K A according to the decoding computation cost constraint. For example, N inac + K A = 2 K . We further assume that for i = 1 , , n s ,
d i A min { A K , N inac } + j = 1 i M j .
This assumption is usually satisfied by the degrees sampled as 1) A K is linear in K and 2) the average degree of a BATS code degree distribution is only around two times the batch size M and even the maximum degree is O ( M ) . If d i A does not satisfy (4), which should occur rarely, we can modify d i A to this upper bound or re-sample the active degrees.
Let m 1 = 0 , and, for i 2 , let m i = m i 1 + M i 1 . For i 1 , the d i A active packets in B i include
  • the ( m i + 1 ) th to the ( m i + M i ) th active packets, and
  • a set of d i A M i packets chosen from the first m i active packets and the last min { A K , N inac } active packets.
The inactive packets in B i are obtained in the same way as the ordinary outer code. The ith batch is generated as B i G i , where G i is a d i × M matrix different from the ordinary outer code. The rows of G i corresponding to the ( m i + 1 ) th to the ( m i + M i ) th active packets have the form I M i U , where I M i is the M i × M i identity matrix and U is the M i × ( M M i ) uniformly random matrix. The other rows of G i are uniformly random. The batches generated can be transmitted following an arbitrary order.
Define G ˜ i as the K × M matrix by inserting zero rows into G i so that [ B B p ] G ˜ i = B i G i . The overall generator matrix of ENC n s can be written as G ˜ = G ˜ 1 G ˜ n s . According to the design of ENC n s , G ˜ is of the form in Figure 5.

4.2. Decoder Design

It is possible to use the decoders discussed in Section 2.3 to decode the batches generated by triangular embedding. However, due to the structure of the triangular embedding encoding, the decoder can be simplified.
We design a decoder DEC n s using only the first M i packets of the ith batch, i = 1 , , n s . The overall generator matrix G ˜ * of ENC n s * is of the form in Figure 6.
An inactivation decoder can be applied to decode the message packets:
  • First, inactivate all the packets used by the first n s batches among the last A K active packets;
  • Second, apply belief propagation decoding to solve all the batches;
  • Last, solve the inactive packets.
Note that as, at most, min { A K , N inac } packets are used among the last A K active packets during encoding, the total number of inactive packets is no more than N inac + K A .

4.3. Design Verification

We verify the triangular embedding design from two aspects. First, it can help to generate zero-coding-overhead consistent outer codes using a small number of random trials. Second, when jointly decoded with batches generated by the ordinary outer code, the decoding performance is similar to the case of decoding only the batches generated by the ordinary outer code.
We perform the experiments using the batch size M = 16 and the rank-M rank distribution h . As with the experiments in Section 3.4, we use the BATS code implementation in [45] with the parameters in Appendix B. The experimental results of the triangular embedding outer code are shown in Table 2. We see that, using triangular embedding, for K up to 1000 M , more than 99.5 % instances are of zero coding overhead. In fact, for the remaining instances that are not of zero coding overhead, the coding overhead is only 1 packet (generated using the ordinary outer code). The last row in Table 2 gives the maximum number of inactive packets for all the instances tested for each value of K. We see that the number of inactivations is lower than 150, the number of inactivations in the random design. Therefore, diagonal embedding also reduces the decoding computation cost.
As ENC n s uses a different encoding approach to the ordinary outer code ENC n s + , we consider whether the batches generated by diagonal embedding and the batches generated by the ordinary outer code together form a good outer code. We perform some numerical experiments to verify the joint decoding performance of these two types of batches. For each batch generated by triangular embedding, we discard the batch with probability ϵ = 0.1 , 0.3 , 0.5 and send the remaining batches to the decoder. After the first n s batches, the ordinary outer code is applied to generate more batches for the decoder. We adopt the same degree distribution Ψ optimized for the rank-M distribution. The results are shown in Table 3. We see that for all the cases of ϵ and for K = 10 M , 100 M , 200 M , the number of zero-coding-overhead instances is higher than that in Table 1 and the number of instances with a coding overhead larger than 2 M is lower than that in Table 1. For  K = 400 M , 600 M , the decoding performance is similar to that in Table 1 in terms of both the ratio of zero coding overhead and the ratio of coding overhead larger than 2 M .

5. Inner Code for Systematic Batches

In this section, we study the design of the inner code for systematic batches. Based on the discussion in Section 3.3, the decoding complexity at the destination node depends on the number of message packets received. However, using the existing inner coding schemes, the number of message packets in a systematic batch reduces significantly during communication. In the worst case, when no message packets are received at the destination node, the decoding computation cost at the destination node is doubled when compared with the ordinary BATS outer code. To resolve this issue, we discuss how to design the inner code to preserve the message packets in systematic batches.

5.1. Detailed Inner Code Formulation

We first formulate in detail how each network node performs the inner code. We also discuss the existing inner coding schemes for systematic batches.
We consider the inner code on a line network as described in Section 2. As the inner code is performed on each batch individually, we consider a generic systematic batch X without the subscripts. We assume that the packets in X are all message packets. By (2), the received packets Y ( u ) of the batch X at node u satisfy
Y ( u ) = X H ( u ) ,
where H ( u ) is the transfer matrix of the batch at the node u.
Let N u be the number of columns of Y ( u ) (or H ( u ) ), i.e., the number of received packets of the batch at node u. For a non-destination node u, we use u + to denote the receiver of the outgoing link of u in the line network. Suppose that the node u needs to transmit N u packets of the batch X to the node u + . The transmitted packets, called recoded packets, are generated by linear combinations as Y ( u ) Φ ( u ) = X H ( u ) Φ ( u ) , where Φ ( u ) is an N u × N u matrix over the base field F q . Due to packet loss, the set of received packets at u + is a subset of Y ( u ) Φ ( u ) . Let E ( u ) be an N u × N u + matrix obtained by removing the columns of an identity matrix specifying the packet erasures. We can write
Y ( u + ) = X H ( u ) Φ ( u ) E ( u ) = X H ( u + ) ,
where H ( u + ) = H ( u ) Φ ( u ) E ( u ) .
There are many solutions to design the recoding matrix Φ ( u ) in the literature. One common method for RLNC is a uniformly random matrix over the base field, which is also called the random linear inner code (RLIC). For multicast communications, it has been shown that RLIC achieves the multicast capacity for networks with packet loss [5,8,9,10]. For the line network discussed here, the systematic inner code (SIC) has been proposed [23], where all the linearly independent received packets are directly used as recoded packets. We first discuss the performance of these two existing inner code schemes for systematic batches.
  • When using RLIC for systematic batches, the probability that a recoded packet (a column of X Φ ( u ) ) is a message packet is q M .
  • When using SIC for systematic batches, if the network links have no packet loss, the destination node receives all the message packets without decoding. If each link has an erasure probability ϵ > 0 independently, the number of received message packets at the destination node drops exponentially rapidly with L increasing.
In other words, for both RLIC and SIC, the destination node cannot benefit from the systematic outer code.
We are motivated to study the recoding Φ ( u ) such that a non-source node u can receive more message packets from a systematic batch even when there are packet losses.

5.2. Recovery of Individual Message Packets

Although Y ( u ) does not include any message packets, it may be possible to decode some message packets from (5). When rank ( H ( u ) ) = M , all the message packets of a batch can be decoded at node u. Note that for batched network coding, H ( u ) does not necessarily need to be of rank M. We say that a message packet, i.e., a column of X , can be recovered at node u if it can be uniquely solved from the system (5). When rank ( H ( u ) ) < M , some of the message packets can be recovered by operations within a systematic batch.
Denote by Col ( H ( u ) ) the column space of the matrix H ( u ) . Let e i be the length-M column vector with its ith entry 1 and all the other entries 0. A necessary and sufficient condition such that a message packet can be recovered from Y ( u ) is as follows.
Lemma 1. 
Under the condition that Y ( u ) = X H ( u ) is consistent, the ith packet in X has a unique solution if and only if e i Col ( H ( u ) ) .
Proof. 
The lemma can be proven by the equivalence of the following statements:
  • The ith packet in X has a unique solution;
  • All the vectors x F q N u such that x H ( u ) = 0 (called the left nullspace collectively) have the ith entry 0;
  • e i is orthogonal to the left nullspace of H ( u ) ;
  • e i is in the column space of H ( u ) .   □
The following proposition shows that we can test the recoverability of all the message packets in a systematic batch from the reduced column echelon form of H ( u ) , which can be obtained by (column-wise) Gauss–Jordan elimination.
Proposition 1. 
Let L be the reduced column echelon form for a matrix H ( u ) . Then, e i Col ( H ( u ) ) if and only if e i is a column of L .
Proof. 
See Appendix A.    □
The next proposition shows that if a message packet cannot be recovered at a node, it cannot be recovered at any of the following nodes. Equivalently, if a message packet can be recovered at a node, it can be recovered at all the previous nodes.
Proposition 2. 
If a message packet cannot be recovered at the node u, then it cannot be recovered at the node u + on the next hop.
Proof. 
If the ith message packet cannot be recovered at the node u, by Lemma 1, we have e i Col ( H ( u ) ) . Due to Col ( H ( u + ) ) = Col ( H ( u ) Φ ( u ) E ( u ) ) Col ( H ( u ) ) , e i Col ( H ( u + ) ) and hence the ith message packet cannot be recovered at the node u + .    □
In general, performing an elementary operation as used in Gauss–Jordan elimination on the received packets of a batch does not affect the rank of the batch, and hence does not affect the decoding performance. However, recovering message packets at the intermediate nodes helps to improve the number of message packets to be received/recovered in the next hop. We use an example to illustrate this fact.
Example 1. 
Consider a line network with L = 2 , M = 3 and N 1 = M at node 1. Suppose that
H ( 1 ) = 1 0 a 0 1 b 0 0 c ,
where a , b , c 0 are elements from the base field. Using systematic recoding on H ( 1 ) , no additional packets should be generated and Y ( 1 ) is transmitted by node 1. When the second packet is lost from node 1 to 2, we obtain
H ( 2 ) = 1 a 0 b 0 c ,
At destination node 2, we can only recover one message packet. On the other hand, suppose that we apply the Gaussian elimination step at node 1 and the result should be H ( 1 ) D = I . Then, node 1 transmits Y ( 1 ) D instead of Y ( 1 ) . In this case, if we still erase the second packet, the following node can recover 2 message packets. Moreover, since the Gaussian elimination step preserves the column space of the batch transfer matrix, ( Col ( H ( u ) ) = Col ( H ( u ) D ) ), the rank and number of recoverable message packets at the destination node should be at least as good as in the recoding schemes without this step.
Note that the recovery of the message packets at an intermediate node is a linear operation on a batch and hence can be regarded as a part of the inner code. The effect of the recovery of the message packets can be captured by the coefficient vectors: the same operation applied on the received packets of a batch is applied on the coefficient vectors as well. The destination node does not need to know the exact operations at each intermediate, but only the coefficient vectors of the received packets.

5.3. Side Information for Message Packet Recovery

We discuss some general properties involved in the recovery of message packets at the node u + , which provide guidance for the design of new inner codes. The recoverability of a message packet depends on the knowledge of H ( u ) , which is delivered by the coefficient vectors. Note that the original purpose of the coefficient vectors is for the destination node to decode the batches. A natural question to consider is the following: if more information is delivered from node u to u + , could more message packets be recovered at node u + ?
Proposition 3. 
Suppose that X , H ( u ) , Φ ( u ) and E ( u ) in (6) are mutually independent. Φ ( u ) and X are conditionally independent given H ( u + ) and Y ( u + ) .
Proof. 
See Appendix A.    □
The above proposition states that Φ ( u ) and X are conditionally independent at the node u + . The next proposition further shows that knowing Φ ( u ) at the node u + does not help to recover more message packets at node u + . It actually shows a stronger result that knowing any variable that is independent with X given H ( u + ) and Y ( u + ) at the node u + does not help in recovering more message packets at the node u + .
Proposition 4. 
Suppose that X , H ( u ) , Φ ( u ) and E ( u ) in (6) are mutually independent. Let S be any random variable that is conditionally independent with X given H ( u + ) and Y ( u + ) . Given the instance of H ( u + ) and Y ( u + ) at the node u + , further knowing the instance of S at the node u + does not help to recover more message packets at u + .
Proof. 
See Appendix A.    □
Based on the above analysis, we know that the existing coefficient vectors are sufficient for the recovery of message packets at the intermediate nodes. In other words, it is not necessary for a network node to add further information to assist the recovery of the message packets in the following nodes.

5.4. Recoding with Message Packet Protection

Let r = rank ( H ( u ) ) and V be an N u × N u matrix such that H ( u ) V is in reduced column echelon form. To recover message packets, we perform the same column operations on Y ( u ) and obtain Y ( u ) V = X H ( u ) V . If e i is the jth column of H ( u ) V , then the jth column of Y ( u ) V is equal to the ith message packet.
Let s be the number of message packets that can be recovered from Y ( u ) . By Proposition 1, there are exactly s distinct columns in H ( u ) V with only 1 non-zero entry being one. Therefore, by proper row and column permutations, H ( u ) V is of the form
I s 0 0 0 I r s 0 0 T 0 ,
where I k is the k × k identity matrix, 0 is an all-zero matrix of proper size, and T is an ( N u r ) × r matrix where each column is not zero.
Denoting the first r columns of the corresponding column permutation matrix as the N u × r matrix P , each of the first s columns of H ( u ) V P has only 1 non-zero entry.
We have discussed the decoding step, which is represented by V . However, to generate a recoded batch, some redundant packets are to be generated. The following proposition states that using the random linear inner code at node u, the node u + can recover almost no message packets when the number of received packets at u + is fewer than rank ( H ( u ) ) . Denote by ζ k m , n the probability that an m × n uniformly random matrix over F q has rank k. See, e.g., ([23], Section 3.3.2) for the formula of ζ k m , n .
Proposition 5. 
Suppose that the random linear inner code over F q is used at the node u and N u + < r = rank ( H ( u ) ) . Under the condition that e i Col ( H ( u ) ) , the probability that e i Col ( H ( u + ) ) is 1 k = 0 N u + ζ k r 1 , N u + q k N u + and it converges to zero as q .
Proof. 
See Appendix A.    □
It is unavoidable that the number of received packets at u + is smaller than rank ( H ( u ) ) due to packet loss. Together with Proposition 2, Proposition 5 implies that as long as the event N u + < rank ( H ( u ) ) occurs once at some node u, the destination node receives almost no message packets from a systematic batch. Therefore, random linear recoding is not preferred for the recovery of message packets. Thus, we are motivated to extend systematic inner codes for the recovery of message packets.
We propose two designs of recoding that can protect the message packets during recoding. We first define two recoding matrices. Suppose that s message packets are recoverable at the node u and the rank of the batch is r.
  • Message Protection Recoding
For an integer w with 0 w N u s , let R be an r × N u matrix of the form
R = I s U s , w U r , N u s w 0 0 ,
where U m , n is the m × n uniformly random matrix.
  • Systematic Message Protection Recoding
For an integer w with 0 w N u s , let R sys be an r × N u matrix of the form: when w < N u r ,
R sys = I s U s , w 0 0 0 I r s U r , N u r w ;
when w N u r ,
R sys = I s U s , w 0 0 0 J ,
where J is the first N u w s columns of I r s .
The inner code operations at node u consist of (i) the Gauss–Jordan elimination represented by the matrix V , (ii) the column permutation and removal of the all-zero columns represented by the matrix P , and (iii) (systematic) message protection recoding R ( R sys ). When the overall recoding matrix at node u is V P R , the inner code is called the message protection inner code (MPIC). When the overall recoding matrix at node u is V P R sys , the inner code is called the systematic message protection inner code (SMPIC).
The value of w controls the level of protection of message packets. When w = 0 , no additional protection is provided for message packets, and we can check that SMPIC has the same rank performance as the systematic inner code. When w = N u s , all recoded packets generated by linear combinations are used to protect the message packets.
The computation cost of the proposed message protection recoding at a network node is mainly determined by (1) the Gauss–Jordan elimination for the recovery of the message packets, and (2) the generation of the recoded packets. To simplify the discussion, we only consider the case with w = 0 . At node u, Gauss–Jordan elimination is applied on the N u received packets. As the packet length T is much larger than the batch size M, the computation cost of processing the transfer matrix H ( u ) is ignored. Hence, when the rank of H ( u ) is r, the computation cost of recovering the message packets is about r ( N u 1 ) LCOs. If the previous node also uses message protection recoding, the cost at node u can be reduced, as the message packets received directly can help to simplify the Gauss–Jordan elimination. Let s 0 be the message packet received at node u, and we have s 0 s r . In this case, the computation cost for Gauss–Jordan elimination is about ( r s 0 ) ( N u 1 ) LCOs. For a batch of rank r, the cost of generating recoded packets using R or R sys is linear with the number of entries in uniformly random sub-matrices. Therefore, the overall recoding computation cost for SMPIC with w = 0 is about ( ( r s 0 ) ( N u 1 ) + r ( N u r ) ) LCOs. In contrast, for RLIC, the computation cost is N u N u LCOs, and for SIC, the computation cost is ( N u N u ) N u LCOs.

5.5. Numerical Evaluations

We perform numerical evaluations to verify the performance of the new inner codes in terms of both the average rank and the average number of recoverable message packets and compare it with that of the random linear inner code (RLIC) and the systematic inner code (SIC). We use line networks of length up to 50 hops, where each link has the same independent packet erasure probability 0.2 . The batch size M = 16 and the number of packets to transmit N u = 20 for all nodes u. Since the performance of SMPIC and MPIC shows negligible differences in simulation, we only show the results for SMPIC, where we evaluate w = 0 and w = N u s as representatives.
Our numerical evaluation results are shown in Figure 7. We plot the average number of recoverable message packets and the average rank at node 0 to 50 for SIC, RLIC, SMPIC with w = 0 (denoted by SMPIC0) and SMPIC with w = N u s (denoted by SMPIC1), each with 500 trials. We have the following observations.
  • SIC and RLIC have almost the same rank performance. SIC has a larger number of recoverable message packets than RLIC. However, for both SIC and RLIC, the number of recoverable message packets drops quickly.
  • SMPIC0 has similar rank performance to SIC and RLIC and has a much higher average number of recoverable message packets than that of SIC and RLIC.
  • SMPIC1 has the highest average number of recoverable message packets among the four inner codes, at the cost of a reduced average rank.
The recoding computation costs at each network node are also determined in the experiments and are illustrated in Figure 8. For RLIC, as N u = 20 and the expectation of N u = 16 , the recoding computation cost is about 320 LCOs. For SIC, the recoding computation cost is about 64 LCOs. The recoding computation cost of SMPIC0 also matches the formula that we have derived, where the expectation of s 0 is 1 ϵ = 0.8 multiplied by the number of recovered message packets in the previous hop. In Figure 8, we also show the computation cost of SMPIC1, which is close to that of SMPIC0.

6. Concluding Remarks

In this paper, we propose a design for systematic batched network codes, where the outer code is systematic and the inner code can protect the systematic property during network coding. Our design of the systematic code preserves the most salient features of the BATS code. The diagonal embedding approach is proposed to improve the design efficiency of the systematic outer code, and it can also be used for non-systematic outer coding to reduce the coding overhead and computation cost.
The discussion in this paper can help to evaluate when and how to adopt systematic batched network codes. When the computation cost and the encoding latency are the major concerns, the use of systematic outer codes is preferred due to the lower computation cost compared to the ordinary outer code. The decision regarding whether to use message protection recoding depends on both the computation constraints and the application scenario. When the decoding computation is sensitive and the intermediate nodes have an additional computation capability, it is beneficial to use message protection recoding. Message protection recoding is also preferred for some application scenarios, e.g., for communications where part of the content can be consumed when ready, a systematic code is better. Another useful scenario for systematic codes is a network with dynamic network link qualities: the communication is reliable most of the time and serious packet loss occurs only in a small fraction of the time.
There are still many refinements to be applied for the systematic batched network codes. This paper focused on the inner code design for unicast communications. The current inner codes designed to protect the message packets may not be suitable to achieve the multicast gain of network coding. Further study of the inner code design for multicast communication is desired.

7. Patents

Patents resulting from this work are listed in the following:
CN115811381A 
The design framework of the systematic BATS code (including the outer code and inner code), invented by the authors of this paper, published on 17 March 2023.
CN2023105394085 
The design of the triangular embedded outer code, invented by L.M. and S.Y., filed on 15 May 2023.

Author Contributions

Conceptualization, L.M. and S.Y.; methodology, L.M. and S.Y.; implementation, L.M.; validation, L.M.; formal analysis, L.M., Y.D., X.H. and S.Y.; writing—original draft preparation, L.M. and S.Y.; writing—review and editing, L.M., Y.D., X.H. and S.Y.; visualization, L.M. and S.Y.; supervision, S.Y.; project administration, S.Y.; funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by NSFC under Grants 62171399 and 12141108.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors wish to acknowledge that the research conducted by Dong in this paper was performed while she was affiliated with CUHK-Shenzhen.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

    The following abbreviations are used in this manuscript:
BATS codeBATched Sparse code
HDPCHigh-Density Parity Check
LDPCLow-Density Parity Check
LCOLinear Combination Operation
MPICMessage Protection Inner Code
RLICRandom Linear Inner Code
RLNCRandom Linear Network Codes/Random Linear Network Coding
SICSystematic Inner Code
SMPICSystematic Message Protection Inner Code

Appendix A. Proofs Regarding Recoverability of Message Packets

Proof of Proposition 1. 
For the sufficiency, if e i is a column of L , then e i Col ( L ) = Col ( H u ) .
Now, we prove the necessity. If e i Col ( H u ) , then e i Col ( L ) . Let r = rank ( H u ) . By the property of reduced column echelon form, all the zero columns are on the right of the non-zero columns in L , and hence the first r columns are all the non-zero columns of L . Then, we can write e i = j = 1 r c j l j , where c j is a constant and l j is the jth non-zero column of L .
For j = 1 , , r , denote by i j the row index of the leading 1 of l j , which must exist due to the property of reduced column echelon form. Further, as the ( i j , j ) entry is the only non-zero entry on the i j th row of L , c j = 0 for j such that i j i . If there exists no j such that i j = i , then e i = 0 , a contradiction. Therefore, there must a unique j * such that i j * = i , and hence e i = l j * .    □
Proof of Proposition 3. 
Denote by x , y + , h + and ϕ the instances of X , Y u + , H u + and Φ u , respectively. As P ( x , ϕ | y + , h + ) = P ( ϕ | y + , h + ) P ( x | y + , h + , ϕ ) ; to prove this claim, it is sufficient to show that
P ( x | y + , h + , ϕ ) = P ( x | y + , h + )
for all instances x , y + , h + and ϕ . If y + x h + , (A1) holds as both sides are 0.
Suppose that y + = x h + . As X , H u , Φ u and E u are independent and H u + = H u Φ u E u , we obtain
P ( x | y + , h + , ϕ ) = P ( x ) P ( h + , ϕ ) P ( y + , h + , ϕ ) = P ( x ) P ( h + , ϕ ) x : x h + = y + p ( x ) P ( h + , ϕ ) = P ( x ) x : x h + = y + p ( x ) .
Similarly,
P ( x | y + , h + ) = P ( x ) P ( h + ) P ( y + , h + ) = P ( x ) P ( h + ) x : x h + = y + p ( x ) P ( h + ) = P ( x ) x : x h + = y + p ( x ) .
Therefore, (A1) holds when y + = x h + . The proof is completed.    □
Proof of Proposition 4. 
Denote by x , y + , h + and s the instances of X , Y u + , H u + and S, respectively. It is sufficient to show that P ( x | y + , h + , s ) = P ( x | y + , h + ) for all instances x , s , y + , h + . If y + x h + , the equality holds as both sides are 0. Suppose that y + = x h + . As X and S are independent given H u + and Y u + , we obtain
P ( x | y + , h + , s ) = P ( x , s | y + , h + ) P ( s | y + , h + ) = P ( s | y + , h + ) P ( x | y + , h + ) P ( s | y + , h + ) = P ( x | y + , h + ) .
  □
Proof of Proposition 5. 
For convenience, we omit the subscripts of H u , Φ u and E u .
Assume that H F q M × N u is fixed with rank ( H ) = r and e i Col ( H ) . Since rank ( H ) = r , and e i Col ( H ) , we can extend { e i } to a basis of Col ( H ) , denoted by W . Then, there exists a full row rank matrix C F q r × N u such that H = W C and H u + = W C Φ E . Let Φ * = Φ E ; then, Φ * is an N u × N u + uniformly random matrix.
Notice that C is full row rank, and C can be written as C = K C , where C is an invertible matrix with the first r rows being C and K is made up of the first r rows of an identity matrix. Since C Φ * is still an N u × N u + uniformly random matrix, we have that C Φ * is an r × N u + uniformly random matrix. In the following, we let M = C Φ * and we have e i Col ( H u + ) if and only if e 1 Col ( M ) . Let m T be the first row of M and M ˜ be the submatrix of M with the first row removed. Then, e 1 Col ( M ) is equivalent to x s . t . M ˜ x = 0 , m T x 0 ; in other words, m Null ( M ˜ ) .
When M ˜ has rank k, the null space of M ˜ has dimension N u + k . The probability m Null ( M ˜ ) is 1 q k q N u + .
Therefore, the probability e 1 Col ( M ) is:
Pr ( e 1 Col ( M ) ) = k = 0 N u + ζ k r 1 , N u + ( 1 q k q N u + ) = 1 k = 0 N u + ζ k r 1 , N u + q k N u + .
Observe that
Pr ( e 1 Col ( M ) ) = k = 0 N u + 1 ζ k r 1 , N u + ( 1 q k q N u + ) k = 0 N u + 1 ζ k r 1 , N u + = 1 ζ N u + r 1 , N u + = 1 i = 0 N u + 1 ( 1 q r + 1 + i ) .
Since i = 0 N u + 1 ( 1 q r + 1 + i ) 1 as q , Pr ( e 1 Col ( M ) ) 0 , as q .    □

Appendix B. BATS Code Parameters Used in Numerical Experiments

For the numerical experiments of the BATS outer code in Section 3.4 and Section 4.3, we use the BATS code with the following parameters.
  • The batch size M is 16.
  • We use the degree distribution Ψ asymptotically optimized for the rank-M rank distribution. The non-zero entries of Ψ are listed in Table A1.
  • The following formula determines the number of LDPC packets:
    0.0101 K + 3 K , K < 20 , 000 0.0101 K + 4 K , otherwise .
  • The number of HDPC packets is max ( ln ( K ) , 5 ) .
  • The decoder has a limit on the number of inactivated packets and the limit is 150.
Table A1. The non-zero entries of the degree distribution used in the numerical experiments.
Table A1. The non-zero entries of the degree distribution used in the numerical experiments.
Ψ 17 Ψ 18 Ψ 19 Ψ 20 Ψ 23 Ψ 27 Ψ 31 Ψ 35
0.05880.05710.02450.08990.11700.09210.06780.0679
Ψ 43 Ψ 45 Ψ 63 Ψ 73 Ψ 123 Ψ 126 Ψ 239
0.06080.06040.06710.06710.05990.02220.0457

Appendix C. Pseudocodes for BATS Outer Encoding and Decoding

Algorithm A1 is the pseudocode for the encoding of the BATS outer code, and Algorithm A2 is the pseudocode for the two-step decoding of the BATS outer code.
Algorithm A1: The encoding process of the BATS outer code.
     Input:
  • B : all the input packets.
  • range : a range of index of batches.
     Output:     
  • X : an array of the generated batches.
  1:
procedureENC( B , range)
  2:
     B p Solve the precode constraint B B p P = 0 .
  3:
     B B B p
  4:
     B A , B I Split B into active packets and inactive packets.
  5:
     X []           ▹ Initialize an array for the generated batches.
  6:
    for i in range do       ▹ Line 7 to line 13 generate batches with key i.
  7:
        Use i as the seed for the pseudo random number generator.
  8:
         d i A Sample a degree from the degree distribution Ψ .
  9:
         d i B Randomly choose an inactive degree.
10:
         B i Randomly choose d i A packets from B A and d i B packets from B I .
11:
         G i Create a ( d i A + d i B ) × M matrix with independently uniform entries.
12:
         X i B i G i
13:
        Append X i onto the end of X .
14:
    return  X
Algorithm A2: The two-step decoding process of the BATS outer code.
     Input:
  • Y 1 , Y 2 , , Y n : an array of n batches.
  • H 1 , H 2 , , H n : an array of batch transfer matrices of the n batches.
     Output:     
  • B : Recovered input packets
  1:
procedure DEC( Y 1 , Y 2 , , Y n , H 1 , H 2 , , H n )
  2:
     Ω 1 , 2 , , n           ▹ The index set of unsolved batches
  3:
    for i in 1 n  do
  4:
        Use i as the seed for the pseudo number generator.
  5:
         d i A Sample a degree from the degree distribution Ψ .
  6:
         G i Create a ( d i A ) × M matrix with independently uniform entries.
  7:
    while  i Ω , such that d i A = rank ( G i H i )  do
  8:
         B i Solve the system B i G i H i = Y i .
  9:
        for  b in B i  do
10:
           Mark b as decoded.
11:
           for j in indices of other batch such that b B j  do
12:
               Update the batch equation B j G j H j = Y j by canceling b from that equation.
13:
        Remove i from Ω .
14:
     B , B p Decode the precode using the decoded packets.
15:
    return  B

References

  1. Ahlswede, R.; Cai, N.; Li, S.Y.R.; Yeung, R.W. Network information flow. IEEE Trans. Inform. Theory 2000, 46, 1204–1216. [Google Scholar] [CrossRef]
  2. Li, S.Y.R.; Yeung, R.W.; Cai, N. Linear network coding. IEEE Trans. Inform. Theory 2003, 49, 371–381. [Google Scholar] [CrossRef] [Green Version]
  3. Koetter, R.; Medard, M. An Algebraic Approach to Network Coding. IEEE/ACM Trans. Netw. 2003, 11, 782–795. [Google Scholar] [CrossRef] [Green Version]
  4. Ho, T.; Leong, B.; Medard, M.; Koetter, R.; Chang, Y.; Effros, M. The benefits of coding over routing in a randomized setting. In Proceedings of the IEEE International Symposium on Information Theory (ISIT ’03), Yokohama, Japan, 29 June–4 July 2003. [Google Scholar] [CrossRef] [Green Version]
  5. Ho, T.; Médard, M.; Koetter, R.; Karger, D.R.; Effros, M.; Shi, J.; Leong, B. A Random Linear Network Coding Approach to Multicast. IEEE Trans. Inform. Theory 2006, 52, 4413–4430. [Google Scholar] [CrossRef] [Green Version]
  6. Jaggi, S.; Chou, P.A.; Jain, K. Low complexity optimal algebraic multicast codes. In Proceedings of the International Symposium on Information Theory (ISIT ’03), Yokohama, Japan, 29 June–4 July 2003. [Google Scholar]
  7. Sanders, P.; Egner, S.; Tolhuizen, L. Polynomial time algorithms for network information flow. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’03), San Diego, CA, USA, 7–9 June 2003; pp. 286–294. [Google Scholar]
  8. Wu, Y. A Trellis Connectivity Analysis of Random Linear Network Coding with Buffering. In Proceedings of the 2006 IEEE International Symposium on Information Theory, Seattle, WA, USA, 9–14 July 2006; pp. 768–772. [Google Scholar] [CrossRef]
  9. Dana, A.F.; Gowaikar, R.; Palanki, R.; Hassibi, B.; Effros, M. Capacity of wireless erasure networks. IEEE Trans. Inform. Theory 2006, 52, 789–804. [Google Scholar] [CrossRef] [Green Version]
  10. Yeung, R.W. Avalanche: A network coding analysis. Commun. Inf. Syst. 2007, 7, 353–358. [Google Scholar] [CrossRef] [Green Version]
  11. Chou, P.A.; Wu, Y.; Jain, K. Practical Network Coding. In Proceedings of the Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 29 September–1 October 2004. Invited paper. [Google Scholar]
  12. de Alwis, C.; Kodikara Arachchi, H.; Fernando, A.; Kondoz, A. Towards minimising the coefficient vector overhead in random linear Network Coding. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 5127–5131. [Google Scholar] [CrossRef]
  13. Silva, D. Minimum-overhead network coding in the short packet regime. In Proceedings of the 2012 International Symposium on Network Coding (NetCod), Cambridge, MA, USA, 29–30 June 2012; pp. 173–178. [Google Scholar] [CrossRef]
  14. Gligoroski, D.; Kralevska, K.; Øverby, H. Minimal header overhead for random linear network coding. In Proceedings of the 2015 IEEE International Conference on Communication Workshop (ICCW), London, UK, 8–12 June 2015; pp. 680–685. [Google Scholar] [CrossRef] [Green Version]
  15. Silva, D.; Zeng, W.; Kschischang, F.R. Sparse Network Coding with Overlapping Classes. In Proceedings of the 2009 Workshop on Network Coding, Theory, and Applications (NetCod ’09), Lausanne, Switzerland, 15–16 June 2009; pp. 74–79. [Google Scholar] [CrossRef] [Green Version]
  16. Heidarzadeh, A.; Banihashemi, A.H. Overlapped Chunked network coding. In Proceedings of the 2010 IEEE Information Theory Workshop on Information Theory (ITW ’10), Cairo, Egypt, 6–8 January 2010; pp. 1–5. [Google Scholar]
  17. Li, Y.; Soljanin, E.; Spasojevic, P. Effects of the Generation Size and Overlap on Throughput and Complexity in Randomized Linear Network Coding. IEEE Trans. Inform. Theory 2011, 57, 1111–1123. [Google Scholar] [CrossRef] [Green Version]
  18. Mahdaviani, K.; Ardakani, M.; Bagheri, H.; Tellambura, C. Gamma Codes: A low-overhead linear-complexity network coding solution. In Proceedings of the 2012 International Symposium on Network Coding (NetCod), Cambridge, MA, USA, 29–30 June 2012; pp. 125–130. [Google Scholar] [CrossRef]
  19. Yang, S.; Yeung, R.W. Batched Sparse Codes. IEEE Trans. Inform. Theory 2014, 60, 5322–5346. [Google Scholar] [CrossRef] [Green Version]
  20. Tang, B.; Yang, S.; Yin, Y.; Ye, B.; Lu, S. Expander Chunked Codes. EURASIP J. Adv. Signal Process. 2015, 2015, 106. [Google Scholar] [CrossRef] [Green Version]
  21. Tang, B.; Yang, S. An LDPC Approach for Chunked Network Codes. IEEE/ACM Trans. Netw. 2018, 26, 605–617. [Google Scholar] [CrossRef]
  22. Yang, S.; Yeung, R.W. Network Communication Protocol Design from the Perspective of Batched Network Coding. IEEE Commun. Mag. 2022, 60, 89–93. [Google Scholar] [CrossRef]
  23. Yang, S.; Yeung, R.W. BATS Codes: Theory and Practice; Synthesis Lectures on Communication Networks; Morgan & Claypool Publishers: Williston, VT, USA, 2017. [Google Scholar] [CrossRef]
  24. MacWilliams, F.; Sloane, N. The Theory of Error-Correcting Codes; North-Holland Publishing: Amsterdam, The Netherlands, 1978. [Google Scholar]
  25. Versfeld, D.J.; Ridley, J.N.; Ferreira, H.C.; Helberg, A.S. On systematic generator matrices for Reed–Solomon codes. IEEE Trans. Inform. Theory 2010, 56, 2549–2550. [Google Scholar] [CrossRef]
  26. Luby, M.; Shokrollahi, A.; Watson, M.; Stockhammer, T.; Minder, L. RaptorQ Forward Error Correction Scheme for Object Delivery–RFC 6330; Internet Engineering Task Force: Fremont, CA, USA, 2011. [Google Scholar]
  27. Arikan, E. Systematic polar coding. IEEE Commun. Lett. 2011, 15, 860–862. [Google Scholar] [CrossRef] [Green Version]
  28. Badr, A.; Khisti, A.; Tan, W.T.; Apostolopoulos, J. Perfecting Protection for Interactive Multimedia: A survey of forward error correction for low-delay interactive applications. IEEE Signal Process. Mag. 2017, 34, 95–113. [Google Scholar] [CrossRef]
  29. Garcia-Saavedra, A.; Karzand, M.; Leith, D.J. Low Delay Random Linear Coding and Scheduling Over Multiple Interfaces. IEEE Trans. Mob. Comput. 2017, 16, 3100–3114. [Google Scholar] [CrossRef] [Green Version]
  30. Li, Y.; Zhang, F.; Wang, J.; Quek, T.Q.S.; Wang, J. On Streaming Coding for Low-Latency Packet Transmissions Over Highly Lossy Links. IEEE Commun. Lett. 2020, 24, 1885–1889. [Google Scholar] [CrossRef]
  31. Prior, R.; Rodrigues, A. Systematic network coding for packet loss concealment in broadcast distribution. In Proceedings of the The International Conference on Information Networking 2011 (ICOIN2011), Kuala Lumpur, Malaysia, 26–28 January 2011; pp. 245–250. [Google Scholar] [CrossRef]
  32. Lucani, D.E.; Medard, M.; Stojanovic, M. On Coding for Delay—Network Coding for Time-Division Duplexing. IEEE Trans. Inform. Theory 2012, 58, 2330–2348. [Google Scholar] [CrossRef] [Green Version]
  33. Yu, M.; Aboutorab, N.; Sadeghi, P. From Instantly Decodable to Random Linear Network Coded Broadcast. IEEE Trans. Commun. 2014, 62, 3943–3955. [Google Scholar] [CrossRef]
  34. Gabriel, F.; Wunderlich, S.; Pandi, S.; Fitzek, F.H.P.; Reisslein, M. Caterpillar RLNC With Feedback (CRLNC-FB): Reducing Delay in Selective Repeat ARQ Through Coding. IEEE Access 2018, 6, 44787–44802. [Google Scholar] [CrossRef]
  35. Phung, C.V.; Engelmann, A.; Jukan, A. Error Correction with Systematic RLNC in Multi-Channel THz Communication Systems. In Proceedings of the 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 28 September–2 October 2020; pp. 512–517. [Google Scholar] [CrossRef]
  36. Karetsi, F.; Papapetrou, E. A Low Complexity Network-Coded ARQ protocol for Ultra-Reliable Low Latency Communication. In Proceedings of the 2021 IEEE 22nd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), Pisa, Italy, 7–11 June 2021; pp. 11–20. [Google Scholar] [CrossRef]
  37. Shokrollahi, A.; Luby, M. Raptor Codes. Found. Trends Commun. Inf. Theory 2011, 6, 213–322. [Google Scholar] [CrossRef]
  38. Tasdemir, E.; Tömösközi, M.; Cabrera, J.A.; Gabriel, F.; You, D.; Fitzek, F.H.P.; Reisslein, M. SpaRec: Sparse Systematic RLNC Recoding in Multi-Hop Networks. IEEE Access 2021, 9, 168567–168586. [Google Scholar] [CrossRef]
  39. Pandi, S.; Gabriel, F.; Cabrera, J.A.; Wunderlich, S.; Reisslein, M.; Fitzek, F.H.P. PACE: Redundancy Engineering in RLNC for Low-Latency Communication. IEEE Access 2017, 5, 20477–20493. [Google Scholar] [CrossRef]
  40. Wunderlich, S.; Gabriel, F.; Pandi, S.; Fitzek, F.H.P.; Reisslein, M. Caterpillar RLNC (CRLNC): A Practical Finite Sliding Window RLNC Approach. IEEE Access 2017, 5, 20183–20197. [Google Scholar] [CrossRef]
  41. Lucani, D.E.; Pedersen, M.V.; Ruano, D.; Sørensen, C.W.; Fitzek, F.H.P.; Heide, J.; Geil, O.; Nguyen, V.; Reisslein, M. Fulcrum: Flexible Network Coding for Heterogeneous Devices. IEEE Access 2018, 6, 77890–77910. [Google Scholar] [CrossRef]
  42. Nguyen, V.; Tasdemir, E.; Nguyen, G.T.; Lucani, D.E.; Fitzek, F.H.P.; Reisslein, M. DSEP Fulcrum: Dynamic Sparsity and Expansion Packets for Fulcrum Network Coding. IEEE Access 2020, 8, 78293–78314. [Google Scholar] [CrossRef]
  43. Tasdemir, E.; Nguyen, V.; Nguyen, G.T.; Fitzek, F.H.P.; Reisslein, M. FSW: Fulcrum sliding window coding for low-latency communication. IEEE Access 2022, 10, 54276–54290. [Google Scholar] [CrossRef]
  44. Compta, P.T.; Fitzek, F.H.P.; Lucani, D.E. Network Coding is the 5G Key Enabling Technology: Effects and Strategies to Manage Heterogeneous Packet Lengths. Trans. Emerg. Telecommun. Technol. 2015, 6, 46–55. [Google Scholar] [CrossRef]
  45. Yang, S. Simbats. 2015. Available online: https://github.com/shhyang/simbats (accessed on 10 June 2022).
Figure 1. A line network of length L. Node 0 is the source node, and node L is the destination node. The direct edge from node i to node i + 1 ( i = 0 , 1 , , L 1 ) illustrates the network link.
Figure 1. A line network of length L. Node 0 is the source node, and node L is the destination node. The direct edge from node i to node i + 1 ( i = 0 , 1 , , L 1 ) illustrates the network link.
Entropy 25 01055 g001
Figure 2. Illustration of the approach to systematic outer codes. (a) shows the normal use of a consistent pair of the outer code encoder ENC n s and decoder DEC n s . (b) shows the encoding of the systematic code, where ENC n s + is the outer code encoder that generates the coded packets beyond the first n s batches. (c) shows the decoding of the systematic code, where Y is the received coded packets generated by inner coding.
Figure 2. Illustration of the approach to systematic outer codes. (a) shows the normal use of a consistent pair of the outer code encoder ENC n s and decoder DEC n s . (b) shows the encoding of the systematic code, where ENC n s + is the outer code encoder that generates the coded packets beyond the first n s batches. (c) shows the decoding of the systematic code, where Y is the received coded packets generated by inner coding.
Entropy 25 01055 g002
Figure 3. Illustration of the encoding computation cost for the ordinary outer code and the systematic outer code. (a) shows the encoding computation cost of generating the first k coded packets. For the ordinary outer code, the computation cost increases linearly with k. For the systematic outer code, the computation cost is 0 when k K . The jump in the computation cost after time K is used to execute DEC n s * . (b) illustrates the number of encoded packets generated over time. The curve “systematic-1” is for the systematic outer code encoder when the message packets arrive one-by-one in each unit time. The curve “systematic-2” is for the systematic outer code encoder when the message packets arrive all at time K. From time K, these two curves overlap. The ordinary outer code behaves in the same way for both message packet arrival models.
Figure 3. Illustration of the encoding computation cost for the ordinary outer code and the systematic outer code. (a) shows the encoding computation cost of generating the first k coded packets. For the ordinary outer code, the computation cost increases linearly with k. For the systematic outer code, the computation cost is 0 when k K . The jump in the computation cost after time K is used to execute DEC n s * . (b) illustrates the number of encoded packets generated over time. The curve “systematic-1” is for the systematic outer code encoder when the message packets arrive one-by-one in each unit time. The curve “systematic-2” is for the systematic outer code encoder when the message packets arrive all at time K. From time K, these two curves overlap. The ordinary outer code behaves in the same way for both message packet arrival models.
Entropy 25 01055 g003
Figure 4. Illustration of the decoding computation costs of the ordinary outer code and the systematic outer code. (a) illustrates the decoding computation cost for different numbers K m of message packets received. For the ordinary outer code, the decoding computation cost is c. For the systematic outer code, the decoding computation cost is approximately 2 c K m b when K m < K and 0 when K m = K . (b) shows the number of decoded packets over time. The three curves labeled K m = K , K / 2 , 0 are for the systematic outer code decoder with K m message packets received.
Figure 4. Illustration of the decoding computation costs of the ordinary outer code and the systematic outer code. (a) illustrates the decoding computation cost for different numbers K m of message packets received. For the ordinary outer code, the decoding computation cost is c. For the systematic outer code, the decoding computation cost is approximately 2 c K m b when K m < K and 0 when K m = K . (b) shows the number of decoded packets over time. The three curves labeled K m = K , K / 2 , 0 are for the systematic outer code decoder with K m message packets received.
Entropy 25 01055 g004
Figure 5. Illustration of encoding using triangular embedding. The gray part contains non-zero entries and the white part contains only zero.
Figure 5. Illustration of encoding using triangular embedding. The gray part contains non-zero entries and the white part contains only zero.
Entropy 25 01055 g005
Figure 6. Illustration of decoding using triangular embedding. The gray part contains non-zero entries and the white part contains only zero.
Figure 6. Illustration of decoding using triangular embedding. The gray part contains non-zero entries and the white part contains only zero.
Entropy 25 01055 g006
Figure 7. The average number of recovered message packets and the average rank at node 0 to 50 for SIC, RLIC, SMPIC with w = 0 (denoted by SMPIC0) and SMPIC with w = N u s (denoted by SMPIC1), each with 500 trials.
Figure 7. The average number of recovered message packets and the average rank at node 0 to 50 for SIC, RLIC, SMPIC with w = 0 (denoted by SMPIC0) and SMPIC with w = N u s (denoted by SMPIC1), each with 500 trials.
Entropy 25 01055 g007
Figure 8. The average number of linear combination operations at node 0 to node 49 for SIC, RLIC, SMPIC with w = 0 (denoted by SMPIC0) and SMPIC with w = N u s (denoted by SMPIC1), each with 500 trials.
Figure 8. The average number of linear combination operations at node 0 to node 49 for SIC, RLIC, SMPIC with w = 0 (denoted by SMPIC0) and SMPIC with w = N u s (denoted by SMPIC1), each with 500 trials.
Entropy 25 01055 g008
Table 1. Experiments of random design. Here, M = 16 and h has rank M. For each value of K = 10 M , 100 M , 1000 M , 5000 instances of the ordinary outer code are sampled. As a BATS code is rateless, for each instance, we can try a range of values of n s . The table gives the number of consistent instances when n s = K / M , K / M + 1 , K / M + 2 , and n s K / M + 3 .
Table 1. Experiments of random design. Here, M = 16 and h has rank M. For each value of K = 10 M , 100 M , 1000 M , 5000 instances of the ordinary outer code are sampled. As a BATS code is rateless, for each instance, we can try a range of values of n s . The table gives the number of consistent instances when n s = K / M , K / M + 1 , K / M + 2 , and n s K / M + 3 .
n s K / M K = 10 M K = 100 M K = 200 M K = 400 M K = 600 M
04784355283640
11731755000
2341135300
≥391160406149965000
Table 2. Experiments using triangular embedding for consistent outer codes. Here, M = 16 and h has rank M with probability 1. For each value of K = 10 M , 100 M , 1000 M , in total, 5000 instances of the triangular embedded outer code are tested. The table gives the number of consistent instances.
Table 2. Experiments using triangular embedding for consistent outer codes. Here, M = 16 and h has rank M with probability 1. For each value of K = 10 M , 100 M , 1000 M , in total, 5000 instances of the triangular embedded outer code are tested. The table gives the number of consistent instances.
K = 10 M K = 100 M K = 1000 M
zero overhead497849834977
max total inact.2791149
Table 3. Joint decoding of batches generated by triangular embedding and the ordinary outer code. Here, M = 16 and h has rank M with probability 1. In our experiments, each batch generated by triangular embedding has a probability ϵ of being discarded, and the remaining batches are sent to the decoder. Following the batches generated by triangular embedding, batches generated by the ordinary outer code are also sent to the decoder. For each value of K = 10 M , 100 M , 200 M , 400 M , 600 M and ϵ = 0.1 , 0.3 , 0.5 , in total, 5000 instances are tested.
Table 3. Joint decoding of batches generated by triangular embedding and the ordinary outer code. Here, M = 16 and h has rank M with probability 1. In our experiments, each batch generated by triangular embedding has a probability ϵ of being discarded, and the remaining batches are sent to the decoder. Following the batches generated by triangular embedding, batches generated by the ordinary outer code are also sent to the decoder. For each value of K = 10 M , 100 M , 200 M , 400 M , 600 M and ϵ = 0.1 , 0.3 , 0.5 , in total, 5000 instances are tested.
Coding Overhead K = 10 M K = 100 M K = 200 M K = 400 M K = 600 M
(a) ϵ = 0 . 1
049824955344630
1 M 18266900
M + 1 2 M 035100
> 2 M 016143449975000
(b) ϵ = 0 . 3
0498345111037170
1 M 17733100
M + 1 2 M 0392510
> 2 M 0377390749825000
(c) ϵ = 0 . 5
049874155182350
1 M 1310710110
M + 1 2 M 0617210
> 2 M 0677300449935000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mao, L.; Yang, S.; Huang, X.; Dong, Y. Design and Analysis of Systematic Batched Network Codes. Entropy 2023, 25, 1055. https://doi.org/10.3390/e25071055

AMA Style

Mao L, Yang S, Huang X, Dong Y. Design and Analysis of Systematic Batched Network Codes. Entropy. 2023; 25(7):1055. https://doi.org/10.3390/e25071055

Chicago/Turabian Style

Mao, Licheng, Shenghao Yang, Xuan Huang, and Yanyan Dong. 2023. "Design and Analysis of Systematic Batched Network Codes" Entropy 25, no. 7: 1055. https://doi.org/10.3390/e25071055

APA Style

Mao, L., Yang, S., Huang, X., & Dong, Y. (2023). Design and Analysis of Systematic Batched Network Codes. Entropy, 25(7), 1055. https://doi.org/10.3390/e25071055

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop