1. Introduction
Information is such a valuable boon for our time. Since the transmission of information has always been subject to precision problems, besides finding obstacles between the transmitter and the receiver, potential disruptions can occur at any point in the process. The physical means and channels involved with the exchange are always flawed and are subject to errors that might result in the loss of essential data.
Among the different kinds of channels, there is the erasure channel, in which the receptor knows if an arrived symbol is correct as each symbol either arrives correctly or is erased. In electronic communication, especially over the Internet, the messages are transmitted using packets, and each packet comes with a checksum. The receptor can know that a packet is correct regarding if the checksum is correct. Otherwise, a packet is contaminated or lost in the course of transmission.
Convolutional codes are error-correcting codes that generate redundant bits through the convolution of input data with a series of generator polynomials. This process ensures that the encoded data can be protected against errors during transmission over noisy channels. They possess a certain degree of flexibility due to their “sliding window” feature. This means the received information can be grouped into blocks or windows in many ways. Convolutional codes can correct more errors than classical block codes when the erasure channels are considered.
Convolutional codes, with the advantage of being treated by the theory of linear systems, provide attractive opportunities in regulating the transmission of information. In particular, decoding a convolutional code can be viewed as finding the trajectory of the corresponding linear system that is, in some sense, closest to the received data. Garcia-Planas, Souidi, and Um presented in [
1] a decoding algorithm for convolutional codes using linear systems theory. Recently, Lieb and Rosenthal in [
2] presented a new algorithm using the output observability character introduced in [
1]; this new algorithm has the advantage that it can minimize both the decoding delay and the computational effort in the erasure recovery process.
Among convolutional codes, there is a specific class called MDP convolutional codes introduced in [
3], which are constructed to achieve the maximum possible minimum distance at every time step. This means that the distance properties of the code are optimized to ensure the highest error correction capability. Due to their optimal distance properties, these kinds of codes offer superior error correction capabilities compared to non-MDP convolutional codes with the same parameters. MDP convolutional codes are particularly well-suited for use in sequential decoding algorithms and widely used in communication systems that demand high error resilience, including deep-space communication, satellite communication, and wireless networks.
Compared to other erasure decoding algorithms for convolutional codes available in the literature (see, for example, [
4,
5,
6]), our proposal systems theoretic approach offers the advantage of reduced computational effort and decoding delay due to the treatment through linear system theory using an input–state–output description, enabling the use of the concept of output observability.
To realize a systematic review article about MDP convolutional code decoding over erasure channels, the authors of this work found a few recent articles on decoding convolutional codes in the available literature. Among the few articles found, it is worth highlighting the work of Martín and Plaza in [
7], which presented an explicit implementation of a complex decoding algorithm for a class of convolutional codes based on algebraic techniques and an approach to the complexity of the resulting algorithm, and the work of Pinto et al. [
8], which proposed the construction of two-dimensional convolutional codes based on the existing decoding algorithms for one-dimensional codes (so it may be interesting to improve the decoding criteria), as well as the work of Nowosielski et al. [
9], which proposed an algorithm for the automatic recognition of convolutional codes based on the operation of the Viterbi decoder; the existence of new algorithms can facilitate such recognition.
This paper is structured as follows. A preliminary section introduces the concept of the convolutional code from the linear system theory point of view. The following section presents the control properties of controllability, observability, and output observability, which help to describe an iterative decoding algorithm for convolutional codes. Next, a case of convolutional codes, the so-called MDP convolutional codes, is introduced. Finally, a decoding procedure for this kind of code is analyzed, and a conclusion is presented.
2. Preliminaries
We consider a finite set of symbols , called alphabet, with q elements. The information to be processed and the codewords will be expressed with symbols from this alphabet. The set is structured as a finite field (in particular, the size q of the alphabet is a power of a prime number).
Convolutional code is a type of error-correcting code in which each k-bit information symbol (each k-bit string) to be encoded is transformed into an n-bit symbol, where the transformation is a function of the last information symbols contained in the memory of the physical encoder.
Following Rosenthal and York [
10], a convolutional code is defined as a submodule of
.
Definition 1. Let be a code. Then, is a convolutional code if and only if is a -submodule of .
Corollary 1. There exists an injective morphism of modulesEquivalently, there exists a polynomial matrix (called encoder) of order and having maximal rank such that The ratio into the number k of input bits and the number n of output bits per encoding operation is known as the ratio of a convolutional code, and it is a crucial parameter in communication systems because it directly impacts the efficiency and reliability of data transmission.
We denote by the maximum of all degrees of each of the polynomials of each line, we define the complexity of the encoder as measuring the influence of past inputs in the present output of the encoder, and finally we define the complexity convolution code as the maximum of all degrees of the largest minors of .
The representation of a code using a polynomial matrix is not unique; however, we have the following proposition.
Proposition 1. Two rational encoders , define the same convolutional code if and only if there is a unimodular matrix such that .
After a suitable permutation of the rows, we can assume that the generator matrix
is of the form
with right coprime polynomial factors
and
, respectively.
A description of convolutional codes can be provided by a time-invariant discrete linear system called discrete-time state-space system in control theory (see [
11]). We want to note that linear systems theory is quite general and it permits all kinds of time axes and signal spaces.
The linear dynamical system representation of the convolutional code is easily obtained from the following results.
Theorem 1. Let be a -convolutional code of complexity δ. Then, there exist matrices K, L of size and a matrix M of size having their coefficients in such that the code is defined asMoreover, K is a column full-rank matrix, is a row full-rank matrix, and , . The triple satisfying the above it is called minimal representation of .
Proposition 2. Let be another representation of the convolutional code . Then, there exist invertible matrices T and S of adequate sizes such that We have the following corollary
Corollary 2. The triple can be written as And, we deduce the following corollary.
Now, if we divide the matrix into two parties , the equality
can be expressed as
Finally, applying the antitransform
Z, we obtain the system
Remark 1. The vectors , , , and are known as the information vector, state vector, parity vector, and the code vector transmitted via the communication channel, respectively.
We identify this system with the matrix quadruple . The function is called transfer function of the linear system.
The distance concepts are fundamental in coding theory.
Definition 2. Let be vectors. The Hamming distance is defined to be the number of components in which x and y differ. The weight of x is defined to be the number of nonzero components of x.
Obviously, we have that . When is a codeword, its weight is defined as .
Several distance measures are suitable for assessing their error correction capability in various scenarios for convolutional codes. In particular, free and column distance are two of the most notable distance measures for convolutional codes [
12].
Definition 3. The j-th column distance of the code C is defined aswhere the minimum is taken over all trajectories of the system (4) with initial vector . It is easy to observe that
and hence there is an integer
r such that
for all
. This largest possible column distance is called free distance:
Proposition 3. For every , it is verified that For proof, it suffices to consider definition .
Definition 5. A sequence represents a finite-weight codeword if
- -
Equation (4) is satisfied for all ; - -
There is a such that and for .
3. Control Properties
In linear systems theory, the major concepts are controllability and observability. These concepts were introduced by R. Kalman in 1960 [
13]. Controllability means instead the possibility of steering the system from any initial state to any final one by means of a control signal in the input. Roughly speaking, observability means the possibility of identifying the internal state of a system from measurements of the outputs. We introduce these concepts as well as the notion of output observability [
11].
Definition 6. A linear system is a controllable system if the controllability matrix associated with the systemhas full rank δ, where δ is the complexity of the code. If is controllable, it is possible to drive a given state vector to any other state vector in a finite number of steps.
Definition 7. A linear system is said to be observable if the observability matrix associated with the systemhas full rank δ. The observability character of a code means that one can be sure that a message has been completed once a sufficiently long string of zeros has been received.
These properties of a linear system are related to properties of the corresponding convolutional code in the following manner.
Theorem 2 ([
10], Theorems 2.9 and 2.10)
. is a minimal representation of the convolutional if and only if it is controllable. Theorem 3 ([
10], Lemma 2.11)
. Assume that is controllable. Then, is non-catastrophic if and only if is observable. Non-catastrophic convolutional codes are characterized by the fact that a finite number of transmission errors will result in only a finite number of decoding errors.
Another important property related to the decoding process is the output observability.
Definition 8. A linear system is said to be output-observable if the output observability matrix associated with the systemhas full row rank for all . Output observability represents the possibility of an internal state, to be only defined by a finite set of outputs, for a finite number of steps.
Fixing the initial state , the output observability matrix allows us to describe a sequence of trajectories in the following manner.
Theorem 4. Let be a representation of a convolutional code. Suppose that the initial state of the system is , and then the vectorwhere Proof. The system
is equivalent to
And now, it suffices to make column block elementary transformations to the system matrix and consider the condition
. □
Matrix provides information about ℓth column distance in the following manner: a convolutional code has ℓth column distance if and only if all minors appearing in that are not trivially zero are nonzero.
Remark 2. Remember that a minor is called nontrivially zero minor if it is related to proper submatrices and a minor of a lower triangular block Toeplitz matrix (as in our case) with and ; the sets of row and column indices are said to be proper if for each the inequality holds.
Iterative Decoding Algorithm
With the linear systems approach for convolutional codes, the output observability matrix allows to build decoding algorithms.
As an example, we show that the following algorithm, which can be found in [
1], is divided between some steps.
This algorithm for solving focuses more on detecting the error at first before getting into the correction process. Indeed, we will consider that we first need to check whether or not the received sequence is the encoded one and has not been compromised or modified through the sending process. If so, we will correct the error first and later figure out or deduce the original input message that is supposed to have been encoded. We will do so considering the initial conditions and output observability matrix.
The objective of this algorithm of decoding is to try to approach at first the word received from the encoding machine to the list of codewords.
- Step 1:
Set the initial conditions .
- Step 2:
With D, we compute the list of codewords.
Then, iteratively, generate the list of codewords, with matrix , and store them in a set.
- Step 3:
Compute the distance between the received
and the set of
in the codewords. Compute
until it is minimal; then, settle for the closest codeword
in the list; thus, the system
becomes solvable; deduce the corresponding input
.
Iteratively, compute the distance between the received
and the set of corresponding codewords. Detect the minimum distance
between them and settle for the closest codeword
in the list; thus, system
9 becomes compatible; deduce the corresponding input sequence.
We have the following proposition:
Proposition 4. Let be a representation of a convolutional code over a field with , being prime. Given the sequence , the decoded sequence is obtained recursively as follows
- (a)
Setting
- (b)
- (i)
Computing ;
if , we solve ;
else, for some , we solve is a solution of: .
- (ii)
Computing ;
if , we solve , with obtained in (i);
else, for some , we solve , with obtained in (i)
⋮
- (ℓ)
Computing
;
if , we solve with , , …, obtained in (i), (ii), …, ;
else, for some such that
, we solve , with , , …, obtained in (i), (ii), …,
Corollary 4. Let be a representation of a convolutional code over a field with , being prime. If D has full column rank and , given the sequence , the decoded sequence is obtained recursively as follows:
- (a)
Setting
- (b)
- (i)
is a solution of .
- (ii)
is a solution of , with obtained in (i);
⋮
- (ℓ)
is a solution of
with , , …, obtained in (i), (ii), …, , with , , …, obtained in (i), (ii), …,
Proof. If D has full column rank, and , then , and the matrix D is invertible, so, for all , there exists such that . Similarly, for all and for all , there exists such that □
Remark 3. This decoding is both a detection and correction method; at first, we detect the error; and then we try to correct it.
Remark 4. The resolution with this method depends heavily on matrix D.
Remark 5. Suppose that we have code , which is not output-observable. Then, we know that D does not have full row rank, and it can potentially increase the decoding time.
4. MDP Convolutional Codes
Maximum distance profile (MDP) convolutional codes are a specific class of convolutional codes characterized by their optimal distance properties having the property that their column distances increase as rapidly as possible for as long as possible.
This is explained more concretely below.
Definition 9. An convolutional code C is maximum distance profile (MDP) [6] ifwhere . These codes are designed to maximize the minimum distance between code sequences, thereby enhancing error detection and correction capabilities. Notice that, when transmitting over an erasure channel, each symbol is either received correctly or is not received at all.
Theorem 5. Let be a representation of an -convolutional code C. This code has j-th column distance if and only if there are no zero entries in , , and every minor of , in which not trivially zero is nonzero.
Suppose now that the finite sequence
represents a finite-weight codeword, and let
be such that
and
for
. Then, taking into account that
We have that
for
.
Consequently, we have
Proof. Theorem follows from the fact that
for
,
□
This result provides us some algebraic restrictions that will be exploited to achieve some advantages in the decoding algorithm.
5. Decoding over an Erasure Channel
Suppose that a convolutional code is transmitting over an erasure channel. Then, we can state the following result.
Proposition 5. Let be a representation of a convolutional code with column distance . If in any sliding window of length at most erasures occur, then we can completely recover the transmitted sequence.
Assume we corrected or received the information correctly for the sequence up to instant
. To recover the erased components in
, we consider the equation where the erased elements are considered unknowns in the following equation:
where
.
Indeed, we know a solution to the system exists because we have assumed that only erasures occur. On the other hand, Theorem 5 ensures that the matrix of the system of each of the not trivially zero full-size minors is nonzero and permits ensuring the unicity of the solution.
When an excessive number of erasures occur at a certain point in the sequence, the system lacks a unique solution, preventing the recovery of the erased information.
The knowledge of the erased elements permits us to calculate the next state of the system.
The fact that the code’s ℓ-th column distance is maximal implies that all preceding column distances are also maximal. Consequently, the window size in the process does not need to be fixed and can be adjusted according to the distribution of erasures in the sequence. This means that smaller windows can be framed, allowing for the resolution of smaller systems. As a result, decoding can commence before an entire block has been received, demonstrating once again that convolutional codes offer more flexibility than block codes in this regard.
6. Discussion
Each of the existent methods for decoding has its strengths and weaknesses, and the choice of which to use depends on the specific application requirements, such as the expected erasure patterns, computational resources, and latency constraints. For instance, window decoding is particularly useful in streaming applications where low latency is crucial. In contrast, algebraic and syndrome-based methods might be preferred in environments where abundant computational resources and exact decoding are paramount. Iterative and graph-based methods balance performance and complexity, making them suitable for various practical applications.
The input–state–output representation approach models the system regarding its inputs, internal states, and outputs. This method allows for decoding using a structural approach that leverages the dynamics of the convolutional system. Moreover, it allows for a structured and detailed analysis of the system and can quickly adapt to changes in channel conditions; similar to window decoding, the decoding can start before the entire block is received and facilitates the integration of control and state estimation techniques, enhancing the robustness. The methodology employed can be beneficial in addressing the challenge of designing MDP convolutional codes for communication over an erasure channel.
In summary, the choice of the decoding method depends on the specific application requirements, such as latency, computational complexity, adaptability, and robustness to different erasure patterns. The input–state–output representation is particularly useful in systems where the dynamic structure of the convolutional code can be leveraged to improve the decoding performance.
7. Conclusions
In this paper, we conduct an in-depth investigation into the behavior of maximum distance profile (MDP) convolutional codes when transmitted over an erasure channel. Our analysis employs the input–state–output representation of these codes, which serves as a foundational framework for understanding their operational dynamics in the presence of erasures. By leveraging this representation, we aim to provide a comprehensive description of an erasure decoding procedure specifically tailored for MDP convolutional codes.
The possibility of representing convolutional codes as linear systems enables the use of linear algebra, significantly contributing to the advancement of the state of the art, in particular on decoding maximum distance profile (MDP) convolutional codes through the erasure channels method. For instance, this method translates the problem of finding the original message into solving a set of linear equations, which are efficiently handled using linear algebra. On the other hand, the presented decoding method benefits from linear algebra through the output observability matrix. It relies on matrix operations to optimize the pathfinding process, enhancing the efficiency and accuracy of the decoding.