On Linear Coding over Finite Rings and Applications to Computing

Huang, Sheng; Skoglund, Mikael

doi:10.3390/e19050233

Open AccessFeature PaperArticle

On Linear Coding over Finite Rings and Applications to Computing

by

Sheng Huang

^* and

Mikael Skoglund

Communication Theory Lab, School of Electrical Engineering, KTH Royal Institute of Technology, Stockholm 10044, Sweden

^*

Author to whom correspondence should be addressed.

Entropy 2017, 19(5), 233; https://doi.org/10.3390/e19050233

Submission received: 6 January 2017 / Revised: 24 April 2017 / Accepted: 15 May 2017 / Published: 20 May 2017

(This article belongs to the Special Issue Network Information Theory)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a coding theorem for linear coding over finite rings, in the setting of the Slepian–Wolf source coding problem. This theorem covers corresponding achievability theorems of Elias (IRE Conv. Rec. 1955, 3, 37–46) and Csiszár (IEEE Trans. Inf. Theory 1982, 28, 585–592) for linear coding over finite fields as special cases. In addition, it is shown that, for any set of finite correlated discrete memoryless sources, there always exists a sequence of linear encoders over some finite non-field rings which achieves the data compression limit, the Slepian–Wolf region. Hence, the optimality problem regarding linear coding over finite non-field rings for data compression is closed with positive confirmation with respect to existence. For application, we address the problem of source coding for computing, where the decoder is interested in recovering a discrete function of the data generated and independently encoded by several correlated i.i.d. random sources. We propose linear coding over finite rings as an alternative solution to this problem. Results in Körner–Marton (IEEE Trans. Inf. Theory 1979, 25, 219–221) and Ahlswede–Han (IEEE Trans. Inf. Theory 1983, 29, 396–411, Theorem 10) are generalized to cases for encoding (pseudo) nomographic functions (over rings). Since a discrete function with a finite domain always admits a nomographic presentation, we conclude that both generalizations universally apply for encoding all discrete functions of finite domains. Based on these, we demonstrate that linear coding over finite rings strictly outperforms its field counterpart in terms of achieving better coding rates and reducing the required alphabet sizes of the encoders for encoding infinitely many discrete functions.

Keywords:

linear coding; source coding; ring; field; source coding for computing

1. Introduction

The problem of source coding for computing considers the scenario where a decoder is interested in recovering a function of the message(s), other than the original message(s), that is (are) i.i.d. generated and independently encoded by the source(s). In rigorous terms:

Problem 1 (Source Coding for Computing).

Given

S = {1, 2, \dots, s}

and

(X_{1}, X_{2}, \dots, X_{s}) \sim p

. For each

i \in S

consider a discrete memoryless source that randomly generates i.i.d. discrete data

X_{i}^{(1)}, X_{i}^{(2)}, \dots, X_{i}^{(n)}, \dots

, where

X_{i}^{(n)}

has a finite sample space

X_{i}

and

(X_{1}^{(n)}, X_{2}^{(n)}, \dots, X_{s}^{(n)}) \sim p,

\forall n \in N^{+}

. For a discrete function

g : \prod_{i \in S} X_{i} \to Ω,

what is the largest region

R [g] \subset R^{s}

, such that,

\forall (R_{1}, R_{2}, \dots, R_{s}) \in R [g]

and

\forall ϵ > 0

, there exists an

N_{0} \in N^{+}

, such that for all

n > N_{0}

, there exist s encoders

ϕ_{i} : X_{i}^{n} \to [1, 2^{n R_{i}}], i \in S,

and one decoder

ψ : \prod_{i \in S} [1, 2^{n R_{i}}] \to Ω^{n},

with

\begin{matrix} Pr \{\vec{g} (X_{1}^{n}, \dots, X_{s}^{n}) \neq ψ [ϕ_{1} (X_{1}^{n}), \dots, ϕ_{s} (X_{s}^{n})]\} < ϵ, \end{matrix}

(1)

where

X_{i}^{n} = (X_{i}^{(1)}, X_{i}^{(2)}, \dots, X_{i}^{(n)})

and

\begin{matrix} \vec{g} (X_{1}^{n}, \dots, X_{s}^{n}) = [\begin{matrix} g (X_{1}^{(1)}, \dots, X_{s}^{(1)}) \\ ⋮ \\ g (X_{1}^{(n)}, \dots, X_{s}^{(n)}) \end{matrix}] \in Ω^{n} ? \end{matrix}

(2)

The region

R [g]

is called the achievable coding rate region for computing g. A rate tuple

R \in R^{s}

is said to be achievable for computing g (or simply achievable) if and only if

R \in R [g]

. A region

R \subset R^{s}

is said to be achievable for computing g (or simply achievable) if and only if

R \subseteq R [g]

.

If g is an identity function, the computing problem, Problem 1, is known as the Slepian–Wolf (SW) source coding problem.

R [g]

is then the SW region [1],

\begin{matrix} R [X_{1}, X_{2}, \dots, X_{s}] = {(R_{1}, R_{2}, \dots, R_{s}) \in R^{s}| \sum_{j \in T} R_{j} > H (X_{T} | X_{T^{c}}), \forall \emptyset \neq T \subseteq S}, \end{matrix}

(3)

where

T^{c}

is the complement of T in

S

and

X_{T} (X_{T^{c}})

is the random variable array

\prod_{j \in T} X_{j} (\prod_{j \in T^{c}} X_{j})

. However, from [1] it is hard to draw conclusions regarding the structure (linear or not) of the encoders, as the corresponding mappings are chosen randomly among all feasible mappings. This limits the scope of their potential applications. As a consequence, linear coding over finite fields (LCoF), namely

X_{i}

’s are injectively mapped into some subsets of some finite fields and the

ϕ_{i}

’s are chosen as linear mappings over these fields, is considered. It is shown that LCoF achieves the same encoding limit, the SW region [2,3]. Although it seems straightforward to study linear mappings over rings (non-field rings in particular), it has not been proved (nor denied) that linear encoding over non-field rings can be equally optimal.

For an arbitrary discrete function g, Problem 1 remains open in general, and

R [X_{1}, X_{2}, \dots, X_{s}] \subseteq R [g]

obviously. Making use of Elias’ theorem on binary linear codes [2], Körner–Marton [4] shows that

R [\oplus_{2}]

(“

\oplus_{2}

” is the modulo-two sum) contains the region

\begin{matrix} \tilde{R} = \{(R_{1}, R_{2}) \in R^{2} | R_{1}, R_{2} > H (X_{1} \oplus_{2} X_{2})\} . \end{matrix}

(4)

This region is not contained in the SW region for certain distributions. In other words,

R [\oplus_{2}] ⊋ R [X_{1}, X_{2}]

. Combining the standard random coding technique and Elias’ result, [5] shows that

R [\oplus_{2}]

can be strictly larger than the convex hull of the union

R [X_{1}, X_{2}] \cup \tilde{R}

. However, the functions considered in these works are relatively simple. With a polynomial approach, [6,7] generalize the result of Ahlswede–Han ([5], Theorem 10) to the scenario of g being arbitrary. Making use of the fact that a discrete function is essentially a polynomial function (see Definition 2) over some finite field, an achievable region is given for computing an arbitrary discrete function. Such a region contains and can be strictly larger (depending on the precise function and distribution under consideration) than the SW region. Conditions under which

R [g]

is strictly larger than the SW region are presented in [6,8] from different perspectives, respectively. The cases regarding Abelian group codes are covered in [9,10,11].

The present work proposes replacing the linear encoders over finite fields from Elias [2] and Csiszár [3] with linear encoders over finite rings in the case of the problems accounted for above. Achievability theorems related to linear coding over finite rings (LCoR) for SW data compression are presented, covering the results in [2,3] as special cases in the sense of characterizing the achievable region. In addition, it is proved that there always exists a sequence of linear encoders over some finite non-field rings that achieves the SW region for any scenario of SW. Therefore, the issue of optimality of linear coding over finite non-field rings for data compression is closed with respect to existence. Furthermore, we also consider LCoR as an alternative technique for the general computing problem, Problem 1. Results from Körner–Marton [4], Ahlswede–Han ([5], Theorem 10) and [7] are generalized to corresponding ring versions for encoding (pseudo) nomographic functions (over rings). Since any discrete function with a finite domain admits a nomographic presentation, we conclude that our results universally apply for encoding all discrete functions of finite domains. Finally, it is shown that our ring approach dominates its field counterpart in terms of achieving better coding rates and reducing alphabet sizes of the encoders for encoding some discrete function. The proof is done by taking advantage of the fact that the characteristic of a ring can be any positive integer while the characteristic of a field must be a prime. From this observation used in the proof, it is seen that there are actually infinite many such functions.

2. Rings, Ideals and Linear Mappings

We start by introducing some fundamental algebraic concepts and related properties. Readers who are already familiar with this material may still choose to go through quickly to identify our notation.

Definition 1.

The tuple

[R, +, \cdot]

is called a ring if the following criteria are met:

1.: $[R, +]$ is an Abelian group ;
2.: There exists a multiplicative identity $1 \in R$ , namely, $1 \cdot a = a \cdot 1 = a$ , $\forall a \in R$ ;
3.: $\forall a, b, c \in R$ , $a \cdot b \in R$ and $(a \cdot b) \cdot c = a \cdot (b \cdot c)$ ;
4.: $\forall a, b, c \in R$ , $a \cdot (b + c) = (a \cdot b) + (a \cdot c)$ and $(b + c) \cdot a = (b \cdot a) + (c \cdot a)$ .

We often write

R

for

[R, +, \cdot]

when the operations considered are known from the context. The operation “·” is usually written by juxtaposition,

a b

for

a \cdot b

, for all

a, b \in R

.

A ring

[R, +, \cdot]

is said to be commutative if

\forall a, b \in R

,

a \cdot b = b \cdot a

. In Definition 1, the identity of the group

[R, +]

, denoted by 0, is called the zero. A ring

[R, +, \cdot]

is said to be finite if the cardinality

| R |

is finite, and

| R |

is called the order of

R

. The set

Z_{q}

of integers modulo q is a commutative finite ring with respect to the modular arithmetic. For any ring

R

, the set of all polynomials of s indeterminants over

R

is an infinite ring.

Definition 2.

A polynomial function (Polynomial and polynomial function are distinct concepts.) of k variables over a finite ring

R

is a function

g : R^{k} \to R

of the form

\begin{matrix} g (x_{1}, x_{2}, \dots, x_{k}) = \sum_{j = 0}^{m} a_{j} x_{1}^{m_{1 j}} x_{2}^{m_{2 j}} \dots x_{k}^{m_{k j}}, \end{matrix}

(5)

where

a_{j} \in R

and m and

m_{i j}

’s are non-negative integers. The set of all the polynomial functions of k variables over ring

R

is designated by

R [k]

.

Remark 1.

Polynomial and polynomial function are sometimes only defined over a commutative ring [12,13]. It is a very delicate matter to define them over a non-commutative ring [14,15], due to the fact that

x_{1} x_{2}

and

x_{2} x_{1}

can become different objects. We choose to define “polynomial functions” with Formula (5) because those functions are within the scope of this paper’s interest.

Proposition 1.

Given s rings

R_{1}, R_{2}, \dots, R_{s}

, for any non-empty set

T \subseteq {1, 2, \dots, s}

, the Cartesian product (see [12])

R_{T} = \prod_{i \in T} R_{i}

forms a new ring

[R_{T}, +, \cdot]

with respect to the component-wise operations defined as follows:

\begin{matrix} a^{'} + a^{″} = & (a_{1}^{'} + a_{1}^{″}, a_{2}^{'} + a_{2}^{″}, \dots, a_{| T |}^{'} + a_{| T |}^{″}), \end{matrix}

(6)

\begin{matrix} a^{'} \cdot a^{″} = & (a_{1}^{'} a_{1}^{″}, a_{2}^{'} a_{2}^{″}, \dots, a_{| T |}^{'} a_{| T |}^{″}), \end{matrix}

(7)

\forall a^{'} = (a_{1}^{'}, a_{2}^{'}, \dots, a_{| T |}^{'}), a^{″} = (a_{1}^{″}, a_{2}^{″}, \dots, a_{| T |}^{″}) \in R_{T}

.

Remark 2.

In Proposition 1,

[R_{T}, +, \cdot]

is called the direct product of

{R_{i} | i \in T}

. It can be easily seen that

(0, 0, \dots, 0)

and

(1, 1, \dots, 1)

are the zero and the multiplicative identity of

[R_{T}, +, \cdot]

, respectively.

Definition 3.

A non-zero element a of a ring

R

is said to be invertible , if and only if there exists

b \in R

, such that

a b = b a = 1

. b is called the inverse of a, denoted by

a^{- 1}

. An invertible element of a ring is called a unit.

Remark 3.

It can be proved that the inverse of a unit is unique. By definition, the multiplicative identity is the inverse of itself.

Let

R^{*} = R \ {0}

. The ring

[R, +, \cdot]

is a field if and only if

[R^{*}, \cdot]

is an Abelian group. In other words, all non-zero elements of

R

are invertible. All fields are commutative rings.

Z_{q}

is a field if and only if q is a prime. All finite fields of the same order are isomorphic to each other ([16], p. 549). This “unique” field of order q is denoted by

F_{q}

. It is necessary that q is a power of a prime. More details regarding finite fields can be found in ([16], Chapter 14.3).

Theorem 1

(Wedderburn’s little theorem [12]). Let

R

be a finite ring.

R

is a field if and only if all non-zero elements of

R

are invertible.

Remark 4.

Wedderburn’s little theorem guarantees commutativity for a finite ring if all of its non-zero elements are invertible. Hence, a finite ring is either a field or at least one of its elements has no inverse. However, a finite commutative ring is not necessary a field, e.g.,

Z_{q}

is not a field if q is not a prime.

Definition 4

([16]). The characteristic of a finite ring

R

is defined to be the smallest positive integer m, such that

\sum_{j = 1}^{m} 1 = 0

, where 0 and 1 are the zero and the multiplicative identity of

R

, respectively. The characteristic of

R

is often denoted by

Char (R)

.

Remark 5.

Clearly,

Char (Z_{q}) = q

. For a finite field

F_{q}

,

Char (F_{q})

is always the prime

q_{0}

such that

q = q_{0}^{n}

for some integer n ([12], Proposition 2.137).

Proposition 2.

Let

F_{q}

be a finite field. For any

0 \neq a \in F_{q}

,

m = Char (F_{q})

if and only if m is the smallest positive integer such that

\sum_{j = 1}^{m} a = 0

.

Proof.

Since

a \neq 0

,

\begin{matrix} \sum_{j = 1}^{m} a = 0 \Rightarrow a^{- 1} \sum_{j = 1}^{m} a = a^{- 1} \cdot 0 \Rightarrow \sum_{j = 1}^{m} 1 = 0 \Rightarrow \sum_{j = 1}^{m} a = 0 \end{matrix}

(8)

The statement is proved. ☐

Definition 5.

A subset

I

of a ring

[R, +, \cdot]

is said to be a left ideal of

R

, denoted by

I \leq_{l} R

, if and only if

1.: $[I, +]$ is a subgroup of $[R, +]$ ;
2.: $\forall x \in I$ and $\forall a \in R$ , $a \cdot x \in I$ .

If condition 2 is replaced by

3.: $\forall x \in I$ and $\forall a \in R$ , $x \cdot a \in I$ ,

then

I

is called a right ideal of

R

, denoted by

I \leq_{r} R

.

{0}

is a trivial left (right) ideal, usually denoted by 0.

The cardinality

| I |

is called the order of a finite left (right) ideal

I

.

Remark 6.

Let

\{a_{1}, a_{2}, \dots, a_{n}\}

be a non-empty set of elements of some ring

R

. It is easy to verify that

{〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{r} = {\sum_{i = 1}^{n} a_{i} b_{i} | b_{i} \in R, \forall 1 \leq i \leq n}

is a right ideal and

{〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{l} = {\sum_{i = 1}^{n} b_{i} a_{i} | b_{i} \in R, \forall 1 \leq i \leq n}

is a left ideal. Furthermore,

{〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{r} = R

and

{〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{l} = R

if

a_{i}

is a unit for some

1 \leq i \leq n

.

It is well-known that if

I \leq_{l} R

, then

R

is divided into disjoint cosets which are of equal size (cardinality). For any coset

J

,

J = x + I = \{x + y | y \in I\}

,

\forall x \in J

. The set of all cosets forms a left module over

R

, denoted by

R / I

. Similarly,

R / I

becomes a right module over

R

if

I \leq_{r} R

[17]. Of course,

R / I

can also be considered as a quotient group [12]. However, its structure is well richer than simply being a quotient group.

Proposition 3.

Let

R_{i}

(

1 \leq i \leq s

) be a ring and

R = \prod_{i = 1}^{s} R_{i}

. For any

A \subseteq R

,

A \leq_{l} R

(or

A \leq_{r} R

) if and only if

A = \prod_{i = 1}^{s} A_{i}

and

A_{i} \leq_{l} R_{i}

(or

A_{i} \leq_{r} R_{i}

),

\forall 1 \leq i \leq s

.

Proof.

We prove for the

\leq_{l}

case only, and the

\leq_{r}

case follows from a similar argument. Let

π_{i}

(

1 \leq i \leq s

) be the coordinate function assigning every element in

R

its ith component. Then

A \subseteq \prod_{i = 1}^{s} A_{i}

, where

A_{i} = π_{i} (A)

. Moreover, for any

\begin{matrix} x = (π_{1} (x_{1}), π_{2} (x_{2}), \dots, π_{s} (x_{s})) \in \prod_{i = 1}^{s} A_{i}, \end{matrix}

(9)

where

x_{i} \in A

for all feasible i, we have that

\begin{matrix} x = \sum_{i = 1}^{s} e_{i} x_{i}, \end{matrix}

(10)

where

e_{i} \in R

has the ith coordinate being 1 and others being 0. If

A \leq_{l} R

, then

x \in A

by definition. Therefore,

\prod_{i = 1}^{s} A_{i} \subseteq A

. Consequently,

A = \prod_{i = 1}^{s} A_{i}

. Since

π_{i}

is a homomorphism, we also have that

A_{i} \leq_{l} R_{i}

for all feasible i. The other direction is easily verified by definition. ☐

Remark 7.

It is worthwhile to point out that Proposition 3 does not hold for infinite index set, namely,

R = \prod_{i \in I} R_{i}

, where I is not finite.

For any

\emptyset \neq T \subseteq S

, Proposition 3 states that any left (right) ideal of

R_{T}

is a Cartesian product of some left (right) ideals of

R_{i}

,

i \in T

. Let

I_{i}

be a left (right) ideal of ring

R_{i}

(

1 \leq i \leq s

). We define

I_{T}

to be the left (right) ideal

\prod_{i \in T} I_{i}

of

R_{T}

.

Let

x^{t}

be the transpose of a vector (or matrix)

x

.

Definition 6.

A mapping

f : R^{n} \to R^{m}

given as:

\begin{matrix} f (x_{1}, x_{2}, \dots, x_{n}) = {(\sum_{j = 1}^{n} a_{1, j} x_{j}, \dots, \sum_{j = 1}^{n} a_{m, j} x_{j})}^{t}, \forall (x_{1}, x_{2}, \dots, x_{n}) \in R^{n}, \end{matrix}

(11)

where t stands for transposition and

a_{i, j} \in R

for all feasible i and j, is called a left linear mapping over ring

R

. Similarly,

\begin{matrix} f (x_{1}, x_{2}, \dots, x_{n}) = {(\sum_{j = 1}^{n} x_{j} a_{1, j}, \dots, \sum_{j = 1}^{n} x_{j} a_{m, j})}^{t}, \forall (x_{1}, x_{2}, \dots, x_{n}) \in R^{n}, \end{matrix}

(12)

defines a right linear mapping over ring

R

. If

m = 1

, then f is called a left ( right ) linear function over

R

.

From now on, left linear mapping (function) or right linear mapping (function) are simply called linear mapping (function). This will not lead to any confusion since the intended use can usually be clearly distinguished from the context.

Remark 8.

The mapping f in Definition 6 is called linear in accordance with the definition of linear mapping (function) over a field. In fact, the two structures have several similar properties. Moreover, (11) is equivalent to

\begin{matrix} f (x_{1}, x_{2}, \dots, x_{n}) = A {(x_{1}, x_{2}, \dots, x_{n})}^{t}, \forall (x_{1}, x_{2}, \dots, x_{n}) \in R^{n}, \end{matrix}

(13)

where

A

is an

m \times n

matrix over

R

and

{[A]}_{i, j} = a_{i, j}

for all feasible i and j.

A

is named the coefficient matrix. It is easy to prove that a linear mapping is uniquely determined by its coefficient matrix, and vice versa. The linear mapping f is said to be trivial , denoted by 0, if

A

is the zero matrix , i.e.,

{[A]}_{i, j} = 0

for all feasible i and j.

It should be noted that an interesting approach to coding over an Abelian group was presented in [9,10,11]. However, we emphasize that even if group, field and ring are closely related algebraic structures, the definition of the group encoder in [11] and the linear encoder in [3] and in the present work are in general fundamentally different (although there is an overlap in special cases). To highlight in more detail the difference between linear encoding (this work and [3]) and encoding over a group, as in [11], which is a nonlinear operation in general, take the Abelian group

G = Z_{2} \oplus Z_{2}

, the field

F_{4}

of order 4 and the matrix ring

M_{L, 2} = \{[\begin{matrix} a & 0 \\ b & a \end{matrix}]| a, b \in Z_{2}\}

as examples.

By ([11], Example 2), the Abelian group encoder encodes the source $\hat{Z} = (X, Y) \in G$ based on a Slepian–Wolf like scheme. Namely, two binary linear encoders encode $X^{n}$ and $Y^{n}$ separately as two binary sources. Therefore, the lengths of the codewords from encoding $X^{n}$ and $Y^{n}$ can even be different, and the encoder is in general a highly nonlinear device.
On the other hand, the linear encoder over either $F_{4}$ or $M_{L, 2}$ simply outputs a linear combination of the vector ${\hat{Z}}^{n}$ , namely $A {\hat{Z}}^{n}$ for some matrix $A$ over $F_{4}$ or $M_{L, 2}$ .
However, if one requires that the codewords from encoding $X^{n}$ and $Y^{n}$ be of the same length in (1), then the output from encoding ${\hat{Z}}^{n}$ is the same as $\tilde{A} {\hat{Z}}^{n}$ for some matrix $\tilde{A}$ over ring $Z_{2} \times Z_{2}$ (a specific product ring whose multiplication is significantly different from those of $F_{4}$ or $M_{L, 2}$ ). In other words, in this quite specific special case, the encoder becomes linear over a product ring of modulo integers, which is a sub-class of the completely general ring structures considered in this paper.

We also note that in some source network problems, linear codes appear superior to others [3]. For instance, for encoding the modulo-two sum of binary symmetric sources, linear coding over

F_{4}

or

M_{L, 2}

achieves the optimal Körner–Marton region [4] (the

M_{L, 2}

case will be established in later sections), while coding over G achieves the sub-optimal Slepian–Wolf region ([11], p. 1509). To avoid any remaining confusion, we in Appendix D present additional details regarding the differences between linear coding, as in the present work and in [3], and coding over an Abelian group, as in [11].

Let

A

be an

m \times n

matrix over ring

R

and

f (x) = A x

,

\forall x \in R^{n}

. For the system of linear equations

\begin{matrix} f (x) = A x = 0, where 0 = {(0, 0, \dots, 0)}^{t} \in R^{m}, \end{matrix}

(14)

let

S (f)

be the set of all solutions, namely

S (f) = \{x \in R^{n} | f (x) = 0\}

. It is obvious that

S (f) = R^{n}

if f is trivial, i.e.,

A

is the zero matrix. If

R

is a field, then

S (f)

is a subspace of

R^{n}

. We conclude this section with a lemma regarding the cardinalities of

R^{n}

and

S (f)

in the following.

Lemma 1.

For a finite ring

R

and a linear function

\begin{matrix} f : & x \mapsto (a_{1}, a_{2}, \dots, a_{n}) x \end{matrix}

(15)

\begin{matrix} (f : & x \mapsto x^{t} {(a_{1}, a_{2}, \dots, a_{n})}^{t}), \forall x \in R^{n}, \end{matrix}

(16)

we have

\begin{matrix} \frac{|S (f)|}{{|R|}^{n}} = \frac{1}{| I |}, \end{matrix}

(17)

where

I = {〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{r}

(I = {〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{l})

. In particular, if

a_{i}

is invertible for some

1 \leq i \leq n

, then

|S (f)| = {|R|}^{n - 1}

.

Proof.

It is obvious that the image

f (R^{n}) = I

by definition. Moreover,

\forall x \neq y \in I

, the pre-images

f^{- 1} (x)

f^{- 1} (y)

satisfy

f^{- 1} (x) \cap f^{- 1} (y) = \emptyset

and

|f^{- 1} (x)| = |f^{- 1} (y)| = | S (f) |

. Therefore,

| I | |S (f)| = {|R|}^{n}

, i.e.,

\frac{|S (f)|}{{|R|}^{n}} = \frac{1}{| I |}

. Moreover, if

a_{i}

is a unit, then

I = R

, thus,

|S (f)| = {|R|}^{n} / |R| = {|R|}^{n - 1}

. ☐

3. Linear Coding over Finite Rings

In this section, we will present a coding rate region achieved with LCoR for the SW source coding problem, i.e., g is an identity function in Problem 1. This region is exactly the SW region if all the rings considered are fields. However, being field is not necessary as seen in Section 5, where the issue of optimality is addressed.

Before proceeding, a subtlety needs to be cleared out. It is assumed that a source generates data taking values from a finite sample space

X_{i}

, while

X_{i}

does not necessarily admit any algebraic structure. We have to either assume that

X_{i}

is with a certain algebraic structure, for instance

X_{i}

is a ring, or injectively map elements of

X_{i}

into some algebraic structure. In our subsequent discussions, we assume that

X_{i}

is mapped into a finite ring

R_{i}

of order at least

|X_{i}|

by some injection

Φ_{i}

. Hence,

X_{i}

can simply be treated as a subset

Φ_{i} (X_{i}) \subseteq R_{i}

for a fixed

Φ_{i}

. When required,

Φ_{i}

can also be selected to obtain desired outcomes.

To facilitate our discussion, the following notation is used. For

\emptyset \neq T \subseteq S

,

X_{T}

(

x_{T}

and

X_{T}

resp.) is defined to be the Cartesian product

\begin{matrix} \prod_{i \in T} X_{i} (\prod_{i \in T} x_{i} and \prod_{i \in T} X_{i} resp .), \end{matrix}

(18)

where

x_{i} \in X_{i}

is a realization of

X_{i}

. If

(X_{1}, X_{2}, \dots, X_{s}) \sim p

, we denote the marginal of p with respect to

X_{T}

by

p_{X_{T}}

, i.e.,

X_{T} \sim p_{X_{T}}

, define the support

\begin{matrix} supp (p_{X_{T}}) = & \{x_{T} \in X_{T}| p_{X_{T}} (x_{T}) > 0\} and \end{matrix}

(19)

\begin{matrix} H (p_{X_{T}}) = & H (X_{T}) . \end{matrix}

(20)

For simplicity,

M (X_{S}, R_{S})

is defined to be

\begin{matrix} \{(Φ_{1}, Φ_{2}, \dots, Φ_{s})| Φ_{i} : X_{i} \to R_{i} is injective, \forall i \in S\} \end{matrix}

(21)

(

|R_{i}| \geq |X_{i}|

is implicitly assumed), and

Φ (x_{T}) = \prod_{i \in T} Φ_{i} (x_{i})

for any

Φ \in M (X_{S}, R_{S})

and

x_{T} \in X_{T}

. For any

Φ \in M (X_{S}, R_{S})

, let

R_{Φ} = {(R_{1}, R_{2}, \dots, R_{s}) \in R^{s} |\sum_{i \in T} \frac{R_{i} log | I_{i} |}{log | R_{i} |} > r (T, I_{T}), \forall \emptyset \neq T \subseteq S, \forall 0 \neq I_{i} \leq_{l} R_{i}},

(22)

where

r (T, I_{T}) = H (X_{T} | X_{T^{c}}) - H (Y_{R_{T} / I_{T}} | X_{T^{c}}) = H (X_{T} | Y_{R_{T} / I_{T}}, X_{T^{c}})

and

Y_{R_{T} / I_{T}} = Φ (X_{T}) + I_{T}

is a random variable with sample space

R_{T} / I_{T}

.

Theorem 2.

R_{Φ}

is achievable with linear coding over the finite rings

R_{1}, R_{2}, \dots, R_{s}

. In exact terms,

\forall ϵ > 0

, there exists

N_{0} \in N^{+}

, for all

n > N_{0}

, there exist linear encoders (left linear mappings to be more precise)

ϕ_{i} : Φ {(X_{i})}^{n} \to R_{i}^{k_{i}}

(

i \in S

) and a decoder ψ, such that

\begin{matrix} Pr \{ψ (\prod_{i \in S} ϕ_{i} (X_{i})) \neq \prod_{i \in S} X_{i}\} < ϵ, \end{matrix}

(23)

where

X_{i} = {(Φ (X_{i}^{(1)}), Φ (X_{i}^{(2)}), \dots, Φ (X_{i}^{(n)}))}^{t}

, as long as

\begin{matrix} (\frac{k_{1} log |R_{1}|}{n}, \frac{k_{2} log |R_{2}|}{n}, \dots, \frac{k_{s} log |R_{s}|}{n}) \in R_{Φ} . \end{matrix}

(24)

Proof.

The proof is given in Section 4. ☐

The following is a concrete example providing some insight into this theorem.

Example 1.

Consider the single source scenario, where

X_{1} \sim p

and

X_{1} = Z_{6}

, specified as follows.

$X_{1}$	0	1	2	3	4	5
$p (X_{1})$	$0.05$	$0.1$	$0.15$	$0.2$	$0.2$	$0.3$

Obviously,

Z_{6}

contains 3 non-trivial ideals

I_{1} = {0, 3}

,

I_{2} = {0, 2, 4}

and

Z_{6}

, and

Y_{Z_{6} / I_{1}}

and

Y_{Z_{6} / I_{2}}

admit the distributions

respectively. In addition,

Y_{Z_{6} / Z_{6}} = Z_{6}

is a constant. Thus, by Theorem 2, rate

R_{1}

is achievable if

\begin{matrix} \frac{R_{1} log |I_{1}|}{log |Z_{6}|} = & \frac{R_{1} log 2}{log 6} > H (X_{1}) - H (Y_{Z_{6} / I_{1}}) = 2.40869 - 1.53949 = 0.86920, \\ \frac{R_{1} log |I_{2}|}{log |Z_{6}|} = & \frac{R_{1} log 3}{log 6} > H (X_{1}) - H (Y_{Z_{6} / I_{2}}) = 2.40869 - 0.97095 = 1.43774 \\ a n d \frac{R_{1} log |Z_{6}|}{log |Z_{6}|} = & R_{1} > H (X_{1}) - H (Y_{Z_{6} / Z_{6}}) = H (X_{1}) = 2.40869 . \end{matrix}

In other words,

\begin{matrix} R = & \{R_{1} \in R | R_{1} > max {2.24685, 2.34485, 2.40869}\} \end{matrix}

(25)

\begin{matrix} = & \{R_{1} \in R | R_{1} > 2.40869 = H (X_{1})\} \end{matrix}

(26)

is achievable with linear coding over ring

Z_{6}

. Obviously,

R

is just the region

R [X_{1}]

. Optimality is claimed.

Additionally, we would like to point out that some of the inequalities defining (22) are not active for specific scenarios. Two classes of these scenarios are discussed in the following theorems. The first, Theorem 3, is for scenarios where rings considered are product rings, while the second, Theorem 4, is for cases of lower triangle matrix rings (similarly, readers can consider usual matrix rings, which are often non-commutative, if interested).

Theorem 3.

Suppose

R_{i}

(

1 \leq i \leq s

) is a (finite) product ring

\prod_{l = 1}^{k_{i}} R_{l, i}

of finite rings

R_{l, i}

’s, and the sample space

X_{i}

satisfies

|X_{i}| \leq |R_{l, i}|

for all feasible i and l. Given injections

Φ_{l, i} : X_{i} \to R_{l, i}

and let

\begin{matrix} Φ = (Φ_{1}, Φ_{2}, \dots, Φ_{s}), \end{matrix}

(27)

where

Φ_{i} = \prod_{l = 1}^{k_{i}} Φ_{l, i}

is defined as

\begin{matrix} Φ_{i} : x_{i} \mapsto (Φ_{1, i} (x_{i}), Φ_{2, i} (x_{i}), \dots, Φ_{k_{i}, i} (x_{i})) \in R_{i}, \forall x_{i} \in X_{i} . \end{matrix}

(28)

We have that

R_{Φ, prod} = {[R_{1}, R_{2}, \dots, R_{s}] \in R^{s} | \sum_{i \in T} \frac{R_{i} log |I_{i}|}{log |R_{i}|} > H (X_{T} | Y_{R_{T} / I_{T}}, X_{T^{c}}), \forall \emptyset \neq T \subseteq S, \forall I_{i} = \prod_{l = 1}^{k_{i}} I_{l, i} w i t h 0 \neq I_{l, i} \leq_{l} R_{l, i}},

(29)

where

Y_{R_{T} / I_{T}} = Φ (X_{T}) + I_{T}

, is achievable with linear coding over

R_{1}, R_{2}, \dots, R_{s}

. Moreover,

R_{Φ} \subseteq R_{Φ, prod}

.

Proof.

The proof is found in Section 4. ☐

Let

R

be a finite ring and

\begin{matrix} M_{L, R, m} = \{[\begin{matrix} a_{1} & 0 & 0 \\ a_{2} & a_{1} & 0 \\ ⋱ \\ a_{m} & a_{m - 1} & a_{1} \end{matrix}]| a_{1}, a_{2}, \dots, a_{m} \in R\}, \end{matrix}

(30)

where m is a positive integer. It is easy to verify that

M_{L, R, m}

is a ring with respect to matrix operations. Moreover,

I

is a left ideal of

M_{L, R, m}

if and only if

\begin{matrix} I = \{[\begin{matrix} a_{1} & 0 & 0 \\ a_{2} & a_{1} & 0 \\ ⋱ \\ a_{m} & a_{m - 1} & a_{1} \end{matrix}]| \begin{matrix} a_{j} \in I_{j} \leq_{l} R, \forall 1 \leq j \leq m; \\ I_{j} \subseteq I_{j + 1}, \forall 1 \leq j < m \end{matrix}\} . \end{matrix}

(31)

Let

O (M_{L, R, m})

be the set of all left ideals of the form

\begin{matrix} \{[\begin{matrix} a_{1} & 0 & 0 \\ a_{2} & a_{1} & 0 \\ ⋱ \\ a_{m} & a_{m - 1} & a_{1} \end{matrix}]| \begin{matrix} a_{j} \in I_{j} \leq_{l} R, \forall 1 \leq j \leq m; \\ I_{j} \subseteq I_{j + 1}, \forall 1 \leq j < m; \\ I_{i} = 0 for some 1 \leq i \leq m \end{matrix}\} . \end{matrix}

(32)

Theorem 4.

Let

R_{i}

(

1 \leq i \leq s

) be a finite ring such that

|X_{i}| \leq |R_{i}|

. For any injections

Φ_{i}^{'} : X_{i} \to R_{i}

, let

\begin{matrix} Φ = (Φ_{1}, Φ_{2}, \dots, Φ_{s}), \end{matrix}

(33)

where

Φ_{i} : X_{i} \to M_{L, R_{i}, m_{i}}

is defined as

\begin{matrix} Φ_{i} : x_{i} \mapsto [\begin{matrix} Φ_{i}^{'} (x_{i}) & 0 & 0 \\ Φ_{i}^{'} (x_{i}) & Φ_{i}^{'} (x_{i}) & 0 \\ ⋱ \\ Φ_{i}^{'} (x_{i}) & Φ_{i}^{'} (x_{i}) & Φ_{i}^{'} (x_{i}) \end{matrix}], \forall x_{i} \in X_{i} . \end{matrix}

(34)

We have that

\begin{matrix} R_{Φ, m} = { & [R_{1}, R_{2}, \dots, R_{s}] \in R^{s} | \sum_{i \in T} \frac{R_{i} log |I_{i}|}{log |R_{i}|} > H (X_{T} | Y_{R_{T} / I_{T}}, X_{T^{c}}), \\ \forall \emptyset \neq T \subseteq S, \forall I_{i} \leq_{l} M_{L, R_{i}, m_{i}} a n d I_{i} \notin O (M_{L, R_{i}, m_{i}})}, \end{matrix}

(35)

where

Y_{R_{T} / I_{T}} = Φ (X_{T}) + I_{T}

, is achievable with linear coding over

M_{L, R_{1}, m_{1}}, M_{L, R_{2}, m_{2}}, \dots, M_{L, R_{s}, m_{s}}

. Moreover,

R_{Φ} \subseteq R_{Φ, m}

.

Proof.

The proof is found in Section 4. ☐

Remark 9.

The difference between (22), (29) and (35) lies in their restrictions defining

I_{i}

’s, respectively, as highlighted in the proofs given in Section 4.

Remark 10.

Without much effort, one can see that

R_{Φ}

(

R_{Φ, prod}

and

R_{Φ, m}

, respectively) in Theorem 2 (Theorem 3 and Theorem 4, respectively) depends on Φ via random variables

Y_{R_{T} / I_{T}}

’s whose distributions are determined by Φ. For each

i \in S

, there exist

\frac{|R_{i}|!}{(|R_{i}| - |X_{i}|)!}

distinct injections from

X_{i}

to a ring

R_{i}

of order at least

| X_{i} |

. Let

cov (A)

be the convex hull of a set

A \subseteq R^{s}

. By a straightforward time sharing argument, we have that

\begin{matrix} R_{l} = cov (⋃_{Φ \in M (X_{S}, R_{S})} R_{Φ}) \end{matrix}

(36)

is achievable with linear coding over

R_{1}, R_{2}, \dots, R_{s}

.

Remark 11.

From Theorem 5, one will see that (22) and (36) are the same when all the rings are fields. Actually, both are identical to the SW region. However, (36) can be strictly larger than (22) (see Section 5), when not all the rings are fields. This implies that, in order to achieve the desired rate, a suitable injection is required. However, be reminded that taking the convex hull in (36) is not always needed for optimality as shown in Example 1. A more sophisticated elaboration on this issue is found in Section 5.

The rest of this section provides key supporting lemmata and concepts used to prove Theorems 2–4. The final proofs are presented in Section 4.

Lemma 2.

Let

x, y \in R^{n}

be two distinct sequences, where

R

is a finite ring, and assume that

y - x = {(a_{1}, a_{2}, \dots, a_{n})}^{t}

. If

f : R^{n} \to R^{k}

is a random linear mapping chosen uniformly at random, i.e., generate the

k \times n

coefficient matrix

A

of f by independently choosing each entry of

A

from

R

uniformly at random, then

\begin{matrix} Pr \{f (x) = f (y)\} = {| I |}^{- k}, \end{matrix}

(37)

where

I = {〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{l}

.

Proof.

Let

f = {(f_{1}, f_{2}, \dots, f_{k})}^{t}

, where

f_{i} : R^{n} \to R

is a random linear function. Then

\begin{matrix} Pr {f (x) = f (y)} = & Pr \{⋂_{i = 1}^{k} \{f_{i} (x) = f_{i} (y)\}\} \end{matrix}

(38)

\begin{matrix} = & \prod_{i = 1}^{k} Pr \{f_{i} (x - y) = 0\}, \end{matrix}

(39)

since the

f_{i}

’s are independent from each other. The statement follows from Lemma 1, which ensures that

Pr \{f_{i} (x - y) = 0\} = {| I |}^{- 1}

. ☐

Remark 12.

In Lemma 2, if

R

is a field and

x \neq y

, then

I = R

because every non-zero

a_{i}

is a unit. Thus,

Pr \{f (x) = f (y)\} = {| R |}^{- k}

.

Definition 7

([18]). Let

X \sim p_{X}

be a discrete random variable with sample space

X

. The set

T_{ϵ} (n, X)

of strongly

ϵ

-typical sequencesof length n with respect to X is defined to be

\begin{matrix} \{x \in X^{n} ||\frac{N (x; x)}{n} - p_{X} (x)| \leq ϵ, \forall x \in X\}, \end{matrix}

(40)

where

N (x; x)

is the number of occurrences of x in the sequence

x

.

The notation

T_{ϵ} (n, X)

is sometimes replaced by

T_{ϵ}

when the length n and the random variable X referred to are clear from the context.

Now we conclude this section with the following lemma. It is a crucial part for our proofs of the achievability theorems. It generalizes the classic conditional typicality lemma ([19], Theorem 15.2.2), yet at the same time distinguishes our argument from the one for the field version.

Lemma 3.

Let

(X_{1}, X_{2}) \sim p

be a jointly random variable whose sample space is a finite ring

R = R_{1} \times R_{2}

. For any

η > 0

, there exists

ϵ > 0

, such that,

\forall {(x_{1}, x_{2})}^{t} \in T_{ϵ} (n, (X_{1}, X_{2}))

and

\forall I \leq_{l} R_{1}

,

\begin{matrix} |D_{ϵ} (x_{1}, I | x_{2})| < 2^{n [H (X_{1} | Y_{R_{1} / I}, X_{2}) + η]}, \end{matrix}

(41)

where

\begin{matrix} D_{ϵ} (x_{1}, I | x_{2}) = \{{(y, x_{2})}^{t} \in T_{ϵ}| y - x_{1} \in I^{n}\} \end{matrix}

(42)

and

Y_{R_{1} / I} = X_{1} + I

is a random variable with sample space

R_{1} / I

.

Proof.

Define the mapping

Γ : R_{1} \to R_{1} / I

by

\begin{matrix} Γ : x_{1} \mapsto x_{1} + I, \forall x_{1} \in R_{1} . \end{matrix}

(43)

Assume that

x_{1} = (x_{1}^{(1)}, x_{1}^{(2)}, \dots, x_{1}^{(n)})

, and let

\begin{matrix} \bar{y} = (Γ (x_{1}^{(1)}), Γ (x_{1}^{(2)}), \dots, Γ (x_{1}^{(n)})) . \end{matrix}

(44)

By definition,

\forall {(y, x_{2})}^{t} \in D_{ϵ} (x_{1}, I | x_{2})

, where

y = (y^{(1)}, y^{(2)}, \dots, y^{(n)})

,

\begin{matrix} (Γ (y^{(1)}), Γ (y^{(2)}), \dots, Γ (y^{(n)})) = \bar{y} . \end{matrix}

(45)

Moreover,

\begin{matrix} {(y, \bar{y}, x_{2})}^{t} \in T_{ϵ} (n, (X_{1}, Y_{R_{1} / I}, X_{2})), and \end{matrix}

(46)

\begin{matrix} |D_{ϵ} (x_{1}, I | x_{2})| = |\{{(y, \bar{y}, x_{2})}^{t} \in T_{ϵ}| y - x_{1} \in I^{n}\}| . \end{matrix}

(47)

For fixed

{(\bar{y}, x_{2})}^{t} \in T_{ϵ}

, the number of strongly

ϵ

-typical sequences

y

such that

{(y, \bar{y}, x_{2})}^{t}

is strongly

ϵ

-typical is strictly upper bounded by

2^{n [H (X_{1} | Y_{R_{1} / I}, X_{2}) + η]}

if n is large enough and

ϵ

is small. Therefore,

\begin{matrix} |D_{ϵ} (x_{1}, I | x_{2})| < 2^{n [H (X_{1} | Y_{R_{1} / I}, X_{2}) + η]} . \end{matrix}

(48)

☐

Remark 13.

We acknowledge an anonymous reviewer of our paper [20] for suggesting the proof for Lemma 3 given above. Our original proof was presented as a special case of a more general result in [21]. The techniques behind the two proofs are quite different, however the full generality of our original proof is appreciated better in non-i.i.d. scenarios, as in [21].

Remark 14.

Assume that

y - x = {(a_{1}, a_{2}, \dots, a_{n})}^{t}

, then

y - x \in I^{n}

is equivalent to

{〈 a_{1}, a_{2}, \dots, a_{n} 〉}_{l} \subseteq I

.

4. Proof of the Achievability Theorems

4.1. Proof of Theorem 2

As mentioned,

X_{i}

can be seen as a subset of

R_{i}

for a fixed

Φ = (Φ_{1}, \dots, Φ_{s})

. In this section, we assume that

X_{i}

has sample space

R_{i}

, which makes sense since

Φ_{i}

is injective.

Let

R = (R_{1}, R_{2}, \dots, R_{s})

and

k_{i} = ⌊\frac{n R_{i}}{log | R_{i} |}⌋

,

\forall i \in S

, where n is the length of the data sequences. If

R \in R_{Φ}

, then

\sum_{i \in T} \frac{R_{i} log | I_{i} |}{log | R_{i} |} > r (T, I_{T}),

(this implies that

\frac{1}{n} \sum_{i \in T} k_{i} log | I_{i} | - r (T, I_{T}) > 2 η

for some small constant

η > 0

and large enough n),

\forall \emptyset \neq T \subseteq S, \forall 0 \neq I_{i} \leq_{l} R_{i}

. We claim that

R

is achievable by linear coding over

R_{1}, R_{2}, \dots, R_{s}

.

Encoding:

For every

i \in S

, randomly generate a

k_{i} \times n

matrix

A_{i}

based on a uniform distribution, i.e., independently choose each entry of

A_{i}

uniformly at random from

R_{i}

. Define a linear encoder

ϕ_{i} : R_{i}^{n} \to R_{i}^{k_{i}}

such that

\begin{matrix} ϕ_{i} : x \mapsto A_{i} x, \forall x \in R_{i}^{n} . \end{matrix}

(49)

Obviously the coding rate of this encoder is

\frac{1}{n} log |ϕ_{i} (R_{i}^{n})| \leq \frac{1}{n} log {|R_{i}|}^{k_{i}} = \frac{log |R_{i}|}{n} ⌊\frac{n R_{i}}{log | R_{i} |}⌋ \leq R_{i}

.

Decoding:

Subject to observing

y_{i} \in R_{i}^{k_{i}}

(

i \in S

) from the ith encoder, the decoder claims that

x = {(x_{1}, x_{2}, \dots, x_{s})}^{t} \in \prod_{i = 1}^{s} R_{i}^{n}

is the array of the encoded data sequences, if and only if:

$x \in T_{ϵ}$ ; and
$\forall x^{'} = {[x_{1}^{'}, x_{2}^{'}, \dots, x_{s}^{'}]}^{t} \in T_{ϵ}$ , if $x^{'} \neq x$ , then $ϕ_{j} (x_{j}^{'}) \neq y_{j}$ , for some j.

Error:

Assume that

X_{i} = x_{i} \in R_{i}^{n}

(

i \in S

) is the original data sequence generated by the ith source. It is readily seen that an error occurs if and only if one of the following events occurs:

E₁:: $x = {[x_{1}, x_{2}, \dots, x_{s}]}^{t} \notin T_{ϵ}$ ;
E₂:: There exists $x \neq {(x_{1}^{'}, x_{2}^{'}, \dots, x_{s}^{'})}^{t} \in T_{ϵ}$ , such that $ϕ_{i} (x_{i}^{'}) = ϕ_{i} (x_{i})$ , $\forall i \in S$ .

Error Probability:

By the joint asymptotic equipartition principle (AEP) ([18], Theorem 6.9),

Pr \{E_{1}\} \to 0

,

n \to \infty

.

Additionally, for

\emptyset \neq T \subseteq S

, let

\begin{matrix} D_{ϵ} (x; T) = \{{(x_{1}^{'}, x_{2}^{'}, \dots, x_{s}^{'})}^{t} \in T_{ϵ} | x_{i}^{'} \neq x_{i}, \forall i \in T and x_{i}^{'} = x_{i}, \forall i \in T^{c}\} . \end{matrix}

(50)

We have

\begin{matrix} D_{ϵ} (x; T) \subseteq ⋃_{0 \neq I_{i} \leq_{l} R_{i} i \in T} [D_{ϵ} (x_{T}, I_{T} | x_{T^{c}}) \ {x}], \end{matrix}

(51)

where

x_{T} = \prod_{i \in T} x_{i}

and

x_{T^{c}} = \prod_{i \in T^{c}} x_{i}

, since

I_{i}

goes over all possible non-trivial left ideals. Consequently,

\begin{matrix} Pr \{E_{2} | E_{1}^{c}\} = & \sum_{{[x_{1}^{'}, \dots, x_{s}^{'}]}^{t} \in T_{ϵ} \ {x}} \prod_{i \in S} Pr \{ϕ_{i} (x_{i}^{'}) = ϕ_{i} (x_{i}) | E_{1}^{c}\} \end{matrix}

\begin{matrix} = & \sum_{\emptyset \neq T \subseteq S} \sum_{\begin{matrix} {[x_{1}^{'}, \dots, x_{s}^{'}]}^{t} \\ \in D_{ϵ} (x; T) \end{matrix}} \prod_{i \in T} Pr \{ϕ_{i} (x_{i}^{'}) = ϕ_{i} (x_{i}) | E_{1}^{c}\} \end{matrix}

(52)

\begin{matrix} \leq & \sum_{\emptyset \neq T \subseteq S} \sum_{\begin{matrix} 0 \neq I_{i} \leq_{l} R_{i} \\ i \in T \end{matrix}} \sum_{\begin{matrix} {[x_{1}^{'}, \dots, x_{s}^{'}]}^{t} \\ \in D_{ϵ} (x_{T}, I_{T} | x_{T^{c}}) \ {x} \end{matrix}} \prod_{i \in T} Pr \{ϕ_{i} (x_{i}^{'}) = ϕ_{i} (x_{i}) | E_{1}^{c}\} \end{matrix}

(53)

\begin{matrix} < & \sum_{\emptyset \neq T \subseteq S} \sum_{\begin{matrix} 0 \neq I_{i} \leq_{l} R_{i} \\ i \in T \end{matrix}} (2^{n [r (T, I_{T}) + η]} - 1) \prod_{i \in T} {| I_{i} |}^{- k_{i}} \end{matrix}

(54)

\begin{matrix} < & (2^{s} - 1) (2^{|R_{S}|} - 2) \times max_{\begin{matrix} \emptyset \neq T \subseteq S, \\ 0 \neq I_{i} \leq_{l} R_{i} \\ i \in T \end{matrix}} 2^{- n [\frac{1}{n} \sum_{i \in T} k_{i} log | I_{i} | - [r (T, I_{T}) + η]]}, \end{matrix}

(55)

where

(52) is from the fact that $T_{ϵ} \ {x} = ∐_{\emptyset \neq T \subseteq S} D_{ϵ} (x; T)$ (disjoint union);
(53) follows from (51) by the union bound (Boole’s inequality);
(54) is from Lemmas 2 and 3, as well as the fact that every left ideal of $R_{T}$ is a Cartesian product of some left ideals $I_{i}$ of $R_{i}$ , $i \in T$ (see Proposition 3). At the same time, $ϵ$ is required to be sufficiently small;
(55) is due to the facts that the number of non-empty subsets of $S$ is $2^{s} - 1$ and the number of non-trivial left ideals of the finite ring $R_{T}$ is less than $2^{|R_{S}|} - 1$ , which is the number of non-empty subsets of $R_{S} (\supseteq R_{T})$ .

Thus,

Pr \{E_{2} | E_{1}^{c}\} \to 0

, when

n \to \infty

, from (55), since for sufficiently large n and small

ϵ

,

\frac{1}{n} \sum_{i \in T} k_{i} log | I_{i} | - [r (T, I_{T}) + η] > η > 0

.

Therefore,

Pr \{E_{1} \cup E_{2}\} = Pr \{E_{1}\} + Pr \{E_{2} | E_{1}^{c}\} \to 0

as

ϵ \to 0

and

n \to \infty

.

4.2. Proof of Theorem 3

The proof follows almost the same steps as in proving Theorem 2, except that the performance analysis only focuses on sequences

(a_{i, 1}, a_{i, 2}, \dots, a_{i, n}) \in R_{i}^{n}

(

1 \leq i \leq s

) such that

\begin{matrix} a_{i, j} = (Φ_{1, i} (x_{i}^{(j)}), Φ_{2, i} (x_{i}^{(j)}), \dots, Φ_{k_{i}, i} (x_{i}^{(j)})) \in \prod_{l = 1}^{k_{i}} R_{l, i} \end{matrix}

(56)

for some

x_{i}^{(j)} \in X_{i}

. Let

X_{i}, Y_{i}

be any two such sequences satisfying

X_{i} - Y_{i} \in I_{i}^{n}

for some

I_{i} \leq_{l} R_{i}

. Based on the special structure of

X_{i}

and

Y_{i}

, it is easy to verify that

I_{i} \neq 0 \Leftrightarrow I_{i} = \prod_{l = 1}^{k_{i}} I_{l, i} and 0 \neq I_{l, i} \leq_{l} R_{l, i},

for all

1 \leq l \leq k_{i}

. (This causes the difference between (22) and (29).) In addition, it is obvious that

R_{Φ} \subseteq R_{Φ, prod}

by their definitions.

4.3. Proof of Theorem 4

The proof is similar to that for Theorem 2, except that it only focuses on sequences

(a_{i, 1}, a_{i, 2}, \dots, a_{i, n}) \in M_{L, R_{i}, m_{i}}^{n}

(

1 \leq i \leq s

) such that

a_{i, j} \in M_{L, R_{i}, m_{i}}

satisfies

{[a_{i, j}]}_{u, v} = \{\begin{matrix} a, & u \leq v; \\ 0, & otherwise, \end{matrix}

for some

a \in R_{i}

. Let

X_{i}, Y_{i}

be any two such sequences such that

X_{i} - Y_{i} \in I_{i}^{n}

for some

I_{i} \leq_{l} M_{L, R_{i}, m_{i}}

. It is easily seen that

I_{i} \neq 0

if and only if

I_{i} \notin O (M_{L, R_{i}, m_{i}}) .

(This causes the difference between (22) and (35).) In addition, it is obvious that

R_{Φ} \subseteq R_{Φ, m}

by their definitions.

5. Optimality

Obviously, Theorem 2 specializes to its field counterpart if all rings considered are fields, as summarized in the following theorem.

Theorem 5.

Region (22) is the SW region if

R_{i}

contains no proper non-trivial left ideal, equivalently,

R_{i}

is a field, for all

i \in S

. As a consequence, region (36) is the SW region.

Proof.

In Theorem 2, random variable

Y_{R_{T} / I_{T}}

admits a sample space of cardinality 1 for all

\emptyset \neq T \subseteq S

, since the only non-trivial left ideal of

R_{i}

is itself for all feasible i. Thus,

0 = H (Y_{R_{T} / I_{T}}) \geq H (Y_{R_{T} / I_{T}} | X_{T^{c}}) \geq 0

. Consequently,

\begin{matrix} R_{Φ} = \{(R_{1}, R_{2}, \dots, R_{s}) \in R^{s} | \sum_{i \in T} R_{i} > H (X_{T} | X_{T^{c}}), \forall \emptyset \neq T \subseteq S\}, \end{matrix}

(57)

which is the SW region

R [X_{1}, X_{2}, \dots, X_{s}]

. Therefore, region (36) is also the SW region.

If

R_{i}

is a field, then obviously it has no proper non-trivial left (right) ideal. Conversely,

\forall 0 \neq a \in R_{i}

,

{〈 a 〉}_{l} = R_{i}

implies that

\exists 0 \neq b \in R_{i}

, such that

b a = 1

. Similarly,

\exists 0 \neq c \in R_{i}

, such that

c b = 1

. Moreover,

c = c \cdot 1 = c b a = 1 \cdot a = a

. Hence,

a b = c b = 1

. b is the inverse of a. By Wedderburn’s little theorem,

R_{i}

is a field. ☐

One important question to address is whether linear coding over finite non-field rings can be equally optimal for data compression. Hereby, we claim that, for any SW scenario, there always exist linear encoders over some finite non-field rings which achieve the data compression limit. Therefore, optimality of linear coding over finite non-field rings for data compression is established in the sense of existence.

5.1. Existence Theorem I: Single Source

For any single source scenario, the assertion that there always exists a finite ring

R_{1}

, such that

R_{l}

is in fact the SW region

\begin{matrix} R [X_{1}] = {R_{1} \in R | R_{1} > H (X_{1})}, \end{matrix}

(58)

is equivalent to the existence of a finite ring

R_{1}

and an injection

Φ_{1} : X_{1} \to R_{1}

, such that

\begin{matrix} max_{0 \neq I_{1} \leq_{l} R_{1}} \frac{log |R_{1}|}{log |I_{1}|} [H (X_{1}) - H (Y_{R_{1} / I_{1}})] = H (X_{1}), \end{matrix}

(59)

where

Y_{R_{1} / I_{1}} = Φ_{1} (X_{1}) + I_{1}

.

Theorem 6.

Let

R_{1}

be a finite ring of order

|R_{1}| \geq |X_{1}|

. If

R_{1}

contains one and only one proper non-trivial left ideal

I_{0}

and

|I_{0}| = \sqrt{|R_{1}|}

, then region (36) coincides with the SW region, i.e., there exists an injection

Φ_{1} : X_{1} \to R_{1}

, such that (59) holds.

Remark 15.

Examples of such a non-field ring

R_{1}

in the above theorem include

\begin{matrix} M_{L, p} = \{[\begin{matrix} x & 0 \\ y & x \end{matrix}]| x, y \in Z_{p}\} \end{matrix}

(60)

(

M_{L, p}

is a ring with respect to matrix addition and multiplication) and

Z_{p^{2}}

, where p is any prime. For any single source scenario, one can always choose

R_{1}

to be either

M_{L, p}

or

Z_{p^{2}}

. Consequently, optimality is attained.

Proof of Theorem 6.

Notice that the random variable

Y_{R_{1} / I_{0}}

depends on the injection

Φ_{1}

, so does its entropy

H (Y_{R_{1} / I_{0}})

. Obviously

H (Y_{R_{1} / R_{1}}) = 0

, since the sample space of the random variable

Y_{R_{1} / R_{1}}

contains only one element. Therefore,

\begin{matrix} \frac{log |R_{1}|}{log |R_{1}|} [H (X_{1}) - H (Y_{R_{1} / R_{1}})] = H (X_{1}) . \end{matrix}

(61)

Consequently, (59) is equivalent to

\begin{matrix} \frac{log |R_{1}|}{log |I_{0}|} [H (X_{1}) - H (Y_{R_{1} / I_{0}})] \leq H (X_{1}) \\ \Leftrightarrow & H (X_{1}) \leq 2 H (Y_{R_{1} / I_{0}}), \end{matrix}

(62)

since

|I_{0}| = \sqrt{|R_{1}|}

. By Lemma A1, there exists injection

{\tilde{Φ}}_{1} : X_{1} \to R_{1}

such that (62) holds if

Φ_{1} = {\tilde{Φ}}_{1}

. The statement follows. ☐

Up to isomorphism, there are exactly 4 distinct rings of order

p^{2}

for a given prime p. They include 3 non-field rings,

Z_{p} \times Z_{p}

,

M_{L, p}

and

Z_{p^{2}}

, in addition to the field

F_{p^{2}}

. It has been proved that, using linear encoders over the last three, optimality can always be achieved in the single source scenario. Actually, the same holds true for all multiple sources scenarios.

5.2. Existence Theorem II: Multiple Sources

Theorem 7.

Let

R_{1}, R_{2}, \dots, R_{s}

be s finite rings with

|R_{i}| \geq |X_{i}|

. If

R_{i}

is isomorphic to either

1.: a field, i.e., $R_{i}$ contains no proper non-trivial left (right) ideal; or
2.: a ring containing one and only one proper non-trivial left ideal $I_{0 i}$ and $|I_{0 i}| = \sqrt{|R_{i}|}$ ,

for all feasible i, then (36) coincides with the SW region

R [X_{1}, X_{2}, \dots, X_{s}]

.

Remark 16.

It is obvious that Theorem 7 includes Theorem 6 as a special case. In fact, its proof resembles the one of Theorem 6. Examples of

R_{i}

’s include all finite fields,

M_{L, p}

and

Z_{p^{2}}

, where p is a prime. However, Theorem 7 does not guarantee that all rates, except the vertexes , in the polytope of the SW region are “directly” achievable for the multiple sources case. A time sharing scheme is required in our current proof. Nevertheless, all rates are “directly” achievable if

R_{i}

’s are fields or if

s = 1

. This is partially the reason that the two theorems are stated separately.

Remark 17.

Theorem 7 also includes Theorem 5 as a special case. However, Theorem 5 admits a simpler proof compared to the one for Theorem 7.

Proof of Theorem 7.

It suffices to prove that, for any

R = (R_{1}, R_{2}, \dots, R_{s}) \in R^{s}

satisfies

\begin{matrix} R_{i} > H (X_{i} | X_{i - 1}, X_{i - 2}, \dots, X_{1}), \forall 1 \leq i \leq s, \end{matrix}

(63)

R \in R_{Φ}

for some set of injections

Φ = (Φ_{1}, Φ_{2}, \dots, Φ_{s})

, where

Φ_{i} : X_{i} \to R_{i}

. Let

\tilde{Φ} = ({\tilde{Φ}}_{1}, {\tilde{Φ}}_{2}, \dots, {\tilde{Φ}}_{s})

be the set of injections, where, if

(i): $R_{i}$ is a field, ${\tilde{Φ}}_{i}$ is any injection;
(ii): $R_{i}$ satisfies 2, ${\tilde{Φ}}_{i}$ is the injection such that

$\begin{matrix} H (X_{i} | X_{i - 1}, X_{i - 2}, \dots, X_{1}) \leq & 2 H (Y_{R_{i} / I_{0 i}} | X_{i - 1}, X_{i - 2}, \dots, X_{1}), \end{matrix}$

(64)

when $Φ_{i} = {\tilde{Φ}}_{i}$ . The existence of ${\tilde{Φ}}_{i}$ is guaranteed by Lemma A1.

If

Φ = \tilde{Φ}

, then

\begin{matrix} \frac{log |I_{i}|}{log |R_{i}|} H (X_{i} | X_{i - 1}, X_{i - 2}, \dots, X_{1}) \geq & H (X_{i} | X_{i - 1}, X_{i - 2}, \dots, X_{1}) - H (Y_{R_{i} / I_{i}} | X_{i - 1}, X_{i - 2}, \dots, X_{1}) \end{matrix}

(65)

\begin{matrix} = & H (X_{i} | Y_{R_{i} / I_{i}}, X_{i - 1}, X_{i - 2}, \dots, X_{1}), \end{matrix}

(66)

for all

1 \leq i \leq s

and

0 \neq I_{i} \leq_{l} R_{i}

. As a consequence,

\begin{matrix} \sum_{i \in T} \frac{R_{i} log |I_{i}|}{log |R_{i}|} > & \sum_{i \in T} \frac{log |I_{i}|}{log |R_{i}|} H (X_{i} | X_{i - 1}, X_{i - 2}, \dots, X_{1}) \end{matrix}

(67)

\begin{matrix} \geq & \sum_{i \in T} [H (X_{i} | Y_{R_{i} / I_{i}}, X_{i - 1}, X_{i - 2}, \dots, X_{1})] \end{matrix}

(68)

\begin{matrix} \geq & \sum_{i \in T} [H (X_{i} | Y_{R_{T} / I_{T}}, X_{T^{c}}, X_{i - 1}, X_{i - 2}, \dots, X_{1})] \end{matrix}

(69)

\begin{matrix} \geq & H (X_{T} |Y_{R_{T} / I_{T}}, X_{T^{c}}) \end{matrix}

(70)

\begin{matrix} = & H (X_{T} | X_{T^{c}}) - H (Y_{R_{T} / I_{T}} | X_{T^{c}}), \end{matrix}

(71)

for all

\emptyset \neq T \subseteq {1, 2, \dots, s}

. Thus,

R \in R_{\tilde{Φ}}

. ☐

By Theorems 5–7, we draw the conclusion that

Corollary 1.

For any SW scenario, there always exists a sequence of linear encoders over some finite rings (fields or non-field rings) which achieves the data compression limit, the SW region.

In fact, LCoR can be optimal even for rings beyond those stated in the above theorems (see Example 1). We classify some of these scenarios in the remaining parts of this section.

5.3. Product Rings

Theorem 8.

Let

R_{l, 1}, R_{l, 2}, \dots, R_{l, s}

(

l = 1, 2

) be a set of finite rings of equal size, and

R_{i} = R_{1, i} \times R_{2, i}

for all feasible i. If the coding rate

R \in R^{s}

is achievable with linear encoders over

R_{l, 1}, R_{l, 2}, \dots, R_{l, s}

(

l = 1, 2

), then

R

is achievable with linear encoders over

R_{1}, R_{2}, \dots, R_{s}

.

Proof.

By definition,

R

is a convex combination of coding rates which are achieved by different linear encoding schemes over

R_{l, 1}, R_{l, 2}, \dots, R_{l, s}

(

l = 1, 2

), respectively. To be more precise, there exist

R_{1}, R_{2}, \dots, R_{m} \in R^{s}

and positive numbers

w_{1}, w_{2}, \dots, w_{m}

with

\sum_{j = 1}^{m} w_{j} = 1

, such that

R = \sum_{j = 1}^{m} w_{j} R_{j}

. Moreover, there exist injections

Φ_{l} = (Φ_{l, 1}, Φ_{l, 2}, \dots, Φ_{l, s})

(

l = 1, 2

), where

Φ_{l, i} : X_{i} \to R_{l, i}

, such that

R_{j} \in R_{Φ_{l}} = {(R_{1}, R_{2}, \dots, R_{s}) \in R^{s} | \sum_{i \in T} \frac{R_{i} log |I_{l, i}|}{log |R_{l, i}|} > H (X_{T} | X_{T^{c}}) - H (Y_{R_{l, T} / I_{l, T}} | X_{T^{c}}), \forall \emptyset \neq T \subseteq S, \forall 0 \neq I_{l, i} \leq_{l} R_{l, i}},

(72)

where

R_{l, T} = \prod_{i \in T} R_{l, i}

,

I_{l, T} = \prod_{i \in T} I_{l, i}

and

Y_{R_{l, T} / I_{l, T}} = Φ_{l} (X_{T}) + I_{l, T}

is a random variable with sample space

R_{l, T} / I_{l, T}

. To show that

R

is achievable with linear encoders over

R_{1}, R_{2}, \dots, R_{s}

, it suffices to prove that

R_{j}

is achievable with linear encoders over

R_{1}, R_{2}, \dots, R_{s}

for all feasible j. Let

R_{j} = (R_{j, 1}, R_{j, 2}, \dots, R_{j, s})

. For all

\emptyset \neq T \subseteq S

and

0 \neq I_{i} = I_{1, i} \times I_{2, i} \leq_{l} R_{i}

with

0 \neq I_{l, i} \leq_{l} R_{l, i}

(

l = 1, 2

), we have

\begin{matrix} \sum_{i \in T} \frac{R_{j, i} log |I_{i}|}{log |R_{i}|} = \sum_{i \in T} \frac{R_{j, i} log |I_{1, i}|}{log |R_{1, i}|} \frac{c_{1}}{c_{1} + c_{2}} + \sum_{i \in T} \frac{R_{j, i} log |I_{2, i}|}{log |R_{2, i}|} \frac{c_{2}}{c_{1} + c_{2}}, \end{matrix}

(73)

where

c_{l} = log |R_{l, 1}|

. By (72), it can be easily seen that

\begin{matrix} \sum_{i \in T} \frac{R_{j, i} log |I_{i}|}{log |R_{i}|} > & H (X_{T} | X_{T^{c}}) - \frac{1}{c_{1} + c_{2}} \sum_{l = 1}^{2} c_{l} H (Y_{R_{l, T} / I_{l, T}} | X_{T^{c}}) . \end{matrix}

(74)

Meanwhile, let

R_{T} = \prod_{i \in T} R_{i}

,

I_{T} = \prod_{i \in T} I_{i}

,

Φ = (Φ_{1, 1} \times Φ_{2, 1}, Φ_{1, 2} \times Φ_{2, 2}, \dots, Φ_{1, s} \times Φ_{2, s})

(Note:

\begin{matrix} Φ_{1, i} \times Φ_{2, i} : x_{i} \mapsto (Φ_{1, i} (x_{i}), Φ_{2, i} (x_{i})) \in R_{i} \end{matrix}

(75)

for all

x_{i} \in X_{i}

.) and

Y_{R_{T} / I_{T}} = Φ (X_{T}) + I_{T}

. It can be verified that

Y_{R_{l, T} / I_{l, T}}

(

l = 1, 2

) is a function of

Y_{R_{T} / I_{T}}

, hence,

H (Y_{R_{T} / I_{T}} | X_{T^{c}}) \geq H (Y_{R_{l, T} / I_{l, T}} | X_{T^{c}})

. Consequently,

\begin{matrix} \sum_{i \in T} \frac{R_{j, i} log |I_{i}|}{log |R_{i}|} > H (X_{T} | X_{T^{c}}) - H (Y_{R_{T} / I_{T}} | X_{T^{c}}), \end{matrix}

(76)

which implies that

R_{j} \in R_{Φ, prod}

by Theorem 3. We therefore conclude that

R_{j}

is achievable with linear encoders over

R_{1}, R_{2}, \dots, R_{s}

for all feasible j, so is

R

. ☐

Obviously,

R_{1}, R_{2}, \dots, R_{s}

in Theorem 8 are of the same size. Inductively, one can verify the following without any difficulty.

Theorem 9.

Let

L

be any finite index set,

R_{l, 1}, R_{l, 2}, \dots, R_{l, s}

(

l \in L

) be a set of finite rings of equal size, and

R_{i} = \prod_{l \in L} R_{l, i}

for all feasible i. If the coding rate

R \in R^{s}

is achievable with linear encoders over

R_{l, 1}, R_{l, 2}, \dots, R_{l, s}

(

l \in L

), then

R

is achievable with linear encoders over

R_{1}, R_{2}, \dots, R_{s}

.

Remark 18.

There are delicate issues to the situation Theorem 9 (Theorem 8) illustrates. Let

X_{i}

(

1 \leq i \leq s

) be the set of all symbols generated by the ith source. The hypothesis of Theorem 9 (Theorem 8) implicitly implies the alphabet constraint

|X_{i}| \leq |R_{l, i}|

for all feasible i and l.

Let

R_{1}, R_{2}, \dots, R_{s}

be s finite rings each of which is isomorphic to either

a ring $R$ containing one and only one proper non-trivial left ideal whose order is $\sqrt{|R|}$ , e.g., $M_{L, p}$ and $Z_{p^{2}}$ (p is a prime); or
a ring of a finite product of finite field(s) and/or ring(s) satisfying 1), e.g., $M_{L, p} \times \prod_{j = 1}^{m} Z_{p_{j}}$ (p and $p_{j}$ ’s are prime) and $\prod_{i = 1}^{m^{'}} M_{L, p_{i}} \times \prod_{j = 1}^{m^{''}} F_{q_{j}}$ ( $m^{'}$ and $m^{″}$ are non-negative, $p_{i}$ ’s are prime and $q_{j}$ ’s are power of primes).

Theorems 7 and 9 ensure that linear encoders over ring

R_{1}, R_{2}, \dots, R_{s}

are always optimal in any applicable (subject to the condition specified in the corresponding theorem) SW coding scenario. As a very special case,

Z_{p} \times Z_{p}

, where p is a prime, is always optimal in any (single source or multiple sources) scenario with alphabet size less than or equal to p. However, using a field or product rings is not necessary. As shown in Theorem 6, neither

M_{L, p}

nor

Z_{p^{2}}

is (isomorphic to) a product of rings nor a field. It is also not required to have a restriction on the alphabet size (see Theorem 7), even for product rings (see Example 1 for a case of

Z_{2} \times Z_{3}

).

5.4. Trivial Case: Uniform Distributions

The following theorem is trivial, however we include it for completeness.

Theorem 10.

Regardless which set of rings

R_{1}, R_{2}, \dots, R_{s}

is chosen, as long as

| R_{i} | = | X_{i} |

for all feasible i, region (22) is the SW region if

(X_{1}, X_{2}, \dots, X_{s}) \sim p

is a uniform distribution.

Proof.

If p is uniform, then, for any

\emptyset \neq T \subseteq S

and

0 \neq I_{T} \leq_{l} R_{T}

,

Y_{R_{T} / I_{T}}

is uniformly distributed on

R_{T} / I_{T}

. Moreover,

X_{T}

and

X_{T^{c}}

are independent, so are

Y_{R_{T} / I_{T}}

and

X_{T^{c}}

. Therefore,

H (X_{T} | X_{T^{c}}) = H (X_{T}) = log | R_{T} |

and

H (Y_{R_{T} / I_{T}} | X_{T^{c}}) = H (Y_{R_{T} / I_{T}}) = log \frac{| R_{T} |}{| I_{T} |}

. Consequently,

\begin{matrix} r (T, I_{T}) = H (X_{T} | X_{T^{c}}) - H (Y_{R_{T} / I_{T}} | X_{T^{c}}) = log | I_{T} | . \end{matrix}

(77)

Region (22) is the SW region. ☐

Remark 19.

When p is uniform, it is obvious that the uncoded strategy (all encoders are one-to-one mappings) is optimal in the SW source coding problem. However, optimality stated in Theorem 10 does not come from deliberately fixing the linear encoding mappings, but generating them randomly.

So far, we have only shown that there exist linear encoders over finite non-field rings that are equally good as their field counterparts. In next section, Problem 1 is considered with an arbitrary g. It will be demonstrated that linear coding over finite non-field rings can strictly outperform its field counterpart for encoding some discrete functions, and there are infinitely many such functions.

6. Application: Source Coding for Computing

The problem of Source Coding for Computing, Problem 1, with an arbitrary g is addressed in this section. Some advantages of LCoR (compared to LCoF) will be demonstrated. We begin with establishing the following theorem which can be recognized as a generalization of Körner–Marton [4].

Theorem 11.

Let

R

be a finite ring, and

\begin{matrix} \hat{g} = h \circ k, w h e r e k (x_{1}, x_{2}, \dots, x_{s}) = \sum_{i = 1}^{s} k_{i} (x_{i}) \end{matrix}

(78)

and

h, k_{i}

’s are functions mapping

R

to

R

. Then

\begin{matrix} R_{\hat{g}} = \{(r, r, \dots, r) \in R^{s} | r > max_{0 \neq I \leq_{l} R} \frac{log | R |}{log | I |} [H (X) - H (Y_{R / I})]\} \subseteq R [\hat{g}], \end{matrix}

(79)

where

X = k (X_{1}, X_{2}, \dots, X_{s})

and

Y_{R / I} = X + I

.

Proof.

By Theorem 2,

\forall ϵ > 0

, there exists a large enough n, an

m \times n

matrix

A \in R^{m \times n}

and a decoder

ψ

, such that

Pr \{X^{n} \neq ψ (A X^{n})\} < ϵ

, if

m > {max}_{0 \neq I \leq_{l} R} \frac{n (H (X) - H (Y_{R / I}))}{log | I |}

. Let

ϕ_{i} = A \circ {\vec{k}}_{i}

(

1 \leq i \leq s

) be the encoder of the ith source. Upon receiving

ϕ_{i} (X_{i}^{n})

from the ith source, the decoder claims that

\vec{h} ({\hat{X}}^{n})

, where

{\hat{X}}^{n} = ψ [\sum_{i = 1}^{s} ϕ_{i} (X_{i}^{n})]

, is the function, namely

\hat{g}

, subject to computation. The probability of decoding error is

\begin{matrix} Pr \{\vec{h} [\vec{k} (X_{1}^{n}, X_{2}^{n}, \dots, X_{s}^{n})] \neq \vec{h} ({\hat{X}}^{n})\} \leq & Pr \{X^{n} \neq {\hat{X}}^{n}\} \end{matrix}

(80)

\begin{matrix} = & Pr \{X^{n} \neq ψ [\sum_{i = 1}^{s} ϕ_{i} (X_{i}^{n})]\} \end{matrix}

(81)

\begin{matrix} = & Pr \{X^{n} \neq ψ [\sum_{i = 1}^{s} A {\vec{k}}_{i} (X_{i}^{n})]\} \end{matrix}

(82)

\begin{matrix} = & Pr \{X^{n} \neq ψ [A \sum_{i = 1}^{s} {\vec{k}}_{i} (X_{i}^{n})]\} \end{matrix}

(83)

\begin{matrix} = & Pr \{X^{n} \neq ψ [A \vec{k} (X_{1}^{n}, X_{2}^{n}, \dots, X_{s}^{n})]\} \end{matrix}

(84)

\begin{matrix} = & Pr \{X^{n} \neq ψ (A X^{n})\} < ϵ . \end{matrix}

(85)

Therefore, all

(r, r, \dots, r) \in R^{s}

, where

r = \frac{m log | R |}{n} > {max}_{0 \neq I \leq_{l} R} \frac{log | R |}{log | I |} [H (X) - H (Y_{R / I})]

, is achievable, i.e.,

R_{\hat{g}} \subseteq R [\hat{g}]

. ☐

Corollary 2.

In Theorem 11, let

X = k (X_{1}, X_{2}, \dots, X_{s}) \sim p_{X}

. We have

\begin{matrix} R_{\hat{g}} = \{(r, r, \dots, r) \in R^{s}| r > H (X)\} \subseteq R [\hat{g}], \end{matrix}

(86)

if either of the following conditions holds:

1.: $R$ is isomorphic to a finite field;
2.: $R$ is isomorphic to a ring containing one and only one proper non-trivial left ideal $I_{0}$ with $|I_{0}| = \sqrt{|R|}$ , and

$\begin{matrix} H (X) \leq 2 H (X + I_{0}) . \end{matrix}$

(87)

Proof.

If either (1) or (2) holds, then it is guaranteed that

\begin{matrix} max_{0 \neq I \leq_{l} R} \frac{log | R |}{log | I |} [H (X) - H (Y_{R / I})] = H (X) \end{matrix}

(88)

in Theorem 11. The statement follows. ☐

Remark 20.

By Lemma A2, examples of non-field rings satisfying (2) in Corollary 2 includes

(1): $Z_{4}$ with $p_{X} (0) = p_{1}, p_{X} (1) = p_{2}, p_{X} (3) = p_{3} a n d p_{X} (2) = p_{4}$ satisfying

$\begin{matrix} 0 \leq max {p_{2}, p_{3}} ≮ min {p_{1}, p_{4}} \leq 1 a n d 0 \leq max {p_{1}, p_{4}} ≮ min {p_{2}, p_{3}} \leq 1, \end{matrix}$

(89)
(2): $M_{L, 2}$ with

$\begin{matrix} p_{X} ([\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}]) = p_{1}, p_{X} ([\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]) = p_{2}, p_{X} ([\begin{matrix} 1 & 0 \\ 1 & 1 \end{matrix}]) = p_{3} a n d p_{X} ([\begin{matrix} 0 & 0 \\ 1 & 0 \end{matrix}]) = p_{4} \end{matrix}$

(90)

satisfying (89), etc.

Interested readers can figure out even more explicit examples deduced from Lemma A1.

Remark 21.

If

R

is isomorphic to

Z_{2}

and

\hat{g}

is the modulo-two sum, then Corollary 2 recovers the theorem of Körner–Marton [4]. While if

R

is (isomorphic to) a field, it becomes a special case of ([7] Theorem III.1). Actually, almost all the results in [6,7] can be reproved in the setting of rings in a parallel fashion.

We claim that there are functions g for which LCoR outperforms LCoF; in fact, there are infinite many such g’s. To prove this, some definitions are required for the mechanics of our argument.

Definition 8.

Let

g_{1} : \prod_{i = 1}^{s} X_{i} \to Ω_{1}

and

g_{2} : \prod_{i = 1}^{s} Y_{i} \to Ω_{2}

be two functions. If there exist bijections

μ_{i} : X_{i} \to Y_{i}

,

\forall 1 \leq i \leq s

, and

ν : Ω_{1} \to Ω_{2}

, such that

\begin{matrix} g_{1} (x_{1}, x_{2}, \dots, x_{s}) = ν^{- 1} (g_{2} (μ_{1} (x_{1}), μ_{2} (x_{2}), \dots, μ_{s} (x_{s}))), \end{matrix}

(91)

then

g_{1}

and

g_{2}

are said to be equivalent(via

μ_{1}, μ_{2}, \dots, μ_{s}

and ν).

Definition 9.

Given function

g : D \to Ω

, and let

\emptyset \neq S \subseteq D

. The restriction of g on

S

is defined to be the function

{g |}_{S} : S \to Ω

such that

{g |}_{S} : x \mapsto g (x), \forall x \in S

.

Lemma 4.

Let

X_{1}, X_{2}, \dots, X_{k}

and Ω be some finite sets. For any discrete function

g : \prod_{i = 1}^{k} X_{i} \to Ω,

there always exist a finite ring (field) and a polynomial function

\hat{g} \in R [k]

such that

\begin{matrix} ν (g (x_{1}, x_{2}, \dots, x_{k})) = \hat{g} (μ_{1} (x_{1}), μ_{2} (x_{2}), \dots, μ_{k} (x_{k})) \end{matrix}

(92)

for some injections

μ_{i} : X_{i} \to R

(

1 \leq i \leq k

) and

ν : Ω \to R

.

Proof.

There are several possible proofs of this lemma. One is provided in Appendix B. ☐

Remark 22.

Up to equivalence, a function can be presented in many different formats. For example, the function

min {x, y}

defined on

{0, 1} \times {0, 1}

(with ordering

0 \leq 1

) can either be seen as

F_{1} (x, y) = x y

on

Z_{2}^{2}

or be treated as the restriction of

F_{2} (x, y) = x + y - {(x + y)}^{2}

defined on

Z_{3}^{2}

to the domain

{0, 1} \times {0, 1} ⊊ Z_{3}^{2}

.

Lemma 4 implies that any discrete function defined on a finite domain is equivalent to a restriction of some polynomial function over some finite ring (field). As a consequence, we can restrict Problem 1 to all polynomial functions. This polynomial approach offers valuable insight into the general problem, because the algebraic structure of a polynomial function is clearer than that of an arbitrary function. We often call

\hat{g}

in Lemma 4 a polynomial presentation of g. On the other hand, the

\hat{g}

given by (78) is named a nomographic function over

R

(by terminology borrowed from [22]), it is said to be a nomographic presentation of g if g is equivalent to a restriction of it.

Lemma 5.

Let

X_{1}, X_{2}, \dots, X_{s}

and Ω be some finite sets. For any discrete function

g : \prod_{i = 1}^{s} X_{i} \to Ω

, there exists a nomographic function

\hat{g}

over some finite ring (field)

R

such that

\begin{matrix} ν (g (x_{1}, x_{2}, \dots, x_{k})) = \hat{g} (μ_{1} (x_{1}), μ_{2} (x_{2}), \dots, μ_{k} (x_{k})) \end{matrix}

(93)

for some injections

μ_{i} : X_{i} \to R

(

1 \leq i \leq k

) and

ν : Ω \to R

.

Proof.

There are several proofs of this lemma. One is provided in Appendix B. ☐

Lemma 5 advances Lemma 4 by claiming that a discrete function with a finite domain is always equivalent to a restriction of some nomographic function. From this, it is seen that Theorem 11 and Corollary 2 have presented a universal solution to Problem 1.

Given some finite ring

R

, let

\hat{g}

of format (78) be a nomographic presentation of g. We say that the region

R_{\hat{g}}

given by (79) is achievable for computing g in the sense of Körner–Marton. From Theorem 13 given later, we know that

R_{\hat{g}}

might not be the largest achievable region one can obtain for computing g. However,

R_{\hat{g}}

still captures the ability of linear coding over

R

when used for computing g. In other words,

R_{\hat{g}}

is the region purely achieved with linear coding over

R

for computing g. On the other hand, regions from Theorem 13 are achieved by combining the linear coding and the standard random coding techniques. Therefore, it is reasonable to compare LCoR with LCoF in the sense of Körner–Marton.

We show that linear coding over finite rings, non-field rings in particular, strictly outperforms its field counterpart, LCoF, in the following example.

Example 2

([23]). Let

g : {α_{0}, α_{1}}^{3} \to {β_{0}, β_{1}, β_{2}, β_{3}}

(Figure 1) be a function such that

\begin{matrix} g : (α_{0}, α_{0}, α_{0}) \mapsto β_{0}; g : (α_{0}, α_{0}, α_{1}) \mapsto β_{3}; \\ g : (α_{0}, α_{1}, α_{0}) \mapsto β_{2}; g : (α_{0}, α_{1}, α_{1}) \mapsto β_{1}; \\ g : (α_{1}, α_{0}, α_{0}) \mapsto β_{1}; g : (α_{1}, α_{0}, α_{1}) \mapsto β_{0}; \\ g : (α_{1}, α_{1}, α_{0}) \mapsto β_{3}; g : (α_{1}, α_{1}, α_{1}) \mapsto β_{2} . \end{matrix}

(94)

Define

μ : {α_{0}, α_{1}} \to Z_{4}

and

ν : {β_{0}, β_{1}, β_{2}, β_{3}} \to Z_{4}

by

\begin{matrix} μ & : α_{j} \mapsto j, & \forall & j \in {0, 1}, a n d \\ ν & : β_{j} \mapsto j, & \forall & j \in {0, 1, 2, 3}, \end{matrix}

(95)

respectively. Obviously, g is equivalent to

x + 2 y + 3 z \in Z_{4} [3]

(Figure 2) via

μ_{1} = μ_{2} = μ_{3} = μ

and ν. However, by Proposition 4, there exists no

\hat{g} \in F_{4} [3]

of format (78) so that g is equivalent to any restriction of

\hat{g}

. Although, Lemma 5 ensures that there always exists a bigger field

F_{q}

such that g admits a presentation

\hat{g} \in F_{q} [3]

of format (78), the size q must be strictly bigger than 4. For instance, let

\begin{matrix} \hat{h} (x) = \sum_{a \in Z_{5}} a [1 - {(x - a)}^{4}] - [1 - {(x - 4)}^{4}] \in Z_{5} [1] . \end{matrix}

(96)

Then, g has presentation

\hat{h} (x + 2 y + 4 z) \in Z_{5} [3]

(Figure 3) via

μ_{1} = μ_{2} = μ_{3} = μ : {α_{0}, α_{1}} \to Z_{5}

and

ν : {β_{0}, β_{1}, β_{2}, β_{3}} \to Z_{5}

defined (symbolic-wise) by (95).

Proposition 4.

There exists no polynomial function

\hat{g} \in F_{4} [3]

of format (78), such that a restriction of

\hat{g}

is equivalent to the function g defined by (94).

Proof.

Suppose

ν \circ g = \hat{g} \circ (μ_{1}, μ_{2}, μ_{3})

, where

μ_{1}, μ_{1}, μ_{3} : {α_{0}, α_{1}} \to F_{4}

,

ν : {β_{0}, \dots, β_{3}} \to F_{4}

are injections, and

\hat{g} = h \circ (k_{1} + k_{2} + k_{3})

with

h, k_{i} \in F_{4} [1]

for all feasible i. We claim that

\hat{g}

and h are both surjective, since

|g ({α_{0}, α_{1}}^{3})| = |{β_{0}, β_{1}, β_{2}, β_{3}}| = 4 = |F_{4}| .

In particular, h is bijective. Therefore,

h^{- 1} \circ ν \circ g = k_{1} \circ μ_{1} + k_{2} \circ μ_{2} + k_{3} \circ μ_{3}

, i.e., g admits a presentation

k_{1} (x) + k_{2} (y) + k_{3} (z) \in F_{4} [3]

. A contradiction to Lemma A3. ☐

As a consequence of Proposition 4, in the sense of Körner–Marton, in order to use LCoF to encode function g, the alphabet sizes of the three encoders need to be at least 5. However, LCoR offers a solution in which the alphabet sizes are 4, strictly smaller than using LCoF. Most importantly, the region achieved with linear coding over any finite field

F_{q}

, is always a subset of the one achieved with linear coding over

Z_{4}

. This is proved in the following proposition.

Proposition 5.

Let g be the function defined by (94),

{α_{0}, α_{1}}^{3}

be the sample space of

(X_{1}, X_{2}, X_{3}) \sim p

and

p_{X}

be the distribution of

X = g (X_{1}, X_{2}, X_{3})

. If

\begin{matrix} p_{X} (β_{0}) = p_{1}, p_{X} (β_{1}) = p_{2}, p_{X} (β_{3}) = p_{3} a n d p_{X} (β_{2}) = p_{4} \end{matrix}

(97)

satisfying (89), then, in the sense of Körner–Marton, the region

R_{1}

achieved with linear coding over

Z_{4}

contains the one, that is

R_{2}

, obtained with linear coding over any finite field

F_{q}

for computing g. Moreover, if

supp (p)

is the whole domain of g, then

R_{1} ⊋ R_{2}

.

Proof.

Let

\hat{g} = h \circ k \in F_{q} [3]

be a polynomial presentation of g with format (78). By Corollary 2 and Remark 20, we have

\begin{matrix} R_{1} = & \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > H (X_{1} + 2 X_{2} + 3 X_{3})\}, \end{matrix}

(98)

\begin{matrix} R_{2} = & \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > H (k (X_{1}, X_{2}, X_{3}))\} . \end{matrix}

(99)

Assume that

ν \circ g = h \circ k \circ (μ_{1}, μ_{2}, μ_{3})

, where

μ_{1}, μ_{1}, μ_{3} : {α_{0}, α_{1}} \to F_{q}

and

ν : {β_{0}, \dots, β_{3}} \to F_{q}

are injections. Obviously,

g (X_{1}, X_{2}, X_{3})

is a function of

k (X_{1}, X_{2}, X_{3})

. Hence,

\begin{matrix} H (k (X_{1}, X_{2}, X_{3})) \geq H (g (X_{1}, X_{2}, X_{3})) . \end{matrix}

(100)

On the other hand,

H (X_{1} + 2 X_{2} + 3 X_{3}) = H (g (X_{1}, X_{2}, X_{3}))

. Therefore,

\begin{matrix} H (k (X_{1}, X_{2}, X_{3})) \geq H (X_{1} + 2 X_{2} + 3 X_{3}), \end{matrix}

(101)

and

R_{1} \supseteq R_{2}

. In addition, we claim that

{h |}_{S}

, where

S = k (\prod_{j = 1}^{3} μ_{j} {α_{0}, α_{1}})

, is not injective. Otherwise,

h : S \to S^{'}

, where

S^{'} = h (S)

, is bijective, hence,

{({h |}_{S^{'}})}^{- 1} \circ ν \circ g = k \circ (μ_{1}, μ_{2}, μ_{3}) = k_{1} \circ μ_{1} + k_{2} \circ μ_{2} + k_{3} \circ μ_{3}

. A contradiction to Lemma A3. Consequently,

|S| > |S^{'}| = |ν ({β_{0}, \dots, β_{3}})| = 4

. If

supp (p) = {α_{0}, α_{1}}^{3}

, then (100) as well as (101) hold strictly, thus,

R_{1} ⊋ R_{2}

. ☐

A more intuitive comparison (which is not as conclusive as Proposition 5) can be identified from the presentations of g given in Figure 2 and Figure 3. According to Corollary 2, linear encoders over field

Z_{5}

achieve

\begin{matrix} R_{Z_{5}} = \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > H (X_{1} + 2 X_{2} + 4 X_{3})\} . \end{matrix}

(102)

The one achieved by linear encoders over ring

Z_{4}

is

\begin{matrix} R_{Z_{4}} = \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > H (X_{1} + 2 X_{2} + 3 X_{3})\} . \end{matrix}

(103)

Clearly,

H (X_{1} + 2 X_{2} + 3 X_{3}) \leq H (X_{1} + 2 X_{2} + 4 X_{3})

, thus,

R_{Z_{4}}

contains

R_{Z_{5}}

. Furthermore, as long as

\begin{matrix} 0 < Pr (α_{0}, α_{0}, α_{1}), Pr (α_{1}, α_{1}, α_{0}) < 1, \end{matrix}

(104)

R_{Z_{4}}

is strictly larger than

R_{Z_{5}}

, since

H (X_{1} + 2 X_{2} + 3 X_{3}) < H (X_{1} + 2 X_{2} + 4 X_{3})

. To be specific, assume that

(X_{1}, X_{2}, X_{3}) \sim p

satisfies Table 1, we have

\begin{matrix} R [X_{1}, X_{2}, X_{3}] ⊊ & R_{Z_{5}} = \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > 0.4812\} \end{matrix}

(105)

\begin{matrix} ⊊ & R_{Z_{4}} = \{(R_{1}, R_{2}, R_{3}) \in R^{3}| R_{i} > 0.4590\} . \end{matrix}

(106)

Based on Propositions 4 and 5, we conclude that LCoR dominates LCoF, in terms of achieving better coding rates with smaller alphabet sizes of the encoders for computing g. As a direct conclusion, we have:

Theorem 12.

In the sense of Körner–Marton, LCoF is not optimal.

Remark 23.

The key property underlying the proof of Proposition 5 is that the characteristic of a finite field must be a prime while the characteristic of a finite ring can be any positive integer larger than or equal to 2. This implies that it is possible to construct infinitely many discrete functions for which using LCoF always leads to a suboptimal achievable region compared to linear coding over finite non-field rings. Examples include

\sum_{i = 1}^{s} x_{i} \in Z_{2 p} [s]

for

s \geq 2

and prime

p > 2

(note: the characteristic of

Z_{2 p}

is

2 p

which is not a prime). One can always find an explicit distribution of sources for which linear coding over

Z_{2 p}

strictly dominates linear coding over each and every finite field.

As mentioned,

R_{\hat{g}}

given by (79) is sometimes strictly smaller than

R [g]

. This was first shown by Ahlswede–Han [5] for the case of g being the modulo-two sum. Their approach combines the linear coding technique over a binary field with the standard random coding technique. In the following, we generalize the result of Ahlswede–Han ([5], Theorem 10) to the settings, where g is arbitrary, and, at the same time, LCoF is replaced by its generalized version, LCoR.

Consider function

\hat{g}

admitting

\begin{matrix} \hat{g} (x_{1}, x_{2}, \dots, x_{s}) = h [k_{0} (x_{1}, x_{2}, \dots, x_{s_{0}}), \sum_{j = s_{0} + 1}^{s} k_{j} (x_{j})], 0 \leq s_{0} < s, \end{matrix}

(107)

where

k_{0} : R^{s_{0}} \to R

and

h, k_{j}

’s are functions mapping

R

to

R

. By Lemma 5, a discrete function with a finite domain is always equivalent to a restriction of some function of format (107). We call

\hat{g}

from (107) a pseudo nomographic function over ring

R

.

Theorem 13.

Let

S_{0} = {1, 2, \dots, s_{0}} \subseteq S = {1, 2, \dots, s}

. If

\hat{g}

is of format (107), and

R = (R_{1}, R_{2}, \dots, R_{s}) \in R^{s}

satisfying

\begin{matrix} \sum_{j \in T} R_{j} > |T \ S_{0}| max_{0 \neq I \leq_{l} R} \frac{log | R |}{log | I |} [H (X | V_{S}) - H (Y_{R / I} | V_{S})] + I (Y_{T}; V_{T} | V_{T^{c}}), \forall \emptyset \neq T \subseteq S, \end{matrix}

(108)

where

\forall j \in S_{0}

,

V_{j} = Y_{j} = X_{j}

;

\forall j \in S \ S_{0}

,

Y_{j} = k_{j} (X_{j})

,

V_{j}

’s are discrete random variables such that

\begin{matrix} p (y_{1}, y_{2}, \dots, y_{s}, v_{1}, v_{2}, \dots, v_{s}) = p (y_{1}, y_{2}, \dots, y_{s}) \prod_{j = s_{0} + 1}^{s} p (v_{j} | y_{j}), \end{matrix}

(119)

and

X = \sum_{j = s_{0} + 1}^{s} Y_{j}

,

Y_{R / I} = X + I

, then

R \in R [\hat{g}]

.

Proof.

The proof can be completed by applying the tricks from Lemmas 2 and 3 to the approach generalized from Ahlswede–Han ([5], Theorem 10). Details are found in Appendix C. ☐

Remark 24.

The achievable region given by (108) always contains the SW region. Moreover, it is in general larger than the

R_{\hat{g}}

from (79). If

\hat{g}

is the modulo-two sum, namely

s_{0} = 0

and

h, k_{j}

’s are identity functions for all

s_{0} < j \leq s

, then (108) resumes the region of Ahlswede–Han ([5], Theorem 10).

7. Conclusions

7.1. Right Linearity

Careful readers might have noticed that the encoders we used so far are actually left linear mappings. By symmetry, almost all related statements can be easily reproved for right linear mappings (encoders). As an example, the following corresponds to Theorem 2.

Theorem 14.

For any

Φ \in M (X_{S}, R_{S})

,

\begin{matrix} R_{Φ}^{'} = \{(R_{1}, R_{2}, \dots, R_{s}) \in R^{s} | \sum_{i \in T} \frac{R_{i} log | I_{i} |}{log | R_{i} |} > r (T, I_{T}), \forall \emptyset \neq T \subseteq S, \forall 0 \neq I_{i} \leq_{r} R_{i}\}, \end{matrix}

(110)

where

r (T, I_{T}) = H (X_{T} | X_{T^{c}}) - H (Y_{R_{T} / I_{T}} | X_{T^{c}})

and

Y_{R_{T} / I_{T}} = Φ (X_{T}) + I_{T}

, is achievable with (right) linear coding over the finite rings

R_{1}, R_{2}, \dots, R_{s}

.

By time sharing,

\begin{matrix} R_{r} = cov (⋃_{Φ \in M (X_{S}, R_{S})} R_{Φ}^{'}), \end{matrix}

(111)

where

R_{Φ}^{'}

is given by (110), is achievable with (right) LCoR.

7.2. Field, Ring, Rng and Group

Conceptually speaking, LCoR is in fact a generalization of the linear coding technique proposed by Elias [2] and Csiszár [3] (LCoF), since a field must be a ring. However, as seen in Section 4, analyzing the decoding error for the ring version is in general substantially more challenging than in the case of the field version. Our approach crucially relies on the concept of ideals. A field contains no non-trivial ideal but itself. Because of this special property of fields, our general argument for finite rings will render to a simple one when only finite fields are considered.

Even though our analysis for the ring scenario is more complicated than that for finite field scenarios, linear encoders working over some finite rings are in general considerably easier to implement in practice. This is because the implementation of finite field arithmetic can be quite demanding. Normally, a finite field is given by its polynomial representation, operations are carried out based on the polynomial operations (addition and multiplication) followed by the polynomial long division algorithm. In contrast, implementing arithmetic of many finite rings is a straightforward task. For instance, the arithmetic of modulo integers ring

Z_{q}

, for any positive integer q, is simply the integer modulo q arithmetic.

In addition, it is also very interesting to consider instead linear coding over rngs. It will be even more intriguing should it turn out that the rng version outperforms the ring version in the computing problem (Problem 1), in the same manner that the ring version outperforms its field counterpart. It will also be interesting to see whether the idea of using rng provides more understanding of the problems from [6,8].

Some works, including [24,25,26], have proposed to implement coding over a simpler algebraic structure, that of a group. Seemingly, this corresponds to a more universal approach since both fields and rings are also groups. However, one subtle issue is often overlooked in this context. Namely, the set of rings (or rngs) is not a subset of the set of groups, since several non-isomorphic rings (or rngs) can be defined on one and the same group. For instance, given two distinct primes p and q, up to isomorphism,

there are 2 finite rngs of order p, while there is only one group of order p;
there are 4 finite rngs of order $p q$ ;
there are 11 finite rngs of order $p^{2}$ (if $p = 2$ , then 4 of them are rings, namely $F_{4}$ , $Z_{4}$ , $Z_{2} \times Z_{2}$ and $M_{L, 2}$ [27]), while there are only 2 groups of order $p^{2}$ , both of which are Abelian;
there are 22 finite rngs of order $p^{2} q$ ;
there are 52 finite rngs of order 8;
there are $3 p + 50$ finite rngs of order $p^{3}$ ( $p > 2$ ), while there are 5 groups of order $p^{3}$ , 3 of which are Abelian.

Therefore, there is no one-to-one correspondence between rings (field or rngs) and groups, in either direction. Furthermore, from the point of view of formulating a multivariate function, one is highly restricted by using groups, compared to rings (rng or field). Specifically, it is well-known that every discrete function defined on a finite domain is essentially a restriction of some polynomial function over a finite ring (rng or field). Although non-Abelian structures (non-Abelian groups) have the potential to lead to important non-trivial results [28], they are very difficult to handle theoretically and in practice. The performance of non-Abelian group block codes can be quite bad [29].

7.3. Final Remarks

This paper establishes achievability theorems regarding linear coding over finite rings for Slepian–Wolf data compression. Our results include related work from Elias [2] and Csiszár [3] regarding linear coding over finite fields as special cases in the sense of characterizing the achievable region. We have also proved that, for any Slepian–Wolf scenario, there always exists a sequence of linear encoders over some finite rings (non-field rings in particular) that achieves the data compression limit, the Slepian–Wolf region. Thus, with regard to existence, the optimality issue of linear coding over finite non-field rings for data compression is confirmed positively.

In addition, we also address the problem of source coding for computing, Problem 1. Results of Körner–Marton [4], Ahlswede–Han ([5], Theorem 10) and [7] are generalized to corresponding ring versions. Based on these, it is demonstrated that LCoR dominates its field counterpart for encoding (infinitely) many discrete functions.

Appendix A. Supporting Lemmata

Lemma A1.

Let

R

be a finite ring, X and Y be two correlated discrete random variables, and

X

be the sample space of X with

|X| \leq |R|

. If

R

contains one and only one proper non-trivial left ideal

I

and

|I| = \sqrt{|R|}

, then there exists injection

\tilde{Φ} : X \to R

such that

\begin{matrix} H (X | Y) \leq 2 H (\tilde{Φ} (X) + I | Y) . \end{matrix}

(A1)

Proof.

Let

\begin{matrix} \tilde{Φ} \in arg max_{Φ \in M} H (Φ (X) + I | Y), \end{matrix}

(A2)

where

M

is the set of all possible

Φ

’s (maximum can always be reached because

|M| = \frac{|R|!}{(|R| - |X|)!}

is finite, but it is not uniquely attained by

\tilde{Φ}

in general). Assume that

Y

is the sample space (not necessarily finite) of Y. Let

q = |I|

,

I = {r_{1}, r_{2}, \dots, r_{q}}

and

R / I = \{a_{1} + I, a_{2} + I, \dots, a_{q} + I\}

. We have that

\begin{matrix} H (X | Y) = & - \sum_{y \in Y} \sum_{i, j = 1}^{q} p_{i, j, y} log \frac{p_{i, j, y}}{p_{y}} and \end{matrix}

(A3)

\begin{matrix} H (\tilde{Φ} (X) + I | Y) = & - \sum_{y \in Y} \sum_{i = 1}^{q} p_{i, y} log \frac{p_{i, y}}{p_{y}}, \end{matrix}

(A4)

where

\begin{matrix} p_{i, j, y} = & Pr \{\tilde{Φ} (X) = a_{i} + r_{j}, Y = y\}, \end{matrix}

(A5)

\begin{matrix} p_{y} = & \sum_{i, j = 1}^{q} p_{i, j, y}, \end{matrix}

(A6)

\begin{matrix} p_{i, y} = & \sum_{j = 1}^{q} p_{i, j, y} . \end{matrix}

(A7)

(Note:

Pr {\tilde{Φ} (X) = r} = 0

if

r \in R \ \tilde{Φ} (X)

. In addition, every element in

R

can be uniquely expressed as

a_{i} + r_{j}

.) Therefore, (A1) is equivalent to

\begin{matrix} - \sum_{y \in Y} \sum_{i, j = 1}^{q} p_{i, j, y} log \frac{p_{i, j, y}}{p_{y}} \leq & - 2 \sum_{y \in Y} \sum_{i = 1}^{q} p_{i, y} log \frac{p_{i, y}}{p_{y}} \\ \Leftrightarrow \sum_{y \in Y} p_{y} \sum_{i = 1}^{q} \frac{p_{i, y}}{p_{y}} H (\frac{p_{i, 1, y}}{p_{i, y}}, \frac{p_{i, 2, y}}{p_{i, y}}, \dots, \frac{p_{i, q, y}}{p_{i, y}}) \leq & \sum_{y \in Y} p_{y} H (\frac{p_{1, y}}{p_{y}}, \frac{p_{2, y}}{p_{y}}, \dots, \frac{p_{q, y}}{p_{y}}), \end{matrix}

(A8)

where

H (v_{1}, v_{2}, \dots, v_{q}) = - \sum_{j = 1}^{q} v_{j} log v_{j},

by the grouping rule for entropy ([19], p. 49). Let

\begin{matrix} A = \sum_{y \in Y} p_{y} H (\sum_{i = 1}^{q} \frac{p_{i, 1, y}}{p_{y}}, \sum_{i = 1}^{q} \frac{p_{i, 2, y}}{p_{y}}, \dots, \sum_{i = 1}^{q} \frac{p_{i, q, y}}{p_{y}}) . \end{matrix}

(A9)

The concavity of the function H implies that

\begin{matrix} \sum_{y \in Y} p_{y} \sum_{i = 1}^{q} \frac{p_{i, y}}{p_{y}} H (\frac{p_{i, 1, y}}{p_{i, y}}, \frac{p_{i, 2, y}}{p_{i, y}}, \dots, \frac{p_{i, q, y}}{p_{i, y}}) \leq A . \end{matrix}

(A10)

At the same time,

\begin{matrix} \sum_{y \in Y} p_{y} H (\frac{p_{1, y}}{p_{y}}, \frac{p_{2, y}}{p_{y}}, \dots, \frac{p_{q, y}}{p_{y}}) = max_{Φ \in M} H (Φ (X) + I | Y) \end{matrix}

(A11)

by the definition of

\tilde{Φ}

. We now claim that

\begin{matrix} A \leq max_{Φ \in M} H (Φ (X) + I | Y) . \end{matrix}

(A12)

Suppose otherwise, i.e.,

A > \sum_{y \in Y} p_{y} H (\frac{p_{1, y}}{p_{y}}, \frac{p_{2, y}}{p_{y}}, \dots, \frac{p_{q, y}}{p_{y}})

. Let

Φ^{'} : X \to R

be defined as

\begin{matrix} Φ^{'} : x \mapsto a_{j} + r_{i} if \tilde{Φ} (x) = a_{i} + r_{j} . \end{matrix}

(A13)

(Note:

\tilde{Φ} (x)

is an element of

R

. It can be uniquely presented as

a_{i} + r_{j}

for some i and j.) We have that

\begin{matrix} H (Φ^{'} (X) + I | Y) = & \sum_{y \in Y} p_{y} H (\sum_{i = 1}^{q} \frac{p_{i, 1, y}}{p_{y}}, \sum_{i = 1}^{q} \frac{p_{i, 2, y}}{p_{y}}, \dots, \sum_{i = 1}^{q} \frac{p_{i, q, y}}{p_{y}}) = A \end{matrix}

(A14)

\begin{matrix} > & \sum_{y \in Y} p_{y} H (\frac{p_{1, y}}{p_{y}}, \frac{p_{2, y}}{p_{y}}, \dots, \frac{p_{q, y}}{p_{y}}) = max_{Φ \in M} H (Φ (X) + I | Y) . \end{matrix}

(A15)

It is absurd that

H (Φ^{'} (X) + I | Y) > {max}_{Φ \in M} H (Φ (X) + I | Y)

! Therefore, (A8) is valid by (A10) and (A12), so is (A1). ☐

Lemma A2.

If both

\begin{matrix} 0 \leq min {p_{1}, p_{4}} \leq max {p_{2}, p_{3}} \leq 1 a n d 0 \leq min {p_{2}, p_{3}} \leq max {p_{1}, p_{4}} \leq 1 \end{matrix}

(A16)

are valid, and

\sum_{j = 1}^{4} p_{j} = 1

, then

\begin{matrix} - \sum_{j = 1}^{4} p_{j} log p_{j} \leq - 2 [(p_{2} + p_{3}) log (p_{2} + p_{3}) + (p_{1} + p_{4}) log (p_{1} + p_{4})] . \end{matrix}

(A17)

Proof [30].

Without loss of generality, we assume that

0 \leq max {p_{4}, p_{3}} \leq min {p_{2}, p_{1}} \leq 1

which implies that

p_{1} + p_{2} - 1 / 2 \geq |p_{1} + p_{4} - 1 / 2|

. Let

H_{2} (c) = - c log c - (1 - c) log (1 - c)

,

0 \leq c \leq 1

, be the binary entropy function. By the grouping rule for entropy ([19], p. 49), (A17) equals to

\begin{matrix} (p_{1} + p_{4}) (\frac{p_{1}}{p_{1} + p_{4}} log \frac{p_{1} + p_{4}}{p_{1}} + \frac{p_{4}}{p_{1} + p_{4}} log \frac{p_{1} + p_{4}}{p_{4}}) \end{matrix}

(A18)

\begin{matrix} + & (p_{2} + p_{3}) (\frac{p_{2}}{p_{2} + p_{3}} log \frac{p_{2} + p_{3}}{p_{2}} + \frac{p_{3}}{p_{2} + p_{3}} log \frac{p_{2} + p_{3}}{p_{3}}) \end{matrix}

(A19)

\begin{matrix} \leq & - (p_{2} + p_{3}) log (p_{2} + p_{3}) - (p_{1} + p_{4}) log (p_{1} + p_{4}) \end{matrix}

(A20)

\begin{matrix} \Leftrightarrow \end{matrix}

(A21)

\begin{matrix} A = & (p_{1} + p_{4}) H_{2} (\frac{p_{1}}{p_{1} + p_{4}}) + (p_{2} + p_{3}) H_{2} (\frac{p_{2}}{p_{2} + p_{3}}) \end{matrix}

(A22)

\begin{matrix} \leq & H_{2} (p_{1} + p_{4}) . \end{matrix}

(A23)

Since

H_{2}

is a concave function and

\sum_{j = 1}^{4} p_{j} = 1

, then

\begin{matrix} A \leq H_{2} (p_{1} + p_{2}) . \end{matrix}

(A24)

Moreover,

p_{1} + p_{2} - 1 / 2 \geq |p_{1} + p_{4} - 1 / 2|

guarantees that

\begin{matrix} H_{2} (p_{1} + p_{2}) \leq H_{2} (p_{1} + p_{4}), \end{matrix}

(A25)

because

H_{2} (c) = H_{2} (1 - c)

,

\forall 0 \leq c \leq 1

, and

H_{2} (c^{'}) \leq H_{2} (c^{''})

if

0 \leq c^{'} \leq c^{″} \leq 1 / 2

. Therefore,

A \leq H_{2} (p_{1} + p_{4})

and (A17) holds. ☐

Lemma A3.

No matter which finite field

F_{q}

is chosen, g given by (94) admits no presentation

k_{1} (x) + k_{2} (y) + k_{3} (z)

, where

k_{i} \in F_{q} [1]

for all feasible i.

Proof.

Suppose otherwise, i.e.,

k_{1} \circ μ_{1} + k_{2} \circ μ_{2} + k_{3} \circ μ_{3} = ν \circ g

for some injections

μ_{1}, μ_{1}, μ_{3} : {α_{0}, α_{1}} \to F_{q}

and

ν : {β_{0}, \dots, β_{3}} \to F_{q}

. By (94), we have

\begin{matrix} ν (β_{1}) & = (k_{1} \circ μ_{1}) (α_{1}) + (k_{2} \circ μ_{2}) (α_{0}) + (k_{3} \circ μ_{3}) (α_{0}) \\ = (k_{1} \circ μ_{1}) (α_{0}) + (k_{2} \circ μ_{2}) (α_{1}) + (k_{3} \circ μ_{3}) (α_{1}) \\ ν (β_{3}) & = (k_{1} \circ μ_{1}) (α_{1}) + (k_{2} \circ μ_{2}) (α_{1}) + (k_{3} \circ μ_{3}) (α_{0}) \\ = (k_{1} \circ μ_{1}) (α_{0}) + (k_{2} \circ μ_{2}) (α_{0}) + (k_{3} \circ μ_{3}) (α_{1}) \\ \Rightarrow & ν (β_{1}) - ν (β_{3}) = τ = - τ \\ \Rightarrow & τ + τ = 0, \end{matrix}

(A26)

where

τ = k_{2} (μ_{2} (α_{0})) - k_{2} (μ_{2} (α_{1}))

. Since

μ_{2}

is injective, (A26) implies that either

τ = 0

or

Char (F_{q}) = 2

by Proposition 2. Noticeable that

k_{2} (μ_{2} (α_{0})) \neq k_{2} (μ_{2} (α_{1}))

, i.e.,

τ \neq 0

, otherwise,

ν (β_{1}) = ν (β_{3})

which contradicts the assumption that

ν

is injective. Thus,

Char (F_{q}) = 2

. Let

ρ = (k_{3} \circ μ_{3}) (α_{0}) - (k_{3} \circ μ_{3}) (α_{1})

. Obviously,

ρ \neq 0

because of the same reason that

τ \neq 0

, and

ρ + ρ = 0

since

Char (F_{q}) = 2

. Therefore,

\begin{matrix} ν (β_{0}) = & (k_{1} \circ μ_{1}) (α_{0}) + (k_{2} \circ μ_{2}) (α_{0}) + (k_{3} \circ μ_{3}) (α_{0}) \end{matrix}

(A27)

\begin{matrix} = & (k_{1} \circ μ_{1}) (α_{0}) + (k_{2} \circ μ_{2}) (α_{0}) + (k_{3} \circ μ_{3}) (α_{1}) + ρ \end{matrix}

(A28)

\begin{matrix} = & ν (β_{3}) + ρ \end{matrix}

(A29)

\begin{matrix} = & (k_{1} \circ μ_{1}) (α_{1}) + (k_{2} \circ μ_{2}) (α_{1}) + (k_{3} \circ μ_{3}) (α_{0}) + ρ \end{matrix}

(A30)

\begin{matrix} = & (k_{1} \circ μ_{1}) (α_{1}) + (k_{2} \circ μ_{2}) (α_{1}) + (k_{3} \circ μ_{3}) (α_{1}) + ρ + ρ \end{matrix}

(A31)

\begin{matrix} = & ν (β_{2}) + 0 = ν (β_{2}) . \end{matrix}

(A32)

This contradicts the assumption that

ν

is injective. ☐

Remark A1.

As a special case, this lemma implies that no matter which finite field

F_{q}

is chosen, g defined by (94) has no polynomial presentation that is linear over

F_{q}

. In contrast, g admits presentation

x + 2 y + 3 z \in Z_{4} [3]

which is a linear function over

Z_{4}

.

Appendix B. Proofs of Lemmas 4 and 5

Appendix B.1. Proof of Lemma 4

Let p be a prime such that

p^{m} \geq max \{|Ω|, |X_{i}| |1 \leq i \leq k\}

for some integer m, and choose

R

to be a finite field of order

p^{m}

. By ([31], Lemma 7.40), the number of polynomial functions in

R [k]

is

p^{m p^{m k}}

. Moreover, the number of distinct functions with domain

R^{k}

and codomain

R

is also

{|R|}^{|R^{k}|} = p^{m p^{m k}}

. Hence, any function

g^{'} : R^{k} \to R

is a polynomial function.

In the meanwhile, any injections

μ_{i} : X_{i} \to R

(

1 \leq i \leq k

) and

ν : Ω \to R

give rise to a function

\begin{matrix} \hat{g} = ν \circ g (μ_{1}^{'}, μ_{2}^{'}, \dots, μ_{k}^{'}) : R^{k} \to R, \end{matrix}

(A33)

where

μ_{i}^{'}

is the inverse mapping of

μ_{i} : X_{i} \to μ_{i} (X_{i})

. Since

\hat{g}

must be a polynomial function as shown, the statement is established.

Remark A2.

Another proof involving Fermat’s little theorem can be found in [6].

Appendix B.2. Proof of Lemma 5

Let

F

be a finite field such that

|F| \geq |X_{i}|

for all

1 \leq i \leq s

and

{|F|}^{s} \geq |Ω|

, and let

R

be the splitting field of

F

of order

{|F|}^{s}

(one example of the pair

F

and

R

is the

Z_{p}

, where p is some prime, and its Galois extension of degree s). It is easily seen that

R

is an s dimensional vector space over

F

. Hence, there exist s vectors

v_{1}, v_{2}, \dots, v_{s} \in R

that are linearly independent. Let

μ_{i}

be an injection from

X_{i}

to the subspace generated by vector

v_{i}

. It is easy to verify that

k = \sum_{i = 1}^{s} μ_{i}

is injective since

v_{1}, v_{2}, \dots, v_{s}

are linearly independent. Let

k^{'}

be the inverse mapping of

k : \prod_{i = 1}^{s} X_{i} \to k (\prod_{i = 1}^{s} X_{i})

and

ν : Ω \to R

be any injection. By ([31], Lemma 7.40), there exists a polynomial function

h \in R [s]

such that

h = ν \circ g \circ k^{'} .

Let

\hat{g} (x_{1}, x_{2}, \dots, x_{s}) = h (\sum_{i = 1}^{s} x_{i})

. The statement is proved.

Remark A3.

In the proof, k is chosen to be injective because the proof includes the case that g is an identity function. In general, k is not necessarily injective.

Appendix C. Proof of Theorem 13

Choose

δ > 6 ϵ > 0

, such that

R_{j} = R_{j}^{'} + R_{j}^{″}

,

\forall j \in S

,

\sum_{j \in T} R_{j}^{'} > I (Y_{T}; V_{T} | V_{T^{c}}) + 2 |T| δ

,

\forall \emptyset \neq T \subseteq S

, and

R_{j}^{″} > r + 2 δ

, where

r = {max}_{0 \neq I \leq_{l} R} \frac{log | R |}{log | I |} [H (X | V_{S}) - H (Y_{R / I} | V_{S})]

,

\forall j \in S \ S_{0}

.

Appendix C.3. Encoding:

Fix the joint distribution p which satisfies (109). For all

j \in S_{0}

, let

V_{j, ϵ} = T_{ϵ} (n, X_{j})

. For all

j \in S \ S_{0}

, generate randomly

2^{n [I (Y_{j}; V_{j}) + δ]}

strongly

ϵ

-typical sequences according to distribution

p_{V_{j}^{n}}

and let

V_{j, ϵ}

be the set of these generated sequences. Define mapping

ϕ_{j}^{'} : R^{n} \to V_{j, ϵ}

as follows:

If $j \in S_{0}$ , then, $\forall x \in R^{n}$ , $ϕ_{j}^{'} (x) = \{\begin{matrix} x, & if x \in T_{ϵ}; \\ x_{0}, & otherwise, \end{matrix}$ where $x_{0} \in V_{j, ϵ}$ is fixed.
If $j \in S \ S_{0}$ , then for every $x \in R^{n}$ , let $L_{x} = {v \in V_{j, ϵ} | ({\vec{k}}_{j} (x), v) \in T_{ϵ}}$ . If $x \in T_{ϵ}$ and $L_{x} \neq \emptyset$ , then $ϕ_{j}^{'} (x)$ is set to be some element in $L_{x}$ ; otherwise $ϕ_{j}^{'} (x)$ is some fixed $v_{0} \in V_{j, ϵ}$ .

Define mapping

η_{j} : V_{j, ϵ} \to [1, 2^{n R_{j}^{'}}]

by randomly choosing the value for each

v \in V_{j, ϵ}

according to a uniform distribution.

Let

k = {min}_{j \in S \ S_{0}} \{⌊\frac{n R_{j}^{″}}{log |R|}⌋\}

. When n is big enough, we have

k > \frac{n [r + δ]}{log |R|}

. Randomly generate a

k \times n

matrix

M \in R^{k \times n}

, and let

θ_{j} : R^{n} \to R^{k}

(

j \in S \ S_{0}

) be the function

θ_{j} : x \mapsto M {\vec{k}}_{j} (x), \forall x \in R^{n}

.

Define the encoder

ϕ_{j}

as the follows

\begin{matrix} ϕ_{j} = \{\begin{matrix} η_{j} \circ ϕ_{j}^{'}, & j \in S_{0}; \\ (η_{j} \circ ϕ_{j}^{'}, θ_{j}), & otherwise . \end{matrix} \end{matrix}

(A34)

Appendix C.4. Decoding:

Upon observing

(a_{1}, a_{2}, \dots, a_{s_{0}}, (a_{s_{0} + 1}, b_{s_{0} + 1}), \dots, (a_{s}, b_{s}))

at the decoder, the decoder claims that

\begin{matrix} \vec{h} [{\vec{k}}_{0} ({\hat{V}}_{1}^{n}, {\hat{V}}_{2}^{n}, \dots, {\hat{V}}_{s_{0}}^{n}), {\hat{X}}^{n}] \end{matrix}

(A35)

is the function of the generated data, if and only if there exists one and only one

\begin{matrix} \hat{V} = ({\hat{V}}_{1}^{n}, {\hat{V}}_{2}^{n}, \dots, {\hat{V}}_{s}^{n}) \in \prod_{j = 1}^{s} V_{j, ϵ}, \end{matrix}

(A36)

such that

a_{j} = η_{j} ({\hat{V}}_{j}^{n}), \forall j \in S

, and

{\hat{X}}^{n}

is the only element in the set

\begin{matrix} L_{\hat{V}} = \{x \in R^{n} | (x, \hat{V}) \in T_{ϵ}, M x = \sum_{j = t + 1}^{s} b_{j}\} . \end{matrix}

(A37)

Appendix C.5. Error:

Assume that

X_{j}^{n}

is the data generated by the jth source and let

X^{n} = \sum_{j = s_{0} + 1}^{s} {\vec{k}}_{j} (X_{j}^{n})

. An error happens if and only if one of the following events happens.

E₁:: $(X_{1}^{n}, X_{2}^{n}, \dots, X_{s}^{n}, Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}, X^{n}) \notin T_{ϵ}$ ;
E₂:: There exists some $j_{0} \in S \ S_{0}$ , such that $L_{X_{j_{0}}^{n}} = \emptyset$ ;
E₃:: $(Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}, X^{n}, V) \notin T_{ϵ}$ , where $V = (V_{1}^{n}, V_{2}^{n}, \dots, V_{s}^{n})$ and $V_{j}^{n} = ϕ_{j}^{'} (X_{j}^{n}), \forall j \in S$ ;
E₄:: There exists $V^{'} = (v_{1}^{'}, v_{2}^{'}, \dots, v_{s}^{'}) \in T_{ϵ} \cap \prod_{j = 1}^{s} V_{j, ϵ}$ , $V^{'} \neq V$ , such that $η_{j} (v_{j}^{'}) = η_{j} (V_{j}^{n})$ , $\forall j \in S$ ;
E₅:: $X^{n} \notin L_{V}$ or $|L_{V}| > 1$ , i.e., there exists $X_{0}^{n} \in R^{n}$ , $X_{0}^{n} \neq X^{n}$ , such that $M X_{0}^{n} = M X^{n}$ and $(X_{0}^{n}, V) \in T_{ϵ}$ .

Let

γ = Pr \{⋃_{l = 1}^{5} E_{l}\} = \sum_{l = 1}^{5} Pr \{E_{l} | E_{l, c}\}

, where

E_{1, c} = \emptyset

and

E_{l, c} = ⋂_{τ = 1}^{l - 1} E_{τ}^{c}

for

1 < l \leq 5

. In the following, we show that

γ \to 0

,

n \to \infty

.

(a). By the joint AEP ([18], Theorem 6.9),

Pr {E_{1}} \to 0

,

n \to \infty

.

(b). Let

E_{2, j} = \{L_{X_{j}^{n}} = \emptyset\}

,

\forall j \in S \ S_{0}

. Then

\begin{matrix} Pr {E_{2} | E_{2, c}} \leq \sum_{j \in S \ S_{0}} Pr \{E_{2, j} | E_{2, c}\} . \end{matrix}

(A38)

For any

j \in S \ S_{0}

, because the sequence

v \in V_{j, ϵ}

and

Y_{j}^{n} = {\vec{k}}_{j} (X_{j}^{n})

are drawn independently, we have

\begin{matrix} Pr {(Y_{j}^{n}, v) \in T_{ϵ}} \geq & (1 - ϵ) 2^{- n [I (Y_{j}; V_{j}) + 3 ϵ]} \end{matrix}

(A39)

\begin{matrix} = & (1 - ϵ) 2^{- n [I (Y_{j}; V_{j}) + δ / 2] + n (δ / 2 - 3 ϵ)} \end{matrix}

(A40)

\begin{matrix} > & 2^{- n [I (Y_{j}; V_{j}) + δ / 2]} \end{matrix}

(A41)

when n is big enough. Thus,

\begin{matrix} Pr \{E_{2, j} | E_{2, c}\} & = Pr \{L_{X_{j}^{n}} = \emptyset | E_{2, c}\} \\ = \prod_{v \in V_{j, ϵ}} Pr \{({\vec{k}}_{j} (X_{j}^{n}), v) \notin T_{ϵ}\} \\ < {\{1 - 2^{- n [I (Y_{j}; V_{j}) + δ / 2]}\}}^{2^{n [I (Y_{j}; V_{j}) + δ]}} \\ \to 0, n \to \infty . \end{matrix}

(A42)

where (A42) holds true for all big enough n and the limit follow from the fact that

{(1 - 1 / a)}^{a} \to e^{- 1}

,

a \to \infty .

Therefore,

Pr {E_{2} | E_{2, c}} \to 0

,

n \to \infty

by (A38).

(c). By (109), it is obvious that

V_{J_{1}} - Y_{J_{1}} - Y_{J_{2}} - V_{J_{2}}

forms a Markov chain for any two disjoint nonempty sets

J_{1}, J_{2} ⊊ S

. Thus, if

(Y_{j}^{n}, V_{j}^{n}) \in T_{ϵ}

for all

j \in S

and

(Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}) \in T_{ϵ}

, then

(Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}, V) \in T_{ϵ}

. In the meantime,

X - (Y_{1}, Y_{2}, \dots, Y_{s}) - (V_{1}, V_{2}, \dots, V_{s})

is also a Markov chain. Hence,

(Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}, X^{n}, V) \in T_{ϵ}

if

(Y_{1}^{n}, Y_{2}^{n}, \dots, Y_{s}^{n}, X^{n}) \in T_{ϵ}

. Therefore,

Pr {E_{3} | E_{3, c}} = 0

.

(d). For all

\emptyset \neq J \subseteq S

, let

J = {j_{1}, j_{2}, \dots, j_{|j|}}

and

\begin{matrix} Γ_{J} = {V^{'} = (v_{1}^{'}, v_{2}^{'}, \dots, v_{s}^{'}) \in \prod_{j = 1}^{s} V_{j, ϵ} | v_{j}^{'} = V_{j}^{n} if and only if j \in S \ J} . \end{matrix}

(A43)

By definition,

|Γ_{J}| = \prod_{j \in J} |V_{j, ϵ}| - 1 = 2^{n [\sum_{j \in J} I (Y_{j}; V_{j}) + |J| δ]} - 1

and

\begin{matrix} Pr {E_{4} | E_{4, c}} & = \sum_{\emptyset \neq J \subseteq S} \sum_{V^{'} \in Γ_{J}} Pr \{η_{j} (v_{j}^{'}) = η_{j} (V_{j}^{n}), \forall j \in J, V^{'} \in T_{ϵ} | E_{4, c}\} \\ = \sum_{\emptyset \neq J \subseteq S} \sum_{V^{'} \in Γ_{J}} Pr \{η_{j} (v_{j}^{'}) = η_{j} (V_{j}^{n}), \forall j \in J\} \times Pr \{V^{'} \in T_{ϵ} | E_{4, c}\} \end{matrix}

(A44)

\begin{matrix} < & \sum_{\emptyset \neq J \subseteq S} \sum_{V^{'} \in Γ_{J}} 2^{- n \sum_{j \in J} R_{j}^{'}} \times 2^{- n [\sum_{i = 1}^{|J|} I (V_{j_{i}}; V_{J^{c}}, V_{j_{1}}, \dots, V_{j_{i - 1}}) - |J| δ]} \\ < & \sum_{\emptyset \neq J \subseteq S} 2^{n [\sum_{j \in J} I (Y_{j}; V_{j}) + |j| δ]} \times 2^{- n \sum_{j \in J} R_{j}^{'}} \times 2^{- n [\sum_{i = 1}^{|j|} I (V_{j_{i}}; V_{J^{c}}, V_{j_{1}}, \dots, V_{j_{i - 1}}) - |j| δ]} \end{matrix}

(A45)

\begin{matrix} \leq C max_{\emptyset \neq J \subseteq N} 2^{- n [\sum_{j \in J} R_{j}^{'} - I (Y_{J}; V_{J} | V_{J^{c}}) - 2 |j| δ]} \\ \to 0, n \to \infty, \end{matrix}

(A46)

where

C = 2^{s} - 1

. Equality (A44) holds because the processes of choosing

η_{j}

’s and generating

V^{'}

are done independently. (A45) follows from Lemma A4 and the definitions of

η_{j}

’s. (A46) is from Lemma A5.

Lemma A4.

Let

(X_{1}, X_{2}, \dots, X_{l}, Y) \sim q

. For any

ϵ > 0

and positive integer n, choose a sequence

{\tilde{X}}_{j}^{n}

(

1 \leq j \leq l

) randomly from

T_{ϵ} (n, X_{j})

based on a uniform distribution. If

y \in Y^{n}

is an ϵ-typical sequence with respect to Y, then

\begin{matrix} Pr \{({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}, \dots, {\tilde{X}}_{l}^{n}, Y^{n}) \in T_{ϵ} | Y^{n} = y\} \leq 2^{- n [\sum_{j = 1}^{l} I (X_{j}; Y, X_{1}, X_{2}, \dots, X_{j - 1}) - 3 l ϵ]} . \end{matrix}

(A47)

Proof.

Let

F_{j}

be the event

{({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}, \dots, {\tilde{X}}_{j}^{n}, Y^{n}) \in T_{ϵ}}

,

1 \leq j \leq l

, and

F_{0} = \emptyset

. We have

\begin{matrix} Pr \{({\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}, \dots, {\tilde{X}}_{l}^{n}, Y^{n}) \in T_{ϵ} | Y^{n} = y\} = & \prod_{j = 1}^{l} Pr \{F_{j} | Y^{n} = y, F_{j - 1}\} \end{matrix}

(A48)

\begin{matrix} \leq & \prod_{j = 1}^{l} 2^{- n [I (X_{j}; Y, X_{1}, X_{2}, \dots, X_{j - 1}) - 3 ϵ]} \end{matrix}

(A49)

\begin{matrix} = & 2^{- n [\sum_{j = 1}^{l} I (X_{j}; Y, X_{1}, X_{2}, \dots, X_{j - 1}) - 3 l ϵ]}, \end{matrix}

(A50)

since

{\tilde{X}}_{1}^{n}, {\tilde{X}}_{2}^{n}, \dots, {\tilde{X}}_{l}^{n}, y

are generated independent. ☐

Lemma A5.

If

(Y_{1}, V_{1}, Y_{2}, V_{2}, \dots, Y_{s}, V_{s}) \sim q

, and

\begin{matrix} q (y_{1}, v_{1}, y_{2}, v_{2}, \dots, y_{s}, v_{s}) = q (y_{1}, y_{2}, \dots, y_{s}) \prod_{i = 1}^{s} q (v_{i} | y_{i}), \end{matrix}

(A51)

then,

\forall J = {j_{1}, j_{2}, \dots, j_{|J|}} \subseteq {1, 2, \dots, s}

,

\begin{matrix} I (Y_{J}; V_{J} | V_{J^{c}}) = \sum_{i = 1}^{|j|} I (Y_{j_{i}}; V_{j_{i}}) - I (V_{j_{i}}; V_{J^{c}}, V_{j_{1}}, \dots, V_{j_{i - 1}}) . \end{matrix}

(A52)

(e). Let

E_{5, 1} = {L_{V} = \emptyset}

and

E_{5, 2} = {| L_{V} | > 1}

. We have

Pr {E_{5, 1} | E_{5, c}} = 0

, because

E_{5, c}

contains the event that

(X^{n}, V) \in L_{V}

and V is unique. Therefore,

\begin{matrix} Pr \{E_{5} | E_{5, c}\} = & Pr \{E_{5, 2} | E_{5, c}\} \\ = & \sum_{(X_{0}^{n}, V) \in T_{ϵ} \ (X^{n}, V)} Pr \{M X_{0}^{n} = M X^{n}\} \\ < & \sum_{0 \neq I \leq_{l} R} \sum_{D_{ϵ} (X^{n}, I | V) \ (X^{n}, V)} Pr \{M X_{0}^{n} = M X^{n}\} \end{matrix}

Choose a small

η > 0

such that

η < \frac{δ}{2 log |R|}

. Then

\begin{matrix} Pr \{E_{5} | E_{5, c}\} < & (2^{|R|} - 2) max_{0 \neq I \leq_{l} R} 2^{n [H (X | V_{S}) - H (Y_{R / I} | V_{S}) + η]} \times 2^{- k log |I|} \\ = & (2^{|R|} - 2) max_{0 \neq I \leq_{l} R} 2^{- n [k log |I| / n - H (X | V_{S}) + H (Y_{R / I} | V_{S}) - η]} \\ < & (2^{|R|} - 2) max_{0 \neq I \leq_{l} R} 2^{- n [δ log |I| / log |R| - η]} \end{matrix}

(A53)

\begin{matrix} < (2^{|R|} - 2) 2^{- n δ / 2 log |R|} \\ \to 0, n \to \infty, \end{matrix}

(A54)

where (A53) is from Lemmas 2 and 3 (for all large enough n and small enough

ϵ

) and (A54) is because

|I| \geq 2

for all

I \neq 0

.

To summarize, by (a)–(e), we have

γ \to 0, n \to \infty

. The theorem is established.

Appendix D. On Coding over Abelian Groups

As discussed in Section 2, since in this paper we focus on linear encoding, we need to work over a field or a ring. In general, most of the existing coding literature assumes coding over fields, especially when the focus is on linear encoding. Some both traditional and recent work, including [9,10,11], has however also considered (Abelian) groups, while significantly fewer results are available for coding over rings. In this appendix we elaborate on the relation between coding over fields, rings and groups in order to clearly show that our results in this paper are not subsumed by previous work on coding over groups. To highlight this fact even further, the following constitutes a counterexample to illustrate that “linear” operations over groups are not well-defined: In the case of the Abelian group

G = Z_{p} \oplus Z_{p} \oplus Z_{p} \oplus Z_{p}

(p is a prime), there are at least three distinct definitions of multiplication to define rings over G. These rings are isomorphic to either

the field $F_{p^{4}}$ which is commutative; or
the non-field ring

$\begin{matrix} M_{p} = \{[\begin{matrix} a & b \\ c & d \end{matrix}]| a, b, c, d \in Z_{p}\} \end{matrix}$

(A55)

which is not commutative; or
the product ring $Z_{p} \times Z_{p} \times Z_{p} \times Z_{p}$ which is commutative.

Suppose “linear operation over group G” is defined with respect to some multiplicative operation “*”, at the same time, this linear scheme over G includes the three distinct linear coding schemes defined over

F_{p^{4}}

,

M_{p}

and

Z_{p} \times Z_{p} \times Z_{p} \times Z_{p}

simultaneously. We then conclude that the operation “*” is commutative and non-commutative at the same time, a contradiction.

To be more specific about the fundamental differences, beyond linearity, between coding over groups, as in e.g., [11], and coding over fields or rings we also provide the following list of additional remarks.

(R1): Consider the example given in ([11], Section VIII.B.1) for reconstruction of the modulo-two sum of binary symmetric sources [4]. On ([11], p. 1509), it reads “Rate points achieved by embedding the function in the Abelian groups $Z_{3}$ , $Z_{4}$ are strictly worse than that achieved by embedding the function in $Z_{2}$ while embedding in $Z_{2} \oplus Z_{2}$ gives the Slepian–Wolf rate region for the lossless reconstruction of $(X, Y)$ ” ( $(X, Y)$ should be $F (X, Y) = X \oplus_{2} Y$ from the context, because coding over $Z_{3}$ is not strictly worse than coding over $Z_{2}$ for lossless reconstruct the original data $(X, Y)$ [3].).
Ref. [11] clearly states that group coding over $Z_{2} \oplus Z_{2}$ for encoding the modulo-two sum of symmetric sources gives only the Slepian–Wolf region. On the contrary, consider either the finite field $F_{4}$ or the non-field ring

$\begin{matrix} M_{L, 2} = \{[\begin{matrix} a & 0 \\ b & a \end{matrix}]| a, b \in Z_{2}\} \end{matrix}$

(A56)

(note: the underlying Abelian group defining $F_{4}$ and $M_{L, 2}$ is $Z_{2} \oplus Z_{2}$ ). We claim that linear coding over either $F_{4}$ or $M_{L, 2}$ for encoding the modulo-two sum of symmetric sources gives the Körner–Marton region [4]. This is because linear coding over finite field, e.g., $F_{4}$ , is always optimal for the Slepian–Wolf problem, so is linear coding over non-field ring $M_{L, 2}$ by Theorem 7. However, group coding over $Z_{2} \oplus Z_{2}$ is not.
It is well-known that the Körner–Marton region is often strictly larger than the Slepian–Wolf region. Linear coding over the non-field ring $M_{L, 2}$ (field $F_{4}$ ), as a special case (nonlinear) coding over Abelian group $Z_{2} \oplus Z_{2}$ must not achieve a region larger than the Slepian–Wolf region, leading to a contradiction.
(R2): Row 2 of TABLE III in [11] states that group coding over $Z_{4} \oplus Z_{4}$ (achieving sum rate $3.5$ ) is strictly worse than over the group $Z_{4}$ (achieving sum rate 3) for lossless encoding of a quaternary function ([11], Section VIII.A). On the contrary, linear coding over the ring $Z_{4} \times Z_{4}$ (with underlying Abelian group $Z_{4} \oplus Z_{4}$ ) always achieves a region containing the one achieved by linear coding over ring $Z_{4}$ . This is implied by Theorem 3. By direct calculation, we have that linear coding over the ring $Z_{4} \times Z_{4}$ (achieving sum rate 3) is strictly better than what is achieved by coding over over the Abelian group $Z_{4} \oplus Z_{4}$ (achieving sum rate $3.5$ ).
(R3): Finally, we emphasize that according to the Fundamental Theorem of (Finite) Abelian Group ([12], Theorem 5.25), up to isomorphism, every finite Abelian group is a direct sum of cyclic groups of prime-power order ([12], Proposition 5.27). This implies that every finite Abelian group can be represented via direct sum of modulo integers. However, many finite rings are not (isomorphic to) direct product of modulo integers, e.g., finite fields $F_{q}$ (when q is a power of a prime but is not a prime), matrix rings $M_{L, q^{'}}$ (when $q^{'} \geq 2$ is any positive integer) and all non-commutative rings. For a fixed order (e.g., $p^{2}$ with p being a prime), the number of finite rings is often significantly bigger than the number of finite Abelian groups. For instance, there are 4 rings of order 4 while there are 2 groups of order 4.

Acknowledgments

The authors would like to thank their colleagues Jinfeng Du and Mattias Andersson for assistance in proving Lemma A2. They are also very grateful to an anonymous reviewer of the paper [20] for suggesting an alternative proof of Lemma 3. This work was funded in part by the Swedish Research Council.

Author Contributions

Sheng Huang contributed to the original idea, the analysis and proofs, and wrote the paper. Mikael Skoglund helped to polish the idea and the analysis, and wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Slepian, D.; Wolf, J.K. Noiseless Coding of Correlated Information Sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
Elias, P. Coding for Noisy Channels. IRE Conv. Rec. 1955, 3, 37–46. [Google Scholar]
Csiszár, I. Linear Codes for Sources and Source Networks: Error Exponents, Universal Coding. IEEE Trans. Inf. Theory 1982, 28, 585–592. [Google Scholar] [CrossRef]
Körner, J.; Marton, K. How to Encode The Modulo-Two Sum of Binary Sources. IEEE Trans. Inf. Theory 1979, 25, 219–221. [Google Scholar] [CrossRef]
Ahlswede, R.; Han, T.S. On Source Coding with Side Information via a Multiple-Access Channel and Related Problems in Multi-User Information Theory. IEEE Trans. Inf. Theory 1983, 29, 396–411. [Google Scholar] [CrossRef]
Huang, S.; Skoglund, M. Polynomials and Computing Functions of Correlated Sources. In Proceedings of the 2012 IEEE International Symposium on Information Theory, Cambridge, MA, USA, 1–6 July 2012; pp. 771–775. [Google Scholar]
Huang, S.; Skoglund, M. Computing Polynomial Functions of Correlated Sources: Inner Bounds. In Proceedings of the International Symposium on Information Theory and Its Applications, Honolulu, HI, USA, 28–31 October 2012; pp. 160–164. [Google Scholar]
Han, T.S.; Kobayashi, K. A Dichotomy of Functions F(X, Y) of Correlated Sources (X, Y) from the Viewpoint of the Achievable Rate Region. IEEE Trans. Inf. Theory 1987, 33, 69–76. [Google Scholar] [CrossRef]
Como, G.; Fagnani, F. The Capacity of Finite Abelian Group Codes over Symmetric Memoryless Channels. IEEE Trans. Inf. Theory 2009, 55, 2037–2054. [Google Scholar] [CrossRef]
Como, G. Group codes outperform binary-coset codes on nonbinary symmetric memoryless channels. IEEE Trans. Inf. Theory 2010, 56, 4321–4334. [Google Scholar] [CrossRef]
Krithivasan, D.; Pradhan, S. Distributed Source Coding Using Abelian Group Codes: A New Achievable Rate-Distortion Region. IEEE Trans. Inf. Theory 2011, 57, 1495–1519. [Google Scholar] [CrossRef]
Rotman, J.J. Advanced Modern Algebra, 2nd ed.; American Mathematical Society: Providence, RI, USA, 2010. [Google Scholar]
Mullen, G.; Stevens, H. Polynomial functions (modm). Acta Math. Hung. 1984, 44, 237–241. [Google Scholar] [CrossRef]
Hungerford, T.W. Algebra (Graduate Texts in Mathematics); Springer: New York, NY, USA, 1980. [Google Scholar]
Lam, T.Y. A First Course in Noncommutative Rings, 2nd ed.; Springer: New York, NY, USA, 2001. [Google Scholar]
Dummit, D.S.; Foote, R.M. Abstract Algebra, 3rd ed.; Wiley: New York, NY, USA, 2003. [Google Scholar]
Anderson, F.W.; Fuller, K.R. Rings and Categories of Modules, 2nd ed.; Springer: New York, NY, USA, 1992. [Google Scholar]
Yeung, R.W. Information Theory and Network Coding, 1st ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: New York, NY, USA, 2006. [Google Scholar]
Huang, S.; Skoglund, M. On achievability of linear source coding over finite rings. In Proceedings of the 2013 IEEE International Symposium on Information Theory Proceedings (ISIT), Istanbul, Turkey, 7–12 July 2013; pp. 1984–1988. [Google Scholar]
Huang, S.; Skoglund, M. Encoding Irreducible Markovian Functions of Sources: An Application of Supremus Typicality; KTH Royal Institute of Technology: Stockholm, Sweden, 2013. [Google Scholar]
Buck, R.C. Nomographic Functions are Nowhere Dense. Proc. Am. Math. Soc. 1982, 85, 195–199. [Google Scholar] [CrossRef]
Huang, S.; Skoglund, M. Linear Source Coding over Rings and Applications. In Proceedings of the IEEE Swedish Communication Technologies Workshop, Lund, Sweden, 24–26 October 2012; pp. 1–6. [Google Scholar]
Slepian, D. Group Codes for the Gaussian Channel. Bell Syst. Tech. J. 1968, 47, 575–602. [Google Scholar] [CrossRef]
Ahlswede, R. Group Codes do not Achieve Shannon’s Channel Capacity for General Discrete Channels. Ann. Math. Stat. 1971, 42, 224–240. [Google Scholar] [CrossRef]
Forney, G.D., Jr. On the Hamming distance properties of group codes. IEEE Trans. Inf. Theory 1992, 38, 1797–1801. [Google Scholar] [CrossRef]
Singmaster, D.; Bloom, D.M. Rings of Order Four. Math. Assoc. Am. 1964, 71, 918–920. [Google Scholar] [CrossRef]
Chan, T.H.; Grant, A. Entropy vector and network codes. In Proceedings of the IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007. [Google Scholar]
Interlando, J.C.; Palazzo, R., Jr.; Elia, M. Group Block Codes Over Nonabelian Groups are Asymptotically Bad. IEEE Trans. Inf. Theory 1996, 42, 1277–1280. [Google Scholar] [CrossRef]
Du, J.; KTH Royal Institute of Technology, Stockholm, Sweden; Andersson, M.; KTH Royal Institute of Technology, Stockholm, Sweden. Personal Communication, 2012.
Lidl, R.; Niederreiter, H. Finite Fields, 2nd ed.; Gambridge University Press: New York, NY, USA, 1997. [Google Scholar]

Figure 1.

g : {α_{0}, α_{1}}^{3} \to {β_{0}, β_{1}, β_{2}, β_{3}}

.

Figure 1.

g : {α_{0}, α_{1}}^{3} \to {β_{0}, β_{1}, β_{2}, β_{3}}

.

Figure 2.

x + 2 y + 3 z \in Z_{4} [3]

.

Figure 2.

x + 2 y + 3 z \in Z_{4} [3]

.

Figure 3.

\hat{h} (x + 2 y + 4 z) \in Z_{5} [3]

.

Figure 3.

\hat{h} (x + 2 y + 4 z) \in Z_{5} [3]

.

Table 1. Distribution p.

$(X_{1}, X_{2}, X_{3})$	$p$	$(X_{1}, X_{2}, X_{3})$	$p$
$(α_{0}, α_{0}, α_{0})$	$1 / 90$	$(α_{0}, α_{1}, α_{0})$	$1 / 90$
$(α_{1}, α_{0}, α_{1})$	$1 / 90$	$(α_{1}, α_{1}, α_{1})$	$1 / 90$
$(α_{1}, α_{0}, α_{0})$	$42 / 90$	$(α_{0}, α_{0}, α_{1})$	$1 / 90$
$(α_{0}, α_{1}, α_{1})$	$42 / 90$	$(α_{1}, α_{1}, α_{0})$	$1 / 90$

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, S.; Skoglund, M. On Linear Coding over Finite Rings and Applications to Computing. Entropy 2017, 19, 233. https://doi.org/10.3390/e19050233

AMA Style

Huang S, Skoglund M. On Linear Coding over Finite Rings and Applications to Computing. Entropy. 2017; 19(5):233. https://doi.org/10.3390/e19050233

Chicago/Turabian Style

Huang, Sheng, and Mikael Skoglund. 2017. "On Linear Coding over Finite Rings and Applications to Computing" Entropy 19, no. 5: 233. https://doi.org/10.3390/e19050233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Linear Coding over Finite Rings and Applications to Computing

Abstract

1. Introduction

2. Rings, Ideals and Linear Mappings

3. Linear Coding over Finite Rings

4. Proof of the Achievability Theorems

4.1. Proof of Theorem 2

4.2. Proof of Theorem 3

4.3. Proof of Theorem 4

5. Optimality

5.1. Existence Theorem I: Single Source

5.2. Existence Theorem II: Multiple Sources

5.3. Product Rings

5.4. Trivial Case: Uniform Distributions

6. Application: Source Coding for Computing

7. Conclusions

7.1. Right Linearity

7.2. Field, Ring, Rng and Group

7.3. Final Remarks

Appendix A. Supporting Lemmata

Appendix B. Proofs of Lemmas 4 and 5

Appendix B.1. Proof of Lemma 4

Appendix B.2. Proof of Lemma 5

Appendix C. Proof of Theorem 13

Appendix C.3. Encoding:

Appendix C.4. Decoding:

Appendix C.5. Error:

Appendix D. On Coding over Abelian Groups

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI