Next Article in Journal
Maxwell’s Demon—A Historical Review
Next Article in Special Issue
On the Energy-Distortion Tradeoff of Gaussian Broadcast Channels with Feedback
Previous Article in Journal
Lyapunov Spectra of Coulombic and Gravitational Periodic Systems
Previous Article in Special Issue
Leveraging Receiver Message Side Information in Two-Receiver Broadcast Channels: A General Approach †
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On Linear Coding over Finite Rings and Applications to Computing

Communication Theory Lab, School of Electrical Engineering, KTH Royal Institute of Technology, Stockholm 10044, Sweden
*
Author to whom correspondence should be addressed.
Entropy 2017, 19(5), 233; https://doi.org/10.3390/e19050233
Submission received: 6 January 2017 / Revised: 24 April 2017 / Accepted: 15 May 2017 / Published: 20 May 2017
(This article belongs to the Special Issue Network Information Theory)

Abstract

:
This paper presents a coding theorem for linear coding over finite rings, in the setting of the Slepian–Wolf source coding problem. This theorem covers corresponding achievability theorems of Elias (IRE Conv. Rec. 1955, 3, 37–46) and Csiszár (IEEE Trans. Inf. Theory 1982, 28, 585–592) for linear coding over finite fields as special cases. In addition, it is shown that, for any set of finite correlated discrete memoryless sources, there always exists a sequence of linear encoders over some finite non-field rings which achieves the data compression limit, the Slepian–Wolf region. Hence, the optimality problem regarding linear coding over finite non-field rings for data compression is closed with positive confirmation with respect to existence. For application, we address the problem of source coding for computing, where the decoder is interested in recovering a discrete function of the data generated and independently encoded by several correlated i.i.d. random sources. We propose linear coding over finite rings as an alternative solution to this problem. Results in Körner–Marton (IEEE Trans. Inf. Theory 1979, 25, 219–221) and Ahlswede–Han (IEEE Trans. Inf. Theory 1983, 29, 396–411, Theorem 10) are generalized to cases for encoding (pseudo) nomographic functions (over rings). Since a discrete function with a finite domain always admits a nomographic presentation, we conclude that both generalizations universally apply for encoding all discrete functions of finite domains. Based on these, we demonstrate that linear coding over finite rings strictly outperforms its field counterpart in terms of achieving better coding rates and reducing the required alphabet sizes of the encoders for encoding infinitely many discrete functions.

1. Introduction

The problem of source coding for computing considers the scenario where a decoder is interested in recovering a function of the message(s), other than the original message(s), that is (are) i.i.d. generated and independently encoded by the source(s). In rigorous terms:
Problem 1 (Source Coding for Computing).
Given S = { 1 , 2 , , s } and ( X 1 , X 2 , , X s ) p . For each i S consider a discrete memoryless source that randomly generates i.i.d. discrete data X i ( 1 ) , X i ( 2 ) , , X i ( n ) , , where X i ( n ) has a finite sample space X i and X 1 ( n ) , X 2 ( n ) , , X s ( n ) p , n N + . For a discrete function g : i S X i Ω , what is the largest region R [ g ] R s , such that, ( R 1 , R 2 , , R s ) R [ g ] and ϵ > 0 , there exists an N 0 N + , such that for all n > N 0 , there exist s encoders ϕ i : X i n 1 , 2 n R i , i S , and one decoder ψ : i S 1 , 2 n R i Ω n , with
Pr g X 1 n , , X s n ψ ϕ 1 X 1 n , , ϕ s X s n < ϵ ,
where X i n = X i ( 1 ) , X i ( 2 ) , , X i ( n ) and
g X 1 n , , X s n = g X 1 ( 1 ) , , X s ( 1 ) g X 1 ( n ) , , X s ( n ) Ω n ?
The region R [ g ] is called the achievable coding rate region for computing g. A rate tuple R R s is said to be achievable for computing g (or simply achievable) if and only if R R [ g ] . A region R R s is said to be achievable for computing g (or simply achievable) if and only if R R [ g ] .
If g is an identity function, the computing problem, Problem 1, is known as the Slepian–Wolf (SW) source coding problem. R [ g ] is then the SW region [1],
R [ X 1 , X 2 , , X s ] = { ( R 1 , R 2 , , R s ) R s j T R j > H ( X T | X T c ) , T S } ,
where T c is the complement of T in S and X T X T c is the random variable array j T X j j T c X j . However, from [1] it is hard to draw conclusions regarding the structure (linear or not) of the encoders, as the corresponding mappings are chosen randomly among all feasible mappings. This limits the scope of their potential applications. As a consequence, linear coding over finite fields (LCoF), namely X i ’s are injectively mapped into some subsets of some finite fields and the ϕ i ’s are chosen as linear mappings over these fields, is considered. It is shown that LCoF achieves the same encoding limit, the SW region [2,3]. Although it seems straightforward to study linear mappings over rings (non-field rings in particular), it has not been proved (nor denied) that linear encoding over non-field rings can be equally optimal.
For an arbitrary discrete function g, Problem 1 remains open in general, and R [ X 1 , X 2 , , X s ] R [ g ] obviously. Making use of Elias’ theorem on binary linear codes [2], Körner–Marton [4] shows that R [ 2 ] (“ 2 ” is the modulo-two sum) contains the region
R ˜ = ( R 1 , R 2 ) R 2 | R 1 , R 2 > H ( X 1 2 X 2 ) .
This region is not contained in the SW region for certain distributions. In other words, R [ 2 ] R [ X 1 , X 2 ] . Combining the standard random coding technique and Elias’ result, [5] shows that R [ 2 ] can be strictly larger than the convex hull of the union R [ X 1 , X 2 ] R ˜ . However, the functions considered in these works are relatively simple. With a polynomial approach, [6,7] generalize the result of Ahlswede–Han ([5], Theorem 10) to the scenario of g being arbitrary. Making use of the fact that a discrete function is essentially a polynomial function (see Definition 2) over some finite field, an achievable region is given for computing an arbitrary discrete function. Such a region contains and can be strictly larger (depending on the precise function and distribution under consideration) than the SW region. Conditions under which R [ g ] is strictly larger than the SW region are presented in [6,8] from different perspectives, respectively. The cases regarding Abelian group codes are covered in [9,10,11].
The present work proposes replacing the linear encoders over finite fields from Elias [2] and Csiszár [3] with linear encoders over finite rings in the case of the problems accounted for above. Achievability theorems related to linear coding over finite rings (LCoR) for SW data compression are presented, covering the results in [2,3] as special cases in the sense of characterizing the achievable region. In addition, it is proved that there always exists a sequence of linear encoders over some finite non-field rings that achieves the SW region for any scenario of SW. Therefore, the issue of optimality of linear coding over finite non-field rings for data compression is closed with respect to existence. Furthermore, we also consider LCoR as an alternative technique for the general computing problem, Problem 1. Results from Körner–Marton [4], Ahlswede–Han ([5], Theorem 10) and [7] are generalized to corresponding ring versions for encoding (pseudo) nomographic functions (over rings). Since any discrete function with a finite domain admits a nomographic presentation, we conclude that our results universally apply for encoding all discrete functions of finite domains. Finally, it is shown that our ring approach dominates its field counterpart in terms of achieving better coding rates and reducing alphabet sizes of the encoders for encoding some discrete function. The proof is done by taking advantage of the fact that the characteristic of a ring can be any positive integer while the characteristic of a field must be a prime. From this observation used in the proof, it is seen that there are actually infinite many such functions.

2. Rings, Ideals and Linear Mappings

We start by introducing some fundamental algebraic concepts and related properties. Readers who are already familiar with this material may still choose to go through quickly to identify our notation.
Definition 1.
The tuple [ R , + , · ] is called a ring if the following criteria are met:
1. 
[ R , + ] is an Abelian group ;
2. 
There exists a multiplicative identity 1 R , namely, 1 · a = a · 1 = a , a R ;
3. 
a , b , c R , a · b R and ( a · b ) · c = a · ( b · c ) ;
4. 
a , b , c R , a · ( b + c ) = ( a · b ) + ( a · c ) and ( b + c ) · a = ( b · a ) + ( c · a ) .
We often write R for [ R , + , · ] when the operations considered are known from the context. The operation “·” is usually written by juxtaposition, a b for a · b , for all a , b R .
A ring [ R , + , · ] is said to be commutative if a , b R , a · b = b · a . In Definition 1, the identity of the group [ R , + ] , denoted by 0, is called the zero. A ring [ R , + , · ] is said to be finite if the cardinality | R | is finite, and | R | is called the order of R . The set Z q of integers modulo q is a commutative finite ring with respect to the modular arithmetic. For any ring R , the set of all polynomials of s indeterminants over R is an infinite ring.
Definition 2.
A polynomial function (Polynomial and polynomial function are distinct concepts.) of k variables over a finite ring R is a function g : R k R of the form
g ( x 1 , x 2 , , x k ) = j = 0 m a j x 1 m 1 j x 2 m 2 j x k m k j ,
where a j R and m and m i j ’s are non-negative integers. The set of all the polynomial functions of k variables over ring R is designated by R [ k ] .
Remark 1.
Polynomial and polynomial function are sometimes only defined over a commutative ring [12,13]. It is a very delicate matter to define them over a non-commutative ring [14,15], due to the fact that x 1 x 2 and x 2 x 1 can become different objects. We choose to define “polynomial functions” with Formula (5) because those functions are within the scope of this paper’s interest.
Proposition 1.
Given s rings R 1 , R 2 , , R s , for any non-empty set T { 1 , 2 , , s } , the Cartesian product (see [12]) R T = i T R i forms a new ring [ R T , + , · ] with respect to the component-wise operations defined as follows:
a + a = a 1 + a 1 , a 2 + a 2 , , a | T | + a | T | ,
a · a = a 1 a 1 , a 2 a 2 , , a | T | a | T | ,
a = a 1 , a 2 , , a | T | , a = a 1 , a 2 , , a | T | R T .
Remark 2.
In Proposition 1, [ R T , + , · ] is called the direct product of { R i | i T } . It can be easily seen that ( 0 , 0 , , 0 ) and ( 1 , 1 , , 1 ) are the zero and the multiplicative identity of [ R T , + , · ] , respectively.
Definition 3.
A non-zero element a of a ring R is said to be invertible , if and only if there exists b R , such that a b = b a = 1 . b is called the inverse of a, denoted by a 1 . An invertible element of a ring is called a unit.
Remark 3.
It can be proved that the inverse of a unit is unique. By definition, the multiplicative identity is the inverse of itself.
Let R * = R \ { 0 } . The ring [ R , + , · ] is a field if and only if [ R * , · ] is an Abelian group. In other words, all non-zero elements of R are invertible. All fields are commutative rings. Z q is a field if and only if q is a prime. All finite fields of the same order are isomorphic to each other ([16], p. 549). This “unique” field of order q is denoted by F q . It is necessary that q is a power of a prime. More details regarding finite fields can be found in ([16], Chapter 14.3).
Theorem 1
(Wedderburn’s little theorem [12]). Let R be a finite ring. R is a field if and only if all non-zero elements of R are invertible.
Remark 4.
Wedderburn’s little theorem guarantees commutativity for a finite ring if all of its non-zero elements are invertible. Hence, a finite ring is either a field or at least one of its elements has no inverse. However, a finite commutative ring is not necessary a field, e.g., Z q is not a field if q is not a prime.
Definition 4
([16]). The characteristic of a finite ring R is defined to be the smallest positive integer m, such that j = 1 m 1 = 0 , where 0 and 1 are the zero and the multiplicative identity of R , respectively. The characteristic of R is often denoted by Char ( R ) .
Remark 5.
Clearly, Char ( Z q ) = q . For a finite field F q , Char ( F q ) is always the prime q 0 such that q = q 0 n for some integer n ([12], Proposition 2.137).
Proposition 2.
Let F q be a finite field. For any 0 a F q , m = Char ( F q ) if and only if m is the smallest positive integer such that j = 1 m a = 0 .
Proof. 
Since a 0 ,
j = 1 m a = 0 a 1 j = 1 m a = a 1 · 0 j = 1 m 1 = 0 j = 1 m a = 0
The statement is proved. ☐
Definition 5.
A subset I of a ring [ R , + , · ] is said to be a left ideal of R , denoted by I l R , if and only if
1. 
[ I , + ] is a subgroup of [ R , + ] ;
2. 
x I and a R , a · x I .
If condition 2 is replaced by
3. 
x I and a R , x · a I ,
then I is called a right ideal of R , denoted by I r R . { 0 } is a trivial left (right) ideal, usually denoted by 0.
The cardinality | I | is called the order of a finite left (right) ideal I .
Remark 6.
Let a 1 , a 2 , , a n be a non-empty set of elements of some ring R . It is easy to verify that a 1 , a 2 , , a n r = { i = 1 n a i b i | b i R , 1 i n } is a right ideal and a 1 , a 2 , , a n l = { i = 1 n b i a i | b i R , 1 i n } is a left ideal. Furthermore, a 1 , a 2 , , a n r = R and a 1 , a 2 , , a n l = R if a i is a unit for some 1 i n .
It is well-known that if I l R , then R is divided into disjoint cosets which are of equal size (cardinality). For any coset J , J = x + I = x + y | y I , x J . The set of all cosets forms a left module over R , denoted by R / I . Similarly, R / I becomes a right module over R if I r R [17]. Of course, R / I can also be considered as a quotient group [12]. However, its structure is well richer than simply being a quotient group.
Proposition 3.
Let R i ( 1 i s ) be a ring and R = i = 1 s R i . For any A R , A l R (or A r R ) if and only if A = i = 1 s A i and A i l R i (or A i r R i ), 1 i s .
Proof. 
We prove for the l case only, and the r case follows from a similar argument. Let π i ( 1 i s ) be the coordinate function assigning every element in R its ith component. Then A i = 1 s A i , where A i = π i ( A ) . Moreover, for any
x = ( π 1 ( x 1 ) , π 2 ( x 2 ) , , π s ( x s ) ) i = 1 s A i ,
where x i A for all feasible i, we have that
x = i = 1 s e i x i ,
where e i R has the ith coordinate being 1 and others being 0. If A l R , then x A by definition. Therefore, i = 1 s A i A . Consequently, A = i = 1 s A i . Since π i is a homomorphism, we also have that A i l R i for all feasible i. The other direction is easily verified by definition. ☐
Remark 7.
It is worthwhile to point out that Proposition 3 does not hold for infinite index set, namely, R = i I R i , where I is not finite.
For any T S , Proposition 3 states that any left (right) ideal of R T is a Cartesian product of some left (right) ideals of R i , i T . Let I i be a left (right) ideal of ring R i ( 1 i s ). We define I T to be the left (right) ideal i T I i of R T .
Let x t be the transpose of a vector (or matrix) x .
Definition 6.
A mapping f : R n R m given as:
f ( x 1 , x 2 , , x n ) = j = 1 n a 1 , j x j , , j = 1 n a m , j x j t , ( x 1 , x 2 , , x n ) R n ,
where t stands for transposition and a i , j R for all feasible i and j, is called a left linear mapping over ring R . Similarly,
f ( x 1 , x 2 , , x n ) = j = 1 n x j a 1 , j , , j = 1 n x j a m , j t , ( x 1 , x 2 , , x n ) R n ,
defines a right linear mapping over ring R . If m = 1 , then f is called a left ( right ) linear function over R .
From now on, left linear mapping (function) or right linear mapping (function) are simply called linear mapping (function). This will not lead to any confusion since the intended use can usually be clearly distinguished from the context.
Remark 8.
The mapping f in Definition 6 is called linear in accordance with the definition of linear mapping (function) over a field. In fact, the two structures have several similar properties. Moreover, (11) is equivalent to
f ( x 1 , x 2 , , x n ) = A x 1 , x 2 , , x n t , ( x 1 , x 2 , , x n ) R n ,
where A is an m × n matrix over R and [ A ] i , j = a i , j for all feasible i and j. A is named the coefficient matrix. It is easy to prove that a linear mapping is uniquely determined by its coefficient matrix, and vice versa. The linear mapping f is said to be trivial , denoted by 0, if A is the zero matrix , i.e., [ A ] i , j = 0 for all feasible i and j.
It should be noted that an interesting approach to coding over an Abelian group was presented in [9,10,11]. However, we emphasize that even if group, field and ring are closely related algebraic structures, the definition of the group encoder in [11] and the linear encoder in [3] and in the present work are in general fundamentally different (although there is an overlap in special cases). To highlight in more detail the difference between linear encoding (this work and [3]) and encoding over a group, as in [11], which is a nonlinear operation in general, take the Abelian group G = Z 2 Z 2 , the field F 4 of order 4 and the matrix ring M L , 2 = a 0 b a a , b Z 2 as examples.
  • By ([11], Example 2), the Abelian group encoder encodes the source Z ^ = ( X , Y ) G based on a Slepian–Wolf like scheme. Namely, two binary linear encoders encode X n and Y n separately as two binary sources. Therefore, the lengths of the codewords from encoding X n and Y n can even be different, and the encoder is in general a highly nonlinear device.
  • On the other hand, the linear encoder over either F 4 or M L , 2 simply outputs a linear combination of the vector Z ^ n , namely A Z ^ n for some matrix A over F 4 or M L , 2 .
  • However, if one requires that the codewords from encoding X n and Y n be of the same length in (1), then the output from encoding Z ^ n is the same as A ˜ Z ^ n for some matrix A ˜ over ring Z 2 × Z 2 (a specific product ring whose multiplication is significantly different from those of F 4 or M L , 2 ). In other words, in this quite specific special case, the encoder becomes linear over a product ring of modulo integers, which is a sub-class of the completely general ring structures considered in this paper.
We also note that in some source network problems, linear codes appear superior to others [3]. For instance, for encoding the modulo-two sum of binary symmetric sources, linear coding over F 4 or M L , 2 achieves the optimal Körner–Marton region [4] (the M L , 2 case will be established in later sections), while coding over G achieves the sub-optimal Slepian–Wolf region ([11], p. 1509). To avoid any remaining confusion, we in Appendix D present additional details regarding the differences between linear coding, as in the present work and in [3], and coding over an Abelian group, as in [11].
Let A be an m × n matrix over ring R and f ( x ) = A x , x R n . For the system of linear equations
f ( x ) = A x = 0 , where 0 = ( 0 , 0 , , 0 ) t R m ,
let S ( f ) be the set of all solutions, namely S ( f ) = x R n | f ( x ) = 0 . It is obvious that S ( f ) = R n if f is trivial, i.e., A is the zero matrix. If R is a field, then S ( f ) is a subspace of R n . We conclude this section with a lemma regarding the cardinalities of R n and S ( f ) in the following.
Lemma 1.
For a finite ring R and a linear function
f : x ( a 1 , a 2 , , a n ) x
( f : x x t ( a 1 , a 2 , , a n ) t ) , x R n ,
we have
S ( f ) R n = 1 | I | ,
where I = a 1 , a 2 , , a n r I = a 1 , a 2 , , a n l . In particular, if a i is invertible for some 1 i n , then S ( f ) = R n 1 .
Proof. 
It is obvious that the image f ( R n ) = I by definition. Moreover, x y I , the pre-images f 1 ( x ) f 1 ( y ) satisfy f 1 ( x ) f 1 ( y ) = and f 1 ( x ) = f 1 ( y ) = | S ( f ) | . Therefore, | I | S ( f ) = R n , i.e., S ( f ) R n = 1 | I | . Moreover, if a i is a unit, then I = R , thus, S ( f ) = R n / R = R n 1 . ☐

3. Linear Coding over Finite Rings

In this section, we will present a coding rate region achieved with LCoR for the SW source coding problem, i.e., g is an identity function in Problem 1. This region is exactly the SW region if all the rings considered are fields. However, being field is not necessary as seen in Section 5, where the issue of optimality is addressed.
Before proceeding, a subtlety needs to be cleared out. It is assumed that a source generates data taking values from a finite sample space X i , while X i does not necessarily admit any algebraic structure. We have to either assume that X i is with a certain algebraic structure, for instance X i is a ring, or injectively map elements of X i into some algebraic structure. In our subsequent discussions, we assume that X i is mapped into a finite ring R i of order at least X i by some injection Φ i . Hence, X i can simply be treated as a subset Φ i X i R i for a fixed Φ i . When required, Φ i can also be selected to obtain desired outcomes.
To facilitate our discussion, the following notation is used. For T S , X T ( x T and X T resp.) is defined to be the Cartesian product
i T X i i T x i and i T X i resp . ,
where x i X i is a realization of X i . If X 1 , X 2 , , X s p , we denote the marginal of p with respect to X T by p X T , i.e., X T p X T , define the support
supp ( p X T ) = x T X T p X T x T > 0 and
H ( p X T ) = H ( X T ) .
For simplicity, M X S , R S is defined to be
Φ 1 , Φ 2 , , Φ s Φ i : X i R i is   injective , i S
( R i X i is implicitly assumed), and Φ ( x T ) = i T Φ i ( x i ) for any Φ M X S , R S and x T X T . For any Φ M X S , R S , let
R Φ = { R 1 , R 2 , , R s R s i T R i log | I i | log | R i | > r T , I T , T S , 0 I i l R i } ,
where r T , I T = H ( X T | X T c ) H ( Y R T / I T | X T c ) = H ( X T | Y R T / I T , X T c ) and Y R T / I T = Φ ( X T ) + I T is a random variable with sample space R T / I T .
Theorem 2.
R Φ is achievable with linear coding over the finite rings R 1 , R 2 , , R s . In exact terms, ϵ > 0 , there exists N 0 N + , for all n > N 0 , there exist linear encoders (left linear mappings to be more precise) ϕ i : Φ ( X i ) n R i k i ( i S ) and a decoder ψ, such that
Pr ψ i S ϕ i X i i S X i < ϵ ,
where X i = Φ X i ( 1 ) , Φ X i ( 2 ) , , Φ X i ( n ) t , as long as
k 1 log R 1 n , k 2 log R 2 n , , k s log R s n R Φ .
Proof. 
The proof is given in Section 4. ☐
The following is a concrete example providing some insight into this theorem.
Example 1.
Consider the single source scenario, where X 1 p and X 1 = Z 6 , specified as follows.
X 1 012345
p ( X 1 ) 0.05 0.1 0.15 0.2 0.2 0.3
Obviously, Z 6 contains 3 non-trivial ideals I 1 = { 0 , 3 } , I 2 = { 0 , 2 , 4 } and Z 6 , and Y Z 6 / I 1 and Y Z 6 / I 2 admit the distributions
Entropy 19 00233 i001
respectively. In addition, Y Z 6 / Z 6 = Z 6 is a constant. Thus, by Theorem 2, rate R 1 is achievable if
R 1 log I 1 log Z 6 = R 1 log 2 log 6 > H ( X 1 ) H ( Y Z 6 / I 1 ) = 2.40869 1.53949 = 0.86920 , R 1 log I 2 log Z 6 = R 1 log 3 log 6 > H ( X 1 ) H ( Y Z 6 / I 2 ) = 2.40869 0.97095 = 1.43774 a n d R 1 log Z 6 log Z 6 = R 1 > H ( X 1 ) H ( Y Z 6 / Z 6 ) = H ( X 1 ) = 2.40869 .
In other words,
R = R 1 R | R 1 > max { 2.24685 , 2.34485 , 2.40869 }
= R 1 R | R 1 > 2.40869 = H ( X 1 )
is achievable with linear coding over ring Z 6 . Obviously, R is just the region R [ X 1 ] . Optimality is claimed.
Additionally, we would like to point out that some of the inequalities defining (22) are not active for specific scenarios. Two classes of these scenarios are discussed in the following theorems. The first, Theorem 3, is for scenarios where rings considered are product rings, while the second, Theorem 4, is for cases of lower triangle matrix rings (similarly, readers can consider usual matrix rings, which are often non-commutative, if interested).
Theorem 3.
Suppose R i ( 1 i s ) is a (finite) product ring l = 1 k i R l , i of finite rings R l , i ’s, and the sample space X i satisfies X i R l , i for all feasible i and l. Given injections Φ l , i : X i R l , i and let
Φ = ( Φ 1 , Φ 2 , , Φ s ) ,
where Φ i = l = 1 k i Φ l , i is defined as
Φ i : x i Φ 1 , i ( x i ) , Φ 2 , i ( x i ) , , Φ k i , i ( x i ) R i , x i X i .
We have that
R Φ , prod = { [ R 1 , R 2 , , R s ] R s | i T R i log I i log R i > H ( X T | Y R T / I T , X T c ) , T S , I i = l = 1 k i I l , i   w i t h   0 I l , i l R l , i } ,
where Y R T / I T = Φ ( X T ) + I T , is achievable with linear coding over R 1 , R 2 , , R s . Moreover, R Φ R Φ , prod .
Proof. 
The proof is found in Section 4. ☐
Let R be a finite ring and
M L , R , m = a 1 0 0 a 2 a 1 0 a m a m 1 a 1 a 1 , a 2 , , a m R ,
where m is a positive integer. It is easy to verify that M L , R , m is a ring with respect to matrix operations. Moreover, I is a left ideal of M L , R , m if and only if
I = a 1 0 0 a 2 a 1 0 a m a m 1 a 1 a j I j l R , 1 j m ; I j I j + 1 , 1 j < m .
Let O ( M L , R , m ) be the set of all left ideals of the form
a 1 0 0 a 2 a 1 0 a m a m 1 a 1 a j I j l R , 1 j m ; I j I j + 1 , 1 j < m ; I i = 0 for   some 1 i m .
Theorem 4.
Let R i ( 1 i s ) be a finite ring such that X i R i . For any injections Φ i : X i R i , let
Φ = ( Φ 1 , Φ 2 , , Φ s ) ,
where Φ i : X i M L , R i , m i is defined as
Φ i : x i Φ i ( x i ) 0 0 Φ i ( x i ) Φ i ( x i ) 0 Φ i ( x i ) Φ i ( x i ) Φ i ( x i ) , x i X i .
We have that
R Φ , m = { [ R 1 , R 2 , , R s ] R s | i T R i log I i log R i > H ( X T | Y R T / I T , X T c ) ,      T S , I i l M L , R i , m i   a n d   I i O ( M L , R i , m i ) } ,
where Y R T / I T = Φ ( X T ) + I T , is achievable with linear coding over M L , R 1 , m 1 , M L , R 2 , m 2 , , M L , R s , m s . Moreover, R Φ R Φ , m .
Proof. 
The proof is found in Section 4. ☐
Remark 9.
The difference between (22), (29) and (35) lies in their restrictions defining I i ’s, respectively, as highlighted in the proofs given in Section 4.
Remark 10.
Without much effort, one can see that R Φ ( R Φ , prod and R Φ , m , respectively) in Theorem 2 (Theorem 3 and Theorem 4, respectively) depends on Φ via random variables Y R T / I T ’s whose distributions are determined by Φ. For each i S , there exist R i ! R i X i ! distinct injections from X i to a ring R i of order at least | X i | . Let cov ( A ) be the convex hull of a set A R s . By a straightforward time sharing argument, we have that
R l = cov Φ M X S , R S R Φ
is achievable with linear coding over R 1 , R 2 , , R s .
Remark 11.
From Theorem 5, one will see that (22) and (36) are the same when all the rings are fields. Actually, both are identical to the SW region. However, (36) can be strictly larger than (22) (see Section 5), when not all the rings are fields. This implies that, in order to achieve the desired rate, a suitable injection is required. However, be reminded that taking the convex hull in (36) is not always needed for optimality as shown in Example 1. A more sophisticated elaboration on this issue is found in Section 5.
The rest of this section provides key supporting lemmata and concepts used to prove Theorems 2–4. The final proofs are presented in Section 4.
Lemma 2.
Let x , y R n be two distinct sequences, where R is a finite ring, and assume that y x = a 1 , a 2 , , a n t . If f : R n R k is a random linear mapping chosen uniformly at random, i.e., generate the k × n coefficient matrix A of f by independently choosing each entry of A from R uniformly at random, then
Pr f ( x ) = f ( y ) = | I | k ,
where I = a 1 , a 2 , , a n l .
Proof. 
Let f = ( f 1 , f 2 , , f k ) t , where f i : R n R is a random linear function. Then
Pr { f ( x ) = f ( y ) } = Pr i = 1 k f i ( x ) = f i ( y )
   = i = 1 k Pr f i ( x y ) = 0 ,
since the f i ’s are independent from each other. The statement follows from Lemma 1, which ensures that Pr f i ( x y ) = 0 = | I | 1 . ☐
Remark 12.
In Lemma 2, if R is a field and x y , then I = R because every non-zero a i is a unit. Thus, Pr f ( x ) = f ( y ) = | R | k .
Definition 7
([18]). Let X p X be a discrete random variable with sample space X . The set T ϵ ( n , X ) of strongly ϵ -typical sequencesof length n with respect to X is defined to be
x X n N ( x ; x ) n p X ( x ) ϵ , x X ,
where N ( x ; x ) is the number of occurrences of x in the sequence x .
The notation T ϵ ( n , X ) is sometimes replaced by T ϵ when the length n and the random variable X referred to are clear from the context.
Now we conclude this section with the following lemma. It is a crucial part for our proofs of the achievability theorems. It generalizes the classic conditional typicality lemma ([19], Theorem 15.2.2), yet at the same time distinguishes our argument from the one for the field version.
Lemma 3.
Let ( X 1 , X 2 ) p be a jointly random variable whose sample space is a finite ring R = R 1 × R 2 . For any η > 0 , there exists ϵ > 0 , such that, ( x 1 , x 2 ) t T ϵ ( n , ( X 1 , X 2 ) ) and I l R 1 ,
D ϵ ( x 1 , I | x 2 ) < 2 n H ( X 1 | Y R 1 / I , X 2 ) + η ,
where
D ϵ ( x 1 , I | x 2 ) = ( y , x 2 ) t T ϵ y x 1 I n
and Y R 1 / I = X 1 + I is a random variable with sample space R 1 / I .
Proof. 
Define the mapping Γ : R 1 R 1 / I by
Γ : x 1 x 1 + I , x 1 R 1 .
Assume that x 1 = x 1 ( 1 ) , x 1 ( 2 ) , , x 1 ( n ) , and let
y ¯ = Γ x 1 ( 1 ) , Γ x 1 ( 2 ) , , Γ x 1 ( n ) .
By definition, ( y , x 2 ) t D ϵ ( x 1 , I | x 2 ) , where y = y ( 1 ) , y ( 2 ) , , y ( n ) ,
Γ y ( 1 ) , Γ y ( 2 ) , , Γ y ( n ) = y ¯ .
Moreover,
( y , y ¯ , x 2 ) t T ϵ ( n , ( X 1 , Y R 1 / I , X 2 ) ) , and
D ϵ ( x 1 , I | x 2 ) = ( y , y ¯ , x 2 ) t T ϵ y x 1 I n .
For fixed ( y ¯ , x 2 ) t T ϵ , the number of strongly ϵ -typical sequences y such that ( y , y ¯ , x 2 ) t is strongly ϵ -typical is strictly upper bounded by 2 n H ( X 1 | Y R 1 / I , X 2 ) + η if n is large enough and ϵ is small. Therefore,
D ϵ ( x 1 , I | x 2 ) < 2 n H ( X 1 | Y R 1 / I , X 2 ) + η .
 ☐
Remark 13.
We acknowledge an anonymous reviewer of our paper [20] for suggesting the proof for Lemma 3 given above. Our original proof was presented as a special case of a more general result in [21]. The techniques behind the two proofs are quite different, however the full generality of our original proof is appreciated better in non-i.i.d. scenarios, as in [21].
Remark 14.
Assume that y x = ( a 1 , a 2 , , a n ) t , then y x I n is equivalent to a 1 , a 2 , , a n l I .

4. Proof of the Achievability Theorems

4.1. Proof of Theorem 2

As mentioned, X i can be seen as a subset of R i for a fixed Φ = Φ 1 , , Φ s . In this section, we assume that X i has sample space R i , which makes sense since Φ i is injective.
Let R = R 1 , R 2 , , R s and k i = n R i log | R i | , i S , where n is the length of the data sequences. If R R Φ , then i T R i log | I i | log | R i | > r T , I T , (this implies that 1 n i T k i log | I i | r T , I T > 2 η for some small constant η > 0 and large enough n), T S , 0 I i l R i . We claim that R is achievable by linear coding over R 1 , R 2 , , R s .
Encoding:
For every i S , randomly generate a k i × n matrix A i based on a uniform distribution, i.e., independently choose each entry of A i uniformly at random from R i . Define a linear encoder ϕ i : R i n R i k i such that
ϕ i : x A i x , x R i n .
Obviously the coding rate of this encoder is 1 n log ϕ i ( R i n ) 1 n log R i k i = log R i n n R i log | R i | R i .
Decoding:
Subject to observing y i R i k i ( i S ) from the ith encoder, the decoder claims that x = x 1 , x 2 , , x s t i = 1 s R i n is the array of the encoded data sequences, if and only if:
  • x T ϵ ; and
  • x = x 1 , x 2 , , x s t T ϵ , if x x , then ϕ j ( x j ) y j , for some j.
Error:
Assume that X i = x i R i n ( i S ) is the original data sequence generated by the ith source. It is readily seen that an error occurs if and only if one of the following events occurs:
E1:
x = x 1 , x 2 , , x s t T ϵ ;
E2:
There exists x x 1 , x 2 , , x s t T ϵ , such that ϕ i ( x i ) = ϕ i ( x i ) , i S .
Error Probability:
By the joint asymptotic equipartition principle (AEP) ([18], Theorem 6.9), Pr E 1 0 , n .
Additionally, for T S , let
D ϵ ( x ; T ) = x 1 , x 2 , , x s t T ϵ | x i x i , i T and x i = x i , i T c .
We have
D ϵ ( x ; T ) 0 I i l R i i T D ϵ ( x T , I T | x T c ) \ { x } ,
where x T = i T x i and x T c = i T c x i , since I i goes over all possible non-trivial left ideals. Consequently,
Pr E 2 | E 1 c = x 1 , , x s t T ϵ \ { x } i S Pr ϕ i ( x i ) = ϕ i ( x i ) | E 1 c
= T S x 1 , , x s t D ϵ ( x ; T ) i T Pr ϕ i ( x i ) = ϕ i ( x i ) | E 1 c
T S 0 I i l R i i T x 1 , , x s t D ϵ ( x T , I T | x T c ) \ { x } i T Pr ϕ i ( x i ) = ϕ i ( x i ) | E 1 c
< T S 0 I i l R i i T 2 n r T , I T + η 1 i T | I i | k i
< 2 s 1 2 R S 2 × max T S , 0 I i l R i i T 2 n 1 n i T k i log | I i | r T , I T + η ,
where
  • (52) is from the fact that T ϵ \ { x } = T S D ϵ ( x ; T ) (disjoint union);
  • (53) follows from (51) by the union bound (Boole’s inequality);
  • (54) is from Lemmas 2 and 3, as well as the fact that every left ideal of R T is a Cartesian product of some left ideals I i of R i , i T (see Proposition 3). At the same time, ϵ is required to be sufficiently small;
  • (55) is due to the facts that the number of non-empty subsets of S is 2 s 1 and the number of non-trivial left ideals of the finite ring R T is less than 2 R S 1 , which is the number of non-empty subsets of R S R T .
Thus, Pr E 2 | E 1 c 0 , when n , from (55), since for sufficiently large n and small ϵ , 1 n i T k i log | I i | r T , I T + η > η > 0 .
Therefore, Pr E 1 E 2 = Pr E 1 + Pr E 2 | E 1 c 0 as ϵ 0 and n .

4.2. Proof of Theorem 3

The proof follows almost the same steps as in proving Theorem 2, except that the performance analysis only focuses on sequences a i , 1 , a i , 2 , , a i , n R i n ( 1 i s ) such that
a i , j = Φ 1 , i x i ( j ) , Φ 2 , i x i ( j ) , , Φ k i , i x i ( j ) l = 1 k i R l , i
for some x i ( j ) X i . Let X i , Y i be any two such sequences satisfying X i Y i I i n for some I i l R i . Based on the special structure of X i and Y i , it is easy to verify that I i 0 I i = l = 1 k i I l , i and 0 I l , i l R l , i , for all 1 l k i . (This causes the difference between (22) and (29).) In addition, it is obvious that R Φ R Φ , prod by their definitions.

4.3. Proof of Theorem 4

The proof is similar to that for Theorem 2, except that it only focuses on sequences a i , 1 , a i , 2 , , a i , n M L , R i , m i n ( 1 i s ) such that a i , j M L , R i , m i satisfies [ a i , j ] u , v = a , u v ; 0 , otherwise , for some a R i . Let X i , Y i be any two such sequences such that X i Y i I i n for some I i l M L , R i , m i . It is easily seen that I i 0 if and only if I i O ( M L , R i , m i ) . (This causes the difference between (22) and (35).) In addition, it is obvious that R Φ R Φ , m by their definitions.

5. Optimality

Obviously, Theorem 2 specializes to its field counterpart if all rings considered are fields, as summarized in the following theorem.
Theorem 5.
Region (22) is the SW region if R i contains no proper non-trivial left ideal, equivalently, R i is a field, for all i S . As a consequence, region (36) is the SW region.
Proof. 
In Theorem 2, random variable Y R T / I T admits a sample space of cardinality 1 for all T S , since the only non-trivial left ideal of R i is itself for all feasible i. Thus, 0 = H ( Y R T / I T ) H ( Y R T / I T | X T c ) 0 . Consequently,
R Φ = R 1 , R 2 , , R s R s | i T R i > H ( X T | X T c ) , T S ,
which is the SW region R [ X 1 , X 2 , , X s ] . Therefore, region (36) is also the SW region.
If R i is a field, then obviously it has no proper non-trivial left (right) ideal. Conversely, 0 a R i , a l = R i implies that 0 b R i , such that b a = 1 . Similarly, 0 c R i , such that c b = 1 . Moreover, c = c · 1 = c b a = 1 · a = a . Hence, a b = c b = 1 . b is the inverse of a. By Wedderburn’s little theorem, R i is a field. ☐
One important question to address is whether linear coding over finite non-field rings can be equally optimal for data compression. Hereby, we claim that, for any SW scenario, there always exist linear encoders over some finite non-field rings which achieve the data compression limit. Therefore, optimality of linear coding over finite non-field rings for data compression is established in the sense of existence.

5.1. Existence Theorem I: Single Source

For any single source scenario, the assertion that there always exists a finite ring R 1 , such that R l is in fact the SW region
R [ X 1 ] = { R 1 R | R 1 > H ( X 1 ) } ,
is equivalent to the existence of a finite ring R 1 and an injection Φ 1 : X 1 R 1 , such that
max 0 I 1 l R 1 log R 1 log I 1 H ( X 1 ) H ( Y R 1 / I 1 ) = H ( X 1 ) ,
where Y R 1 / I 1 = Φ 1 X 1 + I 1 .
Theorem 6.
Let R 1 be a finite ring of order R 1 X 1 . If R 1 contains one and only one proper non-trivial left ideal I 0 and I 0 = R 1 , then region (36) coincides with the SW region, i.e., there exists an injection Φ 1 : X 1 R 1 , such that (59) holds.
Remark 15.
Examples of such a non-field ring R 1 in the above theorem include
M L , p = x 0 y x x , y Z p
( M L , p is a ring with respect to matrix addition and multiplication) and Z p 2 , where p is any prime. For any single source scenario, one can always choose R 1 to be either M L , p or Z p 2 . Consequently, optimality is attained.
Proof of Theorem 6.
Notice that the random variable Y R 1 / I 0 depends on the injection Φ 1 , so does its entropy H ( Y R 1 / I 0 ) . Obviously H ( Y R 1 / R 1 ) = 0 , since the sample space of the random variable Y R 1 / R 1 contains only one element. Therefore,
log R 1 log R 1 H ( X 1 ) H ( Y R 1 / R 1 ) = H ( X 1 ) .
Consequently, (59) is equivalent to
log R 1 log I 0 H ( X 1 ) H ( Y R 1 / I 0 ) H ( X 1 ) H ( X 1 ) 2 H ( Y R 1 / I 0 ) ,
since I 0 = R 1 . By Lemma A1, there exists injection Φ ˜ 1 : X 1 R 1 such that (62) holds if Φ 1 = Φ ˜ 1 . The statement follows. ☐
Up to isomorphism, there are exactly 4 distinct rings of order p 2 for a given prime p. They include 3 non-field rings, Z p × Z p , M L , p and Z p 2 , in addition to the field F p 2 . It has been proved that, using linear encoders over the last three, optimality can always be achieved in the single source scenario. Actually, the same holds true for all multiple sources scenarios.

5.2. Existence Theorem II: Multiple Sources

Theorem 7.
Let R 1 , R 2 , , R s be s finite rings with R i X i . If R i is isomorphic to either
1. 
a field, i.e., R i contains no proper non-trivial left (right) ideal; or
2. 
a ring containing one and only one proper non-trivial left ideal I 0 i and I 0 i = R i ,
for all feasible i, then (36) coincides with the SW region R [ X 1 , X 2 , , X s ] .
Remark 16.
It is obvious that Theorem 7 includes Theorem 6 as a special case. In fact, its proof resembles the one of Theorem 6. Examples of R i ’s include all finite fields, M L , p and Z p 2 , where p is a prime. However, Theorem 7 does not guarantee that all rates, except the vertexes , in the polytope of the SW region are “directly” achievable for the multiple sources case. A time sharing scheme is required in our current proof. Nevertheless, all rates are “directly” achievable if R i ’s are fields or if s = 1 . This is partially the reason that the two theorems are stated separately.
Remark 17.
Theorem 7 also includes Theorem 5 as a special case. However, Theorem 5 admits a simpler proof compared to the one for Theorem 7.
Proof of Theorem 7.
It suffices to prove that, for any R = ( R 1 , R 2 , , R s ) R s satisfies
R i > H ( X i | X i 1 , X i 2 , , X 1 ) , 1 i s ,
R R Φ for some set of injections Φ = ( Φ 1 , Φ 2 , , Φ s ) , where Φ i : X i R i . Let Φ ˜ = ( Φ ˜ 1 , Φ ˜ 2 , , Φ ˜ s ) be the set of injections, where, if
(i)
R i is a field, Φ ˜ i is any injection;
(ii)
R i satisfies 2, Φ ˜ i is the injection such that
H ( X i | X i 1 , X i 2 , , X 1 ) 2 H ( Y R i / I 0 i | X i 1 , X i 2 , , X 1 ) ,
when Φ i = Φ ˜ i . The existence of Φ ˜ i is guaranteed by Lemma A1.
If Φ = Φ ˜ , then
log I i log R i H ( X i | X i 1 , X i 2 , , X 1 ) H ( X i | X i 1 , X i 2 , , X 1 ) H ( Y R i / I i | X i 1 , X i 2 , , X 1 )
= H ( X i | Y R i / I i , X i 1 , X i 2 , , X 1 ) ,
for all 1 i s and 0 I i l R i . As a consequence,
i T R i log I i log R i > i T log I i log R i H ( X i | X i 1 , X i 2 , , X 1 )
i T H ( X i | Y R i / I i , X i 1 , X i 2 , , X 1 )
i T H ( X i | Y R T / I T , X T c , X i 1 , X i 2 , , X 1 )
H X T Y R T / I T , X T c
= H X T | X T c H Y R T / I T | X T c ,
for all T { 1 , 2 , , s } . Thus, R R Φ ˜ . ☐
By Theorems 5–7, we draw the conclusion that
Corollary 1.
For any SW scenario, there always exists a sequence of linear encoders over some finite rings (fields or non-field rings) which achieves the data compression limit, the SW region.
In fact, LCoR can be optimal even for rings beyond those stated in the above theorems (see Example 1). We classify some of these scenarios in the remaining parts of this section.

5.3. Product Rings

Theorem 8.
Let R l , 1 , R l , 2 , , R l , s ( l = 1 , 2 ) be a set of finite rings of equal size, and R i = R 1 , i × R 2 , i for all feasible i. If the coding rate R R s is achievable with linear encoders over R l , 1 , R l , 2 , , R l , s ( l = 1 , 2 ), then R is achievable with linear encoders over R 1 , R 2 , , R s .
Proof. 
By definition, R is a convex combination of coding rates which are achieved by different linear encoding schemes over R l , 1 , R l , 2 , , R l , s ( l = 1 , 2 ), respectively. To be more precise, there exist R 1 , R 2 , , R m R s and positive numbers w 1 , w 2 , , w m with j = 1 m w j = 1 , such that R = j = 1 m w j R j . Moreover, there exist injections Φ l = ( Φ l , 1 , Φ l , 2 , , Φ l , s ) ( l = 1 , 2 ), where Φ l , i : X i R l , i , such that
R j R Φ l = { ( R 1 , R 2 , , R s ) R s | i T R i log I l , i log R l , i > H ( X T | X T c ) H ( Y R l , T / I l , T | X T c ) , T S , 0 I l , i l R l , i } ,
where R l , T = i T R l , i , I l , T = i T I l , i and Y R l , T / I l , T = Φ l ( X T ) + I l , T is a random variable with sample space R l , T / I l , T . To show that R is achievable with linear encoders over R 1 , R 2 , , R s , it suffices to prove that R j is achievable with linear encoders over R 1 , R 2 , , R s for all feasible j. Let R j = ( R j , 1 , R j , 2 , , R j , s ) . For all T S and 0 I i = I 1 , i × I 2 , i l R i with 0 I l , i l R l , i ( l = 1 , 2 ), we have
i T R j , i log I i log R i = i T R j , i log I 1 , i log R 1 , i c 1 c 1 + c 2 + i T R j , i log I 2 , i log R 2 , i c 2 c 1 + c 2 ,
where c l = log R l , 1 . By (72), it can be easily seen that
i T R j , i log I i log R i > H ( X T | X T c ) 1 c 1 + c 2 l = 1 2 c l H ( Y R l , T / I l , T | X T c ) .
Meanwhile, let R T = i T R i , I T = i T I i , Φ = ( Φ 1 , 1 × Φ 2 , 1 , Φ 1 , 2 × Φ 2 , 2 , , Φ 1 , s × Φ 2 , s ) (Note:
Φ 1 , i × Φ 2 , i : x i Φ 1 , i ( x i ) , Φ 2 , i ( x i ) R i
for all x i X i .) and Y R T / I T = Φ ( X T ) + I T . It can be verified that Y R l , T / I l , T ( l = 1 , 2 ) is a function of Y R T / I T , hence, H ( Y R T / I T | X T c ) H ( Y R l , T / I l , T | X T c ) . Consequently,
i T R j , i log I i log R i > H ( X T | X T c ) H ( Y R T / I T | X T c ) ,
which implies that R j R Φ , prod by Theorem 3. We therefore conclude that R j is achievable with linear encoders over R 1 , R 2 , , R s for all feasible j, so is R . ☐
Obviously, R 1 , R 2 , , R s in Theorem 8 are of the same size. Inductively, one can verify the following without any difficulty.
Theorem 9.
Let L be any finite index set, R l , 1 , R l , 2 , , R l , s ( l L ) be a set of finite rings of equal size, and R i = l L R l , i for all feasible i. If the coding rate R R s is achievable with linear encoders over R l , 1 , R l , 2 , , R l , s ( l L ), then R is achievable with linear encoders over R 1 , R 2 , , R s .
Remark 18.
There are delicate issues to the situation Theorem 9 (Theorem 8) illustrates. Let X i ( 1 i s ) be the set of all symbols generated by the ith source. The hypothesis of Theorem 9 (Theorem 8) implicitly implies the alphabet constraint X i R l , i for all feasible i and l.
Let R 1 , R 2 , , R s be s finite rings each of which is isomorphic to either
  • a ring R containing one and only one proper non-trivial left ideal whose order is R , e.g., M L , p and Z p 2 (p is a prime); or
  • a ring of a finite product of finite field(s) and/or ring(s) satisfying 1), e.g., M L , p × j = 1 m Z p j (p and p j ’s are prime) and i = 1 m M L , p i × j = 1 m F q j ( m and m are non-negative, p i ’s are prime and q j ’s are power of primes).
Theorems 7 and 9 ensure that linear encoders over ring R 1 , R 2 , , R s are always optimal in any applicable (subject to the condition specified in the corresponding theorem) SW coding scenario. As a very special case, Z p × Z p , where p is a prime, is always optimal in any (single source or multiple sources) scenario with alphabet size less than or equal to p. However, using a field or product rings is not necessary. As shown in Theorem 6, neither M L , p nor Z p 2 is (isomorphic to) a product of rings nor a field. It is also not required to have a restriction on the alphabet size (see Theorem 7), even for product rings (see Example 1 for a case of Z 2 × Z 3 ).

5.4. Trivial Case: Uniform Distributions

The following theorem is trivial, however we include it for completeness.
Theorem 10.
Regardless which set of rings R 1 , R 2 , , R s is chosen, as long as | R i | = | X i | for all feasible i, region (22) is the SW region if ( X 1 , X 2 , , X s ) p is a uniform distribution.
Proof. 
If p is uniform, then, for any T S and 0 I T l R T , Y R T / I T is uniformly distributed on R T / I T . Moreover, X T and X T c are independent, so are Y R T / I T and X T c . Therefore, H ( X T | X T c ) = H ( X T ) = log | R T | and H ( Y R T / I T | X T c ) = H ( Y R T / I T ) = log | R T | | I T | . Consequently,
r ( T , I T ) = H ( X T | X T c ) H ( Y R T / I T | X T c ) = log | I T | .
Region (22) is the SW region. ☐
Remark 19.
When p is uniform, it is obvious that the uncoded strategy (all encoders are one-to-one mappings) is optimal in the SW source coding problem. However, optimality stated in Theorem 10 does not come from deliberately fixing the linear encoding mappings, but generating them randomly.
So far, we have only shown that there exist linear encoders over finite non-field rings that are equally good as their field counterparts. In next section, Problem 1 is considered with an arbitrary g. It will be demonstrated that linear coding over finite non-field rings can strictly outperform its field counterpart for encoding some discrete functions, and there are infinitely many such functions.

6. Application: Source Coding for Computing

The problem of Source Coding for Computing, Problem 1, with an arbitrary g is addressed in this section. Some advantages of LCoR (compared to LCoF) will be demonstrated. We begin with establishing the following theorem which can be recognized as a generalization of Körner–Marton [4].
Theorem 11.
Let R be a finite ring, and
g ^ = h k ,   w h e r e   k ( x 1 , x 2 , , x s ) = i = 1 s k i ( x i )
and h , k i ’s are functions mapping R to R . Then
R g ^ = ( r , r , , r ) R s | r > max 0 I l R log | R | log | I | H ( X ) H ( Y R / I ) R [ g ^ ] ,
where X = k ( X 1 , X 2 , , X s ) and Y R / I = X + I .
Proof. 
By Theorem 2, ϵ > 0 , there exists a large enough n, an m × n matrix A R m × n and a decoder ψ , such that Pr X n ψ A X n < ϵ , if m > max 0 I l R n ( H ( X ) H ( Y R / I ) ) log | I | . Let ϕ i = A k i ( 1 i s ) be the encoder of the ith source. Upon receiving ϕ i ( X i n ) from the ith source, the decoder claims that h X ^ n , where X ^ n = ψ i = 1 s ϕ i X i n , is the function, namely g ^ , subject to computation. The probability of decoding error is
Pr h k X 1 n , X 2 n , , X s n h X ^ n Pr X n X ^ n
= Pr X n ψ i = 1 s ϕ i X i n
= Pr X n ψ i = 1 s A k i X i n
= Pr X n ψ A i = 1 s k i X i n
= Pr X n ψ A k X 1 n , X 2 n , , X s n
= Pr X n ψ A X n < ϵ .
Therefore, all ( r , r , , r ) R s , where r = m log | R | n > max 0 I l R log | R | log | I | H ( X ) H ( Y R / I ) , is achievable, i.e., R g ^ R [ g ^ ] . ☐
Corollary 2.
In Theorem 11, let X = k ( X 1 , X 2 , , X s ) p X . We have
R g ^ = ( r , r , , r ) R s r > H ( X ) R [ g ^ ] ,
if either of the following conditions holds:
1. 
R is isomorphic to a finite field;
2. 
R is isomorphic to a ring containing one and only one proper non-trivial left ideal I 0 with I 0 = R , and
H ( X ) 2 H ( X + I 0 ) .
Proof. 
If either (1) or (2) holds, then it is guaranteed that
max 0 I l R log | R | log | I | H ( X ) H ( Y R / I ) = H ( X )
in Theorem 11. The statement follows. ☐
Remark 20.
By Lemma A2, examples of non-field rings satisfying (2) in Corollary 2 includes
(1) 
Z 4 with p X ( 0 ) = p 1 , p X ( 1 ) = p 2 , p X ( 3 ) = p 3   a n d   p X ( 2 ) = p 4 satisfying
0 max { p 2 , p 3 } min { p 1 , p 4 } 1   a n d   0 max { p 1 , p 4 } min { p 2 , p 3 } 1 ,
(2) 
M L , 2 with
p X 0 0 0 0 = p 1 , p X 1 0 0 1 = p 2 , p X 1 0 1 1 = p 3   a n d   p X 0 0 1 0 = p 4
satisfying (89), etc.
Interested readers can figure out even more explicit examples deduced from Lemma A1.
Remark 21.
If R is isomorphic to Z 2 and g ^ is the modulo-two sum, then Corollary 2 recovers the theorem of Körner–Marton [4]. While if R is (isomorphic to) a field, it becomes a special case of ([7] Theorem III.1). Actually, almost all the results in [6,7] can be reproved in the setting of rings in a parallel fashion.
We claim that there are functions g for which LCoR outperforms LCoF; in fact, there are infinite many such g’s. To prove this, some definitions are required for the mechanics of our argument.
Definition 8.
Let g 1 : i = 1 s X i Ω 1 and g 2 : i = 1 s Y i Ω 2 be two functions. If there exist bijections μ i : X i Y i , 1 i s , and ν : Ω 1 Ω 2 , such that
g 1 ( x 1 , x 2 , , x s ) = ν 1 ( g 2 ( μ 1 ( x 1 ) , μ 2 ( x 2 ) , , μ s ( x s ) ) ) ,
then g 1 and g 2 are said to be equivalent(via μ 1 , μ 2 , , μ s and ν).
Definition 9.
Given function g : D Ω , and let S D . The restriction of g on S is defined to be the function g | S : S Ω such that g | S : x g ( x ) , x S .
Lemma 4.
Let X 1 , X 2 , , X k and Ω be some finite sets. For any discrete function g : i = 1 k X i Ω , there always exist a finite ring (field) and a polynomial function g ^ R [ k ] such that
ν g x 1 , x 2 , , x k = g ^ μ 1 ( x 1 ) , μ 2 ( x 2 ) , , μ k ( x k )
for some injections μ i : X i R ( 1 i k ) and ν : Ω R .
Proof. 
There are several possible proofs of this lemma. One is provided in Appendix B. ☐
Remark 22.
Up to equivalence, a function can be presented in many different formats. For example, the function min { x , y } defined on { 0 , 1 } × { 0 , 1 } (with ordering 0 1 ) can either be seen as F 1 ( x , y ) = x y on Z 2 2 or be treated as the restriction of F 2 ( x , y ) = x + y ( x + y ) 2 defined on Z 3 2 to the domain { 0 , 1 } × { 0 , 1 } Z 3 2 .
Lemma 4 implies that any discrete function defined on a finite domain is equivalent to a restriction of some polynomial function over some finite ring (field). As a consequence, we can restrict Problem 1 to all polynomial functions. This polynomial approach offers valuable insight into the general problem, because the algebraic structure of a polynomial function is clearer than that of an arbitrary function. We often call g ^ in Lemma 4 a polynomial presentation of g. On the other hand, the g ^ given by (78) is named a nomographic function over R (by terminology borrowed from [22]), it is said to be a nomographic presentation of g if g is equivalent to a restriction of it.
Lemma 5.
Let X 1 , X 2 , , X s and Ω be some finite sets. For any discrete function g : i = 1 s X i Ω , there exists a nomographic function g ^ over some finite ring (field) R such that
ν g x 1 , x 2 , , x k = g ^ μ 1 ( x 1 ) , μ 2 ( x 2 ) , , μ k ( x k )
for some injections μ i : X i R ( 1 i k ) and ν : Ω R .
Proof. 
There are several proofs of this lemma. One is provided in Appendix B. ☐
Lemma 5 advances Lemma 4 by claiming that a discrete function with a finite domain is always equivalent to a restriction of some nomographic function. From this, it is seen that Theorem 11 and Corollary 2 have presented a universal solution to Problem 1.
Given some finite ring R , let g ^ of format (78) be a nomographic presentation of g. We say that the region R g ^ given by (79) is achievable for computing g in the sense of Körner–Marton. From Theorem 13 given later, we know that R g ^ might not be the largest achievable region one can obtain for computing g. However, R g ^ still captures the ability of linear coding over R when used for computing g. In other words, R g ^ is the region purely achieved with linear coding over R for computing g. On the other hand, regions from Theorem 13 are achieved by combining the linear coding and the standard random coding techniques. Therefore, it is reasonable to compare LCoR with LCoF in the sense of Körner–Marton.
We show that linear coding over finite rings, non-field rings in particular, strictly outperforms its field counterpart, LCoF, in the following example.
Example 2
([23]). Let g : { α 0 , α 1 } 3 { β 0 , β 1 , β 2 , β 3 } (Figure 1) be a function such that
g : ( α 0 , α 0 , α 0 ) β 0 ; g : ( α 0 , α 0 , α 1 ) β 3 ; g : ( α 0 , α 1 , α 0 ) β 2 ; g : ( α 0 , α 1 , α 1 ) β 1 ; g : ( α 1 , α 0 , α 0 ) β 1 ; g : ( α 1 , α 0 , α 1 ) β 0 ; g : ( α 1 , α 1 , α 0 ) β 3 ; g : ( α 1 , α 1 , α 1 ) β 2 .
Define μ : { α 0 , α 1 } Z 4 and ν : { β 0 , β 1 , β 2 , β 3 } Z 4 by
μ : α j j , j { 0 , 1 } , a n d ν : β j j , j { 0 , 1 , 2 , 3 } ,
respectively. Obviously, g is equivalent to x + 2 y + 3 z Z 4 [ 3 ] (Figure 2) via μ 1 = μ 2 = μ 3 = μ and ν. However, by Proposition 4, there exists no g ^ F 4 [ 3 ] of format (78) so that g is equivalent to any restriction of g ^ . Although, Lemma 5 ensures that there always exists a bigger field F q such that g admits a presentation g ^ F q [ 3 ] of format (78), the size q must be strictly bigger than 4. For instance, let
h ^ ( x ) = a Z 5 a 1 ( x a ) 4 1 ( x 4 ) 4 Z 5 [ 1 ] .
Then, g has presentation h ^ ( x + 2 y + 4 z ) Z 5 [ 3 ] (Figure 3) via μ 1 = μ 2 = μ 3 = μ : { α 0 , α 1 } Z 5 and ν : { β 0 , β 1 , β 2 , β 3 } Z 5 defined (symbolic-wise) by (95).
Proposition 4.
There exists no polynomial function g ^ F 4 [ 3 ] of format (78), such that a restriction of g ^ is equivalent to the function g defined by (94).
Proof. 
Suppose ν g = g ^ ( μ 1 , μ 2 , μ 3 ) , where μ 1 , μ 1 , μ 3 : { α 0 , α 1 } F 4 , ν : { β 0 , , β 3 } F 4 are injections, and g ^ = h ( k 1 + k 2 + k 3 ) with h , k i F 4 [ 1 ] for all feasible i. We claim that g ^ and h are both surjective, since g { α 0 , α 1 } 3 = { β 0 , β 1 , β 2 , β 3 } = 4 = F 4 . In particular, h is bijective. Therefore, h 1 ν g = k 1 μ 1 + k 2 μ 2 + k 3 μ 3 , i.e., g admits a presentation k 1 ( x ) + k 2 ( y ) + k 3 ( z ) F 4 [ 3 ] . A contradiction to Lemma A3. ☐
As a consequence of Proposition 4, in the sense of Körner–Marton, in order to use LCoF to encode function g, the alphabet sizes of the three encoders need to be at least 5. However, LCoR offers a solution in which the alphabet sizes are 4, strictly smaller than using LCoF. Most importantly, the region achieved with linear coding over any finite field F q , is always a subset of the one achieved with linear coding over Z 4 . This is proved in the following proposition.
Proposition 5.
Let g be the function defined by (94), { α 0 , α 1 } 3 be the sample space of ( X 1 , X 2 , X 3 ) p and p X be the distribution of X = g ( X 1 , X 2 , X 3 ) . If
p X ( β 0 ) = p 1 , p X ( β 1 ) = p 2 , p X ( β 3 ) = p 3   a n d   p X ( β 2 ) = p 4
satisfying (89), then, in the sense of Körner–Marton, the region R 1 achieved with linear coding over Z 4 contains the one, that is R 2 , obtained with linear coding over any finite field F q for computing g. Moreover, if supp ( p ) is the whole domain of g, then R 1 R 2 .
Proof. 
Let g ^ = h k F q [ 3 ] be a polynomial presentation of g with format (78). By Corollary 2 and Remark 20, we have
R 1 = ( R 1 , R 2 , R 3 ) R 3 R i > H ( X 1 + 2 X 2 + 3 X 3 ) ,
R 2 = ( R 1 , R 2 , R 3 ) R 3 R i > H ( k ( X 1 , X 2 , X 3 ) ) .
Assume that ν g = h k ( μ 1 , μ 2 , μ 3 ) , where μ 1 , μ 1 , μ 3 : { α 0 , α 1 } F q and ν : { β 0 , , β 3 } F q are injections. Obviously, g ( X 1 , X 2 , X 3 ) is a function of k ( X 1 , X 2 , X 3 ) . Hence,
H ( k ( X 1 , X 2 , X 3 ) ) H ( g ( X 1 , X 2 , X 3 ) ) .
On the other hand, H ( X 1 + 2 X 2 + 3 X 3 ) = H ( g ( X 1 , X 2 , X 3 ) ) . Therefore,
H ( k ( X 1 , X 2 , X 3 ) ) H ( X 1 + 2 X 2 + 3 X 3 ) ,
and R 1 R 2 . In addition, we claim that h | S , where S = k j = 1 3 μ j { α 0 , α 1 } , is not injective. Otherwise, h : S S , where S = h ( S ) , is bijective, hence, h | S 1 ν g = k ( μ 1 , μ 2 , μ 3 ) = k 1 μ 1 + k 2 μ 2 + k 3 μ 3 . A contradiction to Lemma A3. Consequently, S > S = ν { β 0 , , β 3 } = 4 . If supp ( p ) = { α 0 , α 1 } 3 , then (100) as well as (101) hold strictly, thus, R 1 R 2 . ☐
A more intuitive comparison (which is not as conclusive as Proposition 5) can be identified from the presentations of g given in Figure 2 and Figure 3. According to Corollary 2, linear encoders over field Z 5 achieve
R Z 5 = ( R 1 , R 2 , R 3 ) R 3 R i > H ( X 1 + 2 X 2 + 4 X 3 ) .
The one achieved by linear encoders over ring Z 4 is
R Z 4 = ( R 1 , R 2 , R 3 ) R 3 R i > H ( X 1 + 2 X 2 + 3 X 3 ) .
Clearly, H ( X 1 + 2 X 2 + 3 X 3 ) H ( X 1 + 2 X 2 + 4 X 3 ) , thus, R Z 4 contains R Z 5 . Furthermore, as long as
0 < Pr α 0 , α 0 , α 1 , Pr α 1 , α 1 , α 0 < 1 ,
R Z 4 is strictly larger than R Z 5 , since H ( X 1 + 2 X 2 + 3 X 3 ) < H ( X 1 + 2 X 2 + 4 X 3 ) . To be specific, assume that ( X 1 , X 2 , X 3 ) p satisfies Table 1, we have
R [ X 1 , X 2 , X 3 ] R Z 5 = ( R 1 , R 2 , R 3 ) R 3 R i > 0 . 4812
R Z 4 = ( R 1 , R 2 , R 3 ) R 3 R i > 0 . 4590 .
Based on Propositions 4 and 5, we conclude that LCoR dominates LCoF, in terms of achieving better coding rates with smaller alphabet sizes of the encoders for computing g. As a direct conclusion, we have:
Theorem 12.
In the sense of Körner–Marton, LCoF is not optimal.
Remark 23.
The key property underlying the proof of Proposition 5 is that the characteristic of a finite field must be a prime while the characteristic of a finite ring can be any positive integer larger than or equal to 2. This implies that it is possible to construct infinitely many discrete functions for which using LCoF always leads to a suboptimal achievable region compared to linear coding over finite non-field rings. Examples include i = 1 s x i Z 2 p [ s ] for s 2 and prime p > 2 (note: the characteristic of Z 2 p is 2 p which is not a prime). One can always find an explicit distribution of sources for which linear coding over Z 2 p strictly dominates linear coding over each and every finite field.
As mentioned, R g ^ given by (79) is sometimes strictly smaller than R [ g ] . This was first shown by Ahlswede–Han [5] for the case of g being the modulo-two sum. Their approach combines the linear coding technique over a binary field with the standard random coding technique. In the following, we generalize the result of Ahlswede–Han ([5], Theorem 10) to the settings, where g is arbitrary, and, at the same time, LCoF is replaced by its generalized version, LCoR.
Consider function g ^ admitting
g ^ ( x 1 , x 2 , , x s ) = h k 0 ( x 1 , x 2 , , x s 0 ) , j = s 0 + 1 s k j ( x j ) , 0 s 0 < s ,
where k 0 : R s 0 R and h , k j ’s are functions mapping R to R . By Lemma 5, a discrete function with a finite domain is always equivalent to a restriction of some function of format (107). We call g ^ from (107) a pseudo nomographic function over ring R .
Theorem 13.
Let S 0 = { 1 , 2 , , s 0 } S = { 1 , 2 , , s } . If g ^ is of format (107), and R = ( R 1 , R 2 , , R s ) R s satisfying
j T R j > T \ S 0 max 0 I l R log | R | log | I | H ( X | V S ) H ( Y R / I | V S ) + I ( Y T ; V T | V T c ) , T S ,
where j S 0 , V j = Y j = X j ; j S \ S 0 , Y j = k j ( X j ) , V j ’s are discrete random variables such that
p ( y 1 , y 2 , , y s , v 1 , v 2 , , v s ) = p ( y 1 , y 2 , , y s ) j = s 0 + 1 s p ( v j | y j ) ,
and X = j = s 0 + 1 s Y j , Y R / I = X + I , then R R [ g ^ ] .
Proof. 
The proof can be completed by applying the tricks from Lemmas 2 and 3 to the approach generalized from Ahlswede–Han ([5], Theorem 10). Details are found in Appendix C. ☐
Remark 24.
The achievable region given by (108) always contains the SW region. Moreover, it is in general larger than the R g ^ from (79). If g ^ is the modulo-two sum, namely s 0 = 0 and h , k j ’s are identity functions for all s 0 < j s , then (108) resumes the region of Ahlswede–Han ([5], Theorem 10).

7. Conclusions

7.1. Right Linearity

Careful readers might have noticed that the encoders we used so far are actually left linear mappings. By symmetry, almost all related statements can be easily reproved for right linear mappings (encoders). As an example, the following corresponds to Theorem 2.
Theorem 14.
For any Φ M X S , R S ,
R Φ = R 1 , R 2 , , R s R s | i T R i log | I i | log | R i | > r T , I T , T S , 0 I i r R i ,
where r T , I T = H ( X T | X T c ) H ( Y R T / I T | X T c ) and Y R T / I T = Φ ( X T ) + I T , is achievable with (right) linear coding over the finite rings R 1 , R 2 , , R s .
By time sharing,
R r = cov Φ M X S , R S R Φ ,
where R Φ is given by (110), is achievable with (right) LCoR.

7.2. Field, Ring, Rng and Group

Conceptually speaking, LCoR is in fact a generalization of the linear coding technique proposed by Elias [2] and Csiszár [3] (LCoF), since a field must be a ring. However, as seen in Section 4, analyzing the decoding error for the ring version is in general substantially more challenging than in the case of the field version. Our approach crucially relies on the concept of ideals. A field contains no non-trivial ideal but itself. Because of this special property of fields, our general argument for finite rings will render to a simple one when only finite fields are considered.
Even though our analysis for the ring scenario is more complicated than that for finite field scenarios, linear encoders working over some finite rings are in general considerably easier to implement in practice. This is because the implementation of finite field arithmetic can be quite demanding. Normally, a finite field is given by its polynomial representation, operations are carried out based on the polynomial operations (addition and multiplication) followed by the polynomial long division algorithm. In contrast, implementing arithmetic of many finite rings is a straightforward task. For instance, the arithmetic of modulo integers ring Z q , for any positive integer q, is simply the integer modulo q arithmetic.
In addition, it is also very interesting to consider instead linear coding over rngs. It will be even more intriguing should it turn out that the rng version outperforms the ring version in the computing problem (Problem 1), in the same manner that the ring version outperforms its field counterpart. It will also be interesting to see whether the idea of using rng provides more understanding of the problems from [6,8].
Some works, including [24,25,26], have proposed to implement coding over a simpler algebraic structure, that of a group. Seemingly, this corresponds to a more universal approach since both fields and rings are also groups. However, one subtle issue is often overlooked in this context. Namely, the set of rings (or rngs) is not a subset of the set of groups, since several non-isomorphic rings (or rngs) can be defined on one and the same group. For instance, given two distinct primes p and q, up to isomorphism,
  • there are 2 finite rngs of order p, while there is only one group of order p;
  • there are 4 finite rngs of order p q ;
  • there are 11 finite rngs of order p 2 (if p = 2 , then 4 of them are rings, namely F 4 , Z 4 , Z 2 × Z 2 and M L , 2 [27]), while there are only 2 groups of order p 2 , both of which are Abelian;
  • there are 22 finite rngs of order p 2 q ;
  • there are 52 finite rngs of order 8;
  • there are 3 p + 50 finite rngs of order p 3 ( p > 2 ), while there are 5 groups of order p 3 , 3 of which are Abelian.
Therefore, there is no one-to-one correspondence between rings (field or rngs) and groups, in either direction. Furthermore, from the point of view of formulating a multivariate function, one is highly restricted by using groups, compared to rings (rng or field). Specifically, it is well-known that every discrete function defined on a finite domain is essentially a restriction of some polynomial function over a finite ring (rng or field). Although non-Abelian structures (non-Abelian groups) have the potential to lead to important non-trivial results [28], they are very difficult to handle theoretically and in practice. The performance of non-Abelian group block codes can be quite bad [29].

7.3. Final Remarks

This paper establishes achievability theorems regarding linear coding over finite rings for Slepian–Wolf data compression. Our results include related work from Elias [2] and Csiszár [3] regarding linear coding over finite fields as special cases in the sense of characterizing the achievable region. We have also proved that, for any Slepian–Wolf scenario, there always exists a sequence of linear encoders over some finite rings (non-field rings in particular) that achieves the data compression limit, the Slepian–Wolf region. Thus, with regard to existence, the optimality issue of linear coding over finite non-field rings for data compression is confirmed positively.
In addition, we also address the problem of source coding for computing, Problem 1. Results of Körner–Marton [4], Ahlswede–Han ([5], Theorem 10) and [7] are generalized to corresponding ring versions. Based on these, it is demonstrated that LCoR dominates its field counterpart for encoding (infinitely) many discrete functions.

Appendix A. Supporting Lemmata

Lemma A1.
Let R be a finite ring, X and Y be two correlated discrete random variables, and X be the sample space of X with X R . If R contains one and only one proper non-trivial left ideal I and I = R , then there exists injection Φ ˜ : X R such that
H ( X | Y ) 2 H ( Φ ˜ X + I | Y ) .
Proof. 
Let
Φ ˜ arg max Φ M H ( Φ X + I | Y ) ,
where M is the set of all possible Φ ’s (maximum can always be reached because M = R ! ( R X ) ! is finite, but it is not uniquely attained by Φ ˜ in general). Assume that Y is the sample space (not necessarily finite) of Y. Let q = I , I = { r 1 , r 2 , , r q } and R / I = a 1 + I , a 2 + I , , a q + I . We have that
H ( X | Y ) = y Y i , j = 1 q p i , j , y log p i , j , y p y and
H ( Φ ˜ X + I | Y ) = y Y i = 1 q p i , y log p i , y p y ,
where
p i , j , y = Pr Φ ˜ ( X ) = a i + r j , Y = y ,
p y = i , j = 1 q p i , j , y ,
p i , y = j = 1 q p i , j , y .
(Note: Pr { Φ ˜ ( X ) = r } = 0 if r R \ Φ ˜ ( X ) . In addition, every element in R can be uniquely expressed as a i + r j .) Therefore, (A1) is equivalent to
y Y i , j = 1 q p i , j , y log p i , j , y p y 2 y Y i = 1 q p i , y log p i , y p y y Y p y i = 1 q p i , y p y H p i , 1 , y p i , y , p i , 2 , y p i , y , , p i , q , y p i , y y Y p y H p 1 , y p y , p 2 , y p y , , p q , y p y ,
where H v 1 , v 2 , , v q = j = 1 q v j log v j , by the grouping rule for entropy ([19], p. 49). Let
A = y Y p y H i = 1 q p i , 1 , y p y , i = 1 q p i , 2 , y p y , , i = 1 q p i , q , y p y .
The concavity of the function H implies that
y Y p y i = 1 q p i , y p y H p i , 1 , y p i , y , p i , 2 , y p i , y , , p i , q , y p i , y A .
At the same time,
y Y p y H p 1 , y p y , p 2 , y p y , , p q , y p y = max Φ M H ( Φ ( X ) + I | Y )
by the definition of Φ ˜ . We now claim that
A max Φ M H ( Φ ( X ) + I | Y ) .
Suppose otherwise, i.e., A > y Y p y H p 1 , y p y , p 2 , y p y , , p q , y p y . Let Φ : X R be defined as
Φ : x a j + r i   if   Φ ˜ ( x ) = a i + r j .
(Note: Φ ˜ ( x ) is an element of R . It can be uniquely presented as a i + r j for some i and j.) We have that
H ( Φ ( X ) + I | Y ) = y Y p y H i = 1 q p i , 1 , y p y , i = 1 q p i , 2 , y p y , , i = 1 q p i , q , y p y = A
> y Y p y H p 1 , y p y , p 2 , y p y , , p q , y p y = max Φ M H ( Φ ( X ) + I | Y ) .
It is absurd that H ( Φ ( X ) + I | Y ) > max Φ M H ( Φ ( X ) + I | Y ) ! Therefore, (A8) is valid by (A10) and (A12), so is (A1). ☐
Lemma A2.
If both
0 min { p 1 , p 4 } max { p 2 , p 3 } 1   a n d   0 min { p 2 , p 3 } max { p 1 , p 4 } 1
are valid, and j = 1 4 p j = 1 , then
j = 1 4 p j log p j 2 p 2 + p 3 log p 2 + p 3 + p 1 + p 4 log p 1 + p 4 .
Proof [30].
Without loss of generality, we assume that 0 max { p 4 , p 3 } min { p 2 , p 1 } 1 which implies that p 1 + p 2 1 / 2 p 1 + p 4 1 / 2 . Let H 2 ( c ) = c log c ( 1 c ) log ( 1 c ) , 0 c 1 , be the binary entropy function. By the grouping rule for entropy ([19], p. 49), (A17) equals to
( p 1 + p 4 ) p 1 p 1 + p 4 log p 1 + p 4 p 1 + p 4 p 1 + p 4 log p 1 + p 4 p 4
+ ( p 2 + p 3 ) p 2 p 2 + p 3 log p 2 + p 3 p 2 + p 3 p 2 + p 3 log p 2 + p 3 p 3
p 2 + p 3 log p 2 + p 3 p 1 + p 4 log p 1 + p 4
A = ( p 1 + p 4 ) H 2 p 1 p 1 + p 4 + ( p 2 + p 3 ) H 2 p 2 p 2 + p 3
H 2 ( p 1 + p 4 ) .
Since H 2 is a concave function and j = 1 4 p j = 1 , then
A H 2 p 1 + p 2 .
Moreover, p 1 + p 2 1 / 2 p 1 + p 4 1 / 2 guarantees that
H 2 p 1 + p 2 H 2 p 1 + p 4 ,
because H 2 ( c ) = H 2 ( 1 c ) , 0 c 1 , and H 2 ( c ) H 2 ( c ) if 0 c c 1 / 2 . Therefore, A H 2 p 1 + p 4 and (A17) holds. ☐
Lemma A3.
No matter which finite field F q is chosen, g given by (94) admits no presentation k 1 ( x ) + k 2 ( y ) + k 3 ( z ) , where k i F q [ 1 ] for all feasible i.
Proof. 
Suppose otherwise, i.e., k 1 μ 1 + k 2 μ 2 + k 3 μ 3 = ν g for some injections μ 1 , μ 1 , μ 3 : { α 0 , α 1 } F q and ν : { β 0 , , β 3 } F q . By (94), we have
ν ( β 1 ) = ( k 1 μ 1 ) ( α 1 ) + ( k 2 μ 2 ) ( α 0 ) + ( k 3 μ 3 ) ( α 0 ) = ( k 1 μ 1 ) ( α 0 ) + ( k 2 μ 2 ) ( α 1 ) + ( k 3 μ 3 ) ( α 1 ) ν ( β 3 ) = ( k 1 μ 1 ) ( α 1 ) + ( k 2 μ 2 ) ( α 1 ) + ( k 3 μ 3 ) ( α 0 ) = ( k 1 μ 1 ) ( α 0 ) + ( k 2 μ 2 ) ( α 0 ) + ( k 3 μ 3 ) ( α 1 ) ν ( β 1 ) ν ( β 3 ) = τ = τ τ + τ = 0 ,
where τ = k 2 ( μ 2 ( α 0 ) ) k 2 ( μ 2 ( α 1 ) ) . Since μ 2 is injective, (A26) implies that either τ = 0 or Char ( F q ) = 2 by Proposition 2. Noticeable that k 2 ( μ 2 ( α 0 ) ) k 2 ( μ 2 ( α 1 ) ) , i.e., τ 0 , otherwise, ν ( β 1 ) = ν ( β 3 ) which contradicts the assumption that ν is injective. Thus, Char ( F q ) = 2 . Let ρ = ( k 3 μ 3 ) ( α 0 ) ( k 3 μ 3 ) ( α 1 ) . Obviously, ρ 0 because of the same reason that τ 0 , and ρ + ρ = 0 since Char ( F q ) = 2 . Therefore,
ν ( β 0 ) = ( k 1 μ 1 ) ( α 0 ) + ( k 2 μ 2 ) ( α 0 ) + ( k 3 μ 3 ) ( α 0 )
= ( k 1 μ 1 ) ( α 0 ) + ( k 2 μ 2 ) ( α 0 ) + ( k 3 μ 3 ) ( α 1 ) + ρ
= ν ( β 3 ) + ρ
= ( k 1 μ 1 ) ( α 1 ) + ( k 2 μ 2 ) ( α 1 ) + ( k 3 μ 3 ) ( α 0 ) + ρ
= ( k 1 μ 1 ) ( α 1 ) + ( k 2 μ 2 ) ( α 1 ) + ( k 3 μ 3 ) ( α 1 ) + ρ + ρ
= ν ( β 2 ) + 0 = ν ( β 2 ) .
This contradicts the assumption that ν is injective. ☐
Remark A1.
As a special case, this lemma implies that no matter which finite field F q is chosen, g defined by (94) has no polynomial presentation that is linear over F q . In contrast, g admits presentation x + 2 y + 3 z Z 4 [ 3 ] which is a linear function over Z 4 .

Appendix B. Proofs of Lemmas 4 and 5

Appendix B.1. Proof of Lemma 4

Let p be a prime such that p m max Ω , X i 1 i k for some integer m, and choose R to be a finite field of order p m . By ([31], Lemma 7.40), the number of polynomial functions in R [ k ] is p m p m k . Moreover, the number of distinct functions with domain R k and codomain R is also R R k = p m p m k . Hence, any function g : R k R is a polynomial function.
In the meanwhile, any injections μ i : X i R ( 1 i k ) and ν : Ω R give rise to a function
g ^ = ν g μ 1 , μ 2 , , μ k : R k R ,
where μ i is the inverse mapping of μ i : X i μ i X i . Since g ^ must be a polynomial function as shown, the statement is established.
Remark A2.
Another proof involving Fermat’s little theorem can be found in [6].

Appendix B.2. Proof of Lemma 5

Let F be a finite field such that F X i for all 1 i s and F s Ω , and let R be the splitting field of F of order F s (one example of the pair F and R is the Z p , where p is some prime, and its Galois extension of degree s). It is easily seen that R is an s dimensional vector space over F . Hence, there exist s vectors v 1 , v 2 , , v s R that are linearly independent. Let μ i be an injection from X i to the subspace generated by vector v i . It is easy to verify that k = i = 1 s μ i is injective since v 1 , v 2 , , v s are linearly independent. Let k be the inverse mapping of k : i = 1 s X i k i = 1 s X i and ν : Ω R be any injection. By ([31], Lemma 7.40), there exists a polynomial function h R [ s ] such that h = ν g k . Let g ^ ( x 1 , x 2 , , x s ) = h i = 1 s x i . The statement is proved.
Remark A3.
In the proof, k is chosen to be injective because the proof includes the case that g is an identity function. In general, k is not necessarily injective.

Appendix C. Proof of Theorem 13

Choose δ > 6 ϵ > 0 , such that R j = R j + R j , j S , j T R j > I ( Y T ; V T | V T c ) + 2 T δ , T S , and R j > r + 2 δ , where r = max 0 I l R log | R | log | I | H ( X | V S ) H ( Y R / I | V S ) , j S \ S 0 .

Appendix C.3. Encoding:

Fix the joint distribution p which satisfies (109). For all j S 0 , let V j , ϵ = T ϵ ( n , X j ) . For all j S \ S 0 , generate randomly 2 n [ I ( Y j ; V j ) + δ ] strongly ϵ -typical sequences according to distribution p V j n and let V j , ϵ be the set of these generated sequences. Define mapping ϕ j : R n V j , ϵ as follows:
  • If j S 0 , then, x R n , ϕ j ( x ) = x , if   x T ϵ ; x 0 , otherwise , where x 0 V j , ϵ is fixed.
  • If j S \ S 0 , then for every x R n , let L x = { v V j , ϵ | ( k j ( x ) , v ) T ϵ } . If x T ϵ and L x , then ϕ j ( x ) is set to be some element in L x ; otherwise ϕ j ( x ) is some fixed v 0 V j , ϵ .
Define mapping η j : V j , ϵ [ 1 , 2 n R j ] by randomly choosing the value for each v V j , ϵ according to a uniform distribution.
Let k = min j S \ S 0 n R j log R . When n is big enough, we have k > n [ r + δ ] log R . Randomly generate a k × n matrix M R k × n , and let θ j : R n R k ( j S \ S 0 ) be the function θ j : x M k j ( x ) , x R n .
Define the encoder ϕ j as the follows
ϕ j = η j ϕ j , j S 0 ; ( η j ϕ j , θ j ) , otherwise .

Appendix C.4. Decoding:

Upon observing a 1 , a 2 , , a s 0 , ( a s 0 + 1 , b s 0 + 1 ) , , ( a s , b s ) at the decoder, the decoder claims that
h k 0 V ^ 1 n , V ^ 2 n , , V ^ s 0 n , X ^ n
is the function of the generated data, if and only if there exists one and only one
V ^ = V ^ 1 n , V ^ 2 n , , V ^ s n j = 1 s V j , ϵ ,
such that a j = η j ( V ^ j n ) , j S , and X ^ n is the only element in the set
L V ^ = x R n | ( x , V ^ ) T ϵ , M x = j = t + 1 s b j .

Appendix C.5. Error:

Assume that X j n is the data generated by the jth source and let X n = j = s 0 + 1 s k j X j n . An error happens if and only if one of the following events happens.
E1:
( X 1 n , X 2 n , , X s n , Y 1 n , Y 2 n , , Y s n , X n ) T ϵ ;
E2:
There exists some j 0 S \ S 0 , such that L X j 0 n = ;
E3:
( Y 1 n , Y 2 n , , Y s n , X n , V ) T ϵ , where V = V 1 n , V 2 n , , V s n and V j n = ϕ j ( X j n ) , j S ;
E4:
There exists V = ( v 1 , v 2 , , v s ) T ϵ j = 1 s V j , ϵ , V V , such that η j ( v j ) = η j V j n , j S ;
E5:
X n L V or L V > 1 , i.e., there exists X 0 n R n , X 0 n X n , such that M X 0 n = M X n and ( X 0 n , V ) T ϵ .
Let γ = Pr l = 1 5 E l = l = 1 5 Pr E l | E l , c , where E 1 , c = and E l , c = τ = 1 l 1 E τ c for 1 < l 5 . In the following, we show that γ 0 , n .
(a). By the joint AEP ([18], Theorem 6.9), Pr { E 1 } 0 , n .
(b). Let E 2 , j = L X j n = , j S \ S 0 . Then
Pr { E 2 | E 2 , c } j S \ S 0 Pr E 2 , j | E 2 , c .
For any j S \ S 0 , because the sequence v V j , ϵ and Y j n = k j ( X j n ) are drawn independently, we have
Pr { ( Y j n , v ) T ϵ } ( 1 ϵ ) 2 n [ I ( Y j ; V j ) + 3 ϵ ]
= ( 1 ϵ ) 2 n [ I ( Y j ; V j ) + δ / 2 ] + n ( δ / 2 3 ϵ )
> 2 n [ I ( Y j ; V j ) + δ / 2 ]
when n is big enough. Thus,
Pr E 2 , j | E 2 , c = Pr L X j n = | E 2 , c = v V j , ϵ Pr k j ( X j n ) , v T ϵ < 1 2 n [ I ( Y j ; V j ) + δ / 2 ] 2 n [ I ( Y j ; V j ) + δ ] 0 , n .
where (A42) holds true for all big enough n and the limit follow from the fact that 1 1 / a a e 1 , a . Therefore, Pr { E 2 | E 2 , c } 0 , n by (A38).
(c). By (109), it is obvious that V J 1 Y J 1 Y J 2 V J 2 forms a Markov chain for any two disjoint nonempty sets J 1 , J 2 S . Thus, if ( Y j n , V j n ) T ϵ for all j S and ( Y 1 n , Y 2 n , , Y s n ) T ϵ , then ( Y 1 n , Y 2 n , , Y s n , V ) T ϵ . In the meantime, X ( Y 1 , Y 2 , , Y s ) ( V 1 , V 2 , , V s ) is also a Markov chain. Hence, ( Y 1 n , Y 2 n , , Y s n , X n , V ) T ϵ if ( Y 1 n , Y 2 n , , Y s n , X n ) T ϵ . Therefore, Pr { E 3 | E 3 , c } = 0 .
(d). For all J S , let J = { j 1 , j 2 , , j j } and
Γ J = { V = ( v 1 , v 2 , , v s ) j = 1 s V j , ϵ | v j = V j n if   and   only   if   j S \ J } .
By definition, Γ J = j J V j , ϵ 1 = 2 n j J I ( Y j ; V j ) + J δ 1 and
Pr { E 4 | E 4 , c } = J S V Γ J Pr η j ( v j ) = η j ( V j n ) , j J , V T ϵ | E 4 , c = J S V Γ J Pr η j ( v j ) = η j ( V j n ) , j J × Pr V T ϵ | E 4 , c
< J S V Γ J 2 n j J R j × 2 n i = 1 J I ( V j i ; V J c , V j 1 , , V j i 1 ) J δ < J S 2 n j J I ( Y j ; V j ) + j δ × 2 n j J R j × 2 n i = 1 j I ( V j i ; V J c , V j 1 , , V j i 1 ) j δ
C max J N 2 n j J R j I ( Y J ; V J | V J c ) 2 j δ 0 , n ,
where C = 2 s 1 . Equality (A44) holds because the processes of choosing η j ’s and generating V are done independently. (A45) follows from Lemma A4 and the definitions of η j ’s. (A46) is from Lemma A5.
Lemma A4.
Let X 1 , X 2 , , X l , Y q . For any ϵ > 0 and positive integer n, choose a sequence X ˜ j n ( 1 j l ) randomly from T ϵ ( n , X j ) based on a uniform distribution. If y Y n is an ϵ-typical sequence with respect to Y, then
Pr ( X ˜ 1 n , X ˜ 2 n , , X ˜ l n , Y n ) T ϵ | Y n = y 2 n j = 1 l I ( X j ; Y , X 1 , X 2 , , X j 1 ) 3 l ϵ .
Proof. 
Let F j be the event { ( X ˜ 1 n , X ˜ 2 n , , X ˜ j n , Y n ) T ϵ } , 1 j l , and F 0 = . We have
Pr ( X ˜ 1 n , X ˜ 2 n , , X ˜ l n , Y n ) T ϵ | Y n = y = j = 1 l Pr F j | Y n = y , F j 1
j = 1 l 2 n I ( X j ; Y , X 1 , X 2 , , X j 1 ) 3 ϵ
= 2 n j = 1 l I ( X j ; Y , X 1 , X 2 , , X j 1 ) 3 l ϵ ,
since X ˜ 1 n , X ˜ 2 n , , X ˜ l n , y are generated independent. ☐
Lemma A5.
If ( Y 1 , V 1 , Y 2 , V 2 , , Y s , V s ) q , and
q ( y 1 , v 1 , y 2 , v 2 , , y s , v s ) = q ( y 1 , y 2 , , y s ) i = 1 s q ( v i | y i ) ,
then, J = { j 1 , j 2 , , j J } { 1 , 2 , , s } ,
I ( Y J ; V J | V J c ) = i = 1 j I ( Y j i ; V j i ) I ( V j i ; V J c , V j 1 , , V j i 1 ) .
(e). Let E 5 , 1 = { L V = } and E 5 , 2 = { | L V | > 1 } . We have Pr { E 5 , 1 | E 5 , c } = 0 , because E 5 , c contains the event that ( X n , V ) L V and V is unique. Therefore,
Pr E 5 | E 5 , c = Pr E 5 , 2 | E 5 , c = ( X 0 n , V ) T ϵ \ ( X n , V ) Pr M X 0 n = M X n < 0 I l R D ϵ ( X n , I | V ) \ ( X n , V ) Pr M X 0 n = M X n
Choose a small η > 0 such that η < δ 2 log R . Then
Pr E 5 | E 5 , c < 2 R 2 max 0 I l R 2 n H ( X | V S ) H ( Y R / I | V S ) + η × 2 k log I = 2 R 2 max 0 I l R 2 n k log I / n H ( X | V S ) + H ( Y R / I | V S ) η < 2 R 2 max 0 I l R 2 n δ log I / log R η
< 2 R 2 2 n δ / 2 log R 0 , n ,
where (A53) is from Lemmas 2 and 3 (for all large enough n and small enough ϵ ) and (A54) is because I 2 for all I 0 .
To summarize, by (a)–(e), we have γ 0 , n . The theorem is established.

Appendix D. On Coding over Abelian Groups

As discussed in Section 2, since in this paper we focus on linear encoding, we need to work over a field or a ring. In general, most of the existing coding literature assumes coding over fields, especially when the focus is on linear encoding. Some both traditional and recent work, including [9,10,11], has however also considered (Abelian) groups, while significantly fewer results are available for coding over rings. In this appendix we elaborate on the relation between coding over fields, rings and groups in order to clearly show that our results in this paper are not subsumed by previous work on coding over groups. To highlight this fact even further, the following constitutes a counterexample to illustrate that “linear” operations over groups are not well-defined: In the case of the Abelian group G = Z p Z p Z p Z p (p is a prime), there are at least three distinct definitions of multiplication to define rings over G. These rings are isomorphic to either
  • the field F p 4 which is commutative; or
  • the non-field ring
    M p = a b c d a , b , c , d Z p
    which is not commutative; or
  • the product ring Z p × Z p × Z p × Z p which is commutative.
Suppose “linear operation over group G” is defined with respect to some multiplicative operation “*”, at the same time, this linear scheme over G includes the three distinct linear coding schemes defined over F p 4 , M p and Z p × Z p × Z p × Z p simultaneously. We then conclude that the operation “*” is commutative and non-commutative at the same time, a contradiction.
To be more specific about the fundamental differences, beyond linearity, between coding over groups, as in e.g., [11], and coding over fields or rings we also provide the following list of additional remarks.
(R1)
Consider the example given in ([11], Section VIII.B.1) for reconstruction of the modulo-two sum of binary symmetric sources [4]. On ([11], p. 1509), it reads “Rate points achieved by embedding the function in the Abelian groups Z 3 , Z 4 are strictly worse than that achieved by embedding the function in Z 2 while embedding in Z 2 Z 2 gives the Slepian–Wolf rate region for the lossless reconstruction of ( X , Y ) ” ( ( X , Y ) should be F ( X , Y ) = X 2 Y from the context, because coding over Z 3 is not strictly worse than coding over Z 2 for lossless reconstruct the original data ( X , Y ) [3].).
Ref. [11] clearly states that group coding over Z 2 Z 2 for encoding the modulo-two sum of symmetric sources gives only the Slepian–Wolf region. On the contrary, consider either the finite field F 4 or the non-field ring
M L , 2 = a 0 b a a , b Z 2
(note: the underlying Abelian group defining F 4 and M L , 2 is Z 2 Z 2 ). We claim that linear coding over either F 4 or M L , 2 for encoding the modulo-two sum of symmetric sources gives the Körner–Marton region [4]. This is because linear coding over finite field, e.g., F 4 , is always optimal for the Slepian–Wolf problem, so is linear coding over non-field ring M L , 2 by Theorem 7. However, group coding over Z 2 Z 2 is not.
It is well-known that the Körner–Marton region is often strictly larger than the Slepian–Wolf region. Linear coding over the non-field ring M L , 2 (field F 4 ), as a special case (nonlinear) coding over Abelian group Z 2 Z 2 must not achieve a region larger than the Slepian–Wolf region, leading to a contradiction.
(R2)
Row 2 of TABLE III in [11] states that group coding over Z 4 Z 4 (achieving sum rate 3.5 ) is strictly worse than over the group Z 4 (achieving sum rate 3) for lossless encoding of a quaternary function ([11], Section VIII.A). On the contrary, linear coding over the ring Z 4 × Z 4 (with underlying Abelian group Z 4 Z 4 ) always achieves a region containing the one achieved by linear coding over ring Z 4 . This is implied by Theorem 3. By direct calculation, we have that linear coding over the ring Z 4 × Z 4 (achieving sum rate 3) is strictly better than what is achieved by coding over over the Abelian group Z 4 Z 4 (achieving sum rate 3 . 5 ).
(R3)
Finally, we emphasize that according to the Fundamental Theorem of (Finite) Abelian Group ([12], Theorem 5.25), up to isomorphism, every finite Abelian group is a direct sum of cyclic groups of prime-power order ([12], Proposition 5.27). This implies that every finite Abelian group can be represented via direct sum of modulo integers. However, many finite rings are not (isomorphic to) direct product of modulo integers, e.g., finite fields F q (when q is a power of a prime but is not a prime), matrix rings M L , q (when q 2 is any positive integer) and all non-commutative rings. For a fixed order (e.g., p 2 with p being a prime), the number of finite rings is often significantly bigger than the number of finite Abelian groups. For instance, there are 4 rings of order 4 while there are 2 groups of order 4.

Acknowledgments

The authors would like to thank their colleagues Jinfeng Du and Mattias Andersson for assistance in proving Lemma A2. They are also very grateful to an anonymous reviewer of the paper [20] for suggesting an alternative proof of Lemma 3. This work was funded in part by the Swedish Research Council.

Author Contributions

Sheng Huang contributed to the original idea, the analysis and proofs, and wrote the paper. Mikael Skoglund helped to polish the idea and the analysis, and wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Slepian, D.; Wolf, J.K. Noiseless Coding of Correlated Information Sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
  2. Elias, P. Coding for Noisy Channels. IRE Conv. Rec. 1955, 3, 37–46. [Google Scholar]
  3. Csiszár, I. Linear Codes for Sources and Source Networks: Error Exponents, Universal Coding. IEEE Trans. Inf. Theory 1982, 28, 585–592. [Google Scholar] [CrossRef]
  4. Körner, J.; Marton, K. How to Encode The Modulo-Two Sum of Binary Sources. IEEE Trans. Inf. Theory 1979, 25, 219–221. [Google Scholar] [CrossRef]
  5. Ahlswede, R.; Han, T.S. On Source Coding with Side Information via a Multiple-Access Channel and Related Problems in Multi-User Information Theory. IEEE Trans. Inf. Theory 1983, 29, 396–411. [Google Scholar] [CrossRef]
  6. Huang, S.; Skoglund, M. Polynomials and Computing Functions of Correlated Sources. In Proceedings of the 2012 IEEE International Symposium on Information Theory, Cambridge, MA, USA, 1–6 July 2012; pp. 771–775. [Google Scholar]
  7. Huang, S.; Skoglund, M. Computing Polynomial Functions of Correlated Sources: Inner Bounds. In Proceedings of the International Symposium on Information Theory and Its Applications, Honolulu, HI, USA, 28–31 October 2012; pp. 160–164. [Google Scholar]
  8. Han, T.S.; Kobayashi, K. A Dichotomy of Functions F(X, Y) of Correlated Sources (X, Y) from the Viewpoint of the Achievable Rate Region. IEEE Trans. Inf. Theory 1987, 33, 69–76. [Google Scholar] [CrossRef]
  9. Como, G.; Fagnani, F. The Capacity of Finite Abelian Group Codes over Symmetric Memoryless Channels. IEEE Trans. Inf. Theory 2009, 55, 2037–2054. [Google Scholar] [CrossRef]
  10. Como, G. Group codes outperform binary-coset codes on nonbinary symmetric memoryless channels. IEEE Trans. Inf. Theory 2010, 56, 4321–4334. [Google Scholar] [CrossRef]
  11. Krithivasan, D.; Pradhan, S. Distributed Source Coding Using Abelian Group Codes: A New Achievable Rate-Distortion Region. IEEE Trans. Inf. Theory 2011, 57, 1495–1519. [Google Scholar] [CrossRef]
  12. Rotman, J.J. Advanced Modern Algebra, 2nd ed.; American Mathematical Society: Providence, RI, USA, 2010. [Google Scholar]
  13. Mullen, G.; Stevens, H. Polynomial functions (modm). Acta Math. Hung. 1984, 44, 237–241. [Google Scholar] [CrossRef]
  14. Hungerford, T.W. Algebra (Graduate Texts in Mathematics); Springer: New York, NY, USA, 1980. [Google Scholar]
  15. Lam, T.Y. A First Course in Noncommutative Rings, 2nd ed.; Springer: New York, NY, USA, 2001. [Google Scholar]
  16. Dummit, D.S.; Foote, R.M. Abstract Algebra, 3rd ed.; Wiley: New York, NY, USA, 2003. [Google Scholar]
  17. Anderson, F.W.; Fuller, K.R. Rings and Categories of Modules, 2nd ed.; Springer: New York, NY, USA, 1992. [Google Scholar]
  18. Yeung, R.W. Information Theory and Network Coding, 1st ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
  19. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: New York, NY, USA, 2006. [Google Scholar]
  20. Huang, S.; Skoglund, M. On achievability of linear source coding over finite rings. In Proceedings of the 2013 IEEE International Symposium on Information Theory Proceedings (ISIT), Istanbul, Turkey, 7–12 July 2013; pp. 1984–1988. [Google Scholar]
  21. Huang, S.; Skoglund, M. Encoding Irreducible Markovian Functions of Sources: An Application of Supremus Typicality; KTH Royal Institute of Technology: Stockholm, Sweden, 2013. [Google Scholar]
  22. Buck, R.C. Nomographic Functions are Nowhere Dense. Proc. Am. Math. Soc. 1982, 85, 195–199. [Google Scholar] [CrossRef]
  23. Huang, S.; Skoglund, M. Linear Source Coding over Rings and Applications. In Proceedings of the IEEE Swedish Communication Technologies Workshop, Lund, Sweden, 24–26 October 2012; pp. 1–6. [Google Scholar]
  24. Slepian, D. Group Codes for the Gaussian Channel. Bell Syst. Tech. J. 1968, 47, 575–602. [Google Scholar] [CrossRef]
  25. Ahlswede, R. Group Codes do not Achieve Shannon’s Channel Capacity for General Discrete Channels. Ann. Math. Stat. 1971, 42, 224–240. [Google Scholar] [CrossRef]
  26. Forney, G.D., Jr. On the Hamming distance properties of group codes. IEEE Trans. Inf. Theory 1992, 38, 1797–1801. [Google Scholar] [CrossRef]
  27. Singmaster, D.; Bloom, D.M. Rings of Order Four. Math. Assoc. Am. 1964, 71, 918–920. [Google Scholar] [CrossRef]
  28. Chan, T.H.; Grant, A. Entropy vector and network codes. In Proceedings of the IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007. [Google Scholar]
  29. Interlando, J.C.; Palazzo, R., Jr.; Elia, M. Group Block Codes Over Nonabelian Groups are Asymptotically Bad. IEEE Trans. Inf. Theory 1996, 42, 1277–1280. [Google Scholar] [CrossRef]
  30. Du, J.; KTH Royal Institute of Technology, Stockholm, Sweden; Andersson, M.; KTH Royal Institute of Technology, Stockholm, Sweden. Personal Communication, 2012.
  31. Lidl, R.; Niederreiter, H. Finite Fields, 2nd ed.; Gambridge University Press: New York, NY, USA, 1997. [Google Scholar]
Figure 1. g : { α 0 , α 1 } 3 { β 0 , β 1 , β 2 , β 3 } .
Figure 1. g : { α 0 , α 1 } 3 { β 0 , β 1 , β 2 , β 3 } .
Entropy 19 00233 g001
Figure 2. x + 2 y + 3 z Z 4 [ 3 ] .
Figure 2. x + 2 y + 3 z Z 4 [ 3 ] .
Entropy 19 00233 g002
Figure 3. h ^ ( x + 2 y + 4 z ) Z 5 [ 3 ] .
Figure 3. h ^ ( x + 2 y + 4 z ) Z 5 [ 3 ] .
Entropy 19 00233 g003
Table 1. Distribution p.
Table 1. Distribution p.
( X 1 , X 2 , X 3 ) p ( X 1 , X 2 , X 3 ) p
( α 0 , α 0 , α 0 ) 1 / 90 ( α 0 , α 1 , α 0 ) 1 / 90
( α 1 , α 0 , α 1 ) 1 / 90 ( α 1 , α 1 , α 1 ) 1 / 90
( α 1 , α 0 , α 0 ) 42 / 90 ( α 0 , α 0 , α 1 ) 1 / 90
( α 0 , α 1 , α 1 ) 42 / 90 ( α 1 , α 1 , α 0 ) 1 / 90

Share and Cite

MDPI and ACS Style

Huang, S.; Skoglund, M. On Linear Coding over Finite Rings and Applications to Computing. Entropy 2017, 19, 233. https://doi.org/10.3390/e19050233

AMA Style

Huang S, Skoglund M. On Linear Coding over Finite Rings and Applications to Computing. Entropy. 2017; 19(5):233. https://doi.org/10.3390/e19050233

Chicago/Turabian Style

Huang, Sheng, and Mikael Skoglund. 2017. "On Linear Coding over Finite Rings and Applications to Computing" Entropy 19, no. 5: 233. https://doi.org/10.3390/e19050233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop