1. Introduction
A fundamental idea in data compression is to assign shorter codewords to symbols or groups of symbols that appear with higher probability in the source data [
1]. This results in what is called a variable-length code, meaning that the length of each codeword may not be equal, and one of the earliest practical codes of such nature are the Huffman codes [
2]. This code approaches theoretical compression limit under certain scenarios [
1] and has recently been amended to create Tagged Huffman Codes [
3] and End-Tagged Dense Codes [
4] to accommodate database searches.
Popular alternatives to these codes are codes with suffix delimiters, where certain patterns mark the end of a codeword [
5]. A competent example in this class of codes is the Fibonacci code, which is a result of a binary numeration system for the integers known as Zeckendorf representation. The scheme was introduced by [
6] and generalized to a higher order in [
7]. In terms of data compression, Fibonacci codes are optimal under some distributions and can be used as alternatives to Huffman codes [
1,
8]. Fibonacci codes are naturally suitable for integer data, where fast encoding and decoding algorithms can be implemented [
9,
10,
11]. In addition, compared with other variable-length codes for the integers such as logarithmic ramp [
12] and Elias codes [
13], Fibonacci codes provide much better resistance against insertion and deletion errors. Other applications of the Fibonacci sequence in information science include the study of Morse code as a monoid [
14,
15], Wavelet trees [
16], and zero-knowledge cryptography [
17].
While suffix codes have been shown to offer adjustability to some extent [
5], a great burden for a code with suffix delimiters to bear is its own suffix. A longer suffix gives more flexibility in the coded data, hence yielding denser codes, but obviously the codewords now have to carry longer appendage. This is particularly the case for the Fibonacci codes of higher order [
18]. In this paper, we introduce Fibonacci coding for a sequential string of integers. In the proposed scheme, a constituent of integers is encoded together with a single delimiter, thus making Fibonacci codes of higher order more practical.
To generalize Fibonacci codes to the full extent, we re-examine their origin, which is the theorem of Zeckendorf. This much-celebrated theorem states that every positive integer
n can be written uniquely as:
where the
s are Fibonacci numbers and
. In other words, one may represent a positive integer as sum of nonconsecutive Fibonacci numbers. This theorem has been generalized and investigated in many directions; similar results can be stated for the negaFibonacci sequence [
19,
20], the generalized Fibonacci sequence of order
k [
21,
22], and linear recurrence sequences [
23,
24]. The distribution of the number of terms in a Zeckendorf decomposition is studied in [
25,
26,
27], where a probabilistic approach has recently been applied [
28].
Meanwhile, the Fibonacci sequence itself has also been generalized in many directions. Here, noninteger Fibonacci sequences have been studied mostly on the complex numbers [
29,
30]. In particular, Jordan [
31] and Harman [
32] consider Fibonacci sequences whose initial terms and indices are Gaussian integers. Other generalizations usually involve an identity whose integer terms yield the classical Fibonacci numbers. For example, Binet’s formula is exploited in [
33,
34] where a Fibonacci function is defined over the real and complex numbers.
In this paper, we are interested in a generalization of Zeckendorf’s theorem in its purest form: We consider any mathematical sequences that satisfy the generalized Fibonacci recurrence relation and examine elements that can be written as a sum of terms from the sequence. As a result, mathematical spaces with minimal structure that meet our requirement are the free -modules. Loosely speaking, they are spaces of vectors with integer components. This choice of spaces allows our generalizations to encompass, for example, integer vectors, arrays, or blocks of integers of fixed length; Gaussian and Eisenstein integers; lattices; and the ring , where is algebraic.
We summarize the contributions of this paper as follows:
We introduce the notion of k-equivalent sequences. This approach technically treats elements that can be generated from a Fibonacci sequence as a number system using sequence elements as a basis. This key mechanism allows us to manipulate the “digits” of a representation of a number without an explicit knowledge of the underlying Fibonacci sequence. We prove that an element has a Zeckendorf decomposition if and only if it can be written as a finite sum (with multiplicities) of sequence elements.
We provide a necessary and sufficient condition for Zeckendorf’s theorem to hold in a free -module. Obviously, the classical Zeckendorf’s theorem can be viewed as a case of our generalization. We also give a sufficient condition that makes the representation unique.
We propose multidimensional Fibonacci coding for free -modules along with a natural and explicit encoding and decoding algorithm. This is the first work that makes possible native Fibonacci coding for non-integers. Not only that, the resulting codes inherit their robustness property from the standard Fibonacci codes, and they excel in terms of compression efficiency. We demonstrate both theoretically and numerically that the performance on the compression of multidimensional Fibonacci coding is comparable to the classical Fibonacci coding. The proposed scheme has an advantage in that the suffix now appears less frequently, thus cutting down on extraneous bits. However, it can be disadvantageous when a large integer appears together with a small integer in the source vector data. In such case, the larger integer could unproportionally drive up the length of the codeword.
The remainder of this paper is organized as follows.
Section 2 sets the definitions that will be used throughout the paper. In
Section 3, we prove several useful results concerning
k-equivalent sequences. Generalizations of Zeckendorf’s theorem are given in
Section 4. We establish multidimensional Fibonacci coding in
Section 5, discuss its efficiency and demonstrate simulation results in
Section 6, and conclude in
Section 7.
2. Definitions and Terminologies
We first give a definition for free modules and Fibonacci sequences of higher order. A free
-module is a module over
with a basis. Namely,
M is a free
-module of rank
l if it is a group under addition and
for some
that are integrally independent, meaning that the only integer solution to the equation
is
. The elements
are called a basis for
M, and every element of
M can be written uniquely as:
, where
. For
and
, we write
to mean
. Note that multiplication between module elements may or may not be defined.
We are interested in an extremely broad version of a Fibonacci sequence, so we adopt only the fundamental recurrence relation and a corresponding generalization of a Zeckendorf representation.
Definition 1. A Fibonacci sequence of order k is a doubly infinite sequence:wherefor all integers n. Here, a doubly infinite sequence (also called two-way infinite sequence) is a sequence that goes in both directions.
Definition 2. Let be a Fibonacci sequence of order k. An element m is Zeckendorf in if it can be written as a finite sum of terms in with no k consecutive terms. A set M is Zeckendorf in if every element of M is.
Note that we do not specify the initial condition for a Fibonacci sequence of order k. In fact, we will reverse engineer the theorem of Zeckendorf and identify all initial conditions that the theorem holds. In other words, instead of proving a version of Zeckendorf’s theorem for a particular sequence, we set forth to classify all sequences that are in favor of the theorem.
A Fibonacci sequence of order
k has degree of freedom
k; knowing any
k consecutive terms is sufficient to generate the entire sequence. For
, the characteristic polynomial of a Fibonacci sequence of order
k is
. It is known [
35] that this polynomial has
k distinct roots
, where
and
lie in the unit circle. Note that the sequence
is a special case of Fibonacci sequence of order
k. We call this sequence
primitive.
Example 1. In this example, we consider several Fibonacci sequences of order 3 (with respect to a different initial conditions for , , and ):
The usual Tribonacci sequence is defined using , , and . The first few terms of this sequence are:The doubly infinite generalization of this sequence is given by: The three roots of a polynomial are , , and . The primitive Fibonacci sequence of order 3 is the sequence , which is approximately: A Fibonacci sequence of order 3, where , , and is given by:Here, the element of this sequence are Gaussian integers, i.e., complex numbers of the form , where . The element , for example, is Zeckendorf in this sequence since .
In order to study elements that are Zeckendorf in
, it is natural to consider elements of the form
, where
is a doubly infinite sequence of integers and · is an infinite dot product, i.e.:
Roughly speaking, a collection of all elements of the form is a number system with the base with the digit set . Here, acts like digits or coefficients for the basis . These so-called coefficients can be manipulated using the following methods.
Definition 3. Let be an integer. Doubly infinite sequences and are k-equivalent, denoted as ∼, if can be obtained from using finite applications of the following two operations:
Operation 1: For some , subtract 1 from and add 1 to all of .
Operation 2: For some , add 1 to and subtract 1 from all of .
It is not hard to see that Operations 1 and 2 “cancel out”, and is an equivalence relation. These operations also preserve the value of the dot product: If is a Fibonacci sequence of order k and ∼, then . As we will see later, this observation is an important ingredient of our results. Finally, we give a lexicographical order ≻ to doubly infinite sequences that are zero almost everywhere, i.e., if the term of is larger than that of at the rightmost index, where they are not equal. Here, a sequence is zero almost everywhere if the number of its nonzero terms is finite.
3. Properties of k-Equivalent Sequences
In a way, our approach in proving a generalized theorem of Zeckendorf is in asking the following converse question: Given a Fibonacci sequence of order k, which elements can be written as a finite sum of elements from using no k consecutive terms? It is not hard to see that they are elements of the form where is binary, is 0 almost everywhere, and has no k consecutive 1’s. Now, instead of characterizing these elements directly, we will focus on manipulating the coefficients using Operations 1 and 2 from Definition 3. These coefficient manipulations operate independently of the underlying Fibonacci sequence of order k, and so this approach frees us from having to worry about the choice of . As we will see, our results hold in a broad sense and are not limited to just any specific Fibonacci sequence. Propositions 1 and 2 given in this section will provide an effective mechanism for identifying sequences that are k-equivalent to a binary sequence with the required properties.
Proposition 1. For any sequence of integers that is 0 almost everywhere, there exists a sequence such that , and is either:
In addition, if is non-negative, then the k consecutive nonzero terms of are positive.
Proof. A finite contiguous subsequence of is called a cluster if the terms in its complement subsequence are all 0. In other words, is a cluster of if for all and all . The number of terms in a cluster is called the size of the cluster. In addition, note that a cluster may contain, begin, or end with a 0.
A sequence must have a cluster since it is 0 almost everywhere. Now, if the size of the cluster is greater than k, then one may apply either Operation 1 or 2 from Definition 3 to eliminate the rightmost nonzero term of , hence reducing the size of the cluster. Therefore, is k-equivalent to a sequence whose cluster size is k. If the terms of the cluster of are all positive, negative, or zero, then we are done. Otherwise, one may continue eliminating the rightmost nonzero term the sequence and, as a result, shift the index of the cluster by 1. We formalize this process as follows.
Initialize , and let be a cluster of :
If
, we apply
Operation 1 to eliminate
and obtain another sequence
. Here,
. The cluster of
has size
k, but its index has shifted from that of
by 1. Namely, the cluster of
is:
If
, we apply
Operation 2 to eliminate
and obtain another sequence
. Incidentally, since
is negative, the cluster of
also follows from the Equation (
1).
If
, let
and:
where the last equality follows from the fact that
.
It follows that, in all cases:
where:
and
are taken as a
vector. This process can be repeated indefinitely, resulting in a collection of
k-equivalent sequences
. The cluster size of these sequences is
k, but the index of the cluster is continually shifted. Here, the clusters are collected as
. It follows that
, and it remains to show that there is an
n for which
is either all positive, negative, or zero.
It should not be surprising that
A is the transpose of the usual Fibonacci matrix. Thus,
A has a characteristic equation
and eigenvalues
, where
and the complex norm of
is less than 1 [
35]. We denote by
an eigenvector corresponding to the eigenvalue
. Note that:
is real and positive. Now, one may diagonalize
A as
, where
and
. It follows that:
where
. Here,
is the dominant term in the sum. Specifically, let
and
. Thus,
It follows that if n is a positive integer such that , then is the integer vector that is closest to . Since is real and positive, all entries of will have the same sign as . This completes the proof that can be made all positive, negative, or zero.
For the last part of the proposition, if is non-negative, then we only need to use Operation 1, and so the cluster of cannot be negative or zero. □
Proposition 2. For any sequence of non-negative integers that is 0 almost everywhere, there exists a sequence of 0 and 1 with no k consecutive 1’s such that .
Proof. Consider all sequences of non-negative integers that are k-equivalent to . Recall that if , then , where is the primitive Fibonacci sequence of order k. Let N be an integer such that . Now, if a sequence of non-negative integers is k-equivalent to , then we must have . Since is strictly positive, it follows that for all . This allows us to conclude that, among all the sequences of non-negative integers that are k-equivalent to , there must be one with the highest lexicographical order. We call this sequence and claim that it satisfies the conditions required.
Suppose that for some . We perform Operation 1 at and Operation 2 at . This results in a sequence with higher lexicographical order than , contradicting our choice of . Suppose now that for some . Perform Operation 2 at yields a sequence with higher lexicographical order, and once again that contradicts the choice of . We conclude that consists only of 0 and 1 with no k consecutive 1’s. □
We illustrate the transformations given in Propositions 1 and 2 in an example below.
Example 2. We let and consider a doubly infinite sequence that is zero everywhere except , , , and , i.e.: Here, and throughout the example, indices are shown as subscript for improved readability. We now transform into a sequence that is positive at some three consecutive terms and zero elsewhere: Here, Operation 1 is performed at indices 4,3,1,0,0 in order. Next, we transform
into a binary sequence with no three consecutive ones: Here, Operations 2,1,2 are performed in order at indices , respectively. Consider now a Fibonacci sequence of order 3 of Gaussian integers:from Example 1. Again, parentheses and subscript are added for readability. Note that all the above three-equivalent sequences represent the same quantity when multiplied by . In particular, if and , then Indeed, the fact that would hold for any other Fibonacci sequences of order 3.
The last example illustrates a major strength of our approach–each equivalency class represents the same quantity under a given Fibonacci sequence. If one can find a legitimate Zeckendorf representation in an equivalency class, then the element represented by that class has a Zeckendorf decomposition. We finish this section with a powerful corollary to Proposition 2 and another example. The next result characterizes all elements that are Zeckendorf in a Fibonacci sequence of order k.
Corollary 1. Let be a Fibonacci sequence of order k. An element m is Zeckendorf in if and only if m can be written as a finite sum (with multiplicities) of elements from .
Proof. An element m is a finite sum of elements from if it is Zeckendorf in . The converse of this statement follows from Proposition 2. □
Example 3. Consider the primitive Fibonacci sequence of order 2:where is the golden ratio. The elements that are Zeckendorf in this sequence are precisely positive elements in the ring , and this numeration system is known as the golden ratio base. 4. Zeckendorf’s Theorem for Free -Modules
Intuitively, if an element
m is Zeckendorf in a Fibonacci sequence
of order
k, then it is an integer combination of the initial terms (or, in fact, any
k terms) of
. The integer span of these forms a module, so it is natural to attempt to represent every element of a module using sequence elements. The
Section 4.1 below deals with doubly infinite Fibonacci sequences using tools developed in the previous section. The result is then specialized to one-way Fibonacci sequences in
Section 4.2. This paves the way for the Zeckendorf decomposition and Fibonacci coding for modules.
4.1. Two-Sided Sequences
Corollary 1 lays a strong foundation for our work. While it characterizes elements that are Zeckendorf in a sequence, it gives insufficient information on the algebraic structure of the set of all elements that have a legitimate representation. The following theorem now reverses the process. It gives a sufficient and necessary condition for every element of a free -module M to be Zeckendorf in a Fibonacci sequence .
Theorem 1. Let M be a free -module of rank l and be a Fibonacci sequence of order k. Then, M is Zeckendorf in if and only if both of the following conditions are satisfied:
Proof. We first remark that the integral span of is the same as the integral span of since . Thus, span M for alli if and only if span M for anyi.
It is easy to see that the first condition is necessary. Since the elements of are integer combinations of , so must be any sums of the elements from this sequence. It follows that span M. This readily implies . We will now show that l cannot be equal to k, thus establishing the necessity of the second condition.
Suppose that
, and let
m be any nonzero element of
M. Since both
m and
are Zeckendorf in
, we have:
for some sequences
of 0 and 1 with no
k consecutive 1’s. Hence,
. By Proposition 1,
is
k-equivalent to a sequence that is positive at some
k consecutive terms and zero elsewhere. Thus, we have:
for some
i, in which
. This, however, means that
are integrally dependent, and so they cannot span a module of rank
.
We are left to show that the two conditions are sufficient. Since
spans
M and
,
must be integrally dependent. That is, we may write:
where
are not all zeros. In other words, we have:
where
is zero everywhere except for some
k consecutive terms. We apply Proposition 1 and obtain a sequence
that is
k-equivalent to
and contains
k consecutive terms of the same sign and zero elsewhere. Note that
cannot be zero everywhere since that would imply that
.
Suppose now that
is nonzero at the terms
. This means:
where either
or
. Since
spans
M, we can write any element
as:
where
are integers. Now, we have:
for all integers
N. For a sufficiently large (and possibly negative)
N, the terms
will all be positive, and it follows from Corollary 1 that
m is Zeckendorf in
. This completes the proof of the theorem. □
Corollary 2. Let be a Fibonacci sequence of order k. If are integrally dependent, then every element in the integral span of is Zeckendorf in .
In view of Theorem 1, every integer has a (classical) Zeckendorf representation since is a module over itself with rank 1, and the negaFibonacci sequence is of order 2 with and . In fact, the same result would hold as long as and are relatively prime (and hence span integrally). This is technically the case for the Lucas sequence where and . Next, we give an example for the case of Gaussian integers.
Example 4. Consider a Fibonacci sequence:of order 3 from Example 1. It follows from Theorem 1 that every Gaussian integer can be written as a sum of elements from this sequence with no 3 consecutive terms. For instance: One may observe that , and so the Zeckendorf representation for a Gaussian integer in this sequence is not unique. Nonetheless, we will see in the next subsection how certain adjustments can make unique representations possible.
On the contrary, no Fibonacci sequence of order 2 can generate all Gaussian integers. For example, cannot be written as a sum of elements from the sequence: Note that Corollary 2 does not apply here since 1 and i are not integrally dependent. However, it follows from Corollary 1 that every element of the form , , is Zeckendorf in this sequence.
4.2. One-Sided Sequences and Unique Representation
For dense code, we need a one-to-one mapping. However, we see in Example 4 that a two-sided sequence do not provide such luxury. Note that this is observed in the classical Fibonacci sequence as well. It is only when the negaFibonacci sequence is extracted from the two-sided infinite sequence:
that every integer has a unique Zeckendorf representation. In this subsection, we establish analogous results for Fibonacci sequences of higher order. The key ingredient of this development is Corollary 3, which states that binary sequences with no
k consecutive 1’s cannot be
k-equivalent.
Lemma 1. Let a be an integer, and let be the primitive Fibonacci sequence of order k. Then: Corollary 3. Let and be doubly infinite binary sequences with no k consecutive 1’s that are 0 almost everywhere:
If ≻, then .
If ∼, then .
Proof. Suppose that
, and let
a be the largest index where
and
are not equal. It follows that
,
, and
Now, if , then we may assume without loss of generality that . This means , and so . This readily establishes part 2. □
Typically, ∼ means that , and so and are two representations of the same element. Corollary 3 now allows us to identify the Fibonacci sequences (or parts of them) that permit unique Zeckendorf representation: If a sequence has the property that implies ∼, then any element that is Zeckendorf in will have a unique representation. Since we are interested in generating every element in a -module, we look into generalizing negaFibonacci sequence, which is technically a one-sided Fibonacci sequence to the left whose first excluded term is 0. It turns out that this observation generalizes well to modules.
Theorem 2. Let be an integer and M be a free -module of rank . Let be a Fibonacci sequence of order k, where and span M. Then, every element can be uniquely written as a finite sum of elements from the one-way sequence:with no k consecutive terms. Proof. We first prove the existence. Let
and write:
where
are integers. We extend
to a sequence
that is zero everywhere except
. Now, consider all sequences of integers that are
k-equivalent to
such that the terms with positive indices are zero, and the terms with negative indices are non-negative. Such a sequence exists since one may apply
Operation 1 from Definition 3 to
at
for
times. We denote by
the sequence that has the aforementioned properties with highest lexicographical order. We claim that
gives a Zeckendorf representation for
m in
.
We first note that for and , and so . If for some , then we perform Operation 1 at and Operation 2 at to obtain a sequence with higher lexicographical order than . If for some , then performing Operation 2 at results in a sequence with higher lexicographical order. Thus, is binary with no k consecutive 1’s, and is a Zeckendorf representation for m in .
Suppose now for the sake of contradiction that there exists an element
that has two representations as a finite sum of elements from the sequence
with no
k consecutive terms. That is, we have
where
are doubly infinite binary sequences which are zero for
. Now, we keep applying either
Operation 1 or
2 to eliminate the leftmost nonzero term of
and
and obtain sequences
and
, which are zero everywhere except for
. This means:
and
are both equal to
m. Since
span
M of rank
, we must have
for
. If
, then
, and so
and
must be the same from Corollary 3. Otherwise, we assume without loss of generality that
. We now have:
Since
is binary, it follows that
. However, we see the following from Lemma 1:
which is a contradiction. This completes the proof of the theorem. □
Obviously, the generalized Zeckendorf theorem over the negaFibonacci sequence [
19] serves as an instance of Theorem 2. We give another example using Gaussian integers.
Example 5. Consider the following Fibonacci sequence:of order 3 from Examples 1 and 4. One can see that every representation given in Example 4 only involve terms from the sequence: In fact, every Gaussian integer can be uniquely written as a sum of elements from this sequence with no three consecutive terms. This property will be exploited when we develop multidimensional Fibonacci coding for Gaussian integers in Example 6 in the next section.
5. Multidimensional Fibonacci Coding
The classical Fibonacci code of order 2 is quite simple. By virtue of Zeckendorf’s theorem, integers are written in base Fibonacci and encoded as a string of 0’s and 1’s in reverse order together with a suffix 1. For example, 11 is encoded as 001011 since it is the sum of the third and the fifth Fibonacci numbers, which are 3 and 8, respectively. The widely-accepted Fibonacci code of higher order is not so straightforward. Note that simply adding consecutive runs of 1 no longer makes the code uniquely decodable [
7]. The integers are instead mapped to a lexicographically ordered string of 0’s and 1’s with a single
k-consecutive 1 at the end. This encoding, though artificial, provides dense codes.
We see from Theorem 2 that it is possible to write any element of a module as a sum of entries from a sequence using no
k consecutive terms. This suggests a multidimensional Fibonacci coding for modules with the use of
k consecutive 1’s as a separator. This intuition formalizes into Algorithm 1. Here, we denote
k consecutive 1’s,
, by
.
Algorithm 1 Multidimensional Fibonacci Encoding |
Input: A Fibonacci sequence of order k where and span a free -module M of rank , and an element .
|
Output: multidimensional Fibonacci code for m.
|
- 1:
If , output . END. - 2:
Find the module coordinate of m under the basis . We will have
Set and for . - 3:
If , subtract from . - 4:
Find the largest index such that . If there is none, go to Step 5. Otherwise, subtract 1 from , add 1 to , and repeat this step. - 5:
Find the largest index such that . If there is none, go to Step 6. Otherwise, subtract 2 from , add 1 to and , and go to Step 4. - 6:
Find the smallest index such that . Output . END.
|
Note that the coefficients are encoded in reverse order, and Step 2 may depend on the initial condition
of the Fibonacci sequence of order
k. If
is chosen to be the standard basis of the module, then
will simply be the coordinate of
m in the usually sense. Generally, one may also find
by solving (
2) as a system of linear equations. In the final step, we replace the last 1 with
to make our code uniquely decodable. This frees us from having to keep the lookup table but makes the code suboptimal in terms of density.
In addition, note that the algorithm simply mimics the arguments given in the proof of Theorem 2. This guarantees that the algorithm terminates since the operations performed in Step 4 and 5 increase the lexicographical order of the sequence. At every step, the value
remains unchanged. This fact makes the decoding, which is outlined as Algorithm 2, straightforward.
Algorithm 2 Multidimensional Fibonacci Decoding |
Input: A Fibonacci sequence of order k, and a binary string .
|
Output: An element m.
- 1:
If is empty, output 0. END. - 2:
Output . END.
|
To illustrate, we consider once again the Gaussian Fibonacci sequence of order 3 used through Examples 1, 4 and 5.
Example 6. Consider the following Fibonacci sequence:of order 3, and let . To encode, we initialize , and obtain , , after Step 3. We shorthand this sequence as . The iterations of Step 4 and 5 now yield: Thus, is encoded as 10110000111. This string can then be decoded as . The length of the encoded Gaussian integer where under this initial condition is illustrated in Figure 1. Here, the region of the same color represents the Gaussian integer of the same encoded length using the basis given in this example. Clearly, one may replace i in Example 6 with any other quadratic integer and obtain a multidimensional Fibonacci coding for the corresponding ring of quadratic integers. Next, we give an example of multidimensional Fibonacci coding for a lattice.
Example 7. Consider the lattice given by: This lattice provides optimal sphere packing and kissing number in eight dimensions and has many other interesting properties [36]. We set the basis for as:
and consider a Fibonacci sequence of order 9 given by: It follows from Theorem 2 that every element of can be written as a sum of elements from this sequence with no nine consecutive terms. For example, let . Then, and can be encoded as 00100001010111111111.
6. Compression Efficiency and Simulation Results
In this section, we give an overview for the compression efficiency of the proposed scheme, both theoretically and numerically. We refer interested readers to [
1] for a comprehensive survey on the topic of data compression.
Let
be the dominating root of
, i.e., the
-order golden ratio. It takes approximately
bits to encode a single integer
A using the Fibonacci code of order
k. Unfortunately, due to the erratic nature of the higher-order number system, a simple formula cannot be made to approximate the number of bits needed to encode the string
using our proposed algorithm (see also
Figure 1). Here, we shall attempt to give a very crude estimate. Theorem 2 guarantees a one-to-one correspondence between modules element and sequences of 0’s and 1’s with no
k consecutive 1’s. Consider a
k-dimensional hypercube with sides parallel to the axes, and the origin and
are opposite vertices. It takes at least
bits to assign each integer point in this hypercube a unique binary strings under multidimensional Fibonacci coding. Thus, if
are sufficiently large, one may estimate the number of bits needed to encode them altogether as
. If
A is the geometric mean of
, then the per-element average of the number of bits used to write each of the
’s is
. This can be interpreted as the burden of the
k-bit suffix being shared across the string entries.
While there exist several modern compression techniques [
3,
4,
5,
37], each with different strengths and trade-offs, we compare a multidimensional Fibonacci code with the classical one, using the well-established Huffman code as a benchmark.
Figure 2 plots the average length of the codewords as a function of the number of elements in the source data on a logarithmic scale for Huffman, classical Fibonacci (Fib2, Fib3, Fib4), and multidimensional Fibonacci (MultiFib3, MultiFib4) coding. Next, we define the normalized compression ratio as the average codeword length divided by the entropy of the source data and present the comparison in this factor in
Figure 3. Here, we see that multidimensional Fibonacci codes outperform their classical counterpart across all orders when
n is sufficiently large (≈100). This is not surprising, since the burden of the suffix is shared across several alphabets in the proposed scheme.
Next,
Figure 4 and
Figure 5 consider the case when the distribution of the source data follows Zipf’s law, which is known to typically govern the distribution of letters and words in a natural language [
38]. Here, it turns out that the performance of multidimensional Fibonacci codes are slightly behind their classical counterparts. This phenomenon could probably be explained by the fact that multidimensional encoder receives a hold-back when “frequent” and “infrequent” alphabets are encoded together in the same group. We give a quick example to illustrate this circumstamce.
Example 8. Suppose that three strings 2 2, 10 10, and 10 2 are to be encoded with the classical and multidimensional Fibonacci coding of order 3 (Fib3 and MultiFib3, respectively), where Fib3 encodes each number separately and MultiFib3 encodes the two numbers together. The results are given in Table 1. Here, MultiFib3 performs better when the input integers are equally large but gains no reduction in codeword length when one of the inputs is significantly reduced.
To this end, multidimensional Fibonacci coding takes us to a strange paradox where encoding a constituent of data as whole reduces the use of delimiter, thus improving upon the compression rate. However, as the encoding is performed collectively rather than individually, the system suffers a disadvantage when numbers that are considerably different in magnitude are encoded together. This could be particularly troublesome in a skewed distribution such as Zipf, where an input with high frequency is not mapped to a short codeword it deserves and may not bring much benefit to the block it belongs to. Finally, we remark here that while Huffman code outperform Fibonacci code numerically, it works well only when there are finite alphabets and the distribution is known a priori. Fibonacci code is also advantageous in that it provides robustness against errors and works naturally on the set of integers. In this sense, the multidimensional Fibonacci code could be of particular interest when integers of unknown sign and magnitude are to be encoded.
7. Conclusions
In this paper, we generalize Zeckendorf’s theorem to modules. The notion of equivalent sequences allows us to identify elements that can be represented as a sum of entries from a Fibonacci sequence of order k. This results in the necessary and sufficient conditions for a Fibonacci sequence of higher order to generate a module. In addition, under certain circumstances the representation is unique, allowing us to establish multidimensional Fibonacci coding for modules. Future work involves identifying other conditions to which the representation remains unique. Under such environments, one can view the Zeckendorf representation as a number system and develop a generalized Zeckendorf arithmetic. It would also be interesting to study the proposed coding algorithm from the perspective of data compression and computational complexity.