Enhanced Binary MQ Arithmetic Coder with Look-Up Table

Department of Electronics and Communication Engineering, Kwangwoon University, Seoul 01897, Korea
Information 2021, 12(4), 143;
Submission received: 12 February 2021 / Revised: 20 March 2021 / Accepted: 22 March 2021 / Published: 26 March 2021
(This article belongs to the Section Information Theory and Methodology)


Binary MQ arithmetic coding is widely used as a basic entropy coder in multimedia coding system. MQ coder esteems high in compression efficiency to be used in JBIG2 and JPEG2000. The importance of arithmetic coding is increasing after it is adopted as a unique entropy coder in HEVC standard. In the binary MQ coder, arithmetic approximation without multiplication is used in the process of recursive subdivision of range interval. Because of the MPS/LPS exchange activity that happens in the MQ coder, the output byte tends to increase. This paper proposes an enhanced binary MQ arithmetic coder to make use of look-up table (LUT) for (A × Qe) using quantization skill to improve the coding efficiency. Multi-level quantization using 2-level, 4-level and 8-level look-up tables is proposed in this paper. Experimental results applying to binary documents show about 3% improvement for basic context-free binary arithmetic coding. In the case of JBIG2 bi-level image compression standard, compression efficiency improved about 0.9%. In addition, in the case of lossless JPEG2000 compression, compressed byte decreases 1.5% using 8-level LUT. For the lossy JPEG2000 coding, this figure is a little lower, about 0.3% improvement of PSNR at the same rate.

1. Introduction

Since Shannon announced in 1948 that a message can be expressed with the smallest bits based on the probability of occurrence, many studies on entropy coding have been conducted. A typical example is Huffman coding, which is simple and has relatively good performance, so it has been widely used as an entropy coding method in document and image compression [1,2,3].
In theory, close to optimal compression rate can be obtained using Huffman coding. To do this, we need to enlarge the alphabet, i.e., encode the whole message as a one letter of huge alphabet. However, it cannot be achieved because of the very high complexity of such Huffman code tree building. This was the reason why the arithmetic coding (AC) was developed. AC can build the codeword for a whole message with an acceptable complexity. AC procedure has its roots in the Shannon–Fano code developed by Elias (unpublished), which was analyzed by Jelinek [4]. The procedure for the construction of a prefix-free code is due to Gilbert and Moore [5]. The extension of the Shannon–Fano–Elias code to sequences is based on the enumerative methods in Cover [6] and was described with finite-precision arithmetic by Pasco [7] and Rissanen [1,8].
Arithmetic coding has the disadvantage of increasing precision by continuously dividing the probability interval of [0, 1] and expressing it as a decimal value. To circumvent the precision problem posed by the partitioning of the real interval [0, 1] in AC, one uses instead the set of integers [0, N]. To handle the precision issue, a renormalization is performed and output bits are used to indicate that normalization has occurred.
The multi-symbol arithmetic coding was implemented in hardware and software by Witten et al. [9] and Langdon [10] in the 1980s. Afterwards, the arithmetic code was applied to binary data to reduce computational complexity. In the 1990s, binary arithmetic coding began to be applied to still-image coding such as JPEG and JBIG.
There have been studies on solution on arithmetic coders and many adaptive methods have been published [10,11,12,13]. Many researchers pursued a probability estimation technique which provides a simple yet robust mechanism for adaptive estimation of probabilities during coding process [14,15,16]. The quite early successful adaptive binary arithmetic coder (BAC) was the Q-coder proposed by IBM [17]. In the BAC, the binary symbols are classified into the Most Probable Symbol (MPS) and the Least Probable Symbol (LPS). These symbols are assigned to 0 and 1. The Q-coder has a mechanism for adaptive encoding based on the probability estimated from the sequence input during the encoding process. The size of the hardware was reduced by limiting the precision of the probability interval to 16-bit length. However, various problems may occur due to the limited register size. Among them, the carry propagation problem in the renormalization process is solved by bit stuffing. That is, when eight consecutive 1s occur, 0s are inserted. Above all, in the process of continuously dividing the probability interval, a renormalization was performed to maintain the probability of occurrence of LPS and the value of the previous interval within [0.75, 1.50) and to avoid multiplication under these conditions. By adopting the 16-bit length precision, the hardware size was reduced by limiting the precision.
BAC has many advantages of high accuracy and good compression performance. However, interval subdivision requires the multiplication operation, which is accompanied by a severe challenge in implementation. Moreover, the probabilities of input symbols are often unknown. The appropriate method of accurately estimating the probabilities is another important issue. In the process of standardizing process of multimedia coding, the MQ-coder [18,19] and M-Coder [20,21,22,23] showed the best performance. These coders are table-based coder, in which multiplication is avoided by restricting the interval within a certain range.
As another multiplication-free arithmetic coding approach, Mitchell et al. [24] developed log arithmetic encoder/decoder which has a scheme of substituting multiplication with addition by using a logarithmic domain. However, the scheme fails to address the relationship between the original domain and the logarithmic domain. The domain-switching and probability-estimation processes require large amounts of memory and more computation with minor improvement in coding efficiency [14].
To address the above-mentioned problem of logarithmic encoder, new logarithmic BAC with adaptive probability estimation process was developed to improve coding performance compared with existent MQ-coder and M-coder. However, encoder–decoder complexity is quite increasing with a little bit savings [14,25].
As another adaptive BAC, the adaptive binary range coder (ABRC) [15] uses virtual sliding window (VSW) [26] for probability estimation, which does not require look-up tables. The VSW estimation provides a faster probability adaptation at the initial encoding/decoding stage and especially more precise probability estimation for very low entropy binary sources. However, it needs multiplication operations and more complex encoder—decoder processing.
In the 1990s, standardization of image coding was established. The arithmetic encoder was adopted as entropy encoding along with Huffman encoding in the compression method of still images known as JPEG [27]. Huffman coding was adopted as the baseline profile and arithmetic coding was selected as the extended profile. The JPEG committee chose the QM arithmetic coding method jointly proposed by IBM, Lucent, and Mitsubishi [2]. The QM arithmetic coding method applied MPS/LPS conditional exchange to increase the coding efficiency of Q-coder. The probability estimation was improved to enable fast initial learning. Q-coder is suitable for hardware implementation, while QM coder is suitable for implementation in software. Q-coder solved carry propagation in the decoder; however, the QM coder solved it in the encoder. Probability estimation is performed using the next estimated probability table and when the probability of the MPS is less than 0.5, MPS/LPS exchange is performed.
Subsequently, a compression standard named JBIG was established as a new standard for compressing binary documents [28]. In the JBIG compression standard, unlike JPEG, arithmetic coding was adopted as the baseline profile. JBIG is based on Q-coder that also uses a relatively minor refinement by Mitsubishi, resulting in what became known as the QM coder. In 2000, the JBIG2 binary coding standard was established and the MQ arithmetic coder instead of QM arithmetic coder was adopted [18]. The MQ coder has changed the carry propagation method. One 0 bit is inserted only when byte 0xFF is output. The bit stuffing of MQ coder is efficient in bandwidth and execution time. In 2004, a new still image coding method, so called JPEG2000, was established as a lossless/lossy image compression standard [29]. In JPEG2000 standard, MQ arithmetic coding was selected as a baseline profile, replacing Huffman coding. H.264/AVC, which is a video compression standard, adopted the so-called context adaptive binary arithmetic coding (CABAC) method as the standard entropy coding. High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as a successor to the widely used H.264/AVC. In comparison to H.264/AVC, HEVC offers quite better data compression at the same level of video quality. In addition, HEVC chose M-coder as a unique entropy coding [30].
Recently, to increase the coding efficiency for M-coder, a new convolutional neural network (CNN), which was successfully operated in object classification, was applied to compress the syntax elements of the intra-predicted residues in HEVC. Instead of manually designing the binarization process and context models, the probability distributions of the syntax elements by training CNN are estimated directly. The BD-rate reductions are obtained in almost test sequences. However, the encoding and decoding complexity is quite high and AC structure suitable for implementation should be developed [31]. However, in this paper, the probability estimation issue like VSW (Virtual Sliding Window) and CNN-based estimation is not dealt.
In this paper, enhanced MQ-coder is proposed to obtain better compression performance with maintaining multiplication-free property. The existing binary MQ arithmetic coder approximates (A × Qe) to Qe in order to remove multiplication. When the Q-coder was proposed, however, the computing power was not satisfactory, and this approximation method was reasonable at that time. Because computing power has enormously increased afterward, there are lots of room for improvement of the approximation method. By this motivation, in this paper, using the existent probability estimation table Qe, multi-level quantization tables of (A × Qe) are proposed to be used as look-up table (LUT). The technical details are described in Section 3 and verified the performance by coding experiments of JBIG2 and JPEG2000.

2. Related Works

2.1. Shannon–Fano–Elias Arithmetic Coder

When trying to efficiently compress an input sequence composed of symbols of 0 and 1, the probability of occurrence of each symbol is used. According to Shannon’s entropy theory and lossless coding law, a smaller codeword is assigned to a symbol with a high probability of occurrence and a longer codeword is assigned to a symbol with a low probability of occurrence. When compressing a true document, it can be compressed very efficiently.
Let us say that the current range is represented by the interval (Lower, Upper). Let the initial range be (Lower, Upper) = [0, 1]. When one bit is input, the current range is divided into two by the following Equations (1) and (2).
Lower ← Lower + (Upper − Lower) × CDF(n − 1)
Upper ← Lower + (Upper − Lower) × CDF(n)
Here, CDF(n) is the cumulative distribution function of the n-th input symbol. Output occurs when all sequence inputs come in. The length of the last range is equal to the product of the probability of occurrence of all symbols. As each input bit enters, the range continuously decreases and the decimal point representing the accumulated range after the last input bit is expressed in binary. In this case, as the scale of the input sequence increases, it is difficult to increase the accuracy of the range. If the number of input bits is increased, the divided probability interval (range) value continuously decreases, and many bits are rapidly required to express the interval.
To solve the problem of increasing the precision of the arithmetic encoder, if the range value is reduced to less than 1/2, the probability interval is doubled through renormalization or scaling and output bits are generated during the renormalization process.

2.2. Binary MQ Arithmetic Coder (ITU-T Recommendation T.88, 2000)

As shown in Figure 1, the binary MQ arithmetic encoder generates binary output (CD) by inputting the continuous bit input (D) and the corresponding context (CX). Bit 0 or 1 is applied as input. The MQ arithmetic coder first determines whether the MPS and the LPS are 0 or 1, respectively, and compares the current input bit with the MPS value to determine whether it is MPS. Encoding is performed using a probabilistic model for each context of the symbol.
The current range value is divided into two ranges by the input bit value. In the process of dividing the range, the range of the MPS is positioned above the LPS. The input result is displayed as base (stored in C register) and interval (stored in A register) value. As a result of the encoding, the output code string is changed to indicate the base (C) value. At the beginning of encoding, the base C is initialized to 0 and the range A is initialized to 0.75 in decimal and 0x8000 in hex. Renormalization is performed when necessary to keep the range A within 0.75 ≤ A < 1.5.
That is, if the value of A is less than 0.75, the value of A is doubled and the value of C is also doubled each time the value of A is doubled. This operation is repeated until the A value is greater than 0.75. The A register is composed of 16 bits and the C register is set to have a length of 32 bits.
The lower 16 bits of the C register are shifted to the LSB by 1 byte from the A register and if the count value becomes 0 while performing renormalization, the 1-byte value at the b position is output. It is removed from the C register and placed in the output data column buffer. If the predicted value of the LPS occurrence probability is Qe, the divided range A is obtained by the following Equations (3) and (4).
Split interval for MPS = A − (A × Qe)
Split interval for LPS = A × Qe
In Equations (3) and (4), one multiplication is required for each input bit, so the calculation is large. Therefore, it is assumed that the value of A is close to 1 and that (A × Qe) is approximated by Qe [14]. Therefore, after removing the multiplication in Equations (3) and (4), it is approximated as Equations (5) and (6).
Split interval for MPS = A − Qe
Split interval for LPS = Qe
Whenever the MPS is coded, the value of Qe is added to the A register and the interval is reduced to A − Qe. Whenever the LPS is coded, the A register is left unchanged, and the interval is reduced to Qe. The precision range required for A is then restored, if necessary, by renormalization of both A and C. With the process illustrated above, the approximations in the interval subdivision process can sometimes make the LPS sub-interval larger than the MPS sub-interval. If, for example, the value of Qe is 0.5 and A is at the minimum allowed value of 0.75, the approximate scaling gives 1/3 of the interval to the MPS and 2/3 to the LPS.
To avoid this size inversion, the MPS and LPS intervals are exchanged whenever the LPS interval is larger than the MPS interval. This MPS/LPS conditional exchange can only occur when a renormalization is needed. Whenever renormalization occurs, a probability estimation process is invoked which determines a new probability estimate for the context currently being coded. No explicit symbol counts are needed for the estimation. The relative probabilities of renormalization after coding an LPS and MPS provide an approximate symbol counting mechanism which is used to directly estimate the probabilities.

2.2.1. MPS Encoder in Binary MQ Coder (CODEMPS)

The CODEMPS procedure shown in Figure 2a usually reduces the size of the interval A into A − Qe to the MPS sub-interval. If A register value which stores the interval size becomes less than 0.75 (=0x8000) adjusts the code register C so that it points to the base of the MPS sub-interval.
If the divided interval A value is less than 0.75, i.e., if A − Qe is greater than Qe, the base C value increases by Qe, and otherwise, if A − Qe is less than Qe, the A value is changed to Qe. However, if the interval sizes are inverted, the LPS sub-interval is coded instead. Note that the size inversion cannot occur unless a renormalization (RENORME) is required after the coding of the symbol. The probability estimate update changes the Index according to the next MPS (NMPS) column in Table 1. In addition, renormalization occurs.

2.2.2. LPS Encoder in Binary MQ Coder (CODELPS)

The CODELPS procedure shown in Figure 2b usually consists of a scaling of the interval to Qe, the probability estimates of the LPS determined from the Index. The upper interval is first calculated, so it can be compared to the lower interval to confirm that Qe has the smaller size. It is always followed by a renormalization (RENORME). If the interval sizes are inverted, however, the conditional MPS/LPS exchange occurs and the upper interval is coded. In either case, the probability estimate is updated. If the SWITCH flag for the Index is set, then the MPS is inverted. A new Index is saved as determined from the next LPS (NLPS) column in Table 1.

2.2.3. Renormalization (RENORME) and Byte-Out (BYTEOUT) in Binary MQ Coder

The RENORME procedure for the encoder is illustrated in Figure 3a. Both the interval register A and the code register C are shifted, one bit at a time. The number of shifts is counted in the counter CT and when CT is counted down to zero, a byte of compressed data is removed from C by the procedure BYTEOUT shown in Figure 3b. Renormalization continues until A is no longer less than 0x8000.

2.2.4. Adaptive Coding with Probability Estimation Table

The probability estimation state machine consists of several sequences of probability estimates. These sequences are interlinked in a manner which provides probability estimates based on approximate symbol counts derived from the arithmetic coder renormalization [16].
Table 1 shows the Qe value associated with each index. The Qe values are expressed as hexadecimal integers and decimal fractions. To convert the 15-bit integer representation of Qe to the decimal probability, the Qe values are calculated by (4/3) × (0x8000).
The estimator can be envisioned as a finite-state machine—a table of Qe indexes and associated next states for each type of renormalization (i.e., new table positions). The change in state occurs only when the arithmetic coder interval register is renormalized. This is always done after coding the LPS and whenever the interval register is less than 0x8000 (0.75 in decimal notation) after coding the MPS. After an LPS renormalization, NLPS gives the new index for the LPS probability estimate. After an MPS renormalization, NMPS gives the new index for the LPS probability estimate. If Switch is 1, the MPS symbol sense is reversed [18].
The LPS probability estimation table of the MQ arithmetic coder is different from that of the QM arithmetic coder [16], which is adopted in JPEG with 112 probability estimation values, but MQ coder has much smaller 46 probability estimation values. Features include three probability models that have an initial value (0x5601) in index 0, 6 and 14. When 0 or 1 occurs continuously, it is determined how quickly the probability of LPS occurrence converges to zero.

2.3. M-Coder (Marpe, D)

A new design of a family of multiplication-free binary arithmetic coders has been proposed for H.264/AVC coding [22,23]. Its main innovative features are given by a table-based interval subdivision coupled with probability estimation based on a finite-state machine (FSM) as well as a fast bypass coding mode. This so-called modulo (M) coder family of binary arithmetic coding schemes offers a parameterizable trade-off between coding efficiency and memory requirements for the underlying lookup tables.
The basic idea of the low-complexity M-coder approach of interval subdivision is to quantize the admissible domain D = [2(b−2), 2(b−1)) for the range register R induced by renormalization into a small number of K different cells. To further simplify matters, we assume a uniform quantization of D to be applied, resulting in a set of representative equispaced range values Q = {Q0, Q1, …, QK-1}, where K is further constrained to be a power of 2, i.e., K = 2k for a given integer k ≥ 0. By a suitable discretization of the range of LPS-related probability values pLPS ∈ (0, 0.5], a representative set P = {p0, p1,…, pM−1} of probabilities can be constructed together with a set of corresponding transition rules for FSM-based probability estimation.
Both discrete sets P and Q together enable an approximation of the multiplication operation pLPS × R for interval subdivision by means of a 2-D table RTAB that contains all M × K pre-calculated product values {pm × Qk−1|0 ≤ m < M; 0 ≤ k < K} in a suitably chosen integer precision. The entries of the RTAB table can be easily addressed by using the probability state index m and the quantization cell index k related to the given value of R. The RTAB example is partially shown in Table 2 when M equals 64 and K is 4 with b being 10.
H.264/AVC is standardized to use M-coder [22] and HEVC adopted it as a default entropy coding to remove Variable length codes (VLC) based on Golomb coding [23].

3. Proposal of Enhanced Binary MQ Coder

3.1. Problem of Existing Binary MQ Coder

As explained in related works, binary MQ coding is divided into MPS and LPS and interval segmentation is performed by approximating (A × Qe) to Qe under the assumption that A is close to 1. The disadvantage of this method is that if Qe becomes larger than A/2 after encoding, the MPS and LPS may be inverted. The error decreases as A gets closer to 1 through the renormalization process. In this paper, focusing on this problem, we intend to devise a method to increase the coding efficiency while maintaining multiplication-free operation by the table lookup method for (A × Qe) value. In this approach, we can make more accurate estimation of (A × Qe) instead of just approximation of (A × Qe) to Qe. We will show the improvement through experiment for compression system where MQ coder is applied as a standard entropy coder.

3.2. Proposed LPS Probability Estimation Look Up Table Method

We will propose a table lookup method that improves coding efficiency without direct multiplication in current MQ coder. If we want to arithmetically code like Elias coding in Equations (3) and (4), multiplication is required to encode binary symbols. In this paper, we implemented the idea of applying the quantization method to obtain the efficiency of the original Elias method while maintaining the current MQ coder. In the current MQ arithmetic coder, instead of (A × Qe) as shown in Equations (3) and (4), it is substituted with Qe like Equations (5) and (6) on the assumption that A is close to 1. However, this is approximation method, there might be loss of coding efficiency. To reduce shortcoming, as shown in Equations (7) and (8), we propose a method to select (A × Qe) value from look-up table (LUT) that quantizes (A × Qe) into 2-level, 4-level and 8-level according to the range of A values instead of Qe.
MPS: A = A − LUT(A × Qe), C = C + LUT(A × Qe)
LPS: A = LUT(A × Qe)
To encode new (MPS, LPS) symbol values, after renormalization is performed, the value A should be located between 0.75 and 1.5. When the index of the MQ arithmetic code is 0, the probability of occurrence Qe of LPS is 0.50395 (=0x5601) shown in original probability estimation Table 1, so (A × Qe) value is calculated between 0.75 *0.50395 (≈0x4082) and 1.5 *0.50395 (≈0x8104). In this paper, we propose a LUT method in which the quantization level consists of 2, 4 and 8.

3.2.1. 2-Level LUT

As shown in Figure 4, when the quantization levels are two, the entire range of A is divided into two, but if the value falls from 0x8000 to 0xC000, it is said to be quantized to A1. On the other hand, when it falls within 0xC000 to 0x10000, it is said to be quantized to A2. The quantization value is set according to two modes as shown in Table 3. Mode-1 is the mode set the quantization levels to the interval mid value. The quantized value in mode-2 is determined to divide equally entire interval. Parameters α and β are set to find the optimal quantization value to be determined experimentally according to compression methods such as non-contextual MQ, context-based JBIG2 and JPEG2000. We will seek the best compression in different modes. In addition, if possible, we can select the optimum LPS look-up table in MQ coder. The 2-level lookup tables for (Ai × Qe) are shown in Table 4. Adjusting the α and β values, the compression ratio is expected to change.

3.2.2. 4-Level LUT

When the quantization level is set to 4, the value of interval A is divided into 4 sections as shown in Table 5 and Figure 5.
Because the quantization level is set to 4, the interval A is divided into four, and A values variably are set by 4-modes. The quantization values of A1–A4 for each divided interval are determined shown in Table 6. The quantization LUT values are assigned to {A1 × Qe, A2 × Qe, A3 × Qe, A4 × Qe} and the parameters α and β are determined to experimentally find the optimal quantization value. Table 7 shows the LUT examples for 4-level quantized values of Ai × Qe.

3.2.3. 8-Level LUT

When the quantization level is set to 8, interval A is divided into 8 in Table 8 and Figure 6 and the quantization value of each divided interval is used to obtain a lookup table with the center value of the divided interval as the distinct value (A1~A8). The quantization values of A1–A8 for each divided interval are determined shown in Table 9.
The quantization values are assigned to {A1 × Qe, A2 × Qe, A3 × Qe, A4 × Qe, A5 × Qe, A6 × Qe, A7 × Qe, A8 × Qe} shown in Table 10 and the parameter α and β should be set to find the optimal quantization value through experiment. For example, in Mode-1, assuming that the Qe value is 0x5601 and α and β equal to 1, when the index is 0, (A1 × Qe) is 0x47AC, (A2 × Qe) is 0x4ED7, (A3 × Qe) is 0x5602, (A4 × Qe) is 0x5D2D, (A5 × Qe) is 0x6457, (A6 × Qe) is 0x6B82, (A7 × Qe) is 0x72AD and (A8 × Qe) is 0x79D8. Table 10 shows part of the (Ai × Qe) lookup tables of 47 × 8 sizes when α and β is 1.

4. Experimental Results

4.1. Experimental Enviroment

To verify the proposed algorithm, experiment was done by Visual Studio 2019 with binary document image, gray and color images in various resolutions. For binary document compression, a total of 12 documents were used. Eight 200 (dpi) and 300 (dpi) standard binary documents presented by the JBIG committee are used in Figure 7. Two Korean documents created by scanning at 200 (dpi) and 300 (dpi) were used in the experiment as shown in Figure 8. The color and gray images used for JPEG2000 experiment are shown in Figure 9. The images in Figure 9a–f are widely used in still image compression at SD resolution. An additional two HD images by capturing test video for H.264 are shown in Figure 9g–h.
Three experiments were executed to evaluate the compression performance of the proposed method. The first experiment is to compress standard binary documents with context free MQ arithmetic coding. The second experiment is JBIG2 lossless compression for binary document with two-dimensional context. The third experiment is executed for JPEG2000 lossless and lossy compression with proposed method by Jasper program [32].

4.2. Basic Context-Free MQ Coding Experimental Results

Using binary document images shown in Figure 7 and Figure 8, we executed binary MQ encoding without context. Compared to the standard MQ coding, the degree of improvement for the proposed method of 2, 4 and 8 level LUTs were measured. Parameter α and β values were changed by values such as 0.9, 0.95, 0.97, 1.0, 1.02, 1.03, 1.05, 1.1, etc., and the optimal parameter values were attempted to obtain experimentally.
The experimental results of the 2-level MQ encoder are shown in Table 11. For 12 documents, after varying the parameters α and β, the best result in mode-1 was obtained when α equals 1.02 and β equals 1.02. In this case, the average degree of improvement was 2.99%. The best overall improvement in mode-2 was 2.36%. Mode-1 shows the better results than mode-2. In terms of difference of resolution at 200 dpi and 300 dpi, the lower the resolution, the better the improvement.
In the 4-level MQ coding experiment, four modes were simulated and results are shown in Table 12. Though experimental data are not included when α and β are 1, there are improvements of 1.6~1.8% compared with original MQ coder. The mode-4 showed the best improvement of 2.16% when alpha and beta is 1.03 after various attempts for parameters α and β. In terms of the resolution, the case with 200 dpi documents showed better performance than that with 300 dpi documents. It was found that the performance was lower than that of 2-level MQ encoding, which showed that 2-level LUT MQ may be sufficient if context information is not considered.
In the 8-level MQ coding experiment, in mode-1, it showed the best performance when α and β are 1.0 and the improvement was averaged 2.13% for all documents shown in Table 13. In the case of mode-2, the best improvement was 2.08% when α and β are 1.03. In addition, the lower the resolution, the better the improvement.
Comparing the 2-level, 4-level and 8-level LUT MQ coder, 2-level MQ coder showed the best results. However, in the case of the JBIG2 and JPEG2000 experiments to be described below, different results can be obtained. It will be comprehensively examined later.

4.3. JBIG2 Experimental Results

For the experiment applying the proposed method to the JBIG2 standard compression at binary documents, a lossless refinement method was executed. First, in the case of the two-level LUT MQ method, as we saw in Table 14, there was no significant difference in performance between 200 dpi and 300 dpi documents. In the case of mode-1 when α and β equal 1, the average improvement was 0.58%. In addition, for 300 dpi document the best performance happens when α and β equal 1.03. In the case of mode-2, the best improvement of 0.52% is obtained when α and β equal 1.
As shown in Table 15, when applying the 4-level LUT MQ coder to JBIG2, mode-3 provided the best performance compared with other modes. Compared to the original MQ coding, the best improvement percentage was about 0.8%. With varying α and β, there was no significant improvement. In addition, there was no significant difference in performance with different resolutions. It was found to be 0.22% better than the 2-level LUT MQ coder.
As shown in Table 16, in the case of 8-level LUT MQ coder, slightly better results are obtained in mode-1. Improvement of compression ratio was measured 0.9% compared with original JBIG2 coder. As the quantization level increased, it was found to be less sensitive to parameter α and β.

4.4. JPEG2000 Experimental Results

JPEG2000 coding, which is the new still image compression standard, is applied on SD and HD color and gray images shown in Figure 9. Experimental results of lossless coding about 2-level, 4-level and 8-level LUT MQ coder are presented in Table 17 and Table 18. This experiment was carried out by modifying Jasper program [32]. In the case of the color image, only the results of green color are shown because the results of red and blue color are similar and in the case of MAN image, the result of the gray values is given.
Compared to the existing JPEG2000 lossless compression, there were performance improvement of 1.2% for 2-level LUT and 1.38~1.45% for 4-level LUT and 1.5% for 8-level LUT MQ coder. At 2-level, the best results of 1.21% are obtained in mode-2, when α and β equal to 1.05. In the case of 4-level LUT, mode-2 showed the lowest performance and in the other modes, the similar improvement was obtained. In the case of 8-level LUT, there is no significant difference in between two modes. The maximum improvement percentage was 1.51%. In all modes, the best improvements are obtained when α and β is 1.05.
Table 19 and Table 20 show the experimental results of lossy compression with compression ratio of 20:1 in JPEG2000. In the case of lossy compression, performance improvement is lower than that of lossless compression. PSNR is compared at the same compression ratio. In the table at the compression ratio of 0.05, PSNR results using original JPEG2000 algorithm with existing MQ coding are shown and remaining values are the improvement ration in PSNR at the same rate. In the case of 2-level, improvement of 0.26~0.28% is obtained and in the case of 4-level and 8-level, the performance improvement was achieved by 0.30~0.34%. In the case of 2-level, the best is when α and β equal to 1 in mode-2. In lossy compression, because there is no significant difference in performance, 4-level LUT at mode-1 is adequate.
When the compression ratio is set to 50:1, the PSNR comparison results are shown in Table 21 and Table 22. The performance improvement ratio was 0.25~0.27% for 2-level LUT and 0.26~0.31% for 4-level LUT and 0.28~0.31% for 8-level LUT. 4-level and 8-level show similar performance to compare the best mode.

5. Discussion and Conclusions

A problem of the existing MQ coder to compute only addition and subtraction without multiplication is that encoding efficiency is degraded due to frequent MPS/LPS conditional exchange. In this paper, a method to reduce the artifacts caused by approximation is proposed while retaining the probability estimation table of the binary MQ arithmetic coder. By using a pre-calculated 2-level, 4-level and 8-level quantized (A × Qe) look-up table instead of approximation value of Qe, coding performance is improved. Instead of using uniform quantized value of (A × Qe), experiments were executed with varying the parameter α and β at each level of 2, 4 and 8. Nonuniform quantization of (A × Qe) is applied with changing α, β. When applied to the JBIG2 and JPEG2000 coding standards, we obtained affirmative results. By performing lots of experiments, optimal LUT was derived. In the case of JBIG2, the more quantized level, the better compressed performance. At 4-level or 8 levels, the best-chosen parameters α and β are 1.0. Meanwhile, in the case of JPEG2000, the best compression performance was obtained at a value of 1.05 at most quantization levels.
Compressing a binary document without context, compression rate was improved by up to 3% compared to the existing method. In addition, according to the JBIG2 coding with 2-dimesional context, it was confirmed that the proposed method improved the compression rate of 0.9% compared to the conventional method. In the case of the JPEG2000 still image lossless coding, we obtained around 1.0~1.5% depending on the number of levels. In case of JPEG2000 lossy image compression, the improvement of PSNR was 0.3%, which is a little disappointing compared to lossless compression. From the experimental results, the performance is slightly lowered at 2-level LUT and the 8-level LUT is slightly better than the 4-level LUT, but the degree of improvement is reduced.
It is not easy to select the optimal level and mode in common suitable to every coding method, rather it will be possible to variably select the optimal quantization level in rate-distortion optimization. This optimization is not easily done because there are many different binary input sources with different probability distributions. If optimization is needed, we pre-code the sample binary sequence and achieve an optimal quantization level and parameters α and β experimentally. However, if the optimization is not necessary, the best selection strategy for still image coding will be 4-level LUT with α and β equal to 1.05 according to my experimental results. In case of binary document compression, 8-level quantization with α and β equal to 1.0 is the best choice.
Finally, because probability estimation table in MQ coder was developed in 1980s, there is reason to change or modify the estimation table suitable to high resolution images such as HD and super HD images. The methods to be considered for this purpose are machine learning or deep learning approaches to train large data sets of binary sources.


The research was conducted starting with the 2013 sabbatical support at Kwangwoon University.

The author would like to thank the reviewers for their helpful comments and suggestions.

Figure 1. Arithmetic coder inputs and outputs.
Figure 1. Arithmetic coder inputs and outputs.
Information 12 00143 g001
Figure 2. (a) CODEMPS procedure, (b) CODELPS procedure.
Figure 2. (a) CODEMPS procedure, (b) CODELPS procedure.
Information 12 00143 g002
Figure 3. (a) RENORME procedure, (b) BYTEOUT procedure.
Figure 3. (a) RENORME procedure, (b) BYTEOUT procedure.
Information 12 00143 g003
Figure 4. 2-level quantization method.
Figure 4. 2-level quantization method.
Information 12 00143 g004
Figure 5. 4-level Quantization method.
Figure 5. 4-level Quantization method.
Information 12 00143 g005
Figure 6. 8-level quantization method.
Figure 6. 8-level quantization method.
Information 12 00143 g006
Figure 7. Standard binary images used for simulation. (a) CCITT1 document, (b) CCITT4 document, (c) CCITT5 document, (d) CCITT7 document.
Figure 7. Standard binary images used for simulation. (a) CCITT1 document, (b) CCITT4 document, (c) CCITT5 document, (d) CCITT7 document.
Information 12 00143 g007
Figure 8. Binary Korean text used for simulation. (a) Hangeul-1 document, (b) Hangeul-2 document
Figure 8. Binary Korean text used for simulation. (a) Hangeul-1 document, (b) Hangeul-2 document
Information 12 00143 g008
Figure 9. Still Images used for simulation. (a) Baboon (500 × 480), (b) Lenna (512 × 512), (c) Monarch (768 × 512), (d) Barbara (720 × 576), (e) Zelda (780 × 576), (f) Man (1024 × 1024), (g) Four people (1280 × 720), (h) Jockey (1920 × 1080).
Figure 9. Still Images used for simulation. (a) Baboon (500 × 480), (b) Lenna (512 × 512), (c) Monarch (768 × 512), (d) Barbara (720 × 576), (e) Zelda (780 × 576), (f) Man (1024 × 1024), (g) Four people (1280 × 720), (h) Jockey (1920 × 1080).
Information 12 00143 g009
Table 1. LPS probability estimation table in MQ coder.
Table 1. LPS probability estimation table in MQ coder.
IndexQe (Hex)Qe (Decimal)NMPSNLPSSwitch
Table 2. M-coder RTAB (b = 10).
Table 2. M-coder RTAB (b = 10).
Table 3. Two modes of 2-level quantization of A-register value.
Table 3. Two modes of 2-level quantization of A-register value.
Mode-1α × 0.75 × (1 + 1/4)β × 0.75 × (1 + 3/4)
Mode-2α × 0.75 × (1 + 1/3)β × 0.75 × (1 + 2/3)
Table 4. Two modes of 2-level lookup tables for (Ai × Qe) (47 × 2, α = β = 1).
Table 4. Two modes of 2-level lookup tables for (Ai × Qe) (47 × 2, α = β = 1).
Index A1 × QeA2 × QeA1 × QeA2 × Qe
Table 5. MPS and LPS interval subdivision quantization level in case of 4-level.
Table 5. MPS and LPS interval subdivision quantization level in case of 4-level.
if (0x8000 ≤ A < 0xA000) then A = A1 (shown in Table 6)
else if (0xA000 ≤ A < 0xC000) then A = A2 (shown in Table 6)
else if (0xC000 ≤ A < 0xE000) then A = A3 (shown in Table 6)
else then A = A4 (shown in Table 6)
Table 6. Four modes of 4-level quantization.
Table 6. Four modes of 4-level quantization.
Mode-1α × 0.75 × (1 + 1/8)β × 0.75 × (1 + 3/8)β × 0.75 × (1 + 5/8)α × 0.75 × (1 + 7/8)
Mode-2α × 0.75 × (1 + 1/5)β × 0.75 × (1 + 2/5)β × 0.75 × (1 + 3/5)α × 0.75 × (1 + 4/5)
Mode-3α × 0.75 × (1 + 1/5)β × 0.75 × (1 + 3/8)β × 0.75 × (1 + 5/8)α × 0.75 × (1 + 4/5)
Mode-4α × 0.75 × (1 + 1/8)β × 0.75 × (1 + 2/5)β × 0.75 × (1 + 3/5)α × 0.75 × (1 + 7/8)
Table 7. (Ai × Qe) lookup table (47 × 4, α = 1, β = 1).
Table 7. (Ai × Qe) lookup table (47 × 4, α = 1, β = 1).
ModeIndexA1 × QeA2 × QeA3 × QeA4 × QeNMPSNLPSSwitch
Table 8. MPS and LPS interval subdivision quantization level in case of 8-level.
Table 8. MPS and LPS interval subdivision quantization level in case of 8-level.
if (0x8000 ≤ A < 0x9000) then A = A1
else if (0x9000 ≤ A < 0xA000) then A = A2
else if (0xA000 ≤ A < 0xB000) then A = A3
else if (0xB000 ≤ A < 0xC000) then A = A4
else if (0xC000 ≤ A < 0xD000) then A = A5
else if (0xD000 ≤ A < 0xE000) then A = A6
else if (0xE000 ≤ A < 0xF000) then A = A7
else then A = A8
Table 9. Two-modes of 8-level quantization.
Table 9. Two-modes of 8-level quantization.
1α × 0.75 ×
(1 + 1/9)
α × 0.75 ×
(1 + 2/9)
α × 0.75 ×
(1 + 3/9)
α × 0.75 ×
(1 + 4/9)
β × 0.75 ×
(1 + 5/9)
β × 0.75 ×
(1 + 6/9)
β × 0.75 ×
(1 + 7/9)
β × 0.75 ×
(1 + 8/9)
2α × 0.75 ×
(1 + 1/9)
α × 0.75 ×
(1 + 2/9)
α × 0.75 ×
(1 + 5/16)
α × 0.75 ×
(1 + 7/16)
β × 0.75 ×
(1 + 9/16)
β × 0.75 ×
(1 + 11/16)
β × 0.75 ×
(1 + 7/9)
β × 0.75 ×
(1 + 8/9)
Table 10. AiQe lookup table (Mode-1,2, 47 × 8, α = β = 1).
Table 10. AiQe lookup table (Mode-1,2, 47 × 8, α = β = 1).
ModeIdxA1 × QeA2 × QeA3 × QeA4 × QeA5 × QeA6 × QeA7 × QeA8 × Qe
Table 11. Experimental results for each mode using 2-level LUT for binary images.
Table 11. Experimental results for each mode using 2-level LUT for binary images.
DocumentResolution (dpi)Original MQ ByteProposed MQ ByteImprovement (%)
Mode mode-1mode-2mode-1mode-2
(α, β) (1.02, 1.02)(1.0, 1.0)(1.02, 1.02)(1.0, 1.0)
Avg improvement in 200 2.99%2.40%
Avg Improvement in 300 2.98%2.31%
Overall Improvement 2.99%2.36%
Table 12. Experimental Results for each mode using 4-level LUT for binary images.
Table 12. Experimental Results for each mode using 4-level LUT for binary images.
DocumentResol. (dpi)Original MQ ByteProposed MQ Improvement (%)
Mode mode-1mode-2mode-3mode-4
(α, β) (1.02, 1.02)(1.05, 1.05)(1.05, 1.05)(1.03, 1.03)
Avg Improvement in 200 2.22%2.17%2.16%2.23%
Avg Improvement in 300 2.04%1.98%2.01%2.09%
Overall improvement 2.13%2.08%2.08%2.16%
Table 13. Experimental results for each mode using 8-level LUT for binary images.
Table 13. Experimental results for each mode using 8-level LUT for binary images.
DocumentsResol. (dpi)Original MQ ByteProposed MQ ByteImprovement (%)
Mode mode-1mode-2mode-1mode-2
(α, β) (1.0, 1.0)(1.03, 1.03)(1.0, 1.0)(1.03, 1.03)
Avg Improvement in 200 2.18%2.14%
Avg Improvement in 200 2.08%2.02%
Overall Improvement 2.13%2.08%
Table 14. 2-level LUT experimental results for JBIG2. (α = β = 1.0).
Table 14. 2-level LUT experimental results for JBIG2. (α = β = 1.0).
DocumentsResol. (dpi)Original JB2 ByteProposed JB2 ByteImprovement (%)
Mode mode-1mode-2mode-1mode-2
Avg improvement in 200 0.56%0.52%
Avg improvement in 300 0.60%0.52%
Overall improvement 0.58%0.52%
Table 15. 4-level LUT experimental results for JBIG2. (α = β = 1.0).
Table 15. 4-level LUT experimental results for JBIG2. (α = β = 1.0).
DocumentResol (dpi)Original JB2 ByteImprovement (%)
Mode mode-1mode-2mode-3mode-4
Avg improvement in 200 0.76%0.78%0.81%0.76%
Avg improvement in 300 0.76%0.76%0.79%0.77%
Overall improvement 0.76%0.77%0.80%0.77%
Table 16. 8-level LUT experimental results for JBIG2. (α = β = 1.0).
Table 16. 8-level LUT experimental results for JBIG2. (α = β = 1.0).
DocumentResolution (dpi)Original JB2 ByteProposed JB2 ByteImprovement (%)
Mode mode-1mode-2mode-1mode-2
Avg improvement in 300 0.90%0.89%
Avg improvement in 300 0.90%0.87%
Overall improvement 0.90%0.88%
Table 17. Improvement percentage of compressed byte with proposed algorithm applying multi-level MQ in JPEG2000 (lossless compression, 2-level and 8-level LUT MQ coder) (α = β = 1.05).
Table 17. Improvement percentage of compressed byte with proposed algorithm applying multi-level MQ in JPEG2000 (lossless compression, 2-level and 8-level LUT MQ coder) (α = β = 1.05).
ImageImage SizeOriginal JP2k Byte2-Level2-Level8-Level8-Level
ModeWH mode-1mode-2mode-1mode-2
Four people1280720823,8891.25%1.19%1.58%1.57%
Average Improvement 1.19%1.21%1.51%1.50%
Table 18. Improvement percentage of compressed byte with proposed algorithm applying multi-level MQ in JPEG2000 (lossless compression, 4-level LUT MQ coder) (α = β = 1.05).
Table 18. Improvement percentage of compressed byte with proposed algorithm applying multi-level MQ in JPEG2000 (lossless compression, 4-level LUT MQ coder) (α = β = 1.05).
ImageImage SizeOriginal JP2k Byte4-Level4-Level4-Level4-Level
ModeWH mode-1mode-2mode-3mode-4
Four people1280720823,8891.51%1.43%1.51%1.54%
Average Improvement 1.44%1.38%1.44%1.45%
Table 19. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 20:1), 2-level and 8-level LUT MQ coder).
Table 19. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 20:1), 2-level and 8-level LUT MQ coder).
2-Level2-Level8 Level8-Level
mode mode-1mode-2model-1mode-2
(α, β) (1.0, 1.0)(1.0, 1.0)(1.05, 1.05)(1.05, 1.05)
Four peopleG44.3640.28%0.28%0.29%0.31%
Average improvement 0.28%0.29%0.31%0.34%
Table 20. Improvement percentage of PSNR (peak signal-to-noise ratio) with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 20:1), 4-level LUT MQ coder).
Table 20. Improvement percentage of PSNR (peak signal-to-noise ratio) with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 20:1), 4-level LUT MQ coder).
Mode mode-1mode-2mode-3mode-4
(α, β) (1.0, 1.0)(1.0, 1.0)(1.0, 1.0)(1.05, 1.05)
Four peopleG44.3640.29%0.29%0.29%0.31%
Average improvement 0.32%0.31%0.31%0.34%
Table 21. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 50:1), 2-level and 8-level LUT MQ coder).
Table 21. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression (compression ratio = 50:1), 2-level and 8-level LUT MQ coder).
PSNR ImproveColorOriginal
Mode mode-1mode-2model-1mode-2
(α, β) (1.0, 1.0)(1.0, 1.0)(1.05, 1.05)(1.0, 1.0)
Four peopleG39.3980.25%0.22%0.20%0.28%
Average improvement 0.27%0.26%0.30%0.30%
Table 22. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression, compression ratio = 50:1, 4-level LUT MQ coder).
Table 22. Improvement percentage of PSNR with proposed algorithm applying multi-level MQ in JPEG2000 (lossy compression, compression ratio = 50:1, 4-level LUT MQ coder).
PSNR ImproveColorOriginal
Mode mode-1mode-2model-3mode-4
(α, β) (1.0, 1.0)(1.0, 1.0)(1.05, 1.05)(1.0, 1.0)
Four peopleG39.3980.27%0.24%0.27%0.23%
Average improvement 0.28%0.27%0.31%0.29%
