Next Article in Journal
Modeling Dual-Drive Gantry Stages with Heavy-Load and Optimal Synchronous Controls with Force-Feed-Forward Decoupling
Next Article in Special Issue
Robust Multiple Importance Sampling with Tsallis φ-Divergences
Previous Article in Journal
Design of DNA Storage Coding with Enhanced Constraints
Previous Article in Special Issue
Waveform Design for Multi-Target Detection Based on Two-Stage Information Criterion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion †

by
Yuta Nakahara
1,* and
Toshiyasu Matsushima
2
1
Center for Data Science, Waseda University, 1-6-1 Nisniwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
2
Department of Pure and Applied Mathematics, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in Nakahara, Y.; Matsushima, T. Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion. In Proceedings of the 2022 Data Compression Conference (DCC), Snowbird, UT, USA, 22–25 March 2022; pp. 153–162.
Entropy 2022, 24(8), 1152; https://doi.org/10.3390/e24081152
Submission received: 25 July 2022 / Revised: 16 August 2022 / Accepted: 17 August 2022 / Published: 19 August 2022
(This article belongs to the Special Issue Information Theory in Signal Processing and Image Processing)

Abstract

:
Most previous studies on lossless image compression have focused on improving preprocessing functions to reduce the redundancy of pixel values in real images. However, we assumed stochastic generative models directly on pixel values and focused on achieving the theoretical limit of the assumed models. In this study, we proposed a stochastic model based on improper quadtrees. We theoretically derive the optimal code for the proposed model under the Bayes criterion. In general, Bayes-optimal codes require an exponential order of calculation with respect to the data lengths. However, we propose an algorithm that takes a polynomial order of calculation without losing optimality by assuming a novel prior distribution.

1. Introduction

There are two approaches to lossless image compression. (These two approaches are detailed in Section 1 of our previous study [1].) Most previous studies (e.g., [2,3,4]) adopted an approach in which they constructed a preprocessing function f : v t 1 p that outputs a code length assignment vector p from past pixel values v t 1 . p determines the code length of the next pixel value v t , or typically, a value v t equivalent to v t in the meaning that there exists a one-to-one mapping ( v 1 , v 2 , v t ) = g ( v 1 , v 2 , v t ) computable for both encoder and decoder. Then, v t and p are passed to the following entropy coding process such as [5,6]. In this approach, the elements p i of the code length assignment vector p satisfy i p i = 1 . Therefore, it appears superficially as a probability distribution. However, it does not directly govern the stochastic generation of original pixel value v t . Hence, we cannot define the entropy of the source of pixel value v t , and we cannot discuss the theoretical optimality of the preprocessing function f ( v t 1 ) and one-to-one mapping g ( v 1 , v 2 , v t ) .
In contrast, we adopted an approach in which we estimated a stochastic generative model p ( v t | v t 1 , θ m , m ) with an unknown parameter θ m and a model variable m, which is directly and explicitly assumed on the original pixel value v t [1,7,8,9]. Therefore, we can discuss the theoretical optimality of the entire algorithm to the entropy defined from the assumed stochastic model p ( v t | v t 1 , θ m , m ) . In particular, we can achieve the theoretically optimal coding under the Bayes criterion in statistical decision theory (see, e.g., [10]) by assuming prior distributions p ( θ m | m ) and p ( m ) on the unknown parameter θ m and model variable m. Such codes are known as Bayes codes [11] in information theory. It is known that the Bayes code asymptotically achieves the entropy of the true stochastic model, and its convergence speed achieves the theoretical limit [12]. The Bayes codes have shown remarkable performance in text compression (e.g., [13]). Therefore, we consider this approach.
We assume that the target image herein has non-stationarity, that is, the properties of pixel values are different among the positions in the image. For such an image, researchers have performed quadtree block segmentation as a component of preprocessing f ( v t 1 ) and one-to-one mapping g ( v 1 , v 2 , v t ) in the former approach, and its practical efficiency has been reported in many previous studies (e.g., [4,14]). In the latter approach, we proposed a stochastic generative model p ( v t | v t 1 , θ m , m ) that contains a quadtree as a model variable m. By assuming a prior distribution p ( m ) on it, we derived the optimal code under the Bayes criterion, and we constructed a polynomial order algorithm to calculate it without loss of optimality [1]. However, in all these studies [1,4,14], the class of quadtrees is restricted to that of proper trees, whose inner nodes have exactly four children.
In this paper, we propose a stochastic generative model p ( v t | v t 1 , θ m , m ) based on an improper quadtree m and derive the code optimal under the Bayes criterion. In general, the codes optimal under the Bayes criterion require a summation that takes an exponential order calculation for the data length. However, we herein construct an algorithm that only requires a polynomial order calculation without losing optimality by applying a theory of probability distribution for general rooted trees [15] to the improper quadtree representing the block segmentation.

2. Proposed Stochastic Generative Model

Let V denote a set of possible values of a pixel. For example, we have V = { 0 , 1 } for binary images and V = { 0 , 1 , , 255 } for grayscale images. Let h N and w N denote a height and a width of an image, respectively. Although our model is able to represent any rectangular images, we assume that h = w = 2 d max for d max N in the following for the simplicity of the notation. Then, let V t denote the random variable of the t-th pixel value in order of the raster scan, and let v t V denote its realization. Note that V t is at the x ( t ) -th row and y ( t ) -th column, where t divided by w is x ( t ) with a reminder of y ( t ) . In addition, let V t denote the sequence of pixel values V 0 , V 1 , , V t . Note that all the indices start from zero herein.
We assume V t is generated from a probability distribution p ( v t | v t 1 , θ m , m ) depending on an unknown model m M and unknown parameters θ m Θ m . (For t = 0 , we assume V 0 follows p ( v 0 | θ m , m ) .) We define m and θ m in the following.
Definition 1
([1]). Let s ( x 1 y 1 ) ( x 2 y 2 ) ( x d y d ) denote the following index set called “block.”
s ( x 1 y 1 ) ( x 2 y 2 ) ( x d y d ) : = ( i , j ) Z 2 | d = 1 d x d 2 d i 2 d max < d = 1 d x d 2 d + 1 2 d , d = 1 d y d 2 d j 2 d max < d = 1 d y d 2 d + 1 2 d ,
where ( x d y d ) { 0 , 1 } 2 for d { 1 , 2 , , d } and d d max . In addition, let s λ be the set of whole indices s λ : = { 0 , 1 , h 1 } × { 0 , 1 , , w 1 } . Then, let S denote the set that consists of all the above index sets, that is, S : = { s λ , s ( 00 ) , , s ( 11 ) , s ( 00 ) ( 00 ) , , s ( 11 ) ( 11 ) , , s ( 11 ) ( 11 ) ( 11 ) } .
Example 1
([1]). For d max = 2 ,
s ( 01 ) = { ( i , j ) Z 2 0 i < 2 , 2 j < 4 } = { ( 0 , 2 ) , ( 0 , 3 ) , ( 1 , 2 ) , ( 1 , 3 ) } .
Therefore, it represents the indices of the upper right region. In a similar manner, s ( 01 ) ( 11 ) = { ( i , j ) Z 2 1 i < 2 , 3 j < 4 } = { ( 1 , 3 ) } . It should be noted that the cardinality | s | for each s S represents the number of pixels in the block.
Definition 2.
We define the model m as a quadtree whose nodes are elements of S . Let M denote the set of the models. Let S m S , L m S and I m S denote the set of the nodes, the leaf nodes and the inner nodes of m M , respectively. Let U m S m denote the set of nodes that have less than four children. Then, U m corresponds to a pattern of variable block size segmentation, as shown in Figure 1.
Definition 3.
Each node s U m of the model m has a parameter θ s m whose parameter space is Θ s m . We define θ m as a tuple of parameters { θ s m } s U m , and let Θ m denote its space.
Notably, we can reduce the number of parameters from an equivalent model represented by a proper tree with added dummy child nodes. See the following example.
Example 2.
For d max = 2 , consider a model represented by the left-hand side image in Figure 2. It has three parameters: θ s λ , θ s ( 00 ) , and  θ s ( 10 ) . An equivalent model can be represented by a proper quadtree shown in the right-hand side of Figure 2, if assuming θ s ( 01 ) = θ s ( 11 ) by chance. However, it requires four parameters: θ s ( 00 ) , θ s ( 01 ) , θ s ( 10 ) , and  θ s ( 11 ) . Therefore, it causes inefficient learning.
Under the model m M and the parameters θ m Θ m , we assume that the t-th pixel value V t is generated as follows.
Assumption A1.
We assume that
p ( v t | v t 1 , θ m , m ) = p ( v t | v t 1 , θ s m ) ,
where s is the minimal block that satisfies ( x ( t ) , y ( t ) ) s U m (in other words, s is the the deepest node that contains ( x ( t ) , y ( t ) ) in m). For  t = 0 , we assume a similar condition p ( v 0 | θ m , m ) = p ( v 0 | θ s m ) .
Thus, the pixel value V t given the past sequence V t 1 depends only on the parameter of the minimal block s that contains V t . Note that we do not assume a specific form of p ( v t | v t 1 , θ s m ) at this point. For example, we can assume the Bernoulli distribution for V = { 0 , 1 } and also the Gaussian distribution (with an appropriate normalization and quantization) for V = { 0 , 1 , , 255 } .

3. The Bayes Code for Proposed Model

Since the true m and θ m are unknown, we assume prior distributions p ( m ) and p ( θ m | m ) . Then, we estimate the true generative probability p ( v t | v t 1 , θ m , m ) by q ( v t | v t 1 ) under the Bayes criterion in statistical decision theory (see, e.g., [10]). Subsequently, we use q ( v t | v t 1 ) as a coding probability of the entropy code such as [16]. Such a code is known as Bayes codes [11] in information theory. The expected code length of the Bayes code converges to the entropy of p ( v t | v t 1 , θ m , m ) for sufficiently large data length, and its convergence speed achieves the theoretical limit [12]. The Bayes code has shown remarkable performances in text compression (e.g., [13]).
The optimal coding probability of the Bayes code for v t is derived as follows, according to the general formula in [11].
Proposition 1.
The optimal coding probability q * ( v t | v t 1 ) under the Bayes criterion is given by
q * ( v t | v t 1 ) = m M p ( m | v t 1 ) p ( v t | v t 1 , θ m , m ) p ( θ m | v t 1 , m ) d θ m .
We call q * ( v t | v t 1 ) the Bayes-optimal coding probability.
Proposition 1 implies that we should use the coding probability that is a weighted mixture of p ( v t | v t 1 , θ m , m ) for every block segmentation pattern m and parameters θ m according to the posteriors p ( m | v t 1 ) and p ( θ m | v t 1 , m ) . (For t = 0 , p ( v 0 | θ m , m ) is mixed with weights according to the priors p ( m ) and p ( θ m | m ) , which corresponds to the initialization of the algorithm.) Notably, M is generalized to the set of improper quadtrees from the set of proper quadtrees although (4) has a similar form to Formula (5) in [1].

4. Polynomial Order Algorithm to Calculate Bayes-Optimal Coding Probability

Unfortunately, the Bayes-optimal coding probability (4) contains a computationally hard calculation. (Herein, we assume that p ( v t | v t 1 , θ m , m ) p ( θ m | v t 1 , m ) d θ m is feasible. Examples of feasible settings will be described in the next section.) The summation cost for m exponentially increases with respect to d max . Therefore, we propose a polynomial order algorithm to calculate (4) without loss of optimality by applying a theory of probability distribution for general rooted trees [15] to the improper quadtree m. In this section, we focus on the procedure of the constructed algorithm. Its validity is described in Appendix A.
Definition 4.
Let Ch ( s ) : = { s ( 00 ) , s ( 01 ) , s ( 10 ) , s ( 11 ) } be the set of child nodes of s. We define a vector z s m { 0 , 1 } 4 representing the block division pattern of s in S m as z s m : = ( z s s m ) s Ch ( s ) : = ( I { s ( 00 ) S m } , I { s ( 01 ) S m } , I { s ( 10 ) S m } , I { s ( 11 ) S m } ) , where I { · } denotes the indicator function. Examples of z s m are shown in Figure 3. For leaf nodes, z s m = 0 .
First, we assume the following prior distributions as p ( m ) and p ( θ m | m ) .
Assumption A2.
Let η s ( z ) [ 0 , 1 ] be a given hyper parameter of a block s S , which satisfies z { 0 , 1 } 4 η s ( z ) = 1 . Then, we assume that the prior on M is represented as follows.
p ( m ) = s S m η s ( z s m ) ,
where η s ( 0 ) = 1 for s whose cardinality | s | is equal to 1.
Intuitively, η s ( z s m ) represents the conditional probability that s has the block division pattern z s m under the condition that s S m . The above prior actually satisfies the condition m M p ( m ) = 1 . Although this is proved for any rooted tree in [15], we briefly describe a proof restricted for our model in the Appendix A to make this paper self-contained. Note that the above assumption does not restrict the expressive capability of the general prior in the meaning that each model m still has possibly to be assigned a non-zero probability p ( m ) > 0 .
Assumption A3.
For each model m M , we assume that
p ( θ m | m ) = s U m p ( θ s m | m ) .
Moreover, for any m , m M , s U m U m , and  θ s Θ s , we assume that
p ( θ s | m ) = p ( θ s | m ) = : p s ( θ s ) .
Therefore, each element θ s m of the parameters θ m depends only on s and they are independent from both of the other elements and the model m.
From Assumptions 1 and 3, the following lemma holds.
Lemma 1.
For any m , m M , let s t U m and s t U m denote the minimal node that satisfies ( x ( t ) , y ( t ) ) s t U m and ( x ( t ) , y ( t ) ) s t U m , respectively. If  s t = s t = : s and z s t m = z s t m = : z s , that is, they are the same block and their division patterns are also the same, then
p ( v t | v t 1 , m ) = p ( v t | v t 1 , m ) .
Hence, we represent it by q ˜ ( v t | v t 1 , s , z s ) because it does not depend on m but ( s , z s ) . Let q ˜ ( v t | v t 1 , s ) = p ( v t | m ) = p ( v t | m ) for t = 0 .
Lemma 1 means that the optimal coding probability for v t depends on the minimal block s that contains v t and its division pattern z s . Therefore, it could be calculated as q ˜ ( v t | v t 1 , s , z s ) if ( s , z s ) was known.
At last, the Bayes-optimal coding probability q * ( v t | v t 1 ) can be calculated by a recursive function for nodes on a path of the perfect quadtree on S . The definition of the path is the same as [1].
Definition 5
([1]). Let S t denote the set of nodes which contain ( x ( t ) , y ( t ) ) . They construct a path from the leaf node s ( x 1 y 1 ) ( x 2 y 2 ) ( x d max y d max ) = { ( x ( t ) , y ( t ) ) } to the root node s λ on the perfect quadtree whose depth is d max on S , as shown in Figure 4. In addition, let s ch S t denote the child node of s S t on that path.
Definition 6.
We define the following recursive function q ( v t | v t 1 , s ) for s S t .
q ( v t | v t 1 , s ) : = q ˜ ( v t | v t 1 , s , 0 ) , | s | = 1 , z s : z s s ch = 0 η s ( z s | v t 1 ) q ˜ ( v t | v t 1 , s , z s ) + z s : z s s ch = 1 η s ( z s | v t 1 ) q ( v t | v t 1 , s ch ) , otherwise ,
where η s ( z s | v t ) is also recursively updated for s S t as follows:
η s ( z s | v t ) : = η s ( z s ) , t = 1 , η s ( z s | v t 1 ) q ˜ ( v t | v t 1 , s , z s ) q ( v t | v t 1 , s ) , t 0 z s s ch = 0 , η s ( z s | v t 1 ) q ( v t | v t 1 , s ch ) q ( v t | v t 1 , s ) , t 0 z s s ch = 1 .
Consequently, the following theorem holds.
Theorem 1.
The Bayes-optimal coding probability q * ( v t | v t 1 ) for the proposed model is calculated by
q * ( v t | v t 1 ) = q ( v t | v t 1 , s λ ) .
Although Theorem 1 is proved by applying Corollary 2 of Theorem 7 in [15], we briefly describe a proof restricted to our model in the Appendix A to make this paper self-contained. Theorem 1 means that the summation with respect to m M in (4) is able to be replaced by the summation with respect to s S t and z s { 0 , 1 } 4 , which costs only O ( 2 4 d max ) . The proposed algorithm recursively calculates a weighted mixture of coding probabilities q ˜ ( v t | v t 1 , s , z s ) for the case where block s is not divided at s ch (i.e., z s s ch = 0 ) and the coding probability q ( v t | v t 1 , s ch ) for the case where block s is divided at s ch (i.e., z s s ch = 1 ).

5. Experiments

In this section, we perform four experiments. Three of them are similar to the experiments in [1]. The fourth one is newly added. In Experiments 1, 2, and 3, we assume V = { 0 , 1 } , which is the simplest setting, to focus on the effect of the improper quadtrees. In Experiment 4, we assume V = { 0 , 1 , , 255 } to show our method is also applicable to grayscale images. The purpose of the first experiment is to confirm the Bayes optimality of q ( v t | v t 1 , s λ ) for synthetic images generated from the proposed model. The purpose of the second experiment is to show an example image suitable to our model. The purpose of the third experiment is to compare average coding rates of our proposed algorithm with a current image coding procedure on real images. The purpose of the fourth experiment is to show our method is applicable to grayscale images.
In Experiments 1 and 2, p ( v t | v t 1 , θ m , m ) is Bernoulli distribution Bern ( v t | θ s m ) for the minimal s that satisfies ( x ( t ) , y ( t ) ) s U m . Each element of θ m is i.i.d. distributed with the beta distribution Beta ( θ | α , β ) , which is the conjugate prior distribution of Bernoulli distribution. Therefore, the integral in (4) has a closed form. The hyperparameter η s ( z ) of the model prior is η s ( z ) = 1 / 2 4 for every s S and z { 0 , 1 } 4 , and the hyperparameters of the beta distribution are α = β = 1 / 2 . For comparison, we used the previous method based on proper quadtrees, whose hyperparameters are the same as the experiments in [1], and the standard methods known as JBIG [17] and JBIG2 [18].

5.1. Experiment 1

The setting of Experiment 1 is as follows. The width and height of images are w = h = 2 d max = 64 . We generate 1000 images according to the following procedure.
1.
Generate m according to (5).
2.
Generate θ s m according to p ( θ s m | m ) for s U m .
3.
Generate pixel value v t according to p ( v t | v t 1 , θ m , m ) for t { 0 , 1 , , h w 1 } .
4.
Repeat Steps 1 to 3 for 1000 times.
Examples of the generated images are shown in Figure 5. Subsequently, we compress these 1000 images. The size of the image is saved in the header of the compressed file using 4 bytes. The coding probability calculated by the proposed algorithm is quantized in 2 16 levels and substituted into the range coder [16]. Table 1 shows the coding rates (bit/pel) averaged over all the images. Our proposed code has the minimum coding rate as expected by the Bayes optimality.

5.2. Experiment 2

In Experiment 2, we compress camera.tif in [19], which is binarized with the threshold of 128. The setting of the header and the range coder is the same as those of Experiment 1. Figure 6 visualizes the maximum a posteriori (MAP) estimation m MAP = arg max m p ( m | v h w 1 ) based on the improper quadtree model and the proper quadtree model [1], which are by-products of the compression. They are obtained by applying Theorem 3 in [15] and the algorithm in Appendix B in the preprint of the full version of [15], which is uploaded on arXiv. The improper quadtree represents the non-stationarity by a fewer number of regions (i.e., fewer parameters) than that of the proper quadtree [1]. Table 2 shows that the coding rate of our proposed model for camera.tif is lower than the previous one based on the proper quadtree [1] and JBIG [17] without any special tuning. However, JBIG2 [18] showed the lowest coding rate. The improvement of our method for real images will be described in the next experiment.

5.3. Experiment 3

In Experiment 3, we compare the proposed algorithm with the proper-quadtree-based algorithm [1], JBIG [17], and JBIG2 [18] on real images from [19]. They are binarized in a similar manner to Experiment 2. The setting of the header and the range coder is the same as those of Experiments 1 and 2. A difference from Experiments 1 and 2 is in the stochastic generative model p ( v t | v t 1 , θ m , m ) assumed on each block s. We assume another model p ( v t | v t 1 , θ m , m ) represented as the Bernoulli distribution Bern ( v t | θ s ; v t w 1 v t w v t w + 1 v t 1 m ) that depends on the neighboring four pixels. (If the indices go out of the image, we use the nearest past pixel in Manhattan distance.) Therefore, p ( v t | v t 1 , θ m , m ) has a kind of Markov property. In other words, there are 16 parameters θ s ; 0000 m , θ s ; 0001 m , , θ s ; 1111 m for each block s of model m, and one of them is chosen by the observed values v t w 1 , v t w , v t w + 1 , and  v t 1 in the past. Each parameter is i.i.d. distributed with the beta distribution whose parameters are α = β = 1 / 2 . The results are shown in Table 3. The algorithms labeled as Improper-i.i.d. and Proper-i.i.d. are the same as those in Experiments 1 and 2. The algorithms labeled as Improper-Markov and Proper-Markov are the aforementioned ones.
Improper-Markov outperforms the other methods from the perspective of average coding rates. The effect of the improper quadtree is probably amplified because the number of parameters for each block is increased. However, JBIG2 [18] still outperforms our algorithms only for text. We consider it is because JBIG2 [18] is designed for text images such as faxes in contrast to our general-purpose algorithm. Note that our algorithm has room for improvement by tuning the hyperparameters α and β of the beta distribution for each of θ s ; 0000 m , θ s ; 0001 m , , θ s ; 1111 m .

5.4. Experiment 4

Through Experiment 4, we show our method is applicable to grayscale images. Herein, we assume two types of stochastic generative models p ( v t | v t 1 , θ m , m ) for the block of the proper quadtree and the improper quadtree. The first one is the i.i.d. Gaussian distribution N ( v t | μ s m , ( λ s m ) 1 ) . In this case, θ s m can be regarded as { μ s m , λ s m } R × R > 0 . The second one is the two-dimensional autoregressive (AR) model [7] of the neighboring four pixels, i.e., N ( v t | v ˜ t 1 w s m , ( τ s m ) 1 ) , where v ˜ t 1 = ( v t w 1 , v t w , v t w + 1 , v t 1 ) . (If the indices go out of the image, we use the nearest past pixel in Manhattan distance.) In this case, θ s m can be regarded as { w s m , τ s m } R 4 × R > 0 . For both models, v t is normalized and quantized into V = { 0 , 1 , , 255 } in a similar manner to [7]. The prior distributions for each model are assumed to be the Gauss–gamma distributions N ( μ s m | μ 0 , ( κ 0 λ s ) 1 ) Gam ( λ s m | α 0 , β 0 ) and N ( w s m | μ 0 , ( τ s m Λ 0 ) 1 ) Gam ( τ s m | α 0 , β 0 ) , where μ 0 = 0 , μ 0 = 0 , κ 0 = 0.01 , Λ 0 = 0.01 I , α 0 = 1.0 , β 0 = 0.0001 . Here, I is the identity matrix. The results are shown in Table 4. (The values for previous studies [2,4,20,21] are cited from [21].)
The coding rates of the proper-quadtree-based algorithm are improved by our proposed method for all the images in this data set and for both settings of the stochastic generative model assumed within blocks. This indicates the superiority of the improper-quadtree-based model to the proper-quadtree-based model. The method labeled by Improper-AR showed an average coding rate lower than JPEG2000, averaging for the whole images. It also showed an average coding rate lower than JPEG-LS, averaging for the natural images. Although it does not outperform recent methods such as MRP and Vanilc, we consider this is because of the suitability of the stochastic generative model within blocks, which is out of the scope of this paper.

6. Conclusions

We proposed a novel stochastic model based on the improper quadtree, so that our model effectively represents the variable block size segmentation of images. Then, we constructed a Bayes code for the proposed stochastic model. Moreover, we introduced an algorithm to implement it in polynomial order of data size without loss of optimality. Some experiments both on synthetic and real images demonstrated the flexibility of our stochastic model and the efficiency of our algorithm. As a result, the derived algorithm showed a better average coding rate than that of JBIG2 [18].

Author Contributions

Conceptualization, Y.N. and T.M.; methodology, Y.N.; software, Y.N.; validation, Y.N. and T.M.; formal analysis, Y.N. and T.M.; investigation, Y.N. and T.M.; resources, Y.N.; data curation, Y.N.; writing—original draft preparation, Y.N.; writing—review and editing, Y.N. and T.M.; visualization, Y.N.; supervision, T.M.; project administration, T.M.; funding acquisition, T.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by JSPS KAKENHI Grant Numbers 17K06446, JP19K04914 and 22K02811.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://links.uwaterloo.ca/Repository.html (accessed on 18 August 2022).

Acknowledgments

We would like to thank the members of Matsushima laboratory for their meaningful discussions.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Validity of Proposed Algorithm

Validity of Prior Distribution for Models

Although a general proof for any rooted trees is described in [15] (please see also a preprint for the full version of [15] uploaded on arXiv.), in the following, we briefly describe a proof restricted for our model to make this paper self-contained.
m M p ( m ) = m M s S m η s ( z s m ) ( a ) = z s λ { 0 , 1 } 4 m M : z s λ m = z s λ s S m η s ( z s m )
= z s λ { 0 , 1 } 4 η s λ ( z s λ ) m M : z s λ m = z s λ s S m { s λ } η s ( z s m )
= z s λ { 0 , 1 } 4 η s λ ( z s λ ) s Ch ( s λ ) m M s s S m η s ( z s m ) ( b ) z s λ s
In (A3), M s denotes the set of subtrees whose root node is s . The factorization from (A2) to (A3) is because m in (A2) is determined by the subtrees m whose root nodes are in Ch ( s λ ) . The same idea is also detailed in Figure 4 in the preprint of the full version of [15], which is uploaded on arXiv. The underbraced parts ( a ) and ( b ) have the same structure except for the depth of the root node. We represent them by ϕ ( s ) , which is a function of the root node s of the subtree.
Subsequently, we have
ϕ ( s ) = z { 0 , 1 } 4 η s ( z ) = 1 , | s | = 1 , z { 0 , 1 } 4 η s ( z ) s Ch ( s ) ϕ ( s ) z s s , otherwise .
Therefore, the following holds by recursively substituting ϕ ( s ) from the leaf nodes.
m M p ( m ) = ϕ ( s λ ) = 1
Proof of Lemma 1. 
Let R ( s , z s ) denote s Ch ( s ) : z s s = 0 s , which is a region where v t is generated according to η s ( z s ) when ( x ( t ) , y ( t ) ) s . Then,
p ( v t | v t 1 , m ) p ( v t | v t 1 , θ s m ) p ( θ m | m ) p ( v t 1 | θ m , m ) d θ s m d θ s m
p ( v t | v t 1 , θ s m ) p s ( θ s m ) i { i t | ( x ( i ) , y ( i ) ) R ( s , z s ) } p ( v i | v i 1 , θ s m ) d θ s m ,
where ∝ means that the left-hand side is proportional to the right-hand side, regarding the variables except for v t as constant, and θ s m denotes the parameters θ m except for θ s m . Formula (A7) does not depend on m but ( s , z s ) . □
Proof of Theorem 1. 
Although Theorem 1 is proved by applying Corollary 2 of Theorem 7 in [15] (please see also the preprint for the full version of [15] uploaded on arXiv), in the following, we briefly describe a proof restricted to our model to make this paper self-contained.
Theorem 1 will be proved by induction. First, we assume
p ( m | v t 1 ) = s S m η s ( z s m | v t 1 ) ,
which is true for t = 0 because of Assumption 2 and will be proved later for t > 0 . In addition, we define the following function to simplify the notation.
f ( v t | v t 1 , s , z s ) : = q ˜ ( v t | v t 1 , s , 0 ) , s = { ( x ( t ) , y ( t ) ) } , q ˜ ( v t | v t 1 , s , z s ) , s Ch ( s ) s . t . ( s ( x ( t ) , y ( t ) ) ) ( z s s = 0 ) , 1 , otherwise .
Using this notation, we can represent p ( v t | v t 1 , m ) as follows.
p ( v t | v t 1 , m ) = s S m f ( v t | v t 1 , s , z s m ) .
(Equations (A9) and (A10) correspond to Conditions 4 and 3 in [15], respectively. If we accept this fact, Theorem 1 is immediately proved by applying Corollary 2 in [15].) By using (A10), we have
p ( v t | v t 1 ) = m M p ( m | v t 1 ) p ( v t | v t 1 , m ) = m M s S m η s ( z s m | v t 1 ) f ( v t | v t 1 , s , z s m ) .
Since the right-hand side of (A11) has a similar form to the underbraced part ( a ) in (A2), we can define a recursive function q ( v t | v t 1 , s ) that satisfies
p ( v t | v t 1 ) = q ( v t | v t 1 , s λ ) ,
where
q ( v t | v t 1 , s ) : = f ( v t | v t 1 , s , 0 ) , | s | = 1 z s { 0 , 1 } 4 η s ( z s | v t 1 ) f ( v t | v t 1 , s , z s ) × s Ch ( s ) q ( v t | v t 1 , s ) z s s , otherwise
By substituting (A9), q ( v t | v t 1 , s ) = 1 holds for s ( x ( t ) , y ( t ) ) (or equivalently for s S t ). Therefore, we need not calculate (A13) for s S t and (9) will be derived by substituting (A9) again for s S t .
Lastly, we will prove (A8). Using (A9), the updating Formula (10) can be generally represented as follows.
η s ( z s | v t ) = η s ( z s ) , t = 1 , η s ( z s | v t 1 ) , t 0 | s | = 1 , η s ( z s | v t 1 ) f ( v t | v t 1 , s , z s ) s Ch ( s ) q ( v t | v t 1 , s ) z s s q ( v t | v t 1 , s ) , otherwise .
By substituting the above general updating formula,
s S m η s ( z s | v t ) = s I m η s ( z s | v t 1 ) f ( v t | v t 1 , s , z s ) s Ch ( s ) q ( v t | v t 1 , s ) z s s q ( v t | v t 1 , s )
× s L m η s ( z s | v t 1 ) f ( v t | v t 1 , s , z s ) s Ch ( s ) q ( v t | v t 1 , s ) 0 q ( v t | v t 1 , s )
= 1 q ( v t | v t 1 , s λ ) s S m η s ( z s | v t 1 ) s S m f ( v t | v t 1 , s , z s )
= p ( m | v t 1 ) p ( v t | v t 1 , m ) p ( v t | v t 1 ) = p ( m | v t )
In the above operation, (A15) was a telescoping product, i.e., q ( v t | v t 1 , s ) appeared at once in each of the denominator and the numerator. Therefore, we canceled them except for q ( v t | v t 1 , s λ ) . (A16) is because of (A8), (A10) and (A11), where (A8) and (A11) are the induction hypotheses. □

References

  1. Nakahara, Y.; Matsushima, T. A Stochastic Model for Block Segmentation of Images Based on the Quadtree and the Bayes Code for It. Entropy 2021, 23, 991. [Google Scholar] [CrossRef] [PubMed]
  2. Weinberger, M.J.; Seroussi, G.; Sapiro, G. The LOCO-I lossless image compression algorithm: Principles and standardization into JPEG-LS. IEEE Trans. Image Process. 2000, 9, 1309–1324. [Google Scholar] [CrossRef] [PubMed]
  3. Wu, X.; Memon, N. Context-based, adaptive, lossless image coding. IEEE Trans. Commun. 1997, 45, 437–444. [Google Scholar] [CrossRef]
  4. Matsuda, I.; Ozaki, N.; Umezu, Y.; Itoh, S. Lossless coding using variable block-size adaptive prediction optimized for each image. In Proceedings of the 2005 13th European Signal Processing Conference, Antalya, Turkey, 4–8 September 2005; pp. 1–4. [Google Scholar]
  5. Huffman, D.A. A Method for the Construction of Minimum-Redundancy Codes. Proc. IRE 1952, 40, 1098–1101. [Google Scholar] [CrossRef]
  6. Rissanen, J.; Langdon, G. Universal modeling and coding. IEEE Trans. Inf. Theory 1981, 27, 12–23. [Google Scholar] [CrossRef]
  7. Nakahara, Y.; Matsushima, T. Autoregressive Image Generative Models with Normal and t-distributed Noise and the Bayes Codes for Them. In Proceedings of the 2020 International Symposium on Information Theory and Its Applications (ISITA), Kapolei, HI, USA, 24–27 October 2020; pp. 81–85. [Google Scholar]
  8. Nakahara, Y.; Matsushima, T. Hyperparameter Learning of Stochastic Image Generative Models with Bayesian Hierarchical Modeling and Its Effect on Lossless Image Coding. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW), Kanazawa, Japan, 17–21 October 2021. [Google Scholar]
  9. Nakahara, Y.; Matsushima, T. Bayes code for two-dimensional auto-regressive hidden Markov model and its application to lossless image compression. In Proceedings of the International Workshop on Advanced Imaging Technology (IWAIT) 2020, Yogyakarta, Indonesia, 1 June 2020; SPIE: Bellingham, WA, USA, 2020; Volume 11515, pp. 330–335. [Google Scholar] [CrossRef]
  10. Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  11. Matsushima, T.; Inazumi, H.; Hirasawa, S. A class of distortionless codes designed by Bayes decision theory. IEEE Trans. Inf. Theory 1991, 37, 1288–1293. [Google Scholar] [CrossRef]
  12. Clarke, B.S.; Barron, A.R. Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory 1990, 36, 453–471. [Google Scholar] [CrossRef]
  13. Matsushima, T.; Hirasawa, S. Reducing the space complexity of a Bayes coding algorithm using an expanded context tree. In Proceedings of the 2009 IEEE International Symposium on Information Theory, Seoul, Korea, 28 June–3 July 2009; pp. 719–723. [Google Scholar] [CrossRef]
  14. Sullivan, G.J.; Ohm, J.; Han, W.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
  15. Nakahara, Y.; Saito, S.; Kamatsuka, A.; Matsushima, T. Probability Distribution on Rooted Trees. In Proceedings of the 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 26 June–1 July 2022. [Google Scholar]
  16. Martín, G. Range encoding: An algorithm for removing redundancy from a digitised message. In Proceedings of the Video and Data Recording Conference, Southampton, UK, 24–27 July 1979; pp. 24–27. [Google Scholar]
  17. Kuhn, M. JBIG-KIT. Available online: https://www.cl.cam.ac.uk/~mgk25/jbigkit/ (accessed on 24 July 2022).
  18. Langley, A. jbig2enc. Available online: https://github.com/agl/jbig2enc (accessed on 24 July 2022).
  19. Image Repository of the University of Waterloo. Available online: http://links.uwaterloo.ca/Repository.html (accessed on 8 November 2021).
  20. Skodras, A.; Christopoulos, C.; Ebrahimi, T. The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 2001, 18, 36–58. [Google Scholar] [CrossRef]
  21. Weinlich, A.; Amon, P.; Hutter, A.; Kaup, A. Probability Distribution Estimation for Autoregressive Pixel-Predictive Image Coding. IEEE Trans. Image Process. 2016, 25, 1382–1395. [Google Scholar] [CrossRef] [PubMed]
Figure 1. An example of node set S and models m. The set of blocks with gray region corresponds to U m , which covers the whole region of the image and represents a block segmentation pattern.
Figure 1. An example of node set S and models m. The set of blocks with gray region corresponds to U m , which covers the whole region of the image and represents a block segmentation pattern.
Entropy 24 01152 g001
Figure 2. A model with three parameters (left) and a model with four parameters (right).
Figure 2. A model with three parameters (left) and a model with four parameters (right).
Entropy 24 01152 g002
Figure 3. Examples of block division patterns and corresponding z s m .
Figure 3. Examples of block division patterns and corresponding z s m .
Entropy 24 01152 g003
Figure 4. An example of a path constructed from S t .
Figure 4. An example of a path constructed from S t .
Entropy 24 01152 g004
Figure 5. Examples of the generated images in Experiment 1.
Figure 5. Examples of the generated images in Experiment 1.
Entropy 24 01152 g005
Figure 6. The original image (left), the MAP estimated model m MAP based on the proper quadtree [1] (middle), and that based on the improper quadtree (right).
Figure 6. The original image (left), the MAP estimated model m MAP based on the proper quadtree [1] (middle), and that based on the improper quadtree (right).
Entropy 24 01152 g006
Table 1. The average coding rates (bit/pel).
Table 1. The average coding rates (bit/pel).
Improper Quadtree (Proposal)Proper Quadtree [1]JBIG [17]JBIG2 [18]
0.6190.6241.8110.962
Table 2. The coding rates for the camera.tif in [19] (bit/pel).
Table 2. The coding rates for the camera.tif in [19] (bit/pel).
Improper Quadtree (Proposal)Proper Quadtree [1]JBIG [17]JBIG2 [18]
0.3180.3230.3480.293
Table 3. The coding rates for the binarized images from [19] (bit/pel).
Table 3. The coding rates for the binarized images from [19] (bit/pel).
ImagesProper-i.i.dImproper-i.i.d.JBIG [17]Proper-MarkovJBIG2 [18]Improper-Markov
bird0.1210.1130.1490.0990.0900.067
bridge0.3900.3820.3860.3730.3530.300
camera0.3230.3180.3480.3100.2930.255
circles0.1000.0900.1020.0600.0450.030
crosses0.1400.1320.0830.1100.0270.027
goldhill10.3710.3640.3590.3530.3210.280
horiz0.0750.0700.0780.0220.0180.004
lena10.2540.2430.2170.2160.1690.141
montage0.1760.1650.1640.1630.1140.087
slope0.0910.0830.0960.0560.0380.021
squares0.0050.0040.0760.0100.0160.003
text0.4680.4650.3010.4680.2290.280
avg.0.2090.2020.1970.1870.1430.125
Table 4. The coding rates for the grayscale images from [19] (bit/pel).
Table 4. The coding rates for the grayscale images from [19] (bit/pel).
ImagesJPEG2000 [20]JPEG-LS [2]MRP [4]Vanilc [21]Proper-GaussianImproper-GaussianProper-ARImproper-AR
bird3.6303.4713.2382.7494.0864.0553.4613.422
bridge6.0125.7905.5845.5966.3536.2945.6965.678
camera4.5704.3143.9983.9954.6514.5894.1634.121
circles0.9280.1530.1320.0431.1900.9151.0300.826
crosses1.0660.3860.0510.0161.6031.2400.8980.625
goldhill15.5165.2815.0985.0905.7965.7385.2205.196
horiz0.2310.0940.0160.0151.0910.9220.2790.216
lena14.7554.5814.1894.1235.3125.2594.4334.394
montage2.9832.7232.3532.3633.8183.7342.9402.850
slope1.3421.5710.8590.9603.7213.6831.7281.602
squares0.1630.0770.0130.0070.3350.2050.3230.202
text4.2151.6323.1750.6214.3103.6914.1763.732
Whole avg.2.9512.5062.3922.1323.5223.3602.8622.739
Natural avg.4.8974.6874.4214.3115.2405.1874.5954.562
Artificial avg.1.5610.9480.9430.5752.2952.0561.6251.436
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nakahara, Y.; Matsushima, T. Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion. Entropy 2022, 24, 1152. https://doi.org/10.3390/e24081152

AMA Style

Nakahara Y, Matsushima T. Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion. Entropy. 2022; 24(8):1152. https://doi.org/10.3390/e24081152

Chicago/Turabian Style

Nakahara, Yuta, and Toshiyasu Matsushima. 2022. "Stochastic Model of Block Segmentation Based on Improper Quadtree and Optimal Code under the Bayes Criterion" Entropy 24, no. 8: 1152. https://doi.org/10.3390/e24081152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop