Next Article in Journal
Tool Embodiment Is Reflected in Movement Multifractal Nonlinearity
Previous Article in Journal
On Graphical Fuzzy Metric Spaces with Application to Fractional Differential Equations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantization for Infinite Affine Transformations

by
Doğan Çömez
1 and
Mrinal Kanti Roychowdhury
2,*
1
Department of Mathematics, 408E24 Minard Hall, North Dakota State University, Fargo, ND 58108-6050, USA
2
School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, 1201 West University Drive, Edinburg, TX 78539-2999, USA
*
Author to whom correspondence should be addressed.
Fractal Fract. 2022, 6(5), 239; https://doi.org/10.3390/fractalfract6050239
Submission received: 4 April 2022 / Revised: 22 April 2022 / Accepted: 24 April 2022 / Published: 25 April 2022

Abstract

:
Quantization for a probability distribution refers to the idea of estimating a given probability by a discrete probability supported by a finite set. In this article, we consider a probability distribution generated by an infinite system of affine transformations { S i j } on R 2 with associated probabilities { p i j } such that p i j > 0 for all i , j N and i , j = 1 p i j = 1 . For such a probability measure P, the optimal sets of n-means and the nth quantization error are calculated for every natural number n. It is shown that the distribution of such a probability measure is the same as that of the direct product of the Cantor distribution. In addition, it is proved that the quantization dimension D ( P ) exists and is finite; whereas, the D ( P ) -dimensional quantization coefficient does not exist, and the D ( P ) -dimensional lower and the upper quantization coefficients lie in the closed interval [ 1 12 , 5 4 ] .

1. Introduction

The quantization problem for probability measures is concerned with approximating a given measure by discrete measures of finite support in L r -metrics. This problem has roots in information theory and engineering technology, in particular in signal processing and pattern recognition [1,2]. For a Borel probability measure P on R d , a quantizer is a function q mapping d-dimensional vectors in the domain Ω R d into a finite set of vectors α R d . In this case, the error min a α x a 2 d P ( x ) , where · is the Euclidean norm R d , is often referred to as the variance, cost, or distortion error for α with respect to the measure P, and is denoted by V ( α ) : = V ( P ; α ) . The value inf { V ( P ; α ) : α R d , card ( α ) n } is called the nth quantization error for the P, and is denoted by V n : = V n ( P ) . A set α on which this infimum is attained and contains no more than n points is called an optimal set of n-means. The elements of an optimal set are called optimal quantizers. It is known that for a Borel probability measure P if its support contains infinitely many elements and x 2 d P ( x ) is finite, then an optimal set of n-means always has exactly n-elements [3,4,5,6]. The number lim n 2 log n log V n ( P ) , if exists, is called the quantization dimension of the measure P, and is denoted by D ( P ) ; likewise, for any s ( 0 , + ) , the number lim n n 2 s V n ( P ) , if it exists, is called the s-dimensional quantization coefficient for P.
For a finite set α R d , the Voronoi region generated by a α , denoted by M ( a | α ) , is the set of all points in R d which are closer to a α than to all other elements in α . For a probability distribution P on R d the centroids of the regions M ( a | α ) are given by a * = 1 P ( M ( a | α ) ) M ( a | α ) x d P . A Voronoi tessellation is called a centroidal Voronoi tessellation (CVT) if a * = a , i.e., if the generators are also the centroids of their own Voronoi regions. For a Borel probability measure P on R d , an optimal set of n-means forms a CVT; however, the converse is not true in general [7,8]. The following fact is known [6,9]:
Proposition 1.
Let α be an optimal set of n-means and a α . Then,
(i)
P ( M ( a | α ) ) > 0 and P ( M ( a | α ) ) = 0 ,
(ii)
a = E ( X : X M ( a | α ) ) , where X is a random variable with distribution P ,
(iii)
P-almost surely the set { M ( a | α ) : a α } forms a Voronoi partition of R d .
Let X = R and consider the probability distribution P c : = 1 2 P c U 1 1 + 1 2 P c U 2 1 , where U 1 ( x ) = 1 3 x and U 2 ( x ) = 1 3 x + 2 3 , for all x R . Because its support is the standard Cantor set generated by U 1 and U 2 , P c is called the Cantor distribution. S. Graf and H. Luschgy determined the optimal sets of n-means and the nth quantization errors for the Cantor distribution, for all n 1 , completing its quantization program [10]. This result has been extended to the setting of a nonuniform Cantor distribution by L. Roychowdhury [11]. Analogously, the Cantor dust is generated by the contractive mappings { S i } i = 1 4 on R 2 , where S 1 ( x 1 , x 2 ) = 1 3 ( x 1 , x 2 ) , S 2 ( x 1 , x 2 ) = 1 3 ( x 1 , x 2 ) + ( 2 3 , 0 ) , S 3 ( x 1 , x 2 ) = 1 3 ( x 1 , x 2 ) + ( 0 , 2 3 ) , and S 4 ( x 1 , x 2 ) = 1 3 ( x 1 , x 2 ) + ( 2 3 , 2 3 ) . If P is a Borel probability measure on R 2 such that P = 1 4 P S 1 1 + 1 4 P S 2 1 + 1 4 P S 3 1 + 1 4 P S 4 1 , then P has support the Cantor dust. For this measure, D. Çömez and M.K. Roychowdhury determined the optimal sets of n-means and the nth quantization errors [12]. Let P be a probability measure on R generated by an infinite collection of similitudes { S j } j = 1 , where S j ( x ) = 1 3 j x + 1 1 3 j 1 for all x R and P is given by P = j = 1 1 2 j P S j 1 . For this measure, M.K. Roychowdhury determined the optimal sets of n-means and the nth quantization errors [13], which is an infinite extension of the result of S. Graf and H. Luschgy in [10]. The quantization dimension for probability distributions generated by an infinite collection of similitudes was determined by E. Mihailescu and M.K. Roychowdhury in [14], which is an infinite extension of the result of S. Graf and H. Luschgy in [15]. In this article, we study extension of the result of D. Çömez and M.K. Roychowdhury in [12] to the setting of countably infinite affine maps on R 2 , which will also complete the program initiated in [14].
Let { S ( i , j ) : i , j N } be a collection of countably infinite affine transformations on R 2 , where S ( i , j ) ( x 1 , x 2 ) = ( r i x 1 + 1 r i 1 , r j x 2 + 1 r j 1 ) , where 0 < r 1 3 . Clearly, these affine transformations are all contractive but are not similarity mappings. Associate the mappings S ( i , j ) with the probabilities p ( i , j ) such that p ( i , j ) = 1 2 i + j for all i , j N , where N : = { 1 , 2 , 3 , } . Then, there exists a unique Borel probability measure P on R 2 ([16,17,18], etc.) such that
P = i , j = 1 p ( i , j ) P S ( i , j ) 1 .
The support of such a probability measure lies in the unit square [ 0 , 1 ] 2 . We call such a measure an affine measure on R 2 , or more specifically, an infinitely generated affine measure on R 2 . This article deals with the quantization of this measure P. The arrangement of the paper is as follows: in Section 2, we discuss the basic definitions and lemmas about the optimal sets of n-means and the nth quantization errors. The arguments in this section point out that determining optimal sets of n-means and the nth quantization errors for all n 3 and for arbitrary r ( 0 , 1 3 ) require very intricate and complicated analysis; hence, for clarity purposes, in the remaining sections the focus will be on the case r = 1 3 .  Section 3 is devoted to determining the optimal sets of n-means for n = 2 and n = 3 . In Section 4, we first define a mapping F which enables us to convert the infinitely generated affine measure P to a finitely generated product measure P c × P c , each P c is the Cantor distribution. Having this connection between P and P c ; together with the optimal sets of n-means for n = 1 , 2 , 3 , in Section 5 we will utilize the dynamics of the affine maps to obtain the main results of the paper: closed formulas to determine the optimal sets of n-means and the corresponding quantization errors for all n 4 . For clarity of the exposition, we also provide some examples and figures to illustrate the constructions. Lastly, having closed form for the quantization errors for each n , we prove the existence of the quantization dimension D ( P ) and show that the D ( P ) -dimensional quantization coefficient for P does not exist (but are finite) and the D ( P ) -dimensional lower and the upper quantization coefficients lie in the closed interval [ 1 12 , 5 4 ] .
The results and the arguments in this article are not straightforward generalizations of those in [13]; in particular, this is the case for optimal sets. By the nature of the affine transformations considered in this paper, the optimal sets of order n = k 2 , k 1 , are the same as the cross product of optimal sets of order k obtained in [13]; however, the same cannot be said for other n 3 . Clearly, for n a prime number, optimal sets of n-means cannot be obtained this way. Furthermore, as will be seen from the main theorem, even for n = k l , optimal sets of n-means are not the same as the cross product of optimal sets of k- and l-means in [13]. For example, optimal sets of 2- and 3-means in [13] are { 1 6 , 5 6 } and { 1 6 , 13 18 , 17 18 } (or { 1 18 , 5 18 , 5 6 } ) , respectively; hence, the cross product of these sets produce some of the optimal sets of 6-means. On the other hand, one of the optimal sets of 6-means is { ( 1 18 , 1 6 ) , ( 5 6 , 1 6 ) , ( 13 18 , 1 6 ) , ( 1 18 , 5 6 ) , ( 13 18 , 5 6 ) , ( 5 6 , 5 6 ) } , which cannot be obtained as the cross product of optimal sets of 2- and 3-means in [13].

2. Preliminaries

Let P be the affine measure on R 2 generated by the affine maps { S ( i , j ) : i , j N } defined above. Consider the alphabet I = N 2 = { ( i , j ) : i , j N } . By a "string" or a "word" ω over I , it is meant a finite sequence ω : = ω 1 ω 2 ω k of symbols from the alphabet, k 1 , where k is called the length of the word ω . A word of length zero is called the "empty word", and is denoted by ∅. By I * we denote the set of all words over the alphabet I of some finite length k , including the empty word ∅. By | ω | , we denote the length of a word ω I * . For any two words ω : = ω 1 ω 2 ω k and τ : = τ 1 τ 2 τ in I * , by ω τ : = ω 1 ω k τ 1 τ we mean the word obtained from the concatenation of ω and τ . For n 1 and ω = ω 1 ω 2 ω n I * we define ω : = ω 1 ω 2 ω n 1 . Note that ω is the empty word if the length of ω is one. Analogously, by N * we denote the set of all words over the alphabet N , and for any τ N * , | τ | , τ , etc. are defined similarly. Let ω I k , k 1 , be such that ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) , then ω ( 1 ) and ω ( 2 ) will denote the “coordinate words"; i.e., ω ( 1 ) : = i 1 i 2 i k and ω ( 2 ) : = j 1 j 2 j k . Thus, ω | ω | ( 1 ) = i k and ω | ω | ( 2 ) = j k . These lead us to define the following notations: For ω I * , by ω ( , ) it is meant the set of all words ω ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + j ) obtained by concatenating the word ω with the word ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + j ) for j N , i.e.,
ω ( , ) : = { ω ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + j ) : j N } .
Similarly, ω ( , ) and ω ( , ) represent the sets
ω ( , ) : = { ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) ) : i N } and ω ( , ) : = { ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) : i , j N } ,
respectively. Analogously, for any τ N * , by ( τ , ) it is meant the set ( τ , ) : = { τ + i : i N } , and ( τ , ) represents the set ( τ , ) : = { τ } . Thus, if ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) ( , ) , then we write ω ( 1 ) : = ( i 1 i 2 i k , ) and ω ( 2 ) : = j 1 j 2 j k ; if ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) ( , ) , then we write ω ( 1 ) : = i 1 i 2 i k and ω ( 2 ) : = ( j 1 j 2 j k , ) ; and if ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) ( , ) , then we write ω ( 1 ) : = ( i 1 i 2 i k , ) and ω ( 2 ) : = ( j 1 j 2 j k , ) . For ω = ω 1 ω 2 ω k I k , k 1 , let us write
S ω : = S ω 1 S ω k , p ω : = p ω 1 p ω 2 p ω k and J ω : = S ω ( [ 0 , 1 ] × [ 0 , 1 ] ) .
In particular, S = I , the identity mapping on R 2 , and J : = J = S ( [ 0 , 1 ] × [ 0 , 1 ] ) . Then, the probability measure P supports the closure of the limit set S , where S = k N ω I k J ω . The limit set S is called the affine set or infinitely generated affine set. For ω I k and i , j N , the rectangles J ω ( i , j ) , into which J ω is split up at the ( k + 1 ) th level are called the children or the basic rectangles of J ω (see Figure 1). For ω I * , we write
J ω ( , ) : = j = 1 J ω ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + j ) , J ω ( , ) : = i = 1 J ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) ) , J ω ( , ) : = i , j = 1 J ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ;
p ω ( , ) : = P ( J ω ( , ) ) = j = 1 p ω ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + j ) , p ω ( , ) : = P ( J ω ( , ) ) = i = 1 p ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) ) , and p ω ( , ) : = P ( J ω ( , ) ) = i , j = 1 p ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) .
Notice that for any ω I * , p ω ( , ) = p ω j = 1 1 2 ω | ω | ( 1 ) + ω | ω | ( 2 ) + j = p ω p ω | ω | j = 1 1 2 j = p ω p ω | ω | = p ω ; and similarly, p ω ( , ) = p ω ( , ) = p ω .
Because P = i , j = 1 p ( i , j ) P S ( i , j ) 1 , then, by induction, P = ω I k p ω P S ω 1 for any k N . Hence, we have the following statement:
Lemma 1.
Let f : R 2 R + be Borel measurable and k N . Then,
f d P = ω I k p ω f S ω d P .
Let S ( i , j ) ( 1 ) and S ( i , j ) ( 2 ) be the horizontal and vertical components of the transformations S ( i , j ) . Then, for all ( x 1 , x 2 ) R 2 , we have S ( i , j ) ( 1 ) ( x 1 ) = r i x 1 + 1 r i 1 and S ( i , j ) ( 2 ) ( x 2 ) = r j x 2 + 1 r j 1 ; hence, S ( i , j ) ( 1 ) and S ( i , j ) ( 2 ) are similarity mappings on R with similarity ratios s ( i , j ) ( 1 ) : = r i and s ( i , j ) ( 2 ) : = r j , respectively. Similarly, for ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) I k , k 1 , let S ω ( 1 ) and S ω ( 2 ) represent the horizontal and vertical components of the transformation S ω on R 2 . Then, S ω ( 1 ) and S ω ( 2 ) are similarity mappings on R with similarity ratios s ω ( 1 ) and s ω ( 2 ) , respectively, such that S ω ( 1 ) = S ( i 1 , j 1 ) ( 1 ) S ( i k , j k ) ( 1 ) and S ω ( 2 ) = S ( i 1 , j 1 ) ( 2 ) S ( i k , j k ) ( 2 ) . Thus, it follows that
s ω ( 1 ) = s ( i 1 , j 1 ) ( 1 ) s ( i 2 , j 2 ) ( 1 ) s ( i k , j k ) ( 1 ) = r i 1 + i 2 + + i k and s ω ( 2 ) = s ( i 1 , j 1 ) ( 2 ) s ( i 2 , j 2 ) ( 2 ) s ( i k , j k ) ( 2 ) = r j 1 + j 2 + + j k .
Moreover, we have P ( J ω ) = p ω = p ( i 1 , j 1 ) p ( i 2 , j 2 ) p ( i k , j k ) = 1 2 i 1 + i 2 + + i k + j 1 + j 2 + + j k . Let X : = ( X 1 , X 2 ) be a bivariate random variable with distribution P. Let P 1 , P 2 be the marginal distributions of P, i.e., P 1 ( A ) = P ( A × R ) = P π 1 1 ( A ) for all A B , and P 2 ( B ) = P ( R × B ) = P π 2 1 ( B ) for all B B , where π 1 , π 2 are projections given by π 1 ( x 1 , x 2 ) = x 1 and π 2 ( x 1 , x 2 ) = x 2 for all ( x 1 , x 2 ) R 2 . Here B is the Borel σ -algebra on R . Then, X 1 has distribution P 1 and X 2 has distribution P 2 . Let S ( i , j ) ( 1 ) and S ( i , j ) ( 2 ) denote respectively the inverse images of the horizontal and vertical components of the transformations S ( i , j ) for all i , j N . Then, the following lemma is known [16,17,18]:
Lemma 2.
Let P 1 and P 2 be the marginal distributions of the probability measure P. Then,
P 1 = i = 1 1 2 i P 1 S ( i , j ) ( 1 ) and P 2 = j = 1 1 2 j P 2 S ( i , j ) ( 2 ) .
Remark 1.
Since S ( i , j ) ( 1 ) and S ( i , j ) ( 2 ) are similarity mappings, from Lemma 2, one can see that both the marginal distributions P 1 and P 2 are self-similar measures on R generated by an infinite collection of similarities associated with the probability vector ( 1 2 , 1 2 2 , ) .
Lemma 3.
Let E ( X ) and V ( X ) denote the expectation and the variance of the random variable X. Then,
E ( X ) = ( E ( X 1 ) , E ( X 2 ) ) = ( 1 2 , 1 2 ) and V : = V ( X ) = E X ( 1 2 , 1 2 ) 2 = 1 4 .
Proof. 
By Lemma 2, P 1 = P 2 = μ , where μ is a unique Borel probability measure on R such that
μ = k = 1 1 2 k μ S ( k , j ) ( 1 ) = k = 1 1 2 k μ S ( i , k ) ( 2 ) .
Hence, X 1 = X 2 , and by ([11], Lemma 2.2), E ( X 1 ) = E ( X 2 ) = 1 2 , and V ( X 1 ) = V ( X 2 ) = 1 8 , which implies that E X ( 1 2 , 1 2 ) 2 = E ( X 1 1 2 ) 2 + E ( X 2 1 2 ) 2 = V ( X 1 ) + V ( X 2 ) = 1 4 .
Remark 2.
By using the standard rule of probability, for any ( a , b ) R 2 , we have E X ( a , b ) 2 = V + ( a , b ) ( 1 2 , 1 2 ) 2 , which yields that the optimal set of one-mean consists of the expected value and the corresponding quantization error is the variance V of the random variable X.
Lemma 4.
Let ω I * . Then,
( i ) E ( X | X J ω ( , ) ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) + ( s ω ( 1 ) 1 2 ( 1 r ) , s ω ( 2 ) 1 2 ( 1 r ) ) ;
( i i ) E ( X | X J ω ( , ) ) = S ω ( ω | ω | ( 1 ) , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) + ( 0 , s ω ( 2 ) 1 2 ( 1 r ) ) , and
( i i i ) E ( X | X J ω ( , ) ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) ) ( 1 2 , 1 2 ) + ( s ω ( 1 ) 1 2 ( 1 r ) , 0 ) .
Proof. 
First prove ( i ) . Because P ( J ω ( , ) ) = p ω ( , ) = p ω and p ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) = p ω 1 2 i + j ,
E ( X | X J ω ( , ) ) = E ( X | X i , j = 1 J ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ) = 1 P ( J ω ( , ) ) i , j = 1 p ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 2 , 1 2 ) = i , j = 1 1 2 i + j S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 2 , 1 2 ) .
Notice that
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 2 , 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) = S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) , S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 2 ) ( 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) , S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) = S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) , S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 2 ) ( 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) .
Because
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) = s ω ( 1 ) S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) = s ω ( 1 ) r ω | ω | ( 1 ) + i ( 1 2 ) r ω | ω | ( 1 ) + i 1 r ω | ω | ( 1 ) + 1 ( 1 2 ) + r ω | ω | ( 1 ) + 1 1 = s ω ( 1 ) 1 2 r i r i 1 r 2 + 1 = s ω ( 1 ) ( 1 r 2 ) ( 1 r i 1 ) ,
and similarly
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 2 ) ( 1 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) = s ω ( 2 ) ( 1 r 2 ) ( 1 r j 1 ) .
Hence, we have that
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 2 , 1 2 ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) + ( s ω ( 1 ) ( u ) ) , s ω ( 2 ) ( v ) ) , where u = ( 2 r 2 ) ( 1 r i 1 ) and v = ( 2 r 2 ) ( 1 r j 1 ) . Therefore,
E ( X | X J ω ( , ) ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) + i , j = 1 1 2 i + j ( s ω ( 1 ) ( u ) , s ω ( 2 ) ( v ) ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) + ( s ω ( 1 ) ( 1 r 2 ) , s ω ( 2 ) ( 1 r 2 ) ) .
Proofs of (ii) and (iii) are similar. □
Note 1. 
For words β , γ , , δ in I * , by a ( β , γ , , δ ) we denote the conditional expectation of the random variable X given J β J γ J δ , i.e.,
a ( β , γ , , δ ) = E ( X | X J β J γ J δ ) = 1 P ( J β J δ ) J β J δ ( x 1 , x 2 ) d P .
Then, for ω I * ,
a ( ω ) = S ω ( E ( X ) ) = S ω ( 1 2 , 1 2 ) , a ( ω ( , ) ) = E ( X | X J ω ( , ) ) , a ( ω ( , ) ) = E ( X | X J ω ( , ) ) , and a ( ω ( , ) ) = E ( X | X J ω ( , ) ) .
Thus, by Lemma 4, if ω = ( 1 , 1 ) , then a ( ( 1 , 1 ) ) = ( r 2 , r 2 ) , a ( ( 1 , 1 ) ( , ) ) = ( 1 r 2 , r 2 ) , a ( ( 1 , 1 ) ( , ) ) = ( r 2 , 1 r 2 ) , and a ( ( 1 , 1 ) ( , ) ) = ( 1 r 2 , 1 r 2 ) . In addition,
a ( ( 1 , 1 ) , ( 1 , 1 ) ( , ) ) = ( 1 2 , r 2 ) , a ( ( 1 , 1 ) ( , ) , ( 1 , 1 ) ( , ) ) = ( 1 2 , 1 r 2 ) , a ( ( 1 , 1 ) , ( 1 , 1 ) ( , ) ) = ( r 2 , 1 2 ) , a ( ( 1 , 1 ) ( , ) , ( 1 , 1 ) ( , ) ) = ( 1 r 2 , 1 2 ) .
Moreover, for ω I k , k 1 , it is easy to see that
J ω x ( a , b ) 2 d P = p ω ( x 1 , x 2 ) ( a , b ) 2 d P S ω 1 = p ω s ω ( 1 ) 2 V ( X 1 ) + s ω ( 2 ) 2 V ( X 2 ) + S ω ( 1 2 , 1 2 ) ( a , b ) 2 ,
where s ω ( k ) 2 : = ( s ω ( k ) ) 2 for k = 1 , 2 . The expressions (2) and (4) are useful to obtain the optimal sets and the corresponding quantization errors with respect to the probability distribution P.
For the rest of the article r = 1 3 is assumed, which is the most important case due to its intimate connection with the standard Cantor system.

3. Optimal Sets of n-Means for n = 2, 3

In the this section, we determine the optimal sets of two- and three-means, and their quantization errors.
Lemma 5.
Let P be the affine measure on R 2 and let ω I * . Then,
J ω ( , ) x a ( ω ( , ) ) 2 d P = J ω ( , ) x a ( ω ( , ) ) 2 d P = J ω ( , ) x a ( ω ( , ) ) 2 d P = J ω x a ( ω ) 2 d P = p ω ( s ω ( 1 ) 2 + s ω ( 2 ) 2 ) 1 8 .
Proof. 
Let us first prove J ω ( , ) x a ( ω ( , ) ) 2 d P = p ω ( s ω ( 1 ) 2 + s ω ( 2 ) 2 ) 1 8 . By Lemma 4, we have
J ω ( , ) x a ( ω ( , ) ) 2 d P = i , j = 1 J ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) x a ( ω ( , ) ) 2 d P = p ω i , j = 1 1 2 i + j S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( x 1 , x 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) ( s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) , s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ) 2 d P .
Note that S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( x 1 , x 2 ) = S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) , S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 2 ) ( x 2 ) and
S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 2 , 1 2 ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) , S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) . Moreover, we have
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) 2 = s ω ( 1 ) 2 S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) S ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) s ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) 2 = s ω ( 1 ) 2 ( S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) + ( S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) s ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ) ) 2 .
Now break the above expression by using the square formula and note the fact that
S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) 2 d P 1 = s ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) 2 V ( X 1 ) = s ( ω | ω | ( 1 ) , ω | ω | ( 2 ) ) ( 1 ) 2 1 9 i 1 8 , and S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) d P 1 = 0 , and after some simplification we have S ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( 1 2 ) S ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) s ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) 2 = s ( ω | ω | ( 1 ) , ω | ω | ( 2 ) ) ( 1 ) 2 1 4 ( 1 5 3 i ) 2 .
Thus, it follows that
S ω ( ω | ω | ( 1 ) + i , ω | ω | ( 2 ) + j ) ( 1 ) ( x 1 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) 2 d P 1 = s ω ( 1 ) 2 1 9 i 1 8 + 1 4 ( 1 5 3 i ) 2 , and similarly S ω ( ω | ω | ( 2 ) + i , ω | ω | ( 2 ) + j ) ( 2 ) ( x 2 ) S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) 2 d P 2 = s ω ( 2 ) 2 1 9 j 1 8 + 1 4 ( 1 5 3 j ) 2 .
Therefore, (5) implies that
J ω ( , ) x a ( ω ( , ) ) 2 d P = p ω i , j = 1 1 2 i + j s ω ( 1 ) 2 1 9 i 1 8 + 1 4 ( 1 5 3 i ) 2 + s ω ( 2 ) 2 1 9 j 1 8 + 1 4 ( 1 5 3 j ) 2 = p ω ( s ω ( 1 ) 2 + s ω ( 2 ) 2 ) 1 8 .
Other equalities of the statement are proved similarly. □
Lemma 6.
Let P be the affine measure on R 2 , and let { ( a , p ) , ( b , p ) } be a set of two points lying on the line x 2 = p for which the distortion error is smallest. Then, a = 1 6 , b = 5 6 , p = 1 2 and the distortion error is 5 36 .
Proof. 
Let β = { ( a , p ) , ( b , p ) } . Because the points for which the distortion error is smallest are the centroids of their own Voronoi regions, by the properties of centroids, we have
( a , p ) P ( M ( ( a , p ) | β ) ) + ( b , p ) P ( M ( ( b , p ) | β ) ) = ( 1 2 , 1 2 ) ,
which implies p P ( M ( ( a , p ) | β ) ) + p P ( M ( ( b , p ) | β ) ) = 1 2 , i.e, p = 1 2 . Thus, the boundary of the Voronoi regions is the line x 1 = 1 2 . Now, using the definition of conditional expectation,
( a , 1 2 ) = E ( X : X M ( ( a , 1 2 ) | β ) ) = E ( X : X j = 1 J ( 1 , j ) ) = 1 j = 1 p ( 1 , j ) j = 1 p ( 1 , j ) S ( 1 , j ) ( 1 2 , 1 2 ) ,
which implies ( a , 1 2 ) = ( 1 6 , 1 2 ) yielding a = 1 6 . Similarly, b = 5 6 . Then, the distortion error is
min c β x c 2 d P = j = 1 J ( 1 , j ) x ( 1 6 , 1 2 ) 2 d P + i = 2 , j = 1 J ( i , j ) x ( 5 6 , 1 2 ) 2 d P = 5 72 + 5 72 = 5 36 .
This completes the proof the lemma. □
The following lemma provides us information on where to look for points of an optimal set of two-means.
Lemma 7.
Let P be the affine measure on R 2 . The points in an optimal set of two-means can not lie on an oblique line of the affine set.
Proof. 
In the affine set, among all the oblique lines that pass through the point ( 1 2 , 1 2 ) , the line x 2 = x 1 has the maximum symmetry, i.e., with respect to the line x 2 = x 1 the affine set is geometrically symmetrical. Also, observe that, if two basic rectangles of similar geometrical shape lie in the opposite sides of the line x 2 = x 1 , and are equidistant from the line x 2 = x 1 , then they have the same probability (see Figure 1); hence, they are symmetrical with respect to the probability distribution P. Due to this, among all the pairs of two points which have the boundaries of the Voronoi regions oblique lines passing through the point ( 1 2 , 1 2 ) , the two points which have the boundary of the Voronoi regions the line x 2 = x 1 will give the smallest distortion error. Again, we know the two points which give the smallest distortion error are the centroids of their own Voronoi regions. Let ( a 1 , b 1 ) and ( a 2 , b 2 ) be the centroids of the left half and the right half of the affine set with respect to the line x 2 = x 1 respectively. Then, from the definition of conditional expectation, we have
( a 1 , b 1 ) = 2 [ i = 1 , j = i + 1 1 2 i + j S ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 i = 1 j = i + 1 1 2 2 k 1 + i + j S ( k 1 , k 1 ) ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 k 2 = 1 i = 1 j = i + 1 1 2 2 k 1 + 2 k 2 + i + j S ( k 1 , k 1 ) ( k 2 , k 2 ) ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 k 2 = 1 k 3 = 1 i = 1 j = i + 1 1 2 2 k 1 + 2 k 2 + 2 k 3 + i + j S ( k 1 , k 1 ) ( k 2 , k 2 ) ( k 3 , k 3 ) ( i , j ) ( 1 2 , 1 2 ) + ] = ( 3 10 , 7 10 ) ,
and
( a 2 , b 2 ) = 2 ( i = 1 j = 1 i 1 1 2 i + j S ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 i = 1 j = 1 i 1 1 2 2 k 1 + i + j S ( k 1 , k 1 ) ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 k 2 = 1 i = 1 j = 1 i 1 1 2 2 k 1 + 2 k 2 + i + j S ( k 1 , k 1 ) ( k 2 , k 2 ) ( i , j ) ( 1 2 , 1 2 ) + k 1 = 1 k 2 = 1 k 3 = 1 i = 1 j = 1 i 1 1 2 2 k 1 + 2 k 2 + 2 k 3 + i + j S ( k 1 , k 1 ) ( k 2 , k 2 ) ( k 3 , k 3 ) ( i , j ) ( 1 2 , 1 2 ) + ) = ( 7 10 , 3 10 ) .
Let β = { ( 3 10 , 7 10 ) , ( 7 10 , 3 10 ) } . Then, due to symmetry,
min c β x c 2 d P = 2 M ( ( 3 10 , 7 10 ) | β ) x ( 3 10 , 7 10 ) 2 d P .
Write
A : = ( j = 2 4 J ( 1 , 1 ) ( 1 , 1 ) ( 1 , 1 ) ( 1 , 1 ) ( 1 , j ) ) ( j = 2 6 J ( 1 , 1 ) ( 1 , 1 ) ( 1 , 1 ) ( 1 , j ) ) ( j = 3 5 J ( ( 1 , 1 ) ( 1 , 1 ) ( 1 , 1 ) ( 2 , j ) ) ( j = 2 8 J ( 1 , 1 ) ( 1 , 1 ) ( 1 , j ) ) ( j = 3 6 J ( 1 , 1 ) ( 1 , 1 ) ( 2 , j ) ) J ( 1 , 1 ) ( 1 , 1 ) ( 3 , 4 ) ( j = 2 8 J ( 1 , 1 ) ( 1 , j ) ) ( j = 3 7 J ( 1 , 1 ) ( 2 , j ) ) ( j = 4 6 J ( 1 , 1 ) ( 3 , j ) ) ( j = 2 10 J ( 1 , j ) ) ( j = 3 10 J ( 2 , j ) ) ( j = 4 10 J ( 3 , j ) ) ( j = 5 9 J ( 4 , j ) ) ( j = 6 7 J ( 5 , j ) ) .
Because A is a proper subset of M ( ( 3 10 , 7 10 ) | β ) , we have min c β x c 2 d P > 2 A x ( 3 10 , 7 10 ) 2 d P . Now using (4), and then upon simplification, it follows that
min c β x c 2 d P > 2 A x ( 3 10 , 7 10 ) 2 d P = 0.13899 ,
which is larger than the distortion error 5 36 obtained in Lemma 6. Hence, the points in an optimal set of two-means can not lie on a oblique line of the affine set. Thus, the assertion of the lemma follows. □
Proposition 2.
Let P be the affine measure on R 2 . Then, the sets { ( 1 6 , 1 2 ) , ( 5 6 , 1 2 ) } and { ( 1 2 , 1 6 ) , ( 1 2 , 5 6 ) } form two different optimal sets of two-means with quantization error 5 36 .
Proof. 
By Lemma 7, it is known that the points in an optimal set of two-means cannot lie on an oblique line of the affine set. Thus, by Lemma 6, we see that { ( 1 6 , 1 2 ) , ( 5 6 , 1 2 ) } forms an optimal set of two-means with quantization error 5 36 . Due to symmetry, { ( 1 2 , 1 6 ) , ( 1 2 , 5 6 ) } forms another optimal set of two-means (see Figure 2); thus, the assertion follows. □
Proposition 3.
Let P be the affine measure on R 2 . Then, the set { ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) , ( 1 2 , 5 6 ) } forms an optimal set of three-means with quantization error 1 12 .
Proof. 
Let us first consider a three-point set β given by β = { ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) , ( 1 2 , 5 6 ) } . Then, by using Lemma 5 and Equation (4), we have
min a β x a 2 d P = J ( 1 , 1 ) x ( 1 6 , 1 6 ) 2 d P + J ( 1 , 1 ) ( , ) x ( 5 6 , 1 6 ) 2 d P + J ( 1 , 1 ) ( , ) J ( 1 , 1 ) ( , ) x ( 1 2 , 5 6 ) 2 d P = 1 12 .
Because V 3 is the quantization error for an optimal set of three-means, we have 1 12 V 3 . Let α = { ( a i , b i ) : 1 i 3 } be an optimal set of three-means. Because the optimal points are the centroids of their own Voronoi regions, we have α [ 0 , 1 ] × [ 0 , 1 ] . Let A 1 = [ 0 , 1 3 ] × [ 0 , 1 3 ] , A 2 = [ 2 3 , 1 ] × [ 0 , 1 3 ] , A 3 = [ 0 , 1 3 ] × [ 2 3 , 1 ] , and A 4 = [ 2 3 , 1 ] × [ 2 3 , 1 ] . Note that the centroids of A 1 , A 2 , A 3 and A 4 with respect to the probability distribution P are respectively ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) , ( 1 6 , 5 6 ) and ( 5 6 , 5 6 ) . Suppose that α does not contain any point from i = 1 4 A i . Then, we can assume that all the points of α are on the line x 2 = 1 2 , i.e., α = { ( a i , 1 2 ) : 1 i 3 } with a 1 < a 2 < a 3 . If a 1 > 1 3 , quantization error can be strictly reduced by moving the point ( a 1 , 1 2 ) to ( 1 3 , 1 2 ) . So, we can assume that a 1 1 3 . Similarly, we can show that a 3 2 3 . Now, if a 2 < 1 3 , then A 3 A 4 M ( ( a 3 , 1 2 ) | α ) . Moreover, for any x = ( x 1 , x 2 ) J ( 1 , 1 ) ( 1 , 1 ) J ( 1 , 3 ) , we have m ( x ) : = min c α ( x 1 , x 2 ) c 2 ( 7 18 ) 2 and so by (4) and Lemma 5, we obtain
m ( x ) 2 d P = J ( 1 , 1 ) ( 1 , 1 ) J ( 1 , 3 ) m ( x ) 2 d P + J ( 1 , 1 ) ( , ) J ( 1 , 1 ) ( , ) m ( x ) 2 d P 1 16 ( 1 81 + 1 81 ) 1 8 + ( 7 18 ) 2 + 1 16 ( 1 9 + 1 27 2 ) 1 8 + ( 7 18 ) 2 + J ( 1 , 1 ) ( , ) J ( 1 , 1 ) ( , ) x ( 5 6 , 1 2 ) 2 d P = 1 16 ( 1 81 + 1 81 ) 1 8 + ( 7 18 ) 2 + 1 16 ( 1 9 + 1 27 2 ) 1 8 + ( 7 18 ) 2 + 5 72 = 1043 11664 > V 3 ,
which is a contradiction, and so a 2 1 3 must be true. If a 2 > 2 3 , similarly we can show that a contradiction arises. So, 1 3 < a 2 < 2 3 . Next, suppose that 1 2 a 2 < 2 3 . Then, we have 1 2 ( a 1 + a 2 ) 1 3 which implies a 1 1 6 , for otherwise quantization error can be strictly reduced by moving a 2 to ( 2 3 , 1 2 ) , contradicting the fact that α is an optimal set. Then, j = 1 J ( 1 , 1 ) ( 1 , j ) i = 2 , j = 1 J ( 1 , i ) ( 1 , j ) M ( ( a 1 , 1 2 ) | α ) and E ( X : X j = 1 J ( 1 , 1 ) ( 1 , j ) i = 2 , j = 1 J ( 1 , i ) ( 1 , j ) ) = ( 1 18 , 1 2 ) . So, for any ( x 1 , x 2 ) i = 2 , j = 1 J ( 1 , 1 ) ( i , j ) k = 1 , i = 2 , j = 1 J ( k , 2 ) ( i , j ) , min c α ( x 1 , x 2 ) c 2 ( x 1 , x 2 ) ( 1 6 , 1 2 ) 2 . If A = j = 1 J ( 1 , 1 ) ( 1 , j ) i = 2 , j = 1 J ( 1 , i ) ( 1 , j ) , B = i = 2 , j = 1 J ( 1 , 1 ) ( i , j ) k = 1 , i = 2 , j = 1 J ( k , 2 ) ( i , j ) ,   A = j = 1 J ( 1 , 1 ) ( 1 , j ) and B = k = 1 , i = 2 , j = 1 J ( k , 2 ) ( i , j ) , then
m ( x ) 2 d P > A ( x 1 , x 2 ) ( 1 18 , 1 2 ) 2 d P + B ( x 1 , x 2 ) ( 1 6 , 1 2 ) 2 d P = 2 A x ( 1 18 , 1 2 ) 2 d P + i = 2 , j = 1 J ( 1 , 1 ) ( i , j ) x ( 1 6 , 1 2 ) 2 d P + B x ( 1 6 , 1 2 ) 2 d P = 2 · 41 2592 + 5 288 + 551 14688 = 953 11016 > V 3 ,
which is a contradiction. Similarly, if we assume 1 3 a 2 < 1 2 , a contradiction will arise. Therefore, all the points in α can not lie on the line x 2 = 1 2 . Let ( a 1 , b 1 ) and ( a 3 , b 3 ) lie on the line x 2 = 1 2 , and ( a 2 , b 2 ) is above or below the horizontal line x 2 = 1 2 . If ( a 2 , b 2 ) is above the horizontal line, then the quantization error can be strictly reduced by moving ( a 1 , b 1 ) to A 1 and ( a 3 , b 3 ) to A 2 contradicting the fact that α is an optimal set. Similarly, if ( a 2 , b 2 ) is below the horizontal line, a contradiction will arise. All these contradictions arise due to our assumption that α does not contain any point from i = 1 4 A i . Hence, α contains at least one point from i = 1 4 A i . In order to complete the proof of the Proposition, first we will prove the following claim:
Claim 1. 
c a r d ( { i : α A i , 1 i 4 } ) = 2 .
For the sake of contradiction, assume that card ( { i : α A i , 1 i 4 } ) = 1 . Then, without any loss of generality we assume that ( a 1 , b 1 ) A 1 and ( a i , b i ) A 2 A 3 A 4 for i = 2 , 3 . Due to symmetry of the affine set with respect to the diagonal x 2 = x 1 , we can assume that ( a 1 , b 1 ) A 1 lies on the diagonal x 2 = x 1 ; ( a 2 , b 2 ) and ( a 3 , b 3 ) are equidistant from the diagonal x 2 = x 1 and are in opposite sides of the diagonal x 2 = x 1 . Now, consider the following cases:
Case 1. Assume that both ( a 2 , b 2 ) and ( a 3 , b 3 ) are below the diagonal x 2 = 1 x 1 , but not in A 1 A 2 A 3 . Let ( a 2 , b 2 ) be above the diagonal x 2 = x 1 and ( a 3 , b 3 ) be below the diagonal x 2 = x 1 . In that case, the quantization error can be strictly reduced by moving ( a 2 , b 2 ) to A 3 and ( a 3 , b 3 ) to A 2 which contradicts the optimality of α .
Case 2. Assume that both ( a 2 , b 2 ) and ( a 3 , b 3 ) are above the diagonal x 2 = 1 x 1 . Let ( a 2 , b 2 ) lie above the diagonal x 2 = x 1 and ( a 3 , b 3 ) lie below the diagonal x 2 = x 1 . Then, due to symmetry we can assume that ( a 1 , b 1 ) = ( 1 6 , 1 6 ) which is the centroid of A 1 , ( a 2 , b 2 ) = ( 1 2 , 5 6 ) which is the midpoint of the line segment joining the centroids of A 3 and A 4 , ( a 3 , b 3 ) = ( 5 6 , 1 2 ) which is the midpoint of the line segment joining the centroids of A 2 and A 4 . Then,
m ( x ) 2 d P = J ( 1 , 1 ) m ( x ) 2 d P + J ( 1 , 1 ) ( , ) m ( x ) 2 d P + J ( 1 , 1 ) ( , ) m ( x ) 2 d P + J ( 1 , 1 ) ( , ) m ( x ) 2 d P 1 144 + J ( 1 , 1 ) ( , ) x ( 1 2 , 5 6 ) 2 d P + J ( 1 , 1 ) ( , ) x ( 5 6 , 1 2 ) 2 d P + i = 2 j = i + 1 J ( i , j ) x ( 1 2 , 5 6 ) 2 d P = 1 144 + 5 144 + 5 144 + 1381 166320 = 7043 83160 > V 3 ,
which is a contradiction. Thus, card ( { i : α A i , 1 i 4 } ) = 1 cannot hold.
Next, for the sake of contradiction, assume that card ( { i : α A i , 1 i 4 } ) = 3 . Then, without any loss of generality we assume that ( a 1 , b 1 ) A 3 , ( a 2 , b 2 ) A 2 and ( a 3 , b 3 ) A 4 . Let A 11 and A 12 be the regions of A 1 which are respectively above and below the diagonal of A 1 passing through ( 0 , 0 ) . Due to symmetry, we must have A 3 A 11 M ( ( a 1 , b 1 ) | α ) and A 2 A 12 M ( ( a 2 , b 2 ) | α ) . Notice that A 3 A 11 M ( ( a 1 , b 1 ) | α ) implies
A 3 i = 1 , j = i + 1 J ( 1 , 1 ) ( i , j ) j = i + 1 k = 1 , i = 1 J ( 1 , 1 ) ( k , k ) ( i , j ) M ( ( a 1 , b 1 ) | α ) ,
and by using (1), we have
E ( X : X A 3 i = 1 , j = i + 1 J ( 1 , 1 ) ( i , j ) j = i + 1 k = 1 , i = 1 J ( 1 , 1 ) ( k , k ) ( i , j ) = ( 1385 9438 , 6173 9438 ) ,
which shows that the point ( a 1 , b 1 ) falls below the line x 2 = 2 3 , which is a contradiction, as we assumed that ( a 1 , b 1 ) A 3 . This contradiction arises due to our assumption that card ( { i : α A i , 1 i 4 } ) = 3 . Hence, we conclude that card ( { i : α A i , 1 i 4 } ) = 2 , which proves the claim.
By the claim, we assume that ( a 1 , b 1 ) A 1 and ( a 3 , b 3 ) A 2 . Notice that A 1 , A 2 , A 3 , A 4 are geometrically symmetric as well as their corresponding centroids are symmetrically distributed over the square [ 0 , 1 ] × [ 0 , 1 ] . Without any loss of generality, we can assume that the optimal point ( a 1 , b 1 ) is the centroid of A 1 , i.e., ( a 1 , b 1 ) = ( 1 6 , 1 6 ) . Then, due to symmetry with respect to the line x 1 = 1 2 , it follows that ( a 3 , b 3 ) = centroid of A 2 = ( 5 6 , 1 6 ) , and ( a 2 , b 2 ) lies on x 1 = 1 2 but above the line x 2 = 1 2 . Now, notice that
min ( a 3 , b 3 ) [ 1 3 , 2 3 ] × [ 2 3 , 1 ] { ( 1 6 , 5 6 ) ( a 3 , b 3 ) 2 + ( 5 6 , 5 6 ) ( a 3 , b 3 ) 2 } = 2 9 ,
which occurs when ( a 3 , b 3 ) = center of [ 1 3 , 2 3 ] × [ 2 3 , 1 ] = ( 1 2 , 5 6 ) . Moreover, the three points ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) and ( 1 2 , 5 6 ) are the centroids of their own Voronoi regions. Thus, { ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) , ( 1 2 , 5 6 ) } forms an optimal set of three-means with quantization error V 3 = 1 12 . Hence, the proposition follows. □
Remark 3.
Due to symmetry, in addition to the optimal set given in Proposition 3, there are three more optimal sets of three-means with quantization error V 3 = 1 12 (see Figure 3).

4. Affine Measures

In this section, we show that the affine measure P under consideration is the direct product of the Cantor distribution P c .
For the rest of the article, by a word σ of length k over the alphabet { 1 , 2 } , it is meant σ : = σ 1 σ 2 σ k { 1 , 2 } k , k 1 . By a word of length zero it is meant the empty word ∅. { 1 , 2 } * represents the set of all words over the alphabet { 1 , 2 } including the empty word ∅. Length of a word σ { 1 , 2 } * is denoted by | σ | . If σ = σ 1 σ 2 σ k , we write U σ : = U σ 1 U σ 2 U σ k . U represents the identity mapping on R . By u σ we represent the similarity ratio of U σ . If X c is the random variable with distribution P c , then E ( X c ) = 1 2 and V ( X c ) = 1 8 [10]. For σ { 1 , 2 } * , write A ( σ ) : = U σ ( 1 2 ) . Notice that for σ { 1 , 2 } * , we have 1 2 ( A ( σ 1 ) + A ( σ 2 ) ) = A ( σ ) , u σ = 1 3 | σ | , the contractive factor of U σ , and for the empty word ∅, A ( ) = 1 2 . For σ { 1 , 2 } * define A σ : = U σ [ 0 , 1 ] . For any positive integer n, by 2 * n it is meant the concatenation of the symbol 2 with itself n-times successively, i.e., 2 * n = 222 ( n times ) , with the convention that 2 * 0 is the empty word. For any positive integer k, by { 1 , 2 } k * 2 it is meant the direct product of the set { 1 , 2 } k with itself. By { 1 , 2 } 0 * 2 it is meant the set { ( , ) } . Also, recall the notations defined in Section 2. Let us now introduce the map F : N * { ( σ , ) : σ N * } { 1 , 2 } * such that
F ( x ) = f ( σ 1 ) f ( σ 2 ) f ( σ | σ | ) if x = σ = σ 1 σ 2 σ | σ | , f ( σ 1 ) f ( σ 2 ) f ( σ | σ | , ) if x = ( σ 1 σ 2 σ | σ | , ) , if x = ,
where f : N { ( n , ) : n N } { 1 , 2 } * { } is such that
f ( x ) = 2 * ( n 1 ) 1 if x = n for some n N , 2 * n if x = ( n , ) for some n N .
The function f is one-to-one and onto, and consequently, F is also one-to-one and onto. For any σ N * , write A F ( σ ) : = A ( F ( σ ) ) and A F ( σ , ) : = A ( F ( σ , ) ) .
The map F is instrumental in converting the infinitely generated affine measure P to a finitely generated affine measure P c × P c . Furthermore, to improve the clarity of the arguments, we will write T i for S ( i , j ) ( 1 ) , and T j for S ( i , j ) ( 2 ) , where T k for all k 1 form an infinite collection of similarity mappings on R such that T k ( x ) = 1 3 k x + 1 1 3 k 1 for all x R . Thus, if ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i n , j n ) , then S ω ( 1 ) = T i 1 T i n = T i 1 i 2 i n and S ω ( 2 ) = T j 1 T j n = T j 1 j 2 j n for all n 1 . Again, T is the identity mapping on R .
Lemma 8.
Let T k for k 1 be the infinite collection of similitudes defined above, and U 1 and U 2 be the similitudes generating the Cantor set. Then, for any σ N * and x R , we have T σ ( x ) = U F ( σ ) ( x ) .
Proof. 
If σ = 1 , then T 1 ( x ) = 1 3 x = U 1 ( x ) = U F ( 1 ) ( x ) for any x R . Assume that the lemma is true if σ = k for some positive integer k, i.e., T k ( x ) = U F ( k ) ( x ) . Then,
U F ( k + 1 ) ( x ) = U 2 * k 1 ( x ) = U 2 * ( k 1 ) 21 ( x ) = U 2 * ( k 1 ) U 21 ( x ) = U 2 * ( k 1 ) ( 1 9 x + 2 3 ) = U 2 * ( k 1 ) 1 ( 3 ( 1 9 x + 2 3 ) ) = U F ( k ) ( 1 3 x + 2 ) = T k ( 1 3 x + 2 ) = 1 3 k ( 1 3 x + 2 ) + 1 1 3 k 1 = 1 3 k + 1 x + 1 1 3 k = T k + 1 ( x ) .
Thus, by the Principle of Mathematical Induction, T k ( x ) = U F ( k ) ( x ) for all k N . Again, for any τ , δ N * , by (6), it follows that F ( σ δ ) = F ( σ ) F ( δ ) . Hence, for any σ = σ 1 σ 2 σ n N * , n 1 , we have
T σ ( x ) = T σ 1 T σ 2 T σ n ( x ) = U F ( σ 1 ) U F ( σ 2 ) U F ( σ n ) ( x ) = U F ( σ ) ( x ) ,
which completes the proof. □
Lemma 9.
Let ω I * , and F be the function as defined in (6). Then for r = 1 , 2 , we have A F ( ω ( r ) ) = S ω ( r ) ( 1 2 ) , and A F ( ω ( r ) , ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( r ) ( 1 2 ) + s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( r ) .
Proof. 
By Lemma 8, we have
A F ( ω ( 1 ) ) = U F ( ω ( 1 ) ) ( 1 2 ) = T ω ( 1 ) ( 1 2 ) = S ω ( 1 ) ( 1 2 ) , and similarly A F ( ω ( 2 ) ) = S ω ( 2 ) ( 1 2 ) .
Without any loss of generality, we can assume ω = ( i 1 , j 1 ) ( i 2 , j 2 ) ( i k , j k ) for k 1 . Then,
A F ( ω ( 1 ) , ) = U F ( i 1 i 2 i k , ) ( 1 2 ) = U F ( i 1 i 2 i k 1 ) U F ( i k , ) ( 1 2 ) = U F ( i 1 i 2 i k 1 ) U 2 * i k ( 1 2 ) = U F ( i 1 i 2 i k 1 ) U 2 * i k 1 ( U 1 1 ( 1 2 ) ) = U F ( i 1 i 2 i k 1 ) U F ( i k + 1 ) ( 3 2 ) = U F ( i 1 i 2 i k 1 ( i k + 1 ) ) ( 3 2 ) = T i 1 i 2 i k 1 ( i k + 1 ) ( 3 2 ) = S ω ( i k + 1 , j k + 1 ) ( 1 ) ( 3 2 ) .
Because, S ( i k + 1 , j k + 1 ) ( 1 ) ( 3 2 ) S ( i k + 1 , j k + 1 ) ( 1 ) ( 1 2 ) = 1 3 i k + 1 3 2 + 1 1 3 i k 1 3 i k + 1 1 2 1 + 1 3 i k = 1 3 i k + 1 , we have
S ω ( i k + 1 , j k + 1 ) ( 1 ) ( 3 2 ) S ω ( i k + 1 , j k + 1 ) ( 1 ) ( 1 2 ) = s ω ( 1 ) ( S ( i k + 1 , j k + 1 ) ( 1 ) ( 3 2 ) S ( i k + 1 , j k + 1 ) ( 1 ) ( 1 2 ) ) = s ω ( 1 ) 1 3 i k + 1 = s ω ( i k + 1 , j k + 1 ) ( 1 ) = s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) , which yields
A F ( ω ( 1 ) , ) = S ω ( i k + 1 , j k + 1 ) ( 1 ) ( 3 2 ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) ( 1 2 ) + s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 1 ) . Similarly, A F ( ω ( 2 ) , ) = S ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) ( 1 2 ) + s ω ( ω | ω | ( 1 ) + 1 , ω | ω | ( 2 ) + 1 ) ( 2 ) . □
Remark 4.
By Lemmas 4 and 9, for any ω I * , we have
a ( ω ) = ( A F ( ω ( 1 ) ) , A F ( ω ( 2 ) ) ) , a ( ω ( , ) ) = ( A F ( ω ( 1 ) , ) , A F ( ω ( 2 ) , ) ) , a ( ω ( , ) ) = ( A F ( ω ( 1 ) , ) , A F ( ω ( 2 ) ) ) , and a ( ω ( , ) ) = ( A F ( ω ( 1 ) ) , A F ( ω ( 2 ) , ) ) .
The following example illustrates the outcome of the lemma above.
Example 1.
a ( ( 1 , 1 ) ) = ( A F ( 1 ) , A F ( 1 ) ) = ( A ( 1 ) , A ( 1 ) ) = ( 1 6 , 1 6 ) ,
a ( ( 1 , 1 ) ( , ) ) = ( A F ( 1 , ) , A F ( 1 ) ) = ( A ( 2 ) , A ( 1 ) ) = ( 5 6 , 1 6 ) ,
a ( ( 1 , 1 ) ( , ) ) = ( A F ( 1 ) , A F ( 1 , ) ) = ( A ( 1 ) , A ( 2 ) ) = ( 1 6 , 5 6 ) ,
a ( ( 1 , 1 ) ( , ) ) = ( A F ( 1 , ) , A F ( 1 , ) ) = ( A ( 2 ) , A ( 2 ) ) = ( 5 6 , 5 6 ) ,
a ( ( 1 , 1 ) ( 1 , 1 ) ) = ( A F ( 11 ) , A F ( 11 ) ) = ( A ( 11 ) , A ( 11 ) ) = ( 1 18 , 1 18 ) ,
a ( ( 1 , 1 ) ( 1 , 1 ) ( , ) ) = ( A F ( 11 , ) , A F ( 11 ) ) = ( A ( 12 ) , A ( 11 ) ) = ( 5 18 , 1 18 ) ,
a ( ( 1 , 1 ) ( 1 , 1 ) ( , ) ) = ( A F ( 11 ) , A F ( 11 , ) ) = ( A ( 11 ) , A ( 12 ) ) = ( 1 18 , 5 18 ) , and
a ( ( 1 , 1 ) ( 1 , 1 ) ( , ) ) = ( A F ( 11 , ) , A F ( 11 , ) ) = ( A ( 12 ) , A ( 12 ) ) = ( 5 18 , 5 18 ) , etc.
Lemma 10.
Let μ = k = 1 1 2 k μ T k 1 . Then, for any σ N * , we have μ ( T σ [ 0 , 1 ] ) = P c ( A F ( σ ) ) , where P c : = 1 2 P c U 1 1 + 1 2 P c U 2 1 .
Proof. 
Without any loss of generality, let σ = i 1 i 2 i k for any k 1 . See that F ( σ ) = F ( i 1 ) F ( i 2 ) F ( i k ) , and thus | F ( σ ) | = | F ( i 1 ) | + | F ( i 2 ) | + + | F ( i k ) | = i 1 + i 2 + + i k . Consequently,
μ ( T σ [ 0 , 1 ] ) = 1 2 i 1 + i 2 + + i k = 1 2 | F ( σ ) | = P c ( A F ( σ ) ) ,
which proves the lemma. □
Proposition 4.
Let P be the affine measure. Then, P = P c × P c , where P c is the Cantor distribution.
Proof. 
Borel σ -algebra on the affine set is generated by all sets of the form J ( δ , τ ) for ( δ , τ ) I * , where J ( δ , τ ) = S ( δ , τ ) ( [ 0 , 1 ] × [ 0 , 1 ] ) . Notice that
J ( δ , τ ) = T δ [ 0 , 1 ] × T τ [ 0 , 1 ] = U F ( δ ) [ 0 , 1 ] × U F ( τ ) [ 0 , 1 ] = A F ( δ ) × A F ( τ ) .
Again, the sets of the form A α , where α { 1 , 2 } * , generate the Borel σ -algebra on the Cantor set C. Thus, we see that the Borel σ -algebra of the affine set is the same as the product of the Borel σ -algebras on the Cantor set. Moreover, for any ( δ , τ ) I * , by Remark 1 and Lemma 10, we have
P ( J ( δ , τ ) ) = μ ( T δ [ 0 , 1 ] ) μ ( T τ [ 0 , 1 ] ) = P c ( A F ( δ ) ) P c ( A F ( τ ) ) = ( P c × P c ) ( A F ( δ ) × A F ( τ ) ) .
Hence, the proposition follows. □
Remark 5.
By Proposition 4, it follows that the optimal sets of n-means for P are the same as the optimal sets n-means for the product measure P c × P c on the affine set. Moreover, for k 1 we can write
P = P c × P c = ( σ , τ ) { 1 , 2 } k * 2 1 4 k ( P c × P c ) ( U σ , U τ ) 1 ,
where for ( x 1 , x 2 ) R 2 , ( U σ , U τ ) 1 ( x 1 , x 2 ) = ( U σ 1 ( x 1 ) , U τ 1 ( x 2 ) ) .

5. Optimal Sets of n-Means for all n 4

In this section, we will derive closed formulas to determine the optimal sets of n-means and the nth quantization error for all n 4 . For ( σ , τ ) { 1 , 2 } k * 2 , write A ( σ , τ ) : = A σ × A τ and U ( σ , τ ) : = ( U σ , U τ ) .
Lemma 11.
Let α be an optimal set of n-means with n 4 . Then, α A ( i , j ) for all 1 i , j 2 .
Proof. 
Let α be an optimal set of n-means for n 4 . As the optimal points are the centroids of their own Voronoi regions we have α A × A : = [ 0 , 1 ] × [ 0 , 1 ] .
Consider the four-point set β given by β = { ( A ( i ) , A ( j ) ) : 1 i , j 2 } . Then,
min c β x c 2 d P = i , j = 1 2 A ( i , j ) x ( A ( i ) , A ( j ) ) 2 d ( P c × P c ) = i , j = 1 2 1 4 ( 1 9 + 1 9 ) 1 8 = 1 36 .
Because V 4 is the quantization error of four-means, we have 1 36 V 4 V n .
Assume that α does not contain any point from i , j = 1 2 A ( i , j ) . We know that
( a , b ) α ( a , b ) P ( M ( a , b ) | α ) ) = ( 1 2 , 1 2 ) .
If all the points of α are below the line x 2 = 1 2 , i.e., if b < 1 2 then by (7), we see that 1 2 = ( a , b ) α b P ( M ( a , b ) | α ) ) < ( a , b ) α 1 2 P ( M ( a , b ) | α ) ) = 1 2 , which is a contradiction. Similarly, it follows that if all the points of α are above the line x 2 = 1 2 , or left of the line x 1 = 1 2 , or right of the line x 1 = 1 2 , a contradiction will arise.
Next, suppose that all the points of α are on the line x 2 = 1 2 . We will consider two cases: n = 4 and n > 4 . When n = 4 , let α = { ( a i , 1 2 ) : 1 i 4 } with a i < a j for i < j . Due to symmetry, we can assume that the boundary of the Voronoi regions of the points ( a 1 , 1 2 ) , ( a 2 , 1 2 ) , ( a 3 , 1 2 ) , and ( a 4 , 1 2 ) are respectively x 1 = 1 6 , x 1 = 1 2 , and x 1 = 5 6 yielding α = { ( 1 18 , 1 2 ) , ( 5 18 , 1 2 ) , ( 13 18 , 1 2 ) , ( 17 18 , 1 2 ) } , and then writing B : = A ( 11 , 11 ) A ( 11 , 12 ) A ( 11 , 21 ) A ( 11 , 22 ) , by symmetry we have
min c α x c 2 d P = 4 B x ( 1 18 , 1 2 ) 2 d ( P c × P c ) = 8 A ( 11 , 11 ) x ( 1 18 , 1 2 ) 2 d ( P c × P c ) + 8 A ( 11 , 12 ) x ( 1 18 , 1 2 ) 2 d ( P c × P c ) = 8 ( 65 5184 + 17 5184 ) = 41 324 > V 4 ,
which is a contradiction. We consider the case n > 4 . Because for any ( x 1 , x 2 ) i , j = 1 2 A i j , min c α ( x 1 , x 2 ) c 2 1 36 , we have
min c α x c 2 d P = i , j = 1 2 A ( i , j ) min c α x c 2 d ( P c × P c ) i , j = 1 2 A ( i , j ) 1 36 d ( P c × P c ) = 1 36 ,
which implies 1 36 V 4 > V n , a contradiction. Thus, we see that all the points of α can not lie on x 2 = 1 2 . Similarly, all the points of α can not lie on x 1 = 1 2 .
Notice that the lines x 1 = 1 2 and x 2 = 1 2 partition the square [ 0 , 1 ] × [ 0 , 1 ] into four quadrants with center ( 1 2 , 1 2 ) . If n = 4 k for some positive integer k, due to symmetry, we can assume that each quadrant contains k-points from the set α . But then, any of the k points in the quadrant containing a basic rectangle A ( i , j ) can be moved to A ( i , j ) which strictly reduce the quantization error, and it gives a contradiction as we assumed that the set α is an optimal set of n-means and α does not contain any point from A ( i , j ) for 1 i , j 2 .
If n = 4 k + 1 , 4 k + 2 , or n = 4 k + 3 , then, again due to symmetry, each quadrant gets at least k points. Then, as in the case n = 4 k , here also, one can strictly reduce the quantization error by moving a point in the quadrant containing a basic rectangle A ( i , j ) to A ( i , j ) for 1 i , j 2 , which is a contradiction.
Thus, we have proved that α A ( i , j ) for all 1 i , j 2 . □
Lemma 12.
Let α be an optimal set of n-means with n 4 . Then, α i , j = 1 2 A ( i , j ) .
Proof. 
By Lemma 11, we know that α A ( i , j ) for all 1 i , j 2 . Now, we will prove the statement by considering four distinct cases:
Case 1: n = 4 k for some integer k 1 .
In this case, due to symmetry, we can assume that α contains k points from each of A ( i , j ) , otherwise, quantization error can be reduced by redistributing the points of α equally among A ( i , j ) for 1 i , j 2 , and so α i , j = 1 2 A ( i , j ) .
Case 2: n = 4 k + 1 for some integer k 1 .
In this case, again due to symmetry, we can assume that α contains k points from each of A ( i , j ) , and if possible, one point, say ( a , b ) , from A ( , ) i , j = 1 2 A ( i , j ) . By symmetry, one can assume that ( a , b ) is the midpoint of the line segment joining any two centroids of the basic rectangles A ( i , j ) for 1 i , j 2 . Let us first take ( a , b ) = ( 1 2 , 1 2 ) which is the center of the affine set. For simplicity, we first assume k = 1 , i.e., n = 5 . Then, α contains only one point from each of A ( i , j ) . Let ( a 1 , b 1 ) be the point that α takes from A ( 1 , 1 ) . As ( 1 2 , 1 2 ) lies on the diagonal x 2 = x 1 , due to symmetry we can also assume that ( a 1 , b 1 ) lies on the diagonal x 2 = x 1 . By Proposition 1, we have P ( M ( ( 1 2 , 1 2 ) | α ) ) > 0 . This yields that 1 2 ( ( a 1 , b 1 ) + ( 1 2 , 1 2 ) ) < ( 1 3 , 1 3 ) which implies a 1 < 1 6 and b 1 < 1 6 . Then, we see that
1 36 = V 4 V 5 = 4 A ( 1 , 1 ) min c { ( a 1 , b 1 ) , ( 1 2 , 1 2 ) } x c 2 d P > min c β x c 2 d P = 2 81 V 5 ,
where β = { ( 1 18 , 1 18 ) , ( 1 18 , 5 18 ) , ( 5 6 , 1 6 ) , ( 1 6 , 5 6 ) , ( 5 6 , 5 6 ) } , which is a contradiction. Similarly, if we take ( a , b ) as the midpoint of a line segments joining the centroids of any two adjacent basic rectangles A ( i , j ) for 1 i , j 2 , contradiction arises. Proceeding in the similar way, by taking k = 2 , 3 , , we see that contradiction arises at each value k takes. Therefore, α i , j = 1 2 A ( i , j ) .
Case 3: n = 4 k + 2 for some integer k 1 .
In this case, due to symmetry, we can assume that α contains k points from each of A ( i , j ) , and if possible, two points, say ( a 1 , b 1 ) and ( a 2 , b 2 ) , from A ( , ) i , j = 1 2 A ( i , j ) . Then, by symmetry, we can assume that ( a 1 , b 1 ) lies on the midpoint of the line segment joining the centroids of A ( 1 , 1 ) , A ( 2 , 1 ) ; and ( a 2 , b 2 ) lies on the midpoint of the line segment joining the centroids of A ( 1 , 2 ) and A ( 2 , 2 ) . As in Case 2, this leads to a contradiction. Thus, α i , j = 1 2 A ( i , j ) .
Case 4: n = 4 k + 3 for some integer k 1 . Due to symmetry, in this case, we can assume that each of A ( 1 , 1 ) and A ( 2 , 1 ) gets k + 1 points; each of A ( 1 , 2 ) and A ( 2 , 2 ) gets k points. The remaining one point lies on the midpoint of the line segment joining the centroids of A ( 1 , 2 ) and A ( 2 , 2 ) . But, in that case, proceeding as in Case 2, we can show that a contradiction arises. Thus, α i , j = 1 2 A ( i , j ) .
We have shown that in all possible cases α i , j = 1 2 A ( i , j ) ; hence, the lemma follows. □
Corollary 1.
The set { ( 1 6 , 1 6 ) , ( 5 6 , 1 6 ) , ( 1 6 , 5 6 ) , ( 5 6 , 5 6 ) } is a unique optimal set of four-means of the affine measure P with quantization error V 4 = 1 36 (see Figure 4).
Remark 6.
Let α be an optimal set of n-means, and n i j = card ( β i j ) where β i j = α A ( i , j ) for 1 i , j 2 . Then, 0 | n i j n p q | 1 for 1 i , j , p , q 2 .
Lemma 13.
Let n 4 and α be an optimal set of n-means for the product measure P c × P c . For 1 i , j 2 , set β i j : = α A ( i , j ) , and let n i j = card ( β i j ) . Then, U ( i , j ) 1 ( β i j ) is an optimal set of n i j -means, and V n = i , j = 1 2 1 36 V n i j .
Proof. 
For n 4 , by Lemma 11, we have α = i , j = 1 2 β i j , n = i , j = 1 2 n i j , and so
V n = i , j = 1 2 A ( i , j ) min a β i j x a 2 d ( P c × P c ) .
If U ( 1 , 1 ) 1 ( β 11 ) is not an optimal set of n 11 -means for P c × P c , then there exists a set γ 11 R 2 with card ( γ 11 ) = n 11 such that min a γ 11 x a 2 d ( P c × P c ) < min a U ( 1 , 1 ) 1 ( β 11 ) x a 2 d ( P c × P c ) . But then, δ : = U ( 1 , 1 ) ( γ 11 ) β 12 β 21 β 22 is a set of cardinality n and it satisfies min a δ x a 2 d ( P c × P c ) < min a α x a 2 d ( P c × P c ) , contradicting the fact that α is an optimal set of n-means for P c × P c . Similarly, it can be proved that U ( 1 , 2 ) 1 ( β 12 ) , U ( 2 , 1 ) 1 ( β 21 ) , and U ( 2 , 2 ) 1 ( β 22 ) are optimal sets of n 12 -, n 21 -, and n 22 -means respectively. Thus,
V n = i , j = 1 2 1 4 min a β i j x a 2 d ( ( P c × P c ) U ( i , j ) 1 ) = i , j = 1 2 1 36 min a U ( i , j ) 1 ( β i j ) x a 2 d P = i , j = 1 2 1 36 V n i j ,
which gives the lemma. □
Proposition 5.
Let n N be such that n = 4 ( n ) for some positive integer ( n ) . Then, the set
α 4 ( n ) : = ( σ , τ ) { 1 , 2 } ( n ) * 2 { ( A ( σ ) , A ( τ ) ) }
forms a unique optimal set of n-means for the affine measure P with quantization error
V 4 ( n ) = 1 4 1 9 ( n ) .
Proof. 
We will prove the statement by induction. By Corollary 1, it is true if ( n ) = 1 . Let us assume that it is true for n = 4 k for some positive integer k. We now show that it is also true if n = 4 k + 1 . Let β be an optimal set of 4 k + 1 -means. Set β i j : = β A ( i , j ) for 1 i , j 2 . Then, by Lemmas 11 and 13, U ( i , j ) 1 ( β i j ) is an optimal set of 4 k -means, and so U ( i , j ) 1 ( β i j ) = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } k * 2 } which implies β i j = { ( A ( i σ ) , A ( j τ ) ) : ( σ , τ ) { 1 , 2 } k * 2 } . Thus, β = i , j = 1 2 β i j = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } ( k + 1 ) * 2 } is an optimal set of 4 k + 1 -means. Because ( A ( σ ) , A ( τ ) ) is the centroid of A ( σ , τ ) for each ( σ , τ ) I k + 1 , the set β is unique. Now, by Lemma 13, we have the quantization error as
V k + 1 = i , j = 1 2 1 36 V k = 1 9 · 1 4 · 1 9 k = 1 4 1 9 k + 1 .
Thus, by induction, the proof of the proposition is complete. □
Definition 1.
For n N with n 4 let ( n ) be the unique natural number with 4 ( n ) < n 2 · 4 ( n ) . For I { 1 , 2 } ( n ) * 2 with card ( I ) = n 4 ( n ) let α n ( I ) be the set defined as follows:
α n ( I ) = ( σ , τ ) { 1 , 2 } ( n ) * 2 I { ( A ( σ ) , A ( τ ) ) } ( ( σ , τ ) I { ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ ) ) } ) .
Remark 7.
In Definition 1, instead of choosing the set { ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ ) ) } , one can choose { ( A ( σ ) , A ( τ 1 ) ) , ( A ( σ ) , A ( τ 2 ) ) } , i.e., the set associated with each ( σ , τ ) I can be chosen in two different ways. Moreover, the subset I can be chosen from { 1 , 2 } ( n ) * 2 in 4 ( n ) C n 4 ( n ) ways. Hence, the number of the sets α n ( I ) is 2 card ( I ) · 4 ( n ) C n 4 ( n ) .
The following example illustrates Definition 1.
Example 2.
Let n = 5 . Then, ( n ) = 1 , I { 1 , 2 } * 2 with card ( I ) = 1 , and so
α 5 ( { ( 1 , 1 ) } ) = { ( A ( 1 ) , A ( 2 ) ) , ( A ( 2 ) , A ( 1 ) ) , ( A ( 2 ) , A ( 2 ) ) } { ( A ( 11 ) , A ( 1 ) ) , ( A ( 12 ) , A ( 1 ) ) } = { ( 1 6 , 5 6 ) , ( 5 6 , 1 6 ) , ( 5 6 , 5 6 ) } { ( 1 18 , 1 6 ) , ( 5 18 , 1 6 ) } ,
or,
α 5 ( { ( 1 , 1 ) } ) = { ( A ( 1 ) , A ( 2 ) ) , ( A ( 2 ) , A ( 1 ) ) , ( A ( 2 ) , A ( 2 ) ) } { ( A ( 1 ) , A ( 11 ) ) , ( A ( 1 ) , A ( 12 ) ) } = { ( 1 6 , 5 6 ) , ( 5 6 , 1 6 ) , ( 5 6 , 5 6 ) } { ( 1 6 , 1 18 ) , ( 1 6 , 5 18 ) } .
Similarly, one can get six more sets by taking I = { ( 1 , 2 ) } , { ( 2 , 1 ) } , or { ( 2 , 2 ) } , i.e., the number of the sets α n ( I ) in this case is 2 card ( I ) · 4 ( n ) C n 4 ( n ) = 8 .
Proposition 6.
Let n 4 and α n ( I ) be the set as defined in Definition 1. Then, α n ( I ) forms an optimal set of n-means with quantization error
V n = 1 4 1 36 ( n ) 2 · 4 ( n ) n + 5 9 ( n 4 ( n ) ) .
Proof. 
We have n = 4 ( n ) + k where 1 k 4 ( n ) . Set β i j = α A i j with n i j = card ( β i j ) for 1 i , j 2 . Let us prove it by induction. We first assume k = 1 . By Lemmas 11 and 13, we can assume that each of U ( i , j ) 1 ( β i j ) for i = 2 and j = 1 , 2 , are optimal sets of 4 ( n ) 1 -means and U ( 1 , 1 ) 1 ( β 11 ) is an optimal set of ( 4 ( n ) 1 + 1 ) -means. Thus, for i = 2 and j = 1 , 2 , we can write
U ( i , j ) 1 ( β i j ) = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } ( ( n ) 1 ) * 2 } , and U ( 1 , 1 ) 1 ( β 11 ) = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } ( ( n ) 1 ) * 2 { τ } } U τ ( α 2 ) ,
for some τ { 1 , 2 } ( ( n ) 1 ) * 2 , where α 2 is an optimal set of two-means. Thus,
α n ( { ( 1 , 1 ) τ } ) = i , j = 1 2 β i j = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } ( n ) * 2 { ( 1 , 1 ) τ } } U ( 1 , 1 ) τ ( α 2 ) ,
for some τ { 1 , 2 } ( ( n ) 1 ) * 2 , where α 2 is an optimal set of two-means. Notice that instead of choosing U ( 1 , 1 ) 1 ( β 11 ) as an optimal set of ( 4 ( n ) 1 + 1 ) -means, one can choose any one from U ( i , j ) 1 ( β i j ) for i = 2 , j = 1 , 2 , as an optimal set of ( 4 ( n ) 1 + 1 ) -means. Hence, for n = 4 ( n ) + 1 , one can write
α n ( I ) = i , j = 1 2 β i j = { ( A ( σ ) , A ( τ ) ) : ( σ , τ ) { 1 , 2 } ( n ) * 2 { τ } } U τ ( α 2 ) ,
where I = { τ } for some τ { 1 , 2 } ( n ) * 2 as an optimal set of n-means. Thus, we see that the proposition is true if n = 4 ( n ) + k . Similarly, one can prove that the proposition is true for any 1 k 4 ( n ) . Then, the quantization error is
V n = min ( a , b ) α n ( I ) x ( a , b ) 2 d P = ( σ , τ ) { 1 , 2 } ( n ) * 2 I A σ × A τ x ( A ( σ ) , A ( τ ) ) 2 d ( P c × P c ) + ( σ , τ ) I i = 1 2 A σ i × A τ x ( A ( σ i ) , A ( τ ) ) 2 d ( P c × P c ) = ( σ , τ ) { 1 , 2 } ( n ) * 2 I 1 4 ( n ) ( u σ 2 + u τ 2 ) 1 8 + ( σ , τ ) I i = 1 2 1 4 ( n ) 1 2 ( u σ i 2 + u τ 2 ) 1 8 = ( σ , τ ) { 1 , 2 } ( n ) * 2 I 1 4 ( n ) ( u σ 2 + u τ 2 ) 1 8 + ( σ , τ ) I 1 4 ( n ) ( 1 9 u σ 2 + u τ 2 ) 1 8 .
Because card ( { 1 , 2 } ( n ) * 2 I ) = 2 · 4 ( n ) n , card ( I ) = n 4 ( n ) , u σ = u τ = 1 3 ( n ) , upon simplification, we have V n = 1 4 1 36 ( n ) 2 · 4 ( n ) n + 5 9 ( n 4 ( n ) ) . Thus, the proof of the proposition is complete. □
Definition 2.
For n N with n 4 let ( n ) be the unique natural number with 2 · 4 ( n ) < n < 4 ( n ) + 1 . For I { 1 , 2 } ( n ) * 2 with card ( I ) = n 2 · 4 ( n ) let α n ( I ) be the set defined as follows:
α n ( I ) = ( σ , τ ) { 1 , 2 } ( n ) * 2 I { ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ ) ) } ( ( σ , τ ) I { ( A ( σ 1 ) , A ( τ 1 ) ) , ( A ( σ 1 ) , A ( τ 2 ) ) , ( A ( σ 2 ) , A ( τ ) ) } ) .
Remark 8.
In Definition 2, instead of choosing the set { ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ ) ) } , one can choose { ( A ( σ ) , A ( τ 1 ) ) , ( A ( σ ) , A ( τ 2 ) ) } . Instead of choosing the set
{ ( A ( σ 1 ) , A ( τ 1 ) ) , ( A ( σ 1 ) , A ( τ 2 ) ) , ( A ( σ 2 ) , A ( τ ) ) } , one can choose either the set
{ ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ 1 ) ) , ( A ( σ 2 ) , A ( τ 2 ) ) } , or
{ ( A ( σ 1 ) , A ( τ 1 ) ) , ( A ( σ 2 ) , A ( τ 1 ) ) , ( A ( σ ) , A ( τ 2 ) ) } , or
{ ( A ( σ ) , A ( τ 1 ) ) , ( A ( σ 1 ) , A ( τ 2 ) ) , ( A ( σ 2 ) , A ( τ 2 ) ) } , i.e., the set corresponding to each ( σ , τ ) { 1 , 2 } ( n ) * 2 I can be chosen in two different ways, and the set corresponding to each ( σ , τ ) I can be chosen in four different ways. Because card ( { 1 , 2 } ( n ) * 2 I ) = 4 ( n ) ( n 2 · 4 ( n ) ) = 3 · 4 ( n ) n and the subset I can be chosen from { 1 , 2 } ( n ) * 2 in 4 ( n ) C n 2 · 4 ( n ) ways, the number of the sets α n ( I ) is 2 3 · 4 ( n ) n · 4 card ( I ) · 4 ( n ) C n 2 · 4 ( n ) .
We now give an example illustrating Definition 2.
Example 3.
Let n = 9 . Then, ( n ) = 1 , I { 1 , 2 } * 2 with card ( I ) = 1 . Take I = { ( 1 , 1 ) } . Then,
α 9 ( { ( 1 , 1 ) } ) = { ( A ( 11 ) , A ( 2 ) ) , ( A ( 12 ) , A ( 2 ) ) , ( A ( 21 ) , A ( 2 ) ) , ( A ( 22 ) , A ( 2 ) ) , ( A ( 21 ) , A ( 1 ) ) , ( A ( 22 ) , A ( 1 ) ) } { ( A ( 11 ) , A ( 1 ) ) , ( A ( 12 ) , A ( 11 ) ) , ( A ( 12 ) , A ( 12 ) ) } = { ( 1 18 , 5 6 ) , ( 5 18 , 5 6 ) , ( 13 18 , 5 6 ) , ( 17 18 , 5 6 ) , ( 13 18 , 1 6 ) , ( 17 18 , 1 6 ) } { ( 1 18 , 1 6 ) , ( 5 18 , 1 18 ) , ( 5 18 , 5 18 ) } .
Note that each of α 9 ( { ( 1 , 1 ) } ) , α 9 ( { ( 1 , 2 ) } ) , α 9 ( { ( 2 , 1 ) } ) , α 9 ( { ( 2 , 2 ) } ) can be chosen in 32 ways, i.e., the numbers of the sets α 9 ( I ) in this case is 4 · 32 = 128 . Moreover, by using the formula in Remark 8, we have
2 3 · 4 ( n ) n · 4 card ( I ) · 4 ( n ) C n 2 · 4 ( n ) = 128 .
Proposition 7.
Let n 4 and α n ( I ) be the set as defined in Definition 2. Then, α n ( I ) forms an optimal set of n-means with quantization error
V n = 1 36 ( n ) + 1 ( 9 · 4 ( n ) 2 n ) .
Proof. 
We have n = 2 · 4 ( n ) + k where 1 k < 2 · 4 ( n ) . Set β i j = α A i j with n i j = card ( β i j ) for 1 i , j 2 . Let us prove it by induction. We first assume k = 1 . By Lemmas 11 and 13, we can assume that each of U ( i , j ) 1 ( β i j ) for i = 2 and j = 1 , 2 , are optimal sets of 2 · 4 ( n ) 1 -means and U ( 1 , 1 ) 1 ( β 11 ) is an optimal set of ( 2 · 4 ( n ) 1 + 1 ) -means. Thus, for i = 2 and j = 1 , 2 , we can write
U ( i , j ) 1 ( β i j ) = { U ( σ , τ ) ( α 2 ) : ( σ , τ ) { 1 , 2 } ( ( n ) 1 ) * 2 } , and U ( 1 , 1 ) 1 ( β 11 ) = { U ( σ , τ ) ( α 2 ) : ( σ , τ ) { 1 , 2 } ( ( n ) 1 ) * 2 { τ } } U τ ( α 3 ) ,
for some τ { 1 , 2 } ( ( n ) 1 ) * 2 , where α 3 is an optimal set of three-means. Thus
α n ( { ( 1 , 1 ) τ } ) = i , j = 1 2 β i j = { U ( σ , τ ) ( α 2 ) : ( σ , τ ) { 1 , 2 } ( n ) * 2 { ( 1 , 1 ) τ } } U ( 1 , 1 ) τ ( α 3 ) ,
for some τ { 1 , 2 } ( ( n ) 1 ) * 2 , where α 3 is an optimal set of three-means. Notice that instead of choosing U ( 1 , 1 ) 1 ( β 11 ) as an optimal set of ( 2 · 4 ( n ) 1 + 1 ) -means, one can choose any one from U ( i , j ) 1 ( β i j ) for i = 2 , j = 1 , 2 , as an optimal set of ( 2 · 4 ( n ) 1 + 1 ) -means. Hence, for n = 2 · 4 ( n ) + 1 , one can write
α n ( I ) = i , j = 1 2 β i j = { U ( σ , τ ) ( α 2 ) : ( σ , τ ) { 1 , 2 } ( n ) * 2 { τ } } U τ ( α 3 ) ,
where I = { τ } for some τ { 1 , 2 } ( n ) * 2 as an optimal set of n-means. Thus, we see that the proposition is true if n = 2 · 4 ( n ) + 1 . Similarly, one can prove that the proposition is true for any 1 k < 2 · 4 ( n ) . Thus, writing α 2 = { ( A ( 1 ) , A ( ) ) , ( A ( 2 ) , A ( ) ) } , and α 3 = { ( A ( 1 ) , A ( 1 ) ) , ( A ( 1 ) , A ( 2 ) ) , ( A ( 2 ) , A ( ) ) } , we have, in general,
α n ( I ) = ( σ , τ ) { 1 , 2 } ( n ) * 2 I { ( A ( σ 1 ) , A ( τ ) ) , ( A ( σ 2 ) , A ( τ ) ) } ( ( σ , τ ) I { ( A ( σ 1 ) , A ( τ 1 ) ) , ( A ( σ 1 ) , A ( τ 2 ) ) , ( A ( σ 2 ) , A ( τ ) ) } ) ,
where I { 1 , 2 } ( n ) * 2 with card ( I ) = k for some 1 k < 2 · 4 ( n ) . Then, we obtain the quantization error as
V n = min ( a , b ) β n ( I ) x ( a , b ) 2 d P = ( σ , τ ) { 1 , 2 } ( n ) * 2 I i = 1 2 A σ i × A τ x ( A ( σ i ) , A ( τ ) ) 2 d ( P c × P c ) + ( σ , τ ) I ( j = 1 2 A σ 1 × A τ j x ( A ( σ 1 ) , A ( τ j ) ) 2 d ( P c × P c ) + A σ 2 × A τ x ( A ( σ 2 ) , A ( τ ) ) 2 d ( P c × P c ) ) = ( σ , τ ) { 1 , 2 } ( n ) * 2 I i = 1 2 1 4 ( n ) 1 2 ( u σ i 2 + u τ 2 ) 1 8 + ( σ , τ ) I 1 4 ( n ) j = 1 2 1 4 ( u σ 1 2 + u τ j 2 ) 1 8 + 1 2 ( u σ 2 2 + u τ 2 ) 1 8 = ( σ , τ ) { 1 , 2 } ( n ) * 2 I 1 4 ( n ) ( 1 9 u σ 2 + u τ 2 ) 1 8 + ( σ , τ ) I 1 4 ( n ) ( u σ 2 + 5 u τ 2 ) 1 72 .
Because card ( { 1 , 2 } ( n ) * 2 I ) = 3 · 4 ( n ) n , card ( I ) = n 2 · 4 ( n ) , u σ = u τ = 1 3 ( n ) , upon simplification, we have V n = 1 36 ( n ) + 1 ( 9 · 4 ( n ) 2 n ) . Thus, the proof of the proposition is complete. □

6. Quantization Dimension and Quantization Coefficient for P

The techniques employed in the previous sections also provide closed formulas for the quantization errors involved at each step. Such closed formulas are amenable for direct calculation of the quantization dimension and the quantization coefficient for the probability distribution involved. Hence, in this section we will calculate the quantization dimension D ( P ) of the probability distribution P, and the accumulation points for the D ( P ) -dimensional quantization coefficients. By Propositions 5–7, the nth quantization error V n is given by
V n = 1 4 1 36 ( n ) 2 · 4 ( n ) n + 5 9 ( n 4 ( n ) ) if 4 ( n ) n 2 · 4 ( n ) , 1 36 ( n ) + 1 ( 9 · 4 ( n ) 2 n ) if 2 · 4 ( n ) < n < 4 ( n ) + 1 .
Proposition 8.
The quantization dimension D ( P ) of the probability distribution P exists and equals log 4 log 3 .
Proof. 
By (8), for 4 ( n ) n 2 · 4 ( n ) , it follows that V 2 · 4 ( n ) V n V 4 ( n ) , i.e.,
5 36 9 ( n ) V n 1 4 9 ( n ) ,
and so
2 ( n ) log 4 log 5 36 + ( n ) log 9 2 log n log V n < 2 log 2 + 2 ( n ) log 4 log 1 4 + ( n ) log 9 .
Thus, we deduce that
lim n 2 log n log V n = log 4 log 3 .
Similarly, for 2 · 4 ( n ) < n < 4 ( n ) + 1 , we also obtain the same limit. Hence,
D ( P ) = lim n 2 log n log V n = log 4 log 3 .
Thus, the proof of the proposition is complete. □
Proposition 9.
Let β : = D ( P ) be the quantization dimension of P. Then, the β-dimensional quantization coefficient forP does not exist, and the accumulation points of { n 2 β V n } n N lie in the closed interval [ 1 12 , 5 4 ] .
Proof. 
Recall the sequence of quantization errors { V n } n = 4 given by (8). Again, notice that 4 1 β = 3 . Along the sequence { 4 ( n ) } n N , we have lim n ( 4 ( n ) ) 2 β V 4 ( n ) = 1 4 . Similarly, along the sequence { 2 · 4 ( n ) } n N , we have lim n ( 2 · 4 ( n ) ) 2 β V 2 . 4 ( n ) = 5 12 . Consequently, lim n n 2 β V n does not exist. Now, we calculate the range for the accumulation points of { n 2 β V n } n N . The following two cases can arise:
Case 1. 4 ( n ) n 2 · 4 ( n ) .
In this case, we have V 2 . 4 ( n ) V n V 4 ( n ) , implying ( 4 ( n ) ) 2 β V 2 · 4 ( n ) n 2 β V n ( 2 · 4 ( n ) ) 2 β V 4 ( n ) . Because
lim n ( 4 ( n ) ) 2 β V 2 · 4 ( n ) = 5 36 , and lim n ( 2 · 4 ( n ) ) 2 β V 4 ( n ) = 3 4 ,
it follows that along such subsequences, we have lim inf n n 2 β V n = 5 36 < 3 4 = lim sup n n 2 β V n .
Case 2. 2 · 4 ( n ) < n < 4 ( n ) + 1 .
In this case, we have V 4 ( n ) + 1 < V n < V 2 · 4 ( n ) , implying
( 2 · 4 ( n ) ) 2 β V 4 ( n ) + 1 < n 2 β V n < ( 4 ( n ) + 1 ) 2 β V 2 · 4 ( n ) .
Because
lim n ( 2 · 4 ( n ) ) 2 β V 4 ( n ) + 1 = 1 12 , and lim n ( 4 ( n ) + 1 ) 2 β V 2 · 4 ( n ) = 5 4 ,
it follows that lim inf n n 2 β V n = 1 12 < 5 4 = lim sup n n 2 β V n .
By Case 1 and Case 2, for n N , we see that
lim inf n n 2 β V n = 1 12 < 5 4 = lim sup n n 2 β V n ,
which yields the fact that the accumulation points of { n 2 β V n } n N lie in the closed interval [ 1 12 , 5 4 ] . Thus, the proof of the proposition is complete. □

7. Discussion and Concluding Remarks

Motivation. As it has been mentioned in Introduction, the main motivation for this article is completion of the programme initiated in [14]. In the meantime, we extend the results in [12] to the setting of infinite affine transformations. Analogously to [10], this completes the programme of providing complete quantization for affine measures on R 2 .
Observations and Remarks. Quantization of continuous random signals (or random variables and processes) is an important part of digital representation of analog signals for various coding techniques (e.g., source coding, data compression, archiving, restoration). The oldest example of quantization in statistics is rounding off. Sheppard (see [19]) was the first who analyzed rounding off for estimating densities by histograms. Any real number x can be rounded off (or quantized) to the nearest integer, say q ( x ) = [ x ] , with a resulting quantization error e ( x ) = x q ( x ) . Hence, the restored signal may differ from the original one and some information can be lost. Thus, in quantization of a continuous set of values there is always a distortion (also known as noise or error) between the original set of values and the quantized set of values. The main goal in quantization theory is finding a set of quantizers with minimum distortion, which has been extensively investigated by numerous authors [2,20,21,22,23,24]. A different approach for uniform scalar quantization is developed in [25], where the correlation properties of a Gaussian process are exploited to evaluate the asymptotic behavior of the random quantization rate for uniform quantizers. General quantization problems for Gaussian processes in infinite-dimensional functional spaces are considered in [26]. In estimating weighted integrals of time series with no quadratic mean derivatives, by means of samples at discrete times, it is known that the rate of convergence of mean-square error is reduced from n 2 to n 1.5 when the samples are quantized (see [27]). For smoother time series, with k = 1 , 2 , quadratic mean derivatives, the rate of convergence is reduced from n 2 k 2 to n 2 when the samples are quantized, which is a very significant reduction (see [28]). The interplay between sampling and quantization is also studied in [28], which asymptotically leads to optimal allocation between the number of samples and the number of levels of quantization. Quantization also seems to be a promising tool in recent development in numerical probability (see, e.g., [29]).
By Proposition 1 the points in an optimal set are the centroids of their own Voronoi regions. Consequently, the points in an optimal set are an evenly spaced distribution of sites in the domain with minimum distortion error with respect to a given probability measure and is therefore very useful in many fields, such as clustering, data compression, optimal mesh generation, cellular biology, optimal quadrature, coverage control and geographical optimization; for more details one can see [7,30]. In addition, it has applications in energy-efficient distribution of base stations in a cellular network [31,32,33]. In both geographical and cellular applications the distribution of users is highly complex and often modeled by a fractal [34,35].
Future Directions.k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations, or the underlying data set into k clusters in which each observation belongs to the cluster with the nearest mean, also known as cluster center or cluster centroid. For a given k and a given probability distribution in a dataset there can be two or more different sets of k-means clusters: for example, with respect to a uniform distribution the unit square { ( x 1 , x 2 ) : | x 1 | 1 , | x 2 | 1 } has four different sets of two-means clusters with cluster centers { ( 1 2 , 1 2 ) , ( 1 2 , 1 2 ) } , { ( 1 2 , 1 2 ) , ( 1 2 , 1 2 ) } , { ( 1 2 , 0 ) , ( 1 2 , 0 ) } , and { ( 0 , 1 2 ) , ( 0 , 1 2 ) } . Among these only { ( 1 2 , 0 ) , ( 1 2 , 0 ) } , and { ( 0 , 1 2 ) , ( 0 , 1 2 ) } form two different optimal sets of two-means. In other words, we can say that for a given k, among the multiple sets of k-means clusters, the centers of a set with the smallest distortion error form an optimal set of k-means. Thus, it is much more difficult to calculate an optimal set of k-means than to calculate a set of k-means clusters. There are several work done in the direction of k-means clustering. On the other hand, there is not much work in the direction of finding optimal sets of k-means clusters, and the work in this paper is an addition in this direction.
The probability measure P considered in this study has identical marginal distributions, which is instrumental in determining optimal sets of 2-, 3-, and 4-means accurately. Besides, it enables us to bridge infinitely generated affine measures with finitely generated ones, and consequently, connect optimal sets of n-means for P and P C × P C . It would be interesting to investigate if similar results can be achieved when P is induced by different infinite probability vectors { p i j } than considered in this article.

Author Contributions

The work in this paper is completely new. Both the authors equally contributed in writing the draft of this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bucklew, J.A.; Wise, G.L. Multidimensional asymptotic quantization with rth power distortion measures. IEEE Trans. Inf. Theory 1982, 28, 239–247. [Google Scholar] [CrossRef]
  2. Gray, R.; Neuhoff, D. Quantization. IEEE Trans. Inform. Theory 1998, 44, 2325–2383. [Google Scholar] [CrossRef]
  3. Abaya, E.F.; Wise, G.L. Some remarks on the existence of optimal quantizers. Stat. Probab. Lett. 1984, 2, 349–351. [Google Scholar] [CrossRef]
  4. Gray, R.M.; Kieffer, J.C.; Linde, Y. Locally optimal block quantizer design. Inf. Control. 1980, 45, 178–198. [Google Scholar] [CrossRef] [Green Version]
  5. György, A.; Linder, T. On the structure of optimal entropy-constrained scalar quantizers. IEEE Trans. Inf. Theory 2002, 48, 416–427. [Google Scholar] [CrossRef]
  6. Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1730. [Google Scholar]
  7. Du, Q.; Faber, V.; Gunzburger, M. Centroidal Voronoi Tessellations: Applications and Algorithms. Siam Rev. 1999, 41, 637–676. [Google Scholar] [CrossRef] [Green Version]
  8. Roychowdhury, M.K. Quantization and centroidal Voronoi tessellations for probability measures on dyadic Cantor sets. J. Fractal Geom. 2017, 4, 127–146. [Google Scholar] [CrossRef] [Green Version]
  9. Gersho, A.; Gray, R.M. Vector Quantization and Signal Compression; Kluwer Academy Publishers: Boston, MA, USA, 1992. [Google Scholar]
  10. Graf, S.; Luschgy, H. The Quantization of the Cantor Distribution. Math. Nachr. 1997, 183, 113–133. [Google Scholar] [CrossRef]
  11. Roychowdhury, L. Optimal quantization for nonuniform Cantor distributions. J. Interdiscip. Math. 2019, 22, 1325–1348. [Google Scholar] [CrossRef]
  12. Çömez, D.; Roychowdhury, M.K. Quantization for uniform distributions of Cantor dusts on R2. Topol. Proc. 2020, 56, 195–218. [Google Scholar]
  13. Roychowdhury, M.K. Optimal quantization for the Cantor distribution generated by infinite similitudes. Isr. J. Math. 2019, 231, 437–466. [Google Scholar] [CrossRef] [Green Version]
  14. Mihailescu, E.; Roychowdhury, M.K. Quantization coefficients in infinite systems. Kyoto J. Math. 2015, 55, 857–873. [Google Scholar] [CrossRef] [Green Version]
  15. Graf, S.; Luschgy, H. The quantization dimension of self-similar probabilities. Math. Nachr. 2002, 241, 103–109. [Google Scholar] [CrossRef]
  16. Hutchinson, J. Fractals and self-similarity. Indiana Univ. J. 1981, 30, 713–747. [Google Scholar] [CrossRef]
  17. Moran, M. Hausdorff measure of infinitely generated self-similar sets. Monatsh. Math. 1996, 122, 387–399. [Google Scholar] [CrossRef]
  18. Mauldin, D.; Urbański, M. Dimensions and measures in infinite iterated function systems. Proc. Lond. Math. Soc. 1996, 73, 105–154. [Google Scholar] [CrossRef]
  19. Sheppard, W.F. On the calculation of the most probable values of frequency constants for data arranged according to equidistant divisions of a scale. Proc. Lond. Math. Soc. 1897, 1, 353–380. [Google Scholar] [CrossRef]
  20. Cambanis, S.; Gerr, N. A simple class of asymptotically optimal quantizers. IEEE Trans. Inform. Theory 1983, 29, 664–676. [Google Scholar] [CrossRef]
  21. Gray, R.M.; Linder, T. Mismatch in high rate entropy constrained vector quantization. IEEE Trans. Inform. Theory 2003, 49, 1204–1217. [Google Scholar] [CrossRef]
  22. Li, J.; Chaddha, N.; Gray, R.M. Asymptotic performance of vector quantizers with a perceptual distortion measure. IEEE Trans. Inform. Theory 1999, 45, 1082–1091. [Google Scholar]
  23. Shykula, M.; Seleznjev, O. Stochastic structure of asymptotic quantization errors. Stat. Prob. Lett. 2006, 76, 453–464. [Google Scholar] [CrossRef]
  24. Zador, P.L. Asymptotic quantization error of continuous signals and the quantization dimensions. IEEE Trans. Inform. Theory 1982, 28, 139–148. [Google Scholar] [CrossRef]
  25. Shykula, M.; Seleznjev, O. Uniform Quantization of Random Processes; Univ. Umeȧ Research Report; Umeå University: Umeå, Sweden, 2004; pp. 1–16. [Google Scholar]
  26. Luschgy, H.; Pagès, G. Functional quantization of Gaussian processes. J. Funct. Anal. 2002, 196, 486–531. [Google Scholar] [CrossRef] [Green Version]
  27. Bucklew, J.A.; Cambanis, S. Estimating random integrals from noisy observations: Sampling designs and their performance. IEEE Trans. Inform. Theory 1988, 34, 111–127. [Google Scholar] [CrossRef]
  28. Benhenni, K.; Cambanis, S. The effect of quantization on the performance of sampling designs. IEEE Trans. Inform. Theory 1998, 44, 1981–1992. [Google Scholar] [CrossRef]
  29. Pagès, G.; Pham, H.; Printemps, J. Optimal quantization methods and applications to numerical problems in finance. In Handbook of Computational and Numerical Methods in Finance; Rachev, S., Ed.; Birkhäuser Boston: Boston, MA, USA, 2004; pp. 253–297. [Google Scholar]
  30. Okabe, A.; Boots, B.; Sugihara, K.; Chiu, S.N. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, 2nd ed.; Wiley: Hoboken, NJ, USA, 2000. [Google Scholar]
  31. Hao, Y.; Chen, M.; Hu, L.; Song, J.; Volk, M.; Humar, I. Wireless Fractal Ultra-Dense Cellular Networks. Sensors 2017, 17, 841. [Google Scholar] [CrossRef]
  32. Kaza, K.R.; Kshirsagar, K.; Rajan, K.S. A bi-objective algorithm for dynamic reconfiguration of mobile networks. In Proceedings of the IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 5741–5745. [Google Scholar]
  33. Song, Y. Cost-Effective Algorithms for Deployment and Sensing in Mobile Sensor Networks. Ph.D. Thesis, University of Conneticut, Storrs, CT, USA, 2014. [Google Scholar]
  34. Abundo, C.; Bodnar, T.; Driscoll, J.; Hatton, I.; Wright, J. City population dynamics and fractal transport networks. In Proceedings of the Santa Fe Institute’s CSSS2013; Santa Fe Institute: Santa Fe, NM, USA, 2013. [Google Scholar]
  35. Lu, Z.; Zhang, H.; Southworth, F.; Crittenden, J. Fractal dimensions of metropolitan area road networks and the impacts on the urban built environment. Ecol. Indic. 2016, 70, 285–296. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Basic rectangles of the infinite affine transformations.
Figure 1. Basic rectangles of the infinite affine transformations.
Fractalfract 06 00239 g001
Figure 2. Optimal sets of two-means.
Figure 2. Optimal sets of two-means.
Fractalfract 06 00239 g002
Figure 3. Optimal sets of three-means.
Figure 3. Optimal sets of three-means.
Fractalfract 06 00239 g003
Figure 4. Optimal sets of n-means for 4 n 7 . Optimal set of 4-means is unique; on the other hand, optimal sets of n-means for n = 5 , 6 , 7 are not unique.
Figure 4. Optimal sets of n-means for 4 n 7 . Optimal set of 4-means is unique; on the other hand, optimal sets of n-means for n = 5 , 6 , 7 are not unique.
Fractalfract 06 00239 g004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Çömez, D.; Roychowdhury, M.K. Quantization for Infinite Affine Transformations. Fractal Fract. 2022, 6, 239. https://doi.org/10.3390/fractalfract6050239

AMA Style

Çömez D, Roychowdhury MK. Quantization for Infinite Affine Transformations. Fractal and Fractional. 2022; 6(5):239. https://doi.org/10.3390/fractalfract6050239

Chicago/Turabian Style

Çömez, Doğan, and Mrinal Kanti Roychowdhury. 2022. "Quantization for Infinite Affine Transformations" Fractal and Fractional 6, no. 5: 239. https://doi.org/10.3390/fractalfract6050239

Article Metrics

Back to TopTop