Quantization for Infinite Affine Transformations

Çömez, Doğan; Roychowdhury, Mrinal Kanti

doi:10.3390/fractalfract6050239

Open AccessArticle

Quantization for Infinite Affine Transformations

by

Doğan Çömez

¹ and

Mrinal Kanti Roychowdhury

^2,*

¹

Department of Mathematics, 408E24 Minard Hall, North Dakota State University, Fargo, ND 58108-6050, USA

²

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, 1201 West University Drive, Edinburg, TX 78539-2999, USA

^*

Author to whom correspondence should be addressed.

Fractal Fract. 2022, 6(5), 239; https://doi.org/10.3390/fractalfract6050239

Submission received: 4 April 2022 / Revised: 22 April 2022 / Accepted: 24 April 2022 / Published: 25 April 2022

Download

Browse Figures

Versions Notes

Abstract

:

Quantization for a probability distribution refers to the idea of estimating a given probability by a discrete probability supported by a finite set. In this article, we consider a probability distribution generated by an infinite system of affine transformations

{S_{i j}}

on

R^{2}

with associated probabilities

{p_{i j}}

such that

p_{i j} > 0

for all

i, j \in N

and

\sum_{i, j = 1}^{\infty} p_{i j} = 1

. For such a probability measure P, the optimal sets of n-means and the nth quantization error are calculated for every natural number n. It is shown that the distribution of such a probability measure is the same as that of the direct product of the Cantor distribution. In addition, it is proved that the quantization dimension

D (P)

exists and is finite; whereas, the

D (P)

-dimensional quantization coefficient does not exist, and the

D (P)

-dimensional lower and the upper quantization coefficients lie in the closed interval

[\frac{1}{12}, \frac{5}{4}]

.

Keywords:

affine transformations; affine set; affine measure; optimal quantizers; quantization error

MSC:

60Exx; 28A80; 94A34

1. Introduction

The quantization problem for probability measures is concerned with approximating a given measure by discrete measures of finite support in

L_{r}

-metrics. This problem has roots in information theory and engineering technology, in particular in signal processing and pattern recognition [1,2]. For a Borel probability measure P on

R^{d},

a quantizer is a function q mapping d-dimensional vectors in the domain

Ω \subset R^{d}

into a finite set of vectors

α \subset R^{d}

. In this case, the error

\int {min}_{a \in α} {∥ x - a ∥}^{2} d P (x),

where

∥ \cdot ∥

is the Euclidean norm

R^{d},

is often referred to as the variance, cost, or distortion error for

α

with respect to the measure P, and is denoted by

V (α) : = V (P; α)

. The value

inf {V (P; α) : α \subset R^{d}, card (α) \leq n}

is called the nth quantization error for the P, and is denoted by

V_{n} : = V_{n} (P)

. A set

α

on which this infimum is attained and contains no more than n points is called an optimal set of n-means. The elements of an optimal set are called optimal quantizers. It is known that for a Borel probability measure P if its support contains infinitely many elements and

{\int ∥ x ∥}^{2} d P (x)

is finite, then an optimal set of n-means always has exactly n-elements [3,4,5,6]. The number

{lim}_{n \to \infty} \frac{2 log n}{- log V_{n} (P)},

if exists, is called the quantization dimension of the measure P, and is denoted by

D (P)

; likewise, for any

s \in (0, + \infty)

, the number

lim_{n \to \infty} n^{\frac{2}{s}} V_{n} (P)

, if it exists, is called the s-dimensional quantization coefficient for P.

For a finite set

α \subset R^{d},

the Voronoi region generated by

a \in α,

denoted by

M (a | α),

is the set of all points in

R^{d}

which are closer to

a \in α

than to all other elements in

α .

For a probability distribution P on

R^{d}

the centroids of the regions

M (a | α)

are given by

a^{*} = \frac{1}{P (M (a | α))} \int_{M (a | α)} x d P .

A Voronoi tessellation is called a centroidal Voronoi tessellation (CVT) if

a^{*} = a

, i.e., if the generators are also the centroids of their own Voronoi regions. For a Borel probability measure P on

R^{d}

, an optimal set of n-means forms a CVT; however, the converse is not true in general [7,8]. The following fact is known [6,9]:

Proposition 1.

Let α be an optimal set of n-means and

a \in α

. Then,

(i): $P (M (a | α)) > 0$ and $P (\partial M (a | α)) = 0$ ,
(ii): $a = E (X : X \in M (a | α))$ , where X is a random variable with distribution $P,$
(iii): P-almost surely the set ${M (a | α) : a \in α}$ forms a Voronoi partition of $R^{d}$ .

Let

X = R

and consider the probability distribution

P_{c} : = \frac{1}{2} P_{c} \circ U_{1}^{- 1} + \frac{1}{2} P_{c} \circ U_{2}^{- 1},

where

U_{1} (x) = \frac{1}{3} x

and

U_{2} (x) = \frac{1}{3} x + \frac{2}{3},

for all

x \in R

. Because its support is the standard Cantor set generated by

U_{1}

and

U_{2}

,

P_{c}

is called the Cantor distribution. S. Graf and H. Luschgy determined the optimal sets of n-means and the nth quantization errors for the Cantor distribution, for all

n \geq 1,

completing its quantization program [10]. This result has been extended to the setting of a nonuniform Cantor distribution by L. Roychowdhury [11]. Analogously, the Cantor dust is generated by the contractive mappings

{S_{i}}_{i = 1}^{4}

on

R^{2},

where

S_{1} (x_{1}, x_{2}) = \frac{1}{3} (x_{1}, x_{2})

,

S_{2} (x_{1}, x_{2}) = \frac{1}{3} (x_{1}, x_{2}) + (\frac{2}{3}, 0)

,

S_{3} (x_{1}, x_{2}) = \frac{1}{3} (x_{1}, x_{2}) + (0, \frac{2}{3})

, and

S_{4} (x_{1}, x_{2}) = \frac{1}{3} (x_{1}, x_{2}) + (\frac{2}{3}, \frac{2}{3})

. If P is a Borel probability measure on

R^{2}

such that

P = \frac{1}{4} P \circ S_{1}^{- 1} + \frac{1}{4} P \circ S_{2}^{- 1} + \frac{1}{4} P \circ S_{3}^{- 1} + \frac{1}{4} P \circ S_{4}^{- 1}

, then P has support the Cantor dust. For this measure, D. Çömez and M.K. Roychowdhury determined the optimal sets of n-means and the nth quantization errors [12]. Let P be a probability measure on

R

generated by an infinite collection of similitudes

{S_{j}}_{j = 1}^{\infty},

where

S_{j} (x) = \frac{1}{3^{j}} x + 1 - \frac{1}{3^{j - 1}}

for all

x \in R

and P is given by

P = \sum_{j = 1}^{\infty} \frac{1}{2^{j}} P \circ S_{j}^{- 1}

. For this measure, M.K. Roychowdhury determined the optimal sets of n-means and the nth quantization errors [13], which is an infinite extension of the result of S. Graf and H. Luschgy in [10]. The quantization dimension for probability distributions generated by an infinite collection of similitudes was determined by E. Mihailescu and M.K. Roychowdhury in [14], which is an infinite extension of the result of S. Graf and H. Luschgy in [15]. In this article, we study extension of the result of D. Çömez and M.K. Roychowdhury in [12] to the setting of countably infinite affine maps on

R^{2}

, which will also complete the program initiated in [14].

Let

{S_{(i, j)} : i, j \in N}

be a collection of countably infinite affine transformations on

R^{2}

, where

S_{(i, j)} (x_{1}, x_{2}) = (r^{i} x_{1} + 1 - r^{i - 1}, r^{j} x_{2} + 1 - r^{j - 1})

, where

0 < r \leq \frac{1}{3} .

Clearly, these affine transformations are all contractive but are not similarity mappings. Associate the mappings

S_{(i, j)}

with the probabilities

p_{(i, j)}

such that

p_{(i, j)} = \frac{1}{2^{i + j}}

for all

i, j \in N

, where

N : = {1, 2, 3, \dots}

. Then, there exists a unique Borel probability measure P on

R^{2}

([16,17,18], etc.) such that

P = \sum_{i, j = 1}^{\infty} p_{(i, j)} P \circ S_{(i, j)}^{- 1} .

The support of such a probability measure lies in the unit square

{[0, 1]}^{2} .

We call such a measure an affine measure on

R^{2},

or more specifically, an infinitely generated affine measure on

R^{2}

. This article deals with the quantization of this measure P. The arrangement of the paper is as follows: in Section 2, we discuss the basic definitions and lemmas about the optimal sets of n-means and the nth quantization errors. The arguments in this section point out that determining optimal sets of n-means and the nth quantization errors for all

n \geq 3

and for arbitrary

r \in (0, \frac{1}{3})

require very intricate and complicated analysis; hence, for clarity purposes, in the remaining sections the focus will be on the case

r = \frac{1}{3} .

Section 3 is devoted to determining the optimal sets of n-means for

n = 2

and

n = 3

. In Section 4, we first define a mapping F which enables us to convert the infinitely generated affine measure P to a finitely generated product measure

P_{c} \times P_{c}

, each

P_{c}

is the Cantor distribution. Having this connection between P and

P_{c};

together with the optimal sets of n-means for

n = 1, 2, 3,

in Section 5 we will utilize the dynamics of the affine maps to obtain the main results of the paper: closed formulas to determine the optimal sets of n-means and the corresponding quantization errors for all

n \geq 4

. For clarity of the exposition, we also provide some examples and figures to illustrate the constructions. Lastly, having closed form for the quantization errors for each

n,

we prove the existence of the quantization dimension

D (P)

and show that the

D (P)

-dimensional quantization coefficient for P does not exist (but are finite) and the

D (P)

-dimensional lower and the upper quantization coefficients lie in the closed interval

[\frac{1}{12}, \frac{5}{4}]

.

The results and the arguments in this article are not straightforward generalizations of those in [13]; in particular, this is the case for optimal sets. By the nature of the affine transformations considered in this paper, the optimal sets of order

n = k^{2}, k \geq 1,

are the same as the cross product of optimal sets of order k obtained in [13]; however, the same cannot be said for other

n \geq 3 .

Clearly, for n a prime number, optimal sets of n-means cannot be obtained this way. Furthermore, as will be seen from the main theorem, even for

n = k l,

optimal sets of n-means are not the same as the cross product of optimal sets of k- and l-means in [13]. For example, optimal sets of 2- and 3-means in [13] are

{\frac{1}{6}, \frac{5}{6}}

and

{\frac{1}{6}, \frac{13}{18}, \frac{17}{18}}

(or

{\frac{1}{18}, \frac{5}{18}, \frac{5}{6}}),

respectively; hence, the cross product of these sets produce some of the optimal sets of 6-means. On the other hand, one of the optimal sets of 6-means is

{(\frac{1}{18}, \frac{1}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{13}{18}, \frac{1}{6}), (\frac{1}{18}, \frac{5}{6}), (\frac{13}{18}, \frac{5}{6}), (\frac{5}{6}, \frac{5}{6})}

, which cannot be obtained as the cross product of optimal sets of 2- and 3-means in [13].

2. Preliminaries

Let P be the affine measure on

R^{2}

generated by the affine maps

{S_{(i, j)} : i, j \in N}

defined above. Consider the alphabet

I = N^{2} = {(i, j) : i, j \in N}

. By a "string" or a "word"

ω

over

I

, it is meant a finite sequence

ω : = ω_{1} ω_{2} \dots ω_{k}

of symbols from the alphabet,

k \geq 1

, where k is called the length of the word

ω

. A word of length zero is called the "empty word", and is denoted by ∅. By

I^{*}

we denote the set of all words over the alphabet

I

of some finite length

k,

including the empty word ∅. By

| ω |

, we denote the length of a word

ω \in I^{*}

. For any two words

ω : = ω_{1} ω_{2} \dots ω_{k}

and

τ : = τ_{1} τ_{2} \dots τ_{ℓ}

in

I^{*}

, by

ω τ : = ω_{1} \dots ω_{k} τ_{1} \dots τ_{ℓ}

we mean the word obtained from the concatenation of

ω

and

τ

. For

n \geq 1

and

ω = ω_{1} ω_{2} \dots ω_{n} \in I^{*}

we define

ω^{-} : = ω_{1} ω_{2} \dots ω_{n - 1}

. Note that

ω^{-}

is the empty word if the length of

ω

is one. Analogously, by

N^{*}

we denote the set of all words over the alphabet

N

, and for any

τ \in N^{*}, | τ |

,

τ^{-}

, etc. are defined similarly. Let

ω \in I^{k}

,

k \geq 1

, be such that

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k})

, then

ω^{(1)}

and

ω^{(2)}

will denote the “coordinate words"; i.e.,

ω^{(1)} : = i_{1} i_{2} \dots i_{k}

and

ω^{(2)} : = j_{1} j_{2} \dots j_{k}

. Thus,

ω_{| ω |}^{(1)} = i_{k}

and

ω_{| ω |}^{(2)} = j_{k}

. These lead us to define the following notations: For

ω \in I^{*}

, by

ω (\emptyset, \infty)

it is meant the set of all words

ω^{-} (ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + j)

obtained by concatenating the word

ω^{-}

with the word

(ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + j)

for

j \in N

, i.e.,

ω (\emptyset, \infty) : = {ω^{-} (ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + j) : j \in N} .

Similarly,

ω (\infty, \emptyset)

and

ω (\infty, \infty)

represent the sets

ω (\infty, \emptyset) : = {ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)}) : i \in N} and ω (\infty, \infty) : = {ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j) : i, j \in N},

respectively. Analogously, for any

τ \in N^{*}

, by

(τ, \infty)

it is meant the set

(τ, \infty) : = {τ + i : i \in N}

, and

(τ, \emptyset)

represents the set

(τ, \emptyset) : = {τ}

. Thus, if

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k}) (\infty, \emptyset)

, then we write

ω^{(1)} : = (i_{1} i_{2} \dots i_{k}, \infty)

and

ω^{(2)} : = j_{1} j_{2} \dots j_{k}

; if

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k})

(\emptyset, \infty)

, then we write

ω^{(1)} : = i_{1} i_{2} \dots i_{k}

and

ω^{(2)} : = (j_{1} j_{2} \dots j_{k}, \infty)

; and if

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k})

(\infty, \infty)

, then we write

ω^{(1)} : = (i_{1} i_{2} \dots i_{k}, \infty)

and

ω^{(2)} : = (j_{1} j_{2} \dots j_{k}, \infty)

. For

ω = ω_{1} ω_{2} \dots

ω_{k} \in I^{k}

,

k \geq 1,

let us write

\begin{matrix} S_{ω} : & = S_{ω_{1}} \circ \dots \circ S_{ω_{k}}, p_{ω} : = p_{ω_{1}} p_{ω_{2}} \dots p_{ω_{k}} and J_{ω} : = S_{ω} ([0, 1] \times [0, 1]) . \end{matrix}

In particular,

S_{\emptyset} = I,

the identity mapping on

R^{2},

and

J : = J_{\emptyset} = S_{\emptyset} ([0, 1] \times [0, 1])

. Then, the probability measure P supports the closure of the limit set

S

, where

S = ⋂_{k \in N} ⋃_{ω \in I^{k}} J_{ω}

. The limit set

S

is called the affine set or infinitely generated affine set. For

ω \in I^{k}

and

i, j \in N

, the rectangles

J_{ω (i, j)}

, into which

J_{ω}

is split up at the

(k + 1)

th level are called the children or the basic rectangles of

J_{ω}

(see Figure 1). For

ω \in I^{*}

, we write

J_{ω (\emptyset, \infty)} : = \cup_{j = 1}^{\infty} J_{ω^{-} (ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + j)}, J_{ω (\infty, \emptyset)} : = \cup_{i = 1}^{\infty} J_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)})}, J_{ω (\infty, \infty)} : = \cup_{i, j = 1}^{\infty} J_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)};

p_{ω (\emptyset, \infty)} : = P (J_{ω (\emptyset, \infty)}) = \sum_{j = 1}^{\infty} p_{ω^{-} (ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + j)}, p_{ω (\infty, \emptyset)} : = P (J_{ω (\infty, \emptyset)}) = \sum_{i = 1}^{\infty} p_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)})}, and p_{ω (\infty, \infty)} : = P (J_{ω (\infty, \infty)}) = \sum_{i, j = 1}^{\infty} p_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} .

Notice that for any

ω \in I^{*}

,

p_{ω (\emptyset, \infty)} = p_{ω^{-}} \sum_{j = 1}^{\infty} \frac{1}{2^{ω_{| ω |}^{(1)} + ω_{| ω |}^{(2)} + j}} = p_{ω^{-}} p_{ω_{| ω |}} \sum_{j = 1}^{\infty} \frac{1}{2^{j}} = p_{ω^{-}} p_{ω_{| ω |}}

= p_{ω}

; and similarly,

p_{ω (\infty, \emptyset)} = p_{ω (\infty, \infty)} = p_{ω} .

Because

P = \sum_{i, j = 1}^{\infty} p_{(i, j)} P \circ S_{(i, j)}^{- 1}

, then, by induction,

P = \sum_{ω \in I^{k}} p_{ω} P \circ S_{ω}^{- 1}

for any

k \in N

. Hence, we have the following statement:

Lemma 1.

Let

f : R^{2} \to R^{+}

be Borel measurable and

k \in N

. Then,

\int f d P = \sum_{ω \in I^{k}} p_{ω} \int f \circ S_{ω} d P .

Let

S_{(i, j)}^{(1)}

and

S_{(i, j)}^{(2)}

be the horizontal and vertical components of the transformations

S_{(i, j)} .

Then, for all

(x_{1}, x_{2}) \in R^{2},

we have

S_{(i, j)}^{(1)} (x_{1}) = r^{i} x_{1} + 1 - r^{i - 1}

and

S_{(i, j)}^{(2)} (x_{2}) = r^{j} x_{2} + 1 - r^{j - 1};

hence,

S_{(i, j)}^{(1)}

and

S_{(i, j)}^{(2)}

are similarity mappings on

R

with similarity ratios

s_{(i, j)}^{(1)} : = r^{i}

and

s_{(i, j)}^{(2)} : = r^{j},

respectively. Similarly, for

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k}) \in I^{k}

,

k \geq 1

, let

S_{ω}^{(1)}

and

S_{ω}^{(2)}

represent the horizontal and vertical components of the transformation

S_{ω}

on

R^{2}

. Then,

S_{ω}^{(1)}

and

S_{ω}^{(2)}

are similarity mappings on

R

with similarity ratios

s_{ω}^{(1)}

and

s_{ω}^{(2)},

respectively, such that

S_{ω}^{(1)} = S_{(i_{1}, j_{1})}^{(1)} \circ \dots \circ S_{(i_{k}, j_{k})}^{(1)}

and

S_{ω}^{(2)} = S_{(i_{1}, j_{1})}^{(2)} \circ \dots \circ S_{(i_{k}, j_{k})}^{(2)}

. Thus, it follows that

\begin{matrix} s_{ω}^{(1)} = s_{(i_{1}, j_{1})}^{(1)} s_{(i_{2}, j_{2})}^{(1)} \dots s_{(i_{k}, j_{k})}^{(1)} = r^{i_{1} + i_{2} + \dots + i_{k}} and \\ s_{ω}^{(2)} = s_{(i_{1}, j_{1})}^{(2)} s_{(i_{2}, j_{2})}^{(2)} \dots s_{(i_{k}, j_{k})}^{(2)} = r^{j_{1} + j_{2} + \dots + j_{k}} . \end{matrix}

Moreover, we have

P (J_{ω}) = p_{ω} = p_{(i_{1}, j_{1})} p_{(i_{2}, j_{2})} \dots p_{(i_{k}, j_{k})} = \frac{1}{2^{i_{1} + i_{2} + \dots + i_{k} + j_{1} + j_{2} + \dots + j_{k}}}

. Let

X : = (X_{1}, X_{2})

be a bivariate random variable with distribution P. Let

P_{1}, P_{2}

be the marginal distributions of P, i.e.,

P_{1} (A) = P (A \times R) = P \circ π_{1}^{- 1} (A)

for all

A \in B

, and

P_{2} (B) = P (R \times B) = P \circ π_{2}^{- 1} (B)

for all

B \in B

, where

π_{1}, π_{2}

are projections given by

π_{1} (x_{1}, x_{2}) = x_{1}

and

π_{2} (x_{1}, x_{2}) = x_{2}

for all

(x_{1}, x_{2}) \in R^{2}

. Here

B

is the Borel

σ

-algebra on

R

. Then,

X_{1}

has distribution

P_{1}

and

X_{2}

has distribution

P_{2}

. Let

S_{(i, j)}^{- (1)}

and

S_{(i, j)}^{- (2)}

denote respectively the inverse images of the horizontal and vertical components of the transformations

S_{(i, j)}

for all

i, j \in N

. Then, the following lemma is known [16,17,18]:

Lemma 2.

Let

P_{1}

and

P_{2}

be the marginal distributions of the probability measure P. Then,

P_{1} = \sum_{i = 1}^{\infty} \frac{1}{2^{i}} P_{1} \circ S_{(i, j)}^{- (1)} and P_{2} = \sum_{j = 1}^{\infty} \frac{1}{2^{j}} P_{2} \circ S_{(i, j)}^{- (2)} .

Remark 1.

Since

S_{(i, j)}^{(1)}

and

S_{(i, j)}^{(2)}

are similarity mappings, from Lemma 2, one can see that both the marginal distributions

P_{1}

and

P_{2}

are self-similar measures on

R

generated by an infinite collection of similarities associated with the probability vector

(\frac{1}{2}, \frac{1}{2^{2}}, \dots)

.

Lemma 3.

Let

E (X)

and

V (X)

denote the expectation and the variance of the random variable X. Then,

E (X) = (E (X_{1}), E (X_{2})) = (\frac{1}{2}, \frac{1}{2}) and V : = V (X) = E {∥ X - (\frac{1}{2}, \frac{1}{2}) ∥}^{2} = \frac{1}{4} .

Proof.

By Lemma 2,

P_{1} = P_{2} = μ

, where

μ

is a unique Borel probability measure on

R

such that

μ = \sum_{k = 1}^{\infty} \frac{1}{2^{k}} μ \circ S_{(k, j)}^{- (1)} = \sum_{k = 1}^{\infty} \frac{1}{2^{k}} μ \circ S_{(i, k)}^{- (2)} .

Hence,

X_{1} = X_{2}

, and by ([11], Lemma 2.2),

E (X_{1}) = E (X_{2}) = \frac{1}{2},

and

V (X_{1}) = V (X_{2}) = \frac{1}{8},

which implies that

E ∥ X - (\frac{1}{2}, \frac{1}{2}) ∥^{2} = E {(X_{1} - \frac{1}{2})}^{2} + E {(X_{2} - \frac{1}{2})}^{2} = V (X_{1}) + V (X_{2}) = \frac{1}{4} .

□

Remark 2.

By using the standard rule of probability, for any

(a, b) \in R^{2}

, we have

{E ∥ X - (a, b) ∥}^{2} = V + {∥ (a, b) - (\frac{1}{2}, \frac{1}{2}) ∥}^{2}

, which yields that the optimal set of one-mean consists of the expected value and the corresponding quantization error is the variance V of the random variable X.

Lemma 4.

Let

ω \in I^{*}

. Then,

(i)

E (X | X \in J_{ω (\infty, \infty)}) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) + (s_{ω}^{(1)} \frac{1}{2} (1 - r), s_{ω}^{(2)} \frac{1}{2} (1 - r));

(i i)

E (X | X \in J_{ω (\emptyset, \infty)}) = S_{ω^{-} (ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) + (0, s_{ω}^{(2)} \frac{1}{2} (1 - r)),

and

(i i i)

E (X | X \in J_{ω (\infty, \emptyset)}) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)})} (\frac{1}{2}, \frac{1}{2}) + (s_{ω}^{(1)} \frac{1}{2} (1 - r), 0) .

Proof.

First prove

(i)

. Because

P (J_{ω (\infty, \infty)}) = p_{ω (\infty, \infty)} = p_{ω}

and

p_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} = p_{ω} \frac{1}{2^{i + j}},

\begin{matrix} E (X | X \in J_{ω (\infty, \infty)}) = E (X | X \in \cup_{i, j = 1}^{\infty} J_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}) \\ = \frac{1}{P (J_{ω (\infty, \infty)})} \sum_{i, j = 1}^{\infty} p_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (\frac{1}{2}, \frac{1}{2}) = \sum_{i, j = 1}^{\infty} \frac{1}{2^{i + j}} S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (\frac{1}{2}, \frac{1}{2}) . \end{matrix}

Notice that

\begin{matrix} S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (\frac{1}{2}, \frac{1}{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) \\ = (S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}), S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(2)} (\frac{1}{2})) - (S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}), S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2})) \\ = (S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}), S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(2)} (\frac{1}{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2})) . \end{matrix}

Because

\begin{matrix} S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) = s_{ω^{-}}^{(1)} (S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2})) \\ = s_{ω^{-}}^{(1)} (r^{ω_{| ω |}^{(1)} + i} (\frac{1}{2}) - r^{ω_{| ω |}^{(1)} + i - 1} - r^{ω_{| ω |}^{(1)} + 1} (\frac{1}{2}) + r^{ω_{| ω |}^{(1)} + 1 - 1}) = s_{ω}^{(1)} (\frac{1}{2} r^{i} - r^{i - 1} - \frac{r}{2} + 1) \\ = s_{ω}^{(1)} (1 - \frac{r}{2}) (1 - r^{i - 1}), \end{matrix}

and similarly

S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(2)} (\frac{1}{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2}) = s_{ω}^{(2)} (1 - \frac{r}{2}) (1 - r^{j - 1}) .

Hence, we have that

S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (\frac{1}{2}, \frac{1}{2}) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) + (s_{ω}^{(1)} (u)), s_{ω}^{(2)} (v)),

where

u = (\frac{2 - r}{2})

(1 - r^{i - 1})

and

v = (\frac{2 - r}{2}) (1 - r^{j - 1}) .

Therefore,

\begin{matrix} E (X | X \in J_{ω (\infty, \infty)}) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) + \sum_{i, j = 1}^{\infty} \frac{1}{2^{i + j}} (s_{ω}^{(1)} (u), s_{ω}^{(2)} (v)) \\ = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) + (s_{ω}^{(1)} (\frac{1 - r}{2}), s_{ω}^{(2)} (\frac{1 - r}{2})) . \end{matrix}

Proofs of (ii) and (iii) are similar. □

Note 1.

For words

β, γ, \dots, δ

in

I^{*}

, by

a (β, γ, \dots, δ)

we denote the conditional expectation of the random variable X given

J_{β} \cup J_{γ} \cup \dots \cup J_{δ},

i.e.,

a (β, γ, \dots, δ) = E (X | X \in J_{β} \cup J_{γ} \cup \dots \cup J_{δ}) = \frac{1}{P (J_{β} \cup \dots \cup J_{δ})} \int_{J_{β} \cup \dots \cup J_{δ}} (x_{1}, x_{2}) d P .

(1)

Then, for

ω \in I^{*}

,

\begin{matrix} \{\begin{matrix} a (ω) = S_{ω} (E (X)) = S_{ω} (\frac{1}{2}, \frac{1}{2}), \\ a (ω (\emptyset, \infty)) = E (X | X \in J_{ω (\emptyset, \infty)}), \\ a (ω (\infty, \emptyset)) = E (X | X \in J_{ω (\infty, \emptyset)}), and \\ a (ω (\infty, \infty)) = E (X | X \in J_{ω (\infty, \infty)}) . \end{matrix} \end{matrix}

(2)

Thus, by Lemma 4, if

ω = (1, 1)

, then

a ((1, 1)) = (\frac{r}{2}, \frac{r}{2})

,

a ((1, 1) (\infty, \emptyset)) = (1 - \frac{r}{2}, \frac{r}{2})

,

a ((1, 1) (\emptyset, \infty)) = (\frac{r}{2}, 1 - \frac{r}{2})

, and

a ((1, 1) (\infty, \infty)) = (1 - \frac{r}{2}, 1 - \frac{r}{2})

. In addition,

\begin{matrix} \{\begin{matrix} a ((1, 1), (1, 1) (\infty, \emptyset)) = (\frac{1}{2}, \frac{r}{2}), \\ a ((1, 1) (\emptyset, \infty), (1, 1) (\infty, \infty)) = (\frac{1}{2}, 1 - \frac{r}{2}), \\ a ((1, 1), (1, 1) (\emptyset, \infty)) = (\frac{r}{2}, \frac{1}{2}), \\ a ((1, 1) (\infty, \emptyset), (1, 1) (\infty, \infty)) = (1 - \frac{r}{2}, \frac{1}{2}) . \end{matrix} \end{matrix}

(3)

Moreover, for

ω \in I^{k}

,

k \geq 1

, it is easy to see that

\begin{matrix} \int_{J_{ω}} {∥ x - (a, b) ∥}^{2} d P = p_{ω} \int {∥ (x_{1}, x_{2}) - (a, b) ∥}^{2} d P \circ S_{ω}^{- 1} \\ = p_{ω} (s_{ω}^{(1) 2} V (X_{1}) + s_{ω}^{(2) 2} V (X_{2}) + {∥ S_{ω} (\frac{1}{2}, \frac{1}{2}) - (a, b) ∥}^{2}), \end{matrix}

(4)

where

s_{ω}^{(k) 2} : = {(s_{ω}^{(k)})}^{2}

for

k = 1, 2

. The expressions (2) and (4) are useful to obtain the optimal sets and the corresponding quantization errors with respect to the probability distribution P.

For the rest of the article

r = \frac{1}{3}

is assumed, which is the most important case due to its intimate connection with the standard Cantor system.

3. Optimal Sets of n-Means for n = 2, 3

In the this section, we determine the optimal sets of two- and three-means, and their quantization errors.

Lemma 5.

Let P be the affine measure on

R^{2}

and let

ω \in I^{*}

. Then,

\begin{matrix} \int_{J_{ω (\infty, \infty)}} {∥ x - a (ω (\infty, \infty)) ∥}^{2} d P = \int_{J_{ω (\emptyset, \infty)}} {∥ x - a (ω (\emptyset, \infty)) ∥}^{2} d P \\ = \int_{J_{ω (\infty, \emptyset)}} {∥ x - a (ω (\infty, \emptyset)) ∥}^{2} d P = \int_{J_{ω}} {∥ x - a (ω) ∥}^{2} d P = p_{ω} (s_{ω}^{(1) 2} + s_{ω}^{(2) 2}) \frac{1}{8} . \end{matrix}

Proof.

Let us first prove

\int_{J_{ω (\infty, \infty)}} {∥ x - a (ω (\infty, \infty)) ∥}^{2} d P = p_{ω} (s_{ω}^{(1) 2} + s_{ω}^{(2) 2}) \frac{1}{8}

. By Lemma 4, we have

\begin{matrix} \int_{J_{ω (\infty, \infty)}} {∥ x - a (ω (\infty, \infty)) ∥}^{2} d P = \sum_{i, j = 1}^{\infty} \int_{J_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}} {∥ x - a (ω (\infty, \infty)) ∥}^{2} d P \\ = p_{ω} \sum_{i, j = 1}^{\infty} \frac{1}{2^{i + j}} \int ∥ S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (x_{1}, x_{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) \\ - (s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)}, s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)}) ∥^{2} d P . \end{matrix}

(5)

Note that

S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)} (x_{1}, x_{2}) = (S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}), S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(2)} (x_{2}))

and

S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)} (\frac{1}{2}, \frac{1}{2}) = (S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}), S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2}))

. Moreover, we have

\begin{matrix} {(S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) - s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)})}^{2} \\ = s_{ω^{-}}^{(1) 2} {(S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}) - S_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) - s_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)})}^{2} \\ = s_{ω^{-}}^{(1) 2} ((S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}) - S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2})) + (S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) \\ - s_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)}))^{2} . \end{matrix}

Now break the above expression by using the square formula and note the fact that

\begin{matrix} \int {(S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}) - S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}))}^{2} d P_{1} = s_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1) 2} V (X_{1}) = s_{(ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)})}^{(1) 2} \frac{1}{9^{i}} \frac{1}{8}, and \\ \int (S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2})) d P_{1} = 0, and after some simplification we have \\ {(S_{(ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (\frac{1}{2}) - S_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) - s_{(ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)})}^{2} = s_{(ω_{| ω |}^{(1)}, ω_{| ω |}^{(2)})}^{(1) 2} \frac{1}{4} {(1 - \frac{5}{3^{i}})}^{2} . \end{matrix}

Thus, it follows that

\begin{matrix} \int {(S_{ω^{-} (ω_{| ω |}^{(1)} + i, ω_{| ω |}^{(2)} + j)}^{(1)} (x_{1}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) - s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)})}^{2} d P_{1} \\ = s_{ω}^{(1) 2} (\frac{1}{9^{i}} \frac{1}{8} + \frac{1}{4} {(1 - \frac{5}{3^{i}})}^{2}), and similarly \\ \int {(S_{ω^{-} (ω_{| ω |}^{(2)} + i, ω_{| ω |}^{(2)} + j)}^{(2)} (x_{2}) - S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2}) - s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)})}^{2} d P_{2} \\ = s_{ω}^{(2) 2} (\frac{1}{9^{j}} \frac{1}{8} + \frac{1}{4} {(1 - \frac{5}{3^{j}})}^{2}) . \end{matrix}

Therefore, (5) implies that

\begin{matrix} \int_{J_{ω (\infty, \infty)}} {∥ x - a (ω (\infty, \infty)) ∥}^{2} d P \\ = p_{ω} \sum_{i, j = 1}^{\infty} \frac{1}{2^{i + j}} (s_{ω}^{(1) 2} (\frac{1}{9^{i}} \frac{1}{8} + \frac{1}{4} {(1 - \frac{5}{3^{i}})}^{2}) + s_{ω}^{(2) 2} (\frac{1}{9^{j}} \frac{1}{8} + \frac{1}{4} {(1 - \frac{5}{3^{j}})}^{2})) = p_{ω} (s_{ω}^{(1) 2} + s_{ω}^{(2) 2}) \frac{1}{8} . \end{matrix}

Other equalities of the statement are proved similarly. □

Lemma 6.

Let P be the affine measure on

R^{2}

, and let

{(a, p), (b, p)}

be a set of two points lying on the line

x_{2} = p

for which the distortion error is smallest. Then,

a = \frac{1}{6}

,

b = \frac{5}{6}

,

p = \frac{1}{2}

and the distortion error is

\frac{5}{36} .

Proof.

Let

β = {(a, p), (b, p)}

. Because the points for which the distortion error is smallest are the centroids of their own Voronoi regions, by the properties of centroids, we have

(a, p) P (M ((a, p) | β)) + (b, p) P (M ((b, p) | β)) = (\frac{1}{2}, \frac{1}{2}),

which implies

p P (M ((a, p) | β)) + p P (M ((b, p) | β)) = \frac{1}{2}

, i.e,

p = \frac{1}{2}

. Thus, the boundary of the Voronoi regions is the line

x_{1} = \frac{1}{2}

. Now, using the definition of conditional expectation,

\begin{matrix} (a, \frac{1}{2}) = E (X : X \in M ((a, \frac{1}{2}) | β)) = E (X : X \in \cup_{j = 1}^{\infty} J_{(1, j)}) = \frac{1}{\sum_{j = 1}^{\infty} p_{(1, j)}} \sum_{j = 1}^{\infty} p_{(1, j)} S_{(1, j)} (\frac{1}{2}, \frac{1}{2}), \end{matrix}

which implies

(a, \frac{1}{2}) = (\frac{1}{6}, \frac{1}{2})

yielding

a = \frac{1}{6}

. Similarly,

b = \frac{5}{6}

. Then, the distortion error is

\int min_{c \in β} {∥ x - c ∥}^{2} d P = \int_{\cup_{j = 1}^{\infty} J_{(1, j)}} ∥ x - (\frac{1}{6}, \frac{1}{2}) ∥^{2} d P + \int_{\cup_{i = 2, j = 1}^{\infty} J_{(i, j)}} {∥ x - (\frac{5}{6}, \frac{1}{2}) ∥}^{2} d P = \frac{5}{72} + \frac{5}{72} = \frac{5}{36} .

This completes the proof the lemma. □

The following lemma provides us information on where to look for points of an optimal set of two-means.

Lemma 7.

Let P be the affine measure on

R^{2}

. The points in an optimal set of two-means can not lie on an oblique line of the affine set.

Proof.

In the affine set, among all the oblique lines that pass through the point

(\frac{1}{2}, \frac{1}{2})

, the line

x_{2} = x_{1}

has the maximum symmetry, i.e., with respect to the line

x_{2} = x_{1}

the affine set is geometrically symmetrical. Also, observe that, if two basic rectangles of similar geometrical shape lie in the opposite sides of the line

x_{2} = x_{1}

, and are equidistant from the line

x_{2} = x_{1}

, then they have the same probability (see Figure 1); hence, they are symmetrical with respect to the probability distribution P. Due to this, among all the pairs of two points which have the boundaries of the Voronoi regions oblique lines passing through the point

(\frac{1}{2}, \frac{1}{2})

, the two points which have the boundary of the Voronoi regions the line

x_{2} = x_{1}

will give the smallest distortion error. Again, we know the two points which give the smallest distortion error are the centroids of their own Voronoi regions. Let

(a_{1}, b_{1})

and

(a_{2}, b_{2})

be the centroids of the left half and the right half of the affine set with respect to the line

x_{2} = x_{1}

respectively. Then, from the definition of conditional expectation, we have

\begin{matrix} (a_{1}, b_{1}) = 2 [\sum_{i = 1, j = i + 1}^{\infty} \frac{1}{2^{i + j}} S_{(i, j)} (\frac{1}{2}, \frac{1}{2}) + \sum_{k_{1} = 1}^{\infty} \sum_{\underset{j = i + 1}{i = 1}}^{\infty} \frac{1}{2^{2 k_{1} + i + j}} S_{(k_{1}, k_{1}) (i, j)} (\frac{1}{2}, \frac{1}{2}) \\ + \sum_{k_{1} = 1}^{\infty} \sum_{k_{2} = 1}^{\infty} \sum_{\underset{j = i + 1}{i = 1}}^{\infty} \frac{1}{2^{2 k_{1} + 2 k_{2} + i + j}} S_{(k_{1}, k_{1}) (k_{2}, k_{2}) (i, j)} (\frac{1}{2}, \frac{1}{2}) \\ + \sum_{k_{1} = 1}^{\infty} \sum_{k_{2} = 1}^{\infty} \sum_{k_{3} = 1}^{\infty} \sum_{\underset{j = i + 1}{i = 1}}^{\infty} \frac{1}{2^{2 k_{1} + 2 k_{2} + 2 k_{3} + i + j}} S_{(k_{1}, k_{1}) (k_{2}, k_{2}) (k_{3}, k_{3}) (i, j)} (\frac{1}{2}, \frac{1}{2}) + \dots] = (\frac{3}{10}, \frac{7}{10}), \end{matrix}

and

\begin{matrix} (a_{2}, b_{2}) = 2 (\sum_{i = 1}^{\infty} \sum_{j = 1}^{i - 1} \frac{1}{2^{i + j}} S_{(i, j)} (\frac{1}{2}, \frac{1}{2}) + \sum_{k_{1} = 1}^{\infty} \sum_{i = 1}^{\infty} \sum_{j = 1}^{i - 1} \frac{1}{2^{2 k_{1} + i + j}} S_{(k_{1}, k_{1}) (i, j)} (\frac{1}{2}, \frac{1}{2}) \\ + \sum_{k_{1} = 1}^{\infty} \sum_{k_{2} = 1}^{\infty} \sum_{i = 1}^{\infty} \sum_{j = 1}^{i - 1} \frac{1}{2^{2 k_{1} + 2 k_{2} + i + j}} S_{(k_{1}, k_{1}) (k_{2}, k_{2}) (i, j)} (\frac{1}{2}, \frac{1}{2}) \\ + \sum_{k_{1} = 1}^{\infty} \sum_{k_{2} = 1}^{\infty} \sum_{k_{3} = 1}^{\infty} \sum_{i = 1}^{\infty} \sum_{j = 1}^{i - 1} \frac{1}{2^{2 k_{1} + 2 k_{2} + 2 k_{3} + i + j}} S_{(k_{1}, k_{1}) (k_{2}, k_{2}) (k_{3}, k_{3}) (i, j)} (\frac{1}{2}, \frac{1}{2}) + \dots) = (\frac{7}{10}, \frac{3}{10}) . \end{matrix}

Let

β = {(\frac{3}{10}, \frac{7}{10}), (\frac{7}{10}, \frac{3}{10})}

. Then, due to symmetry,

\begin{matrix} \int min_{c \in β} {∥ x - c ∥}^{2} d P = 2 \int_{M ((\frac{3}{10}, \frac{7}{10}) | β)} {∥ x - (\frac{3}{10}, \frac{7}{10}) ∥}^{2} d P . \end{matrix}

Write

\begin{matrix} A : = (\cup_{j = 2}^{4} J_{(1, 1) (1, 1) (1, 1) (1, 1) (1, j)}) \cup (\cup_{j = 2}^{6} J_{(1, 1) (1, 1) (1, 1) (1, j)}) \cup (\cup_{j = 3}^{5} J_{((1, 1) (1, 1) (1, 1) (2, j)}) \cup (\cup_{j = 2}^{8} J_{(1, 1) (1, 1) (1, j)}) \\ \cup (\cup_{j = 3}^{6} J_{(1, 1) (1, 1) (2, j)}) \cup J_{(1, 1) (1, 1) (3, 4)} \cup (\cup_{j = 2}^{8} J_{(1, 1) (1, j)}) \cup (\cup_{j = 3}^{7} J_{(1, 1) (2, j)}) \cup (\cup_{j = 4}^{6} J_{(1, 1) (3, j)}) \cup (\cup_{j = 2}^{10} J_{(1, j)}) \\ \cup (\cup_{j = 3}^{10} J_{(2, j)}) \cup (\cup_{j = 4}^{10} J_{(3, j)}) \cup (\cup_{j = 5}^{9} J_{(4, j)}) \cup (\cup_{j = 6}^{7} J_{(5, j)}) . \end{matrix}

Because A is a proper subset of

M ((\frac{3}{10}, \frac{7}{10}) | β)

, we have

\int {min}_{c \in β} {∥ x - c ∥}^{2} d P > 2 \int_{A} {∥ x - (\frac{3}{10}, \frac{7}{10}) ∥}^{2} d P .

Now using (4), and then upon simplification, it follows that

\int min_{c \in β} {∥ x - c ∥}^{2} d P > 2 \int_{A} {∥ x - (\frac{3}{10}, \frac{7}{10}) ∥}^{2} d P = 0.13899,

which is larger than the distortion error

\frac{5}{36}

obtained in Lemma 6. Hence, the points in an optimal set of two-means can not lie on a oblique line of the affine set. Thus, the assertion of the lemma follows. □

Proposition 2.

Let P be the affine measure on

R^{2}

. Then, the sets

{(\frac{1}{6}, \frac{1}{2}), (\frac{5}{6}, \frac{1}{2})}

and

{(\frac{1}{2}, \frac{1}{6}), (\frac{1}{2}, \frac{5}{6})}

form two different optimal sets of two-means with quantization error

\frac{5}{36} .

Proof.

By Lemma 7, it is known that the points in an optimal set of two-means cannot lie on an oblique line of the affine set. Thus, by Lemma 6, we see that

{(\frac{1}{6}, \frac{1}{2}), (\frac{5}{6}, \frac{1}{2})}

forms an optimal set of two-means with quantization error

\frac{5}{36}

. Due to symmetry,

{(\frac{1}{2}, \frac{1}{6}), (\frac{1}{2}, \frac{5}{6})}

forms another optimal set of two-means (see Figure 2); thus, the assertion follows. □

Proposition 3.

Let P be the affine measure on

R^{2}

. Then, the set

{(\frac{1}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{1}{2}, \frac{5}{6})}

forms an optimal set of three-means with quantization error

\frac{1}{12}

.

Proof.

Let us first consider a three-point set

β

given by

β = {(\frac{1}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{1}{2}, \frac{5}{6})}

. Then, by using Lemma 5 and Equation (4), we have

\begin{matrix} \int min_{a \in β} {∥ x - a ∥}^{2} d P = \int_{J_{(1, 1)}} ∥ x - (\frac{1}{6}, \frac{1}{6}) ∥^{2} d P + \int_{J_{(1, 1) (\infty, \emptyset)}} {∥ x - (\frac{5}{6}, \frac{1}{6}) ∥}^{2} d P \\ + \int_{J_{(1, 1) (\emptyset, \infty)} \cup J_{(1, 1) (\infty, \infty)}} {∥ x - (\frac{1}{2}, \frac{5}{6}) ∥}^{2} d P = \frac{1}{12} . \end{matrix}

Because

V_{3}

is the quantization error for an optimal set of three-means, we have

\frac{1}{12} \geq V_{3}

. Let

α = {(a_{i}, b_{i}) : 1 \leq i \leq 3}

be an optimal set of three-means. Because the optimal points are the centroids of their own Voronoi regions, we have

α \subset [0, 1] \times [0, 1]

. Let

A_{1} = [0, \frac{1}{3}] \times [0, \frac{1}{3}]

,

A_{2} = [\frac{2}{3}, 1] \times [0, \frac{1}{3}]

,

A_{3} = [0, \frac{1}{3}] \times [\frac{2}{3}, 1]

, and

A_{4} = [\frac{2}{3}, 1] \times [\frac{2}{3}, 1]

. Note that the centroids of

A_{1}

,

A_{2}

,

A_{3}

and

A_{4}

with respect to the probability distribution P are respectively

(\frac{1}{6}, \frac{1}{6})

,

(\frac{5}{6}, \frac{1}{6})

,

(\frac{1}{6}, \frac{5}{6})

and

(\frac{5}{6}, \frac{5}{6})

. Suppose that

α

does not contain any point from

\cup_{i = 1}^{4} A_{i}

. Then, we can assume that all the points of

α

are on the line

x_{2} = \frac{1}{2}

, i.e.,

α = {(a_{i}, \frac{1}{2}) : 1 \leq i \leq 3}

with

a_{1} < a_{2} < a_{3}

. If

a_{1} > \frac{1}{3},

quantization error can be strictly reduced by moving the point

(a_{1}, \frac{1}{2})

to

(\frac{1}{3}, \frac{1}{2})

. So, we can assume that

a_{1} \leq \frac{1}{3}

. Similarly, we can show that

a_{3} \geq \frac{2}{3}

. Now, if

a_{2} < \frac{1}{3}

, then

A_{3} \cup A_{4} \subset M ((a_{3}, \frac{1}{2}) | α)

. Moreover, for any

x = (x_{1}, x_{2}) \in J_{(1, 1) (1, 1)} \cup J_{(1, 3)}

, we have

m (x) : = {min}_{c \in α} {∥ (x_{1}, x_{2}) - c ∥}^{2} \geq {(\frac{7}{18})}^{2}

and so by (4) and Lemma 5, we obtain

\begin{matrix} \int m {(x)}^{2} d P = \int_{J_{(1, 1) (1, 1)} \cup J_{(1, 3)}} m {(x)}^{2} d P + \int_{J_{(1, 1) (\infty, \emptyset)} \cup J_{(1, 1) (\infty, \infty)}} m {(x)}^{2} d P \\ \geq \frac{1}{16} ((\frac{1}{81} + \frac{1}{81}) \frac{1}{8} + {(\frac{7}{18})}^{2}) + \frac{1}{16} ((\frac{1}{9} + \frac{1}{27^{2}}) \frac{1}{8} + {(\frac{7}{18})}^{2}) + \int_{J_{(1, 1) (\infty, \emptyset)} \cup J_{(1, 1) (\infty, \infty)}} {∥ x - (\frac{5}{6}, \frac{1}{2}) ∥}^{2} d P \\ = \frac{1}{16} ((\frac{1}{81} + \frac{1}{81}) \frac{1}{8} + {(\frac{7}{18})}^{2}) + \frac{1}{16} ((\frac{1}{9} + \frac{1}{27^{2}}) \frac{1}{8} + {(\frac{7}{18})}^{2}) + \frac{5}{72} = \frac{1043}{11664} > V_{3}, \end{matrix}

which is a contradiction, and so

a_{2} \geq \frac{1}{3}

must be true. If

a_{2} > \frac{2}{3}

, similarly we can show that a contradiction arises. So,

\frac{1}{3} < a_{2} < \frac{2}{3}

. Next, suppose that

\frac{1}{2} \leq a_{2} < \frac{2}{3}

. Then, we have

\frac{1}{2} (a_{1} + a_{2}) \leq \frac{1}{3}

which implies

a_{1} \leq \frac{1}{6}

, for otherwise quantization error can be strictly reduced by moving

a_{2}

to

(\frac{2}{3}, \frac{1}{2})

, contradicting the fact that

α

is an optimal set. Then,

\cup_{j = 1}^{\infty} J_{(1, 1) (1, j)} \cup \cup_{i = 2, j = 1}^{\infty} J_{(1, i) (1, j)} \subset M ((a_{1}, \frac{1}{2}) | α)

and

E (X : X \in \cup_{j = 1}^{\infty} J_{(1, 1) (1, j)} \cup \cup_{i = 2, j = 1}^{\infty} J_{(1, i) (1, j)}) = (\frac{1}{18}, \frac{1}{2}) .

So, for any

(x_{1}, x_{2}) \in \cup_{i = 2, j = 1}^{\infty} J_{(1, 1) (i, j)} \cup \cup_{\underset{j = 1}{k = 1, i = 2,}}^{\infty} J_{(k, 2) (i, j)}

,

{min}_{c \in α} ∥ (x_{1}, x_{2}) {- c ∥}^{2} \geq {∥ (x_{1}, x_{2}) - (\frac{1}{6}, \frac{1}{2}) ∥}^{2}

. If

A = \cup_{j = 1}^{\infty} J_{(1, 1) (1, j)} \cup \cup_{i = 2, j = 1}^{\infty} J_{(1, i) (1, j)},

B = \cup_{i = 2, j = 1}^{\infty} J_{(1, 1) (i, j)} \cup \cup_{\underset{j = 1}{k = 1, i = 2,}}^{\infty} J_{(k, 2) (i, j)},

A^{'} = \cup_{j = 1}^{\infty} J_{(1, 1) (1, j)}

and

B^{'} = \cup_{\underset{j = 1}{k = 1, i = 2,}}^{\infty} J_{(k, 2) (i, j)},

then

\begin{matrix} \int m {(x)}^{2} d P > \int_{A} ∥ (x_{1}, x_{2}) - (\frac{1}{18}, \frac{1}{2}) ∥^{2} d P + \int_{B} {∥ (x_{1}, x_{2}) - (\frac{1}{6}, \frac{1}{2}) ∥}^{2} d P \\ = 2 \int_{A^{'}} ∥ x - (\frac{1}{18}, \frac{1}{2}) ∥^{2} d P + \int_{\cup_{i = 2, j = 1}^{\infty} J_{(1, 1) (i, j)}} ∥ x - (\frac{1}{6}, \frac{1}{2}) ∥^{2} d P + \int_{B^{'}} {∥ x - (\frac{1}{6}, \frac{1}{2}) ∥}^{2} d P \\ = 2 \cdot \frac{41}{2592} + \frac{5}{288} + \frac{551}{14688} = \frac{953}{11016} > V_{3}, \end{matrix}

which is a contradiction. Similarly, if we assume

\frac{1}{3} \leq a_{2} < \frac{1}{2}

, a contradiction will arise. Therefore, all the points in

α

can not lie on the line

x_{2} = \frac{1}{2}

. Let

(a_{1}, b_{1})

and

(a_{3}, b_{3})

lie on the line

x_{2} = \frac{1}{2}

, and

(a_{2}, b_{2})

is above or below the horizontal line

x_{2} = \frac{1}{2}

. If

(a_{2}, b_{2})

is above the horizontal line, then the quantization error can be strictly reduced by moving

(a_{1}, b_{1})

to

A_{1}

and

(a_{3}, b_{3})

to

A_{2}

contradicting the fact that

α

is an optimal set. Similarly, if

(a_{2}, b_{2})

is below the horizontal line, a contradiction will arise. All these contradictions arise due to our assumption that

α

does not contain any point from

\cup_{i = 1}^{4} A_{i}

. Hence,

α

contains at least one point from

\cup_{i = 1}^{4} A_{i}

. In order to complete the proof of the Proposition, first we will prove the following claim:

Claim 1.

c a r d ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 2

.

For the sake of contradiction, assume that

card ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 1

. Then, without any loss of generality we assume that

(a_{1}, b_{1}) \in A_{1}

and

(a_{i}, b_{i}) \notin A_{2} \cup A_{3} \cup A_{4}

for

i = 2, 3

. Due to symmetry of the affine set with respect to the diagonal

x_{2} = x_{1}

, we can assume that

(a_{1}, b_{1}) \in A_{1}

lies on the diagonal

x_{2} = x_{1}

;

(a_{2}, b_{2})

and

(a_{3}, b_{3})

are equidistant from the diagonal

x_{2} = x_{1}

and are in opposite sides of the diagonal

x_{2} = x_{1}

. Now, consider the following cases:

Case 1. Assume that both

(a_{2}, b_{2})

and

(a_{3}, b_{3})

are below the diagonal

x_{2} = 1 - x_{1}

, but not in

A_{1} \cup A_{2} \cup A_{3}

. Let

(a_{2}, b_{2})

be above the diagonal

x_{2} = x_{1}

and

(a_{3}, b_{3})

be below the diagonal

x_{2} = x_{1}

. In that case, the quantization error can be strictly reduced by moving

(a_{2}, b_{2})

to

A_{3}

and

(a_{3}, b_{3})

to

A_{2}

which contradicts the optimality of

α

.

Case 2. Assume that both

(a_{2}, b_{2})

and

(a_{3}, b_{3})

are above the diagonal

x_{2} = 1 - x_{1}

. Let

(a_{2}, b_{2})

lie above the diagonal

x_{2} = x_{1}

and

(a_{3}, b_{3})

lie below the diagonal

x_{2} = x_{1}

. Then, due to symmetry we can assume that

(a_{1}, b_{1}) = (\frac{1}{6}, \frac{1}{6})

which is the centroid of

A_{1}

,

(a_{2}, b_{2}) = (\frac{1}{2}, \frac{5}{6})

which is the midpoint of the line segment joining the centroids of

A_{3}

and

A_{4}

,

(a_{3}, b_{3}) = (\frac{5}{6}, \frac{1}{2})

which is the midpoint of the line segment joining the centroids of

A_{2}

and

A_{4}

. Then,

\begin{matrix} \int m {(x)}^{2} d P = \int_{J_{(1, 1)}} m {(x)}^{2} d P + \int_{J_{(1, 1) (\emptyset, \infty)}} m {(x)}^{2} d P + \int_{J_{(1, 1) (\infty, \emptyset)}} m {(x)}^{2} d P + \int_{J_{(1, 1) (\infty, \infty)}} m {(x)}^{2} d P \\ \geq \frac{1}{144} + \int_{J_{(1, 1) (\emptyset, \infty)}} ∥ x - (\frac{1}{2}, \frac{5}{6}) ∥^{2} d P + \int_{J_{(1, 1) (\infty, \emptyset)}} ∥ x - (\frac{5}{6}, \frac{1}{2}) ∥^{2} d P + \int_{\cup_{\underset{j = i + 1}{i = 2}}^{\infty} J_{(i, j)}} {∥ x - (\frac{1}{2}, \frac{5}{6}) ∥}^{2} d P \\ = \frac{1}{144} + \frac{5}{144} + \frac{5}{144} + \frac{1381}{166320} = \frac{7043}{83160} > V_{3}, \end{matrix}

which is a contradiction. Thus,

card ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 1

cannot hold.

Next, for the sake of contradiction, assume that

card ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 3

. Then, without any loss of generality we assume that

(a_{1}, b_{1}) \in A_{3}

,

(a_{2}, b_{2}) \in A_{2}

and

(a_{3}, b_{3}) \in A_{4}

. Let

A_{11}

and

A_{12}

be the regions of

A_{1}

which are respectively above and below the diagonal of

A_{1}

passing through

(0, 0)

. Due to symmetry, we must have

A_{3} \cup A_{11} \subset M ((a_{1}, b_{1}) | α)

and

A_{2} \cup A_{12} \subset M ((a_{2}, b_{2}) | α)

. Notice that

A_{3} \cup A_{11} \subset M ((a_{1}, b_{1}) | α)

implies

A_{3} \cup \underset{i = 1, j = i + 1}{\cup} J_{(1, 1) (i, j)} \cup \underset{\overset{k = 1, i = 1}{j = i + 1}}{\cup} J_{(1, 1) (k, k) (i, j)} \subset M ((a_{1}, b_{1}) | α),

and by using (1), we have

E (X : X \in A_{3} \cup \underset{i = 1, j = i + 1}{\cup} J_{(1, 1) (i, j)} \cup \underset{\overset{k = 1, i = 1}{j = i + 1}}{\cup} J_{(1, 1) (k, k) (i, j)} = (\frac{1385}{9438}, \frac{6173}{9438}),

which shows that the point

(a_{1}, b_{1})

falls below the line

x_{2} = \frac{2}{3}

, which is a contradiction, as we assumed that

(a_{1}, b_{1}) \in A_{3}

. This contradiction arises due to our assumption that

card ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 3

. Hence, we conclude that

card ({i : α \cap A_{i} \neq \emptyset, 1 \leq i \leq 4}) = 2

, which proves the claim.

By the claim, we assume that

(a_{1}, b_{1}) \in A_{1}

and

(a_{3}, b_{3}) \in A_{2}

. Notice that

A_{1}, A_{2}, A_{3}, A_{4}

are geometrically symmetric as well as their corresponding centroids are symmetrically distributed over the square

[0, 1] \times [0, 1]

. Without any loss of generality, we can assume that the optimal point

(a_{1}, b_{1})

is the centroid of

A_{1}

, i.e.,

(a_{1}, b_{1}) = (\frac{1}{6}, \frac{1}{6})

. Then, due to symmetry with respect to the line

x_{1} = \frac{1}{2}

, it follows that

(a_{3}, b_{3}) = centroid of A_{2} = (\frac{5}{6}, \frac{1}{6})

, and

(a_{2}, b_{2})

lies on

x_{1} = \frac{1}{2}

but above the line

x_{2} = \frac{1}{2}

. Now, notice that

min_{(a_{3}, b_{3}) \in [\frac{1}{3}, \frac{2}{3}] \times [\frac{2}{3}, 1]} {∥ (\frac{1}{6}, \frac{5}{6}) - (a_{3}, b_{3}) ∥^{2} + ∥ (\frac{5}{6}, \frac{5}{6}) - (a_{3}, b_{3}) ∥^{2}} = \frac{2}{9},

which occurs when

(a_{3}, b_{3}) = center of [\frac{1}{3}, \frac{2}{3}] \times [\frac{2}{3}, 1] = (\frac{1}{2}, \frac{5}{6})

. Moreover, the three points

(\frac{1}{6}, \frac{1}{6})

,

(\frac{5}{6}, \frac{1}{6})

and

(\frac{1}{2}, \frac{5}{6})

are the centroids of their own Voronoi regions. Thus,

{(\frac{1}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{1}{2}, \frac{5}{6})}

forms an optimal set of three-means with quantization error

V_{3} = \frac{1}{12} .

Hence, the proposition follows. □

Remark 3.

Due to symmetry, in addition to the optimal set given in Proposition 3, there are three more optimal sets of three-means with quantization error

V_{3} = \frac{1}{12}

(see Figure 3).

4. Affine Measures

In this section, we show that the affine measure P under consideration is the direct product of the Cantor distribution

P_{c}

.

For the rest of the article, by a word

σ

of length k over the alphabet

{1, 2}

, it is meant

σ : = σ_{1} σ_{2} \dots σ_{k} \in {1, 2}^{k}

,

k \geq 1

. By a word of length zero it is meant the empty word ∅.

{1, 2}^{*}

represents the set of all words over the alphabet

{1, 2}

including the empty word ∅. Length of a word

σ \in {1, 2}^{*}

is denoted by

| σ |

. If

σ = σ_{1} σ_{2} \dots σ_{k}

, we write

U_{σ} : = U_{σ_{1}} \circ U_{σ_{2}} \circ \dots \circ U_{σ_{k}}

.

U_{\emptyset}

represents the identity mapping on

R

. By

u_{σ}

we represent the similarity ratio of

U_{σ}

. If

X_{c}

is the random variable with distribution

P_{c}

, then

E (X_{c}) = \frac{1}{2}

and

V (X_{c}) = \frac{1}{8}

[10]. For

σ \in {1, 2}^{*}

, write

A (σ) : = U_{σ} (\frac{1}{2})

. Notice that for

σ \in {1, 2}^{*}

, we have

\frac{1}{2} (A (σ 1) + A (σ 2)) = A (σ)

,

u_{σ} = \frac{1}{3^{| σ |}}

, the contractive factor of

U_{σ},

and for the empty word ∅,

A (\emptyset) = \frac{1}{2}

. For

σ \in {1, 2}^{*}

define

A_{σ} : = U_{σ} [0, 1]

. For any positive integer n, by

2^{* n}

it is meant the concatenation of the symbol 2 with itself n-times successively, i.e.,

2^{* n} = 222 \dots (n times)

, with the convention that

2^{* 0}

is the empty word. For any positive integer k, by

{1, 2}^{k * 2}

it is meant the direct product of the set

{1, 2}^{k}

with itself. By

{1, 2}^{0 * 2}

it is meant the set

{(\emptyset, \emptyset)}

. Also, recall the notations defined in Section 2. Let us now introduce the map

F : N^{*} \cup {(σ, \infty) : σ \in N^{*}} \to {1, 2}^{*}

such that

F (x) = \{\begin{matrix} f (σ_{1}) f (σ_{2}) \dots f (σ_{| σ |}) & if x = σ = σ_{1} σ_{2} \dots σ_{| σ |}, \\ f (σ_{1}) f (σ_{2}) \dots f (σ_{| σ |}, \infty) & if x = (σ_{1} σ_{2} \dots σ_{| σ |}, \infty), \\ \emptyset & if x = \emptyset, \end{matrix}

(6)

where

f : N \cup {(n, \infty) : n \in N} \to {1, 2}^{*} ∖ {\emptyset}

is such that

f (x) = \{\begin{matrix} 2^{* (n - 1)} 1 & if x = n for some n \in N, \\ 2^{* n} & if x = (n, \infty) for some n \in N . \end{matrix}

The function f is one-to-one and onto, and consequently, F is also one-to-one and onto. For any

σ \in N^{*}

, write

A F (σ) : = A (F (σ))

and

A F (σ, \infty) : = A (F (σ, \infty))

.

The map F is instrumental in converting the infinitely generated affine measure P to a finitely generated affine measure

P_{c} \times P_{c}

. Furthermore, to improve the clarity of the arguments, we will write

T_{i}

for

S_{(i, j)}^{(1)}

, and

T_{j}

for

S_{(i, j)}^{(2)}

, where

T_{k}

for all

k \geq 1

form an infinite collection of similarity mappings on

R

such that

T_{k} (x) = \frac{1}{3^{k}} x + 1 - \frac{1}{3^{k - 1}}

for all

x \in R

. Thus, if

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{n}, j_{n})

, then

S_{ω}^{(1)} = T_{i_{1}} \circ \dots \circ T_{i_{n}} = T_{i_{1} i_{2} \dots i_{n}}

and

S_{ω}^{(2)} = T_{j_{1}} \circ \dots \circ T_{j_{n}} = T_{j_{1} j_{2} \dots j_{n}}

for all

n \geq 1

. Again,

T_{\emptyset}

is the identity mapping on

R

.

Lemma 8.

Let

T_{k}

for

k \geq 1

be the infinite collection of similitudes defined above, and

U_{1}

and

U_{2}

be the similitudes generating the Cantor set. Then, for any

σ \in N^{*}

and

x \in R

, we have

T_{σ} (x) = U_{F (σ)} (x) .

Proof.

If

σ = 1

, then

T_{1} (x) = \frac{1}{3} x = U_{1} (x) = U_{F (1)} (x)

for any

x \in R .

Assume that the lemma is true if

σ = k

for some positive integer k, i.e.,

T_{k} (x) = U_{F (k)} (x)

. Then,

\begin{matrix} U_{F (k + 1)} (x) = U_{2^{* k} 1} (x) = U_{2^{* (k - 1)} 21} (x) = U_{2^{* (k - 1)}} U_{21} (x) = U_{2^{* (k - 1)}} (\frac{1}{9} x + \frac{2}{3}) \\ = U_{2^{* (k - 1)} 1} (3 (\frac{1}{9} x + \frac{2}{3})) = U_{F (k)} (\frac{1}{3} x + 2) = T_{k} (\frac{1}{3} x + 2) = \frac{1}{3^{k}} (\frac{1}{3} x + 2) + 1 - \frac{1}{3^{k - 1}} \\ = \frac{1}{3^{k + 1}} x + 1 - \frac{1}{3^{k}} = T_{k + 1} (x) . \end{matrix}

Thus, by the Principle of Mathematical Induction,

T_{k} (x) = U_{F (k)} (x)

for all

k \in N

. Again, for any

τ, δ \in N^{*}

, by (6), it follows that

F (σ δ) = F (σ) F (δ)

. Hence, for any

σ = σ_{1} σ_{2} \dots σ_{n} \in N^{*}

,

n \geq 1

, we have

T_{σ} (x) = T_{σ_{1}} \circ T_{σ_{2}} \circ \dots \circ T_{σ_{n}} (x) = U_{F (σ_{1})} \circ U_{F (σ_{2})} \circ \dots \circ U_{F (σ_{n})} (x) = U_{F (σ)} (x),

which completes the proof. □

Lemma 9.

Let

ω \in I^{*}

, and F be the function as defined in (6). Then for

r = 1, 2

, we have

A F (ω^{(r)}) = S_{ω}^{(r)} (\frac{1}{2})

, and

A F (ω^{(r)}, \infty) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(r)} (\frac{1}{2}) + s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(r)}

.

Proof.

By Lemma 8, we have

A F (ω^{(1)}) = U_{F (ω^{(1)})} (\frac{1}{2}) = T_{ω^{(1)}} (\frac{1}{2}) = S_{ω}^{(1)} (\frac{1}{2}), and similarly A F (ω^{(2)}) = S_{ω}^{(2)} (\frac{1}{2}) .

Without any loss of generality, we can assume

ω = (i_{1}, j_{1}) (i_{2}, j_{2}) \dots (i_{k}, j_{k})

for

k \geq 1

. Then,

\begin{matrix} A F (ω^{(1)}, \infty) = U_{F (i_{1} i_{2} \dots i_{k}, \infty)} (\frac{1}{2}) = U_{F (i_{1} i_{2} \dots i_{k - 1})} \circ U_{F (i_{k}, \infty)} (\frac{1}{2}) = U_{F (i_{1} i_{2} \dots i_{k - 1})} \circ U_{2^{* i_{k}}} (\frac{1}{2}) \\ = U_{F (i_{1} i_{2} \dots i_{k - 1})} \circ U_{2^{* i_{k}} 1} (U_{1}^{- 1} (\frac{1}{2})) = U_{F (i_{1} i_{2} \dots i_{k - 1})} \circ U_{F (i_{k} + 1)} (\frac{3}{2}) = U_{F (i_{1} i_{2} \dots i_{k - 1} (i_{k} + 1))} (\frac{3}{2}) \\ = T_{i_{1} i_{2} \dots i_{k - 1} (i_{k} + 1)} (\frac{3}{2}) = S_{ω^{-} (i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{3}{2}) . \end{matrix}

Because,

S_{(i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{3}{2}) - S_{(i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{1}{2}) = \frac{1}{3^{i_{k} + 1}} \frac{3}{2} + 1 - \frac{1}{3^{i_{k}}} - \frac{1}{3^{i_{k} + 1}} \frac{1}{2} - 1 + \frac{1}{3^{i_{k}}} = \frac{1}{3^{i_{k} + 1}},

we have

\begin{matrix} S_{ω^{-} (i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{3}{2}) - S_{ω^{-} (i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{1}{2}) = s_{ω^{-}}^{(1)} (S_{(i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{3}{2}) - S_{(i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{1}{2})) = s_{ω^{-}}^{(1)} \frac{1}{3^{i_{k} + 1}} \\ = s_{ω^{-} (i_{k} + 1, j_{k} + 1)}^{(1)} = s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)}, which yields \end{matrix}

A F (ω^{(1)}, \infty) = S_{ω^{-} (i_{k} + 1, j_{k} + 1)}^{(1)} (\frac{3}{2}) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} (\frac{1}{2}) + s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(1)} .

Similarly,

A F (ω^{(2)}, \infty) = S_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)} (\frac{1}{2}) + s_{ω^{-} (ω_{| ω |}^{(1)} + 1, ω_{| ω |}^{(2)} + 1)}^{(2)}

. □

Remark 4.

By Lemmas 4 and 9, for any

ω \in I^{*}

, we have

\begin{matrix} a (ω) = (A F (ω^{(1)}), A F (ω^{(2)})), a (ω (\infty, \infty)) = (A F (ω^{(1)}, \infty), A F (ω^{(2)}, \infty)), \\ a (ω (\infty, \emptyset)) = (A F (ω^{(1)}, \infty), A F (ω^{(2)})), and a (ω (\emptyset, \infty)) = (A F (ω^{(1)}), A F (ω^{(2)}, \infty)) . \end{matrix}

The following example illustrates the outcome of the lemma above.

Example 1.

a ((1, 1)) = (A F (1), A F (1)) = (A (1), A (1)) = (\frac{1}{6}, \frac{1}{6})

,

a ((1, 1) (\infty, \emptyset)) = (A F (1, \infty), A F (1)) = (A (2), A (1)) = (\frac{5}{6}, \frac{1}{6})

,

a ((1, 1) (\emptyset, \infty)) = (A F (1), A F (1, \infty)) = (A (1), A (2)) = (\frac{1}{6}, \frac{5}{6})

,

a ((1, 1) (\infty, \infty)) = (A F (1, \infty), A F (1, \infty)) = (A (2), A (2)) = (\frac{5}{6}, \frac{5}{6})

,

a ((1, 1) (1, 1)) = (A F (11), A F (11)) = (A (11), A (11)) = (\frac{1}{18}, \frac{1}{18})

,

a ((1, 1) (1, 1) (\infty, \emptyset)) = (A F (11, \infty), A F (11)) = (A (12), A (11)) = (\frac{5}{18}, \frac{1}{18})

,

a ((1, 1) (1, 1) (\emptyset, \infty)) = (A F (11), A F (11, \infty)) = (A (11), A (12)) = (\frac{1}{18}, \frac{5}{18})

, and

a ((1, 1) (1, 1) (\infty, \infty)) = (A F (11, \infty), A F (11, \infty)) = (A (12), A (12)) = (\frac{5}{18}, \frac{5}{18})

, etc.

Lemma 10.

Let

μ = \sum_{k = 1}^{\infty} \frac{1}{2^{k}} μ \circ T_{k}^{- 1}

. Then, for any

σ \in N^{*}

, we have

μ (T_{σ} [0, 1]) = P_{c} (A_{F (σ)})

, where

P_{c} : = \frac{1}{2} P_{c} \circ U_{1}^{- 1} + \frac{1}{2} P_{c} \circ U_{2}^{- 1}

.

Proof.

Without any loss of generality, let

σ = i_{1} i_{2} \dots i_{k}

for any

k \geq 1

. See that

F (σ) = F (i_{1}) F (i_{2}) \dots F (i_{k})

, and thus

| F (σ) | = | F (i_{1}) | + | F (i_{2}) | + \dots + | F (i_{k}) | = i_{1} + i_{2} + \dots + i_{k} .

Consequently,

μ (T_{σ} [0, 1]) = \frac{1}{2^{i_{1} + i_{2} + \dots + i_{k}}} = \frac{1}{2^{| F (σ) |}} = P_{c} (A_{F (σ)}),

which proves the lemma. □

Proposition 4.

Let P be the affine measure. Then,

P = P_{c} \times P_{c}

, where

P_{c}

is the Cantor distribution.

Proof.

Borel

σ

-algebra on the affine set is generated by all sets of the form

J_{(δ, τ)}

for

(δ, τ) \in I^{*}

, where

J_{(δ, τ)} = S_{(δ, τ)} ([0, 1] \times [0, 1])

. Notice that

J_{(δ, τ)} = T_{δ} [0, 1] \times T_{τ} [0, 1] = U_{F (δ)} [0, 1] \times U_{F (τ)} [0, 1] = A_{F (δ)} \times A_{F (τ)} .

Again, the sets of the form

A_{α}

, where

α \in {1, 2}^{*}

, generate the Borel

σ

-algebra on the Cantor set C. Thus, we see that the Borel

σ

-algebra of the affine set is the same as the product of the Borel

σ

-algebras on the Cantor set. Moreover, for any

(δ, τ) \in I^{*}

, by Remark 1 and Lemma 10, we have

P (J_{(δ, τ)}) = μ (T_{δ} [0, 1]) μ (T_{τ} [0, 1]) = P_{c} (A_{F (δ)}) P_{c} (A_{F (τ)}) = (P_{c} \times P_{c}) (A_{F (δ)} \times A_{F (τ)}) .

Hence, the proposition follows. □

Remark 5.

By Proposition 4, it follows that the optimal sets of n-means for P are the same as the optimal sets n-means for the product measure

P_{c} \times P_{c}

on the affine set. Moreover, for

k \geq 1

we can write

P = P_{c} \times P_{c} = \sum_{(σ, τ) \in {1, 2}^{k * 2}} \frac{1}{4^{k}} (P_{c} \times P_{c}) \circ {(U_{σ}, U_{τ})}^{- 1},

where for

(x_{1}, x_{2}) \in R^{2}

,

{(U_{σ}, U_{τ})}^{- 1} (x_{1}, x_{2}) = (U_{σ}^{- 1} (x_{1}), U_{τ}^{- 1} (x_{2}))

.

5. Optimal Sets of n-Means for all $n \geq 4$

In this section, we will derive closed formulas to determine the optimal sets of n-means and the nth quantization error for all

n \geq 4

. For

(σ, τ) \in {1, 2}^{k * 2}

, write

A_{(σ, τ)} : = A_{σ} \times A_{τ}

and

U_{(σ, τ)} : = (U_{σ}, U_{τ})

.

Lemma 11.

Let α be an optimal set of n-means with

n \geq 4

. Then,

α \cap A_{(i, j)} \neq \emptyset

for all

1 \leq i, j \leq 2

.

Proof.

Let

α

be an optimal set of n-means for

n \geq 4

. As the optimal points are the centroids of their own Voronoi regions we have

α \subset A_{\emptyset} \times A_{\emptyset} : = [0, 1] \times [0, 1]

.

Consider the four-point set

β

given by

β = {(A (i), A (j)) : 1 \leq i, j \leq 2}

. Then,

\begin{matrix} \int min_{c \in β} {∥ x - c ∥}^{2} d P = \sum_{i, j = 1}^{2} \int_{A_{(i, j)}} {∥ x - (A (i), A (j)) ∥}^{2} d (P_{c} \times P_{c}) = \sum_{i, j = 1}^{2} \frac{1}{4} (\frac{1}{9} + \frac{1}{9}) \frac{1}{8} = \frac{1}{36} . \end{matrix}

Because

V_{4}

is the quantization error of four-means, we have

\frac{1}{36} \geq V_{4} \geq V_{n}

.

Assume that

α

does not contain any point from

\cup_{i, j = 1}^{2} A_{(i, j)}

. We know that

\sum_{(a, b) \in α} (a, b) P (M (a, b) | α)) = (\frac{1}{2}, \frac{1}{2}) .

(7)

If all the points of

α

are below the line

x_{2} = \frac{1}{2}

, i.e., if

b < \frac{1}{2}

then by (7), we see that

\frac{1}{2} = \sum_{(a, b) \in α} b P (M (a, b) | α)) < \sum_{(a, b) \in α} \frac{1}{2} P (M (a, b) | α)) = \frac{1}{2}

, which is a contradiction. Similarly, it follows that if all the points of

α

are above the line

x_{2} = \frac{1}{2}

, or left of the line

x_{1} = \frac{1}{2}

, or right of the line

x_{1} = \frac{1}{2}

, a contradiction will arise.

Next, suppose that all the points of

α

are on the line

x_{2} = \frac{1}{2}

. We will consider two cases:

n = 4

and

n > 4 .

When

n = 4,

let

α = {(a_{i}, \frac{1}{2}) : 1 \leq i \leq 4}

with

a_{i} < a_{j}

for

i < j

. Due to symmetry, we can assume that the boundary of the Voronoi regions of the points

(a_{1}, \frac{1}{2})

,

(a_{2}, \frac{1}{2})

,

(a_{3}, \frac{1}{2})

, and

(a_{4}, \frac{1}{2})

are respectively

x_{1} = \frac{1}{6}

,

x_{1} = \frac{1}{2}

, and

x_{1} = \frac{5}{6}

yielding

α = {(\frac{1}{18}, \frac{1}{2}), (\frac{5}{18}, \frac{1}{2}), (\frac{13}{18}, \frac{1}{2}), (\frac{17}{18}, \frac{1}{2})}

, and then writing

B : = A_{(11, 11)} \cup A_{(11, 12)} \cup A_{(11, 21)} \cup A_{(11, 22)}

, by symmetry we have

\begin{matrix} \int min_{c \in α} {∥ x - c ∥}^{2} d P = 4 \int_{B} {∥ x - (\frac{1}{18}, \frac{1}{2}) ∥}^{2} d (P_{c} \times P_{c}) \\ = 8 \int_{A_{(11, 11)}} ∥ x - (\frac{1}{18}, \frac{1}{2}) ∥^{2} d (P_{c} \times P_{c}) + 8 \int_{A_{(11, 12)}} {∥ x - (\frac{1}{18}, \frac{1}{2}) ∥}^{2} d (P_{c} \times P_{c}) \\ = 8 (\frac{65}{5184} + \frac{17}{5184}) = \frac{41}{324} > V_{4}, \end{matrix}

which is a contradiction. We consider the case

n > 4

. Because for any

(x_{1}, x_{2}) \in \cup_{i, j = 1}^{2} A_{i j}

,

{min}_{c \in α} {∥ (x_{1}, x_{2}) - c ∥}^{2} \geq \frac{1}{36}

, we have

\begin{matrix} \int min_{c \in α} {∥ x - c ∥}^{2} d P = \sum_{i, j = 1}^{2} \int_{A_{(i, j)}} min_{c \in α} {∥ x - c ∥}^{2} d (P_{c} \times P_{c}) \geq \sum_{i, j = 1}^{2} \int_{A_{(i, j)}} \frac{1}{36} d (P_{c} \times P_{c}) = \frac{1}{36}, \end{matrix}

which implies

\frac{1}{36} \geq V_{4} > V_{n}

, a contradiction. Thus, we see that all the points of

α

can not lie on

x_{2} = \frac{1}{2}

. Similarly, all the points of

α

can not lie on

x_{1} = \frac{1}{2}

.

Notice that the lines

x_{1} = \frac{1}{2}

and

x_{2} = \frac{1}{2}

partition the square

[0, 1] \times [0, 1]

into four quadrants with center

(\frac{1}{2}, \frac{1}{2})

. If

n = 4 k

for some positive integer k, due to symmetry, we can assume that each quadrant contains k-points from the set

α

. But then, any of the k points in the quadrant containing a basic rectangle

A_{(i, j)}

can be moved to

A_{(i, j)}

which strictly reduce the quantization error, and it gives a contradiction as we assumed that the set

α

is an optimal set of n-means and

α

does not contain any point from

A_{(i, j)}

for

1 \leq i, j \leq 2

.

If

n = 4 k + 1, 4 k + 2

, or

n = 4 k + 3

, then, again due to symmetry, each quadrant gets at least k points. Then, as in the case

n = 4 k,

here also, one can strictly reduce the quantization error by moving a point in the quadrant containing a basic rectangle

A_{(i, j)}

to

A_{(i, j)}

for

1 \leq i, j \leq 2

, which is a contradiction.

Thus, we have proved that

α \cap A_{(i, j)} \neq \emptyset

for all

1 \leq i, j \leq 2

. □

Lemma 12.

Let α be an optimal set of n-means with

n \geq 4

. Then,

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

.

Proof.

By Lemma 11, we know that

α \cap A_{(i, j)} \neq \emptyset

for all

1 \leq i, j \leq 2

. Now, we will prove the statement by considering four distinct cases:

Case 1:

n = 4 k

for some integer

k \geq 1

.

In this case, due to symmetry, we can assume that

α

contains k points from each of

A_{(i, j)}

, otherwise, quantization error can be reduced by redistributing the points of

α

equally among

A_{(i, j)}

for

1 \leq i, j \leq 2

, and so

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

.

Case 2:

n = 4 k + 1

for some integer

k \geq 1

.

In this case, again due to symmetry, we can assume that

α

contains k points from each of

A_{(i, j)},

and if possible, one point, say

(a, b)

, from

A_{(\emptyset, \emptyset)} ∖ \cup_{i, j = 1}^{2} A_{(i, j)}

. By symmetry, one can assume that

(a, b)

is the midpoint of the line segment joining any two centroids of the basic rectangles

A_{(i, j)}

for

1 \leq i, j \leq 2

. Let us first take

(a, b) = (\frac{1}{2}, \frac{1}{2})

which is the center of the affine set. For simplicity, we first assume

k = 1

, i.e.,

n = 5

. Then,

α

contains only one point from each of

A_{(i, j)}

. Let

(a_{1}, b_{1})

be the point that

α

takes from

A_{(1, 1)}

. As

(\frac{1}{2}, \frac{1}{2})

lies on the diagonal

x_{2} = x_{1}

, due to symmetry we can also assume that

(a_{1}, b_{1})

lies on the diagonal

x_{2} = x_{1}

. By Proposition 1, we have

P (M ((\frac{1}{2}, \frac{1}{2}) | α)) > 0

. This yields that

\frac{1}{2} ((a_{1}, b_{1}) + (\frac{1}{2}, \frac{1}{2})) < (\frac{1}{3}, \frac{1}{3})

which implies

a_{1} < \frac{1}{6}

and

b_{1} < \frac{1}{6}

. Then, we see that

\frac{1}{36} = V_{4} \approx V_{5} = 4 \int_{A_{(1, 1)}} min_{c \in {(a_{1}, b_{1}), (\frac{1}{2}, \frac{1}{2})}} {∥ x - c ∥}^{2} d P > \int min_{c \in β} {∥ x - c ∥}^{2} d P = \frac{2}{81} \geq V_{5},

where

β = {(\frac{1}{18}, \frac{1}{18}), (\frac{1}{18}, \frac{5}{18}), (\frac{5}{6}, \frac{1}{6}), (\frac{1}{6}, \frac{5}{6}), (\frac{5}{6}, \frac{5}{6})}

, which is a contradiction. Similarly, if we take

(a, b)

as the midpoint of a line segments joining the centroids of any two adjacent basic rectangles

A_{(i, j)}

for

1 \leq i, j \leq 2

, contradiction arises. Proceeding in the similar way, by taking

k = 2, 3, \dots

, we see that contradiction arises at each value k takes. Therefore,

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

.

Case 3:

n = 4 k + 2

for some integer

k \geq 1

.

In this case, due to symmetry, we can assume that

α

contains k points from each of

A_{(i, j)}

, and if possible, two points, say

(a_{1}, b_{1})

and

(a_{2}, b_{2})

, from

A_{(\emptyset, \emptyset)} ∖ \cup_{i, j = 1}^{2} A_{(i, j)}

. Then, by symmetry, we can assume that

(a_{1}, b_{1})

lies on the midpoint of the line segment joining the centroids of

A_{(1, 1)}

,

A_{(2, 1)}

; and

(a_{2}, b_{2})

lies on the midpoint of the line segment joining the centroids of

A_{(1, 2)}

and

A_{(2, 2)}

. As in Case 2, this leads to a contradiction. Thus,

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

.

Case 4:

n = 4 k + 3

for some integer

k \geq 1

. Due to symmetry, in this case, we can assume that each of

A_{(1, 1)}

and

A_{(2, 1)}

gets

k + 1

points; each of

A_{(1, 2)}

and

A_{(2, 2)}

gets k points. The remaining one point lies on the midpoint of the line segment joining the centroids of

A_{(1, 2)}

and

A_{(2, 2)}

. But, in that case, proceeding as in Case 2, we can show that a contradiction arises. Thus,

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

.

We have shown that in all possible cases

α \subset \cup_{i, j = 1}^{2} A_{(i, j)}

; hence, the lemma follows. □

Corollary 1.

The set

{(\frac{1}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{1}{6}, \frac{5}{6}), (\frac{5}{6}, \frac{5}{6})}

is a unique optimal set of four-means of the affine measure P with quantization error

V_{4} = \frac{1}{36}

(see Figure 4).

Remark 6.

Let α be an optimal set of n-means, and

n_{i j} = card (β_{i j})

where

β_{i j} = α \cap A_{(i, j)}

for

1 \leq i, j \leq 2

. Then,

0 \leq | n_{i j} - n_{p q} | \leq 1

for

1 \leq i, j, p, q \leq 2

.

Lemma 13.

Let

n \geq 4

and α be an optimal set of n-means for the product measure

P_{c} \times P_{c}

. For

1 \leq i, j \leq 2

, set

β_{i j} : = α \cap A_{(i, j)}

, and let

n_{i j} = card (β_{i j})

. Then,

U_{(i, j)}^{- 1} (β_{i j})

is an optimal set of

n_{i j}

-means, and

V_{n} = \sum_{i, j = 1}^{2} \frac{1}{36} V_{n_{i j}}

.

Proof.

For

n \geq 4

, by Lemma 11, we have

α = \cup_{i, j = 1}^{2} β_{i j}

,

n = \sum_{i, j = 1}^{2} n_{i j}

, and so

V_{n} = \sum_{i, j = 1}^{2} \int_{A_{(i, j)}} min_{a \in β_{i j}} {∥ x - a ∥}^{2} d (P_{c} \times P_{c}) .

If

U_{(1, 1)}^{- 1} (β_{11})

is not an optimal set of

n_{11}

-means for

P_{c} \times P_{c}

, then there exists a set

γ_{11} \subset R^{2}

with

card (γ_{11}) = n_{11}

such that

\int {min}_{a \in γ_{11}} {∥ x - a ∥}^{2} d (P_{c} \times P_{c}) < \int {min}_{a \in U_{(1, 1)}^{- 1} (β_{11})} {∥ x - a ∥}^{2} d (P_{c} \times P_{c})

. But then,

δ : = U_{(1, 1)} (γ_{11}) \cup β_{12} \cup β_{21} \cup β_{22}

is a set of cardinality n and it satisfies

\int {min}_{a \in δ} {∥ x - a ∥}^{2} d (P_{c} \times P_{c}) < \int {min}_{a \in α} {∥ x - a ∥}^{2} d (P_{c} \times P_{c}),

contradicting the fact that

α

is an optimal set of n-means for

P_{c} \times P_{c}

. Similarly, it can be proved that

U_{(1, 2)}^{- 1} (β_{12})

,

U_{(2, 1)}^{- 1} (β_{21})

, and

U_{(2, 2)}^{- 1} (β_{22})

are optimal sets of

n_{12}

-,

n_{21}

-, and

n_{22}

-means respectively. Thus,

V_{n} = \sum_{i, j = 1}^{2} \frac{1}{4} \int min_{a \in β_{i j}} {∥ x - a ∥}^{2} d ((P_{c} \times P_{c}) \circ U_{(i, j)}^{- 1}) = \sum_{i, j = 1}^{2} \frac{1}{36} \int min_{a \in U_{(i, j)}^{- 1} (β_{i j})} {∥ x - a ∥}^{2} d P = \sum_{i, j = 1}^{2} \frac{1}{36} V_{n_{i j}},

which gives the lemma. □

Proposition 5.

Let

n \in N

be such that

n = 4^{ℓ (n)}

for some positive integer

ℓ (n)

. Then, the set

α_{4^{ℓ (n)}} : = \underset{(σ, τ) \in {1, 2}^{ℓ (n) * 2}}{\cup} {(A (σ), A (τ))}

forms a unique optimal set of n-means for the affine measure P with quantization error

V_{4^{ℓ (n)}} = \frac{1}{4} \frac{1}{9^{ℓ (n)}} .

Proof.

We will prove the statement by induction. By Corollary 1, it is true if

ℓ (n) = 1

. Let us assume that it is true for

n = 4^{k}

for some positive integer k. We now show that it is also true if

n = 4^{k + 1}

. Let

β

be an optimal set of

4^{k + 1}

-means. Set

β_{i j} : = β \cap A_{(i, j)}

for

1 \leq i, j \leq 2

. Then, by Lemmas 11 and 13,

U_{(i, j)}^{- 1} (β_{i j})

is an optimal set of

4^{k}

-means, and so

U_{(i, j)}^{- 1} (β_{i j}) = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{k * 2}}

which implies

β_{i j} = {(A (i σ), A (j τ)) : (σ, τ) \in {1, 2}^{k * 2}}

. Thus,

β = \cup_{i, j = 1}^{2} β_{i j} = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{(k + 1) * 2}}

is an optimal set of

4^{k + 1}

-means. Because

(A (σ), A (τ))

is the centroid of

A_{(σ, τ)}

for each

(σ, τ) \in I^{k + 1}

, the set

β

is unique. Now, by Lemma 13, we have the quantization error as

V_{k + 1} = \sum_{i, j = 1}^{2} \frac{1}{36} V_{^{k}} = \frac{1}{9} \cdot \frac{1}{4} \cdot \frac{1}{9^{k}} = \frac{1}{4} \frac{1}{9^{k + 1}} .

Thus, by induction, the proof of the proposition is complete. □

Definition 1.

For

n \in N

with

n \geq 4

let

ℓ (n)

be the unique natural number with

4^{ℓ (n)} < n \leq 2 \cdot 4^{ℓ (n)}

. For

I \subset {1, 2}^{ℓ (n) * 2}

with card

(I) = n - 4^{ℓ (n)}

let

α_{n} (I)

be the set defined as follows:

\begin{matrix} α_{n} (I) & = \underset{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I}{\cup} {(A (σ), A (τ))} \cup (\underset{(σ, τ) \in I}{\cup} {(A (σ 1), A (τ)), (A (σ 2), A (τ))}) . \end{matrix}

Remark 7.

In Definition 1, instead of choosing the set

{(A (σ 1), A (τ)), (A (σ 2), A (τ))}

, one can choose

{(A (σ), A (τ 1)), (A (σ), A (τ 2))}

, i.e., the set associated with each

(σ, τ) \in I

can be chosen in two different ways. Moreover, the subset I can be chosen from

{1, 2}^{ℓ (n) * 2}

in

^{4^{ℓ (n)}} C_{n - 4^{ℓ (n)}}

ways. Hence, the number of the sets

α_{n} (I)

is

2^{card (I)} \cdot^{4^{ℓ (n)}} C_{n - 4^{ℓ (n)}}

.

The following example illustrates Definition 1.

Example 2.

Let

n = 5

. Then,

ℓ (n) = 1

,

I \subset {1, 2}^{* 2}

with

card (I) = 1

, and so

\begin{matrix} α_{5} ({(1, 1)}) & = {(A (1), A (2)), (A (2), A (1)), (A (2), A (2))} \cup {(A (11), A (1)), (A (12), A (1))} \\ = {(\frac{1}{6}, \frac{5}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{5}{6})} \cup {(\frac{1}{18}, \frac{1}{6}), (\frac{5}{18}, \frac{1}{6})}, \end{matrix}

or,

\begin{matrix} α_{5} ({(1, 1)}) & = {(A (1), A (2)), (A (2), A (1)), (A (2), A (2))} \cup {(A (1), A (11)), (A (1), A (12))} \\ = {(\frac{1}{6}, \frac{5}{6}), (\frac{5}{6}, \frac{1}{6}), (\frac{5}{6}, \frac{5}{6})} \cup {(\frac{1}{6}, \frac{1}{18}), (\frac{1}{6}, \frac{5}{18})} . \end{matrix}

Similarly, one can get six more sets by taking

I = {(1, 2)}

,

{(2, 1)}

, or

{(2, 2)}

, i.e., the number of the sets

α_{n} (I)

in this case is

2^{card (I)} \cdot^{4^{ℓ (n)}} C_{n - 4^{ℓ (n)}} = 8

.

Proposition 6.

Let

n \geq 4

and

α_{n} (I)

be the set as defined in Definition 1. Then,

α_{n} (I)

forms an optimal set of n-means with quantization error

V_{n} = \frac{1}{4} \frac{1}{36^{ℓ (n)}} (2 \cdot 4^{ℓ (n)} - n + \frac{5}{9} (n - 4^{ℓ (n)})) .

Proof.

We have

n = 4^{ℓ (n)} + k

where

1 \leq k \leq 4^{ℓ (n)}

. Set

β_{i j} = α \cap A_{i j}

with

n_{i j} = card (β_{i j})

for

1 \leq i, j \leq 2

. Let us prove it by induction. We first assume

k = 1

. By Lemmas 11 and 13, we can assume that each of

U_{(i, j)}^{- 1} (β_{i j})

for

i = 2

and

j = 1, 2

, are optimal sets of

4^{ℓ (n) - 1}

-means and

U_{(1, 1)}^{- 1} (β_{11})

is an optimal set of

(4^{ℓ (n) - 1} + 1)

-means. Thus, for

i = 2

and

j = 1, 2

, we can write

\begin{matrix} U_{(i, j)}^{- 1} (β_{i j}) & = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{(ℓ (n) - 1) * 2}}, and \\ U_{(1, 1)}^{- 1} (β_{11}) & = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{(ℓ (n) - 1) * 2} ∖ {τ}} \cup U_{τ} (α_{2}), \end{matrix}

for some

τ \in {1, 2}^{(ℓ (n) - 1) * 2}

, where

α_{2}

is an optimal set of two-means. Thus,

α_{n} ({(1, 1) τ}) = \cup_{i, j = 1}^{2} β_{i j} = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ {(1, 1) τ}} \cup U_{(1, 1) τ} (α_{2}),

for some

τ \in {1, 2}^{(ℓ (n) - 1) * 2}

, where

α_{2}

is an optimal set of two-means. Notice that instead of choosing

U_{(1, 1)}^{- 1} (β_{11})

as an optimal set of

(4^{ℓ (n) - 1} + 1)

-means, one can choose any one from

U_{(i, j)}^{- 1} (β_{i j})

for

i = 2

,

j = 1, 2

, as an optimal set of

(4^{ℓ (n) - 1} + 1)

-means. Hence, for

n = 4^{ℓ (n)} + 1

, one can write

α_{n} (I) = \cup_{i, j = 1}^{2} β_{i j} = {(A (σ), A (τ)) : (σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ {τ}} \cup U_{τ} (α_{2}),

where

I = {τ}

for some

τ \in {1, 2}^{ℓ (n) * 2}

as an optimal set of n-means. Thus, we see that the proposition is true if

n = 4^{ℓ (n)} + k

. Similarly, one can prove that the proposition is true for any

1 \leq k \leq 4^{ℓ (n)}

. Then, the quantization error is

\begin{matrix} V_{n} = min_{(a, b) \in α_{n} (I)} {∥ x - (a, b) ∥}^{2} d P = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \int_{A_{σ} \times A_{τ}} {∥ x - (A (σ), A (τ)) ∥}^{2} d (P_{c} \times P_{c}) \\ + \sum_{(σ, τ) \in I} \sum_{i = 1}^{2} \int_{A_{σ i} \times A_{τ}} {∥ x - (A (σ i), A (τ)) ∥}^{2} d (P_{c} \times P_{c}) \\ = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \frac{1}{4^{ℓ (n)}} (u_{σ}^{2} + u_{τ}^{2}) \frac{1}{8} + \sum_{(σ, τ) \in I} \sum_{i = 1}^{2} \frac{1}{4^{ℓ (n)}} \frac{1}{2} (u_{σ i}^{2} + u_{τ}^{2}) \frac{1}{8} \\ = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \frac{1}{4^{ℓ (n)}} (u_{σ}^{2} + u_{τ}^{2}) \frac{1}{8} + \sum_{(σ, τ) \in I} \frac{1}{4^{ℓ (n)}} (\frac{1}{9} u_{σ}^{2} + u_{τ}^{2}) \frac{1}{8} . \end{matrix}

Because

card ({1, 2}^{ℓ (n) * 2} ∖ I) = 2 \cdot 4^{ℓ (n)} - n

,

card (I) = n - 4^{ℓ (n)}

,

u_{σ} = u_{τ} = \frac{1}{3^{ℓ (n)}}

, upon simplification, we have

V_{n} = \frac{1}{4} \frac{1}{36^{ℓ (n)}} (2 \cdot 4^{ℓ (n)} - n + \frac{5}{9} (n - 4^{ℓ (n)}))

. Thus, the proof of the proposition is complete. □

Definition 2.

For

n \in N

with

n \geq 4

let

ℓ (n)

be the unique natural number with

2 \cdot 4^{ℓ (n)} < n < 4^{ℓ (n) + 1}

. For

I \subset {1, 2}^{ℓ (n) * 2}

with card

(I) = n - 2 \cdot 4^{ℓ (n)}

let

α_{n} (I)

be the set defined as follows:

\begin{matrix} α_{n} (I) & = \underset{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I}{\cup} {(A (σ 1), A (τ)), (A (σ 2), A (τ))} \\ \cup (\underset{(σ, τ) \in I}{\cup} {(A (σ 1), A (τ 1)), (A (σ 1), A (τ 2)), (A (σ 2), A (τ))}) . \end{matrix}

Remark 8.

In Definition 2, instead of choosing the set

{(A (σ 1), A (τ)), (A (σ 2), A (τ))}

, one can choose

{(A (σ), A (τ 1)), (A (σ), A (τ 2))}

. Instead of choosing the set

{(A (σ 1), A (τ 1)), (A (σ 1), A (τ 2)), (A (σ 2), A (τ))}

, one can choose either the set

{(A (σ 1), A (τ)), (A (σ 2), A (τ 1)), (A (σ 2), A (τ 2))}

, or

{(A (σ 1), A (τ 1)), (A (σ 2), A (τ 1)), (A (σ), A (τ 2))}

, or

{(A (σ), A (τ 1)), (A (σ 1), A (τ 2)), (A (σ 2), A (τ 2))}

, i.e., the set corresponding to each

(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I

can be chosen in two different ways, and the set corresponding to each

(σ, τ) \in I

can be chosen in four different ways. Because

card ({1, 2}^{ℓ (n) * 2} ∖ I) = 4^{ℓ (n)} - (n - 2 \cdot 4^{ℓ (n)}) = 3 \cdot 4^{ℓ (n)} - n

and the subset I can be chosen from

{1, 2}^{ℓ (n) * 2}

in

^{4^{ℓ (n)}} C_{n - 2 \cdot 4^{ℓ (n)}}

ways, the number of the sets

α_{n} (I)

is

2^{3 \cdot 4^{ℓ (n)} - n} \cdot 4^{card (I)} \cdot^{4^{ℓ (n)}} C_{n - 2 \cdot 4^{ℓ (n)}}

.

We now give an example illustrating Definition 2.

Example 3.

Let

n = 9

. Then,

ℓ (n) = 1

,

I \subset {1, 2}^{* 2}

with

card (I) = 1

. Take

I = {(1, 1)}

. Then,

\begin{matrix} α_{9} ({(1, 1)}) & = {(A (11), A (2)), (A (12), A (2)), (A (21), A (2)), (A (22), A (2)), (A (21), A (1)), \\ (A (22), A (1))} \cup {(A (11), A (1)), (A (12), A (11)), (A (12), A (12))} \\ = {(\frac{1}{18}, \frac{5}{6}), (\frac{5}{18}, \frac{5}{6}), (\frac{13}{18}, \frac{5}{6}), (\frac{17}{18}, \frac{5}{6}), (\frac{13}{18}, \frac{1}{6}), (\frac{17}{18}, \frac{1}{6})} \\ \cup {(\frac{1}{18}, \frac{1}{6}), (\frac{5}{18}, \frac{1}{18}), (\frac{5}{18}, \frac{5}{18})} . \end{matrix}

Note that each of

α_{9} ({(1, 1)})

,

α_{9} ({(1, 2)})

,

α_{9} ({(2, 1)})

,

α_{9} ({(2, 2)})

can be chosen in 32 ways, i.e., the numbers of the sets

α_{9} (I)

in this case is

4 \cdot 32 = 128

. Moreover, by using the formula in Remark 8, we have

2^{3 \cdot 4^{ℓ (n)} - n} \cdot 4^{card (I)} \cdot^{4^{ℓ (n)}} C_{n - 2 \cdot 4^{ℓ (n)}} = 128 .

Proposition 7.

Let

n \geq 4

and

α_{n} (I)

be the set as defined in Definition 2. Then,

α_{n} (I)

forms an optimal set of n-means with quantization error

V_{n} = \frac{1}{36^{ℓ (n) + 1}} (9 \cdot 4^{ℓ (n)} - 2 n) .

Proof.

We have

n = 2 \cdot 4^{ℓ (n)} + k

where

1 \leq k < 2 \cdot 4^{ℓ (n)}

. Set

β_{i j} = α \cap A_{i j}

with

n_{i j} = card (β_{i j})

for

1 \leq i, j \leq 2

. Let us prove it by induction. We first assume

k = 1

. By Lemmas 11 and 13, we can assume that each of

U_{(i, j)}^{- 1} (β_{i j})

for

i = 2

and

j = 1, 2

, are optimal sets of

2 \cdot 4^{ℓ (n) - 1}

-means and

U_{(1, 1)}^{- 1} (β_{11})

is an optimal set of

(2 \cdot 4^{ℓ (n) - 1} + 1)

-means. Thus, for

i = 2

and

j = 1, 2

, we can write

\begin{matrix} U_{(i, j)}^{- 1} (β_{i j}) & = {U_{(σ, τ)} (α_{2}) : (σ, τ) \in {1, 2}^{(ℓ (n) - 1) * 2}}, and \\ U_{(1, 1)}^{- 1} (β_{11}) & = {U_{(σ, τ)} (α_{2}) : (σ, τ) \in {1, 2}^{(ℓ (n) - 1) * 2} ∖ {τ}} \cup U_{τ} (α_{3}), \end{matrix}

for some

τ \in {1, 2}^{(ℓ (n) - 1) * 2}

, where

α_{3}

is an optimal set of three-means. Thus

α_{n} ({(1, 1) τ}) = \cup_{i, j = 1}^{2} β_{i j} = {U_{(σ, τ)} (α_{2}) : (σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ {(1, 1) τ}} \cup U_{(1, 1) τ} (α_{3}),

for some

τ \in {1, 2}^{(ℓ (n) - 1) * 2}

, where

α_{3}

is an optimal set of three-means. Notice that instead of choosing

U_{(1, 1)}^{- 1} (β_{11})

as an optimal set of

(2 \cdot 4^{ℓ (n) - 1} + 1)

-means, one can choose any one from

U_{(i, j)}^{- 1} (β_{i j})

for

i = 2

,

j = 1, 2

, as an optimal set of

(2 \cdot 4^{ℓ (n) - 1} + 1)

-means. Hence, for

n = 2 \cdot 4^{ℓ (n)} + 1

, one can write

α_{n} (I) = \cup_{i, j = 1}^{2} β_{i j} = {U_{(σ, τ)} (α_{2}) : (σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ {τ}} \cup U_{τ} (α_{3}),

where

I = {τ}

for some

τ \in {1, 2}^{ℓ (n) * 2}

as an optimal set of n-means. Thus, we see that the proposition is true if

n = 2 \cdot 4^{ℓ (n)} + 1

. Similarly, one can prove that the proposition is true for any

1 \leq k < 2 \cdot 4^{ℓ (n)}

. Thus, writing

α_{2} = {(A (1), A (\emptyset)), (A (2), A (\emptyset))}

, and

α_{3} = {(A (1), A (1)), (A (1), A (2)), (A (2), A (\emptyset))}

, we have, in general,

\begin{matrix} α_{n} (I) & = \underset{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I}{\cup} {(A (σ 1), A (τ)), (A (σ 2), A (τ))} \\ \cup (\underset{(σ, τ) \in I}{\cup} {(A (σ 1), A (τ 1)), (A (σ 1), A (τ 2)), (A (σ 2), A (τ))}), \end{matrix}

where

I \subset {1, 2}^{ℓ (n) * 2}

with

card (I) = k

for some

1 \leq k < 2 \cdot 4^{ℓ (n)}

. Then, we obtain the quantization error as

\begin{matrix} V_{n} = min_{(a, b) \in β_{n} (I)} {∥ x - (a, b) ∥}^{2} d P = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \sum_{i = 1}^{2} \int_{A_{σ i} \times A_{τ}} {∥ x - (A (σ i), A (τ)) ∥}^{2} d (P_{c} \times P_{c}) \\ + \sum_{(σ, τ) \in I} (\sum_{j = 1}^{2} \int_{A_{σ 1} \times A_{τ j}} {∥ x - (A (σ 1), A (τ j)) ∥}^{2} d (P_{c} \times P_{c}) \\ + \int_{A_{σ 2} \times A_{τ}} {∥ x - (A (σ 2), A (τ)) ∥}^{2} d (P_{c} \times P_{c})) \\ = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \sum_{i = 1}^{2} \frac{1}{4^{ℓ (n)}} \frac{1}{2} (u_{σ i}^{2} + u_{τ}^{2}) \frac{1}{8} + \sum_{(σ, τ) \in I} \frac{1}{4^{ℓ (n)}} (\sum_{j = 1}^{2} \frac{1}{4} (u_{σ 1}^{2} + u_{τ j}^{2}) \frac{1}{8} + \frac{1}{2} (u_{σ 2}^{2} + u_{τ}^{2}) \frac{1}{8}) \\ = \sum_{(σ, τ) \in {1, 2}^{ℓ (n) * 2} ∖ I} \frac{1}{4^{ℓ (n)}} (\frac{1}{9} u_{σ}^{2} + u_{τ}^{2}) \frac{1}{8} + \sum_{(σ, τ) \in I} \frac{1}{4^{ℓ (n)}} (u_{σ}^{2} + 5 u_{τ}^{2}) \frac{1}{72} . \end{matrix}

Because

card ({1, 2}^{ℓ (n) * 2} ∖ I) = 3 \cdot 4^{ℓ (n)} - n

,

card (I) = n - 2 \cdot 4^{ℓ (n)}

,

u_{σ} = u_{τ} = \frac{1}{3^{ℓ (n)}}

, upon simplification, we have

V_{n} = \frac{1}{36^{ℓ (n) + 1}} (9 \cdot 4^{ℓ (n)} - 2 n)

. Thus, the proof of the proposition is complete. □

6. Quantization Dimension and Quantization Coefficient for P

The techniques employed in the previous sections also provide closed formulas for the quantization errors involved at each step. Such closed formulas are amenable for direct calculation of the quantization dimension and the quantization coefficient for the probability distribution involved. Hence, in this section we will calculate the quantization dimension

D (P)

of the probability distribution P, and the accumulation points for the

D (P)

-dimensional quantization coefficients. By Propositions 5–7, the nth quantization error

V_{n}

is given by

V_{n} = \{\begin{matrix} \frac{1}{4} \frac{1}{36^{ℓ (n)}} (2 \cdot 4^{ℓ (n)} - n + \frac{5}{9} (n - 4^{ℓ (n)})) & if 4^{ℓ (n)} \leq n \leq 2 \cdot 4^{ℓ (n)}, \\ \frac{1}{36^{ℓ (n) + 1}} (9 \cdot 4^{ℓ (n)} - 2 n) & if 2 \cdot 4^{ℓ (n)} < n < 4^{ℓ (n) + 1} . \end{matrix}

(8)

Proposition 8.

The quantization dimension

D (P)

of the probability distribution P exists and equals

\frac{log 4}{log 3} .

Proof.

By (8), for

4^{ℓ (n)} \leq n \leq 2 \cdot 4^{ℓ (n)},

it follows that

V_{2 \cdot 4^{ℓ (n)}} \leq V_{n} \leq V_{4^{ℓ (n)}}

, i.e.,

\frac{5}{36} 9^{- ℓ (n)} \leq V_{n} \leq \frac{1}{4} 9^{- ℓ (n)},

and so

\frac{2 ℓ (n) log 4}{- log \frac{5}{36} + ℓ (n) log 9} \leq \frac{2 log n}{- log V_{n}} < \frac{2 log 2 + 2 ℓ (n) log 4}{- log \frac{1}{4} + ℓ (n) log 9} .

Thus, we deduce that

lim_{n \to \infty} \frac{2 log n}{- log V_{n}} = \frac{log 4}{log 3} .

Similarly, for

2 \cdot 4^{ℓ (n)} < n < 4^{ℓ (n) + 1}

, we also obtain the same limit. Hence,

D (P) = lim_{n \to \infty} \frac{2 log n}{- log V_{n}} = \frac{log 4}{log 3} .

Thus, the proof of the proposition is complete. □

Proposition 9.

Let

β : = D (P)

be the quantization dimension of P. Then, the β-dimensional quantization coefficient forP does not exist, and the accumulation points of

{n^{\frac{2}{β}} V_{n}}_{n \in N}

lie in the closed interval

[\frac{1}{12}, \frac{5}{4}]

.

Proof.

Recall the sequence of quantization errors

{V_{n}}_{n = 4}^{\infty}

given by (8). Again, notice that

4^{\frac{1}{β}} = 3

. Along the sequence

{4^{ℓ (n)}}_{n \in N}

, we have

{lim}_{n \to \infty} {(4^{ℓ (n)})}^{\frac{2}{β}} V_{4^{ℓ (n)}} = \frac{1}{4} .

Similarly, along the sequence

{2 \cdot 4^{ℓ (n)}}_{n \in N},

we have

{lim}_{n \to \infty} {(2 \cdot 4^{ℓ (n)})}^{\frac{2}{β}} V_{2 . 4^{ℓ (n)}} = \frac{5}{12} .

Consequently,

lim_{n \to \infty} n^{\frac{2}{β}} V_{n}

does not exist. Now, we calculate the range for the accumulation points of

{n^{\frac{2}{β}} V_{n}}_{n \in N}

. The following two cases can arise:

Case 1.

4^{ℓ (n)} \leq n \leq 2 \cdot 4^{ℓ (n)}

.

In this case, we have

V_{2 . 4^{ℓ (n)}} \leq V_{n} \leq V_{4^{ℓ (n)}},

implying

{(4^{ℓ (n)})}^{\frac{2}{β}} V_{2 \cdot 4^{ℓ (n)}} \leq n^{\frac{2}{β}} V_{n} \leq {(2 \cdot 4^{ℓ (n)})}^{\frac{2}{β}} V_{4^{ℓ (n)}} .

Because

lim_{n \to \infty} {(4^{ℓ (n)})}^{\frac{2}{β}} V_{2 \cdot 4^{ℓ (n)}} = \frac{5}{36}, and lim_{n \to \infty} {(2 \cdot 4^{ℓ (n)})}^{\frac{2}{β}} V_{4^{ℓ (n)}} = \frac{3}{4},

it follows that along such subsequences, we have

{lim inf}_{n} n^{\frac{2}{β}} V_{n} = \frac{5}{36} < \frac{3}{4} = {lim sup}_{n} n^{\frac{2}{β}} V_{n} .

Case 2.

2 \cdot 4^{ℓ (n)} < n < 4^{ℓ (n) + 1}

.

In this case, we have

V_{4^{ℓ (n) + 1}} < V_{n} < V_{2 \cdot 4^{ℓ (n)}}

, implying

{(2 \cdot 4^{ℓ (n)})}^{\frac{2}{β}} V_{4^{ℓ (n) + 1}} < n^{\frac{2}{β}} V_{n} < {(4^{ℓ (n) + 1})}^{\frac{2}{β}} V_{2 \cdot 4^{ℓ (n)}} .

Because

lim_{n \to \infty} {(2 \cdot 4^{ℓ (n)})}^{\frac{2}{β}} V_{4^{ℓ (n) + 1}} = \frac{1}{12}, and lim_{n \to \infty} {(4^{ℓ (n) + 1})}^{\frac{2}{β}} V_{2 \cdot 4^{ℓ (n)}} = \frac{5}{4},

it follows that

{lim inf}_{n} n^{\frac{2}{β}} V_{n} = \frac{1}{12} < \frac{5}{4} = {lim sup}_{n} n^{\frac{2}{β}} V_{n} .

By Case 1 and Case 2, for

n \in N

, we see that

\underset{n}{lim inf} n^{\frac{2}{β}} V_{n} = \frac{1}{12} < \frac{5}{4} = \underset{n}{lim sup} n^{\frac{2}{β}} V_{n},

which yields the fact that the accumulation points of

{n^{\frac{2}{β}} V_{n}}_{n \in N}

lie in the closed interval

[\frac{1}{12}, \frac{5}{4}]

. Thus, the proof of the proposition is complete. □

7. Discussion and Concluding Remarks

Motivation. As it has been mentioned in Introduction, the main motivation for this article is completion of the programme initiated in [14]. In the meantime, we extend the results in [12] to the setting of infinite affine transformations. Analogously to [10], this completes the programme of providing complete quantization for affine measures on

R^{2} .

Observations and Remarks. Quantization of continuous random signals (or random variables and processes) is an important part of digital representation of analog signals for various coding techniques (e.g., source coding, data compression, archiving, restoration). The oldest example of quantization in statistics is rounding off. Sheppard (see [19]) was the first who analyzed rounding off for estimating densities by histograms. Any real number x can be rounded off (or quantized) to the nearest integer, say

q (x) = [x]

, with a resulting quantization error

e (x) = x - q (x) .

Hence, the restored signal may differ from the original one and some information can be lost. Thus, in quantization of a continuous set of values there is always a distortion (also known as noise or error) between the original set of values and the quantized set of values. The main goal in quantization theory is finding a set of quantizers with minimum distortion, which has been extensively investigated by numerous authors [2,20,21,22,23,24]. A different approach for uniform scalar quantization is developed in [25], where the correlation properties of a Gaussian process are exploited to evaluate the asymptotic behavior of the random quantization rate for uniform quantizers. General quantization problems for Gaussian processes in infinite-dimensional functional spaces are considered in [26]. In estimating weighted integrals of time series with no quadratic mean derivatives, by means of samples at discrete times, it is known that the rate of convergence of mean-square error is reduced from

n^{- 2}

to

n^{- 1.5}

when the samples are quantized (see [27]). For smoother time series, with

k = 1, 2, \dots

quadratic mean derivatives, the rate of convergence is reduced from

n^{- 2 k - 2}

to

n^{- 2}

when the samples are quantized, which is a very significant reduction (see [28]). The interplay between sampling and quantization is also studied in [28], which asymptotically leads to optimal allocation between the number of samples and the number of levels of quantization. Quantization also seems to be a promising tool in recent development in numerical probability (see, e.g., [29]).

By Proposition 1 the points in an optimal set are the centroids of their own Voronoi regions. Consequently, the points in an optimal set are an evenly spaced distribution of sites in the domain with minimum distortion error with respect to a given probability measure and is therefore very useful in many fields, such as clustering, data compression, optimal mesh generation, cellular biology, optimal quadrature, coverage control and geographical optimization; for more details one can see [7,30]. In addition, it has applications in energy-efficient distribution of base stations in a cellular network [31,32,33]. In both geographical and cellular applications the distribution of users is highly complex and often modeled by a fractal [34,35].

Future Directions.k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations, or the underlying data set into k clusters in which each observation belongs to the cluster with the nearest mean, also known as cluster center or cluster centroid. For a given k and a given probability distribution in a dataset there can be two or more different sets of k-means clusters: for example, with respect to a uniform distribution the unit square

{(x_{1}, x_{2}) : | x_{1} | \leq 1, | x_{2} | \leq 1}

has four different sets of two-means clusters with cluster centers

{(\frac{1}{2}, \frac{1}{2}), (- \frac{1}{2}, - \frac{1}{2})}

,

{(- \frac{1}{2}, \frac{1}{2}), (\frac{1}{2}, - \frac{1}{2})}

,

{(- \frac{1}{2}, 0), (\frac{1}{2}, 0)}

, and

{(0, \frac{1}{2}), (0, - \frac{1}{2})}

. Among these only

{(- \frac{1}{2}, 0), (\frac{1}{2}, 0)}

, and

{(0, \frac{1}{2}), (0, - \frac{1}{2})}

form two different optimal sets of two-means. In other words, we can say that for a given k, among the multiple sets of k-means clusters, the centers of a set with the smallest distortion error form an optimal set of k-means. Thus, it is much more difficult to calculate an optimal set of k-means than to calculate a set of k-means clusters. There are several work done in the direction of k-means clustering. On the other hand, there is not much work in the direction of finding optimal sets of k-means clusters, and the work in this paper is an addition in this direction.

The probability measure P considered in this study has identical marginal distributions, which is instrumental in determining optimal sets of 2-, 3-, and 4-means accurately. Besides, it enables us to bridge infinitely generated affine measures with finitely generated ones, and consequently, connect optimal sets of n-means for P and

P_{C} \times P_{C} .

It would be interesting to investigate if similar results can be achieved when P is induced by different infinite probability vectors

{p_{i j}}

than considered in this article.

Author Contributions

The work in this paper is completely new. Both the authors equally contributed in writing the draft of this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bucklew, J.A.; Wise, G.L. Multidimensional asymptotic quantization with rth power distortion measures. IEEE Trans. Inf. Theory 1982, 28, 239–247. [Google Scholar] [CrossRef]
Gray, R.; Neuhoff, D. Quantization. IEEE Trans. Inform. Theory 1998, 44, 2325–2383. [Google Scholar] [CrossRef]
Abaya, E.F.; Wise, G.L. Some remarks on the existence of optimal quantizers. Stat. Probab. Lett. 1984, 2, 349–351. [Google Scholar] [CrossRef]
Gray, R.M.; Kieffer, J.C.; Linde, Y. Locally optimal block quantizer design. Inf. Control. 1980, 45, 178–198. [Google Scholar] [CrossRef] [Green Version]
György, A.; Linder, T. On the structure of optimal entropy-constrained scalar quantizers. IEEE Trans. Inf. Theory 2002, 48, 416–427. [Google Scholar] [CrossRef]
Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1730. [Google Scholar]
Du, Q.; Faber, V.; Gunzburger, M. Centroidal Voronoi Tessellations: Applications and Algorithms. Siam Rev. 1999, 41, 637–676. [Google Scholar] [CrossRef] [Green Version]
Roychowdhury, M.K. Quantization and centroidal Voronoi tessellations for probability measures on dyadic Cantor sets. J. Fractal Geom. 2017, 4, 127–146. [Google Scholar] [CrossRef] [Green Version]
Gersho, A.; Gray, R.M. Vector Quantization and Signal Compression; Kluwer Academy Publishers: Boston, MA, USA, 1992. [Google Scholar]
Graf, S.; Luschgy, H. The Quantization of the Cantor Distribution. Math. Nachr. 1997, 183, 113–133. [Google Scholar] [CrossRef]
Roychowdhury, L. Optimal quantization for nonuniform Cantor distributions. J. Interdiscip. Math. 2019, 22, 1325–1348. [Google Scholar] [CrossRef]
Çömez, D.; Roychowdhury, M.K. Quantization for uniform distributions of Cantor dusts on R². Topol. Proc. 2020, 56, 195–218. [Google Scholar]
Roychowdhury, M.K. Optimal quantization for the Cantor distribution generated by infinite similitudes. Isr. J. Math. 2019, 231, 437–466. [Google Scholar] [CrossRef] [Green Version]
Mihailescu, E.; Roychowdhury, M.K. Quantization coefficients in infinite systems. Kyoto J. Math. 2015, 55, 857–873. [Google Scholar] [CrossRef] [Green Version]
Graf, S.; Luschgy, H. The quantization dimension of self-similar probabilities. Math. Nachr. 2002, 241, 103–109. [Google Scholar] [CrossRef]
Hutchinson, J. Fractals and self-similarity. Indiana Univ. J. 1981, 30, 713–747. [Google Scholar] [CrossRef]
Moran, M. Hausdorff measure of infinitely generated self-similar sets. Monatsh. Math. 1996, 122, 387–399. [Google Scholar] [CrossRef]
Mauldin, D.; Urbański, M. Dimensions and measures in infinite iterated function systems. Proc. Lond. Math. Soc. 1996, 73, 105–154. [Google Scholar] [CrossRef]
Sheppard, W.F. On the calculation of the most probable values of frequency constants for data arranged according to equidistant divisions of a scale. Proc. Lond. Math. Soc. 1897, 1, 353–380. [Google Scholar] [CrossRef]
Cambanis, S.; Gerr, N. A simple class of asymptotically optimal quantizers. IEEE Trans. Inform. Theory 1983, 29, 664–676. [Google Scholar] [CrossRef]
Gray, R.M.; Linder, T. Mismatch in high rate entropy constrained vector quantization. IEEE Trans. Inform. Theory 2003, 49, 1204–1217. [Google Scholar] [CrossRef]
Li, J.; Chaddha, N.; Gray, R.M. Asymptotic performance of vector quantizers with a perceptual distortion measure. IEEE Trans. Inform. Theory 1999, 45, 1082–1091. [Google Scholar]
Shykula, M.; Seleznjev, O. Stochastic structure of asymptotic quantization errors. Stat. Prob. Lett. 2006, 76, 453–464. [Google Scholar] [CrossRef]
Zador, P.L. Asymptotic quantization error of continuous signals and the quantization dimensions. IEEE Trans. Inform. Theory 1982, 28, 139–148. [Google Scholar] [CrossRef]
Shykula, M.; Seleznjev, O. Uniform Quantization of Random Processes; Univ. Umeȧ Research Report; Umeå University: Umeå, Sweden, 2004; pp. 1–16. [Google Scholar]
Luschgy, H.; Pagès, G. Functional quantization of Gaussian processes. J. Funct. Anal. 2002, 196, 486–531. [Google Scholar] [CrossRef] [Green Version]
Bucklew, J.A.; Cambanis, S. Estimating random integrals from noisy observations: Sampling designs and their performance. IEEE Trans. Inform. Theory 1988, 34, 111–127. [Google Scholar] [CrossRef]
Benhenni, K.; Cambanis, S. The effect of quantization on the performance of sampling designs. IEEE Trans. Inform. Theory 1998, 44, 1981–1992. [Google Scholar] [CrossRef]
Pagès, G.; Pham, H.; Printemps, J. Optimal quantization methods and applications to numerical problems in finance. In Handbook of Computational and Numerical Methods in Finance; Rachev, S., Ed.; Birkhäuser Boston: Boston, MA, USA, 2004; pp. 253–297. [Google Scholar]
Okabe, A.; Boots, B.; Sugihara, K.; Chiu, S.N. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, 2nd ed.; Wiley: Hoboken, NJ, USA, 2000. [Google Scholar]
Hao, Y.; Chen, M.; Hu, L.; Song, J.; Volk, M.; Humar, I. Wireless Fractal Ultra-Dense Cellular Networks. Sensors 2017, 17, 841. [Google Scholar] [CrossRef]
Kaza, K.R.; Kshirsagar, K.; Rajan, K.S. A bi-objective algorithm for dynamic reconfiguration of mobile networks. In Proceedings of the IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 5741–5745. [Google Scholar]
Song, Y. Cost-Effective Algorithms for Deployment and Sensing in Mobile Sensor Networks. Ph.D. Thesis, University of Conneticut, Storrs, CT, USA, 2014. [Google Scholar]
Abundo, C.; Bodnar, T.; Driscoll, J.; Hatton, I.; Wright, J. City population dynamics and fractal transport networks. In Proceedings of the Santa Fe Institute’s CSSS2013; Santa Fe Institute: Santa Fe, NM, USA, 2013. [Google Scholar]
Lu, Z.; Zhang, H.; Southworth, F.; Crittenden, J. Fractal dimensions of metropolitan area road networks and the impacts on the urban built environment. Ecol. Indic. 2016, 70, 285–296. [Google Scholar] [CrossRef] [Green Version]

$Fractalfract 06 00239 g001 550$

Figure 1. Basic rectangles of the infinite affine transformations.

$Fractalfract 06 00239 g001$

$Fractalfract 06 00239 g002 550$

Figure 2. Optimal sets of two-means.

$Fractalfract 06 00239 g002$

$Fractalfract 06 00239 g003 550$

Figure 3. Optimal sets of three-means.

$Fractalfract 06 00239 g003$

$Fractalfract 06 00239 g004 550$

Figure 4. Optimal sets of n-means for

4 \leq n \leq 7

. Optimal set of 4-means is unique; on the other hand, optimal sets of n-means for

n = 5, 6, 7

are not unique.

Figure 4. Optimal sets of n-means for

4 \leq n \leq 7

. Optimal set of 4-means is unique; on the other hand, optimal sets of n-means for

n = 5, 6, 7

are not unique.

$Fractalfract 06 00239 g004$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Çömez, D.; Roychowdhury, M.K. Quantization for Infinite Affine Transformations. Fractal Fract. 2022, 6, 239. https://doi.org/10.3390/fractalfract6050239

AMA Style

Çömez D, Roychowdhury MK. Quantization for Infinite Affine Transformations. Fractal and Fractional. 2022; 6(5):239. https://doi.org/10.3390/fractalfract6050239

Chicago/Turabian Style

Çömez, Doğan, and Mrinal Kanti Roychowdhury. 2022. "Quantization for Infinite Affine Transformations" Fractal and Fractional 6, no. 5: 239. https://doi.org/10.3390/fractalfract6050239

Article Menu

Quantization for Infinite Affine Transformations

Abstract

1. Introduction

2. Preliminaries

3. Optimal Sets of n-Means for n = 2, 3

4. Affine Measures

5. Optimal Sets of n-Means for all $n \geq 4$

6. Quantization Dimension and Quantization Coefficient for P

7. Discussion and Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Quantization for Infinite Affine Transformations

Abstract

1. Introduction

2. Preliminaries

3. Optimal Sets of n-Means for n = 2, 3

4. Affine Measures

5. Optimal Sets of n-Means for all n ≥ 4

6. Quantization Dimension and Quantization Coefficient for P

7. Discussion and Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5. Optimal Sets of n-Means for all $n \geq 4$