Edge-Preserving Denoising of Image Sequences

Yi, Fan; Qiu, Peihua

doi:10.3390/e23101332

Open AccessArticle

Edge-Preserving Denoising of Image Sequences

by

Fan Yi

^* and

Peihua Qiu

Department of Biostatistics, University of Florida, Gainesville, FL 32603, USA

^*

Author to whom correspondence should be addressed.

Entropy 2021, 23(10), 1332; https://doi.org/10.3390/e23101332

Submission received: 2 September 2021 / Revised: 7 October 2021 / Accepted: 7 October 2021 / Published: 12 October 2021

(This article belongs to the Special Issue Big Data Analytics and Information Science for Business and Biomedical Applications II)

Download

Browse Figures

Versions Notes

Abstract

:

To monitor the Earth’s surface, the satellite of the NASA Landsat program provides us image sequences of any region on the Earth constantly over time. These image sequences give us a unique resource to study the Earth’s surface, changes of the Earth resource over time, and their implications in agriculture, geology, forestry, and more. Besides natural sciences, image sequences are also commonly used in functional magnetic resonance imaging (fMRI) of medical studies for understanding the functioning of brains and other organs. In practice, observed images almost always contain noise and other contaminations. For a reliable subsequent image analysis, it is important to remove such contaminations in advance. This paper focuses on image sequence denoising, which has not been well-discussed in the literature yet. To this end, an edge-preserving image denoising procedure is suggested. The suggested method is based on a jump-preserving local smoothing procedure, in which the bandwidths are chosen such that the possible spatio-temporal correlations in the observed image intensities are accommodated properly. Both theoretical arguments and numerical studies show that this method works well in the various cases considered.

Keywords:

bandwidth selection; correlation; edge-preserving image denoising; image sequence; jump regression analysis; local smoothing; nonparametric regression; spatio-temporal data

1. Introduction

The Landsat project, led by the US Geological Survey (USGS) and NASA, has launched eight satellites since 1972 to continuously provide scientifically valuable images of the Earth’s surface. These images can be freely accessed by researchers around the world (cf., Zanter [1]). This rich archive of Landsat images has become a major resource for scientific research about the Earth’s surface and its resources in different scientific disciplines, including forest science, climate science, agriculture, ecology, fire science, and many more. As an example, Figure 1 shows two images of the Las Vegas area in Nevada taken in 1984 and 2007, respectively. These two images clearly show the increasing urban sprawl of Las Vegas during the 23-year period, and consequently, the environment in that region has changed dramatically. The current satellite (i.e., the Landsat 8) can deliver an image of a given region roughly every 16 days. So, we have a sequence of images of that region collected sequentially over time, stored in the Landsat database, which is increasing all the time. Image sequences are commonly used in many other applications, including functional magnetic resonance imaging (fMRI) in neuroscience and quality control in manufacturing industries (Qiu [2]). In practice, observed images usually contain noise and other contaminations (Gonzalez and Woods [3]). For reliable subsequent image analyses, such contaminations should be removed in advance. In the image processing literature, the removal of noise from an observed image is referred to as image denoising. This paper focuses on image denoising for analyzing observed image sequences.

In the literature, there has been extensive discussion on image denoising (Qiu [4]). Many early methods in the computer science literature are based on the Markov random field (MRF) framework, in which observed image intensities of an image are assumed to have the Markov property that the observed intensity at a given pixel depends only on the observed intensities in a neighborhood of the given pixel (Geman and Geman [5]). Then, if the true image is assumed to have a prior distribution which is also an MRF, its posterior distribution would be an MRF too, and consequently, the true image can be estimated by the maximum a posteriori (MAP) estimator (e.g., Geman and Geman [5], Besag [6], Fessler et al. [7]). Other popular image denoising methods include those based on diffusion equations (e.g., Perona and Malik [8], Weickert [9]), total variation (Beck and Teboulle [10], Rudin et al. [11], Yuan et al. [12]), wavelet transformations (e.g., Chang et al. [13], Mrázek [14]), jump regression analysis (e.g., Gijbels et al. [15], Qiu [16], Qiu [17], Qiu and Mukherjee [18]), adaptive weights smoothing (e.g., Polzehl and Spokoiny [19]), spatial adaption (e.g., Kervrann and Boulanger [20]) and more. Besides noise removal, edge-preserving is important for image denoising because edges are important structures of the images. Some of the methods mentioned above can preserve edges well, such as the ones based on jump regression analysis, total variation, and wavelet transformations. Thorough surveys of popular edge-preserving image denoising methods can be found in Jain and Tyagi [21] and Qiu [4].

Although there are already some existing methods for edge-preserveing image denoising, almost all of them handle observed images taken at a single time point. So far, we have not found much discussion about denoising image sequences, which is the focus of the current paper. A given image sequence often describes a gradual change in appearance over time, subject to the underlying process. For instance, the sequence of images of the Las Vegas area acquired by the Landsat satellite (cf., Figure 1) describes the gradual change of the Earth’s surface in that area over time. As mentioned above, two consecutive images in the sequence acquired by the current Landsat satellite are only about 16 days apart. So, their difference should be very small. However, the images could be substantially different after a long period of time, as shown in Figure 1. In such applications, it should be reasonable to assume that edge locations in different images either do not change or change gradually over time. To handle such image sequences, the neighboring images should be useful when denoising the image at a given time point, or information in neighboring images should be shared during image denoising. By noticing such features of image sequences, we propose an edge-preserving image denoising procedure for analyzing image sequences in this paper. Our proposed method is based on the jump regression analysis (JRA) used for regression modeling when the underlying regression function has jumps or other singularities (Qiu [22]). It is a local smoothing procedure, and the possible spatio-temporal correlation in the observed image data has been accommodated properly in its construction. Both theoretical arguments and numerical studies show that this method works well in various different cases.

The remaining parts of the article are organized as follows. The proposed method is described in detail in Section 2. Its statistical properties and the numerical studies about its performance in different finite-sample cases are presented in Section 3. Several concluding remarks are provided in Section 4. Some technical details are given in Appendix A.

2. Materials and Methods

This section describes our proposed method in two parts. A JRA model for describing an image sequence and the model estimation are discussed in Section 2.1. Selection of several parameters used in model estimation is discussed in Section 2.2.

2.1. JRA Model and Its Estimation

To describe an image sequence, let us consider the following JRA model:

Z_{i j k} = f (x_{i}, y_{j}; t_{k}) + ε_{i j k}, i = 1, 2, \dots, n_{x}, j = 1, 2, \dots, n_{y}, k = 1, 2, \dots, n_{t},

(1)

where

Z_{i j k}

is the observed image intensity level at the

(i, j)

-th pixel

(x_{i}, y_{j})

and at the k-th time point

t_{k}

,

f (x_{i}, y_{j}; t_{k})

is the true image intensity level, and

ε_{i j k}

is the pointwise random noise with mean 0 and variance

σ^{2}

. In model (1), spatio-temporal data correlation is allowed, namely,

{ε_{i j k}}

could be correlated over

i, j

and k. For image data, the pixel locations are usually regularly spaced. Without loss of generality, it is assumed that they are equally spaced in the design space

Ω = [0, 1] \times [0, 1]

, namely,

(x_{i}, y_{j}) = (i / n_{x}, j / n_{y})

, for all i and j, where

n_{x}

and

n_{y}

are the numbers of rows and columns, respectively. The observation times

{t_{k}, k = 1, 2, \dots, n_{t}}

are also assumed to be equally spaced in the time interval

[0, 1]

. The true image intensity function

f (x, y; t)

, for

(x, y) \in Ω

, is continuous in the design space

Ω

at each

t \in [0, 1]

, except on the edges where it has jumps.

To estimate the unknown image intensity function

f (x, y; t)

in model (1), we consider using a local smoothing method, instead of a global smoothing method (e.g., smoothing spline method), because of a large amount of data involved in the current problem. Likewise, it has been well-discussed in the JRA literature that conventional smoothing methods (e.g., conventional local kernel smoothing methods) would not work well for estimating models like (1) where the true image intensity function

f (x, y; t)

has jumps at the edges, because the jumps would be blurred by such conventional methods (cf., Qiu [22]). In this paper, we suggest a jump-preserving local smoothing method for estimating (1), described in detail below. For a given point

(x, y; t) \in Ω \times [0, 1]

, define a local neighborhood

\begin{matrix} O (x, y; t) = & {(x^{^{'}}, y^{^{'}}; t^{^{'}}) : (x^{^{'}}, y^{^{'}}; t^{^{'}}) \in Ω \times [0, 1], \\ \sqrt{\frac{{(x^{^{'}} - x)}^{2}}{h_{x}^{2}} + \frac{{(y^{^{'}} - y)}^{2}}{h_{y}^{2}}} \leq 1, | t^{^{'}} - t | / h_{t} \leq 1}, \end{matrix}

where

h_{x}

,

h_{y}

and

h_{t}

are the bandwidths in the

x -

,

y -

, and

t -

axis, respectively. In

O (x, y; t)

, we first consider the following local linear kernel (LLK) smoothing procedure (Fan and Gijbels [23]):

\begin{matrix} min_{a, b, c, d} \sum_{i = 1}^{n_{x}} \sum_{j = 1}^{n_{y}} \sum_{k = 1}^{n_{t}} & {\{Z_{i j k} - [a + b (x_{i} - x) + c (y_{j} - y) + d (t_{k} - t)]\}}^{2} \\ K (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}), \end{matrix}

(2)

where

K (v)

is a density kernel function with the support

{v : | v | \leq 1}

. The solutions to

(a, b, c, d)

of the minimization problem (2) are denoted as

\hat{a} (x, y; t)

,

\hat{b} (x, y; t)

,

\hat{c} (x, y; t)

, and

\hat{d} (x, y; t)

, respectively. It can be checked that they have the following expressions:

[\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] = {[\begin{matrix} m_{000} & m_{100} & m_{010} & m_{001} \\ m_{100} & m_{200} & m_{110} & m_{101} \\ m_{010} & m_{110} & m_{020} & m_{011} \\ m_{001} & m_{101} & m_{011} & m_{002} \end{matrix}]}^{- 1} [\begin{matrix} \sum_{i j k} Z_{i j k} K_{i j k} \\ \sum_{i j k} (x_{i} - x) Z_{i j k} K_{i j k} \\ \sum_{i j k} (y_{j} - y) Z_{i j k} K_{i j k} \\ \sum_{i j k} (t_{k} - t) Z_{i j k} K_{i j k} \end{matrix}],

(3)

where

\sum_{i j k}

denotes

\sum_{i = 1}^{n_{x}} \sum_{j = 1}^{n_{y}} \sum_{k = 1}^{n_{t}}

,

K_{i j k}

denotes

K (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}})

, and

m_{r s l} = \sum_{i j k} {(x_{i} - x)}^{r} {(y_{j} - y)}^{s} {(t_{k} - t)}^{l} K_{i j k}

, for

r, s, l = 0, 1, 2

. The LLK estimator of

f (x, y; t)

is defined to be

\hat{a} (x, y; t)

. The estimated gradient direction of

f (x, y; t)

at

(x, y; t)

is

\hat{G} (x, y; t) = {(\hat{b} (x, y; t), \hat{c} (x, y; t), \hat{d} (x, y; t))}^{'}

which indicates the direction in which the estimated plane in

O (x, y; t)

by the LLK procedure (2) increases the fastest. If there is an edge surface in

O (x, y; t)

, then

\hat{G} (x, y; t)

would be (approximately) orthogonal to that surface.

In cases when there are no edges in the neighborhood

O (x, y; t)

,

\hat{a} (x, y; t)

would be a good estimate of

f (x, y; t)

. Otherwise, it cannot be a good estimate because

\hat{a} (x, y; t)

is a weighted average of all observed image intensities in

O (x, y; t)

, the jumps in the image intensity surface would be smoothed out in the weighted average, and the estimate

\hat{a} (x, y; t)

would be biased for estimating

f (x, y; t)

. To overcome that limitation, we consider the following one-sided smoothing idea. Let

O (x, y; t)

be divided into two parts

O^{(1)} (x, y; t)

and

O^{(2)} (x, y; t)

by a plane that passes

(x, y; t)

and is perpendicular to

\hat{G} (x, y; t)

. See Figure 2 for an example.

Then, in cases when there is an edge surface in

O (x, y; t)

, that plane would be (approximately) parallel to the edge surface. Consequently, at least one of

O^{(1)} (x, y; t)

and

O^{(2)} (x, y; t)

would be (mostly) located on a single side of the edge surface in such cases. Now, let us consider the following one-sided LLK smoothing procedure: for

l = 1, 2

,

\begin{matrix} min_{a, b, c, d} \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)} (x, y; t)} & {\{Z_{i j k} - [a + b (x_{i} - x) + c (y_{j} - y) + d (t_{k} - t)]\}}^{2} \\ K (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) . \end{matrix}

(4)

The solutions of (4) to

(a, b, c, d)

are denoted as

({\hat{a}}^{(l)} (x, y; t),

{\hat{b}}^{(l)} (x, y; t),

{\hat{c}}^{(l)} (x, y; t),

{\hat{d}}^{(l)} (x, y; t))

, for

l = 1, 2

. Intuitively, when there are no edges in

O (x, y; t)

,

\hat{a} (x, y; t)

,

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

are all consistent estimates of

f (x, y; t)

under some regular conditions. In such cases,

\hat{a} (x, y; t)

would be preferred since it averages more observations and consequently it would have a smaller variance. When there are edges in

O (x, y; t)

,

\hat{a} (x, y; t)

would not be a good estimate of

f (x, y; t)

as explained above, but one of

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

should estimate

f (x, y; t)

well. Therefore, in all cases, at least one of the three estimators

\hat{a} (x, y; t)

,

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

should estimate

f (x, y; t)

well.

Next, we need to choose a good estimator from

\hat{a} (x, y; t)

,

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

based on the observed data, which is not straightforward, partly because we do not know in advance whether there are edges in the neighborhood

O (x, y; t)

and whether the edges are mostly contained in

O^{(1)} (x, y; t)

or

O^{(2)} (x, y; t)

if the answer to the first question is positive. To overcome this difficulty, let us consider the following weighted residual mean squares (WRMS) of the fitted local plane by the LLK procedure (2):

\begin{matrix} e (x, y; t) = & {\sum_{i j k} [Z_{i j k} - \hat{a} (x, y; t) - \hat{b} (x, y; t) (x_{i} - x) - \hat{c} (x, y; t) (y_{j} - y) - \\ \hat{d} (x, y; t) (t_{k} - t)]^{2} K_{i j k}} / \sum_{i j k} K_{i j k} . \end{matrix}

(5)

The above WRMS measures how well the fitted local plane describes the observed data in

O (x, y; t)

. If there are edges in

O (x, y; t)

, this quantity would be relatively large, due mainly to the jumps in the image intensity surface. Otherwise, it would be relatively small. So, the quantity

e (x, y; t)

contains useful information about the existence of edges in

O (x, y; t)

. Similarly, we can define WRMS values for the two one-sided local planes fitted in

O^{(1)} (x, y; t)

and

O^{(2)} (x, y; t)

. They are denoted as

e^{(1)} (x, y; t)

and

e^{(2)} (x, y; t)

. Based on these WRMS values, we define our edge-preserving estimator of

f (x, y; t)

to be

\begin{matrix} \hat{f} (x, y; t) & = \hat{a} (x, y; t) I (D (x, y; t) \leq u) \\ + {\hat{a}}^{(1)} (x, y; t) I (D (x, y; t) > u) I (e^{(1)} (x, y; t) < e^{(2)} (x, y; t)) \\ + {\hat{a}}^{(2)} (x, y; t) I (D (x, y; t) > u) I (e^{(1)} (x, y; t) > e^{(2)} (x, y; t)) \\ + \frac{{\hat{a}}^{(1)} (x, y; t) + {\hat{a}}^{(2)} (x, y; t)}{2} I (D (x, y; t) > u) I (e^{(1)} (x, y; t) = e^{(2)} (x, y; t)), \end{matrix}

(6)

where

D (x, y; t) = max (e (x, y; t) - e^{(1)} (x, y; t), e (x, y; t) - e^{(2)} (x, y; t))

,

I (\cdot)

is the indicator function, and

u > 0

is a threshold parameter. By (6), it is obvious that

\hat{f} (x, y; t)

is defined to be one of

\hat{a} (x, y; t)

,

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

. The quantity

\hat{a} (x, y; t)

, which is obtained from the entire neighborhood

O (x, y; t)

, is chosen if the observed data indicate no edges in

O (x, y; t)

, supported by the event

D (x, y; t) \leq u

. Otherwise, one of the two one-sided quantities,

{\hat{a}}^{(1)} (x, y; t)

and

{\hat{a}}^{(2)} (x, y; t)

, with a smaller WRMS value is chosen. Although, theoretically, the event

(e^{(1)} (x, y; t) = e^{(2)} (x, y; t))

would have probability zero of happening, the last term on the right-hand-side of (6) is still included for completeness of the definition of

\hat{f} (x, y; t)

and for the consideration that

e^{(1)} (x, y; t)

and

e^{(2)} (x, y; t)

could be considered the same in certain algorithms when their values are close.

2.2. Parameter Selection

In our proposed method described in Section 2.1, there are four parameters;

h_{x}

,

h_{y}

,

h_{t}

and u, that need to be chosen properly in advance. For that purpose, it is natural to consider the cross validation (CV) procedure, especially in the current research problem where the observed data are quite large in size. However, it has been well-demonstrated in the literature that the conventional CV procedure would not work well in cases when the observed data are autocorrelated, because it cannot effectively distinguish the data correlation structure from the mean structure (cf., Altman [24], Opsomer et al. [25]). In the current problem, spatio-temperal data correlation is possible in almost all applications. Thus, the conventional CV procedure is not feasible in such cases. In the univariate regression setup, Brabanter et al. [26] suggested a modified CV procedure for choosing smoothing parameters in cases with correlated data. This procedure is generalized here for choosing the parameters

h_{x}

,

h_{y}

,

h_{t}

and u used in the proposed method, which is described below. Let the modified CV score for choosing

h_{x}

,

h_{y}

,

h_{t}

and u be defined as

C V (h_{x}, h_{y}, h_{t}, u) = \frac{1}{n_{x} n_{y} n_{t}} \sum_{i j k} {[{\hat{f}}_{- (i j k)} (x_{i}, y_{j}; t_{k}) - Z (x_{i}, y_{j}; t_{k})]}^{2},

(7)

where

{\hat{f}}_{- (i j k)} (x_{i}, y_{j}; t_{k})

is the leave-one-out estimate of

f (x_{i}, y_{j}; t_{k})

by (2)–(6) after the observation

Z_{i j k}

is removed from the estimation process and after the kernel function is replaced by the so-called

ϵ

-optimal bimodal kernel function

K_{ϵ} (v)

defined to be

K_{ϵ} (v) = \frac{4}{4 - 3 ϵ - ϵ^{3}} \times \{\begin{matrix} \frac{3}{4} (1 - v^{2}) I (| v | \leq 1), & i f | v | \geq ϵ, \\ \frac{3 (1 - ϵ^{2})}{4 ϵ} | v |, & i f | v | < ϵ, \end{matrix}

(8)

where

0 < ϵ < 1

is a parameter. Based on a large simulation study, Brabanter et al. [26] suggested choosing

ϵ

to be 0.1, which is adopted in this paper. Then, by the above modified CV procedure, (7) and (8), the parameters

h_{x}

,

h_{y}

,

h_{t}

and u can be chosen by minimizing the modified CV score

C V (h_{x}, h_{y}, h_{t}, u)

.

3. Results

3.1. Statistical Properties

In this part, we discuss some statistical properties of the proposed edge-preserving image sequence denoising method (2)–(6). First, we have the following proposition.

Proposition 1.

Assume that i) the kernel function

K (v)

used in (2) is a Lipschitz-1 continuous density function, and ii) the noise terms

{ε_{i j k}, i = 1, 2, \dots, n_{x}, j = 1, 2, \dots, n_{y}, k = 1, 2, \dots, n_{t}}

in model (1) form a strong mixing stochastic process with the following strong mixing coefficients:

\begin{matrix} α (d) & = & sup_{(i j k), (i^{^{'}} j^{^{'}} k^{^{'}})} sup_{A, B} {| P (A \cap B) - P (A) P (B) |, A \in σ (ε_{i j k}), B \in σ (ε_{i^{^{'}} j^{^{'}} k^{^{'}}}), \\ max {| i - i^{^{'}} |, | j - j^{^{'}} |, | k - k^{^{'}} |} > d}, \end{matrix}

which have the property that

α (d) \leq c_{1} σ^{2} ρ^{c_{2} d}

, where

c_{1}, c_{2} > 0

and

0 < ρ < 1

are constants, and iii)

E (ε_{111}^{6}) < \infty

. Let

N = n_{x} n_{y} n_{t}

,

H = h_{x} h_{y} h_{t}

,

n_{m i n} = min (n_{x}, n_{y}, n_{t})

, and

h_{m i n} = min (h_{x}, h_{y}, h_{t})

. Then, for any

(x, y; t) \in Ω_{h} = [h_{x}, 1 - h_{x}] \times [h_{y}, 1 - h_{y}] \times [h_{t}, 1 - h_{t}]

, we have

| \frac{1}{N H} \sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - t}{h_{t}}) - 1 | = O (\frac{1}{n_{m i n} h_{m i n}}),

E [| \frac{1}{N H} \sum_{i j k} ε_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - t}{h_{t}}) |^{2}] = O (\frac{1}{N H}),

E [| \frac{1}{N H} \sum_{i j k} (ε_{i j k}^{2} - σ^{2}) K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - t}{h_{t}}) |^{2}] = O (\frac{1}{N H}) .

Based on the results in Proposition 1, we can derive the following properties of the LLK estimates defined in (3).

Theorem 1.

Besides the conditions in Proposition 1, we further assume that the true image intensity function

f (x, y; t)

has continuous first-order partial derivatives with respect to x, y and t in the design space Ω except at the edge curves. Then, for any

(x, y; t) \in Ω_{h} ∖ J_{h}

, we have

[\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] = [\begin{matrix} f (x, y; t) \\ f_{x}^{^{'}} (x, y; t) \\ f_{y}^{^{'}} (x, y; t) \\ f_{t}^{^{'}} (x, y; t) \end{matrix}] + [\begin{matrix} O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{x}}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{y}}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{t}}) \end{matrix}] + [\begin{matrix} O_{p} (\frac{1}{\sqrt{N H}}) \\ O_{p} (\frac{1}{h_{x} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{y} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{t} \sqrt{N H}}) \end{matrix}] .

for any

(x, y, t) \in J_{h} ∖ S_{h}

, we have

\begin{matrix} [\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] = & [\begin{matrix} f_{-} (x_{τ}, y_{τ}; t_{τ}) + d_{τ} ξ_{000}^{(2)} \\ \frac{d_{τ}}{ξ_{200} h_{x}} ξ_{100}^{(2)} \\ \frac{d_{τ}}{ξ_{020} h_{y}} ξ_{010}^{(2)} \\ \frac{d_{τ}}{ξ_{002} h_{t}} ξ_{001}^{(2)} \end{matrix}] + [\begin{matrix} O (\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{x}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{y}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{t}}) \end{matrix}] + [\begin{matrix} O_{p} (\frac{1}{\sqrt{N H}}) \\ O_{p} (\frac{1}{h_{x} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{y} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{t} \sqrt{N H}}) \end{matrix}], \end{matrix}

(9)

where

ξ_{r s l} = \int_{Ω \times [0, 1]} u^{r} v^{s} w^{l} K (u, v) K (w) d u d v d w

,

ξ_{r s l}^{(2)} = \int_{Q^{(2)}} u^{r} v^{s} w^{l} K (u, v) K (w) d u d v d w

, for

r, s, l = 0, 1, 2

, J is the closure of the set of all jump points of

f (x, y; t)

,

J_{h} = {(x, y; t) : (x, y; t) \in Ω_{h}, \sqrt{{(x - x^{*})}^{2} / h_{x}^{2} + {(y - y^{*})}^{2} / h_{y}^{2}} \leq 1, | t - t^{*} | / h_{t} \leq 1, f o r a n y (x^{*}, y^{*}, t^{*}) \in J}

, S is the set of singular points in J, including the crossing points of two or more edges, points on an edge surface at which the edge surface does not have a unique tangent surface, and points in J at which the jump sizes in

f (x, y; t)

are zero,

S_{h} = {(x, y; t) : (x, y; t) \in Ω_{h}

,

\sqrt{{(x - x^{*})}^{2} / h_{x}^{2} + {(y - y^{*})}^{2} / h_{y}^{2}} \leq 1, | t - t^{*} | / h_{t} \leq 1, f o r a n y (x^{*}, y^{*}, t^{*}) \in S}

,

(x_{τ}, y_{τ}; t_{τ}) \in J ∖ S

is the projection of

(x, y; t)

to J with the Euclidean distance between the two points being

c \sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}

, for a constant

0 < c < 1

, and

f_{-} (x_{τ}, y_{τ}; t_{τ})

is the smaller one of the two one-sided limits of

f (x, y; t)

at

(x_{τ}, y_{τ}; t_{τ})

. In cases when

O (x, y; t)

contains jumps, without loss of generality, it is assumed that

O (x, y; t)

is divided by the edge surface into two parts

I_{1}

and

I_{2}

with a positive jump size

d_{τ}

from

I_{1}

to

I_{2}

at

(x_{τ}, y_{τ}; t_{τ})

, and

Q^{(1)}

and

Q^{(2)}

are the two corresponding parts in the support of

K (u, v) K (w)

.

The next two theorems establish the consistency of the proposed edge-preserving image denoising procedure (2)–(6). First, we have the following theorem about the WRMS values defined in (5).

Theorem 2.

Assume that the conditions in Theorem 1 are satisfied,

h_{x}^{2} + h_{y}^{2} + h_{t}^{2} = o (1)

,

(h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) / h_{m i n} = o (1)

,

1 / (N H) = o (1)

and

1 / (N H h_{m i n}^{2}) = o (1)

. Then, we have the following results: for any

(x, y; t) \in Ω_{h} ∖ J_{h}

,

\begin{matrix} e (x, y; t) & = σ^{2} + o_{p} (1), \\ e^{(l)} (x, y; t) & = σ^{2} + o_{p} (1), f o r l = 1, 2; \end{matrix}

(10)

for any

(x, y; t) \in J_{h} ∖ S_{h}

,

\begin{matrix} e (x, y; t) & = σ^{2} + d_{τ} C_{τ}^{2} + o_{p} (1), \\ e^{(l)} (x, y; t) & = σ^{2} + d_{τ} {[C_{τ}^{(l)}]}^{2} + o_{p} (1), f o r l = 1, 2, \end{matrix}

(11)

where

\begin{matrix} C_{τ} & = & (\int \int \int_{Q^{(1)}} {[ξ_{000}^{(2)} + \frac{ξ_{100}^{(2)}}{ξ_{200}} u + \frac{ξ_{010}^{(2)}}{ξ_{020}} v + \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ \int \int \int_{Q^{(2)}} {[1 - ξ_{000}^{(2)} - \frac{ξ_{100}^{(2)}}{ξ_{200}} u - \frac{ξ_{010}^{(2)}}{ξ_{020}} v - \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w)^{1 / 2} . \end{matrix}

and

\begin{matrix} C_{τ}^{(l)} & = & (2 \int \int \int_{Q^{(1 l)}} {[B_{0 l} + \frac{B_{1 l}}{ξ_{200}} u + \frac{B_{2 l}}{ξ_{020}} v + \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ 2 \int \int \int_{Q^{(2 l)}} {[1 - B_{0 l} - \frac{B_{1 l}}{ξ_{200}} u - \frac{B_{2 l}}{ξ_{020}} v - \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w)^{1 / 2} . \end{matrix}

with the quantities

Q^{(1 l)}

,

Q^{(2 l)}

,

B_{0 l}

,

B_{1 l}

,

B_{2 l}

and

B_{3 l}

defined as follows. Let

\vec{g} = (\frac{d_{τ}}{ξ_{200} h_{x}} ξ_{100}^{(2)}

,

\frac{d_{τ}}{ξ_{020} h_{y}} ξ_{010}^{(2)}

,

\frac{d_{τ}}{ξ_{002} h_{t}} ξ_{001}^{(2)})

. Then, from (9),

\vec{g}

is actually the asymptotic direction of the gradient vector

\hat{G} (x, y; t)

. Let

{\tilde{O}}^{(l)} (x, y; t)

, for

l = 1, 2

, be two halves of the neighborhood

O (x, y; t)

separated by a plane passing the point

(x, y; t)

in the direction perpendicular to

\vec{g}

and

{\tilde{Q}}^{(l)}

be the two corresponding parts in the support of

K (u, v) K (w)

. Then,

Q^{(1 l)} = Q^{(1)} \cap {\tilde{Q}}^{(l)}

,

Q^{(2 l)} = Q^{(2)} \cap {\tilde{Q}}^{(l)}

,

B_{0 l} = \int \int \int_{Q^{(2 l)}} K (u, v) K (w) d u d v d w

,

B_{1 l} = \int \int \int_{Q^{(2 l)}} u K (u, v) K (w) d u d v d w

,

B_{2 l} = \int \int \int_{Q^{(2 l)}} v K (u, v) K (w) d u d v d w

, and

B_{3 l} = \int \int \int_{Q^{(2 l)}} w K (u, v) K (w) d u d v d w

, for

l = 1, 2

.

Theorem 3.

Under the conditions in Theorem 2 and the extra assumption that threshold parameter

u = u_{N} \to 0

as

N \to \infty

, we have, for any

(x, y; t) \in Ω_{h}

,

\hat{f} (x, y; t) = f (x, y; t) + o_{p} (1) .

The proofs of these theoretical results are given in Appendix A.

3.2. Numerical Studies

In this part, we study the numerical performance of our proposed method for denoising an image sequence. First, we consider a simulation example in which the true image intensity function in model (1) has the following expression:

f (x, y; t) = \{\begin{matrix} - 2 {(x - 0.5)}^{2} - 2 {(y - 0.5)}^{2} - 0.1 sin (2 π t) + 1, & i f r (x, y; t) \leq 0 . 25^{2}, \\ - 2 {(x - 0.5)}^{2} - 2 {(y - 0.5)}^{2} - 0.1 sin (2 π t), & o t h e r w i s e, \end{matrix}

where

r (x, y; t) = {(x - 0.5)}^{2} + {(y - 0.5)}^{2} + 0.01 sin (2 π t)

,

(x, y) \in Ω = [0, 1] \times [0, 1]

, and

t \in [0, 1]

. At a given value of t,

f (x, y; t)

has a circular edge curve

r (x, y; t) = 0 . 25^{2}

with a constant jump size 1 in

f (x, y; t)

at the edges. The radius of the circular edge curve,

\sqrt{0 . 25^{2} - 0.01 sin (2 π t)}

, changes periodically over

t \in [0, 1]

. The image intensity function

f (x, y; t)

at

t = 0.01

and 0.25 and its temporal profile

f (0.25, 0.25; t)

are shown in Figure 3. It can be seen that both the image intensity level at a given pixel and the edge curve change gradually when t changes in

[0, 1]

.

In model (1), the random errors

{ε_{i j k}, i = 1, 2, \dots, n_{x}, j = 1, 2, \dots, n_{y}, k = 1, 2, \dots, n_{t}}

are generated by the function spatialnoise() in the R-package neuRosim (cf., Welvaert et al. [27]). In that R function, there are two parameters

ρ

and

σ

to specify in advance, where

ρ

controls the data autocorrelation in all three dimensions and

σ

is the common standard deviation of the random errors. In all our examples,

σ

is fixed at 0.1, 0.2 or 0.3, and

ρ

is fixed at 0.1, 0.3 or 0.5, to study the possible impact of data noise level and data correlation on the performance of the proposed method. Without loss of generality, we set

n_{x} = n_{y}

in all examples. In the model estimation procedure (2)–(6), we set

h_{x} = h_{y}

, and the kernel function

K (v)

is chosen to be the following truncated Gaussian density function:

K (v) = \{\begin{matrix} \frac{exp (- v^{2} / 2) - exp (- 0.5)}{2 π - 3 π exp (- 0.5)}, & i f | v | \leq 1, \\ 0, & o t h e r w i s e . \end{matrix}

In cases when

σ = 0.1

, 0.2 or 0.3,

n_{x} = 64

or 128,

n_{t} = 50

or 100,

ρ = 0.1

, 0.3 or 0.5, the MSE values of the estimator

\hat{f} (x, y; t)

defined in (6) are presented in Table 1, along with the corresponding parameters

h_{x}

,

h_{t}

and u selected by the modified CV procedure (7) and (8). In each case considered, the MSE value is computed based on 10 replicated simulations. For comparison purposes, the optimal MSE value of the estimator

\hat{f} (x, y; t)

, when its parameters (

h_{x}

,

h_{t}

and u) are chosen such that the MSE value reaches the minimum in each case considered, is also presented in the table, along with the corresponding parameter values. From the table, we can draw the following conclusions. (i) The MSE values are smaller when either

n_{x}

or

n_{t}

is larger, which confirms the consistency results discussed in Section 3.1. (ii) When

ρ

is larger (i.e., the spatio-temporal data correlation is stronger), the MSE values are larger. So, data correlation does have an impact on the performance of the proposed method, which is intuitively reasonable. (iii) By comparing the MSE and the optimal MSE values, we can see that the MSE values are usually larger than their optimal values, but their differences are not that big in almost all cases considered. This conclusion indicates that the modified CV procedure (7) and (8) for determining the values of the parameters

(h_{x}, h_{t}, u)

is quite effective. (iv) The parameter values chosen by the modified CV procedure (7) and (8) are quite close to the optimal parameter values in most cases considered.

Next, we compare our proposed method, denoted as NEW, with some alternative methods described below. The first alternative method is the conventional LLK procedure (2), by which

f (x, y; t)

is estimated by

\hat{a} (x, y; t)

defined in (3). Its bandwidths are chosen by the conventional CV procedure, without considering any possible spatio-temporal data correlation. As explained in Section 2.1, this estimator would blur edges while removing noise. The second alternative method is to use

\hat{a} (x, y; t)

for estimating

f (x, y; t)

, but its bandwidths are chosen by the modified CV procedure (7) and (8). The above two alternative methods are denoted as LLK-C and LLK, respectively, where LLK-C denotes the first conventional LLK procedure that does not accommodate data correlation. The third alternative method is the one by Gijbels et al. [15] which is used for edge-preserving image denoising of a single image. To apply this method to the current problem, individual images collected at different time points can be denoised by it separately. This method assumes that the observed image intensities at different pixels are independent of each other, and thus their bandwidths can be chosen by the conventional CV procedure. This method is denoted as GLQ. The fourth alternative method is to use

\hat{f} (x, y; t)

in (6) to estimate

f (x, y; t)

, but the parameters

(h_{x}, h_{t}, u)

are chosen by the conventional CV procedure. This method is denoted as NEW-C. By considering all these four alternative methods (i.e., LLK-C, LLK, GLQ and NEW-C), we can check whether the current problem to denoise an image sequence can be handled properly by the conventional LLK procedure with or without using the modified CV procedure, by an existing edge-preserving image denoising method designed for denoising a single image, or by the proposed method without considering the possible spatio-temporal data correlation. To evaluate their performance, in addition to the regular MSE criterion, we also consider the following edge-preservation (EP) criterion originally discussed in Hall and Qiu [28]:

E P (\hat{f}) = | J S (\hat{f}) - J S (f) | / J S (f),

where

\begin{matrix} J S (f) = & \frac{1}{(n_{x} - 2) (n_{y} - 2) (n_{t} - 2)} \sum_{i = 2}^{n_{x} - 1} \sum_{j = 2}^{n_{y} - 1} \sum_{k = 2}^{n_{t} - 1} ({[f (x_{i + 1}, y_{j}; t_{k}) - f (x_{i - 1}, y_{j}; t_{k})]}^{2} + \\ {[f (x_{i}, y_{j + 1}; t_{k}) - f (x_{i}, y_{j - 1}; t_{k})]}^{2} + {[f (x_{i}, y_{j}; t_{k + 1}) - f (x_{i}, y_{j}; t_{k - 1})]}^{2})^{1 / 2}, \end{matrix}

and JS(

\hat{f}

) is defined similarly. According to Hall and Qiu [28], JS(f) is a reasonable measure of the cumulative jump magnitude of f at the edge locations. So,

E P (\hat{f})

provides a measure of the percentage of the cumulative jump magnitude of f that has been lost during data smoothing by using the estimator

\hat{f}

. By this explanation, the smaller its value, the better. In cases when

σ = 0.1

, 0.2 or 0.3,

n_{x} = 128

,

n_{t} = 100

, and

ρ = 0.1

, 0.3 or 0.5, the MSE and EP values of the related methods are presented in Table 2. From the table, it can be seen that the proposed method NEW has the smallest MSE values with quite large margins among all five methods in all cases considered, except the case when

σ = 0.1

and

ρ = 0.1

where NEW-C has a lightly smaller MSE value than that of NEW due to the weak data correlation in that case. Likewise, NEW has much smaller EP values in all cases considered, compared to the four competing methods. This example confirms that it is necessary to consider edge-preserving procedures when denoising image sequences and the possible spatio-temporal data correlation should be taken into account during the denoising process. It also confirms the benefit to share useful information among neighboring images when denoising an image sequence.

In the cases when

σ = 0.2

and

ρ = 0.1

, 0.3 or 0.5, Figure 4 shows the observed images at

t = 0.5

in the first column, and the denoised images by the methods LLK-C, LLK, GLQ, NEW-C and NEW in columns 2–6. From the figure, it can be seen that the denoised images by NEW are the best in removing noise and preserving edges. As a comparison, the denoised images by LLK-C, and NEW-C are quite noisy because their selected bandwidths by the conventional CV procedure are relatively small due to the fact the conventional CV procedure cannot distinguish the data correlation from the mean structure, as discussed in Section 2.2. The denoised images by LLK are quite blurry because the method does not take the edges into account when denoising the images. The denoised images by GLQ are quite blurry as well since GLQ denoises individual images at different time points separately and the serial data correlation is ignored in this method.

Next, we apply the proposed method NEW and the four alternative methods LLK-C, LLK, GLQ and NEW-C to a sequence of cell images that records the vasculogenesis process. The sequence has 100 images, and each image has

128 \times 128

pixels. A detailed description of the data can be found in Svoboda et al. [29]. The 1st, 50th and 100th images of the sequence are shown in Figure 5.

In the image denoising literature, to test the noise removal ability of a image denoising method, it is a common practice to add random noise at a certain level to the test images and then apply the image denoising method to the noisy test images (cf., Gijbels et al. [15]). To follow this convention, spatio-temporally correlated noise is first generated using the R-package neuRosim and then added to the sequence of 100 cell images described above. When generating the noise,

σ

is chosen to be 0.1, 0.2 or 0.3 and

ρ

is chosen to be 0.1, 0.3 or 0.5, as in the simulation examples presented above. The MSE and EP values of the five image denoising methods based on 10 replicated simulations are presented in Table 3. From the table, it can be seen that NEW still has smaller MSE and EP values in this example, compared to the four competing methods, except in a small number of cases when

σ

and

ρ

are relatively small.

The 50th observed test image after the spatio-temporally correlated noise with

ρ = 0.1

, 0.3 or 0.5 being added is shown in the first column of Figure 6. The denoised images by the five methods LLK-C, LLK, GLQ, NEW-C and NEW are shown in columns 2–6 of the figure. It can be seen that similar conclusions to those from Figure 4 can be made here, and the denoised images by NEW look reasonably well, as the algorithm work well in removing noise and preserving edges.

Finally, we apply the five methods considered in the above examples to a sequence of Landsat images of the Salton Sea region. The Salton Sea is the largest inland lake located at the southern border of California, US, and has a great impact on the local ecosystem (Shuford et al. [30]). The Landsat images used here were taken during the time period of 27 May 2000 and 24 December 2001. There are a total of 20 images collected at roughly equally-spaced time points, and each image has

100 \times 100

pixels. In this example, we consider the case when

σ = 0.3

and

ρ = 0.3

. The MSE values of the five methods LLK-C, LLK, GLQ, NEW-C, and NEW calculated in the same way as before are

9.70

,

4.78

,

12.03

,

9.77

, and

4.82

, respectively. Their EP values are respectively

85.54 %

,

20.18 %

,

109.91 %

,

86.15 %

, and

19.14 %

. So, we can see that NEW method has the best edge-preserving performance among the five methods in this example, and NEW and LLK have the best overall noise removal performance. The 10th noisy observed test image taken on 28 April 2001 and its denoised versions by the five methods are shown in Figure 7. It can be seen from the figure that the denoised images by the methods LLK-C, GLQ, and NEW-C are still quite noisy, and the noise in the images generated by NEW and LLK is mostly removed while the edges are preserved reasonably well.

4. Conclusions

In this paper, we have described our proposed edge-preserving image denoising method for handling image sequences. Some major features of the proposed method include (i) helpful information in neighboring images is shared during image denoising, (ii) edge structures in the observed images can be preserved when removing noise, and (iii) possible sptio-temporal data correlation can be accommodated in the related local smoothing procedure. Theoretical arguments given in Section 3.1 and numerical studies presented in Section 3.2 show that the proposed method works well in various cases considered. There are still some issues about the proposed method for future research. For instance, in the proposed local smoothing procedure (2)–(6), each of the bandwidths

(h_{x}, h_{y}, h_{t})

is chosen by the modified CV procedure (7) and (8) to be the same in the entire design space

Ω \times [0, 1]

. Intuitively, relatively small bandwidths are preferred at places where the image intensity surface

f (x, y; t)

has large curvature and relatively large bandwidths are preferred at places where the curvature of

f (x, y; t)

is small. Thus, in some applications where the curvature of

f (x, y; t)

could change quite dramatically in the design space, variable bandwidths might be helpful. Such issues will be studied carefully in our future research.

Author Contributions

Methodology, P.Q.; Formal analysis, F.Y.; Writing—original draft preparation, F.Y.; Writing—review and editing, P.Q.; Funding acquisition, P.Q.; Supervision, P.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science Foundation grant DMS-1914639.

Data Availability Statement

Publicly available datasets were analyzed in this study. They can be found from the links: https://cbia.fi.muni.cz/datasets/ and https://earthexplorer.usgs.gov.

Acknowledgments

We thank the four referees for many constructive comments and suggestions about the paper which greatly improved its quality. This research is supported in part by the National Science Foundation grant DMS-1914639.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Proof of Proposition 1

Define

B_{h} (x, y, t) = {(x^{^{'}}, y^{^{'}}; t^{^{'}}) : \sqrt{(| x^{^{'}} - x | / h_{x})^{2} + (| y^{^{'}} - y {| / h_{y})}^{2}} \leq 1, | t - t^{^{'}} | \leq h_{t}, (x^{^{'}}, y^{^{'}}; t^{^{'}}) \in [0, 1] \times [0, 1] \times [0, 1]}

,

Δ_{i j k} = [x_{i - 1}, x_{i}] \times [y_{j - 1}, y_{j}] \times [t_{k - 1}, t_{k}]

,

x_{0} = y_{0} = t_{0} = 0

. Then it can be seen that

\begin{matrix} | \frac{1}{N H} \sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) - 1 | \\ = & | \frac{1}{H} \sum_{i j k} \int \int \int_{Δ_{i j k}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) d u d v d w - 1 | \\ = & | \frac{1}{H} \sum_{i j k} \int \int \int_{Δ_{i j k}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) d u d v d w - \\ \frac{1}{H} \int \int \int_{B_{h} (x, y, t)} K (\frac{u - x}{h_{x}}, \frac{v - y}{h_{y}}) K (\frac{w - t}{h_{t}}) d u d v d w | \\ = & | \frac{1}{H} \sum_{i j k} \int \int \int_{Δ_{i j k}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) d u d v d w - \\ \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} (x, y, t) \cap Δ_{i j k}} K (\frac{u - x}{h_{x}}, \frac{v - y}{h_{y}}) K (\frac{w - t}{h_{t}}) d u d v d w | \\ = & | \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} {(x, y, t)}^{c} \cap Δ_{i j k}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) d u d v d w + \\ \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} (x, y, t) \cap Δ_{i j k}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) d u d v d w - \\ \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} (x, y, t) \cap Δ_{i j k}} K (\frac{u - x}{h_{x}}, \frac{v - y}{h_{y}}) K (\frac{w - t}{h_{t}}) d u d v d w | \\ \leq & O (\frac{1}{n_{m i n} h_{m i n}}) + \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} (x, y, t) \cap Δ_{i j k}} | K (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}) K (\frac{t_{k} - t}{h_{t}}) - \\ K (\frac{u - x}{h_{x}}, \frac{v - y}{h_{y}}) K (\frac{w - t}{h_{t}}) | d u d v d w \\ \leq & O (\frac{1}{n_{m i n} h_{m i n}}) + \frac{1}{H} \sum_{i j k} \int \int \int_{B_{h} (x, y, t) \cap Δ_{i j k}} \frac{(1 + \sqrt{2}) C}{n_{m i n} h_{m i n}} d u d v d w \\ = & O (\frac{1}{n_{m i n} h_{m i n}}) + \frac{1}{H} \frac{(1 + \sqrt{2}) C}{n_{m i n} h_{m i n}} \int \int \int_{B_{h} (x, y, t)} 1 d u d v d w \\ = & O (\frac{1}{n_{m i n} h_{m i n}}), \end{matrix}

where

C \geq 0

is the Lipschitz constant that satisfies the condition

| K (u) - K (u^{^{'}}) | \leq C | u - u^{^{'}} |

. So, the first result in Proposition 1 is valid.

To prove the second result, it can be checked that

\begin{matrix} E | \frac{1}{N H} \sum_{i j k} ε_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}}) |^{2} \\ = & V a r (\frac{1}{N H} \sum_{i j k} ε_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}})) \\ = & \frac{1}{N^{2} H^{2}} \sum_{i j k} \sum_{i^{^{'}} j^{^{'}} k^{^{'}}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) \\ K (\frac{x_{i^{^{'}}} - x}{h_{x}}, \frac{y_{i^{^{'}}} - y}{h_{y}}) K (\frac{t_{k^{^{'}}} - x}{h_{t}}) C o v (ε_{i j k}, ε_{i^{^{'}} j^{^{'}} k^{^{'}}}) \\ \leq & \frac{1}{N^{2} H^{2}} \sum_{i j k} \sum_{i^{^{'}} j^{^{'}} k^{^{'}}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) \\ K (\frac{x_{i^{^{'}}} - x}{h_{x}}, \frac{y_{i^{^{'}}} - y}{h_{y}}) K (\frac{t_{k^{^{'}}} - x}{h_{t}}) c_{1} σ^{2} ρ^{c_{2} max {| i - i^{^{'}} |, | j - j^{^{'}} |, | k - k^{^{'}} |}} \\ \leq & \frac{1}{N^{2} H^{2}} \sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) c_{1} σ^{2} 24 \int_{0}^{\infty} τ^{2} ρ^{τ} d τ \\ = & O (\frac{1}{N H}) . \end{matrix}

Similarly, it can be checked that

\begin{matrix} E | \frac{1}{N H} \sum_{i j k} (ε_{i j k}^{2} - σ^{2}) K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}}) |^{2} \\ = & V a r (\frac{1}{N H} \sum_{i j k} ε_{i j k}^{2} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}})) \\ = & \frac{1}{N^{2} H^{2}} \sum_{i j k} \sum_{i^{^{'}} j^{^{'}} k^{^{'}}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) \\ K (\frac{x_{i^{^{'}}} - x}{h_{x}}, \frac{y_{i^{^{'}}} - y}{h_{y}}) K (\frac{t_{k^{^{'}}} - x}{h_{t}}) C o v (ε_{i j k}^{2}, ε_{i^{^{'}} j^{^{'}} k^{^{'}}}^{2}) \\ \leq & \frac{1}{N^{2} H^{2}} \sum_{i j k} \sum_{i^{^{'}} j^{^{'}} k^{^{'}}} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) \\ K (\frac{x_{i^{^{'}}} - x}{h_{x}}, \frac{y_{i^{^{'}}} - y}{h_{y}}) K (\frac{t_{k^{^{'}}} - x}{h_{t}}) 12 {(c_{1} σ^{2} ρ^{c_{2} max {| i - i^{^{'}} |, | j - j^{^{'}} |, | k - k^{^{'}} |}})}^{1 / 4} E (ε_{111}^{4}) \\ \leq & \frac{1}{N^{2} H^{2}} \sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{k} - x}{h_{t}}) 12 {(c_{1} σ^{2} 24 \int_{0}^{\infty} τ^{2} ρ^{τ} d τ)}^{1 / 3} {(E (ε_{111}^{6}))}^{2 / 3} \\ = & O (\frac{1}{N H}) . \end{matrix}

The first inequality in the above expression is based on the result in Davydov [31]. So, the third result is valid.

Appendix A.2. Proof of Theorem 1

We first consider the case when

(x, y; t) \in Ω_{h} ∖ J_{h}

. By Taylor expansion, we have

\begin{matrix} Z_{i j k} = & f (x_{i}, y_{j}; t_{k}) + ϵ_{i j k} \\ = & f (x, y; t) + (x_{i} - x) f_{x}^{^{'}} (x, y; t) + (y_{j} - y) f_{y}^{^{'}} (x, y; t) + (t_{k} - t) f_{t}^{^{'}} (x, y; t) + \\ O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) + ϵ_{i j k} . \end{matrix}

So, it can be checked that

\begin{matrix} [\begin{matrix} \sum_{i j k} Z_{i j k} K_{i j k} \\ \sum_{i j k} (x_{i} - x) Z_{i j k} K_{i j k} \\ \sum_{i j k} (y_{j} - y) Z_{i j k} K_{i j k} \\ \sum_{i j k} (t_{k} - t) Z_{i j k} K_{i j k} \end{matrix}] = & M [\begin{matrix} f (x, y; t) \\ f_{x}^{^{'}} (x, y; t) \\ f_{y}^{^{'}} (x, y; t) \\ f_{t}^{^{'}} (x, y; t) \end{matrix}] + [\begin{matrix} \sum_{i j k} O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (x_{i} - x) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (y_{j} - y) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (t_{k} - t) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \end{matrix}] + \\ [\begin{matrix} \sum_{i j k} ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (x_{i} - x) ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (y_{j} - y) ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (t_{k} - t) ϵ_{i j k} K_{i j k} \end{matrix}], \end{matrix}

where

\begin{matrix} M = [\begin{matrix} m_{000} & m_{100} & m_{010} & m_{001} \\ m_{100} & m_{200} & m_{110} & m_{101} \\ m_{010} & m_{110} & m_{020} & m_{011} \\ m_{001} & m_{101} & m_{011} & m_{002} \end{matrix}] . \end{matrix}

From Expression (3), we have

\begin{matrix} [\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] = & [\begin{matrix} f (x, y; t) \\ f_{x}^{^{'}} (x, y; t) \\ f_{y}^{^{'}} (x, y; t) \\ f_{t}^{^{'}} (x, y; t) \end{matrix}] + M^{- 1} [\begin{matrix} \sum_{i j k} O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (x_{i} - x) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (y_{j} - y) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \\ \sum_{i j k} (t_{k} - t) O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) K_{i j k} \end{matrix}] + \\ M^{- 1} [\begin{matrix} \sum_{i j k} ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (x_{i} - x) ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (y_{j} - y) ϵ_{i j k} K_{i j k} \\ \sum_{i j k} (t_{k} - t) ϵ_{i j k} K_{i j k} \end{matrix}] . \end{matrix}

By some simple algebraic manipulations, we have

\begin{matrix} M^{- 1} = & [\begin{matrix} O (\frac{1}{N H}) & O (\frac{1}{N H \cdot h_{x}}) & O (\frac{1}{N H \cdot h_{y}}) & O (\frac{1}{N H \cdot h_{t}}) \\ O (\frac{1}{N H \cdot h_{x}}) & O (\frac{1}{N H \cdot h_{x}^{2}}) & O (\frac{1}{N H \cdot h_{x} h_{y}}) & O (\frac{1}{N H \cdot h_{x} h_{t}}) \\ O (\frac{1}{N H \cdot h_{y}}) & O (\frac{1}{N H \cdot h_{x} h_{y}}) & O (\frac{1}{N H \cdot h_{y}^{2}}) & O (\frac{1}{N H \cdot h_{y} h_{t}}) \\ O (\frac{1}{N H \cdot h_{t}}) & O (\frac{1}{N H \cdot h_{x} h_{t}}) & O (\frac{1}{N H \cdot h_{y} h_{t}}) & O (\frac{1}{N H \cdot h_{t}^{2}}) \end{matrix}] . \end{matrix}

Then,

\begin{matrix} [\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] = & [\begin{matrix} f (x, y; t) \\ f_{x}^{^{'}} (x, y; t) \\ f_{y}^{^{'}} (x, y; t) \\ f_{t}^{^{'}} (x, y; t) \end{matrix}] + [\begin{matrix} O (h_{x}^{2} + h_{y}^{2} + h_{t}^{2}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{x}}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{y}}) \\ O (\frac{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}{h_{t}}) \end{matrix}] + [\begin{matrix} O_{p} (\frac{1}{\sqrt{N H}}) \\ O_{p} (\frac{1}{h_{x} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{y} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{t} \sqrt{N H}}) \end{matrix}] . \end{matrix}

Now, we consider the case when

(x, y; t) \in J_{h} ∖ S_{h}

. If

(x_{i}, y_{j}; t_{k}) \in I_{1}

, then we have

\begin{matrix} Z_{i j k} & = & f (x_{i}, y_{j}; t_{k}) + ε_{i j k} \\ = & f_{-} (x_{τ}, y_{τ}; t_{τ}) + O (\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}) + ε_{i j k}, \end{matrix}

and if

(x_{i}, y_{j}; t_{k}) \in I_{2}

, we have

\begin{matrix} Z_{i j k} & = & f (x_{i}, y_{j}; t_{k}) + ε_{i j k} \\ = & f_{-} (x_{τ}, y_{τ}; t_{τ}) + d_{τ} + O (\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}) + ε_{i j k} . \end{matrix}

By some similar arguments to those in the case considered above, we have

\begin{matrix} [\begin{matrix} \hat{a} (x, y; t) \\ \hat{b} (x, y; t) \\ \hat{c} (x, y; t) \\ \hat{d} (x, y; t) \end{matrix}] & = & [\begin{matrix} f_{-} (x_{τ}, y_{τ}; t_{τ}) + d_{τ} \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in I_{2}} K_{i j k}}{\sum_{i j k} K_{i j k}} \\ \frac{d_{τ}}{h_{x}} \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in I_{2}} [(x_{i} - x) / h_{x}] K_{i j k}}{\sum_{i j k} {[(x_{i} - x) / h_{x}]}^{2} K_{i j k}} \\ \frac{d_{τ}}{h_{y}} \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in I_{2}} [(y_{j} - y) / h_{y}] K_{i j k}}{\sum_{i j k} {[(y_{j} - y) / h_{y}]}^{2} K_{i j k}} \\ \frac{d_{τ}}{h_{t}} \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in I_{2}} [(t_{k} - t) / h_{t}] K_{i j k}}{\sum_{i j k} {[(t_{k} - t) / h_{t}]}^{2} K_{i j k}} \end{matrix}] + \\ [\begin{matrix} O (\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{x}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{y}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{t}}) \end{matrix}] + [\begin{matrix} O_{p} (\frac{1}{\sqrt{N H}}) \\ O_{p} (\frac{1}{h_{x} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{y} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{t} \sqrt{N H}}) \end{matrix}] \\ = & [\begin{matrix} f_{-} (x_{τ}, y_{τ}; t_{τ}) + d_{τ} ξ_{000}^{(2)} \\ \frac{d_{τ}}{ξ_{200} h_{x}} ξ_{100}^{(2)} \\ \frac{d_{τ}}{ξ_{020} h_{y}} ξ_{010}^{(2)} \\ \frac{d_{τ}}{ξ_{002} h_{t}} ξ_{001}^{(2)} \end{matrix}] + [\begin{matrix} O (\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{x}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{y}}) \\ O (\frac{\sqrt{h_{x}^{2} + h_{y}^{2} + h_{t}^{2}}}{h_{t}}) \end{matrix}] + [\begin{matrix} O_{p} (\frac{1}{\sqrt{N H}}) \\ O_{p} (\frac{1}{h_{x} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{y} \sqrt{N H}}) \\ O_{p} (\frac{1}{h_{t} \sqrt{N H}}) \end{matrix}] \end{matrix}

Appendix A.3. Proof of Theorem 2

We prove the second equations in (10) and (11) here. The first equations can be proved similarly. For simplicity, we write

{\hat{a}}^{(l)} (x, y; t)

,

{\hat{b}}^{(l)} (x, y; t)

,

{\hat{c}}^{(l)} (x, y; t)

,

{\hat{d}}^{(l)} (x, y; t)

,

O^{(l)} (x, y; t)

and

{\tilde{O}}^{(l)} (x, y; t)

as

{\hat{a}}^{(l)}

,

{\hat{b}}^{(l)}

,

{\hat{c}}^{(l)}

,

{\hat{d}}^{(l)}

,

O^{(l)}

and

{\tilde{O}}^{(l)}

, respectively from now on. First, by Proposition 1, it is easy to show that

\frac{\sum_{i j k} ε_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}})}{\sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}})} = O_{p} (\frac{1}{\sqrt{N H}}),

(A1)

\frac{\sum_{i j k} (ε_{i j k}^{2} - σ^{2}) K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}})}{\sum_{i j k} K (\frac{x_{i} - x}{h_{x}}, \frac{y_{i} - y}{h_{y}}) K (\frac{t_{i} - x}{h_{t}})} = o_{p} (1) .

(A2)

Let us first consider the case when

(x, y; t) \in Ω_{h} ∖ J_{h}

. In such a case, it can be checked that

\begin{matrix} e^{(l)} (x, y; t) & = & {\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} [ε_{i j k} + f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)]^{2} K_{i j k}} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} \\ = & \{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} ε_{i j k}^{2} K_{i j k}\} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} + \\ {2 \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} ε_{i j k} [f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)] K_{i j k}} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} + \\ {\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} [f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)]^{2} K_{i j k}} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} \\ = : & A_{1}^{(l)} (x, y; t) + A_{2}^{(l)} (x, y; t) + A_{3}^{(l)} (x, y; t) . \end{matrix}

Similar to (A2), we have

A_{1}^{(l)} (x, y; t) = σ^{2} + o_{p} (1) .

(A3)

Taylor expansion of

f (x_{i}, y_{j}; t_{k})

at point

(x, y; t)

, results in Theorem 1, and by similar arguments for (A1), we have

\begin{matrix} A_{2}^{(l)} (x, y; t) & \leq & 2 | f (x, y; t) - {\hat{a}}^{(l)} | | \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} ε_{i j k} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} | + \\ 2 h_{x} | f_{x}^{^{'}} (x, y; t) - {\hat{b}}^{(l)} | | \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} ε_{i j k} \frac{x_{i} - x}{h_{x}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} | + \\ 2 h_{y} | f_{y}^{^{'}} (x, y; t) - {\hat{c}}^{(l)} | | \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)} (x, y; t)} ε_{i j k} \frac{y_{j} - y}{h_{y}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} | + \\ 2 h_{t} | f_{t}^{^{'}} (x, y; t) - {\hat{d}}^{(l)} | | \frac{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} ε_{i j k} \frac{t_{k} - t}{h_{t}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} | \\ = & o_{p} (1) . \end{matrix}

(A4)

Similarly, we have

\begin{matrix} A_{3}^{(l)} (x, y; t) = o_{p} (1) . \end{matrix}

(A5)

By combining (A3)–(A5), we have

e^{(l)} (x, y; t) = σ^{2} + o_{p} (1) .

Now, let us consider the case when

(x, y; t) \in J_{h} ∖ S_{h}

. Similar to the above case, let us write

e^{(l)} (x, y; t) = A_{1}^{(l)} (x, y; t) + A_{2}^{(l)} (x, y; t) + A_{3}^{(l)} (x, y; t) .

Here, we still have

A_{1}^{(l)} (x, y; t) = σ^{2} + o_{p} (1) .

(A6)

For

A_{2}^{(l)} (x, y; t)

, we have

\begin{matrix} A_{2}^{(l)} (x, y; t) & = & {2 \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} [f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)] K_{i j k}} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} + \\ {2 \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2} \cap O^{(l)}} ε_{i j k} [f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)] K_{i j k}} / \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} \\ = : & A_{21}^{(l)} (x, y; t) + A_{22}^{(l)} (x, y; t) . \end{matrix}

By the results in Theorem 1, we have

\begin{matrix} A_{21}^{(l)} (x, y; t) & = & \frac{2 \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ})] K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} - \\ \frac{(D_{1} + o_{p} (1)) \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} - \\ \frac{(D_{2} + o_{p} (1)) \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} \frac{x_{i} - x}{h_{x}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} - \\ \frac{(D_{3} + o_{p} (1)) \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} \frac{y_{j} - y}{h_{y}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}} - \\ \frac{(D_{4} + o_{p} (1)) \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} ε_{i j k} \frac{t_{k} - t}{h_{t}} K_{i j k}}{\sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k}}, \end{matrix}

where

D_{1}

,

D_{2}

,

D_{3}

and

D_{4}

are constants. By similar arguments for (A1), we can conclude that

A_{21}^{(l)} = o_{p} (1) .

Similarly, we have

A_{22}^{(l)} = o_{p} (1) .

So,

A_{2}^{(l)} = o_{p} (1) .

(A7)

By similar arguments to those about Proposition 1, we have

| \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} K_{i j k} - \frac{1}{2} | = o (1) .

For a function

ϕ (x, y; t)

satisfying the condition that

{sup}_{x^{2} + y^{2} + t^{2} \leq 1} | ϕ (x, y; t) | \leq b_{ϕ} < \infty

, we can have

\begin{matrix} | \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} ⋂ O^{(l)}} ϕ (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}; \frac{t_{k} - t}{h_{t}}) K_{i j k} - \\ \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} ⋂ {\tilde{O}}^{(l)}} ϕ (\frac{x_{i} - x}{h_{x}}, \frac{y_{j} - y}{h_{y}}; \frac{t_{k} - t}{h_{t}}) K_{i j k} | \\ \leq & b_{ϕ} | | K | | \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)} Δ {\tilde{O}}^{(l)}} 1 \\ = & o (1), \end{matrix}

where

O^{(l)} Δ {\tilde{O}}^{(l)} = (O^{(l)} ⋃ {\tilde{O}}^{(l)}) ∖ (O^{(l)} ⋂ {\tilde{O}}^{(l)})

. The last equation above is a direct conclusion of (9). By the above results, we have

\begin{matrix} A_{3}^{(l)} (x, y; t) & = & \frac{2}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} [f (x_{i}, y_{j}; t_{k}) - {\hat{a}}^{(l)} - {\hat{b}}^{(l)} (x_{i} - x) - \\ {\hat{c}}^{(l)} (y_{j} - y) - {\hat{d}}^{(l)} (t_{k} - t)]^{2} K_{i j k} \\ = & \frac{2}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in O^{(l)}} [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ}) - d_{τ} B_{0 l} - \frac{d_{τ} B_{1 l}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} B_{2 l}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} B_{3 l}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \\ = & \frac{2}{N H} (\sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap O^{(l)}} + \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2} \cap O^{(l)}}) \\ [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ}) - d_{τ} B_{0 l} - \frac{d_{τ} B_{1 l}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} B_{2 l}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} B_{3 l}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \\ = & \frac{2}{N H} (\sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap {\tilde{O}}^{(l)}} + \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2} \cap {\tilde{O}}^{(l)}}) \\ [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ}) - d_{τ} B_{0 l} - \frac{d_{τ} B_{1 l}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} B_{2 l}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} B_{3 l}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \\ = & \frac{2}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1} \cap {\tilde{O}}^{(l)}} [- d_{τ} B_{0 l} - \frac{d_{τ} B_{1 l}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} B_{2 l}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} B_{3 l}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + \\ \frac{2}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2} \cap {\tilde{O}}^{(l)}} [d_{τ} - d_{τ} B_{0 l} - \frac{d_{τ} B_{1 l}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} B_{2 l}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} B_{3 l}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \end{matrix}

(A8)

\begin{matrix} = & 2 d_{τ}^{2} \int \int \int_{Q^{(1 l)}} {[B_{0 l} + \frac{B_{1 l}}{ξ_{200}} u + \frac{B_{2 l}}{ξ_{020}} v + \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ 2 d_{τ}^{2} \int \int \int_{Q^{(2 l)}} {[1 - B_{0 l} - \frac{B_{1 l}}{ξ_{200}} u - \frac{B_{2 l}}{ξ_{020}} v - \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w \\ + o_{p} (1) \\ = & d_{τ}^{2} {(C_{τ}^{(l)})}^{2} + o_{p} (1), \end{matrix}

where

\begin{matrix} C_{τ}^{(l)} & = & (2 \int \int \int_{Q^{(1 l)}} {[B_{0 l} + \frac{B_{1 l}}{ξ_{200}} u + \frac{B_{2 l}}{ξ_{020}} v + \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ 2 \int \int \int_{Q^{(2 l)}} {[1 - B_{0 l} - \frac{B_{1 l}}{ξ_{200}} u - \frac{B_{2 l}}{ξ_{020}} v - \frac{B_{3 l}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w)^{1 / 2} . \end{matrix}

Then by equation (A6)–(A8), we have

e^{(l)} (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ}^{(l)})}^{2} + o_{p} (1) .

Similarly, we can prove that

e (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ})}^{2} + o_{p} (1),

where

\begin{matrix} C_{τ} & = & (\int \int \int_{Q^{(1)}} {[ξ_{000}^{(2)} + \frac{ξ_{100}^{(2)}}{ξ_{200}} u + \frac{ξ_{010}^{(2)}}{ξ_{020}} v + \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ \int \int \int_{Q^{(2)}} {[1 - ξ_{000}^{(2)} - \frac{ξ_{100}^{(2)}}{ξ_{200}} u - \frac{ξ_{010}^{(2)}}{ξ_{020}} v - \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w)^{1 / 2} . \end{matrix}

The main difference between this case and the previous case in the proof is in the derivation of the result of (A8). For

e (x, y; t)

, the corresponding result is

\begin{matrix} A_{3} (x, y; t) & = & \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k})} [f (x_{i}, y_{j}; t_{k}) - \hat{a} (x, y; t) - \hat{b} (x, y; t) (x_{i} - x) - \\ \hat{c} (x, y; t) (y_{j} - y) - \hat{d} (x, y; t) (t_{k} - t)]^{2} K_{i j k} \\ = & \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k})} [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ}) - d_{τ} ξ_{000}^{(2)} - \frac{d_{τ} ξ_{100}^{(2)}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} ξ_{010}^{(2)}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} ξ_{001}^{(2)}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \end{matrix}

\begin{matrix} = & \frac{1}{N H} (\sum_{(x_{i}, y_{j}; t_{k}) \in I^{1}} + \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2}}) \\ [f (x_{i}, y_{j}; t_{k}) - f_{-} (x_{τ}, y_{τ}; t_{τ}) - d_{τ} ξ_{000}^{(2)} - \frac{d_{τ} ξ_{100}^{(2)}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} ξ_{010}^{(2)}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} ξ_{001}^{(2)}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \\ = & \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{1}} [- d_{τ} ξ_{000}^{(2)} - \frac{d_{τ} ξ_{100}^{(2)}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} ξ_{010}^{(2)}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} ξ_{001}^{(2)}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + \\ \frac{1}{N H} \sum_{(x_{i}, y_{j}; t_{k}) \in I^{2}} [d_{τ} - d_{τ} ξ_{000}^{(2)} - \frac{d_{τ} ξ_{100}^{(2)}}{ξ_{200}} \frac{x_{i} - x}{h_{x}} - \\ \frac{d_{τ} ξ_{010}^{(2)}}{ξ_{020}} \frac{y_{j} - y}{h_{y}} - \frac{d_{τ} ξ_{001}^{(2)}}{ξ_{002}} \frac{t_{k} - t}{h_{t}}]^{2} K_{i j k} + o_{p} (1) \\ = & d_{τ}^{2} \int \int \int_{Q^{(1)}} {[ξ_{000}^{(2)} + \frac{ξ_{100}^{(2)}}{ξ_{200}} u + \frac{ξ_{010}^{(2)}}{ξ_{020}} v + \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w + \\ d_{τ}^{2} \int \int \int_{Q^{(2)}} {[1 - ξ_{000}^{(2)} - \frac{ξ_{100}^{(2)}}{ξ_{200}} u - \frac{ξ_{010}^{(2)}}{ξ_{020}} v - \frac{ξ_{001}^{(2)}}{ξ_{002}} w]}^{2} K (u, v) K (w) d u d v d w \\ + o_{p} (1) \\ = & d_{τ}^{2} {(C_{τ})}^{2} + o_{p} (1) . \end{matrix}

Appendix A.4. Proof of Theorem 3

For the case when

(x, y; t) \in Ω_{h} ∖ J_{h}

, the estimator

\hat{f} (x, y; t)

is one of

\hat{a} (x, y; t)

,

{\hat{a}}^{(1)} (x, y; t)

,

{\hat{a}}^{(2)} (x, y; t)

and

({\hat{a}}^{(1)} (x, y; t) + {\hat{a}}^{(2)} (x, y; t)) / 2

, all of which are consistent estimators of

f (x, y; t)

. So, we have the result in the theorem.

For the case when

(x, y; t) \in J_{h} ∖ S_{h}

, it is easy to see that we have either i)

e (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ})}^{2} + o_{p} (1)

,

e^{(1)} (x, y; t) = σ^{2} + o_{p} (1)

, and

e^{(2)} (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ}^{(2)})}^{2} + o_{p} (1)

, or ii)

e (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ})}^{2} + o_{p} (1)

,

e^{(1)} (x, y; t) = σ^{2} + d_{τ}^{2} {(C_{τ}^{(1)})}^{2} + o_{p} (1)

, and

e^{(2)} (x, y; t) = σ^{2} + o_{p} (1)

. In both cases, we have

D (x, y; t) = d_{τ}^{2} {(C_{τ})}^{2} + o_{p} (1)

. Therefore, asymptotically

D (x, y; t) > u

. Since

e^{(1)} (x, y; t) < e^{(2)} (x, y; t)

in i), the estimator

\hat{f} (x, y; t)

is

{\hat{a}}^{(1)} (x, y; t)

in this case, which is a consistent estimator of

f (x, y; t)

. A similar result follows in the case ii).

References

Zanter, K. Landsat 8 (L8) Data Users Handbook; Version 2; LSDS-1574; Department of the Interior, U.S. Geological Survey: Washington, DC, USA, 2016. Available online: https://landsat.usgs.gov/landsat-8-l8-data-users-handbook (accessed on 1 October 2020).
Qiu, P. Jump regression, image processing and quality control (with discussions). Qual. Eng. 2018, 30, 137–153. [Google Scholar] [CrossRef]
Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 4th ed.; Pearson: New York, NY, USA, 2018. [Google Scholar]
Qiu, P. Jump surface estimation, edge detection, and image restoration. J. Am. Stat. Assoc. 2007, 102, 745–756. [Google Scholar] [CrossRef]
Geman, S.; Geman, D. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef]
Besag, J. Spatial interaction and the statistical analysis of lattice systems (with discussions). J. R. Stat. Soc. Ser. B 1974, 36, 192–236. [Google Scholar]
Fessler, J.A.; Erdogan, H.; Wu, W.B. Exact distribution of edgepreserving MAP estimators for linear signal models with Gaussian measurement noise. IEEE Trans. Image Process. 2000, 9, 1049–1055. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Perona, P.; Malik, J. Scale space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 629–639. [Google Scholar] [CrossRef] [Green Version]
Weickert, J. Anisotropic Diffusion in Imaging Processing; Teubner: Stuttgart, Germany, 1998. [Google Scholar]
Beck, A.; Teboulle, M. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 2009, 18, 2419–2434. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rudin, L.; Osher, S.; Fatemi, E. Jump regression, Nonlinear total variation based noise removal algorithms. Phys. D 1992, 60, 259–268. [Google Scholar] [CrossRef]
Yuan, Q.; Zhang, L.; Shen, H. Hyperspectral Image Denoising Employing a Spectral–Spatial Adaptive Total Variation Model. IEEE Trans. Geosci. Remote Sens. 2012, 50, 3660–3677. [Google Scholar] [CrossRef]
Chang, G.S.; Yu, B.; Vetterli, M. Spatially adaptive wavelet thresholding with context modeling for image denoising. IEEE Trans. Image Process. 2000, 9, 1522–1531. [Google Scholar] [CrossRef] [Green Version]
Mrázek, P.; Weickert, J.; Steidl, G. Correspondences between wavelet shrinkage and nonlinear diffusion. In Scale Space Methods in Computer Vision; Griffin, L.D., Lillholm, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Gijbels, I.; Lambert, A.; Qiu, P. Edge-preserving image denoising and estimation of discontinuous surfaces. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1075–1087. [Google Scholar] [CrossRef]
Qiu, P. Discontinuous regression surfaces fitting. Ann. Stat. 1998, 26, 2218–2245. [Google Scholar] [CrossRef]
Qiu, P. Jump-preserving surface reconstruction from noisy data. Ann. Inst. Stat. Math. 2009, 61, 715–751. [Google Scholar] [CrossRef]
Qiu, P.; Mukherjee, P.S. Edge structure preserving 3-D image denoising by local surface approximation. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 1457–1468. [Google Scholar]
Polzehl, J.; Spokoiny, V.G. Adaptive weights smoothing with applications to image restoration. J. R. Stat. Soc. Ser. B 2000, 62, 335–354. [Google Scholar] [CrossRef]
Kervrann, C.; Boulanger, J. Optimal Spatial Adaptation for Patch-Based Image Denoising. IEEE Trans. Image Process. 2006, 15, 2866–2878. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jain, P.; Tyagi, V. A survey of edge-preserving image denoising methods. Inf. Syst. Front. 2016, 18, 159–170. [Google Scholar] [CrossRef]
Qiu, P. Image Processing and Jump Regression Analysis; John Wiley & Sons: New York, NY, USA, 2005. [Google Scholar]
Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman and Hall: New York, NY, USA, 1996. [Google Scholar]
Altman, N.S. Kernel smoothing of data with correlated errors. J. Am. Stat. Assoc. 1990, 85, 749–759. [Google Scholar] [CrossRef]
Opsomer, J.; Wang, Y.; Yang, Y. Nonparametric regression with correlated errors. Stat. Sci. 2001, 16, 134–153. [Google Scholar] [CrossRef]
Brabanter, K.D.; Brabanter, J.D.; Suykens, J.; Moor, B. Kernel regression in the presence of correlated errors. J. Mach. Learn. Res. 2011, 12, 1955–1976. [Google Scholar]
Rudin, L.; Osher, S.; Fatemi, E. neuRosim: An R package for generating fMRI data. J. Stat. Softw. 2011, 44, 1–18. [Google Scholar]
Hall, P.; Qiu, P. Blind deconvolution and deblurring in image analysis. Stat. Sin. 2007, 17, 1483–1509. [Google Scholar]
Svoboda, D.; Ulman, V.; Kováč, P.; Šalingová, B.; Tesařová, L.; Koutná, I.K.; Matula, P. Vascular network formation in silico using the extended cellular potts model. IEEE Int. Conf. Image Process. 2016, 3180–3183. [Google Scholar]
Shuford, W.D.; Warnock, N.; Molina, K.C.; Sturm, K. The Salton Sea as critical habitat to migratory and resident waterbirds. Hydrobiologia 2002, 473, 255–274. [Google Scholar] [CrossRef]
Davydov, Y.A. Convergence of Distributions Generated by Stationary Stochastic Process. Theory Probab. Its Appl. 1968, 13, 691–696. [Google Scholar] [CrossRef]

Figure 1. Two Landsat images of the Las Vegas area taken in 1984 (left panel) and 2007 (right panel).

Figure 2. The neighborhood

O (x, y; t)

is divided into two parts by a plane that passes

(x, y; t)

and is perpendicular to the estimated gradient direction

\hat{G} (x, y; t)

.

Figure 2. The neighborhood

O (x, y; t)

is divided into two parts by a plane that passes

(x, y; t)

and is perpendicular to the estimated gradient direction

\hat{G} (x, y; t)

.

Figure 3. (a) The true image intensity function

f (x, y; t)

at

t = 0.01

(left) and

t = 0.25

(right). (b) The temporal profile

f (0.25, 0.25; t)

when t changes in

[0, 1]

.

Figure 3. (a) The true image intensity function

f (x, y; t)

at

t = 0.01

(left) and

t = 0.25

(right). (b) The temporal profile

f (0.25, 0.25; t)

when t changes in

[0, 1]

.

Figure 4. The first column shows the observed images at

t = 0.5

when

σ = 0.2

and

ρ = 0.1

(1st row), 0.3 (2nd row), and 0.5 (3rd row). Second to sixth columns show the denoised images by LLK-C, LLK, GLQ, NEW-C and NEW, respectively.

Figure 4. The first column shows the observed images at

t = 0.5

when

σ = 0.2

and

ρ = 0.1

(1st row), 0.3 (2nd row), and 0.5 (3rd row). Second to sixth columns show the denoised images by LLK-C, LLK, GLQ, NEW-C and NEW, respectively.

Figure 5. The 1st, 50th and 100th cell images of the image sequence for describing a vasculogenesis process.

Figure 6. First column shows the 50th observed cell image after the spatio-temporally correlated noise with

ρ = 0.1

(1st row), 0.3 (2nd row) or 0.5 (3rd row) being added. The second to sixth columns show the denoised images by LLK-C, LLK, GLQ, NEW-C and NEW, respectively.

Figure 6. First column shows the 50th observed cell image after the spatio-temporally correlated noise with

ρ = 0.1

(1st row), 0.3 (2nd row) or 0.5 (3rd row) being added. The second to sixth columns show the denoised images by LLK-C, LLK, GLQ, NEW-C and NEW, respectively.

Figure 7. The first image is the observed landsat image of the Salton Sea region taken on 28 April 2001 after the spatio-temporally correlated noise with

σ = 0.3

and

ρ = 0.3

being added. Second to sixth images are its denoised versions by LLK-C, LLK, GLQ, NEW-C, and NEW, respectively.

Figure 7. The first image is the observed landsat image of the Salton Sea region taken on 28 April 2001 after the spatio-temporally correlated noise with

σ = 0.3

and

ρ = 0.3

being added. Second to sixth images are its denoised versions by LLK-C, LLK, GLQ, NEW-C, and NEW, respectively.

Table 1. In each entry, MSE of

\hat{f} (x, y; t)

in (6) is presented in the first line with its standard error (in parenthesis); the corresponding values of

(h_{x}, h_{t}, u)

chosen by the modified CV procedure (7) and (8) is presented in the second line; the optimal MSE is presented in the third line with its standard error (in parenthesis); the optimal values of

(h_{x y}, h_{t}, u)

are presented in the fourth line. MSE in the table has been multiplied by

10^{3}

and standard error has been multiplied by

10^{5}

.

Table 1. In each entry, MSE of

\hat{f} (x, y; t)

in (6) is presented in the first line with its standard error (in parenthesis); the corresponding values of

(h_{x}, h_{t}, u)

chosen by the modified CV procedure (7) and (8) is presented in the second line; the optimal MSE is presented in the third line with its standard error (in parenthesis); the optimal values of

(h_{x y}, h_{t}, u)

are presented in the fourth line. MSE in the table has been multiplied by

10^{3}

and standard error has been multiplied by

10^{5}

.

		$n_{t} = 50$		$n_{t} = 100$
$σ$	$ρ$	$n_{x} = 64$	$n_{x} = 128$	$n_{x} = 64$	$n_{x} = 128$
0.1	0.1	$0.65 (0.80)$	$0.30 (0.25)$	$0.48 (0.43)$	$0.26 (0.10)$
		(0.03, 0.10, 0.05)	(0.03, 0.08, 0.025)	(0.03, 0.10, 0.05)	(0.02, 0.07, 0.05)
		$0.32 (0.46)$	$0.20 (0.14)$	$0.37 (0.36)$	$0.19 (0.08)$
		(0.04, 0.07, 0.025)	(0.03, 0.05, 0.025)	(0.03, 0.08, 0.025)	(0.02, 0.05, 0.025)
	0.3	$0.60 (0.45)$	$0.33 (0.16)$	$0.59 (0.39)$	$0.33 (0.15)$
		(0.04, 0.10, 0.05)	(0.03, 0.07, 0.025)	(0.03, 0.10, 0.05)	(0.02, 0.07, 0.025)
		$0.49 (0.35)$	$0.30 (0.16)$	$0.50 (0.37)$	$0.29 (0.22)$
		(0.04, 0.08, 0.025)	(0.03, 0.06, 0.025)	(0.03, 0.08, 0.025)	(0.03, 0.04, 0.025)
	0.5	$1.25 (1.24)$	$0.80 (0.22)$	$0.81 (0.55)$	$0.64 (0.21)$
		(0.03, 0.10, 0.05)	(0.02, 0.07, 0.025)	(0.03, 0.10, 0.05)	(0.02, 0.04, 0.025)
		$0.77 (0.65)$	$0.49 (0.24)$	$0.74 (0.46)$	$0.45 (0.25)$
		(0.04, 0.09, 0.025)	(0.03, 0.06, 0.025)	(0.03, 0.09, 0.025)	(0.03, 0.04, 0.025)
0.2	0.1	$1.14 (1.13)$	$0.68 (0.38)$	$1.02 (0.74)$	$0.56 (0.26)$
		(0.04, 0.10, 0.025)	(0.03, 0.08, 0.025)	(0.04, 0.10, 0.025)	(0.03, 0.07, 0.025)
		$1.11 (0.86)$	$0.66 (0.33)$	$0.93 (0.71)$	$0.54 (0.31)$
		(0.04, 0.09, 0.025)	(0.03, 0.07, 0.025)	(0.04, 0.08, 0.025)	(0.03, 0.05, 0.025)
	0.3	$1.69 (0.91)$	$1.03 (0.54)$	$1.32 (1.08)$	$0.78 (0.41)$
		(0.04, 0.10, 0.025)	(0.03, 0.08, 0.025)	(0.04, 0.10, 0.025)	(0.03, 0.07, 0.025)
		$1.69 (1.24)$	$1.03 (0.54)$	$1.29 (1.12)$	$0.78 (0.41)$
		(0.04, 0.11, 0.025)	(0.03, 0.08, 0.025)	(0.04, 0.09, 0.025)	(0.03, 0.07, 0.025)
	0.5	$3.25 (1.74)$	$2.88 (0.78)$	$1.95 (1.85)$	$2.61 (0.58)$
		(0.04, 0.07, 0.025)	(0.02, 0.07, 0.025)	(0.04, 0.09, 0.025)	(0.02, 0.04, 0.025)
		$2.59 (2.23)$	$1.54 (1.32)$	$1.91 (1.78)$	$1.21 (0.43)$
		(0.05, 0.10, 0.025)	(0.04, 0.09, 0.025)	(0.04, 0.11, 0.025)	(0.03, 0.08, 0.025)
0.3	0.1	$2.32 (1.91)$	$1.26 (1.03)$	$1.59 (0.81)$	$0.92 (0.34)$
		(0.05, 0.13, 0.025)	(0.04, 0.09, 0.025)	(0.04, 0.11, 0.025)	(0.03, 0.08, 0.025)
		$2.28 (2.58)$	$1.26 (1.03)$	$1.59 (0.65)$	$0.92 (0.34)$
		(0.05, 0.11, 0.025)	(0.04, 0.09, 0.025)	(0.04, 0.10, 0.025)	(0.03, 0.08, 0.025)
	0.3	$3.15 (2.28)$	$1.72 (1.37)$	$2.26 (1.53)$	$1.36 (0.50)$
		(0.05, 0.13, 0.025)	(0.04, 0.09, 0.025)	(0.04, 0.11, 0.025)	(0.03, 0.08, 0.025)
		$3.14 (2.45)$	$1.71 (1.52)$	$2.21 (1.31)$	$1.33 (0.41)$
		(0.05, 0.14, 0.025)	(0.04, 0.10, 0.025)	(0.04, 0.13, 0.025)	(0.04, 0.09, 0.025)
	0.5	$6.78 (3.46)$	$6.81 (2.00)$	$4.18 (2.72)$	$6.33 (1.43)$
		(0.04, 0.09, 0.05)	(0.02, 0.07, 0.05)	(0.04, 0.10, 0.025)	(0.02, 0.04, 0.05)
		$4.46 (4.94)$	$2.48 (2.38)$	$3.18 (3.42)$	$1.88 (0.56)$
		(0.06, 0.16, 0.025)	(0.05, 0.11, 0.025)	(0.05, 0.14, 0.025)	(0.04, 0.10, 0.025)

Table 2. In each entry, the first line is the MSE value with its standard error (in parenthesis), and the second line is the EP value. MSE values in the table are in the unit of

10^{3}

and the standard error values are in the unit of

10^{5}

.

Table 2. In each entry, the first line is the MSE value with its standard error (in parenthesis), and the second line is the EP value. MSE values in the table are in the unit of

10^{3}

and the standard error values are in the unit of

10^{5}

.

$σ$	$ρ$	LLK-C	LLK	GLQ	NEW-C	NEW
0.1	0.1	$2.06 (0.08)$	$2.10 (0.06)$	$0.60 (0.18)$	$0.24 (0.11)$	$0.26 (0.10)$
		$73.68 %$	$18.43 %$	$28.24 %$	$12.32 %$	$7.48 %$
	0.3	$3.04 (0.14)$	$2.28 (0.09)$	$0.95 (0.18)$	$2.93 (0.40)$	$0.33 (0.15)$
		$124.48 %$	$34.40 %$	$43.69 %$	$131.28 %$	$10.58 %$
	0.5	$3.89 (0.24)$	$3.23 (0.21)$	$1.42 (0.42)$	$3.77 (0.48)$	$0.64 (0.21)$
		$141.47 %$	$95.86 %$	$57.40 %$	$148.17 %$	$28.86 %$
0.2	0.1	$4.16 (0.25)$	$2.93 (0.15)$	$1.51 (0.38)$	$0.86 (0.25)$	$0.56 (0.26)$
		$142.65 %$	$51.78 %$	$54.40 %$	$39.01 %$	$9.14 %$
	0.3	$9.39 (0.52)$	$3.67 (0.25)$	$2.87 (0.51)$	$9.60 (0.78)$	$0.78 (0.41)$
		$291.31 %$	$82.84 %$	$94.59 %$	$295.72 %$	$15.08 %$
	0.5	$12.80 (0.94)$	$11.21 (0.86)$	$7.75 (1.32)$	$13.12 (1.16)$	$2.61 (0.58)$
		$326.38 %$	$289.71 %$	$203.86 %$	$334.62 %$	$84.24 %$
0.3	0.1	$7.88 (0.57)$	$3.94 (0.26)$	$3.17 (0.86)$	$1.01 (0.37)$	$0.92 (0.34)$
		$235.43 %$	$82.24 %$	$73.18 %$	$23.36 %$	$15.41 %$
	0.3	$19.97 (1.15)$	$5.56 (0.50)$	$12.36 (0.63)$	$19.97 (1.16)$	$1.36 (0.50)$
		$461.12 %$	$133.33 %$	$261.31 %$	$461.13 %$	$25.78 %$
	0.5	$27.64 (2.09)$	$23.75 (1.92)$	$15.75 (1.71)$	$28.04 (2.29)$	$6.33 (1.43)$
		$514.22 %$	$458.82 %$	$292.50 %$	$518.16 %$	$144.58 %$

Table 3. Results for denoising a sequence of 100 cell images. In each entry, the first line is the MSE value and its standard error (in parenthesis), and the second line is the EP value. MSE values in the table are in the unit of

10^{3}

and the standard errors are in the unit of

10^{5}

.

Table 3. Results for denoising a sequence of 100 cell images. In each entry, the first line is the MSE value and its standard error (in parenthesis), and the second line is the EP value. MSE values in the table are in the unit of

10^{3}

and the standard errors are in the unit of

10^{5}

.

$σ$	$ρ$	LLK-C	LLK	GLQ	NEW-C	NEW
0.1	0.1	$1.69 (0.11)$	$0.97 (0.08)$	$1.67 (0.12)$	$1.69 (0.12)$	$1.35 (0.12)$
		$63.30 %$	$5.53 %$	$18.88 %$	$63.31 %$	$18.52 %$
	0.3	$2.36 (0.16)$	$1.43 (0.14)$	$1.94 (0.18)$	$2.36 (0.16)$	$1.51 (0.19)$
		$77.54 %$	$31.64 %$	$25.72 %$	$77.55 %$	$7.28 %$
	0.5	$3.21 (0.25)$	$2.82 (0.24)$	$2.28 (0.29)$	$3.21 (0.25)$	$1.92 (0.31)$
		$88.68 %$	$75.95 %$	$30.68 %$	$88.68 %$	$10.11 %$
0.2	0.1	$3.22 (17.00)$	$1.47 (5.54)$	$3.93 (0.29)$	$3.22 (17.00)$	$1.67 (0.25)$
		$85.64 %$	$13.57 %$	$76.53 %$	$85.64 %$	$16.28 %$
	0.3	$8.71 (0.56)$	$2.34 (0.35)$	$5.00 (0.43)$	$8.71 (0.56)$	$2.17 (0.45)$
		$189.74 %$	$42.07 %$	$91.44 %$	$189.75 %$	$4.88 %$
	0.5	$12.12 (0.94)$	$10.35 (0.88)$	$6.41 (0.86)$	$12.14 (0.96)$	$4.48 (0.90)$
		$213.90 %$	$187.93 %$	$102.68 %$	$214.07 %$	$59.86 %$
0.3	0.1	$3.16 (0.50)$	$2.01 (0.28)$	$5.47 (0.53)$	$3.16 (0.50)$	$1.93 (0.40)$
		$47.15 %$	$22.46 %$	$54.20 %$	$47.15 %$	$10.91 %$
	0.3	$19.30 (1.23)$	$4.29 (0.71)$	$10.11 (0.85)$	$19.30 (1.23)$	$2.82 (0.77)$
		$308.32 %$	$79.75 %$	$161.91 %$	$308.32 %$	$14.37 %$
	0.5	$26.96 (2.09)$	$22.88 (1.95)$	$13.36 (1.82)$	$27.00 (2.13)$	$8.75 (1.85)$
		$345.91 %$	$306.28 %$	$180.35 %$	$346.14 %$	$113.48 %$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yi, F.; Qiu, P. Edge-Preserving Denoising of Image Sequences. Entropy 2021, 23, 1332. https://doi.org/10.3390/e23101332

AMA Style

Yi F, Qiu P. Edge-Preserving Denoising of Image Sequences. Entropy. 2021; 23(10):1332. https://doi.org/10.3390/e23101332

Chicago/Turabian Style

Yi, Fan, and Peihua Qiu. 2021. "Edge-Preserving Denoising of Image Sequences" Entropy 23, no. 10: 1332. https://doi.org/10.3390/e23101332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Edge-Preserving Denoising of Image Sequences

Abstract

1. Introduction

2. Materials and Methods

2.1. JRA Model and Its Estimation

2.2. Parameter Selection

3. Results

3.1. Statistical Properties

3.2. Numerical Studies

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Proof of Proposition 1

Appendix A.2. Proof of Theorem 1

Appendix A.3. Proof of Theorem 2

Appendix A.4. Proof of Theorem 3

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI