Next Article in Journal
Fortification of Olive Oil with Herbs and Waste By-Products towards Sustainable Development: Total Antioxidant Capacity, Phenolic Content, and In Vitro Predicted Bioavailability
Previous Article in Journal
Research on the Optimal Deployment of Expressway Roadside Units under the Fusion Perception of Intelligent Connected Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Model-Based and Model-Free Point Prediction Algorithms for Locally Stationary Random Fields

1
School of Mathematical and Data Sciences, West Virginia University, Morgantown, WV 26506, USA
2
MIT Sloan School of Management, Cambridge, MA 02142, USA
3
Department of Mathematics and Halicioglu Data Science Institute, University of California—San Diego, La Jolla, CA 92093, USA
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(15), 8877; https://doi.org/10.3390/app13158877
Submission received: 10 June 2023 / Revised: 19 July 2023 / Accepted: 25 July 2023 / Published: 1 August 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
The Model-Free Prediction Principle has been successfully applied to general regression problems, as well as problems involving stationary and locally stationary time series. In this paper, we demonstrate how Model-Free Prediction can be applied to handle random fields that are only locally stationary such as pixel values over an image or satellite data observed on an ocean surface, i.e., they can be assumed to be stationary only across a limited part over their entire region of definition. We construct novel one-step-ahead Model-Based and Model-Free point predictors and compare their performance using synthetic data as well as images from the CIFAR-10 dataset. In the latter case, we demonstrate that our best Model-Free point prediction results outperform those obtained using Model-Based prediction.

1. Introduction

Consider a real-valued random field dataset { Y t ̲ , t ̲ Z 2 } defined over a 2D index-set D, e.g., pixel values over an image or satellite data observed on an ocean surface. It may be unrealistic to assume that the stochastic structure of such a random field Y t ̲ has stayed invariant over the entire region of definition D; hence, we cannot assume that { Y t ̲ } is stationary. Therefore it is more realistic to assume a slowly-changing stochastic structure, i.e.,  a locally stationary model. The theory of locally stationary time series and parametric and nonparametric methods for their estimation have been covered extensively in the literature including references [1,2,3,4,5,6,7,8]. In [9] we propose Model-Based and Model-Free algorithms for point prediction and prediction intervals of locally stationary time series and demonstrate their applications for both synthetic and real-life datasets. Our work in this paper extends this framework to point prediction over locally stationary random fields with applications involving synthetic and real-life image data.
In the context of random fields two principal modeling approaches are usually followed in order to perform estimation. In the first case for fields of study such as econometrics and ecology where the sampling points can be irregular the random field data { Y s ̲ , s ̲ S } is defined over a continuous subset S of R d . Modeling strategies for such non-uniformly spaced spatial data have been discussed in [10,11]. Kernel estimation for locally stationary random fields defined over such irregularly spaced locations has been proposed in [12,13] while autoregressive estimation of similarly defined locally stationary random fields has been proposed in [14]. In the other style of modeling the random field { Y t ̲ , t ̲ S } is defined over a regularly spaced grid S Z d . Examples of such applications arise in fields of study such as image processing and radiography. Two dimensional (2D) autoregressive models for such random fields covering various regions of support have been proposed in [15]. Autoregressive estimation of such data regularly spaced on a lattice has also been discussed in [16]. Applications of 2D autoregressive models for analysis and synthesis of textural images are shown in [17]. Local Linear based nonparametric estimators of such random fields and their theoretical properties have been discussed in [18,19]. In this paper, we assume a locally stationary model for random fields Y t ̲ R defined over t ̲ S where S Z d , d = 2 . Given data Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n , our objective is to perform point prediction for a future unobserved data point Y t ̲ n + 1 . Here, t ̲ 1 , t ̲ 2 , , t ̲ n , t ̲ n + 1 Z 2 denote the coordinates of the random field over the 2D index set D and the notion of a future datapoint over a coordinate of a random field for purposes of predictive inference over t ̲ Z 2 is defined in Section 2.
The usual approach for dealing with locally stationary series is to assume that the data can be decomposed as the sum of three components:
μ ( t ̲ ) + S t ̲ + W t ̲
where μ ( t ̲ ) is a deterministic trend function, S t ̲ is a seasonal (periodic) series, and  { W t ̲ } is (strictly) stationary with mean zero. This type of decomposition has been proposed for time series [20] and can also be used for the decomposition of locally stationary random field data. The seasonal (periodic) component, be it random or deterministic, can be easily estimated and removed and having completed that, the ‘classical’ decomposition simplifies to the following model with the additive trend, i.e.,
Y t ̲ = μ ( t ̲ ) + W t ̲
which can be generalized to accommodate a coordinate-changing variance as well, i.e.,
Y t ̲ = μ ( t ̲ ) + σ ( t ̲ ) W t ̲ .
In both above models, the series { W t ̲ } is assumed to be (strictly) stationary, weakly dependent, e.g., strong mixing, and satisfying E ( W t ̲ ) = 0 ; in model (2), it is also assumed that v a r ( W t ̲ ) = 1 . The deterministic functions μ ( · ) and σ ( · ) are unknown and can be assumed to belong to a class of functions that is either finite-dimensional (parametric) or not (nonparametric). In this paper, we focus on the nonparametric case and assume that μ ( · ) and σ ( · ) have some degree of smoothness i.e., change smoothly (and slowly) with t ̲ .
Models (1) and (2) can be used to capture the first two moments of the locally stationary random field; however it may be the case that the skewness and/or kurtosis of Y t ̲ changes with t ̲ . In addition it may also be the case that the correlation Corr ( Y t ̲ j , Y t ̲ j + 1 ) changes smoothly (and slowly) with t ̲ j Z 2 . To address this more general case we propose a methodology for point prediction of locally stationary random fields that does not rely on simple additive models such as (1) and (2). This is accomplished by using the Model-Free Prediction Principle of [21,22]. The key to Model-Free inference is to be able to construct an invertible transformation H n : Y ̲ t ̲ n ϵ ̲ n where Y ̲ t ̲ n = ( Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n ) denotes the random field data under consideration and ϵ ̲ n = ( ϵ 1 , , ϵ n ) is a random vector with i.i.d. components.
The rest of the paper is arranged as follows. In Section 2, we set up the framework for defining the causality of random fields in order to enable us to perform point prediction. In Section 3, we visit the problem of Model-Based inference and develop a point prediction methodology for locally stationary random fields. In Section 4, we construct the framework for point prediction of locally stationary random fields using Model-Free inference. In Section 5, we describe how cross-validation can be used to determine the optimal bandwidths for both Model-Based and Model-Free inference. Finally, in Section 6, using finite sample experiments we compare the two novel approaches namely Model-Based of Section 3 and Model-Free of Section 4 using synthetic and real-life data.

2. Causality of Random Fields

Given the random field observations Y t ̲ 1 , , Y t ̲ n our goal is to perform predictive inference for the “next” unknown datapoint Y t ̲ n + 1 . In this context, a definition of causality is necessary to specify the random field coordinate t ̲ n + 1 where predictive inference will be performed. For this purpose, we adopt the framework proposed in [15] and consider random fields discussed in this paper to be defined over a subset of the nonsymmetric half-plane (NSHP) denoted as H . Figure 1 shows an NSHP centered at (0, 0). The NSHP can also be centered at any other point t ̲ as follows:
N S H P ( t ̲ ) = t ̲ + s ̲ s ̲ N S H P ( 0 , 0 )
Such nonsymmetric half-planes have been used previously for specifying causal 2D AR models [15]. In such cases, a causal 2D AR model with H p H can be defined as below in Equation (4) where the set H p is termed as the region of support (ROS) of the 2D AR model. Here H p = { ( j , k ) | j = 1 , 2 , , p and k = 0 , ± 1 , , ± p } { ( 0 , k ) | k = 1 , 2 , , p } and v t 1 , t 2 is a 2D white noise process with mean 0 and variance σ 2 > 0 .
Y t 1 , t 2 = ( j , k ) H p β j , k Y t 1 j , t 2 k + v t 1 , t 2
Based on [23], a 2D AR process with ROS S is causal if there exists a subset C of Z 2 satisfying the following conditions:
  • The set C consists of 2 rays emanating from the origin and the points lie between the rays;
  • The angle between the 2 rays is strictly less than 180 degrees;
  • S C
In this case, since H p H satisfies these conditions the 2D AR process denoted by (4) is causal. We use this framework to describe a causal random field defined over the NSHP and perform predictive inference on the same. Given this, our setup for point prediction of random fields is described as below.
Consider random field data { Y t ̲ , t ̲ E } where E can be any finite subset of Z 2 for, e.g., E n ̲ = { t ̲ Z 2 with n ̲ = ( n 1 , n 2 ) } . Our goal is predictive inference at t ̲ = ( t 1 , t 2 ) where 0 < t 1 < n 1 & 0 < t 2 < n 2 . This “future” value Y t 1 , t 2 is determined using data defined over the region as shown in Figure 2:
E t ̲ , n ̲ = N S H P ( t ̲ ) E n ̲
Both Model-Based and Model-Free causal inference for Y t 1 , t 2 are performed using the data specified over this region E t ̲ , n ̲ . We consider predictive inference at Y t ̲ = Y t 1 , t 2 given the data ( Y s ̲ s ̲ t ̲ & s ̲ E t ̲ , n ̲ ) where the symbol ≺ denotes lexicographical ordering on the region of support of the random field, i.e.,  ( a k , b k ) ( a k + 1 , b k + 1 ) if and only if either a k < a k + 1 or ( a k = a k + 1 and b k < b k + 1 ) [15]. In the subsequent discussion the lexicographically ordered “past” data Y s ̲ will be denoted as Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n and point prediction will be performed at Y t ̲ = Y t ̲ n + 1 .

3. Model-Based Point Prediction

We adopt the time-changing mean and variance model as given by Equation (2). The  L 2 –optimal predictor of Y t ̲ n + 1 given the data Y s ̲ = Y ̲ t ̲ n = ( Y t ̲ 1 , , Y t ̲ n ) is the conditional expectation E ( Y t ̲ n + 1 | Y ̲ t ̲ n ) . Using model (2) and assuming that W ̲ t ̲ n is weakly dependent it can be shown that [9]:
E ( Y t ̲ n + 1 | Y ̲ t ̲ n ) = μ ( t ̲ n + 1 ) + σ ( t ̲ n + 1 ) E ( W t ̲ n + 1 | W ̲ t ̲ n ) .
From the above equation, we can see that for Model-Based point prediction we need to estimate the conditional expectation E ( W t ̲ n + 1 | W ̲ t ̲ n ) as well as the coordinate changing trend and variance, i.e.,  μ ( t ̲ n + 1 ) and σ ( t ̲ n + 1 ) .
Estimating the conditional expectation: This is performed by fitting a (causal) AR( p , q ) model to the data W ̲ t ̲ n = ( W t ̲ 1 , , W t ̲ n ) with p , q chosen by minimizing AIC, BIC or a related criterion as described in [15]. Using this framework involves estimating the coefficients of the following 2D AR model defined over a ROS H p as described in Section 2:
W r , s = ( j , k ) H p β j , k W r j , s k + v r , s
Here, v r , s is a 2D white noise process, i.e., an uncorrelated sequence, with mean 0 and variance σ 2 > 0 . For our point prediction problem setting this implies that
E ¯ ( W t ̲ n + 1 | W ̲ t ̲ n ) = ( j , k ) H p β j , k W t n 1 j , t n 2 k
Estimating the trend and variance: This can be performed by using kernel smoothing [24,25,26] using 2D kernels, i.e., Nadaraya–Watson (NW) estimation. In addition, since predicting Y t ̲ n + 1 is essentially a boundary problem it is also possible to use local linear fitting, which has been reported to have smaller bias than kernel smoothing for such estimation problems [26,27,28]. For time series problems, { Y t , t Z } local linear nonparametric estimation can approximate the trend locally by a straight line whereas for the case of random fields { Y t ̲ , t ̲ Z 2 } discussed in this paper local linear estimation can be used to approximate the trend locally with a plane.
In order to estimate E ( W t ̲ n + 1 | W ̲ t ̲ n ) , the stationary data W t ̲ 1 , , W t ̲ n needs to be estimated. In this case, W t ̲ has to be calculated in a one-sided manner for all points including those at the center of the dataset else the obtained values W ^ t ̲ 1 , , W ^ t ̲ n will not be stationary, which leads to incorrect estimation of the conditional expectation of W t ̲ n + 1 . The one-sided kernel smoothed and local linear estimators used for calculating W ^ t ̲ 1 , , W ^ t ̲ n can be defined in two ways as shown in the equations below on NW–Regular, NW–Predictive, LL–Regular and LL–Predictive fitting. Here, the bandwidth parameter b is assumed to satisfy
b a s n but b / n 0
We will assume throughout that K ( · ) is a nonnegative, symmetric 2D Gaussian kernel function for which the diagonal values are set to the bandwidth b and the off-diagonal terms are set to 0. Random field data is denoted as Y t ̲ 1 , , Y t ̲ k , Y t ̲ n .
  • NW–Regular fitting: Let t ̲ k [ t ̲ 1 , t ̲ n ] , and define
    μ ^ ( t ̲ k ) = i = 1 k Y t ̲ i K ^ t ̲ k t ̲ i b and M ^ ( t ̲ k ) = i = 1 k Y t ̲ i 2 K ^ ( t ̲ k t ̲ i b )
    where
    σ ^ ( t ̲ k ) = M ^ t ̲ k μ ^ ( t ̲ k ) 2 and K ^ t ̲ k t ̲ i b = K ( t ̲ k t ̲ i b ) j = 1 k K ( t ̲ k t ̲ j b ) .
    Using μ ^ ( t ̲ k ) and σ ^ ( t ̲ k ) we can now define the fitted residuals by
    W ^ t ̲ k = Y t ̲ k μ ^ ( t ̲ k ) σ ^ ( t ̲ k ) for t ̲ k = t ̲ 1 , , t ̲ n .
  • NW–Predictive fitting:
    μ ˜ ( t ̲ k ) = i = 1 k 1 Y t ̲ i K ˜ t ̲ k t ̲ i b and M ˜ ( t ̲ k ) = i = 1 k 1 Y t ̲ i 2 K ˜ ( t ̲ k t ̲ i b )
    where
    σ ˜ ( t ̲ k ) = M ˜ t ̲ k μ ˜ ( t ̲ k ) 2 and K ˜ t ̲ k t ̲ i b = K ( t ̲ k t ̲ i b ) j = 1 k 1 K ( t ̲ k t ̲ j b ) .
    Using μ ˜ ( t ̲ k ) and σ ˜ ( t ̲ k ) we can now define the predictive residuals by
    W ˜ t ̲ k = Y t ̲ k μ ˜ ( t ̲ k ) σ ˜ ( t ̲ k ) for t ̲ k = t ̲ 1 , , t ̲ n .
Similarly, the one-sided local linear (LL) fitting estimators of μ ( t ̲ k ) and σ ( t ̲ k ) can be defined in two ways.
  • LL–Regular fitting: Let t ̲ k [ t ̲ 1 , t ̲ n ] , and define
    μ ^ ( t ̲ k ) = j = 1 k w j Y t ̲ j j = 1 k w j + n 2 and M ^ ( t ̲ k ) = j = 1 k w j Y t ̲ j 2 j = 1 k w j + n 2
    Denoting
    a ̲ = ( a 1 , a 2 ) = ( t ̲ j t ̲ k )
    s t 1 , 1 = j = 1 k K t ̲ j t ̲ k b a 1
    s t 2 , 1 = j = 1 k K t ̲ j t ̲ k b a 2
    s t 1 , 2 = j = 1 k K t ̲ j t ̲ k b a 1 2
    s t 2 , 2 = j = 1 k K t ̲ j t ̲ k b a 2 2
    s t 1 , t 2 = j = 1 k K t ̲ j t ̲ k b a 1 a 2
    w j = K ( t ̲ j t ̲ k b ) { s t 1 , 2 s t 2 , 2 s t 1 , t 2 2 a 1 ( s t 1 , 1 s t 2 , 2 s t 2 , 1 s t 1 , t 2 ) + a 2 ( s t 1 , 1 s t 1 , t 2 s t 1 , 2 s t 2 , 1 ) }
    The term n 2 in Equation (15) is just to ensure the denominator is not zero; see [29]. Equation (10) then yields σ ^ ( t ̲ k ) , and Equation (11) yields  W ^ t ̲ k .
  • LL–Predictive fitting:
    μ ˜ ( t ̲ k ) = j = 1 k 1 w j Y t ̲ j j = 1 k 1 w j + n 2 and M ˜ ( t ̲ k ) = j = 1 k 1 w j Y t ̲ j 2 j = 1 k 1 w j + n 2
    where
    a ̲ = ( a 1 , a 2 ) = ( t ̲ j t ̲ k )
    s t 1 , 1 = j = 1 k 1 K t ̲ j t ̲ k b a 1
    s t 2 , 1 = j = 1 k 1 K t ̲ j t ̲ k b a 2
    s t 1 , 2 = j = 1 k 1 K t ̲ j t ̲ k b a 1 2
    s t 2 , 2 = j = 1 k 1 K t ̲ j t ̲ k b a 2 2
    s t 1 , t 2 = j = 1 k 1 K t ̲ j t ̲ k b a 1 a 2
    w j = K ( t ̲ j t ̲ k b ) { s t 1 , 2 s t 2 , 2 s t 1 , t 2 2 a 1 ( s t 1 , 1 s t 2 , 2 s t 2 , 1 s t 1 , t 2 ) + a 2 ( s t 1 , 1 s t 1 , t 2 s t 1 , 2 s t 2 , 1 ) }
    Equation (13) then yields σ ˜ ( t ̲ k ) , and Equation (14) yields W ˜ t ̲ k .
Using one of the above four methods (NW vs. LL, regular vs. predictive) gives estimates of the quantities needed to compute the L 2 –optimal predictor (5). The bandwidth b in all four algorithms can be determined by cross-validation as described in Section 5.

4. Model-Free Point Prediction

For the Model-Based case, Equation (2) accounts for spatially-changing mean and variance of Y t ̲ . More generally, however, it may happen that the random field { Y t ̲ for t ̲ Z 2 } has a nonstationarity in its third (or higher moment), and/or in some other feature of its mth marginal distribution. This is addressed by the Model-Free Prediction Principle of Politis (2013, 2015).
The key towards Model-Free inference is to be able to construct an invertible transformation H n : Y ̲ t ̲ n ϵ ̲ n where Y ̲ t ̲ n = ( Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n ) denotes the random field data under consideration and ϵ ̲ n = ( ϵ 1 , , ϵ n ) is a random vector with i.i.d. components. In order to do this in our context, let some m 1 , and denote by L ( Y t ̲ k , Y t ̲ k 1 , , Y t ̲ k m + 1 ) the mth marginal of Y t ̲ k i.e., the joint probability law of the vector ( Y t ̲ k , Y t ̲ k 1 , , Y t ̲ k m + 1 ) . We assume that L ( Y t ̲ k , Y t ̲ k 1 , , Y t ̲ k m + 1 ) changes smoothly (and slowly) with t ̲ k in order to use nonparametric smoothing for estimation. In this case, { Y t ̲ k , t ̲ k Z 2 } can be defined over a 2D index-set D and the set ( Y t ̲ k , Y t ̲ k 1 , , Y t ̲ k m + 1 ) can be considered to be lexicographically ordered as discussed previously in Section 2.
Similar to the framework proposed in [9] in order to ensure both the smoothness and data-based consistent estimation of L ( Y t ̲ k , Y t ̲ k 1 , , Y t ̲ k m + 1 ) we assume that, for all t ̲ k ,
Y t ̲ k = f t ̲ k ( W t ̲ k , W t ̲ k 1 , , W t ̲ k m + 1 )
for some function f t ̲ k (w) that is smooth in both arguments t ̲ k and w, and some strictly stationary and weakly dependent, univariate series W t ̲ k where without loss of generality, it is assumed that W t ̲ k is a Gaussian series. Here model (2) is a special case of Equation (31) with m = 1 , and  the function f t ̲ k (w) being affine/linear in w. Therefore, for  comparison with the Model-Based case of Equation (2), in this section, we focus on the case m = 1 . For reference Model-Free estimators for point prediction and prediction intervals in the case of locally stationary time series for m = 1 have been discussed in [9]. Below, we describe the steps necessary to construct the invertible transformation H n required to perform Model-Free point prediction for locally stationary random fields for the case m = 1 .
Step 1: 
Transformation to uniform samples
With m = 1 let D t ̲ ( y ) = P { Y t ̲ y } denote the first marginal distribution of the random field { Y t ̲ } . Applying the probability integral transform we have
U t ̲ = D t ̲ ( Y t ̲ ) for t ̲ = t ̲ 1 , , t ̲ n
Here, U t ̲ 1 , , U t ̲ n are random variables having distribution Uniform ( 0 , 1 ) . In this case, it is assumed that D t ̲ ( y ) is (absolutely) continuous in y for all t ̲ . Therefore, we can use either local constant or local linear fitting to estimate it.
Using local constant fitting a smooth estimator can be defined as:
D ¯ t ̲ k ( y ) = i = 1 T Λ ( y Y t ̲ i h 0 ) K ˜ ( t ̲ k t ̲ i b )
where K ˜ ( t ̲ k t ̲ i b ) = K ( t ̲ k t ̲ i b ) / j = 1 T K ( t ̲ k t ̲ j b ) , Λ ( y ) is a smooth distribution function, which is strictly increasing with density λ ( y ) > 0 i.e.,  Λ ( y ) = y λ ( s ) d s and h 0 is a secondary bandwidth. Furthermore, as in Section 3, we can let T = k or T = k 1 leading to a fitted vs. predictive way to estimate D t ̲ k ( y ) using D ¯ t ̲ k ( y ) . Similar to the Model-Based case, we will assume throughout that K ( · ) is a nonnegative, symmetric 2D Gaussian kernel function for which the diagonal values are set to the bandwidth b and the off-diagonal terms are set to 0. Note that the kernel estimator (33) is one-sided for the same reasons discussed before in Section 3. Cross-validation is used to determine the bandwidths h 0 and b; details are described in Section 5.
Since point prediction is performed on the boundary of the random field one can also consider local linear estimation as an alternative to the local constant-based smoothing approach. D ¯ t ̲ k ( y ) as defined in Equation (33) is the Nadaraya–Watson smoother of the variables v 1 , , v n where v i = Λ ( y Y t ̲ i h 0 ) . It is possible to define D ¯ t ̲ k L L ( y ) which is the local linear estimator of D t ̲ k ( y ) based on the smoothed variables Λ ( y Y t ̲ i h 0 ) . This estimator is expected to have a smaller bias than D ¯ t ̲ k ( y ) . However, there is no guarantee that this will be a proper distribution function as a function of y, i.e., being nondecreasing in y with a left limit of 0 and a right limit of 1 as discussed in [26]. A proposed solution put forward by Hansen [30] involves a straightforward adjustment to the local linear estimator of a conditional distribution function that maintains its favorable asymptotic properties. The local linear version of D ¯ t ̲ k ( y ) adjusted via Hansen’s (2004) proposal is given as follows:
D ¯ t ̲ k L L H ( y ) = i = 1 T w i Λ ( y Y t ̲ i h 0 ) i = 1 T w i .
The weights w i are derived from weights w i described in Equations (22) and (30) for the fitted and predictive cases where
w i = 0 when w i < 0 w i when w i 0
As with Equation (33), we can let T = k or T = k 1 in the above, leading to a fitted vs. predictive local linear estimators of D t ̲ k ( y ) using D ¯ t ̲ k L L H ( y ) .
One problem with the local linear estimator described above is that it replaces negative weights with zeros, and then renormalizes the nonzero weights. However, if estimation is performed on the boundary (as in the case with one-step ahead prediction of random fields), negative weights are crucially needed in order to ensure the extrapolation takes place with minimal bias. To address this problem, we modify the original, possibly nonmonotonic local linear distribution estimator D ¯ t ̲ k L L ( y ) to construct a monotonic version denoted by D ¯ t ̲ k L L M ( y ) . The Monotone Local Linear Distribution Estimator  D ¯ t ̲ k L L M ( y ) can be constructed by Algorithm 1 as given below [31].   
Algorithm 1: Monotone Local Linear Distribution Estimation
  • Recall that the derivative of D ¯ t ̲ k L L ( y ) with respect to y is given by
    d ¯ t ̲ k L L ( y ) = 1 h 0 j = 1 T w j λ ( y Y t ̲ j h 0 ) j = 1 n w j
    where λ ( y ) is the derivative of Λ ( y ) and the weights w j can be derived based on Equations (22) and (30) for the fitted and predictive cases.
  • Define a nonnegative version of d ¯ t ̲ k L L ( y ) as d ¯ t ̲ k L L + ( y ) = max ( d ¯ t ̲ k L L ( y ) , 0 ) .
  • To make the above a proper density function, renormalize it to area one, i.e., let
    d ¯ t ̲ k L L M ( y ) = d ¯ t ̲ k L L + ( y ) d ¯ t ̲ k L L + ( s ) d s .
  • Finally, define D ¯ t ̲ k L L M ( y ) = y d ¯ t ̲ k L L M ( s ) d s .
The above modification of the local linear estimator allows one to maintain monotonicity while retaining the negative weights that are helpful in problems that involve estimation at the boundary. As with Equation (33), we can let T = k or T = k 1 in the above, leading to a fitted vs. predictive local linear estimators of D t ̲ k ( y ) that are monotone.
Step 2: 
Transformation to iid normal samples
Starting from the original random field data Y t ̲ 1 , , Y t ̲ n by using either the local constant, the local linear or the monotone local linear distribution estimator in Step 1 it is possible to obtain samples U t ̲ 1 , , U t ̲ n having distribution Uniform ( 0 , 1 ) . However, these samples are dependent, and therefore, additional steps are necessary to convert them to i.i.d. samples as required for Model-Free inference. This is performed as described below.
Let Φ denote the cumulative distribution function (cdf) of the standard normal distribution. Therefore we have:
Z t ̲ = Φ 1 ( U t ̲ ) for t ̲ = t ̲ 1 , , t ̲ n ;
Here Z t ̲ 1 , , Z t ̲ n are correlated standard normal random variables. Now let Γ n denote the n × n covariance matrix of the random vector Z ̲ t ̲ n = ( Z t ̲ 1 , , Z t ̲ n ) . Consider the Cholesky decomposition Γ n = C n C n where C n is (lower) triangular, and  construct the whitening transformation:
ϵ ̲ n = C n 1 Z ̲ t ̲ n .
It then follows that the entries of ϵ ̲ n = ( ϵ 1 , , ϵ n ) are uncorrelated standard normal. Assuming that the random variables Z t ̲ 1 , , Z t ̲ n are jointly normal it can then be inferred that ϵ 1 , , ϵ n are i.i.d.  N ( 0 , 1 ) . Joint normality can be established by assuming a generative model of the random field as given by Equation (31); for a more detailed discussion refer to [9].
To implement the whitening transformation (38), it is necessary to estimate Γ n , i.e., the n × n covariance matrix of the random vector Z ̲ t ̲ n = ( Z t ̲ 1 , , Z t ̲ n ) where the Z t ̲ are the normal random variables defined in Equation (37). The problem involves positive definite estimation of Γ n based on the sample Z t ̲ 1 , , Z t ̲ n . This estimate is based on the sample autocovariance, which is defined for a 2D second-order stationary random field { y r , s | r = 1 , 2 , , R , s = 1 , 2 , , S } as follows [15]:
γ ˘ ( j , k ) = γ ˘ ( j , k ) = 1 ( R j ) ( S k ) s = 1 R j t = 1 S k { y r + j , s + k y ¯ } { y r , s y ¯ }
γ ˘ ( j , k ) = γ ˘ ( j , k ) = 1 ( R j ) ( S k ) s = 1 R j t = k + 1 S { y r + j , s k y ¯ } { y r , s y ¯ }
where ( j , k = 0 , 1 , 2 , ).
Now, let Γ ^ n A R be the n × n covariance matrix associated with the fitted AR(p,q) model to the data Z t ̲ 1 , , Z t ̲ n with p , q by minimizing AIC, BIC or a related criterion as described in [15]. Let γ ^ | i j | A R denote the i , j element of the Toeplitz matrix Γ ^ n A R . Using the 2D Yule–Walker equations to fit the AR model implies that γ ^ k , l A R = γ ˘ k , l for k = 0 , 1 , , p and l = 0 , 1 , , q . For the cases where k > p or l > q , γ ^ k , l A R can be fitted by iterating the difference equation that characterizes the fitted 2D AR model. In the R software, this procedure is automated for time series using the ARMAacf() function, and here we extend the same approach for stationary data over random fields.
Estimating the ‘uniformizing’ transformation D t ̲ ( · ) and the whitening transformation based on Γ n allows us to construct the transformation H n : Y ̲ t ̲ n ϵ ̲ n . Here, ϵ ̲ n is a random vector with i.i.d. components as required by the Model-Free prediction principle. Since all the steps in the transformation, i.e.,  Equations (32), (37) and (38), are invertible; therefore, the composite transformation H n : Y ̲ t ̲ n ϵ ̲ n is also invertible. However, in order to put the Model-Free Prediction Principle to work, we also need to estimate the transformation H n + 1 (and its inverse). To do so, we need a positive definite estimator for the matrix Γ n + 1 ; this can be accomplished by extending the covariance matrix associated with the fitted 2D AR(p,q) model to ( n + 1 ) by ( n + 1 ) i.e., calculate Γ ^ n + 1 A R .
Consider the following vectors which include the additional values Y t ̲ n + 1 , Z t ̲ n + 1 and ϵ n + 1 that have not yet been estimated:
  • Y ̲ t ̲ n + 1 = ( Y t ̲ 1 , , Y t ̲ n , Y t ̲ n + 1 ) ;
  • Z ̲ t ̲ n + 1 = ( Z t ̲ 1 , , Z t ̲ n , Z t ̲ n + 1 ) ;
  • ϵ ̲ n + 1 = ( ϵ 1 , , ϵ n , ϵ n + 1 ) .
We now show how to obtain the inverse transformation H n + 1 1 : ϵ ̲ n + 1 Y ̲ t ̲ n + 1 . Since ϵ ̲ n and Y ̲ t ̲ n are related in a one-to-one way via transformation H n ; therefore, the values Y t ̲ 1 , , Y t ̲ n are obtainable by Y ̲ t ̲ n = H n 1 ( ϵ n ) . Similar to the framework proposed for locally stationary time series in [9] below we show how to create the unobserved Y t ̲ n + 1 from ϵ ̲ n + 1 using the following three steps (Algorithm 2).   
Algorithm 2: Generation of Unobserved Datapoint from Future Innovations
  • Let
    Z ̲ t ̲ n + 1 = C n + 1 ϵ ̲ n + 1
    where C n + 1 is the (lower) triangular Cholesky factor of (our positive definite estimate of) Γ n + 1 . From the above, it follows that
    Z t ̲ n + 1 = c ̲ n + 1 ϵ ̲ n + 1
    where c ̲ n + 1 = ( c 1 , , c n , c n + 1 ) is a row vector consisting of the last row of matrix C n + 1 .
  • Create the uniform random variable
    U t ̲ n + 1 = Φ ( Z t ̲ n + 1 ) .
  • Finally, define
    Y t ̲ n + 1 = D t ̲ n + 1 1 ( U t ̲ n + 1 ) ;
    where in practice, the above will be based on an estimate of D t ̲ n + 1 1 ( · ) .
Since Y ̲ t ̲ n has already been created using (the first n coordinates of) ϵ ̲ n + 1 , the above completes the construction of Y ̲ t ̲ n + 1 based on ϵ ̲ n + 1 , i.e., the mapping H n + 1 1 : ϵ ̲ n + 1 Y ̲ t ̲ n + 1 . By combining Equations (42)–(44) we can write the formula:
Y t ̲ n + 1 = D t ̲ n + 1 1 Φ ( c ̲ n + 1 ϵ ̲ n + 1 ) .
The term c ̲ n + 1 ϵ ̲ n + 1 can be written as i = 1 n c i ϵ i + c n + 1 ϵ n + 1 ; hence, the above can be compactly denoted as
Y t ̲ n + 1 = g n + 1 ( ϵ n + 1 ) where g n + 1 ( x ) = D t ̲ n + 1 1 Φ i = 1 n c i ϵ i + c n + 1 x .
Equation (45) is the predictive equation required in the Model-free Prediction Principle where Y t ̲ n + 1 is estimated conditionally on Y ̲ t ̲ n = ( Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n ) . The complete algorithm for constructing the Model-Free point predictors is as described below (Algorithm 3):   
Algorithm 3: Model-free (MF) point predictors for Y t ̲ n + 1
  • Construct U t ̲ 1 , , U t ̲ n by Equation (32) with D t ̲ k ( · ) estimated by either D ¯ t ̲ k ( · ) , D ¯ t ̲ k L L H ( · ) or D ¯ t ̲ k L L M ( · ) where t ̲ k [ t ̲ 1 , t ̲ n ]
  • Construct Z t ̲ 1 , , Z t ̲ n by Equation (37)
  • Construct ϵ 1 , , ϵ n by Equation (38), and let F ^ n denote their empirical distribution.
  • The Model-free L 2 –optimal point predictor of Y t ̲ n + 1 is then
    Y ^ t ̲ n + 1 = g n + 1 ( x ) d F n ( x ) = 1 n i = 1 n g n + 1 ( ϵ i )
    where the function g n + 1 is defined in the predictive Equation (45) with D t ̲ n + 1 ( · ) being again estimated by either D ¯ t ̲ n + 1 ( · ) , D ¯ t ̲ n + 1 L L H ( · ) or D ¯ t ̲ n + 1 L L M ( · )
  • The Model-free L 1 –optimal point predictor of Y t ̲ n + 1 is given by the median of the set { g n + 1 ( ϵ i ) for i = 1 , , n } .

5. Random Fields Cross-Validation

To choose the bandwidth b for either Model-Based or Model-Free point prediction, we perform one-step-ahead prediction at several coordinates of the given random field data. To elaborate, consider a random field Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ n and suppose only subseries Y t ̲ 1 , Y t ̲ 2 , , Y t ̲ k has been observed where k < n . Let Y ^ t ̲ k + 1 denote the predicted value based on the data Y t ̲ 1 , , Y t ̲ k ; this can be estimated by using either the Model-Based or Model-Free approaches as described in Section 3 and Section 4 for some choice of b. However, since Y t ̲ k + 1 is known, the quality of the predictor can be assessed. Therefore, for each value of b over a reasonable range, we calculate the sum of squared errors:
S S E ( b ) = k = k o n 1 ( Y ^ t ̲ k + 1 Y t ̲ k + 1 ) 2
here k o should be big enough so that the estimation is accurate, e.g., k o can be of the order of n . The cross-validated bandwidth choice would then be the b that minimizes S S E ( b ) . For the problem of selecting h 0 in the case of Model-Free point predictors, as in [21], our final choice is h 0 = h 2 where h = b / n . Note that an initial choice of h 0 (needed to perform cross-validation to determine the optimal bandwidth b) can be set by any plug-in rule as the effect of choosing an initial value of h 0 is minimal.

6. Model-Free vs. Model-Based Inference: Empirical Comparisons

The point prediction performance of the Model-Free and Model-Based predictors described above are empirically compared using simulated as well as real-life data. The Model-Based local constant and local linear methods are denoted as MB-LC and MB-LL, respectively. Model-Based predictors MB-LC and MB-LL are described in Section 3. The Model-Free methods using local constant, local linear (Hansen) and local linear (Monotone) are denoted as MF-LC, MF-LLH, and MF-LLM. Model-Free predictors are described in Section 4. Both fitted and predictive residuals as described in Section 3 and Section 4 are used for point prediction and their performance as indicated by Mean Squared Error (MSE) is used to compare the various estimators.
Baseline comparisons: Besides the above MB and MF estimators we also provide three baselines estimators for comparison as described below:
  • Model-Based estimation of Y t ̲ n + 1 involves nonparametric mean and variance estimation followed by estimating the conditional expectation E ( W t ̲ n + 1 | W ̲ t ̲ n ) which involves calculating the coefficients of the 2D AR model as given by Equation (6). In this case as a baseline we have included results for both the synthetic and real-life datasets using only local linear estimation of the mean i.e., in this case the L 2 –optimal predictor of Y t ̲ n + 1 is given by:
    Y ^ t ̲ n + 1 = μ ( t ̲ n + 1 )
    Here, μ ( t ̲ n + 1 ) is calculated using local linear fitting based on Equations (23)–(30) (in this case regular and predictive fitting are the same as stated in Remark 2.2 in [9]). In Table 1 and Table 2 this estimator is shown as LL.
  • Model-Free estimation of Y t ̲ n + 1 involves nonparametric estimation of the first marginal distribution followed by estimating the autocovariance matrix Γ ^ n A R . In this case, as a baseline we have included results using only local linear estimation of the uniformizing transformation as given by Equations (34) and (35) (Local Linear Hansen) and Algorithm 1 (Monotone Local Linear Distribution Estimation). For the same reasons as stated above, regular and predictive fitting are also the same in this case. The L 2 –optimal predictor of Y t ̲ n + 1 in this case is given by:
    Y ^ t ̲ n + 1 = 1 M i = 1 M D t ̲ n + 1 1 ( u i )
    Here, u 1 , , u M U [ 0 , 1 ] where M is some large integer, U is the uniform distribution and D t ̲ n + 1 is estimated by using D ¯ t ̲ n + 1 L L H ( · ) or D ¯ t ̲ n + 1 L L M ( · ) . In Table 1 and Table 2 these estimators are shown as LLH, LLM.
The code for all algorithms used for the synthetic and real-life datasets as discussed in this paper can be found under https://github.com/srinjoyd/randomfields_pp (accessed on 24 July 2023).

6.1. Simulation: Additive Model with Stationary 2D AR Errors

Let a random field be generated using the 2D AR process as below:
W t ̲ = W t 1 , t 2 = 0.25 W t 1 1 , t 2 1 + 0.2 W t 1 1 , t 2 + 1 0.05 W t 1 2 , t 2 + v t 1 , t 2
Let this field be generated over the region defined by 0 t 1 n 1 & 0 t 2 n 2 where n 1 = 101 , n 2 = 101 . The NSHP limits are set from ( 101 , 101 ) to ( 50 , 50 ) , this defines the region E t ̲ , n ̲ as shown in Figure 2. The data Y t ̲ is generated using the additive model in Equation (1) with trend specified as μ ( t ̲ ) = μ ( t 1 , t 2 ) = sin ( 4 π t 2 1 n 2 1 ) where 0 t 1 n 1 & 0 t 2 n 2 . Here v t 1 , t 2 are i.i.d. N ( 0 , τ 2 ) where τ = 0.1 . Let t 1 = 50 , t 2 = 50 where point prediction is performed. Bandwidths for all Model-Based, Model-Free, and baseline predictors are calculated using cross-validation as described in Section 5.
Results for point prediction using mean square error (MSE) over all MB and MF methods are shown in Table 1. A total of 100 realizations of the dataset were used for measuring point prediction performance. From this table it can be seen that MB-LL is the best point predictor. This is expected since the data was generated by a 2D AR model which is the same used in MB-LL prediction. In addition, the estimation is performed at the boundary of the random field with a strong linear trend as shown in Figure 3 where LL regression is expected to perform the best. In addition, it can be observed that MF-LLM performs the best among all MF point predictors and approaches the performance of MB-LL. This shows that monotonicity correction in the LLM distribution estimator has minimal effect on the center of the distribution that is used for point prediction.
Table 1. Point Prediction performance for 2D AR dataset.
Table 1. Point Prediction performance for 2D AR dataset.
Prediction MethodResidual TypeMSE
MB-LCP 1.488 × 10 2
F 1.520 × 10 2
MB-LLP 1.393 × 10 2
F 1.400 × 10 2
MF-LCP 1.530 × 10 2
F 1.549 × 10 2
MF-LLHP 1.471 × 10 2
F 1.515 × 10 2
MF-LLMP 1.414 × 10 2
F 1.456 × 10 2
LLNot Applicable 1.488 × 10 2
LLHNot Applicable 1.651 × 10 2
LLMNot Applicable 1.455 × 10 2
In addition, comparing the performance of MB-LL versus its corresponding baseline LL and that of MF-LLH and MF-LLM versus their corresponding baselines LLH and LLM, respectively, show that the baseline estimators underperform as they do not take into account the spatial dependence present in the data either by estimating the coefficients of the 2D AR model (Equation (6) as in the Model-Based case) or by estimating the autocovariance matrix Γ ^ n A R (Model-Free case).

6.2. Real-Life Example: CIFAR Images

The CIFAR-10 dataset [32] is used as a real-life example to compare the Model-Based and Model-Free prediction algorithms discussed before. The original CIFAR-10 dataset consists of 60,000 32 by 32 color images in 10 classes, with 6000 images per class. We pick 100 images from the class “dog” where the original images have 3 RGB (red, green, blue) channels with discrete pixel values. We pick the R (red) channel of each image, and standardize these to generate a new real-valued dataset. Our final transformed dataset has 100 32 by 32 random fields. The NSHP limits are set from ( 32 , 32 ) to ( 16 , 16 ) , this defines the region E t ̲ , n ̲ as shown in Figure 2. The rest of the image is considered occluded and their pixel values are not available for prediction. Sample images used for prediction are shown in Figure 4. Let t 1 = 16 , t 2 = 16 where point prediction is performed. Bandwidths for all Model-Based, Model-Free, and baseline predictors are calculated using cross-validation as described in Section 5.
The results for point prediction using mean square error (MSE) over all MB and MF methods are shown in Table 2. From this table, it can be seen that MF-LLH and MF-LLM are the best point predictors. The superior performance of the Model-Free estimators as compared to their Model-Free counterparts can be attributed to the fact that the CIFAR-10 image data is not compatible with the additive model as given by Equation (1). It can also be seen that unlike the synthetic 2D AR dataset the two best predictors MF-LLH and MF-LLM are closer in performance which is owing to the lack of a linear trend at the point where prediction is performed.
In addition, comparing the performance of MB-LL versus its corresponding baseline LL and that of MF-LLH and MF-LLM versus their corresponding baselines LLH and LLM, respectively, show that the baseline estimators underperform as they do not take into account the spatial dependence present in the data either by estimating the coefficients of the 2D AR model (Equation (6) as in the Model-Based case) or by estimating the autocovariance matrix Γ ^ n A R (Model-Free case).
Table 2. Point Prediction performance for CIFAR-10 dataset.
Table 2. Point Prediction performance for CIFAR-10 dataset.
Prediction MethodResidual TypeMSE
MB-LCP 1.98 × 10 1
F 2.20 × 10 1
MB-LLP 1.79 × 10 1
F 1.95 × 10 1
MF-LCP 1.79 × 10 1
F 2.12 × 10 1
MF-LLHP 1.60 × 10 1
F 1.89 × 10 1
MF-LLMP 1.64 × 10 1
F 1.70 × 10 1
LLNot Applicable 2.12 × 10 1
LLHNot Applicable 2.38 × 10 1
LLMNot Applicable 2.14 × 10 1

7. Conclusions and Future Work

In this paper, we investigate the problem of one-sided prediction over random fields that are stationary only across a limited part of their entire region of definition. For such locally stationary random fields, we develop frameworks for point prediction using both a Model-Based approach, which includes a coordinate changing trend and/or variance and also by using the Model-Free principle proposed by [21,22]. We apply our algorithms to both synthetic data as well as a real-life dataset consisting of images from the CIFAR-10 dataset. In the latter case, we obtain the best performance by using the Model-Free approach and thereby demonstrate the superiority of this technique versus the Model-Based case where an additive model is assumed arbitrarily for purposes of prediction. In future work, we plan to investigate both Model-Based and Model-Free prediction using random fields with nonuniform spacing of data as well as consider extending our algorithms for estimating prediction intervals.

Author Contributions

Software and experiments—S.D. and Y.Z.; conceptualization and writing—S.D. and D.N.P. All authors have read and agreed to the published version of the manuscript.

Funding

D.N.P. was partially supported by NSF grant DMS 19-14556. S.D. and Y.Z. did not receive any external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The CIFAR-10 publicly available dataset was analyzed in this study. This data can be found in [32]. R code for the algorithms discussed in this paper can be found at https://github.com/srinjoyd/randomfields_pp (accessed on 24 July 2023).

Acknowledgments

This research was partially supported by NSF grant DMS 19-14556. The authors would like to acknowledge the Pacific Research Platform, NSF Project ACI-1541349 and Larry Smarr (PI, Calit2 at UCSD) for providing the computing infrastructure used in this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Priestley, M.B. Evolutionary spectra and non-stationary processes. J. R. Stat. Soc. Ser. B (Methodol.) 1965, 27, 204–237. [Google Scholar] [CrossRef]
  2. Priestley, M.B. Non-Linear and Non-Stationary Time Series Analysis; Academic Press: London, UK, 1988. [Google Scholar]
  3. Dahlhaus, R. Fitting time series models to nonstationary processes. Ann. Stat. 1997, 25, 1–37. [Google Scholar] [CrossRef]
  4. Dahlhaus, R.; Rao, S.S. Statistical inference for time-varying ARCH processes. Ann. Stat. 2006, 34, 1075–1114. [Google Scholar] [CrossRef] [Green Version]
  5. Zhou, Z.; Wu, W.B. Local linear quantile estimation for nonstationary time series. Ann. Stat. 2009, 37, 2696–2729. [Google Scholar] [CrossRef]
  6. Dahlhaus, R. Locally stationary processes. In Handbook of Statistics; Rao, T.S., Rao, S.S., Rao, C.R., Eds.; Elsevier: Amsterdam, The Netherlands, 2012; Chapter 13; Volume 30, pp. 351–412. [Google Scholar]
  7. Zhou, Z. Nonparametric specification for non-stationary time series regression. Bernoulli 2014, 20, 78–108. [Google Scholar] [CrossRef]
  8. Kley, T.; Preuß, P.; Fryzlewicz, P. Predictive, finite-sample model choice for time series under stationarity and non-stationarity. Electron. J. Stat. 2019, 13, 3710–3774. [Google Scholar] [CrossRef]
  9. Das, S.; Politis, D.N. Predictive inference for locally stationary time series with an application to climate data. J. Am. Stat. Assoc. 2021, 116, 919–934. [Google Scholar] [CrossRef] [Green Version]
  10. Lu, Z.; Tjøstheim, D. Nonparametric estimation of probability density functions for irregularly observed spatial data. J. Am. Stat. Assoc. 2014, 109, 1546–1564. [Google Scholar] [CrossRef] [Green Version]
  11. Fuglstad, G.A.; Simpson, D.; Lindgren, F.; Rue, H. Does non-stationary spatial data always require non-stationary random fields? Spat. Stat. 2015, 14, 505–531. [Google Scholar] [CrossRef] [Green Version]
  12. Kurisu, D. On nonparametric inference for spatial regression models under domain expanding and infill asymptotics. Stat. Probab. Lett. 2019, 154, 108543. [Google Scholar] [CrossRef] [Green Version]
  13. Kurisu, D. Nonparametric regression for locally stationary random fields under stochastic sampling design. Bernoulli 2022, 28, 1250–1275. [Google Scholar] [CrossRef]
  14. Matsuda, Y.; Yajima, Y. Locally stationary spatio-temporal processes. Jpn. J. Stat. Data Sci. 2018, 1, 41–57. [Google Scholar] [CrossRef] [Green Version]
  15. Choi, B.; Politis, D.N. Modeling 2-D AR processes with various regions of support. IEEE Trans. Signal Process. 2007, 55, 1696–1707. [Google Scholar] [CrossRef]
  16. Mojiri, A.; Waghei, Y.; Nili-Sani, H.; Mohtashami Borzadaran, G.R. Non-stationary spatial autoregressive modeling for the prediction of lattice data. Commun. Stat.-Simul. Comput. 2021, 1–13. [Google Scholar] [CrossRef]
  17. Vaishali, D.; Ramesh, R.; Christaline, J.A. 2D autoregressive model for texture analysis and synthesis. In Proceedings of the 2014 International Conference on Communication and Signal Processing, Melmaruvathur, India, 3–5 April 2014; pp. 1135–1139. [Google Scholar]
  18. Hallin, M.; Lu, Z.; Tran, L.T. Local linear spatial regression. Ann. Stat. 2004, 32, 2469–2500. [Google Scholar] [CrossRef] [Green Version]
  19. El Machkouri, M.; Es-Sebaiy, K.; Ouassou, I. On local linear regression for strongly mixing random fields. J. Multivar. Anal. 2017, 156, 103–115. [Google Scholar] [CrossRef]
  20. Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods, 2nd ed.; Springer: New York, NY, USA, 1991. [Google Scholar]
  21. Politis, D.N. Model-free model-fitting and predictive distributions. Test 2013, 22, 183–221. [Google Scholar] [CrossRef] [Green Version]
  22. Politis, D.N. Model-Free Prediction and Regression; Springer: New York, NY, USA, 2015. [Google Scholar]
  23. Dudgeon, D.E.; Mersereau, R.M. Multidimensional Digital Signal Processing Prentice-Hall Signal Processing Series; Prentice-Hall: Englewood Cliffs, NJ, USA, 1984. [Google Scholar]
  24. Härdle, W.; Vieu, P. Kernel regression smoothing of time series. J. Time Ser. Anal. 1992, 13, 209–232. [Google Scholar] [CrossRef]
  25. Kim, T.Y.; Cox, D.D. Bandwidth selection in kernel smoothing of time series. J. Time Ser. Anal. 1996, 17, 49–63. [Google Scholar] [CrossRef]
  26. Li, Q.; Racine, J.S. Nonparametric Econometrics: Theory and Practice; Princeton University Press: Princeton, NJ, USA, 2007. [Google Scholar]
  27. Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability; CRC Press: Boca Raton, FL, USA, 1996; Volume 66. [Google Scholar]
  28. Fan, J.; Yao, Q. Nonlinear Time Series: Nonparametric and Parametric Methods; Springer: New York, NY, USA, 2007. [Google Scholar]
  29. Fan, J. Local linear regression smoothers and their minimax efficiencies. Ann. Stat. 1993, 21, 196–216. [Google Scholar] [CrossRef]
  30. Hansen, B.E. Nonparametric Estimation of Smooth Conditional Distributions; Unpublished paper; Department of Economics, University of Wisconsin: Madison, WI, USA, 2004. [Google Scholar]
  31. Das, S.; Politis, D.N. Nonparametric estimation of the conditional distribution at regression boundary points. Am. Stat. 2019, 74, 233–242. [Google Scholar] [CrossRef] [Green Version]
  32. Krizhevsky, A.; Nair, V.; Hinton, G. The CIFAR-10 Dataset 2014. Available online: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 24 July 2023).
Figure 1. Nonsymmetric Half-Plane.
Figure 1. Nonsymmetric Half-Plane.
Applsci 13 08877 g001
Figure 2. Prediction point for NSHP. In this drawing, NSHP( t ̲ ) denotes the nonsymmetric half-plane centered at t ̲ = ( t 1 , t 2 ) covering the hashed area. E n ̲ denotes the finite subset of Z 2 marked by the red boundary. The intersection of the two gives E t ̲ , n ̲ . Point prediction is performed at t ̲ = ( t 1 , t 2 ) .
Figure 2. Prediction point for NSHP. In this drawing, NSHP( t ̲ ) denotes the nonsymmetric half-plane centered at t ̲ = ( t 1 , t 2 ) covering the hashed area. E n ̲ denotes the finite subset of Z 2 marked by the red boundary. The intersection of the two gives E t ̲ , n ̲ . Point prediction is performed at t ̲ = ( t 1 , t 2 ) .
Applsci 13 08877 g002
Figure 3. Linear trend for NSHP where prediction is performed (50, 50). Here the axes labeled x and y denote the coordinates of the random field and the axis labeled z denotes the corresponding value of the random field at those coordinates.
Figure 3. Linear trend for NSHP where prediction is performed (50, 50). Here the axes labeled x and y denote the coordinates of the random field and the axis labeled z denotes the corresponding value of the random field at those coordinates.
Applsci 13 08877 g003
Figure 4. Sample images from the CIFAR-10 dataset with label dog (Note: Here, full images are shown although only part of it is used for prediction).
Figure 4. Sample images from the CIFAR-10 dataset with label dog (Note: Here, full images are shown although only part of it is used for prediction).
Applsci 13 08877 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Das, S.; Zhang, Y.; Politis, D.N. Model-Based and Model-Free Point Prediction Algorithms for Locally Stationary Random Fields. Appl. Sci. 2023, 13, 8877. https://doi.org/10.3390/app13158877

AMA Style

Das S, Zhang Y, Politis DN. Model-Based and Model-Free Point Prediction Algorithms for Locally Stationary Random Fields. Applied Sciences. 2023; 13(15):8877. https://doi.org/10.3390/app13158877

Chicago/Turabian Style

Das, Srinjoy, Yiwen Zhang, and Dimitris N. Politis. 2023. "Model-Based and Model-Free Point Prediction Algorithms for Locally Stationary Random Fields" Applied Sciences 13, no. 15: 8877. https://doi.org/10.3390/app13158877

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop