A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets

Xu, Lili; Li, Baolin; Yuan, Yecheng; Gao, Xizhang; Zhang, Tao

doi:10.3390/rs70708906

Open AccessArticle

A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets

¹

State Key Lab of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China

²

University of Chinese Academy of Sciences, CAS, Beijing 100049, China

³

Jiangsu Center for Collaborative Innovation in Geographical Information Resource development and Application, Nanjing 210023, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2015, 7(7), 8906-8924; https://doi.org/10.3390/rs70708906

Submission received: 29 April 2015 / Revised: 29 June 2015 / Accepted: 7 July 2015 / Published: 15 July 2015

Download

Browse Figures

Versions Notes

Abstract

:

Reconstructing normalized difference vegetation index (NDVI) time series datasets is essential for monitoring long-term changes in terrestrial vegetation. Here, a temporal–spatial iteration (TSI) method was developed to estimate the NDVIs of contaminated pixels, based on reliable data. The NDVIs of contaminated pixels were first computed through linear interpolation of adjacent high-quality pixels in the time series. Then, the NDVIs of remaining contaminated pixels were determined based on the NDVI of a high-quality pixel located in the same ecological zone, showing the most similar NDVI change trajectories. These two steps were repeated iteratively, using the estimated NDVIs as high-quality pixels to predict undetermined NDVIs of contaminated pixels until the NDVIs of all contaminated pixels were estimated. A case study was conducted in Inner Mongolia, China. The accuracies of estimated NDVIs using TSI were higher than the asymmetric Gaussian, Savitzky–Golay, and window-regression methods. Root mean square error (RMSE) and mean absolute percent error (MAPE) decreased by 16.7%–86.6% and 18.3%–33.0%, respectively. The TSI method performed better over a range of environmental conditions, the variation of performance by the compared methods was 1.4–5 times that of the TSI method. The TSI method will be most applicable when large numbers of contaminated pixels exist.

Keywords:

reconstruction; time series; NDVI; trajectory distance; temporal-spatial correlation

Graphical Abstract

1. Introduction

The normalized difference vegetation index (NDVI) is an important indicator of vegetation status [1,2]. NDVI time series datasets have been used to identify phenological characteristics, for monitoring long-term trends, for detecting abrupt changes and in multi-temporal classification [3,4,5,6]. However, pixels contaminated by atmospheric conditions, sensor viewing angles, or variations in sun-surface-sensor geometry always exist in an NDVI time series dataset [7,8,9]. It is important to determine the “accurate” NDVI values of contaminated pixels because inaccurate NDVIs can result in a misinterpretation of the dynamics of terrestrial ecosystems [10,11]. Wessels et al. (2007) stated that a 15% NDVI reduction at different locations in the time series indicated a clear difference in the NDVI trend [12]. Therefore, the reconstruction of accurate time series datasets is essential for the use of Moderate Resolution Imaging Spectroradiometer (MODIS) MODIS13Q1, National Oceanic and Atmospheric Administration (NOAA) Global Inventory Modeling and Mapping Studies (GIMMS), NOAA Pathfinder (PAL), or Systeme Probatoire d’Observation de la Terre (SPOT) Vegetation (VGT) NDVI datasets.

Much research in recent years has been conducted on methods for reconstructing NDVI datasets that contain contaminated pixels. Most methods, such as the Savitzky–Golay (SG) [13,14], mean value iteration [15], best index slope extraction [16], double logistic [17], asymmetric Gaussians (AG) [18], and Fourier analysis [19] methods, have been applied to determine the values of contaminated pixels based on the integrality and consistency of the entire time series dataset, according to either the frequency or the temporal domain [20,21]. However, continuously contaminated pixels appear quite frequently in the NDVI series, and the use of a limited number of high-quality pixels to establish these values can result in poor model performance.

To overcome this problem, alternative methods have been developed to determine the values of contaminated pixels based on the correlation of data within the spatial dimension combined with the temporal dimension [22]. Cho and Suh [23] used a land cover map for 2006–2008 to define the spatial neighborhood, and estimated the NDVIs of contaminated pixels by weighting the NDVIs of high-quality pixels that had the same land cover type within a predetermined window. However, it is difficult to obtain highly accurate land cover maps: for example, the global land cover map based on Thematic Mapper/Enhanced Thematic Mapper Plus (TM/ETM+) data has accuracy of only about 65% [24,25]. Low-accuracy land cover maps can result in high uncertainty in the estimated NDVIs of contaminated pixels.

De Oliveira et al. [20] proposed a spatial-temporal window method, window-regression (WR), to estimate the NDVI of a contaminated pixel through regression analysis between the NDVIs of the contaminated pixel and others within a predetermined window. However, contaminated pixels are frequently aggregated and, if there is a lack of high-quality pixels within the predetermined window, contaminated NDVIs cannot be estimated. Furthermore, if there are many regression equations with low reliability, there will be high uncertainty regarding the integrity of the estimated NDVIs.

As discussed above, two fundamental problems need to be addressed when using a method based on spatial estimation. The first is to define a temporal–spatial neighborhood that can estimate all the contaminated NDVIs. Second, a means by which to estimate the NDVIs of the contaminated pixels without reference to existing land cover maps should be established. Here, to overcome these two problems, a temporal-spatial iteration (TSI) method is proposed that estimates the NDVIs of the contaminated pixels based on temporal-spatial neighborhood estimation without reference to land cover maps or to an arbitrary predetermined window. In this study, we took an ecological zone as the spatial neighborhood to obtain high-quality NDVIs from the same land cover types. We estimated the NDVIs of the contaminated pixels using the NDVIs of high-quality pixels that reflected similar land cover types. This was achieved using the weighted trajectory distance (WTD) algorithm based on NDVI change trajectories, which avoided the need to use existing land cover maps.

2. Materials and Methods

2.1. Study Area

The study areas were within the Inner Mongolian Autonomous Region of China (37°24′–53°23′N, 97°12′–126°04′E), which covers about 1,180,000 km² and consists of 89 banners (counties or cities). The natural environment of the study area is strongly heterogeneous. Annual precipitation varies from 50 mm/a in the northwest to 500 mm/a in the northeast. Annual average temperature varies from 0 to 8 °C. Humid, semiarid, and arid zones are found from east to west, respectively. Soil types are black soil, chernozem, chestnut soil, and sandy soil. Vegetation comprises forest, steppe, desert-steppe, and desert.

Nine sub-areas were chosen (Figure 1), three from each of the humid, semiarid, and arid zones. MODIS images of each sub-area covered 2500 km² and comprised 40,000 pixels (i.e., 200 rows × 200 columns). Sub-areas 1–3 were in a typical humid region, located in northeast of the main study area where the dominant land cover was forest. Sub-areas 4–6 were in a typical semiarid region with mainly grassland cover, located in the middle of the study area. Sub-areas 7–9 were located in the west of the study area, a typical arid region with vegetation cover of sparse forest or shrubland. The geographic coordinates of the center pixels of the nine sub-areas are listed in Table 1.

Table 1. Location of the nine selected sub-areas.

**Table 1.** Location of the nine selected sub-areas.
Study Area		Geographic Coordinates of the Center Pixel
Humid area	Sub-area 1	50°22′112′′N, 121°1′40′′E
	Sub-area 2	49°26′33′′N, 123°38′54′′E
	Sub-area 3	48°27′57′′N, 120°55′56′′E
Semi-arid area	Sub-area 4	44°51′11′′N, 118°40′11′′E
	Sub-area 5	43°18′31′′N, 118°50′25′′E
	Sub-area 6	42°24′22′′N, 115°7′4′′E
Arid area	Sub-area 7	40°45′8′′N, 106°28′8′′E
	Sub-area 8	40°11′55′′N, 104°2′52′′E
	Sub-area 9	40°54′33′′N, 103°57′8′′E

Figure 1. Location of the study area and nine selected sub-areas.

2.2. Data and Data Preprocessing

The NDVI time series data were taken from MODIS13Q1 datasets downloaded from the MODIS website (http://modis.gsfc.nasa.gov/), covering 92 dates from 2010 to 2013. Data from layer 1 and layer 12 of the MODIS13Q1 dataset were used in this study. Layer 1 reflects the NDVI value and layer 12 provides information on the reliability and quality of the pixel data for each pixel in the dataset. Both layers have the same temporal resolution (16 days) and spatial resolution (250 m). They were reprojected to Albers equal area projection with a WGS84 datum using the MODIS Reprojection Tool (MRT) and the nearest neighbor resampling method.

The 1:5,000,000 ecological map of Inner Mongolia was used to identify ecological zones. An ecological zone is determined by the climate, soil, landform and vegetation. In an ecological zone, the combination of land cover types is unique and is similar in its different sub-areas. The same kind of land cover type in an ecological zone indicates the similar combination of plants and their growing processes.

The values of pixel reliability of the MODIS13Q1 dataset were classified into five ranks [3]: “−1” indicated the corresponding pixel was not processed and that it should be treated as null; “0” indicated the corresponding pixel was good with high-quality data; “1” indicated the value of the corresponding pixel was marginal but that it could be used with reference to other quality assurance information; “2” indicated the targets in the corresponding pixel were covered with snow or ice when the data were produced; and “3” indicated the targets in the corresponding pixel were invisible (usually because of cloud cover). The NDVIs of pixels with rankings of “−1”, “2”, and “3” were considered unreliable and defined as “contaminated” pixels. The NDVI datasets were then classified into high-quality pixels (group 1) and contaminated pixels (group 2).

According to a statistical analysis of the NDVI dataset in the study area, 65.4% and 14.0% of pixels were classified as groups “0” and “1”, respectively, and 11.8% and 8.7% of pixels were classified as groups “2” and “3”, respectively. Only <0.01% of pixels were classified as group “−1” and thus, the impact of this group on the NDVI application was very limited.

The percentage of contaminated NDVIs over Inner Mongolia throughout the study period was 20.5%. However, in sub-area 1–3, there were 4,250,433 contaminated pixels (38.5%); in sub-area 4–6, there were 3,108,003 contaminated pixels (28.1%), and in sub-area 7–9, there were 1,049,249 contaminated pixels (9.5%) (Table 2).

Table 2. Data and data quality in study area.

**Table 2.** Data and data quality in study area.
Study Area		Contaminated NDVIs	Percentage (%)
Humid area	Sub-area 1	1,508,984	41.0
	Sub-area 2	1,212,144	32.9
	Sub-area 3	1,529,305	41.6
	Sub-total	4,250,433	38.5
Semi-arid area	Sub-area 4	1,281,536	34.8
	Sub-area 5	998,648	27.1
	Sub-area 6	827,819	22.5
	Sub-total	3,108,003	28.1
Arid area	Sub-area 7	282,627	7.70
	Sub-area 8	287,835	7.80
	Sub-area 9	478,787	13.0
	Sub-total	1,049,249	9.50

Using only high-quality NDVIs, the annual mean 16-day NDVIs of the pixels during the growing season (AMS-NDVI) over the study period were also computed to indicate the variations in the growth status of the land cover types over the growing season. Thus, an annual 16-day NDVI change trajectory was established for each pixel in the three sub-areas. Each trajectory consisted of 12 NDVI values corresponding to 12 time phases within the growing season.

2.3. Methodology

The estimation of the NDVIs of contaminated pixels using the TSI method was based on both temporal and spatial estimations. The NDVIs of the contaminated pixels based on temporal estimation were computed through linear interpolation of adjacent high-quality pixels in the time series. To reduce the impact of fluctuations due to vegetation growth, the contaminated NDVIs were only estimated using NDVIs of high-quality pixels from within one month forwards or backwards.

For temporal estimation, we predicted the NDVI of contaminated pixels solely based on the NDVIs of this pixel in other time phases. For spatial estimation, we predicted the NDVI of contaminated pixels from those of other pixels. However, the vegetation heterogeneity of 250 m MODIS data made it difficult to find a pixel with the same land cover as a contaminated one. Generally, the land cover types of a pixel seldom change within one month and the status of vegetation should not vary greatly within one month, meaning linear interpolation of NDVI in the temporal domain was more reliable than spatial estimation.

Figure 2. Framework of the TSI method.

Spatial estimation strategies were based on the NDVI of high-quality pixels that contained similar land cover types, as established by Gao et al. [26] and Zhu et al. [27]. In this study, determining the most similar candidate pixel was not based directly on land cover maps because it was difficult to obtain a land cover map with high spatial accuracy. We assumed that the pixels with more similar NDVI change trajectories occupied more similar land cover types. Thus, the WTD algorithm based on the shortest NDVI trajectory distance was used to determine the pixel with the most similar land cover type for a contaminated pixel (details are shown in Section 2.3.2).

Figure 2 presents the framework of the TSI method. The TSI first determines the NDVIs of the contaminated pixels based on temporal estimation and then again, based on spatial estimation. If there are pixels that fail to be reconstructed following these two steps, the TSI repeats the first two steps iteratively using the estimated NDVIs as high-quality NDVIs, until the NDVIs of all the contaminated pixels are estimated successfully.

2.3.1. Temporal Estimation

First, contaminated NDVIs including both type “101” (a contaminated pixel neighbored by two high-quality pixels) and “1001” (two adjacent contaminated pixels neighbored by two high-quality pixels) (Figure 3) were estimated, as suggested by [23]. A contaminated pixel of type 101 was estimated using Equation (1), and a contaminated pixel of type “1001” was estimated using Equations (2) and (3):

N D V I_{i_f i t} = \frac{N D V I_{i - 1} + N D V I_{i + 1}}{2}

(1)

N D V I_{i_f i t} = \frac{N D V I_{i - 1} \times 2 + N D V I_{i + 2}}{3}

(2)

N D V I_{i + 1_f i t} = \frac{N D V I_{i - 1} + N D V I_{i + 2} \times 2}{3}

(3)

where i is the time phase of the contaminated NDVI to be estimated,

N D V I_{i_f i t}

and

N D V I_{i + 1_f i t}

are the estimated NDVIs of the contaminated NDVIs, and

N D V I_{i - 1}

and

N D V I_{i + 2}

are the NDVIs of the high-quality pixels.

Figure 3. Graphical representation of temporal neighborhood estimation.

2.3.2. Spatial Estimation

Step 1: defining all the candidates used for a contaminated NDVI estimation

This step is conducted to find all the suitable candidate pixels that can be used for the estimation of the contaminated NDVIs. For a contaminated NDVI, the candidates must satisfy three conditions, namely: (1) be pixels of high quality; (2) be in the same time phase as the contaminated pixel; and (3) be in the same ecological zone as the contaminated pixel (Figure 4).

Figure 4. Graphical representation of spatial neighborhood estimation.

Step 2: calculating similarities of AMS-NDVI change trajectories between a target-contaminated NDVI and the candidates used for its estimation

For vegetation growth, the NDVIs in some key phases are important for controlling the form of the AMS-NDVI trajectory. Therefore, in this study, we selected the WTD to indicate the similarity of the AMS-NDVI change trajectory between a contaminated NDVI and the candidate used for its estimation. The WTD method assigns different weights to points within a change trajectory to measure the importance of points in affecting the form of the trajectory [28,29,30]. The WTD can be calculated using Equation (4):

D_{i, j} = \frac{1}{\sum_{k = 1}^{n} ω_{k}} \times \sum_{k = 1}^{n} ω_{k} \times | N D V I_{i, k} - N D V I_{j, k} |

(4)

where i is the ith target contaminated pixel to be estimated, j is the jth high-quality candidate pixel used for the spatial neighborhood estimation, n is the total number of time phases in the AMS-NDVI trajectory of the reconstructed NDVI series (n = 12 in this paper),

D_{i, j}

is the trajectory distance between the ith target and the jth candidate, and

ω_{k}

means the weight assigned to the k^th time phase in the growing season.

In this study, all the phases in a vegetation change trajectory were grouped into two classes: (1) key phases; and (2) non-key phases. Key phases (peak point, the fastest growth phase before peak point and the fastest decrease phase after peak point) controlled the fundamental form of the NDVI change trajectory and all other phases were considered as non-key phases. When

D_{i, j}

was calculated, the

ω_{k}

for each key phase was given additional weight. The total of the additional weights equaled 1.0, and higher NDVI slope changes indicated higher weights. The value of

ω_{k}

can be calculated using Equation (5):

ω_{k} = {\begin{matrix} 1 & (n o n - k e y p h a s e s) \\ 1 + ω_{k}^{'} & (k e y p h a s e s) \end{matrix}

(5)

where

ω_{k}

is the weight of the NDVI in the kth time phase;

ω_{k}^{'}

is the additional weight in the k^th phase; and m₁, m₂, and m₃ are the time phases of the key phases (Figure 4).

Figure 5. The AMS-NDVI change trajectories with three key phases.

In this study, we used three key phases as the controls of the fundamental form of the AMS-NDVI change trajectory. The key phases were located at the peak period (m₂ in Figure 5), start period (m₁ in Figure 5), and end period (m₃ in Figure 5), i.e., where the NDVI changed most rapidly. The key phase in the peak period was the phase with maximum NDVI among all the NDVIs. The key phases in the start and end periods were determined as follows: (1) the change of NDVI slope for every phase on the growing part of the NDVI change trajectory before the peak or on the decreasing part after the peak was computed; (2) the phases where the change of NDVI slope was the greatest on the growing part of the NDVI change trajectory before the peak and on the decreasing part after the peak were selected as key phases. The additional weights for three key phases were computed using Equations (6) and (7):

ω_{k}^{'} = {\begin{matrix} | S_{(1, m_{1})} - S_{(m_{1}, m_{2})} | / \sum^{​} S c & (k = m_{1}) \\ | S_{(m_{1}, m_{2})} - S_{(m_{2}, m_{3})} | / \sum^{​} S c & (k = m_{2}) \\ | S_{(m_{2}, m_{3})} - S_{(m_{3}, 12)} | / \sum^{​} S c & (k = m_{3}) \end{matrix}

(6)

\sum^{​} Sc = | S_{(1, m_{1})} - S_{(m_{1}, m_{2})} | + | S_{(m_{1}, m_{2})} - S_{(m_{2}, m_{3})} | + | S_{(m_{2}, m_{3})} - S_{(m_{3}, 12)} |

(7)

where

ω_{k}^{'}

is the additional weight in the kth time phase, k is the kth time phase, m1, m2, and m3 are the time phases of the key phases,

S_{(1, m_{1})}

is the NDVI change slope between the 1st time phase (the first phase in the growing season) and the

m_{1}^{t h}

time phase,

S_{(m_{1}, m_{2})}

is the NDVI change slope between the

m_{1}^{t h}

time phase and the

m_{2}^{t h}

time phase,

S_{(m_{2}, m_{3})}

is the NDVI change slope between the

m_{2}^{t h}

time phase and the

m_{3}^{t h}

time phase, and

S_{(m_{3}, 12)}

is the NDVI change slope between the

m_{3}^{t h}

time phase and the nth time phase (final phase in the growing season) in the AMS-NDVI change trajectories.

\sum^{​} Sc

is the total slope change in three key phases (k = m₁, m₂, m₃).

Step 3: choosing the best candidate and estimating the NDVI of a contaminated pixel

For an undetermined NDVI of a contaminated pixel, the most similar candidate was determined using Equation (8):

D_{s} = m i n (D_{i, j})

(8)

where

D_{i, j}

is the trajectory distance between the ith contaminated pixel and the jth candidate of the high-quality pixels for the estimation, and

D_{s}

is the shortest trajectory distance.

The undetermined NDVI was estimated based on the NDVI of the candidate shown as having the shortest trajectory distance by Equation (9):

N D V I_{E} = N D V I_{(D_{s})}

(9)

where

N D V I_{E}

is the estimated NDVI, and

N D V I_{(D_{s})}

is the NDVI of the candidate with the shortest trajectory distance among those between the estimated NDVI and all the candidates used for the estimation.

Step 4: iteration of temporal–spatial estimation of the NDVI of contaminated pixels

It is possible that some contaminated pixels will not be estimated following both the temporal and the spatial estimations. In such cases, the estimation process (steps 1–4 in Section 2.3.2) is repeated, incorporating all the successfully estimated NDVIs of contaminated pixels as high-quality pixels. This iterative process continues until all the NDVIs of the contaminated pixels have been successfully estimated.

2.4. Technique Evaluation

Evaluation of the TSI method focused on three aspects: (1) the accuracy of the estimated NDVIs of the contaminated pixels; (2) the amount by which the original values of the high-quality pixels were changed; and (3) how many contaminated pixels were estimated. Three commonly used methods, i.e., the SG and AG methods based on temporal estimation and the WR method based on both temporal and spatial estimations, were chosen for comparison.

2.4.1. Accuracies of the Estimated NDVIs of Contaminated Pixels

Both the root mean square error and mean absolute percent error were taken as indices of accuracy with which to assess the estimated NDVIs. Lower values of RMSE and MAPE indicated higher accuracies of the estimated NDVIs. They were calculated using Equations (10) and (11):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\hat{Z_{i}} - Z_{i})}^{2}}

(10)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{\hat{Z_{i}} - Z_{i}}{Z_{i}} | \times 100

(11)

where

Z_{i}

is the real NDVI of sample i,

\hat{Z_{i}}

is the estimated NDVI, and n represents the size of the validation sample.

In this paper, we stated that the NDVIs of high-quality pixels were considered reliable. Thus, high-quality pixels were considered as “true” values and were used as validation samples. One thousand NDVIs of high-quality pixels were randomly selected as validation samples to assess the accuracies of the estimated NDVIs. The 1000 validation pixels were set aside and treated as “real” contaminated pixels to be estimated by other high-quality pixels. They were not used in the estimation of other contaminated NDVI pixels. The validation samples covered all the major ecological zones of the sub-areas including forest, grassland and desert.

2.4.2. Ability to Retain the Original Values of the High-Quality Pixels

Both RMSE and MAPE were also taken as indices with which to assess the capability of the method to retain the original values of the high-quality pixels. Lower values of RMSE and MAPE indicate greater ability to retain the original values of the high-quality pixels. All the high-quality pixels were used as validation samples.

2.4.3. Number of Contaminated Pixels that Can Be Estimated

The percentage of estimated NDVIs of contaminated pixels was taken as an index with which to assess how many contaminated pixels were estimated. It was calculated using Equation (12):

P_{E} = N_{E} / N_{T} \times 100

(12)

where

P_{E}

is the percentage of estimated NDVIs of contaminated pixels,

N_{E}

is the number of estimated NDVIs of contaminated pixels, and

N_{T}

is the total number of contaminated pixels.

3. Results

3.1. Accuracies of the Estimated NDVIs of Contaminated Pixels

Table 3 shows that the accuracies of the estimated NDVIs using the TSI method were clearly higher than those of the AG, SG, and WR methods. The RMSE and MAPE of the TSI method for the nine sub-areas were 0.0486% and 10.9%, respectively. The lowest values of RMSE and MAPE from the other methods were from the SG method, measuring 0.0567% and 12.9%, i.e., 16.7% and 18.3% greater, respectively.

The improvement in the accuracies of the estimated NDVIs using the TSI method varied in the different sub-areas. In sub-areas 1–6 related to forest or grassland ecological zones, both RMSE and MAPE based on the TSI method were clearly lower than the other three methods. In sub-areas 1–3, even the lowest RMSE and MAPE of the compared methods (SG method) were 0.0814% and 14.6%, respectively, which were up to 17.0% and 32.7% greater than the TSI method. In sub-areas 4–6, the best RMSE and MAPE of the compared methods (SG method) were 0.0518% and 12.7%, and higher than the TSI method by 16.9% and 25.7%, respectively. However, in sub-areas 7–9 related to desert ecological zones, the performances of all the four methods were similar with RMSEs ranging from 0.182 to 0.0199 and MAPEs ranging from 9.65% to 13.6% (Table 3, Figure 6).

Table 3. Performances of contaminated pixel reconstructions using the four methods.

**Table 3.** Performances of contaminated pixel reconstructions using the four methods.
Method		AG		SG		WR*		TSI
Method		RMSE	MAPE (%)	RMSE	MAPE (%)	RMSE	MAPE (%)	RMSE	MAPE (%)
Humid area	Sub-area 1	0.0872	15.5	0.0814	14.0	0.168	18.9	0.0717	11.4
	Sub-area2	0.0909	17.0	0.0867	16.0	0.126	15.5	0.0721	11.5
	Sub-area 3	0.0801	14.2	0.0745	12.7	0.150	18.5	0.0638	10.1
	Sub-total	0.0864	15.8	0.0814	14.6	0.149	17.6	0.0696	11.0
Semi-arid area	Sub-area 4	0.0703	18.3	0.0634	16.0	0.0978	21.2	0.0529	11.3
	Sub-area 5	0.0462	10.9	0.0446	10.5	0.0490	11.1	0.0442	10.5
	Sub-area 6	0.0498	13.0	0.0451	11.6	0.0472	14.4	0.0336	8.6
	Sub-total	0.0564	14.1	0.0518	12.7	0.0668	15.2	0.0443	10.1
Arid area	Sub-area 7	0.0254	12.0	0.0240	10.1	0.0293	11.0	0.0254	10.8
	Sub-area 8	0.0152	14.8	0.0137	12.8	0.0118	9.33	0.0139	12.9
	Sub-area 9	0.0176	14.1	0.0150	11.3	0.0110	8.58	0.0142	11.3
	Sub-total	0.0199	13.6	0.0182	11.4	0.0194	9.65	0.0186	11.7
Total		0.0607	14.5	0.0567	12.9	0.0907	13.8	0.0486	10.9

* Only those pixels successfully reconstructed were counted. Overall, 70.2%, 70.6%, 68.1%, 66.4%, 80.1%, 78.5%, 88.5%, 84.7% and 90.8% of pixels were considered in sub-areas 1–9, respectively.

Figure 6. MAPEs and RMSEs of the four methods in the nine sub-areas. *Only those pixels reconstructed by the WR method were used.

Figure 7. The ranges of MAPE of the four methods. *Only those pixels reconstructed by the WR method were used.

The TSI method showed the most robust performance of the four methods (Figure 6).The MAPE of the TSI method ranged from 10.1% to 11.7% (range of 1.6%); that of the AG method ranged from 13.6% to 15.8% (range of 2.2%), that of the SG method ranged from 11.4% to 14.6% (range of 3.2%), and that of the WR method ranged from 9.65% to 17.6% (range of 8.0%). The ranges of the three comparison methods were 1.4–5 times that of the TSI method (Figure 7).

3.2. Ability to Retain the Original Values of the High-Quality Pixels

Both the TSI and the WR methods retained the original values of the high-quality pixels because they were unaffected by the algorithms of the two methods. However, the NDVIs of the high-quality pixels were changed by the AG and SG methods. Across all nine sub-areas, the RMSEs of the AG and SG methods were 0.0458 and 0.0428, respectively, and the MAPEs were 11.8% and 9.9%, respectively (Table 4). This shows that the NDVIs of the high-quality pixels will have errors of about 10% following the reconstruction of NDVIs, which could clearly cause uncertainty with regard to applications based on these NDVIs.

Table 4. RMSEs and MAPEs of reconstructed NDVIs of high-quality pixels.

**Table 4.** RMSEs and MAPEs of reconstructed NDVIs of high-quality pixels.
Study Areas		AG		SG
Study Areas		RMSE	MAPE (%)	RMSE	MAPE (%)
Humid area	Sub-area 1	0.0692	12.4	0.0642	11.3
	Sub-area2	0.0739	14.0	0.0715	13.0
	Sub-area 3	0.0670	12.8	0.0651	11.6
	Sub-total	0.0703	13.1	0.0672	12.0
Semi-arid area	Sub-area 4	0.0586	14.5	0.0511	12.2
	Sub-area 5	0.0341	10.1	0.0321	8.71
	Sub-area 6	0.0407	10.4	0.0380	9.38
	Sub-total	0.0450	11.5	0.0407	10.0
Arid area	Sub-area 7	0.0207	15.1	0.0182	11.3
	Sub-area 8	0.0088	8.50	0.0066	6.23
	Sub-area 9	0.0122	9.91	0.0094	7.22
	Sub-total	0.0148	11.2	0.0125	8.27
Total		0.0458	11.8	0.0428	9.85

3.3. Number of Contaminated Pixels that Can Be Estimated

The AG, SG, and TSI methods were able to determine NDVIs for all the contaminated pixels, while the WR method was not able to do so. Across all nine sub-areas, estimates were generated for 77.5% of the contaminated pixels by the WR method (Table 5). This showed the low ability of the WR method to estimate the NDVIs of the contaminated pixels. In sub-areas 1–3, only about 68.1%–70.6% of the contaminated pixels were estimated, whereas about 84.7%–90.8% of the contaminated pixels were estimated in sub-area 7–9.

Table 5. Percentage of contaminated NDVIs estimated using the WR method.

**Table 5.** Percentage of contaminated NDVIs estimated using the WR method.
Study Area		Estimated NDVIs (%)
Humid area	Sub-area 1	70.2
	Sub-area2	70.6
	Sub-area 3	68.1
	Sub-total	69.6
Semi-arid area	Sub-area 4	66.4
	Sub-area 5	80.1
	Sub-area 6	78.5
	Sub-total	75.0
Arid area	Sub-area 7	88.5
	Sub-area 8	84.7
	Sub-area 9	90.8
	Sub-total	88.0
Total		77.5

4. Discussion

4.1. Limitation of Temporal Estimation of NDVIs of Contaminated Pixels

In the study area, the percentage of pixels whose contaminated status lasted one phase (type “101” in Table 6) was only 10.6% and that of pixels whose contaminated status lasted two phases (type “1001” in Table 6) was only 7.28%. This indicated that over 82.1% of the contaminated pixels would persist for three phases or more. The vegetation status cannot be expected to change linearly over such a long period, and therefore the estimation of contaminated pixels based solely on temporal estimations may not be very effective. This is why the accuracies of the estimated NDVIs using both the temporal and the spatial estimations of the TSI method are higher than those based solely on temporal estimations, such as the AG and SG methods.

Table 6. Distribution of contaminated pixels in different temporal and spatial windows.

**Table 6.** Distribution of contaminated pixels in different temporal and spatial windows.
Study Area	Window Type	Temporal			Spatial
Study Area	Window Type	101 (%)	1001 (%)	Total (%)	3 × 3 (%)	5 × 5 (%)
Humid area	Sub-area 1	0.63	0.55	1.17	95.9	92.1
	Sub-area2	2.45	1.93	4.38	92.9	87.2
	Sub-area 3	1.83	1.21	3.04	93.6	87.8
	Sub-total	1.64	1.23	2.86	94.1	89.0
Semi-arid area	Sub-area 4	1.02	0.30	1.31	95.5	91.5
	Sub-area 5	7.56	6.35	13.9	83.7	71.6
	Sub-area 6	1.28	2.68	3.96	94.9	90.5
	Sub-total	3.29	3.11	6.39	91.4	84.5
Arid area	Sub-area 7	27.8	16.6	44.4	69.0	48.7
	Sub-area 8	28.4	16.7	45.1	72.3	52.9
	Sub-area 9	24.7	19.2	43.9	73.9	55.7
	Sub-total	27.0	17.5	44.5	71.7	52.4
Total		10.6	7.28	17.9	85.7	75.3

4.2. Limitation of using Regular Spatial Neighborhood in Spatial Estimation

For spatial estimations of contaminated pixels, a regular spatial neighborhood is often set to determine all those pixels of high quality that will be used for the estimation. However, because most pixel contamination is caused by clouds, which cover a wide area, the spatial distribution of contaminated pixels is not random but is often aggregated. Therefore, under a predetermined regular spatial window, it is not possible to define suitable pixels for the estimation of many contaminated pixels.

In our study area, based on a 3 × 3 window, the percentage of contaminated pixels completely surrounded by other contaminated pixels was 85.7%. For a 5 × 5 window, this percentage only dropped to 75.3% (Table 6). Therefore, it was not surprising that the percentage of contaminated NDVIs that could be estimated using the WR method was only 77.5%. The proposed TSI method adopted two techniques for overcoming this problem. One was to use the ecological zone as the spatial neighborhood to obtain high-quality pixels that can be used for estimation. The other was to iteratively use estimation taking all the estimated NDVIs of the contaminated pixels as high-quality pixels to help in the estimation of the undetermined NDVIs of additional contaminated pixels, as discussed in Section 2.3.2. TSI can predict values for all the contaminated pixels except when the clouds cover the whole ecological zone and last for more than half a month.

4.3. Impact of the Percentage of Contaminated Pixels

As discussed in Section 3.1, the performance of the TSI method was clearly better in sub-areas 1–6, while the performance of all four methods was similar in sub-areas 7–9. This was mainly related to the number of contaminated pixels. The average percentages of contaminated pixels in sub-areas 1–3 and sub-areas 4–6 were about 38.5% and 28.1%, respectively, which were much higher than sub-areas 7–9 (9.5%) (Table 2).

The distribution of contaminated NDVIs in the temporal profile can also affect the performance of the methods. In sub-areas 1–3 and sub-areas 4–6, the average percentage of pixels with contaminated status lasting one or two phases (types “101” or “1001” in Table 6) was 2.9% and 6.4%, respectively. In sub-areas 7–9, the percentage of pixels whose contaminated status lasted one or two phases was 44.5%. This highlights why the accuracies of the estimated NDVIs in sub-areas 7–9 were higher than in sub-areas 1–6.

Sub-areas 1–3 were located within a humid region with mainly forest cover; sub-areas 4–6 were within a semi-arid region with grassland cover, and sub-areas 7–9 were located within an arid region with desert and sparse shrub cover. In the humid and semi-arid areas, rain and cloud occurred more frequently than in arid zones. The TSI method seemed slightly sensitive to the amount and distribution of noise because the number of high-quality candidate pixels decreased with the increasing amount of noise. However, we still believe that the TSI method will be more suitable than other methods for application in humid and semi-arid zones where a greater number of contaminated pixels exist because of its relatively robust performance. In arid zones, because of the limited number of contaminated pixels, the performance of most methods are similar.

4.4. Significance of High-Quality Data

As discussed in Section 3.2, the NDVIs of the high-quality pixels were changed with MAPEs of 10.8 and RMSEs of 0.04 after the reconstruction of the NDVI series using the AG and SG methods. This indicated that the AG and SG methods sacrificed the accuracy of some high-quality pixels within the dataset to estimate the contaminated NDVIs. Clearly, this could cause uncertainty in applications based on these NDVIs.

High-quality data are reliable data [3] that should not be changed. The use of the TSI method to reconstruct the dataset, based on high-quality MODIS13Q1 data, meant the NDVIs of the high-quality pixels retained their original values. Thus, TSI will be very helpful in reducing uncertainty in applications based on the reconstructed data, such as NDVI, Leaf Area Index (LAI), and Net Primary Productivity (NPP).

4.5. Weaknesses and Further Research

As discussed in Section 4.1, over 82.1% of the contaminated pixels persisted for three or more phases, and therefore the determination of contaminated NDVIs, was mainly based on spatial estimation. The principal idea behind spatial estimation in the TSI method was to substitute a contaminated NDVI value for one from a high-quality pixel that had the most similar land cover types. Therefore, the determination of this high-quality pixel was critical in the TSI method.

In spatial estimation, the iterative strategy was used to increase the opportunity to obtain the most similar pixels as possible for the contaminated pixels. However, any potential error in the first iterations will propagate to subsequent iterations. We cannot determine the error propagation yet and have not yet developed a method to evaluate it. For the time being, we can only evaluate the accuracies from the total flow of the TSI method.

The TSI method used the WTD algorithm for calculating the similarities of the NDVI change trajectories with different weights between a contaminated pixel and the candidate pixels used for the estimation. The TSI method applied special weights to three key phases to improve the description of these similarities, which meant that the TSI may be more suitable for use in a study area with a single seasonal peak. It should be noted that in some intensively-managed agricultural landscapes, especially in tropical or sub-tropical zones, the NDVI may indicate two or more peaks. Because NDVI trajectories in these areas cannot be determined precisely just based on three-phase trajectory model, TSI may offer no improvement over other methods. In future research, we can improve the WTD algorithm to describe NDVI trajectories with two or more peaks, thus, improving the accuracies of the estimated NDVIs in intensively-managed agricultural landscapes.

5. Conclusions

This paper proposed a TSI method to estimate the NDVIs of contaminated pixels in the MODIS13Q1 NDVI time series dataset, based on temporal–spatial neighborhood estimation, without reference to land cover maps or arbitrarily predetermined windows. First, the NDVIs of those contaminated pixels that persisted for only one or two phases were computed through linear interpolation from adjacent high-quality pixels. Then, the undetermined NDVIs of the contaminated pixels were determined based on the NDVI of the high-quality pixel that reflected the most similar land cover types within the same ecological zone, using the WTD algorithm based on AMS-NDVI change trajectories. Finally, these two steps were repeated iteratively using the estimated NDVIs as high-quality NDVIs to estimate the undetermined NDVIs of the remaining contaminated pixels until all the NDVIs of the contaminated pixels were estimated.

The results indicated that the accuracies of the estimated NDVIs using the TSI method were clearly higher than those estimated using the AG, SG, and WR methods. The RMSE and MAPE decreased by up to 16.7%–86.6% and 18.3%–33.0%, respectively. The performance of the TSI method for the different sub-areas was robust. The variations in MAPE using the compared methods were 1.4–5 times that of the TSI method. Furthermore, the TSI method can retain the original values of the high-quality pixels, and NDVIs for all of the contaminated pixels can be estimated using this method. Thus, the TSI method will be of considerable benefit for application in humid, sub-humid, and semi-arid zones where a greater number of contaminated pixels exist.

The selection of the high-quality pixel used to determine the NDVI of a contaminated pixel is fundamental to the process of the TSI method. In this study, we used the WTD algorithm, by calculating the similarity of the AMS-NDVI change trajectory with different weights between a contaminated pixel and each high-quality candidate pixel used for the estimation. Special weights were given to three key phases to improve the description of the similarity between the AMS-NDVI change trajectories of the contaminated and candidate pixels.

Acknowledgments

This study was supported by National Basic Research Program of China (Nos. 2015CB954103 and 2015CB954101) and the National Key Technology Support Program (2012BAH33B01). The authors are grateful to the anonymous reviewers for their constructive criticism and comments.

Author Contributions

Baolin Li conceived the research idea. Lili Xu performed the data processing, experiments designing, algorithm implementation and wrote the draft version of the manuscript. Baolin Li supervised the research including the experiments designing, writing, and revision of the manuscript. Yecheng Yuan and Xizhang Gao contributed to the editing and revision of the manuscript. Tao Zhang provided advices in algorithm optimization.

Conflicts of Interest

The authors declare no conflict of interest.

References

Anyamba, A.; Tucker, C.J. Analysis of Sahelian vegetation dynamics using NOAA-AVHRR NDVI data from 1981–2003. J. Arid Environ. 2005, 63, 596–614. [Google Scholar] [CrossRef]
Xue, Y.J.; Liu, S.G.; Zhang, L.; Hu, Y.M. Integrating fuzzy logic with piecewise linear regression for detecting vegetation greenness change in the Yukon River Basin, Alaska. Int. J. Remote Sens. 2013, 34, 4242–4263. [Google Scholar] [CrossRef]
Piao, S.L.; Fang, J.Y.; Zhou, L.M.; Ciais, P.; Zhu, B. Variations in satellite-derived phenology in China’s temperate vegetation. Glob. Chang. Biol. 2006, 12, 672–685. [Google Scholar] [CrossRef]
De Jong, R.; Verbesselt, J.; Zeileis, A.; Schaepman, M.E. Shift in global vegetation activity trends. Remote Sens. 2013, 5, 1117–1133. [Google Scholar] [CrossRef] [Green Version]
Kim, D.Y.; Thomas, V.; Olson, J.; Williams, M.; Clements, N. Statistical trend and change-point analysis of land-cover-change patterns in East Africa. Int. J. Remote Sens. 2013, 34, 6636–6650. [Google Scholar] [CrossRef]
Jamali, S.; JÖnsson, P.; Eklundh, L.; ArdÖ, J.; Seaquist, J. Detecting changes in vegetation trends using time series segmentation. Remote Sens. Environ. 2015, 156, 182–195. [Google Scholar] [CrossRef]
Roy, D.P.; Borak, S.J.; Devadiga, S.; Wolfe, R.E.; Zheng, M.; Descloitres, J. The MODIS land product quality assessment approach. Remote Sens. Environ. 2002, 83, 62–76. [Google Scholar] [CrossRef]
Chen, J.; Jonsson, P.; Tamura, M.; Gu, Z.H.; Matsushita, B.; Eklundh, L. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky–Golay filter. Remote Sens. Environ. 2004, 91, 332–344. [Google Scholar] [CrossRef]
Gu, J.; Li, X.; Huang, C.L.; Okin, G.S. A simplified data assimilation method for reconstructing time-series MODIS NDVI data. Adv. Space Res. 2009, 44, 501–509. [Google Scholar] [CrossRef]
Michishita, R.; Zhen, Y.J.; Chen, J.; Xu, B. Empirical comparison of noise reduction techniques for NDVI time-series based on a new measure. ISPRS J. Photogramm. Remote Sens. 2014, 91, 17–28. [Google Scholar] [CrossRef]
Cai, B.; Yu, R. Advance and evaluation in the long time series vegetation trends research based on remote sensing. J. Remote Sens. 2009, 13, 1170–1186. [Google Scholar]
Wessels, K.J.; Prince, S.D.; Malherbe, J.; Small, J.; Frost, P.E.; Van, Z.D. Can human-induced land degradation be distinguished from the effects of rainfall variability? A case study in South Africa. J. Arid Environ. 2007, 68, 271–297. [Google Scholar] [CrossRef]
Savitzky, A.; Golay, M.J. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
Jönsson, P.; Eklundh, L. TIMESAT—A program for analyzing time-series of satellite sensor data. Comput. Geosci. 2004, 30, 833–845. [Google Scholar] [CrossRef]
Ma, M.G.; Veroustraete, F. Reconstructing pathfinder AVHRR land NDVI time-series data for the Northwest of China. Adv. Space Res. 2006, 37, 835–840. [Google Scholar] [CrossRef]
Viovy, N.; Arino, O.; Belward, A. The Best Index Slope Extraction (BISE): A method for reducing noise in NDVI time-series. Int. J. Remote Sens. 1992, 13, 1585–1590. [Google Scholar] [CrossRef]
Beck, P.S.A.; Atzberger, C.; Hogda, K.A.; Johansen, B.; Skidmore, A.K. Improved monitoring of vegetation dynamics at very high latitudes: A new method using MODIS NDVI. Remote Sens. Environ. 2006, 100, 321–334. [Google Scholar] [CrossRef]
Jonsson, P.; Eklundh, L. Seasonality extraction by function fitting to time-series of satellite sensor data. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1824–1832. [Google Scholar] [CrossRef]
Roerink, G.J.; Menenti, M.; Verhoef, W. Reconstructing cloud free NDVI composites using Fourier analysis of time series. Int. J. Remote Sens. 2000, 21, 1911–1917. [Google Scholar] [CrossRef]
De Oliveira, J.C.; Neves Epiphanio, J.C.; Rennó, C.D. Window regression: A spatial-temporal analysis to estimate pixels classified as low-quality in MODIS NDVI time series. Remote Sens. 2014, 6, 3123–3142. [Google Scholar] [CrossRef]
Shao, Y.; Lunetta, R.S. Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. ISPRS J. Photogramm. Remote Sens. 2012, 70, 78–87. [Google Scholar] [CrossRef]
Poggio, L.; Gimona, A.; Brown, L. Spatio-temporal MODIS EVI gap filling under cloud cover: An example in Scotland. ISPRS J. Photogramm. Remote Sens. 2012, 72, 56–72. [Google Scholar] [CrossRef]
Cho, A.R.; Suh, M.S. Detection of contaminated pixels based on the short-term continuity of NDVI and correction using spatio-temporal continuity. Asia Pac. J. Atmos. Sci. 2013, 49, 511–525. [Google Scholar] [CrossRef]
Yu, L.; Wang, J.; Gong, P. Improving 30 m global land-cover map FROM-GLC with time series MODIS and auxiliary data sets: A segmentation-based approach. Int. J. Remote Sens. 2013, 34, 5851–5867. [Google Scholar] [CrossRef]
Gong, P.; Wang, J.; Yu, L.; Zhao, Y.C.; Zhao, Y.Y.; Liang, L.; Niu, Z.G.; Huang, X.M.; Fu, H.H.; Liu, S.; et al. Finer resolution observation and monitoring of global land cover: First mapping results with Landsat TM and ETM+ data. Int. J. Remote Sens. 2013, 34, 2607–2654. [Google Scholar] [CrossRef]
Gao, F.; Masek, J.; Schwaller, M.; Hall, F. On the blending of the Landsat and MODIS surface reflectance: Predicting daily Landsat surface reflectance. IEEE T. Geosci. Remote Sens. 2006, 44, 2207–2218. [Google Scholar]
Zhu, X.L.; Chen, J.; Gao, F.; Chen, X.H.; Masek, J.G. An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions. Remote Sens. Environ. 2010, 114, 2610–2623. [Google Scholar] [CrossRef]
Keogh, E.J.; Pazzani, M.J. An enhanced representation of time series which allows fast and accurate classification. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD’98), New York, NY, USA, 27–31 August 1998; pp. 239–241.
Wang, X.Z.; Smith, K.; Hyndman, R. Characteristic-based clustering for time series data. Data Min. Knowl. Disc. 2006, 13, 335–364. [Google Scholar] [CrossRef]
Kumar, M.; Patel, N.R.; Woo, J. Clustering seasonality patterns in the presence of errors. In Proceedings of the Eighth ACM SIGKDD International Conference: Knowledge Discovery & Data Mining, Edmonton, AB, Canada, 23 July 2002; pp. 557–563.

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, L.; Li, B.; Yuan, Y.; Gao, X.; Zhang, T. A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets. Remote Sens. 2015, 7, 8906-8924. https://doi.org/10.3390/rs70708906

AMA Style

Xu L, Li B, Yuan Y, Gao X, Zhang T. A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets. Remote Sensing. 2015; 7(7):8906-8924. https://doi.org/10.3390/rs70708906

Chicago/Turabian Style

Xu, Lili, Baolin Li, Yecheng Yuan, Xizhang Gao, and Tao Zhang. 2015. "A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets" Remote Sensing 7, no. 7: 8906-8924. https://doi.org/10.3390/rs70708906

Article Menu

A Temporal-Spatial Iteration Method to Reconstruct NDVI Time Series Datasets

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data and Data Preprocessing

2.3. Methodology

2.3.1. Temporal Estimation

2.3.2. Spatial Estimation

2.4. Technique Evaluation

2.4.1. Accuracies of the Estimated NDVIs of Contaminated Pixels

2.4.2. Ability to Retain the Original Values of the High-Quality Pixels

2.4.3. Number of Contaminated Pixels that Can Be Estimated

3. Results

3.1. Accuracies of the Estimated NDVIs of Contaminated Pixels

3.2. Ability to Retain the Original Values of the High-Quality Pixels

3.3. Number of Contaminated Pixels that Can Be Estimated

4. Discussion

4.1. Limitation of Temporal Estimation of NDVIs of Contaminated Pixels

4.2. Limitation of using Regular Spatial Neighborhood in Spatial Estimation

4.3. Impact of the Percentage of Contaminated Pixels

4.4. Significance of High-Quality Data

4.5. Weaknesses and Further Research

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI