UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification

Liu, Hualiang; Zhang, Feizhou; Zhang, Lifu; Lin, Yukun; Wang, Siheng; Xie, Yefeng

doi:10.3390/rs12030529

Open AccessArticle

UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification

by

Hualiang Liu

¹,

Feizhou Zhang

¹,

Lifu Zhang

^2,3,*

,

Yukun Lin

^3,4,

Siheng Wang

⁵

and

Yefeng Xie

⁶

¹

Institute of Remote Sensing and Geographic Information Systems, School of Earth and Space Sciences, Peking University, Beijing 100871, China

²

Key Laboratory of Oasis Eco-Agriculture, Xinjiang Production and Construction Group, Shihezi University, Shihezi 832003, China

³

Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, China

⁴

University of Chinese Academy of Sciences, Beijing 100049, China

⁵

Beijing Institute of Spacecraft System Engineering, China Academy of Space Technology, Beijing 100094, China

⁶

College of Urban and Environmental Sciences, Peking University, Beijing 100871, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2020, 12(3), 529; https://doi.org/10.3390/rs12030529

Submission received: 3 January 2020 / Revised: 31 January 2020 / Accepted: 5 February 2020 / Published: 6 February 2020

Download

Browse Figures

Versions Notes

Abstract

:

Land cover data is crucial for earth system modelling, natural resources management, and conservation planning. Remotely sensed time-series data capture dynamic behavior of vegetation, and have been widely used for land cover mapping. Temporal profiles of vegetation index (VI), especially normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI), are the most used features derived from time-series spectral data. Whether NDVI or EVI is optimal to generate temporal profiles has not been evaluated. The universal normalized vegetation index (UNVI), a relatively new index with all spectral bands incorporated, has been proved to be more effective than several commonly used satellite-derived VIs in some application scenarios. In this study, we explored the ability of UNVI time series for discriminating different vegetation types in Chaoyang prefecture, northeast China, in comparison with normalized NDVI, EVI, triangle vegetation index (TVI), and tasseled cap transformation greenness (TCG). These five indices were calculated using Landsat 8 surface reflectance data, and two comparative experiments were conducted. The first experiment analyzed class separabilities using pairwise JM (Jeffries–Matusita) distance as indicator, and the results showed that UNVI was superior to EVI, TVI, and TCG, and almost equivalent to NDVI, especially during the peak of vegetation growing season and for the most indistinguishable vegetation pair broadleaf and shrubs. The second experiment compared the vegetation classification accuracies using the features of these VI temporal profiles and the corresponding phenological parameters, and the results showed that UNVI can better classify the five major vegetation in Chaoyang prefecture than other four indices. Therefore, we conclude that UNVI time series has considerable potential for regional land cover mapping, and we recommend that the use of the UNVI is considered in the future time series related studies.

Keywords:

land cover; UNVI; time series; phenology; JM distance; random forest

Graphical Abstract

1. Introduction

Identifying, characterizing, and mapping land cover is essential for earth system modelling and natural resources planning [1,2]. Land cover data establishes the baseline for environmental monitoring (e.g., biogeochemical cycling, sustainable land use, biodiversity loss) and thematic mapping [3]. Satellite remote sensing has long been used as an ideal technology for land cover mapping and monitoring due to its ability to provide synoptic coverage and repetitive observations over large geographical area in a cost-effective and nearly real-time manner [4,5]. While multispectral image data from a single date suffers high level of spectral confusion or spectral similarity between different cover types, multi-temporal data sets, which are more accessible to the remote sensing community because of the advent of cloud-computing [6] and resources like Google Earth Engine [7], have proved to be more effective and accurate for vegetation discrimination and classification [8,9,10]. With both spectral and temporal profiles incorporated, time series and their derivations such as vegetation indices (VIs) and further derived vegetation phenological parameters greatly enrich available information for vegetation identification. Thus recently more efforts have been focused on multi-temporal images for land cover mapping [11].

Among all the VIs, the most commonly used are the normalized difference vegetation index (NDVI) [12,13,14] and the enhanced vegetation index (EVI) [14,15,16,17,18,19]. They are have demonstrated effectiveness to indicate vegetation status and growing phases, and their multi-temporal data are widely applied to land cover classification [13], agricultural monitoring [9], and change detection [20]. However, with only two or three spectral bands used, NDVI and EVI may discard some valuable information or unique characteristics of a certain vegetation at some growing seasons, which means that the most widely used NDVI or EVI may not be the optimal feature for multi-temporal vegetation identification, especially when multiple vegetation types are to be classified simultaneously [21]. Apart from UNVI and EVI, many other VIs have been proposed and compared in many application scenarios, such as estimation of green leaf area index (LAI) and canopy chlorophyll density [22,23]. However, no published study has explored the different performances of different VI time series for land cover mapping, so there may exist other vegetation indices that may outperform NDVI and EVI.

The universal normalized vegetation index (UNVI), established by Zhang et al. [24], has several advantages over other VIs. UNVI utilizes the information from all observed bands. Expressed as a function of the UPDM (universal pattern decomposition method) coefficients that are sensor-independent, UNVI allows direct comparisons using data from various sources [24]. Liu et al. [25] used four VIs including UNVI to describe variations of urban land surface temperature (LST), and UNVI shows the best correlation with LST variations. Jiao et al. [26] demonstrated that the vegetation condition index (VCI) based on UNVI has stronger correlations with long-term in situ drought indices than NDVI-derived VCI, which means UNVI has considerable potential for drought monitoring [26]. UNVI has also been applied to estimate chlorophyll content of winter wheat, and demonstrates the best accuracy and stability compared with NDVI and triangle vegetation index (TVI) [23,27]. For LAI estimation, UNVI has a higher saturation point than NDVI and EVI, and is more sensitive to a wider range of vegetation dynamics [23]. These previous studies demonstrate that UNVI is more effective than several commonly used VIs in some certain application scenarios. Therefore, we believe that UNVI is a worth exploring index. However, the performance of UNVI multi-temporal profiles for vegetation discrimination remains unknown.

Therefore, the main objective of this study is to evaluate the vegetation discrimination ability of UNVI time series, and four other VIs were employed for comparison, including NDVI, EVI, TVI, and tasseled cap transformation greenness (TCG). To accomplish this, we designed two comparative experiments: the first one is to analyze class separabilities of specific vegetation types using the UNVI time series and other four VIs data; the second one is to compare the vegetation classification accuracies using the features of these VI temporal profiles and the derived phenological parameters.

2. Study Area and Data

2.1. Study Area

To evaluate the performance of UNVI time series for vegetation discrimination, we conducted our study using data of Chaoyang prefecture-level city, located in the west of Liaoning province, Northeast China (Figure 1). The prefecture spans about 165 kilometers from west to east (118°50′ E to 121°17′ E), and 216 kilometers from south to north (40°25′ N to 42°22′ N), covering an area of 19,736 km². The terrain is characterized by hills and low mountains, and the altitude ranges from 70 m to 1256 m. The elevations are high in the north, northwest and southwest, and much lower toward the east, shaped like a dustpan that opens to the east (Figure 1). Chaoyang prefecture has a monsoon-influenced semi-arid and semi-humid continental climate, with humid, muggy summers and cold, windy, dry winters, and annual temperature averaged at 8.5 °C. The average annual precipitation is about 450–500 mm, and over half of the annual rainfall is concentrated in summer from July to August. This area possesses a wide variety of vegetation types, including crops, broadleaf deciduous forests, evergreen coniferous forests, deciduous shrubs, orchards, and grass. Different vegetation (e.g., shrubs and grass) may share similar phenological characteristics such as greenup onset date and length of growing season. The vegetation diversity and phenological similarity make Chaoyang prefecture a suitable area for our study.

2.2. Geographic Conditions Census Data

The vegetation cover type data used for class separability evaluation and classification training and verification were from the China’s first Geographic Conditions Census (GCC) carried out in 2013–2015. The GCC was proposed by the National Administration of Surveying, Mapping, and Geo-information (NASG). With the full use of high-resolution remote sensing images including space-borne sensors (Pleiades-1A, Worldview-2, etc.), airborne sensors, and ground data, the GCC identified the status and spatial distribution of all geographical elements on the nationwide scale, including land cover, topography, and natural and artificial features [28]. The land cover classification scheme includes 12 classes at level 1, 58 classes at level 2, and 135 classes at level 3 (NO.GDPJ 01—2013) [29]. With involvement of local surveying and mapping departments across the country, the land cover mapping was achieved by the basic workflow of 1) base map production, 2) remote sensing data collection and processing, 3) field survey and verification, and 4) GCC data set construction. Finally, and importantly, since the geographic conditions might experience change during the census period, the GCC data set was checked and approved to the standard time point 30 June 2015, which means that the census data reflect the land cover status around 30 June 2015 [30].

According to the GCC data of our study area, major land cover types are listed in Table 1. For croplands, about 80% of the area were planted with corns in recent five years, according to the statistical reports of Chaoyang prefecture on the national economic and social development (http://www.zgcy.gov.cn/ZGCYS/Tjgg/ [in Chinese]); all crops were harvested once per year due to low annual accumulated temperatures in the far north. Orchards were mixed with both fruit- or nut-producing trees and shrubs, which made the intra-class phenological characters much complex and irregular, so this vegetation class was excluded in our multi-temporal classes discrimination. Broadleaf deciduous forests and shrubs both consist of broadleaf trees. The main difference between them was the height, meaning that it may be difficult to discriminate deciduous forests and shrubs by remote sensing data. Figure 2 shows the land cover distribution in Chaoyang prefecture.

2.3. Time-Series Remote Sensing Data

Considering the availability of high-resolution vegetation cover data from GCC and the strong fragmentation of the landscape in Chaoyang prefecture (see Figure 2), we used Landsat-8 multi-temporal images, instead of another commonly used MODIS (Moderate Resolution Imaging Spectroradiometer) images with much coarser spatial resolution. Landsat-8 satellite, carrying the pushbroom sensor OLI (Operational Land Imager), was launched on 11 February 2013. OLI collects data from visible and near infrared bands with multispectral resolution of 30 m and revisit cycle of 16 days. Two OLI tiles (path/row: 121/031 and 121/032, as show in Figure 1) covering the entire Chaoyang prefecture were used. According to the standard time point, 30 June 2015, of the GCC land cover data, we downloaded all the available 2015 Landsat 8 OLI surface reflectance data from the USGS data library (https://earthexplorer.usgs.gov/), assuming that vegetation classes remained unchanged throughout the year. So in all, 46 scenes (2 titles × 23 scenes/year) were collected.

2.4. Vegetation Indices

We carefully selected five vegetation indices for comparison. Table 2 displays the formulae for OLI surface reflectance bands and reference sources.

UNVI is established based on the UPDM, which assumes that the surface reflectance can be expressed as a linear sum of four standard spectral patterns, i.e., the patterns of vegetation, soil, and water, and one supplementary (e.g., yellow leaves) [24]:

R_{i} = C_{w} P_{i w} + C_{v} P_{i v} + C_{s} P_{i s} + C_{4} P_{i 4},

(1)

where, R_i is the reflectance of band i measured by OLI; P_iw, P_iv, P_is, and P_i4 are the normalized reflectance of the four standard objects (i.e., water, vegetation, soil, and yellow leaves respectively) at the spectral range of band i; C_w, C_v, C_s, and C₄ are the corresponding decomposition coefficients for the four standard objects [24,31]. Least squares criterion can be used to approach these four coefficients C_w, C_v, C_s, and C₄ by solving the overdetermined systems (i.e., there are more equations than unknowns) comprised by a set of Equation (1), as the number of spectral bands of OLI is larger than the number of unknown coefficients (which is four). More details can be found in references [24] and [31]. For convenience purpose, coefficient matrices for Landsat-8 OLI provided by Zhang et al. [23] will be used in this study to calculate C_w, C_v, C_s, and C₄. Using these four coefficients, the UNVI can be expressed as:

UNVI = \frac{C_{v} - 0.1 \times C_{s} - C_{4}}{C_{w} + C_{v} + C_{s}},

(2)

where the denominator C_w+C_v+C_s represents the sum of total reflectance, and the numerator plays a critical role in determining the value of UNVI. This numerator will be higher in areas with higher vegetation density because the soil and yellow leaves are covered up. Meanwhile, in areas with sparser vegetation, it will be lower because of the negative contribution of higher C_s and C₄ due to the scattering of light by the soil and yellow leaves. Thus, the UNVI is more sensitive to a wider range of vegetation dynamics than the traditional VIs [23,24]

There are a number of reasons for choosing the other four vegetation indices to compare with UNVI. NDVI is the most well-known and used indicator of vegetation canopies by enhancing the difference of chlorophyll’s strong reflection in the near-infrared (NIR) band and strong absorption in the red band. EVI is designed to improve sensitivity over dense vegetation conditions and to eliminate the influence of atmosphere and canopy background that commonly contaminate NDVI [35]. NDVI and EVI are the most widely used vegetation indices for time series analysis, so evaluation of any other index should first start with the comparison with NDVI and EVI. TVI was developed based on the reflectance difference between the NIR and red bands in conjunction with green band [22]. TVI was chosen, because its time series data performed outstandingly in land cover classes separability analysis [36] and crops classification [37] compared with many other VIs. Tasseled Cap Transformation Greenness (TCG) is also a band-transformation VI with all bands information incorporated [34], which is quite similar to UNVI, so TCG was also selected for comparison.

3. Methods

The overall procedure for evaluation of vegetation discrimination ability of UNVI time series is shown in Figure 3. The Landsat 8 surface reflectance data were first preprocessed to obtain noise-free VI time series, and GCC land cover data was downsampled to make it consistent with the Landsat data. Then two comparative experiments, separability analysis and random forest classification, were conducted.

For the downloaded multi-temporal Landsat 8 images, in order to eliminate noisy observations, pixels contaminated by clouds, cloud shadows, and snow were firstly masked according to the quality assessment (QA) bands that were delivered along with surface reflectance bands. UNVI and the other four VIs of the whole year were calculated using the formulas listed in Table 1. However, the gaps masked by QA band result in missing data along the temporal profile, which may influence the profile reconstruction, so shape-preserving piecewise cubic spline interpolation was used to fill these temporal gaps. In order to eliminate the fluctuations and noises arising primarily from sensor system, atmospheric constituents and surface anisotropy [38,39,40], the temporal profiles of the five VIs were then reconstructed by Savitzky–Golay (SG) filter with a window size of 4.

For the land cover data from GCC, the spatial resolution was downsampled to 30 m, and the pixels were exactly snapped to the Landsat 8 raster images. Considering the widespread small patches and complex distribution of different vegetation types, we applied a morphological operator “erosion” with a structuring element sized 1×1 pixel to each vegetation type to remove all patches’ outermost pixels which are probably mixed ones. The numbers of pixels and area percentage covering the five major vegetation are listed in Table 3.

Based on the smoothed VIs time series and GCC land cover reference data, we designed two comparative experiments to evaluate the ability of multi-temporal UNVI to distinguish different vegetation. The first one was separability analysis of UNVI and other VIs for the five major vegetation classes. The pixels of each class used for separability calculation were manually selected from large patches in conjunction with Google Earth images to ensure the purity, reliability and distribution uniformity of these pixels. Totally, the amounts of selected pixels were 1821 for crops, 1902 for broadleaf forest, 1134 for coniferous forest, 2374 for deciduous shrub, and 849 for grass, as listed in Table 3. The second experiment was the comparison of classification performance using different VIs time series and their corresponding phenological parameters as feature sets for the random forest classifier. More details of these two experiments are described below.

3.1. Separability Analysis

The purpose of separability analysis was to evaluate the separability of UNVI time series for different vegetation classes. Separability of each pair in all classes can be quantitatively measured by average distance between the pairwise class density distributions or histograms [41] of VI values. Three indicators were commonly used: the separability index (SI) [42,43,44,45], the transformed divergence (TD) [46,47] and the Jeffries-Matusita distance (JM distance) [10,17,36,46]. SI between two classes is defined as the difference between the VI means normalized by the sum of the VI standard deviations. The difference of the means reflects the inter-class variability, while the sum of the standard deviations represents the intra-class variability. The limitation of SI is that when the means of the two classes are equal, SI will always be zero and cannot accurately reflect the separability [41]. SI can only measure the separability in a one-dimensional feature space such as an individual time point [45,48]; thus it cannot meet the needs of our study to calculate the separability of multi-temporal VIs. Compared with TD, JM distance was suggested more reliable in separability measurements [49], and also more suitable for less homogeneous major classes [41]. So we chose JM distance to indicate the separability between each pair of vegetation types.

The JM distance is calculated as [50]:

J M = 2 (1 - e^{- B}),

(3)

where B is the Bhattacharyya distance:

B = \frac{1}{8} {(μ_{i} - μ_{j})}^{T} {(\frac{Σ_{i} + Σ_{j}}{2})}^{- 1} (μ_{i} - μ_{j}) + \frac{1}{2} \ln (| \frac{Σ_{i} + Σ_{j}}{2} | / \sqrt{| Σ_{i} | \cdot | Σ_{j} |}),

(4)

where, for classes i and j,

μ

is the mean vector of reflectance values, and

Σ

is the variance-covariance matrix. The JM distance ranges from 0 (completely inseparable) to 2 (completely separable), with a larger value indicating a higher degree of separability between the two classes. In this study, the JM distance was calculated for the ten possible combinations of the five vegetation classes with single and multiple temporal UNVI and NDVI, EVI, TVI, TCG.

3.2. Vegetation Classification

To ascertain the ability of the UNVI time series to distinguish the five major classes in Chaoyang prefecture, we further conducted a classification comparison with other four VIs using random forest (RF) method. RF is an ensemble learning algorithm that operates by constructing a multitude of decision trees (without pruning) from random selected training samples and features and deciding the final class based on the majority of votes from all the trees made [51]. RF is superior to many other classifiers because of its fast training speed, easy parameterization, high accuracy, and noise insensitivity [15]. RF has been extensively used in the remote sensing community [52], especially for land cover mapping with remotely sensed time-series data [18,53,54]. Two parameters are required to be set in the RF algorithm: the number of decision trees (ntree) and the number of features to be selected for each decision split (mtry). Since RF rarely overfits, ntree can be as large as possible and usually several hundred [55]. In this study, Ntree was set to 100 for the five VIs, which is sufficient for classification purpose [18,52,55]; mtry was set to the square root of the total number of input features, which is the recommended default value commonly used [52].

3.2.1. Features Used for Classification

Apart from the VI temporal profiles, the further derived vegetation phenology is usually added to the feature sets to achieve better classification accuracy [16,56,57]. Vegetation phenology represents the timing of specific stages of growth and development in plant annual cycle, including germination, greenup, canopy growth, and senescence. Although phenology is mainly driven by human activities and climatic factors such as temperature, sunlight and rainfall, it is intrinsically determined by vegetation species, which means that different vegetation types have distinct phenological characteristics that can be used in turn for vegetation discrimination. The most prevalent approach to monitoring phenology is based on tracking the temporal change of a VI. In this study, phenological features were obtained using the TIMESAT software package in the MATLAB environment [58,59]. A total of 13 phenological parameters can be extracted from TIMESAT, all of which were used to improve the accuracy of the classifier in our study. The names and definitions of these phenological parameters are displayed in Table 4. In particular, for SOS and EOS, the appropriate fractions of the seasonal amplitude are user-definable and vary from region to region [58,60,61]. The thresholds are usually determined by correlation analysis with field phenophase observations [62,63]. According to Kong et al. [12], whose study area is nearly adjacent to Chaoyang prefecture, the SOS and EOS derived using the threshold values of both 20% are consistent with other related research. Thus we set the threshold values both as 20%.

So for each VI, we obtained a total of 36 features (23 time points VI and 13 phenological parameters) for classification.

3.2.2. Training and Validation

For the classifier training, we randomly sampled each vegetation type in a certain proportion throughout Chaoyang prefecture. It can be seen from Table 3 that the pixel numbers of the five vegetation types varied tremendously. The coverage of the crops or shrubs was much larger than that of the other classes; the area percentages of crops, shrubs, conifers, broadleaf, and grass are 40.5%, 40.0%, 10.8%, 7.6%, and 1.1%, respectively. In other words, we were facing an imbalanced classification problem. Therefore, instead of sampling the same proportions for each class, we under-sampled the majority classes and over-sampled the minorities to balance the class sizes [64]. As a result, only 5% of the crops and shrubs pixels and up to 30% grass pixels were selected as training samples, as shown in Table 3. However, the randomly selected pixels may non-uniformly distribute all over the study area, which could have uncertain influences on the classification results. So, we repeated the classification experiments 30 times for all the VIs to ensure the stability and reliability of the classification results.

Because of the availability of GCC land cover data, we were able to identify the vegetation class of each pixel in Chaoyang prefecture (Figure 2), so all pixels of the five major vegetation after “erosion” operation were used as reference for validation, i.e., orchards and “other” classes (described in Table 1) and the pixels “eroded” by the morphological structuring element were excluded from the validation dataset. Evaluation measures of a classification performance are usually based on a confusion matrix, each row of which represents the actual class from reference data while each column represents the predicted class. Therefore, the diagonal elements represent the numbers of the pixels correctly classified, while off-diagonal elements are those that are mislabeled. For land cover classification, the most commonly used measures calculated from confusion matrix are user’s accuracy (UA), producer’s accuracy (PA), overall accuracy (OA), and kappa coefficient [3]. UA and PA are assessments for individual class accuracy, and they are complements of commission error and omission error, respectively. For a confusion matrix with elements M_ij,

U A_{i} = \frac{M_{i i}}{\sum_{j = 1}^{k} M_{i j}},

(5)

P A_{i} = \frac{M_{i i}}{\sum_{j = 1}^{k} M_{j i}},

(6)

where k is the size of confusion matrix and k = 5 in our study, representing the number of vegetation classes. UA and PA are also known as recall and precision respectively in binary classification problems.

OA is the percentage of correctly classified pixels in all reference pixels, and it mainly reflects the accuracy of majority classes when data is imbalanced. So, OA is not an appropriate measure for our study. Therefore, G-mean and macro-average F-measure were introduced to get a sense of performance on minority classes [64,65,66,67]. G-mean is considered a suitable measure since it concerns the performance of all classes [66,68]. For multi-classes, G-mean is calculated as the geometric mean of the UA of all classes [67,68]:

G - m e a n = {(\prod_{i = 1}^{k} U A_{i})}^{\frac{1}{k}} .

(7)

F-measure takes into account the trade-off relationship between UA and PA of each class, and is calculated as the harmonic average of UA and PA. For our multi-classes imbalanced data, macro-average F-measure (Macro_F), which is computed as a simple average over all classes, gives equal weight to each class regardless of the sizes [65,67]:

M a c r o_F = \frac{1}{k} \sum_{i = 1}^{k} \frac{2 \times U A_{i} \times P A_{i}}{U A_{i} + P A_{i}} .

(8)

The Kappa coefficient is an indicator of how much the classification result is better than random assignation. It is calculated as [69]:

κ = \frac{O A - p_{e}}{1 - p_{e}},

(9)

where p_e is the hypothetical probability of chance agreement, which can be regarded as a penalty factor to data bias. Kappa is thought to be a robust measure of how the classifier performed across all classes, thus it was also used in our study.

To assess statistical differences between the classification accuracies using UNVI and other four VIs, Z-tests were performed to test the hypothesis whether two VI features produced similar accuracy [70,71]. The test is calculated by incorporating the overall kappa and kappa variance:

Z = \frac{κ_{1} - κ_{2}}{\sqrt{σ_{κ_{1}}^{2} + σ_{κ_{2}}^{2}}},

(10)

where,

κ_{1}

and

κ_{2}

represent respectively the kappa coefficients of two classifications;

σ_{κ_{1}}^{2}

and

σ_{κ_{2}}^{2}

represent the corresponding variances, which can be calculated by the formula given by Fleiss et al. [72]. Assuming a normal distribution of Z, a difference is considered statistically significant at the 0.05 significance level (

p \leq 0.05

) if the

| Z | > 1.96

[71].

4. Results

4.1. VI Temporal Profiles

In order to get a general understanding of the phenological characteristics of the five major vegetation types, we calculated the averaged VIs of the selected typical pixels (see Table 3) at each time point. The temporal profiles reconstructed by SG filter are shown in Figure 4. It should be noted that the five VIs differ in value ranges. NDVI of vegetation is restricted to the interval 0 to 1, and UNVI is slightly beyond that domain. The values of EVI and TCG are usually smaller than NDVI, while TVI varies at a much larger range. For each vegetation type, the five indices showed similar trends over time, except for coniferous forest. UNVI and NDVI of conifers kept high values throughout the year, while EVI, TVI, and TCG were relatively low. This is because EVI, TVI, TCG are more sensitive to the reflectance of the near-infrared (NIR) spectral band (see the formula in Table 2), and conifer is characterized by strong absorption (because of high dry matter content in a needle) and strong scattering (“photon trap” caused by the structure of a coniferous shoot) in the NIR band [73]. So, the NIR reflectance of conifer tends to be low, resulting in relatively low values of EVI, TVI, and TCG. For each VI, the five classes were different in phenology, especially for crops and conifers. Crops were distinctly characterized by late SOS, early EOS, short LOS, and large increase and decrease rates. Broadleaf, shrubs, and grass shared highly similar phenological characteristics, including the same greenup and senescence time, with only subtle differences in VI magnitude. Thus, these three vegetation classes were expected to be difficult to distinguish from each other, especially for the pair of broadleaf and shrub.

4.2. Pairwise Vegetation Separabilities

Based on UNVI and the other four VIs (i.e., NDVI, EVI, TVI, TCG), the JM distances between each pair of classes for each single time point were calculated. The results are shown in Figure 5, and the ten subgraphs represent the ten pairwise vegetation combinations. The horizontal and vertical axes represent the vegetation index and the time scale respectively. The color of each grid cell represents the JM distance of the corresponding VI and time. The redder the grid cell is, the higher the corresponding JM distance is and the more separable the two classes become. As we can see from the figure, the pairwise JM distances varied significantly throughout the time due to the phenologically driven differences between the pairwise classes. Crops and conifers were highly distinguishable from any other vegetation. The JM distance between crops and other vegetation (except grass) was large during early spring (late March to late May), because in this period, crops (mainly corn) had yet to be planted, while broadleaf, conifers and shrubs were budding and expending leaves and thus were turning green, so crops can be easily differentiated from those trees. Different from broadleaf, conifers, and shrubs, grass maintained relatively low VI values throughout the growing season, so crops were more separable from grass during the heading period (July to August) when the crops came to the most luxuriant stage. The most remarkable separability between conifers and the other vegetation classes occurred during the cold seasons (from late autumn to early spring), when conifer still remained green while other vegetation had been either dormant or harvested. Broadleaf, shrubs, and grass were the least distinguishable from each other since they shared similar phenological characteristics, all greening up and withering almost at the same time. The only difference between them was the degree of lushness during their growing seasons, which was also why broadleaf and grass were relatively easier to distinguish than other pairs among the three (see the subgraph of “broadleaf-grass”). It is particularly noteworthy that broadleaf and shrub were unable to be distinguished in all time points, since they differed only in the height of the trees (see the descriptions in Table 1).

Although different vegetation indices have different capabilities for vegetation discrimination, all in all, UNVI was superior to EVI, TVI, and TCG, and almost equivalent to NDVI. Specifically, we conclude the following rankings of their separability abilities: UNVI≈NDVI>TCG>EVI≈TVI. The difference between UNVI and NDVI was very subtle. During the dormant period (January to March, November to December), NDVI was slightly better than UNVI in distinguishing vegetation (except coniferous forest), such as crop-broadleaf, crop-shrub, broadleaf-grass, etc., because UNVI’s depiction of vegetation growth is more elaborate [23], and thus it is more sensitive to vegetation heterogeneity, resulting in a higher intra-class variability (variance) and thus a larger JM distance than NDVI. However, during the peak of the growing season (whole year for conifers, July to August for other vegetation), the performance of UNVI was better than NDVI. This is because UNVI is less likely to saturate when vegetation density is high. Therefore, UNVI can capture the nuances of the luxuriant vegetation when NDVI might already reached the saturate point. For the pair of broadleaf and shrubs, which were unable to be distinguished by all VIs, UNVI performed marginally better than NDVI and the other three, which was more evident in the multi-temporal JM distance, as shown in Figure 6.

In practice, multi-temporal VIs are most used for land cover mapping. Therefore, the JM distance of the multi-temporal VI was further calculated; the result is shown in Figure 6. The horizontal axis represents the VIs, and the vertical axis represents the time-series lengths (in the unit of 16 days), i.e., the number of composite periods involved in the JM distance calculation. For each times series length i (i = 1, 2, …, 23), there are

C_{23}^{i}

temporal combinations, and their mean JM distances are given in Figure 6. The figure shows that the JM distance increased significantly with the increase of time series length. When the time series length was greater than 5, all the pairwise classes became almost completely separable, except for “broadleaf-shrub” and “shrub-grass.” As the time series length increased, the JM distance between broadleaf and shrubs increased slowly. When half year of the time series (13 out of 23 composites) was used, broadleaf and shrubs were only moderately separable. For the five VIs, their differences in vegetation discrimination ability became smaller when multi-temporal VIs were incorporated. When the VI time series of the whole year (all 23 composites) were used, the differences between the five VIs were very subtle and the JM distances mostly reached or approached 2.0. However, it can still be seen that UNVI and NDVI performed better than EVI, TVI, and TCG. It is worth noting that for the most indistinguishable pair “broadleaf-shrub,” UNVI outperformed other four indices, showing the superiority of UNVI over other indices.

4.3. Classification Results and Accuracy Assessment

Based on the features of multi-temporal VIs and phenological parameters, vegetation classifications were implemented using RF method, and confusion matrices were calculated. As the classification was repeated 30 times to eliminate the possible influence from training pixels’ non-uniform distribution, we calculated the means and stand deviations of the five metrics, i.e., UA, PA, kappa coefficient, G-mean, and macro F-measure. We found that the standard deviations of these five metrics were very small, with the order of 10⁻⁴. So for the five metrics, only the mean values will be presented, and the standard deviation will no longer be considered.

Table 5 shows the average UAs and PAs of the five vegetation types. For UNVI, crops and conifers were highly accurate in terms of both PA and UA. The PA of shrubs was excellent (93.96%), while the UA was only 82.49%. Classes of boadleaf and grass were poorly classified, with quite low UAs (49.40% and 34.12%, respectively). By analyzing the confusion matrix, it was found that relatively low PA (high omission error) of shrubs corresponded to the low UA (high commission error) of broadleaf and grass. In other words, many shrub pixels were misclassified as broadleaf and grass. This is consistent with the small pairwise JM distance of these three vegetation types, as discussed in Section 4.2. The horizontal comparison of UNVI and NDVI, EVI, TVI, TCG shows that the differences of the five indices were small. This is easy to understand, because the JM distances of the five indices mostly approached 2.0 when all 23 composites were used (see Figure 6), not to mention the phenological features added for classification. But still, we can see that the performance of UNVI was slightly better than other four VIs.

The comparison of averaged kappa coefficients, Macro F-measures, and G-means of the five indices are shown in Figure 7. All the three metrics of UNVI were higher than those of the other VIs, indicating that UNVI had a better classification result for the five vegetation types. Since broadleaf, shrubs, and grass were hard to differentiate, we specifically classified only these three vegetation classes. The purpose of this experiment is to eliminate the influences of crops and conifers on classification and accuracy assessment, and to explore these five indices’ abilities of addressing “intractable problems.” The results are shown in Figure 8. It can be seen that UNVI still outperformed the other four VIs for distinguishing the less distinguishable classes.

The Z-test values between the classification results using UNVI time series and other four VIs are listed in Table 6. The Z-test value between UNVI and NDVI results was 17.71, which was significantly greater than the threshold value of 1.96 at the 0.05 significance level; the corresponding p-value was extremely small (with the order of 10⁻⁷⁰), so we used 0.0001 for comparison, which is small enough to indicate the significance. The differences between results of UNVI and EVI, TVI, TCG were greater than that of UNVI and NDVI. Thus, we can conclude that UNVI time series can significantly better classify the five vegetation types than the other four VI time series, although the differences in accuracy measures seem trivial.

5. Discussion

We conducted two comparative experiments to investigate the vegetation discrimination ability of UNVI time series, in comparison with NDVI, EVI, TVI, and TCG.

The first experiment evaluated class separabilities of the five VIs, and JM distance was used to indicate the separability between each vegetation pair. The overall results showed that UNVI was superior to EVI, TVI, and TCG, and almost equivalent to NDVI. During the dormant periods (January to March, November to December), NDVI performed slightly better than UNVI in distinguishing vegetation, possibly due to UNVI’s high intra-class variability, which mainly result from two causes. First, during dormancy, vegetation does not completely cover soils. The contributions of soil and yellow leaves to UNVI vary in different places, resulting in fluctuations of UNVI even for the same vegetation type (see Equation 4). Second, UNVI depicts vegetation dynamics (mainly determined by chlorophyll content and LAI) more elaborately, e.g., a small variation in LAI may cause a lager variation in UNVI than NDVI [23]. Thus, UNVI is more sensitive to intra-class heterogeneities caused by variations in environmental conditions across the large geographic area. Therefore, UNVI has a higher intra-class variability (variance), which makes the class more inseparable during dormancy when inter-class variability is low. However, during the peak of vegetation’s growing season, UNVI outperformed other four indices. There are mainly two reasons. First, UNVI is more sensitive to vegetation dynamics [23]; thus UNVI is able to characterize the vegetation growth status more precisely, which makes different vegetation more separable. Second, UNVI has a higher saturate point with respect to LAI [23,24]. When vegetation densities are high, UNVI can still capture the nuances of the dense vegetation, while other indices might already saturate. From above analysis, we can conclude that UNVI’s higher sensitivity to vegetation dynamics is a double-edged sword for vegetation discrimination. On one hand, different vegetation types exhibit different macroscopic characteristics such as canopy density. Also, the differences in these characteristics are more pronounced on UNVI. As a result, UNVI achieves a higher inter-class variability than other VIs. In this sense, the sensitivity of UNVI makes different classes more separable. On the other hand, a same type of vegetation usually grows unevenly in a region due to environmental variations. This heterogeneity is more likely to be captured by UNVI because of its sensitivity. Thus the intra-class variability of UNVI is also high. From this view, UNVI is not conducive to vegetation discrimination. During the growing season, UNVI’s high intra-class variability is suppressed by the high inter-class variability. Therefore, when the time-series data in the growing season is available, UNVI is suggested for use. However, during dormancy when inter-class variability is low, NDVI performs better than UNVI, so NDVI is recommended to be used.

In the second experiment, RF classification was implemented to classify the five major vegetation classes in the study area. We chose RF instead of other classifiers, mainly based on the following considerations: (1) RF is the most commonly used algorithm in land cover classification using time series remote sensing data [15,16,19,53,54,74], so the results of RF classification are the most representative and convincing. (2) For classification problems, the results are usually susceptible to parameter tuning, so our comparative study is more inclined to use the classifier with fewer parameters, and RF has the reputation of this advantage, with two free parameters (ntree and mtry) relatively easy to tune [52,53]. For each VI, the features of VI time series and the further derived phenological parameters were used for classification. Although other features, such as time-series spectral bands, and DEM information, may be helpful to improve the classification performance, they may also have different importance of contributions to the classifier for different VIs, and thus could weaken or mess up the differences between these indices. So, for our evaluation and comparison purpose, only the features related to the VI time series were used for classification. The overall evaluation measures kappa coefficients, macro F-measures and G-means showed that UNVI slightly better classified the five vegetation classes. It should be underlined that for each VI, all the time series throughout the year and all the phenological parameters extracted from TIMESAT were used, which means that we tested the ultimate abilities of the VIs time series for classification. Thus, the differences between the accuracy measures of the classification results were not very substantial, which is consistent with the fact that JM distances of the five indices mostly approached 2.0 when all 23 time composites were used (see Figure 6). On the other hand, Z-test results indicated that the classification results using UNVI time series were significantly better than using the other four VI time series.

Broadleaf, shrubs and grass were difficult to differentiate from each other, and they were often misclassified as each other. Because they have similar seasonal behaviors, and only differs in the canopy densities in the growing season. In particular, for the pair of broadleaf forests and shrubs, they are both made up of broadleaf deciduous trees in our study area, and differed only in the height of the trees (see the descriptions in Table 1). So we got low JM distance and low classification accuracy when trying to discriminate broadleaf and shrubs. However, UNVI outperformed the other four VIs on discriminating these less distinguishable classes. This is mainly due to the fact that UNVI is more sensitive to LAI and thus better captures the subtle differences in canopy density between broadleaf and shrubs.

In this study, UNVI time-series performed outstandingly for the major vegetation discrimination in Chaoyang prefecture, mainly due to UNVI’s sensitivity to vegetation dynamics and high saturation point with respect to LAI. These advantages of UNVI also indicate that it has the potential to be used in vegetation dynamics related studies, such as estimating crop yield or vegetation productivity, where VI might be a key metric. Thus, we suggest the UNVI’s potential for characterizing and quantifying vegetation dynamics to be explored in future work. Additionally, as UNVI and NDVI outperform each other at different seasons for vegetation discrimination, the combination of UNVI and NDVI time series may be instructive, from the perspective of improving classification accuracy. So we also recommend that future work evaluates the effectiveness of combined VI time series for vegetation discrimination.

6. Conclusions

Time series vegetation index and the further derived phenological features have been widely used for land cover mapping. In previous studies, researchers used NDVI or EVI by default to generate temporal profiles. However, whether they are optimal VI for multi-temporal vegetation discrimination has not been evaluated. This study made an exploration on this issue, and investigated the vegetation discrimination ability of UNVI time series, in comparison with, NDVI, EVI, TVI, and TCG. We chose UNVI for evaluation mainly because of its full use of all spectral bands and its good performance in many application scenarios. Two comparative experiments, separability analysis and RF classification, were conducted. The results of our study indicate that:

For the overall separability of different types of vegetation, the UNVI is superior to EVI, TVI, and TCG, and almost equivalent to NDVI. During the dormant periods, NDVI performs better than UNVI due to the uncertain contribution of soils and yellow leaves to UNVI. However, during the peak of vegetation growing season, UNVI outperforms NDVI, EVI, TVI, and TCG, mainly because of its sensitivity to vegetation dynamics and high saturation point with respect to LAI.
UNVI times-series and its derived phenological parameters can better classify the five major vegetation classes than NDVI, EVI, TVI, and TCG, indicated by the comparisons of Kappa coefficients, Macro F-measures, and G-means.
For the most indistinguishable vegetation pair broadleaf and shrub, which differ only in the height of trees, UNVI achieves relatively larger JM distance and higher classification accuracy.

UNVI time-series therefore has considerable potential for regional land cover mapping. Thus, we could recommend using UNVI for future time series studies, such as vegetation classification, change detection and dynamic monitoring.

Author Contributions

Funding acquisition, L.Z.; investigation, H.L. and F.Z.; methodology, H.L., F.Z., and Y.L.; supervision, L.Z.; validation, H.L., F.Z., and S.W.; writing—original draft, H.L.; writing—review and editing, Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Major Science and Technology Projects of XPCC (2018AA00402), the Innovation Team of XCPP’s Key Area (2018CB004) and the Key Research Program of Frontier Sciences, CAS (ZDBS-LY-DQC012).

Acknowledgments

The authors are grateful to the anonymous reviewers and the academic editor for their constructive comments, which greatly improved the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lambin, E.F.; Turner, B.L.; Geist, H.J.; Agbola, S.B.; Angelsen, A.; Bruce, J.W.; Coomes, O.T.; Dirzo, R.; Fischer, G.; Folke, C. The causes of land-use and land-cover change: moving beyond the myths. Glob. Environ. Chang. 2001, 11, 261–269. [Google Scholar] [CrossRef]
Sellers, P.; Dickinson, R.E.; Randall, D.; Betts, A.; Hall, F.; Berry, J.; Collatz, G.; Denning, A.; Mooney, H.; Nobre, C. Modeling the exchanges of energy, water, and carbon between continents and the atmosphere. Science 1997, 275, 502–509. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Foody, G.M. Status of land cover classification accuracy assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
Franklin, S.; Wulder, M. Remote sensing methods in medium spatial resolution satellite data land cover classification of large areas. Prog. Phys. Geogr. 2002, 26, 173–205. [Google Scholar] [CrossRef]
Sohl, T.; Sleeter, B. Role of remote sensing for land-use and land-cover change modeling. In Remote sensing of land use and land cover: Principles and applications, 1st ed.; Chandra, P.G., Ed.; CRC Press: Boca Raton, FL, USA, 2012; pp. 225–239. [Google Scholar]
Zhu, Z.; Woodcock, C.E. Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sens. Environ. 2012, 118, 83–94. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Wolter, P.T.; Mladenoff, D.J.; Host, G.E.; Crow, T.R. Improved Forest Classification in the Northern Lake States Using Multi-Temporal Landsat Imagery. Photogramm. Eng. Remote Sens. 1995, 61, 1129–1143. [Google Scholar]
Conese, C.; Maselli, F. Use of multitemporal information to improve classification performance of TM scenes in complex terrain. Isprs J. Photogramm. Remote Sens. 1991, 46, 187–197. [Google Scholar] [CrossRef]
Wardlow, B.D.; Egbert, S.L.; Kastens, J.H. Analysis of time-series MODIS 250m vegetation index data for crop classification in the U.S. Central Great Plains. Remote Sens. Environ. 2007, 108, 290–310. [Google Scholar] [CrossRef] [Green Version]
Carrasco, L.; O’Neil, A.W.; Morton, R.D.; Rowland, C.S. Evaluating Combinations of Temporally Aggregated Sentinel-1, Sentinel-2 and Landsat 8 for Land Cover Mapping with Google Earth Engine. Remote Sens. 2019, 11, 288. [Google Scholar] [CrossRef] [Green Version]
Kong, F.; Li, X.; Wang, H.; Xie, D.; Li, X.; Bai, Y. Land Cover Classification Based on Fused Data from GF-1 and MODIS NDVI Time Series. Remote Sens. 2016, 8, 741. [Google Scholar] [CrossRef] [Green Version]
Knight, J.F.; Lunetta, R.S.; Ediriwickrema, J.; Khorram, S. Regional scale land cover characterization using MODIS-NDVI 250 m multi-temporal imagery: A phenology-based approach. Giscience Remote Sens. 2006, 43, 1–23. [Google Scholar] [CrossRef]
Brown, J.C.; Kastens, J.H.; Coutinho, A.C.; Victoria, D.D.C.; Bishop, C.R. Classifying multiyear agricultural land use data from Mato Grosso using time-series MODIS vegetation index data. Remote Sens. Environ. 2013, 130, 39–50. [Google Scholar] [CrossRef] [Green Version]
Clark, M.L.; Aide, T.M.; Grau, H.R.; Riner, G. A scalable approach to mapping annual land cover at 250 m using MODIS time series data: A case study in the Dry Chaco ecoregion of South America. Remote Sens. Environ. 2010, 114, 2816–2832. [Google Scholar] [CrossRef]
Senf, C.; Pflugmacher, D.; Van Der Linden, S.; Hostert, P. Mapping rubber plantations and natural forests in Xishuangbanna (Southwest China) using multi-spectral phenological metrics from MODIS time series. Remote Sens. 2013, 5, 2795–2812. [Google Scholar] [CrossRef] [Green Version]
Arvor, D.; Jonathan, M.; Meirelles, M.S.P.; Dubreuil, V.; Durieux, L. Classification of MODIS EVI time series for crop mapping in the state of Mato Grosso, Brazil. Int. J. Remote Sens. 2011, 32, 7847–7871. [Google Scholar] [CrossRef]
Gong, P.; Wang, J.; Yu, L.; Zhao, Y.; Zhao, Y.; Liang, L.; Niu, Z.; Huang, X.; Fu, H.; Liu, S. Finer resolution observation and monitoring of global land cover: First mapping results with Landsat TM and ETM+ data. Int. J. Remote Sens. 2013, 34, 2607–2654. [Google Scholar] [CrossRef] [Green Version]
do Nascimento Bendini, H.; Garcia Fonseca, L.M.; Schwieder, M.; Sehn Körting, T.; Rufin, P.; Del Arco Sanches, I.; Leitão, P.J.; Hostert, P. Detailed agricultural land classification in the Brazilian cerrado based on phenological information from dense satellite image time series. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101872. [Google Scholar] [CrossRef]
Verbesselt, J.; Hyndman, R.; Newnham, G.; Culvenor, D. Detecting trend and seasonal changes in satellite image time series. Remote Sens. Environ. 2010, 114, 106–115. [Google Scholar] [CrossRef]
Hu, Q.; Wu, W.; Song, Q.; Lu, M.; Chen, D.; Yu, Q.; Tang, H. How do temporal and spectral features matter in crop classification in Heilongjiang Province, China? J. Integr. Agric. 2017, 16, 324–336. [Google Scholar] [CrossRef]
Broge, N.H.; Leblanc, E. Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
Zhang, L.; Qiao, N.; Baig, M.H.A.; Huang, C.; Lv, X.; Sun, X.; Zhang, Z. Monitoring vegetation dynamics using the universal normalized vegetation index (UNVI): An optimized vegetation index-VIUPD. Remote Sens. Lett. 2019, 10, 629–638. [Google Scholar] [CrossRef]
Zhang, L.; Furumi, S.; Muramatsu, K.; Fujiwara, N.; Daigo, M.; Zhang, L. A new vegetation index based on the universal pattern decomposition method. Int. J. Remote Sens. 2007, 28, 107–124. [Google Scholar] [CrossRef]
Liu, K.; Su, H.; Zhang, L.; Yang, H.; Zhang, R.; Li, X. Analysis of the Urban Heat Island Effect in Shijiazhuang, China Using Satellite and Airborne Data. Remote Sens. 2015, 7, 4804–4833. [Google Scholar] [CrossRef] [Green Version]
Jiao, W.; Zhang, L.; Chang, Q.; Fu, D.; Cen, Y.; Tong, Q. Evaluating an enhanced vegetation condition index (VCI) based on VIUPD for drought monitoring in the continental United States. Remote Sens. 2016, 8, 224. [Google Scholar] [CrossRef] [Green Version]
Jiang, H.L.; Yang, H.; Chen, X.P.; Wang, S.D.; Li, X.K.; Liu, K.; Cen, Y. Research on Accuracy and Stability of Inversing Vegetation Chlorophyll Content by Spectral Index Method. Spectrosc. Spectr. Anal. 2015, 35, 975. [Google Scholar]
Zhang, J.; Li, W.; Zhai, L. Understanding geographical conditions monitoring: a perspective from China. Int. J. Digit. Earth 2015, 8, 38–57. [Google Scholar] [CrossRef]
The National Survey of Geographical Conditions Leading Group Office, S.C., P.R.C. General Situation and Index of Geographical Conditions (Chinese Manual, GDPJ 01-2013); The National Survey of Geographical Conditions Leading Group Office, Sate Council, P.R.C: Beijing, China, 2013. [Google Scholar]
Zhang, T.; Lei, B.; Gan, Y.; Hu, Y.; Liu, K. National satellite image coverage using overall planning technique. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijng, China, 10–15 July 2016; pp. 5488–5491. [Google Scholar]
Zhang, L.; Furumi, S.; Muramatsu, K.; Fujiwara, N.; Daigo, M.; Zhang, L. Sensor-independent analysis method for hyperspectral data based on the pattern decomposition method. Int. J. Remote Sens. 2006, 27, 4899–4910. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Deering, D.W.; Schell, J.A.; Harlan, J.C. Monitoring the vernal advancement and retrogradation (green wave effect) of natural vegetation; E73-10693; NASA: Greenbelt, MD, USA, 1973; p. 112. [Google Scholar]
Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
Baig, M.H.A.; Zhang, L.; Tong, S.; Tong, Q. Derivation of a tasselled cap transformation based on Landsat 8 at-satellite reflectance. Remote Sens. Lett. 2014, 5, 423–431. [Google Scholar] [CrossRef]
Huete, A.; Liu, H.; Batchily, K.; Van Leeuwen, W. A comparison of vegetation indices over a global set of TM images for EOS-MODIS. Remote Sens. Environ. 1997, 59, 440–451. [Google Scholar] [CrossRef]
Yeom, J.; Han, Y.; Kim, Y. Separability analysis and classification of rice fields using KOMPSAT-2 High Resolution Satellite Imagery. Res. J. Chem. Env. 2013, 17, 136–144. [Google Scholar]
Su, T.; Liu, Q.; Su, X. Study on crop remote sensing classification based on multiple vegetation index time series and machine learning. Jiangsu Agric. Sci. 2017, 45, 219–224. [Google Scholar]
Jonsson, P.; Eklundh, L. Seasonality extraction by function fitting to time-series of satellite sensor data. Ieee Trans. Geosci. Remote Sens. 2002, 40, 1824–1832. [Google Scholar] [CrossRef]
Shao, Y.; Lunetta, R.S.; Wheeler, B.; Iiames, J.S.; Campbell, J.B. An evaluation of time-series smoothing algorithms for land-cover classifications using MODIS-NDVI multi-temporal data. Remote Sens. Environ. 2017, 174, 258–265. [Google Scholar] [CrossRef]
Cai, Z.; Jönsson, P.; Jin, H.; Eklundh, L. Performance of smoothing methods for reconstructing NDVI time-series and estimating vegetation phenology from MODIS data. Remote Sens. 2017, 9, 1271. [Google Scholar] [CrossRef] [Green Version]
Thomas, I.L.; Ching, N.P.; Benning, V.M.; D’aguanno, J.A. A review of multi-channel indices of class separability. Int. J. Remote Sens. 1987, 8, 331–350. [Google Scholar] [CrossRef]
Kaufman, Y.J.; Remer, L.A. Detection of forests using mid-IR reflectance: an application for aerosol studies. Geosci. Remote Sens. Ieee Trans. 1994, 32, 672–683. [Google Scholar] [CrossRef]
Deng, C.; Wu, C. BCI: A biophysical composition index for remote sensing of urban environments. Remote Sens. Environ. 2012, 127, 247–259. [Google Scholar] [CrossRef]
Somers, B.; Asner, G.P. Multi-temporal hyperspectral mixture analysis and feature selection for invasive species mapping in rainforests. Remote Sens. Environ. 2013, 136, 14–27. [Google Scholar] [CrossRef]
Hu, Q.; Sulla-Menashe, D.; Xu, B.; Yin, H.; Tang, H.; Yang, P.; Wu, W. A phenology-based spectral and temporal feature selection method for crop mapping from satellite time series. Int. J. Appl. Earth Obs. Geoinf. 2019, 80, 218–229. [Google Scholar] [CrossRef]
Swain, P.H.; Davis, S.M. Remote sensing: The quantitative approach. Ieee Trans. Pattern Anal. Mach. Intell. 1981, 713–714. [Google Scholar] [CrossRef]
Chen, L.; Jin, Z.; Michishita, R.; Cai, J.; Yue, T.; Chen, B.; Bing, X. Dynamic monitoring of wetland cover changes using time-series remote sensing imagery. Ecol. Inform. 2014, 24, 17–26. [Google Scholar] [CrossRef]
Liu, J.; Zhao, Y. Methods on Optimal Bands Selection in Hyperspectral Remote Sensing Data Interpretation. J. Grad. Sch. Acad. Sin. 1999, 12, 153–161. [Google Scholar]
Swain, P.; Robertson, T.; Wacker, A. Comparison of the divergence and B-distance in feature selection. Lars Inf. Note 1971, 20871, 47906-41399. [Google Scholar]
Richards, J.A.; Jia, X. Feature Reduction. In Remote Sensing Digital Image Analysis: An Introduction, 4th ed.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2006; pp. 267–294. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. Isprs J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Hao, P.; Zhan, Y.; Wang, L.; Niu, Z.; Shakir, M. Feature selection of time series MODIS data for early crop classification using random forest: A case study in Kansas, USA. Remote Sens. 2015, 7, 5347–5369. [Google Scholar] [CrossRef] [Green Version]
Nitze, I.; Barrett, B.; Cawkwell, F. Temporal optimisation of image acquisition for land cover classification with Random Forest and MODIS time-series. Int. J. Appl. Earth Obs. Geoinf. 2015, 34, 136–146. [Google Scholar] [CrossRef] [Green Version]
Guan, H.; Li, J.; Chapman, M.; Deng, F.; Ji, Z.; Yang, X. Integration of orthoimagery and lidar data for object-based urban thematic mapping using random forests. Int. J. Remote Sens. 2013, 34, 5166–5186. [Google Scholar] [CrossRef]
Zhou, F.; Zhang, A.; Townley-Smith, L. A data mining approach for evaluation of optimal time-series of MODIS data for land cover mapping at a regional level. Isprs J. Photogramm. Remote Sens. 2013, 84, 114–129. [Google Scholar] [CrossRef]
Chen, Y.; Song, X.; Wang, S.; Huang, J.; Mansaray, L.R. Impacts of spatial heterogeneity on crop area mapping in Canada using MODIS data. Isprs J. Photogramm. Remote Sens. 2016, 119, 451–461. [Google Scholar] [CrossRef]
Jönsson, P.; Eklundh, L. TIMESAT—a program for analyzing time-series of satellite sensor data. Comput. Geosci. 2004, 30, 833–845. [Google Scholar] [CrossRef] [Green Version]
Eklundh, L.; Jönsson, P. TIMESAT 3.3 software manual; Lund and Malmö University: Sweden, 2017; p. 82. [Google Scholar]
de Beurs, K.M.; Henebry, G.M. Spatio-temporal statistical methods for modelling land surface phenology. In Phenological Research; Springer: New York, NY, USA, 2010; pp. 177–208. [Google Scholar]
Ghosh, S.; Mishra, D. Analyzing the Long-Term Phenological Trends of Salt Marsh Ecosystem across Coastal LOUISIANA. Remote Sens. 2017, 9, 1340. [Google Scholar] [CrossRef] [Green Version]
Ren, J.; Campbell, J.; Shao, Y. Estimation of SOS and EOS for Midwestern US corn and soybean crops. Remote Sens. 2017, 9, 722. [Google Scholar] [CrossRef] [Green Version]
Karlsen, S.R.; Solheim, I.; Beck, P.S.; Høgda, K.A.; Wielgolaski, F.E.; Tømmervik, H. Variability of the start of the growing season in Fennoscandia, 1982–2002. Int. J. Biometeorol. 2007, 51, 513–524. [Google Scholar] [CrossRef] [PubMed]
Chen, C.; Liaw, A.; Breiman, L. Using random forest to learn imbalanced data. Univ. Calif. Berkeley 2004, 110, 24. [Google Scholar]
Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to information retrieval; Online ed.; Cambridge University Press: Cambridge, UK, 2008; pp. 253–287. [Google Scholar]
Zarinabad, N.; Wilson, M.P.; Gill, S.K.; Manias, K.A.; Davies, N.P.; Peet, A.C. Multiclass imbalance learning: Improving classification of pediatric brain tumors from magnetic resonance spectroscopy. Magn. Reson. Med. 2017, 77, 2114–2124. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Li, Q.; Zhou, Z. Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Washington, DC, USA, 7–10 December 2013; pp. 478–487. [Google Scholar]
Fiori, M.; Martino, M.D.; Fernández, A. An optimal multiclass classifier design. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancue, Mexico, 4–8 December 2016; pp. 480–485. [Google Scholar]
Cohen, J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
de Leeuw, J.; Jia, H.; Yang, L.; Liu, X.; Schmidt, K.; Skidmore, A.K. Comparing accuracy assessments to infer superiority of image classification methods. Int. J. Remote Sens. 2006, 27, 223–232. [Google Scholar] [CrossRef]
Foody, G.M. Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority. Remote Sens. Environ. 2009, 113, 1658–1663. [Google Scholar] [CrossRef] [Green Version]
Fleiss, J.L.; Cohen, J.; Everitt, B.S. Large sample standard errors of kappa and weighted kappa. Psychol. Bull. 1969, 72, 323–327. [Google Scholar] [CrossRef] [Green Version]
Knyazikhin, Y.; Schull, M.A.; Stenberg, P.; Mõttus, M.; Rautiainen, M.; Yang, Y.; Marshak, A.; Carmona, P.L.; Kaufmann, R.K.; Lewis, P. Hyperspectral remote sensing of foliar nitrogen content. Proc. Natl. Acad. Sci. 2013, 110, E185–E192. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhao, Y.; Feng, D.; Yu, L.; Wang, X.; Chen, Y.; Bai, Y.; Hernández, H.J.; Galleguillos, M.; Estades, C.; Biging, G.S.; et al. Detailed dynamic land cover mapping of Chile: Accuracy improvement by integrating multi-temporal data. Remote Sens. Environ. 2016, 183, 170–185. [Google Scholar] [CrossRef]

Figure 1. Left: Location of the study area (filled with green) and Landsat 8 footprints (red boxes). Right: Digital evolution model (DEM) of the study area.

Figure 2. Distribution of land cover around 30 June 2015 in Chaoyang prefecture from Geographic Conditions Census (GCC) data. The resolution has been downsampled from 0.5 m to 30 m. It illustrates the numerous small patches and complex distribution of different classes.

Figure 3. Flow chart of data processing and comparative experiments in our study.

Figure 4. Averaged VI temporal profiles for the five major vegetation types. Panels (a–e) refer to universal normalized vegetation index (UNVI), normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), triangle vegetation index (TVI), and tasseled cap transformation greenness (TCG), respectively.

Figure 5. Pairwise Jeffries–Matusita (JM) distance charts for each single temporal VIs.

Figure 6. Pairwise JM distance charts for all multi-temporal VI combinations.

Figure 7. Comparisons of Kappa, Macro F-measure, and G-mean from different VIs for the five major vegetation.

Figure 8. Comparisons of Kappa, Macro F-measure, and G-mean from different VIs for the three less separable vegetation.

Table 1. Major land cover types in Chaoyang prefecture.

Class Code	Land Cover Type	Area Percentage	Descriptions
0120	Cropland	30.5%	Cultivated lands for dry crops such as wheat, corn, bean, potatoes, and vegetables, excluding greenhouses
0211	Orchard	4.4%	Orchards that comprise fruit- or nut-producing trees and shrubs
0311	Broadleaf deciduous forest	12.8%	Forests mainly comprised by dicotyledonous trees (accounting for at least 65%) with height over 5 m.
0312	Evergreen coniferous forest	11.5%	Forests mainly comprised of gymnospermous trees (accounting for at least 65%) with height over 5 m, e.g., pine, cypress, and fir
3021	Deciduous shrub	28.9%	Composed of broad-leaved shrubs, and small trees, with height lower than 5 m
0411	Grass	4.1%	Grasslands with good water condition and grass coverage greater than 50%
-	Other classes	7.8%	Mainly buildings, greenhouses, roads, waters, mining fields, etc.

Table 2. Vegetation indices used for comparison in this study. The Landsat 8 surface reflectance bands use for vegetation indices (VIs) calculations include: band 2 (blue, 450–515 nm), band 3 (green, 525–600 nm), band 4 (red, 630–680 nm), band 5 (near infrared, 845–885 nm), band 6 (short wavelength infrared, 1560–1660 nm), band 7 (short wavelength Infrared, 2100–2300 nm).

Vegetation Index	Formula for OLI	Source
UNVI	$UNVI = \frac{C_{v} - 0.1 \times C_{s} - C_{4}}{C_{w} + C_{v} + C_{s}}$	[24]
NDVI	$\frac{B_{5} - B_{4}}{B_{5} + B_{4}}$	[32]
EVI	$2.5 \times \frac{B_{5} - B_{4}}{B_{5} + 6 \times B_{4} - 7.5 \times B_{2} + 1}$	[33]
TVI	$0.5 \times [120 \times (B_{5} - B_{3}) - 200 \times (B_{4} - B_{3})]$	[22]
TCG	$- 0.2941 \times B_{2} - 0.2430 \times B_{3} - 0.5424 \times B_{4}$ $+ 0.7276 \times B_{5} + 0.0713 \times B_{6} - 0.1608 \times B_{7}$	[34]

Table 3. Numbers of pixels, coverage percentages, typical pixels selected and training proportions of the five major vegetation.

Vegetation Type	Crop	Broadleaf Forest	Coniferous Forest	Shrub	Grass
Number of pixels in Chaoyang	2,599,371	488,218	694,601	2,568,218	72,824
Coverage percentage	40.5%	7.6%	10.8%	40.0%	1.1%
Number of typical pixels selected	1821	1902	1134	2374	849
Proportion for training	5%	20%	20%	5%	30%

Table 4. The phenological parameters extracted from TIMESAT.

Phenological Parameters	Definitions
Time for the start of season (SOS)	Time for which the left edge has increased to 20% of the seasonal amplitude measured from the left minimum level
Time for the end of season (EOS)	Time for which the right edge has decreased to 20% of the seasonal amplitude measured from the right minimum level
Length of season (LOS)	Time from the start to the end of the season
Base level	Average of the left and right minimum values
Time for the mid of season	Mean value of the times for which, respectively, the left edge has increased to the 80% level and the right edge has decreased to the 80% level
Maximum VI	The largest VI value for the fitted function during the season
Seasonal amplitude	Difference between the maximum value and the base level
Increase rate	The ratio of the difference between the left 20% and 80% levels and the corresponding time difference
Decrease rate	The absolute value of the ratio of the difference between the right 20% and 80% levels and the corresponding time difference
Large seasonal integral	Integral of the function describing the season from the season start to the season end
Small seasonal integral	Integral of the difference between the function describing the season and the base level from season start to season end
Value for the start of season	Value of the function at the time of the start of the season
Value for the end of season	Value of the function at the time of the end of the season

Table 5. Producer’s and user’s accuracies obtained from the different VIs for each vegetation.

Vegetation Type	UNVI		NDVI		EVI		TVI		TCG
Vegetation Type	PA(%)	UA(%)	PA(%)	UA(%)	PA(%)	UA(%)	PA(%)	UA(%)	PA(%)	UA(%)
Crop	98.39	99.01	98.52	99.06	98.24	98.84	98.14	98.77	98.21	98.75
Broadleaf	72.71	49.40	71.61	47.71	71.99	45.52	71.14	45.81	69.16	44.03
Conifer	95.39	93.82	95.54	93.58	92.22	91.01	92.96	91.76	93.47	91.77
Shrub	82.49	93.96	81.30	93.62	79.77	92.92	80.21	92.70	79.30	92.28
Grass	75.16	34.12	74.71	32.78	73.88	33.81	73.47	34.20	73.59	34.20

Table 6. Tests of significance between classification results using UNVI time series and other VIs.

Classification 1	Classification 2	Z-test Value	p-Value
UNVI	NDVI	17.71	<0.0001
UNVI	EVI	60.40	<0.0001
UNVI	TVI	54.99	<0.0001
UNVI	TCG	74.82	<0.0001

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, H.; Zhang, F.; Zhang, L.; Lin, Y.; Wang, S.; Xie, Y. UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification. Remote Sens. 2020, 12, 529. https://doi.org/10.3390/rs12030529

AMA Style

Liu H, Zhang F, Zhang L, Lin Y, Wang S, Xie Y. UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification. Remote Sensing. 2020; 12(3):529. https://doi.org/10.3390/rs12030529

Chicago/Turabian Style

Liu, Hualiang, Feizhou Zhang, Lifu Zhang, Yukun Lin, Siheng Wang, and Yefeng Xie. 2020. "UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification" Remote Sensing 12, no. 3: 529. https://doi.org/10.3390/rs12030529

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

UNVI-Based Time Series for Vegetation Discrimination Using Separability Analysis and Random Forest Classification

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area

2.2. Geographic Conditions Census Data

2.3. Time-Series Remote Sensing Data

2.4. Vegetation Indices

3. Methods

3.1. Separability Analysis

3.2. Vegetation Classification

3.2.1. Features Used for Classification

3.2.2. Training and Validation

4. Results

4.1. VI Temporal Profiles

4.2. Pairwise Vegetation Separabilities

4.3. Classification Results and Accuracy Assessment

5. Discussion

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI