Next Article in Journal
A Regional Blended Precipitation Dataset over Pakistan Based on Regional Selection of Blending Satellite Precipitation Datasets and the Dynamic Weighted Average Least Squares Algorithm
Next Article in Special Issue
A Remote Sensing Method to Monitor Water, Aquatic Vegetation, and Invasive Water Hyacinth at National Extents
Previous Article in Journal
A Novel Approach to Modelling Mangrove Phenology from Satellite Images: A Case Study from Northern Australia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mapping an Invasive Plant Spartina alterniflora by Combining an Ensemble One-Class Classification Algorithm with a Phenological NDVI Time-Series Analysis Approach in Middle Coast of Jiangsu, China

1
Chair of Remote Sensing and Landscape Information Systems, University of Freiburg, 79106 Freiburg, Germany
2
Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing Normal University, Nanjing 210023, China
3
Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing 210023, China
4
School of Geography Science, Nanjing Normal University, Nanjing 210023, China
5
Chair of Forest Growth and Dendroecology, University of Freiburg, 79106 Freiburg, Germany
*
Author to whom correspondence should be addressed.
Remote Sens. 2020, 12(24), 4010; https://doi.org/10.3390/rs12244010
Submission received: 30 October 2020 / Revised: 4 December 2020 / Accepted: 4 December 2020 / Published: 8 December 2020
(This article belongs to the Special Issue Remote Sensing of Invasive Species)

Abstract

:
Spartina alterniflora (S. alterniflora) is one of the worst plant invaders in the coastal wetlands of China. Accurate and repeatable mapping of S. alterniflora invasion is essential to develop cost-effective management strategies for conserving native biodiversity. Traditional remote-sensing-based mapping methods require a lot of fieldwork for sample collection. Moreover, our ability to detect this invasive species is still limited because of poor spectral separability between S. alterniflora and its co-dominant native plants. Therefore, we proposed a novel scheme that uses an ensemble one-class classifier (EOCC) in combination with phenological Normalized Difference Vegetation Index (NDVI) time-series analysis (TSA) to detect S. alterniflora. We evaluated the performance of the EOCC algorithm in two scenarios, i.e., single-scene analysis (SSA) and NDVI-TSA in the core zones of Yancheng National Natural Reserve (YNNR). Meanwhile, a fully supervised classifier support vector machine (SVM) was tested in the two scenarios for comparison. With these scenarios, the crucial phenological stages and the advantage of phenological NDVI-TSA in S. alterniflora recognition were also investigated. Results indicated the EOCC using only positive training data performed similarly well with the SVM trained on complete training data in the YNNR. Moreover, the EOCC algorithm presented a more robust transferability with notably higher classification accuracy than the SVM when being transferred to a second site, without a second training. Furthermore, when combined with the phenological NDVI-TSA, the EOCC algorithm presented more balanced sensitivity–specificity result, showing slightly better transferability than it performed in the best phenological stage (i.e., senescence stage of November). The achieved results (overall accuracy (OA), Kappa, and true skill statistic (TSS) were 92.92%, 0.843, and 0.834 for the YNNR, and OA, Kappa, and TSS were 90.94%, 0.815, and 0.825 for transferability to the non-training site) suggest that our detection scheme has a high potential for the mapping of S. alterniflora across different areas, and the EOCC algorithm can be a viable alternative to traditional supervised classification method for invasive plant detection.

Graphical Abstract

1. Introduction

Biological invasion has a significant impact on biodiversity, conservation, and ecological security in coastal wetlands [1]. The invasion of Spartina alterniflora (S. alterniflora) is one of the most obvious ecological problems in China’s eastern coastal wetland ecosystems [2]. The perennial herb plant, S. alterniflora, native to the Atlantic coast of the Americas from Newfoundland, Canada, and South to Northern Argentina, was introduced to China’s eastern coast in the 1970s for beach protection, siltation promotion, and saline soil amelioration. However, with its strong adaptability and reproductive ability, S. alterniflora rapidly expanded along the eastern coast of China [3,4]. Its extensive and rapid expansion resulted in serious ecological problems, such as water and soil pollution, reduction of bird biodiversity, and change of estuarine sediment dynamics [4,5]. The acquisition of quantitative data, in particular, the up-to-date spatial distribution of S. alterniflora, is therefore indispensably crucial for conservation agencies to effectively respond to the expansion of S. alterniflora and to develop timely protection strategies [6].
Although accurate, traditional field-based approaches are time-consuming and costly, and therefore impractical for broad-scale mapping with high spatial accuracy [7]. An alternative approach for identifying S. alterniflora over the entire region employs remote sensing (RS) techniques [5]. However, implementing effective and accurate mapping of S. alterniflora remains a challenge due to poor spectral separability between S. alterniflora and its co-dominant native species [6,7]. Recent studies have found that the phenological approach that uses a single RS image acquired during the senescence stage is effective to augment the spectral separability between S. alterniflora and its co-existent native species [4,6,7,8,9]. Because the phenology of S. alterniflora lags behind the native plants, distinct differences in spectral characteristics are present during a brief period [8]. Most studies relied on a priori information or expert knowledge to determine periods that capture the plants’ unique phenological signature [6,8,10]. Additionally, studies have not taken advantage of the green-up stage of salt marsh plants to capture the spectral information of S. alterniflora [4]. The intra-annual time-series analysis approach may solve these problems because it can be implemented regardless of the availability of a priori information. It can also capture more phenological features due to the extended observation period [11,12,13]. As a widely accepted and easily calculated vegetation index, the Normalized Difference Vegetation Index (NDVI) has been broadly used to monitor phenological variations and biomass changes of vegetation in time-series analyses [9,11,14]. Moreover, in recent studies, the NDVI derived from the green-up and the senescence stage has been effective for separating S. alterniflora from native salt marshes plants [4,7,8,9,10]. This raises the possibility of accurate discrimination of S. alterniflora using NDVI time-series.
The supervised classification method based on machine learning algorithms is a common option for identifying invasive plants with time-series images [11,13,15,16,17]. However, the key procedure to use conventional supervised classifiers like random forest (RF), neural networks (NN), and support vector machine (SVM) requires the preparation of training data, which is relatively time-consuming and labor-intensive. Normally, a labeled reference dataset covering an exhaustive set of classes is needed for training these supervised classifiers [18]. This requirement could lead to extensive work dealing with the classes for non-invasive species, including class definition, training sample collection, and classification implementation [19]. Additionally, users likely miss some of the classes for non-invasive species during photo interpretation or field assessment, especially for complex scenes, which could lead to incomplete reference data [20]. Therefore, conventional supervised classifiers might be inefficient for identifying invasive species. One approach to address these limitations uses the one-class classification (OCC) approach, which only needs labeled training data for the positive class.
In recent years, many OCC algorithms, such as one-class support vector machines (OCSVM), support vector data description (SVDD), maximum entropy (MaxEnt), biased support vector machines (BSVM), and positive and unlabeled learning (PUL) algorithms, have been successfully applied for classifying Andean wetlands [21], global snow cover [22], raised bogs [18], and several other land-cover types [23,24,25,26,27,28,29]. These OCC methods have also been increasingly used for detecting invasive species [24,28,30,31]. However, it has been emphasized that results derived from OCCs are not equally reliable for all species [32] and that the best-performing models are not always the same for different species [20]. The detection performance and accuracy of individual OCC classifiers vary widely among methods and species. For example, Skowronek et al. (2017) compared the performance of MaxEnt and BSVM for two invasive plants detection and found that MaxEnt produced higher overall accuracy [28]. In contrast, another study showed that BSVM outperformed OCSVM and MaxEnt in terms of discriminative power by presenting an “in-depth” comparison of the three OCCs [33]. Piiroinen et al. (2018) confirmed the result that BSVM can produce very promising accuracy for detecting two invasive tree species in the Eastern Arc Mountains biodiversity hotspot [30]. Moreover, research has indicated that single classifiers that have the best performance on current data will not necessarily provide the most accurate result in transferability analysis [34]. These aforementioned uncertainties limit the assumptions concerning the operational applicability and transferability of the presented OCC methods to map a range of invasive species. An ensemble analysis provides an established method for resolving differences between individual OCC classifiers and estimating uncertainty [35]. When several individual classifiers are combined using ensemble methods (e.g., the mean of all classifiers), they can form a more accurate prediction with better transferability, outperforming single classifiers [34]. Model ensemble methods have been widely used in modeling species distributions [32,34,35]. They also have been successfully applied in solving multi-class classification problems in RS. However, there is a limited empirical investigation about how well ensemble OCC (EOCC) models perform compared to individual OCCs. Moreover, to the best of our knowledge, the potential of the ensemble analysis of multiple OCC classifiers (e.g., MaxEnt, BSVM, and PUL) in invasive species detection has never been assessed.
To this end, we developed an EOCC algorithm based on multiple typical OCC classifiers, including MaxEnt, BSVM, and PUL algorithms. We reported a test of the EOCC algorithm for S. alterniflora detection under two scenarios, single-scene analysis (SSA) and NDVI time-series analysis (NDVI-TSA), in the core zone of Yancheng National Natural Reserve (YNNR). YNNR is located on the middle coast of Jiangsu, China, where S. alterniflora is widely distributed and less affected by human activities. A standard supervised classifier, SVM, was used as a baseline to compare with the EOCC algorithm. By combining phenological NDVI-TSA with the EOCC algorithm, we proposed a new scheme for S. alterniflora detection. We then investigated the transferability of our proposed detection scheme to Dafeng Milu National Nature Reserve (DMNNR). The primary objective of this study was to accurately map S. alterniflora with low cost and high efficiency. Specifically, we addressed the following three overarching research questions:
(1)
How does the EOCC algorithm perform compare to individual OCC algorithms and a standard supervised classifier SVM in S. alterniflora detection?
(2)
How much does phenological NDVI-TSA improve S. alterniflora detection?
(3)
Is the detection scheme transferable and robust when it is applied in different regions?

2. Materials and Methods

2.1. Study Area

The middle coast of Jiangsu ranges from the Chuandong Estuary in the south to the Xinyang Estuary in the north (Figure 1a). The region belongs to subtropical and warm temperate zones and has a marine monsoon climate with moderately well-defined seasonality [9]. The vegetation mainly comprises S. alterniflora, Phragmites australis (P. australis) and Suaeda salsa (S. salsa), which are predominantly located in two regions, namely the core zones of YNNR (Figure 1b) and the DMNNR (Figure 1c). Due to the continuous reclamation and agricultural activities, most of the salt marsh plants distributed between the Doulong and Chuandong estuaries have been replaced by aqua-farms and farmlands (Figure 1a) [9]. We, therefore, selected these two national nature reserves as the study site. Specifically, the YNNR was chosen as the study area for evaluating the performance of our proposed method in mapping S. alterniflora, while the DMNNR was used for testing the transferability. The YNNR was established in 1983, aiming at protecting red-crown cranes and their habitat. The DMNNR was established in 1986 for the protection of Elaphurus davidianus and their habitat. Both nature reserves are now under national administration with strict protection implemented [36]. Human activities are strictly forbidden in the core areas of the reserves.

2.2. Remote Sensing Data

The Gaofen-1 (GF-1) satellite was launched in April 2013, carrying two panchromatic/multi-spectral and four wide-field-of-view (WFV) cameras. Each WFV camera has a resolution of 16 m and four multispectral bands, including Blue (B1: 0.45–0.52 μm), Green (B2: 0.52–0.59 μm), Red (B3: 0.63–0.69 μm), and NIR (B4: 0.77–0.89 μm). With the combination of four cameras, the GF-1 WFV data have a swath width of 800 km and a high frequency revisit time of four days [37]. A total of 28 high-quality GF-1 WFV images with minimal cloud covers are available for the study area in 2015. Only one image with the highest quality of each month was selected to construct the monthly time-series [11]. However, no cloud-free GF-1 WFV image (<10% cloud cover) was available for September 2015 due to the cloud contamination. The GF-1 WFV image acquired in September 2016 was thus selected in the subsequent time-series analysis (Table 1). All the GF-1 WFV images were freely downloaded from the China Center for Resources Satellite Data and Application.

2.3. Reference Data

Due to the limited infrastructure and the strict protection rule, the central areas of the YNNR and DMNNR are difficult to reach, and therefore, our field surveys were only carried out along the periphery and certain narrow roads in the core areas [38]. Two field trips were conducted to collect ground reference samples in July 2014 and September 2015. With a handheld GPS unit, we recorded the location of S. alterniflora, native species, and other land-cover types. In total, 206 S. alterniflora points, 105 P. australis points, 123 S. salsa points, and 226 other land-cover type points were collected in the YNNR. As the ground reference data were collected in two years, to eliminate the uncertainty caused by the time shift, we removed nine samples that were distributed at the ecotone between S. alterniflora and native plants and three samples from areas frequently flooded by tides. All the ground samples were then inspected on the Google Earth (GE) images (acquired on 13 July 2015 and 14 August 2015), and the points located in a GF-1 WFV pixel (16 m × 16 m) without clear land-cover information were excluded. The detailed spatial information provided by the GE images helped accurately distinguish S. alterniflora from other land-cover types [4]. To ensure a well-stratified reference dataset, we further supplemented the collected in situ data with visual interpretations from the GE images. We first transformed the land-cover maps of the YNNR and DMNNR in 2014 into binary maps containing two classes (S. alterniflora and non-S. alterniflora). The land-cover maps were generated based on a Landsat-8 Operational Land Imager (OLI) image with a 30 m resolution, and both had overall accuracies (OA) exceeding 95% [4]. Then, 2000 and 1000 random points for the YNNR and NMNNR, respectively, were generated with a minimum distance of 100 m constraint from the binary maps to ensure that individual pixel locations were only sampled once. Each random point was then inspected using the GE images, and the points without clear land-cover information were excluded. The sample points were cross-checked by at least one other interpreter. Finally, a total of 1721 samples (1213 samples for YNNR, and 508 samples for DMNNR), including 650 S. alterniflora and 1071 non–S. alterniflora samples were used to formulate our reference dataset.

3. Methodology

3.1. Remote Sensing Image Preprocessing

The RS images were pre-processed using Environment for Visualizing Images (ENVI), which includes radiometric correction, atmospheric correction, orthorectification, and accurate geographic registration. The atmospheric correction for GF-1 WFV was performed using the Fast Line-of Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) software package in ENVI 5.3. For the geographic correction for the GF-1 WFV images, we used one 15-m Landsat-8 OLI panchromatic image as a reference. The correction error for each GF-1 WFV image was controlled within 0.5 pixels. Different vegetation indexes (VI), including NDVI, Enhanced Vegetation Index (EVI), Ratio Vegetation Index (RVI), Difference Vegetation Index (DVI), and Soil-adjusted Vegetation Index (SAVI), were generated from each single-date WFV image (Table 2). Additionally, a principal component analysis (PCA) was conducted on the monthly GF-1 WFV data. The first three principal components of the PCA (PC1, PC2, and PC3) were selected as predictive variables for our model.

3.2. One-Class Classification Algorithms

3.2.1. Individual Models

The BSVM algorithm was implemented in the oneClass package in R statistical software [26]. BSVM uses unlabeled samples as negative training samples, a variant of the binary SVM [18]. As the unlabeled class also contains samples from the positive class, two cost terms Cp and Cu are used for differentially penalizing misclassification and margin errors of the positive and unlabeled samples. We chose the radial basis function (RBF) kernel to build the desired models, requiring three parameters to be optimized: the inverse kernel width, γ, and the two penalty parameters, Cp and Cu [18]. In our experiment, we manually set γ, Cp and Cu from the following ranges: γ= 0.1, 1,…, 10, Cu = 0.1, 1.1,…, 7.1, Cp ={Cu × (1, 5, 10, 20, 100, 200, 500)}. Without negative samples, the performance measure F score cannot be calculated. Alternatively, we chose the commonly used Fpu for model selection criteria. Detailed descriptions of Fpu can be found in Reference [44]. “Optimal” values for tuning parameters were selected by a grid search using five repetitions of ten-fold cross-validation based on the training dataset. We randomly selected 100 positive samples and 5000 unlabeled samples to train the BSVM. Since outputs of BSVM delivers a continuous value for each classified pixel, a threshold is therefore required to convert the output values into binary classes. We used the maximizing the sum of specificity and sensitivity (MaxSSS) as the threshold for the classification of output.
MaxEnt is particularly known for modeling potential species distributions based on environmental parameters [45]. It separates the target species from the background by applying a maximum entropy approach, which compares probability densities [28]. As a general-purpose machine learning method, it has proven to be suitable to solve different one-class classification problems based on RS data in recent years [4,24,25,33]. The MaxEnt algorithm was implemented by using the dismo R package [46]. The same training dataset used for BSVM (i.e., 100 positive samples and 5000 unlabeled samples) was used for MaxEnt. We used default parameters to implement MaxEnt because it has been shown that MaxEnt with the default sets performs similarly to that using “optimal” parameters obtained by the grid search method [4,33]. The MaxSSS was used to transfer the output into binary classes.
The PUL algorithm transforms the traditional binary classifier to one-class classifier based on the Bayes principle [47]. Traditional binary classifiers are inefficient to model p(y = 1|x) due to the absence of reliable negative samples [48]. To address this problem, Elkan and Noto proposed to train a classifier p(s = 1|x) alternatively [47]. The target p(y = 1|x) can be then transformed by the equation: p(y = 1|x) = p(s = 1|x)/c. Here, y = 1 denotes positive pixels, and y = 0 denotes negative pixels, while s = 1 denotes labeled pixels, s = 0 denotes unlabeled pixels, and x denotes the covariates associated with a pixel [48]. If a binary classifier is trained by using labeled and unlabeled data, the model p(s = 1|x) can be presented to estimate the probability of a pixel x being labeled [20]. Therefore, we can obtain p(y = 1|x) if the factor c is successfully estimated. In practice, a reliable way to estimate c is to average the predicted probabilities of multiple positive pixels [29].
c   =   1 n Σ x ϵ V p ( s = 1 | x )
where V is a subset of the training (or validation) set that contains only the labeled (i.e., positive) pixels, and n is the cardinality of dataset V.
It is worth mentioning that PUL is not a specific classifier, but rather a general framework that can be implemented by any classifier able to correctly predict the conditional probability [20]. More details about the PUL algorithm can be found in Reference [48]. Deep learning, as a new branch of machine learning, has been successfully applied in RS classification in recent years [7]. A growing number of studies have reported achieving competitive classification results at moderate spatial resolution with deep learning [7,49]. Therefore, we used a multi-layered feedforward deep neural network (DNN), trained with stochastic gradient descent, using back-propagation. We named this PUL method a positive and unlabeled deep neural network (PUDNN).
For the PUDNN, we used the rectified linear unit (ReLU) activation function with a softmax output classification function. To improve stability for ReLU, we set the constraint for the squared sum of incoming weights per unit to ten. For the hidden layers in the PUDNN, the values from two to six were tested, and the neurons for each hidden layer were set to 200. We also tested the input layer dropout ratio values of 0, 0.05, 0.1, and 0.2. The relative tolerance for metric-based stopping criterion was set to 0.01 and the maximum duty cycle fraction for scoring was set to 0.025. Other parameters not mentioned here were set for the default. The optimal parameter combination for PUDNN was determined by a grid search with ten-fold cross-validation based on the training dataset. Regarding the training data, we found that using 5000 random unlabeled samples resulted in PUDNN models with very high computational costs. Consequently, we used 300 unlabeled samples randomly generated from the 5000 unlabeled samples, and the same 100 positive samples used for the MaxEnt and BSVM. The constant, c, can be estimated by using the labeled positive data either from the training set or from an independent validation set with both methods producing similar results [29]. To avoid reducing the sample size, the constant c was calculated here by averaging the predicted values of the positive data in the training set. The MaxSSS was then calculated to transform the outputs into binary results. The PUDNN was implemented in the h2o package.

3.2.2. Ensemble Model

The ensemble model was implemented by integrating the three individual OCC algorithms, including MaxEnt, BSVM, and PUDNN, based on a weighted voting scheme. First, the three OCC algorithms were carried out to produce different outputs. The threshold MaxSSS for each algorithm was then calculated to transform the different outputs into binary results. Next, the true skill statistic (TSS) was applied to evaluate the accuracies of the binary results. Here, TSS was calculated from the positive and unlabeled data in the training set according to the following equation:
TSS = Specificity + Sensitivity − 1
where sensitivity is the true positive rate (also called recall in other fields) and specificity is the true negative rate. The TSS values vary from −1 to 1, where negative and close-to-zero values indicate models are not different from randomly generated models, while values close to 1 indicate good models, and values above 0.5 are assumed as suitable models [50,51]. Thus, it was used as the weight for each classifier. Then, the binary outputs of the individual OCC approaches were combined by using the weighted vote approach:
EOCC   =   i = 1 n T S S i ×   y i i = 1 n T S S i
where TSSi and yi refer to the TSS value and predicted class (i.e., one for positive class and zero for negative class) of the model i, respectively, and n refers to the number of classifiers. The outputs of the EOCC approach fell into the range of 0 to 1. The output values above 0.5 were classified as the positive class.

3.3. Standard Supervised Classification Method

To systematically investigate the ability of the proposed EOCC algorithm to identify S. alterniflora, we compared its performance with a state-of-the-art supervised classifier SVM, used extensively in RS analysis [20]. To ensure a comparable accuracy assessment, we used the same 100 positive samples from the OCC methods, along with an additional 110 true negative samples to train the SVM. The optimal parameters for the SVM were determined by a grid search, using five repetitions of ten-fold cross-validation. Two parameters were involved in the training of the classifier: the RBF kernel width γ and the penalty parameter C. The parameter grid for the SVM was defined by γ ∈ {2−11, 2−9,…, 23 } and C ∈ {2−3, 21,…, 213 }. The SVM was implemented with the caret package [52] in R statistical software.

3.4. Feature Selection

The random forest recursive feature elimination (RF-RFE) method was utilized to select the important features for the models. The RF-RFE is a wrapper-based feature-ranking algorithm that searches within the space for an optimal subset by performing optimization algorithms [11]. The algorithm starts with all candidate variables in the model and recursively eliminates one insignificant variable each time until only one remained as input. For each iteration, the feature with the smallest ranking score is removed. The model is then rebuilt the model with the features retained, and the model accuracy is recalculated. The algorithm is formulated to identify the optimal subset of discriminatory features [53]. In this study, a five repetitions of the ten-fold cross-validation method was implemented in the algorithm to secure better evaluations of the model performance during the selection process. We implemented RF-RFE, using the caret package in R statistical software [52].

3.5. Two Scenarios for S. alterniflora Detection

3.5.1. Monthly SSA for S. alterniflora Detection (Scenario 1)

We evaluated the performance of 12 monthly SSAs in identifying S. alterniflora by using four OCC algorithms (i.e., EOCC, MaxEnt, BSVM, and PUDNN), along with a standard supervised classification method (i.e., SVM) based on 12 single-date GF-1 WFV images. In addition to the original four GF-1 WFV bands, we also used five VIs, and three PCs from the PCA for each SSA, because the VIs and PCs were found to be efficient for improving the model performance in S. alterniflora detection (see Appendix A Figure A1 for details). Variable selection for each SSA was conducted by using the RF-RFE algorithm (see Supplementary Materials Figure S1 for more details). With a complete analysis of 12 months, the discrimination abilities of different algorithms, as well as the suitable phenological windows for S. alterniflora detection were investigated. The optimal SSA was selected to provide a baseline for the comparison with the subsequent NDVI-SSA.

3.5.2. NDVI-TSA for S. alterniflora Detection (Scenario 2)

In practice, a priori information on the crucial detection period for an invasive species is often lacking. To avoid this limitation and broaden the applicability of our proposed scheme, we utilized a complete monthly NDVI time-series dataset representing 12 consecutive months to conduct the NDVI-TSA. Because S. alterniflora grows in intertidal zones and is often impacted by tidal fluctuations, the inclusion of more multi-temporal images could introduce more interference. This would result in a decrease in model performance [9,54]. Hence, we attempted to compress the original NDVI-TSA by using fewer, but more important, NDVI images and analyzed the corresponding classification results. The RF-RFE algorithm was used to select the best feature subset to construct the optimal NDVI-TSA. We named the optimal NDVI-TSA as phenology-based NDVI-TSA (PB-NDVI-TSA). Besides, in the coastal zone where S. alterniflora occupies, frequent cloud coverage often reduces the number of available images. To address this problem, we also evaluated the performance of the second PB-NDVI-TSA that uses the top three features. Consequently, there were three NDVI-TSA, with each NDVI-TSA assessed by using four OCC and SVM algorithms.

3.6. Transferability Analysis of the EOCC Algorithm Combined with PB-NDVI-TSA for S. alterniflora Detection

To verify whether the proposed detection scheme is transferable and robust, we directly transferred the “optimal” model to the second region without new training data input. The second PB-NDVI-TSA in Scenario 2, the PB-NDVI-TSA using the top three NDVIs, was selected for the transferability analysis. It should be noted that we only tested the transferability of the second PB-NDVI-TSA because it is often a challenge to obtain enough cloud-free and high-quality images in a year with one satellite sensor for practical applications in areas like Jiangsu coastal wetlands. Clouds are present frequently from June to October over the whole Jiangsu coast, especially during hot and rainy summers. We aimed to develop a detection scheme that could be easily implemented and extended to other regions. To assess the advantage of phenological time-series, the best SSA was also considered for comparison. Moreover, to further demonstrate the applicability of our proposed detection scheme, we compared its performance with the SVM in the transferability analysis.

3.7. Accuracy Assessment

To ensure a comparable accuracy assessment, we used an independent validation dataset to evaluate the performance of the OCC and fully supervised classification approaches in Scenarios 1 and 2. A confusion matrix was constructed by using the validation data and the outputs from each classification method. Several accuracy indices, including specificity, sensitivity, OA, Kappa statistic, and TSS, were calculated from the confusion matrix according to the following equations:
Specificity   =   T N T N   +   F P
Sensitivity   =   T P T P   +   F N
OA   =   T P   +   T N T P   +   T N   +   F P   +   F N
Kappa   =   T P   +   T N       T P   +   F N T P   +   F P   +   F N   +   T N F P   +   T N / N N     T P   +   F N T P   +   F P   +   F N   +   T N F P   +   T N / N
TSS = Specificity + Sensitivity − 1
where TP denotes true positives (i.e., the number of S. alterniflora pixels correctly detected), FP denotes false positives (i.e., the number of S. alterniflora pixels incorrectly detected), TN and FN denote true negatives and false negatives, respectively, and N is the size of the dataset. For the YNNR, 343 positive samples and 660 negative samples were used for the independent validation dataset, while 207 positive samples and 301 negative samples were used to evaluate the classification accuracy of the models for the DMNNR.

4. Results

4.1. Crucial Variables, Months, and Phenological Phases for S. alterniflora Detection

Table 3 shows the classification accuracies of the presented methods, using 12 variables and the optimal variable subsets selected by the RF-RFE method. Compared with the results of different classification methods using complete variables (i.e., 12 variables), no significant improvements for all classifiers were found after implementing feature selection. Compared with the results from other OCC methods, the PUDNN presented more pronounced accuracy changes after using the optimal variables subsets. As shown in Table 4, for each SSA, the three most important variables consisted of at least one PC or VI, demonstrating the importance of PCs and VIs for S. alterniflora detection. The optimal variable sizes varied from three to twelve among different months. It was important to note that the NDVI was selected as an important variable for ten months, except for July and September, suggesting its high importance and potential for S. alterniflora detection. In the senescence stage during November and December, the VIs showed higher rankings than other variables.
As expected, the performance of the five methods, four OCCs and the SVM, for S. alterniflora detection was strongly dependent upon the phenological window selection of the GF-1 WVF image. The best performance appeared in November, during which the classification accuracy was promising with a high average Kappa of 0.812 and an average OA of 0.918. Meanwhile, August showed the worst result with the average Kappa and OA of 0.377 and 0.714, respectively. The desired classification results were obtained based on the images acquired in November, December, April, and May (Figure 2). The mean OA and Kappa for these months were greater than the average level of 0.658 and 0.848, respectively. Comparatively, in the dormancy period during January and February, the rapid growth season during July and August, and the early flowering stage during September, the classification performance of the OCC and the SVM algorithms were much poorer than in the senescence and green-up stages. The mean OA and Kappa for these months were less than the average level.
Figure 3 shows the changes in Kappa and OA with an increasing number of variables (in decreasing order of importance) using the RF-RFE algorithm on NDVI-TSA (12 NDVIs). When six NDVI indices with less important values were excluded, the highest classification accuracy for identifying S. alterniflora was achieved (Figure 3). The NDVIs from November, December, May, June, April, and January were selected as the best feature subset. As excepted, all these months showed relatively high performance in S. alterniflora recognition, indicating the NDVI-TSA with RF-RFE algorithm is suitable for selecting the crucial phenological stages for S. alterniflora detection. Among the selected six NDVIs, the NDVIs for November, December, and May were ranked as the top three predictors, further confirming these months as important phenological windows for S. alterniflora recognition. Moreover, it was evident that following the inclusion of the first three variables, the Kappa and OA values showed no significant change. Therefore, we also investigated the performance of the EOCC algorithm by using only these three NDVI indices in the subsequent exploration.

4.2. Algorithms Comparison

Table 5 shows the classification accuracies of OCC classifiers and a fully supervised classifier in Scenarios 1 and 2. All OCC algorithms performed well and similarly for S. alterniflora detection generally. Despite returning the highest mean sensitivity value of all individual OCC methods, MaxEnt had the lowest mean specificity value. BSVM had the highest mean OA, Kappa, TSS, and sensitivity values; however, it also yielded the lowest specificity value of all individual OCC classifiers. Compared to the single OCC algorithms, the EOCC produced a very similar mean sensitivity value, but a slightly higher mean specificity value; meanwhile, it also had a slightly higher mean OA of 86.79%, mean Kappa of 0.698, and mean TSS of 0.676, indicating the EOCC generated slightly better and more balanced results. Moreover, the EOCC obtained the closest classification scores to the fully supervised classification method. Although EOCC’s mean sensitivity and TSS values were lower, its mean OA, Kappa, and specificity values were higher than the SVM. The EOCC also had a slightly lower standard deviation of the accuracies than that of the SVM in Scenarios 1 and 2.

4.3. Performance of the EOCC Algorithm in NDVI-TSA for S. alterniflora Detection

As the baseline for benchmarking the NDVI-TSA, the EOCC algorithm in the optimal time window (i.e., November) yielded an OA, a Kappa, and a TSS of 91.92%, 0.813, and 0.743 (Table 6). Compared to the best SSA, NDVI-TSA achieved slightly lower OA and Kappa values when it used all NDVI variables (i.e., 12 NDVIs). However, it produced a more balanced sensitivity–specificity result. By implementing the feature selection, the classification accuracies of NDVI-TSA were improved and higher than that of the best SSA. The PB-NDVI-TSA using the selected six variables produced the best results with the highest OA, Kappa, TSS, and sensitivity values. Although the EOCC in the best SSA yielded the highest specificity value, it produced the lowest sensitivity value. The sensitivity was significantly improved in the two phenological NDVI-TSA, which indicated that the PB-NDVI-TSA can greatly reduce the omission rate for S. alterniflora detection. Compared to the optimal PB-NDVI-TSA, the second PB-NDVI-TSA using the top three variables produced slightly lower classification accuracy. However, it still outperformed the NDVI-TSA (i.e., using 12 NDVIs) and the best SSA. These results indicate that phenological NDVI time-series analysis can improve the performance of the EOCC algorithm for S. alterniflora detection.
Based on visual inspection, the classification maps generated using the EOCC algorithm in both the best SSA and the NDVI-TSA were desirable. These maps matched with the land-cover map well. However, in the best SSA map, a few false negative pixels can be easily found inside S. alterniflora patches (Figure 4b). In the ecotone between the invasive plant and native salt marsh plants, where S. alterniflora has relatively low biomass and coverage density, S. alterniflora was underestimated by the best SSA during the senescence stage. The best SSA underestimated more S. alterniflora patches than the NDVI-TSA. The false negative pixels were significantly reduced in the maps from three NDVI-TSA models (Figure 4c–e). Compared to the NDVI-TSA using 12 NDVIs, the prediction of S. alterniflora was improved in the two PB-NDVI-TSA models (Figure 4c,d). Moreover, the false positive pixels located in the culture pond and farmland were also notably reduced in the two PB-NDVI-TSA models.

4.4. Transferability Analysis of the EOCC Algorithm in the PB-NDVI-TSA

Since the desired S. alterniflora detection result was obtained by our scheme using the EOCC algorithm in the second PB-NDVI-TSA, we attempted to apply this in a different area without new training data input to verify its applicability and transferability. To demonstrate the usefulness of the phenological NDVI-TSA, we also compared its performance with the optimal SSA in the transferability analysis. As shown in Table 7, by using the EOCC algorithm, the classification accuracies for both SSA and PB-NDVI-TSA were promising with high Kappa (0.789 in the SSA, and 0.815 in the PB-NDVI-TSA), OA (89.57% in the SSA, and 90.94% in the PB-NDVI-TSA), and TSS (0.809 in the SSA, and 0.825 in the PB-NDVI-TSA) values. However, the performances of the SVM in the two schemes (i.e., the best SSA and the PB-NDVI-TSA) were poorer and varied greatly, which indicates that the EOCC has a stronger and more stable performance in the two schemes. Although the EOCC yielded slightly lower sensitivity values in both schemes, it can produce significantly higher specificity values, which demonstrates that the EOCC also has a more balanced performance than the SVM in the transferability analysis. Compared to results in the Best SSA, both SVM and EOCC produced more balanced specificity–sensitivity accuracies in the PB-NDVI-TSA.
Based on visual examination, we found that the SVM in the SSA and PB-NDVI-TSA produced far more false positive pixels than the EOCC did generally (Figure 5). Overestimation of S. alterniflora was quite a serious problem for the SVM. As expected, the classification results for both SVM and EOCC in the PB-NDVI-TSA were slightly improved compared to the SSA results, with the false positive pixels reduced. The EOCC in PB-NDVI-TSA generated the best classification map, which was well consistent with the manually delineated classification map (Figure 5d). Surprisingly, only a few misclassification errors were found in the best classification map, and most of the tidal channels within S. alterniflora patches were excluded (Figure 5e).

5. Discussion

5.1. Ensemble Analysis for OCC Methods

Depending on the input training data, OCC methods can be divided as P-classifier (only trained with positive (P) data) and PU-classifier (trained with positive data and additionally unlabeled (U) data). The prominent examples of the first category are OCSVM and SVDD, while MaxEnt, BSVM, and PUL belong to the second group. Benefiting from the additional information from unlabeled data, PU-classifier usually yields higher classification accuracy than P-classifier [18]. We, therefore, only considered the PU-classifiers for the ensemble analysis in our study. Our results imply that all the individual PU-classifiers (i.e., the MaxEnt, BSVM, and PUL) perform well, returning similar OA, Kappa, and TSS values to that from the standard supervised SVM. However, their sensitivity values were lower than that from the SVM. Overall, the SVM produced more balanced and slightly better accuracy than the individual OCC algorithms. For the EOCC method, most accuracy metrics improved slightly compared to the single OCCs, showing a very similar classification result to the SVM. Although maps for the distribution of S. alterniflora created with traditional supervised classification methods are promising, the associated time and costs reduce their viability, especially for application in large areas and across multiple years. Comparatively, the EOCC method applies positive-only data and performs well in recognizing S. alterniflora, highly reducing the requirement for training data collection. In light of its strong performance, low cost, manageable labor intensity, and convenient operation, we offer the EOCC algorithm as a viable alternative to traditional supervised classification methods for invasive plant detection.
To integrate the outputs from different PU-classifiers, we used a weighted vote scheme, which has been shown to outperform simple majority vote and average ways in ensemble OCC applications [20]. Our result confirmed that higher accuracies can be achieved using the weighted vote approach (Appendix A Table A1). Despite the EOCC method not always delivering the best result of all the OCC methods in all scenarios, it did produce slightly more stable and balanced results. For example, overall, the BSVM yielded the best classification accuracy across all individual OCC methods. It also produced the lowest OA, Kappa, and TSS values in December for Scenario 1 and in NDVI12 for Scenario 2 (Appendix A Table A2). However, these discrepancies did not occur in the EOCC. This indicates that our ensemble scheme may feasibly reduce the predictive uncertainties of single OCC methods and offer improvements for classification accuracies. However, compared to the individual OCCs, only a limited improvement for the EOCC was presented in terms of the accuracy metrics. To further improve the detection performance of the ensemble analysis, one feasible way is to use other combination approaches (e.g., Bayesian average and fuzzy integral) that have shown acceptable performance in multi-class classification problems [55]. Besides, compared to the supervised classifier SVM, although returning higher specificity values, all the OCCs showed lower sensitivity values for Scenarios 1 and 2. As we focus on detecting a potentially invasive species to the environment, it may make more sense to minimize the number of false negatives (pixels that are actually S. alterniflora, but are classified as negatives). This would ensure that most individuals of invasive species are detected and potential countermeasures against further spreading can be efficiently implemented [30]. Therefore, one potential improvement for the EOCC might be to adjust the classification thresholds for each individual OCC to minimize the number of false negatives.

5.2. Advantages of Phenological NDVI Time-Series for Mapping Invasive Plant S. alterniflora

Numerous studies have demonstrated that vegetation indices, particularly the commonly used NDVI, can help improve invasive species detection accuracy [6,9,11,12,24]. Our results confirmed NDVI is of high importance for separating S. alterniflora from native P. australis and S. salsa. NDVI was deemed as a crucial variable for most SSAs in our study. A recent study found that monthly NDVI time-series can achieve better classification results than the multi-temporal imagery composite and the best SSA for three salt marsh plants (S. alterniflora, P. australis, and Scirpus mariqueter) classification [8]. In our study, compared to the best SSA, the three NDVI-TSAs showed better sensitivity–specificity balanced results, indicating that NDVI-TSAs are more efficient for detecting the invasive plant with a lower omission rate. Due to the phenological lag of S. alterniflora, most native salt marsh plants in the study area are completely withered or dormant by late November, while S. alterniflora can still maintain many green vegetative parts during this senescence stage [4]. This phenological difference makes the S. alterniflora easily distinguishable from the native plants [8]. However, some mixed pixels of S. alterniflora are extremely difficult to distinguish from mudflats [7]. This explains why we obtained very high specificity but relatively low sensitivity values in the best SSA in YNNR. Based on the visual inspection, we found that many pixels of S. alterniflora located in the high tide area closest to the sea were falsely classified as the negative class. In the green period, it becomes easy to distinguish S. alterniflora from the mudflats [56]. Additionally, in late April and May, most of S. alterniflora are just starting to germinate or grow with low biomass, while the co-dominant native marshes (e.g., P. australis) have already been growing for approximately one month [8,9], resulting in the S. alterniflora shows a different spectral characteristic from the native plants. However, in the ecotone of S. alterniflora and S. salsa, we found many false positive pixels (pixels are actually S. salsa, but classified as S. alterniflora) during the visual inspection. Our field surveys showed that the density of S. salsa was much lower than that of S. alterniflora in the ecotone. The scattered distribution makes it difficult for the corresponding pixels (16 m × 16 m) to represent the true spectral response of S. salsa because of the effects of both the canopy and the underlying soil [9]. Although S. salsa has been growing for approximately one month by late May, the mixed-pixel phenomenon leads to its lower NDVI values. This phenomenon also results in S. alterniflora and S. salsa showing a similar spectrum trait in the ecotone. Therefore, it may be inadequate to realize the discriminative potential with the SSA method. NDVI-TSA combines green and senescence periods can make full use of different phenological characteristics, thus improving the spectral separability in S. alterniflora detection.
However, although the NDVI-TSA using 12 NDVIs showed a slightly better specificity–sensitivity balance, it did not show many advantages in terms of Kappa, OA and, TSS values compared to the best SSA. Due to the differences in illumination and atmospheric conditions across different image acquisition dates, unclear observations (e.g., clouds, cloud shadows, and inherent noise) may increase, as more images are included in the NDVI time-series imagery [9]. Even though each image was rigorously filtered (e.g., cloud cover less than 10%) and preprocessed (e.g., atmospheric correction and cloud mask), these unclear observations inevitably exist in the dataset and may influence the classification accuracy. The phenomenon of species succession also hinders the accurate S. alterniflora detection. The boundaries and ecotone between S. alterniflora and native salt marsh plants are areas of high misclassification probability [4,9]. Therefore, it is critically important to pick out a few images that best reflect the differences between S. alterniflora and native plants and thereby reduce the span of the time-series, this helps improve the accuracy in unstable regions [9]. The PB-NDVI-TSA method could reduce the aforementioned uncertainties, as fewer, but more important images are considered. Compared to the complete NDVI-TSA (i.e., models using 12 NDVIs), our results show that the classification accuracies are improved by using the PB-NDVI-TSA methods. More importantly, our second PB-NDVI-TSA model, which includes NDVI images from only three key months (May, November, and December), demonstrates a very little reduction in classification accuracy. However, it still yielded higher and more balanced accuracy than any SSAs and the complete NDVI-TSA.
We concluded that an improved mapping result can be achieved during May, November, and December. Coincidentally, the NDVI of November, December, and May were selected as the top three variables for the NDVI-TSA with an RF-RFE algorithm. This result suggests that the NDVI-TSA can be effective in identifying critical phenological events using the RF-RFE algorithm, regardless of the availability of a priori phenological information on S. alterniflora. Moreover, our results also demonstrate the strong performance of the classification algorithms in NDVI-TSA. Hence, if a priori information on the phenological characteristics of a particular invasive plant is limited, the quantification of intra-annual phenology from NDVI time-series may offer a detailed perspective on seasonal variability of the plant’s phenology [13].

5.3. Transferability of the EOCC Combined with PB-NDVI-TSA

For operational applications, an approach must deliver comparable results that extend beyond a specific area [57]. However, except for the PUBNN, all the individual OCCs and the SVM performed relatively poorly in the DMNNR (Appendix A Table A3). Although the PUBNN generally performed well in transferability analysis, it also produced the worst results in the YNNR. The EOCC method performed constantly well and showed a very robust transferability and stability in a second test site, without a second training. Compared to the individual OCCs, despite no significant improvement of classification accuracy was shown for the EOCC in the YNNR, a remarkable improvement was realized in the DMNNR. This is in agreement with previous research indicating that ensemble models outperform individual models, providing more robust and stable predictions in transferability analysis [34,57]. Our results also showed the EOCC outperforms the SVM when being transferred to a second area. A recent study demonstrated that an OCC method outperforms a fully supervised method when the training set is incomplete (i.e., one of the non-target land types is not sampled) [29]. OCC methods are less likely to suffer from the problem of incomplete training data because only the target class needs to be labeled [29]. Compared to the SVM only using 110 negative samples, the EOCC using 5000 random unlabeled samples is more likely to sample all land-cover types. Therefore, the SVM might be more susceptible to incomplete datasets, especially in extended areas without new reference data input.
Compared to the best SSA, our results show that improved performances for both SVM and EOCC can be achieved by combining with the PB-TSA method. A large number of studies have indicated that extended observations offered by phenological time-series enable a classifier to show a stronger detection performance than the SSA methods in a certain area [8,9,11,13]. However, there is a lack of studies assessing their performances when being applied in different geographic spaces. Our results somewhat provide evidence that phenological time-series-based methods could exhibit a more stable and strong transferability than the best SSA method across different geographic regions. Since the timing of leaf senescence for the invasive species varies over space and time, the performance of the best SSA method may vary widely among geographic regions. Moreover, it will be difficult to repeatedly determine the optimal phenological stage for detecting S. alterniflora across locations without expert knowledge. Hence, the best SSA method may lack the generalization capability to conduct the long-term regional-scale monitoring of S. alterniflora [11].
The promising results indicate our detection scheme’s robustness and potential for larger-scale applications, which is crucially needed for S. alterniflora study and management. However, we only transferred the detection scheme in the DMNNR, which has a similar species composition to the YNNR. Therefore, to further verify the stability and generalizability of the proposed detection scheme, more tests need to be conducted in extended and broader range of regions in future studies. Although our proposed method is designed for S. alterniflora detection, it should be beneficial to other invasive plants. For example, P. australis is deemed as an invasive species, which is widely distributed in New England, North America, and other countries [9]. Its extensive and rapid expansion threatens the native Spartina spp. Thus, it will be interesting to apply our proposed method to identify P. australis if this invasive species displays phenological differences with Spartina spp. in these regions.

5.4. Methodological Considerations

A larger number of studies have demonstrated that PCs and VIs can increase the potential to discriminate between vegetation species [4]. Our results confirmed the PCs and VIs are useful for S. alterniflora detection. However, with more input variables, noises in datasets and correlations between variables, which may reduce model stability and interpretability [58]. Therefore, it is critical to find a parsimonious feature subset balancing the prediction accuracy and model interpretability [59]. As one of the most popular variable selection approaches, the RF-RFE method is used to find optimal variable subsets in our study. Although no significant improved accuracies were found after using RF-RFE method, smaller feature subset sizes reduced the computational demand. The best accuracy was achieved in November, which was much higher than other SSAs. Since only three variables were retained for the best SSA after implementing a feature selection using RF-RFE, we did not further evaluate the impact of multicollinearity on model performance. Studies have pointed out that indices developed from remote sensing datasets often show collinearity and may contribute to over prediction [13]. We found different VIs of most SSAs showing high collinearity (r > 0.85), which may affect the model performance. Therefore, addressing collinearity to formulate an optimal feature subset can be critical in future studies.
To ensure a comparable accuracy assessment, we used independent 1511 positive-negative samples to validate the models’ results. However, we did not use these samples during the processes of model training and decision-making (determination of parameters and thresholds). This means we can certainly reduce the number of validation samples to minimize the cost. In an operational setting, when validation samples are limited, OCC results with only a few observations of the species of interest might also serve as an initial map product for directing the fieldwork. The results could be used to locate areas with a high occurrence probability of the species and hence increase the efficiency in subsequent field campaigns to collect presence data [30]. Our field vegetation plots were sampled in July 2014 and September 2015. Time mismatch for the two years’ samples may affect the discrimination results. To reduce this uncertainty, we used GE imagery to inspect the samples and removed the samples of 2014 that were collected in the ecotone between vegetation types, and high tide zone. Unlike the functional properties of vegetation (e.g., biomass and nitrogen content), the structural composition of vegetation usually evolves more slowly (over ±2years) [60]. Thus, the reference samples are reliable. To ensure a well-stratified reference dataset and reduce the influence of the time mismatch between field sampling and RS data, we supplemented the samples from GE imagery. As a freely available data source, it offers advantages for discerning small, newly-colonizing S. alterniflora clumps [3]. The GE imagery with sufficient spatial information could be an appropriate tool for the collection of training samples [61,62]. However, the classification accuracy of time-series models is highly correlated with the stability of the plants during the study period. Annual variation of the species niche can impede the accurate mapping of salt marshes plants, especially in the ecotones and boundaries of different salt marsh communities, which experience rapid vegetation succession [9]. Therefore, in future studies, it is necessary to collect high-quality reference data that cover several months for these sensitive regions.
The coastal zones of Jiangsu where S. alterniflora has established are inevitably affected by the significant cloud coverage [8], which makes it difficult to acquire sufficient monthly cloud-free images over a year within one satellite sensor. Thence, we attempted to reduce the temporal dimension of the NDVI time-series dataset, using NDVIs from key phenological stages. The results indicated that the PB-NDVI-TSA with only three monthly NDVIs can obtain very high accuracy for S. alterniflora detection. Thus, pinpointing several months that capture the key phenological differences between plants could be a viable way to address the limitations on RS data acquisition [4,8]. However, to construct a well-refined phenological time-series model, the data gap in crucial phenological periods is another huge challenge, especially for long-term mapping tasks. With the availability of more satellite data, the combination of multi-source data, such as the ESA’s Sentinel-2A and B constellation, Landsat, and the recently available GaoFen 6-WFV, could be an alternative way to resolve this data gap [4].

6. Conclusions

In this study, we developed a novel detection scheme that employs an EOCC algorithm incorporated with a phenological NDVI-TSA method to detect the invasive species S. alterniflora based on GF-1 WFV monthly time-series data in the core zones of YNNR and DMNNR. We tested the performance of the EOCC in two scenarios, namely SSA and NDVI-TSA. Within the proposed scenarios, the crucial phenological stages and the advantage of NDVI-TSA in S. alterniflora detection were also investigated. To further demonstrate the identification ability of the EOCC algorithm, we used a standard supervised classifier SVM as a baseline for comparison. Results showed that the EOCC produced slightly more stable accuracies of all OCC classifiers and yielded similar accuracies to the SVM in the YNNR. As for the EOCC in SSA, the best classification result was achieved in November. Compared to the best SSA, PB-NDVI-TSA appeared to produce a slightly higher classification accuracy, in which the Kappa and OA improved from 0.813 and 91.92% to 0.841 and 92.92%, respectively. The satisfactory result could also be transferred to the DMNNR area, where the accuracy of the S. alterniflora detection maintained promise with an OA of 90.94% and a Kappa of 0.815. Moreover, our proposed detection scheme combining the EOCC algorithm with PB-TSA was more robust and applicable than other combinations (i.e., EOCC with the best SSA, SVM with PB-NDVI-TSA, or the best SSA) in the transferability analysis. This highlights that our proposed detection scheme is promising for S. alterniflora mapping over extended areas. We hope this study will provide a potential alternative to traditional methods for invasive-plant detection.

Supplementary Materials

The following are available online at https://www.mdpi.com/2072-4292/12/24/4010/s1. Figure S1. Results of feature selection for each SSA in Scenario 1 (Jan–Dec), using RF-RFE algorithm. Table S1. Results of the hyper-parameter optimization process showing the final values used in each algorithm.

Author Contributions

Conceptualization, X.L.; methodology, X.L.; software, X.L., P.D. and B.K.; validation, X.L., Y.Y.L. and J.F.; formal analysis, X.L.; investigation, X.L. and H.L.; resources, X.L., H.L. and B.K.; data curation, X.L.; writing—original draft preparation, X.L.; writing—review and editing, X.L., H.L., P.D. and J.F.; visualization, X.L.; supervision, H.L. and B.K.; project administration, H.L. and B.K.; funding acquisition, H.L., X.L. and B.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study is funded by the National Natural Science Foundation of China (Nos. 41971382 and 31470519), the Priority Academic Program Development of Jiangsu Higher Education Institutions (164320H116), and the China Scholarship Council (201806860050).

Acknowledgments

The authors would especially like to thank Nicole Still for the intensive proofreading as a native speaker and the numerous text improvements. We also thank the support of the Chair of Remote Sensing and Landscape Information Systems, University of Freiburg. We thank Xiaojuan Xu, Haibo Gong, and Fusheng Jiao from Nanjing Normal University for the assistant in the field survey.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

S. alterniforaSpartina alterniflora
P. australisPhragmites australis
S. salsaSuaeda salsa
RSRemote sensing
GF-1 WFVGaofen-1 wide field of view
EOCCEnsemble one-class classification
YNNRYancheng National Nature Reserve
DMNNRDafeng Milu National Nature Reserve
OCSVMOne-class support vector machines
SVDDSupport vector data description
MaxEntMaximum entropy
BSVMBiased support vector machine
PUDNNPositive and unlabeled deep neural network
PULPositive and unlabeled learning
SVMSupport vector machine
RFRandom forest
SSASingle-scene analysis
NDVI-TSANormalized difference vegetation index time-series analysis
PB-NDVI-TSAPhenology-based normalized difference vegetation index time-series analysis
GEGoogle Earth
ENVIEnvironment for visualizing images
FLAASHFast line-of sight atmospheric analysis of spectral hypercubes
MaxSSSMaximizing the sum of specificity and sensitivity
RFERecursive feature elimination

Appendix A

Figure A1. Changes of Kappa values of different algorithms using the variables with different combinations under 12 monthly single-scene analyses. The Bands includes Blue, Green, Red, and NIR; the VIs includes NDVI, RVI, DVI, SAVI, and EVI; and the PCA includes the first three principal components of PCA (PC1, PC2, and PC3).
Figure A1. Changes of Kappa values of different algorithms using the variables with different combinations under 12 monthly single-scene analyses. The Bands includes Blue, Green, Red, and NIR; the VIs includes NDVI, RVI, DVI, SAVI, and EVI; and the PCA includes the first three principal components of PCA (PC1, PC2, and PC3).
Remotesensing 12 04010 g0a1
Table A1. Mean and standard deviation of overall accuracy, Kappa coefficient, sensitivity, and specificity of the two ensemble methods (i.e., weighted vote and majority vote) in Scenarios 1 and 2. Values in parentheses are standard deviations. The best result of each accuracy metrics is bolded.
Table A1. Mean and standard deviation of overall accuracy, Kappa coefficient, sensitivity, and specificity of the two ensemble methods (i.e., weighted vote and majority vote) in Scenarios 1 and 2. Values in parentheses are standard deviations. The best result of each accuracy metrics is bolded.
Ensemble MethodsKappaOA (%)SensitivitySpecificityTSS
Weighted vote0.69886.790.7450.9310.676
(0.116)(5.17)(0.078)(0.048)(0.063)
Majority vote0.68585.990.7510.9160.667
(0.123)(6.09)(0.068)(0.056)(0.124)
Table A2. Summary statistics for the OA and Kappa of OCC classifiers and the fully supervised classifier SVM in Scenarios 1 and 2. Values in the last row are the mean and standard deviation (SD) of OA and Kappa for the SVM and OCC classifiers in Scenarios 1 and 2. Values in parentheses are standard deviations. The best result for each scenario is bolded.
Table A2. Summary statistics for the OA and Kappa of OCC classifiers and the fully supervised classifier SVM in Scenarios 1 and 2. Values in the last row are the mean and standard deviation (SD) of OA and Kappa for the SVM and OCC classifiers in Scenarios 1 and 2. Values in parentheses are standard deviations. The best result for each scenario is bolded.
ScenariosSVMEOCCMaxEntPUDNNBSVM
KappaOA (%)KappaOA (%)KappaOA (%)KappaOA (%)KappaOA (%)
Jan0.66885.240.64484.35 0.65584.95 0.60782.85 0.65284.55
Feb0.66685.040.64784.55 0.63884.05 0.62283.65 0.65584.55
Mar0.68785.64 0.67285.34 0.62082.55 0.63384.15 0.69086.14
Apr0.72987.74 0.75088.93 0.75088.93 0.71287.14 0.75589.33
May0.71887.94 0.71987.94 0.71587.94 0.65685.24 0.74488.83
Jun0.68085.04 0.73688.730.70087.44 0.67285.54 0.72388.24
Jul0.58781.75 0.60582.25 0.57880.66 0.60982.950.60382.15
Aug0.35968.79 0.39473.380.37368.99 0.36472.78 0.39473.38
Sep0.57980.56 0.59982.35 0.61182.950.54178.96 0.59181.85
Oct0.64683.85 0.65085.34 0.64785.24 0.63284.55 0.66085.54
Nov0.83592.62 0.81391.92 0.77190.33 0.83892.720.80591.53
Dec0.79890.93 0.79991.23 0.79190.83 0.83692.720.64085.24
NDVI30.81591.72 0.82492.220.81491.82 0.80491.23 0.82092.02
NDVI60.84693.120.84192.92 0.83892.82 0.82592.12 0.84192.92
NDVI120.85893.620.77890.43 0.77390.23 0.81191.72 0.76990.13
Mean
(SD)
0.698
(0.130)
86.24
(6.36)
0.698
(0.116)
86.79
(5.17)
0.685
(0.118)
85.98
(5.97)
0.677
(0.132)
85.89
(5.63)
86.43
(5.00)
0.689
(0.113)
Table A3. Average overall accuracy, Kappa coefficient, sensitivity, and specificity of the OCC methods and SVM in the transferability analysis. The best result of each accuracy metrics is bolded.
Table A3. Average overall accuracy, Kappa coefficient, sensitivity, and specificity of the OCC methods and SVM in the transferability analysis. The best result of each accuracy metrics is bolded.
AlgorithmsKappaOA (%)SensitivitySpecificityTSS
SVM0.62180.220.9950.6690.664
EOCC0.80290.260.9400.8770.817
MaxEnt0.33766.340.7270.6200.347
PUDNN0.79389.760.9470.8640.811
BSVM0.39074.310.4300.9580.388

References

  1. Perrings, C.; Dehnen-Schmutz, K.; Touza, J.; Williamson, M. How to manage biological invasions under globalization. Trends Ecol. Evol. 2005, 20, 212–215. [Google Scholar] [CrossRef]
  2. Mao, D.; Liu, M.; Wang, Z.; Li, L.; Man, W.; Jia, M.; Zhang, Y. Rapid Invasion of Spartina alterniflora in the coastal zone of mainland China: Spatiotemporal patterns and human prevention. Sensors 2019, 19, 2308. [Google Scholar] [CrossRef] [Green Version]
  3. Liu, M.; Li, H.; Li, L.; Man, W.; Jia, M.; Wang, Z.; Lu, C. Monitoring the invasion of Spartina alterniflora using multi-source high-resolution imagery in the Zhangjiang estuary, China. Remote Sens. 2017, 9, 539. [Google Scholar] [CrossRef] [Green Version]
  4. Liu, X.; Liu, H.; Gong, H.; Lin, Z.; Lv, S. Appling the one-class classification method of Maxent to detect an invasive plant Spartina alterniflora with time-series analysis. Remote Sens. 2017, 9, 1120. [Google Scholar] [CrossRef] [Green Version]
  5. Liu, M.; Mao, D.; Wang, Z.; Li, L.; Man, W.; Jia, M.; Ren, C.; Zhang, Y. Rapid Invasion of Spartina alterniflora in the coastal zone of mainland China: New observations from landsat OLI Images. Remote Sens. 2018, 10, 1933. [Google Scholar] [CrossRef] [Green Version]
  6. Zhang, X.; Xiao, X.; Wang, X.; Xu, X.; Chen, B.; Wang, J.; Ma, J.; Zhao, B.; Li, B. Quantifying expansion and removal of Spartina alterniflora on Chongming island, China, using time series Landsat images during 1995–2018. Remote Sens. Environ. 2020, 247, 111916. [Google Scholar] [CrossRef] [PubMed]
  7. Tian, J.; Wang, L.; Yin, D.; Li, X.; Diao, C.; Gong, H.; Shi, C.; Menenti, M.; Ge, Y.; Nie, S.; et al. Development of spectral-phenological features for deep learning to understand Spartina alterniflora invasion. Remote Sens. Environ. 2020, 242, 111745. [Google Scholar] [CrossRef]
  8. Ai, J.; Gao, W.; Gao, Z.; Shi, R.; Zhang, C. Phenology-based Spartina alterniflora mapping in coastal wetland of the Yangtze Estuary using time series of GaoFen satellite no. 1 wide field of view imagery. J. Appl. Remote Sens. 2017, 11, 026020. [Google Scholar] [CrossRef]
  9. Sun, C.; Liu, Y.; Zhao, S.; Zhou, M.; Yang, Y.; Li, F. Classification mapping and species identification of salt marshes based on a short-time interval NDVI time-series from HJ-1 optical imagery. Int. J. Appl. Earth Obs. Geoinf. 2016, 45, 27–41. [Google Scholar] [CrossRef]
  10. Tian, Y.; Jia, M.; Wang, Z.; Mao, D.; Du, B.; Wang, C. Monitoring Invasion Process of Spartina alterniflora by Seasonal Sentinel-2 Imagery and an Object-Based Random Forest Classification. Remote Sens. 2020, 12, 1383. [Google Scholar] [CrossRef]
  11. Diao, C.; Wang, L. Incorporating plant phenological trajectory in exotic saltcedar detection with monthly time series of Landsat imagery. Remote Sens. Environ. 2016, 182, 60–71. [Google Scholar] [CrossRef]
  12. Ji, W.; Wang, L. Phenology-guided saltcedar (Tamarix spp.) mapping using Landsat TM images in western U.S. Remote Sens. Environ. 2016, 173, 29–38. [Google Scholar] [CrossRef]
  13. Singh, K.K.; Chen, Y.-H.; Smart, L.; Gray, J.; Meentemeyer, R.K. Intra-annual phenology for detecting understory plant invasion in urban forests. ISPRS J. Photogramm. Remote Sens. 2018, 142, 151–161. [Google Scholar] [CrossRef]
  14. Jiang, W.; Yuan, L.; Wang, W.; Cao, R.; Zhang, Y.; Shen, W. Spatio-temporal analysis of vegetation variation in the Yellow River Basin. Ecol. Indic. 2015, 51, 117–126. [Google Scholar] [CrossRef]
  15. Ng, W.-T.; Meroni, M.; Immitzer, M.; Böck, S.; Leonardi, U.; Rembold, F.; Gadain, H.; Atzberger, C. Mapping Prosopis spp. with Landsat 8 data in arid environments: Evaluating effectiveness of different methods and temporal imagery selection for Hargeisa, Somaliland. Int. J. Appl. Earth Obs. Geoinf. 2016, 53, 76–89. [Google Scholar] [CrossRef]
  16. Shiferaw, H.; Bewket, W.; Eckert, S. Performances of machine learning algorithms for mapping fractional cover of an invasive plant species in a dryland ecosystem. Ecol. Evol. 2019, 9, 2562–2574. [Google Scholar] [CrossRef] [Green Version]
  17. Cesar de Sa, N.; Carvalho, S.; Castro, P.; Marchante, E.; Marchante, H. Using Landsat Time Series to Understand How Management and Disturbances Influence the Expansion of an Invasive Tree. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3243–3253. [Google Scholar] [CrossRef]
  18. Mack, B.; Roscher, R.; Stenzel, S.; Feilhauer, H.; Schmidtlein, S.; Waske, B. Mapping raised bogs with an iterative one-class classification approach. ISPRS J. Photogramm. Remote Sens. 2016, 120, 53–64. [Google Scholar] [CrossRef]
  19. Xu, X.; Ji, X.; Jiang, J.; Yao, X.; Tian, Y.; Zhu, Y.; Cao, W.; Cao, Q.; Yang, H.; Shi, Z.; et al. Evaluation of one-class support vector classification for mapping the paddy rice planting area in jiangsu province of china from landsat 8 oli imagery. Remote Sens. 2018, 10, 546. [Google Scholar] [CrossRef] [Green Version]
  20. Liu, R.; Li, W.; Liu, X.; Lu, X.; Li, T.; Guo, Q. An Ensemble of Classifiers Based on Positive and Unlabeled Data in One-Class Remote Sensing Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 572–584. [Google Scholar] [CrossRef]
  21. Araya-López, R.A.; Lopatin, J.; Fassnacht, F.E.; Hernández, H.J. Monitoring Andean high altitude wetlands in central Chile with seasonal optical data: A comparison between Worldview-2 and Sentinel-2 imagery. ISPRS J. Photogramm. Remote Sens. 2018, 145, 213–224. [Google Scholar] [CrossRef]
  22. Xu, X.; Liu, X.; Li, X.; Xin, Q.; Chen, Y.; Shi, Q.; Ai, B. Global snow cover estimation with Microwave Brightness Temperature measurements and one-class in situ observations. Remote Sens. Environ. 2016, 182, 227–251. [Google Scholar] [CrossRef]
  23. Sanchez-Hernandez, C.; Boyd, D.S.; Foody, G.M. One-Class Classification for Mapping a Specific Land-Cover Class: SVDD Classification of Fenland. IEEE Trans. Geosci. Remote Sens. 2007, 45, 1061–1073. [Google Scholar] [CrossRef] [Green Version]
  24. Evangelista, P.; Stohlgren, T.; Morisette, J.; Kumar, S. Mapping Invasive Tamarisk (Tamarix): A Comparison of Single-Scene and Time-Series Analyses of Remotely Sensed Data. Remote Sens. 2009, 1, 519. [Google Scholar] [CrossRef] [Green Version]
  25. Li, W.; Guo, Q. A maximum entropy approach to one-class classification of remote sensing imagery. Int. J. Remote Sens. 2010, 31, 2227–2235. [Google Scholar] [CrossRef]
  26. Mack, B.; Roscher, R.; Waske, B. Can I Trust My One-Class Classification? Remote Sens. 2014, 6, 8779. [Google Scholar] [CrossRef] [Green Version]
  27. Stenzel, S.; Fassnacht, F.E.; Mack, B.; Schmidtlein, S. Identification of high nature value grassland with remote sensing and minimal field data. Ecol. Indic. 2017, 74, 28–38. [Google Scholar] [CrossRef]
  28. Skowronek, S.; Asner, G.P.; Feilhauer, H. Performance of one-class classifiers for invasive species mapping using airborne imaging spectroscopy. Ecol. Inform. 2017, 37, 66–76. [Google Scholar] [CrossRef]
  29. Deng, X.; Li, W.; Liu, X.; Guo, Q.; Newsam, S. One-class remote sensing classification: One-class vs. binary classifiers. Int. J. Remote Sens. 2018, 39, 1890–1910. [Google Scholar] [CrossRef]
  30. Piiroinen, R.; Fassnacht, F.E.; Heiskanen, J.; Maeda, E.; Mack, B.; Pellikka, P. Invasive tree species detection in the Eastern Arc Mountains biodiversity hotspot using one class classification. Remote Sens. Environ. 2018, 218, 119–131. [Google Scholar] [CrossRef]
  31. Kattenborn, T.; Lopatin, J.; Förster, M.; Braun, A.C.; Fassnacht, F.E. UAV data as alternative to field sampling to map woody invasive species based on combined Sentinel-1 and Sentinel-2 data. Remote Sens. Environ. 2019, 227, 61–73. [Google Scholar] [CrossRef]
  32. Grenouillet, G.; Buisson, L.; Casajus, N.; Lek, S. Ensemble modelling of species distribution: The effects of geographical and environmental ranges. Ecography 2011, 34, 9–17. [Google Scholar] [CrossRef]
  33. Mack, B.; Waske, B. In-depth comparisons of MaxEnt, biased SVM and one-class SVM for one-class classification of remote sensing data. Remote Sens. Lett. 2017, 8, 290–299. [Google Scholar] [CrossRef]
  34. Forester, B.R.; DeChaine, E.G.; Bunn, A.G. Integrating ensemble species distribution modelling and statistical phylogeography to inform projections of climate change impacts on species distributions. Divers. Distrib. 2013, 19, 1480–1495. [Google Scholar] [CrossRef]
  35. Woodman, S.M.; Forney, K.A.; Becker, E.A.; DeAngelis, M.L.; Hazen, E.L.; Palacios, D.M.; Redfern, J.V. esdm: A tool for creating and exploring ensembles of predictions from species distribution and abundance models. Methods. Ecol. Evol. 2019, 10, 1923–1933. [Google Scholar] [CrossRef] [Green Version]
  36. Yang, D.; Miao, X.-Y.; Wang, B.; Jiang, R.-P.; Wen, T.; Liu, M.-S.; Huang, C.; Xu, C. System-Specific Complex Interactions Shape Soil Organic Carbon Distribution in Coastal Salt Marshes. Int. J. Environ. Res. Public. Health 2020, 17, 2037. [Google Scholar] [CrossRef] [Green Version]
  37. Li, Z.; Shen, H.; Li, H.; Xia, G.; Gamba, P.; Zhang, L. Multi-feature combined cloud and cloud shadow detection in GaoFen-1 wide field of view imagery. Remote Sens. Environ. 2017, 191, 342–358. [Google Scholar] [CrossRef] [Green Version]
  38. Wang, M.; Fei, X.; Zhang, Y.; Chen, Z.; Wang, X.; Tsou, J.Y.; Liu, D.; Lu, X. Assessing texture features to classify coastal wetland vegetation from high spatial resolution imagery using completed local binary patterns (CLBP). Remote Sens. 2018, 10, 778. [Google Scholar] [CrossRef] [Green Version]
  39. Rouse, J.W.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the great plains with ERTS. In Proceedings of the Third ERTS Symposium, Washington, DC, USA, 10–14 December 1973; pp. 309–317. [Google Scholar]
  40. Tucker, C.J. Red and Photographic Infrared l,lnear Combinations for Monitoring Vegetation. Remote Sens Env. 1979, 127–150. [Google Scholar] [CrossRef] [Green Version]
  41. Pearson Miller Remote mapping of standing crop biomass for estimation of the productivity of the shortgrass prairie, Pawnee National Grasslands. In Proceedings of the 8th International Symposium on Remote Sensing of the Environment, Colorado, CO, USA, 2 October 1972.
  42. Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
  43. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  44. Liu, B.; Dai, Y.; Li, X.; Lee, W.S.; Yu, P.S. Building text classifiers using positive and unlabeled examples. In Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne, Australia, 19–22 November 2003; pp. 179–186. [Google Scholar]
  45. Phillips, S.J.; Anderson, R.P.; Schapire, R.E. Maximum entropy modeling of species geographic distributions. Ecol. Model. 2006, 190, 231–259. [Google Scholar] [CrossRef] [Green Version]
  46. Phillips, S.J.; Anderson, R.P.; Dudík, M.; Schapire, R.E.; Blair, M.E. Opening the black box: An open-source release of Maxent. Ecography 2017, 40, 887–893. [Google Scholar] [CrossRef]
  47. Elkan, C.; Noto, K. Learning classifiers from only positive and unlabeled data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD 08, Las Vegas, NV, USA, 14–18 August 2008; p. 213. [Google Scholar]
  48. Li, W.; Guo, Q.; Elkan, C. A Positive and Unlabeled Learning Algorithm for One-Class Classification of Remote-Sensing Data. IEEE Trans. Geosci. Remote Sens. 2011, 49, 717–725. [Google Scholar] [CrossRef]
  49. Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GIScience Remote Sens. 2019, 1–20. [Google Scholar] [CrossRef] [Green Version]
  50. Allouche, O.; Tsoar, A.; Kadmon, R. Assessing the accuracy of species distribution models: Prevalence, kappa and the true skill statistic (TSS): Assessing the accuracy of distribution models. J. Appl. Ecol. 2006, 43, 1223–1232. [Google Scholar] [CrossRef]
  51. Moraes, A.M.; Vancine, M.H.; Moraes, A.M.; de Oliveira Cordeiro, C.L.; Pinto, M.P.; Lima, A.A.; Culot, L.; Silva, T.S.F.; Collevatti, R.G.; Ribeiro, M.C.; et al. Predicting the potential hybridization zones between native and invasive marmosets within Neotropical biodiversity hotspots. Glob. Ecol. Conserv. 2019, 20, e00706. [Google Scholar] [CrossRef]
  52. Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 2008, 28. [Google Scholar] [CrossRef] [Green Version]
  53. Guyon, I.; Weston, J.; Barnhill, S. Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
  54. Feilhauer, H.; Thonfeld, F.; Faude, U.; He, K.S.; Rocchini, D.; Schmidtlein, S. Assessing floristic composition with multispectral sensors—A comparison based on monotemporal and multiseasonal field spectra. Int. J. Appl. Earth Obs. Geoinf. 2013, 21, 218–229. [Google Scholar] [CrossRef]
  55. Du, P.; Xia, J.; Zhang, W.; Tan, K.; Liu, Y.; Liu, S. Multiple Classifier System for Remote Sensing Image Classification: A Review. Sensors 2012, 12, 4764. [Google Scholar] [CrossRef]
  56. Ouyang, Z.-T.; Gao, Y.; Xie, X.; Guo, H.-Q.; Zhang, T.-T.; Zhao, B. Spectral Discrimination of the Invasive Plant Spartina alterniflora at Multiple Phenological Stages in a Saltmarsh Wetland. PLoS ONE 2013, 8, e67315. [Google Scholar] [CrossRef]
  57. Löw, F.; Conrad, C.; Michel, U. Decision fusion and non-parametric classifiers for land use mapping using multi-temporal RapidEye data. ISPRS J. Photogramm. Remote Sens. 2015, 108, 191–204. [Google Scholar] [CrossRef]
  58. Georganos, S.; Grippa, T.; Vanhuysse, S.; Lennert, M.; Shimoni, M.; Kalogirou, S.; Wolff, E. Less is more: Optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application. GIScience Remote Sens. 2018, 55, 221–242. [Google Scholar] [CrossRef]
  59. Zhang, F.; Yang, X. Improving land cover classification in an urbanized coastal area by random forests: The role of variable selection. Remote Sens. Environ. 2020, 251, 112105. [Google Scholar] [CrossRef]
  60. Rapinel, S.; Mony, C.; Lecoq, L.; Clément, B.; Thomas, A.; Hubert-Moy, L. Evaluation of Sentinel-2 time-series for mapping floodplain grassland plant communities. Remote Sens. Environ. 2019, 223, 115–129. [Google Scholar] [CrossRef]
  61. DeVries, B.; Verbesselt, J.; Kooistra, L.; Herold, M. Robust monitoring of small-scale forest disturbances in a tropical montane forest using Landsat time series. Remote Sens. Environ. 2015, 161, 107–121. [Google Scholar] [CrossRef]
  62. Suess, S.; van der Linden, S.; Okujeni, A.; Griffiths, P.; Leitão, P.J.; Schwieder, M.; Hostert, P. Characterizing 32 years of shrub cover dynamics in southern Portugal using annual Landsat composites and machine learning regression modeling. Remote Sens. Environ. 2018, 219, 353–364. [Google Scholar] [CrossRef]
Figure 1. Location of the study area. (a) Overview of the study area in the Gaofen-1 wide-field-of-view (GF-1 WFV) RGB composite (R: 4 G: 3. B: 2) image on 16 December 2015. (b) Location of the Yancheng National Natural Reserve (YNNR) with field plots and (c) location of the Dafeng Milu National Nature Reserve (DMNNR).
Figure 1. Location of the study area. (a) Overview of the study area in the Gaofen-1 wide-field-of-view (GF-1 WFV) RGB composite (R: 4 G: 3. B: 2) image on 16 December 2015. (b) Location of the Yancheng National Natural Reserve (YNNR) with field plots and (c) location of the Dafeng Milu National Nature Reserve (DMNNR).
Remotesensing 12 04010 g001
Figure 2. Summary statistics for the OA and Kappa coefficient of five classifiers, four OCCs, and the SVM, for different months.
Figure 2. Summary statistics for the OA and Kappa coefficient of five classifiers, four OCCs, and the SVM, for different months.
Remotesensing 12 04010 g002
Figure 3. Variability in overall accuracy and Kappa for NDVI time-series analysis (TSA) when using a different number of NDVI variables. Whiskers indicate the standard deviation of five repeated 10-fold cross-validation. The green box indicates the best accuracy of the model, using the selected optimal feature subset. The best accuracies for NDVI-TSA were produced when six variables were considered. The order of these variables is NDVI_Nov, NDVI_Dec, NDVI_May, NDVI_Jun, NDVI_Apr, and NDVI_Jan.
Figure 3. Variability in overall accuracy and Kappa for NDVI time-series analysis (TSA) when using a different number of NDVI variables. Whiskers indicate the standard deviation of five repeated 10-fold cross-validation. The green box indicates the best accuracy of the model, using the selected optimal feature subset. The best accuracies for NDVI-TSA were produced when six variables were considered. The order of these variables is NDVI_Nov, NDVI_Dec, NDVI_May, NDVI_Jun, NDVI_Apr, and NDVI_Jan.
Remotesensing 12 04010 g003
Figure 4. Classification maps for the first study area. (a) Land-cover classification map generated with a visual interpretation method based on the GF-1 WFV image and Google Earth (GE) image. (b) Classification map using the EOCC algorithm in November; (c) Classification map using the EOCC algorithm in NDVI-TSA with the top three variables. (d) Classification map using the EOCC algorithm in NDVI-TSA with the optimal six variables. (e) Classification map using the EOCC algorithm in NDVI-TSA with 12 variables. The red circle shows the false positive pixels (pixels are not S. alterniflora, but are classified as positives), and the yellow circle shows the false negative pixels (pixels are S. alterniflora, but are classified as negatives).
Figure 4. Classification maps for the first study area. (a) Land-cover classification map generated with a visual interpretation method based on the GF-1 WFV image and Google Earth (GE) image. (b) Classification map using the EOCC algorithm in November; (c) Classification map using the EOCC algorithm in NDVI-TSA with the top three variables. (d) Classification map using the EOCC algorithm in NDVI-TSA with the optimal six variables. (e) Classification map using the EOCC algorithm in NDVI-TSA with 12 variables. The red circle shows the false positive pixels (pixels are not S. alterniflora, but are classified as positives), and the yellow circle shows the false negative pixels (pixels are S. alterniflora, but are classified as negatives).
Remotesensing 12 04010 g004
Figure 5. Classification maps of S. alterniflora using the EOCC and SVM algorithms in the best SSA and the second PB-NDVI-TSA for the DMNNR. (a) RGB composite (R: NDVI _May, G: NDVI _November, B: NDVI _December) of NDVI images in the DMNNR. (b,c) Classification maps of S. alterniflora using the SVM and EOCC algorithms in the best SSA, respectively. (d) Classification map of S. alterniflora using visual interpretation method based on the NDVI imagery. (f,e) Classification maps of S. alterniflora using the SVM and EOCC algorithms in the second PB-NDVI-TSA, respectively. The red circle shows the false positive pixels (pixels are not S. alterniflora, but are classified as positives), and the yellow circle shows the false negative pixels (pixels are S. alterniflora, but are classified as negatives).
Figure 5. Classification maps of S. alterniflora using the EOCC and SVM algorithms in the best SSA and the second PB-NDVI-TSA for the DMNNR. (a) RGB composite (R: NDVI _May, G: NDVI _November, B: NDVI _December) of NDVI images in the DMNNR. (b,c) Classification maps of S. alterniflora using the SVM and EOCC algorithms in the best SSA, respectively. (d) Classification map of S. alterniflora using visual interpretation method based on the NDVI imagery. (f,e) Classification maps of S. alterniflora using the SVM and EOCC algorithms in the second PB-NDVI-TSA, respectively. The red circle shows the false positive pixels (pixels are not S. alterniflora, but are classified as positives), and the yellow circle shows the false negative pixels (pixels are S. alterniflora, but are classified as negatives).
Remotesensing 12 04010 g005
Table 1. The detailed information of the GF-1 WFV data used in this study.
Table 1. The detailed information of the GF-1 WFV data used in this study.
SensorDateCenter Longitude/LatitudeSensorDateCenter Longitude/Latitude
WFV29 January 2015E121.0/N32.6WFV313 July 2015E120.2/N33.9
WFV211 February 2015E120.1/N34.3WFV16 August 2015E119.5/N33.0
WFV312 March 2015E120.3/N33.9WFV23 September 2016E120.7/N34.3
WFV326 April 2015E120.9/N33.9WFV215 October 2015E121.2/N32.6
WFV120 May 2015E119.9/N33.0WFV229 November 2015E121.2/N32.6
WFV36 June 2015E120.9/N33.9WFV115 December 2015E119.9/N33.5
Table 2. Predictor variables derived from GF-1 WFV imagery and used as inputs in the algorithms for detection and mapping of S. alterniflora, including equations and references.
Table 2. Predictor variables derived from GF-1 WFV imagery and used as inputs in the algorithms for detection and mapping of S. alterniflora, including equations and references.
Variable NameEquationCitation
GF-1 WFV bandsBlue, Green, Red, NIR
Normalized Difference Vegetation Index (NDVI)(NIR − Red)/(NIR + Red)[39]
Difference Vegetation Index (DVI)NIR − Red[40]
Ratio Vegetation Index (RVI)NIR/Red[41]
Enhanced Vegetation Index (EVI)6.5 × (NIR − Red)/(NIR + 7.5 × Red − 2.5 × Blue + 1)[42]
Soil-Adjusted Vegetation Index (SAVI)1.5 × (NIR − Red)/(NIR + Red + 0.5)[43]
Principal Component Analysis (PCA)PC1, PC2, PC3
Table 3. Classification accuracies of the support vector machine (SVM) and one-class classifier (OCC) methods, using twelve variables and the optimal variables subsets selected by the random forest recursive feature elimination (RF-RFE) algorithm in Scenario 1. Values in parentheses are standard deviations. OA, overall accuracy; EOCC, ensemble OCC; MaxEnt, maximum entropy; PUDNN, positive and unlabeled deep neural network; BSVM, biased SVM; TSS, true skill statistic.
Table 3. Classification accuracies of the support vector machine (SVM) and one-class classifier (OCC) methods, using twelve variables and the optimal variables subsets selected by the random forest recursive feature elimination (RF-RFE) algorithm in Scenario 1. Values in parentheses are standard deviations. OA, overall accuracy; EOCC, ensemble OCC; MaxEnt, maximum entropy; PUDNN, positive and unlabeled deep neural network; BSVM, biased SVM; TSS, true skill statistic.
Accuracy MetricsSVMEOCCMaxEntPUDNNBSVM
AllRF-RFEAllRF-RFEAllRF-RFEAllRF-RFEAllRF-RFE
OA (%)84.6084.9185.5285.5684.5784.8084.4481.0085.1185.11
(6.05)(6.06)(4.99)(4.95)(5.85)(5.76)(5.37)(9.93)(4.69)(4.58)
Kappa0.6630.6680.6690.6690.6540.6550.6430.5890.6590.658
(0.121)(0.121)(0.112)(0.111)(0.111)(0.111)(0.125)(0.187)(0.105)(0.104)
TSS0.6650.6650.6460.6470.6370.6330.6270.5860.6390.638
(0.069)(0.073)(0.059)(0.059)(0.073)(0.071)(0.073)(0.124)(0.063)(0.067)
Sensitivity0.7910.7800.7230.7220.7330.7170.7150.7390.7200.712
(0.063)(0.067)(0.067)(0.070)(0.052)(0.054)(0.104)(0.103)(0.076)(0.084)
Specificity0.8740.8850.9230.9250.9040.9160.9120.8470.9190.923
(0.075)(0.078)(0.051)(0.047)(0.093)(0.088)(0.042)(0.144)(0.051)(0.049)
Table 4. Optimal variables sizes and subsets for 12 single-scene analyses (SSAs) selected by using the RF-RFE algorithm. The top three variables and the NDVI variable are highlighted in different colors. The VIs are highlighted in green, the PCs are highlighted in yellow, and the original bands are highlighted in gray. Values in parentheses are the optimal variable sizes for each SSA. R, Red; G, Green; B, Blue; N, Near-Infrared band.
Table 4. Optimal variables sizes and subsets for 12 single-scene analyses (SSAs) selected by using the RF-RFE algorithm. The top three variables and the NDVI variable are highlighted in different colors. The VIs are highlighted in green, the PCs are highlighted in yellow, and the original bands are highlighted in gray. Values in parentheses are the optimal variable sizes for each SSA. R, Red; G, Green; B, Blue; N, Near-Infrared band.
RankJanFebMarAprMayJunJulAugSepOctNovDec
(12)(12)(11)(8)(12)(10)(3)(12)(4)(6)(3)(12)
1BPC3BNIRPC3NIRPC2GRRNDVINDVI
2GGGPC1NDVIPC1RNDVIPC3PC1RVIRVI
3PC1BEVIPC2RVIPC2NIRRVIPC1RVIREVI
4REVINDVINDVINIRB\EVIBNDVI\PC3
5NDVIRVIRVIRVIEVIG\DVI\B\R
6RVINDVIPC3GSAVIR\SAVI\PC3\SAVI
7EVISAVIPC1RDVIEVI\PC3\\\DVI
8SAVIDVISAVIEVIRSAVI\B\\\G
9PC3PC1DVI\GNDVI\PC1\\\PC2
10DVIPC2PC2\PC1RVI\NIR\\\NIR
11NIRRNIR\B\\PC2\\\PC1
12PC2NIR\\PC2\\R\\\B
Table 5. Mean and standard deviation of OA, Kappa, sensitivity, specificity, and TSS of the SVM and OCC classifiers in 12 Scenarios 1 and 2. Values in parentheses are standard deviations. The best result of each accuracy metric was bolded.
Table 5. Mean and standard deviation of OA, Kappa, sensitivity, specificity, and TSS of the SVM and OCC classifiers in 12 Scenarios 1 and 2. Values in parentheses are standard deviations. The best result of each accuracy metric was bolded.
Accuracy Metrics Binary ClassifierOCC Classifiers
SVMEOCCMaxEntPUDNNBSVM
Kappa0.6980.6980.6850.6770.689
(0.130)(0.116)(0.118)(0.132)(0.113)
OA (%)86.2486.7985.9885.8986.43
(6.36)(5.17)(5.97)(5.63)(5.00)
Sensitivity0.8110.7450.7500.7430.740
(0.070)(0.078)(0.063)(0.110)(0.083)
Specificity0.8890.9310.9170.9190.929
(0.073)(0.048)(0.086)(0.040)(0.049)
TSS0.7000.6760.6670.6620.669
(0.072)(0.063)(0.075)(0.075)(0.132)
Table 6. Classification accuracies of the two scenarios, using the EOCC algorithm. The best result of each accuracy metric is shown in bold.
Table 6. Classification accuracies of the two scenarios, using the EOCC algorithm. The best result of each accuracy metric is shown in bold.
ScenariosInput PredictorsSensitivitySpecificityOA (%)KappaTSS
Best SSA12 variables (November)0.7520.99191.920.8130.743
NDVI-TSA12 variables0.7700.97490.430.7780.744
Optimal 6 variables0.8780.95692.920.8410.834
Top 3 variables0.8510.95992.220.8240.810
Table 7. Classification accuracies of the optimal SSA and the second phenology-based (PB) NDVI-TSA using the EOCC and SVM algorithms in the DMNNR. The best result of each accuracy metric for each scheme is bolded.
Table 7. Classification accuracies of the optimal SSA and the second phenology-based (PB) NDVI-TSA using the EOCC and SVM algorithms in the DMNNR. The best result of each accuracy metric for each scheme is bolded.
Schemes AlgorithmsSpecificitySensitivityOA (%)KappaTSS
Best SSASVM0.5750.99574.610.5200.570
EOCC0.8570.95289.570.7890.809
PB-NDVI-TSASVM0.7640.99585.830.7210.759
EOCC0.8970.92890.940.8150.825
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liu, X.; Liu, H.; Datta, P.; Frey, J.; Koch, B. Mapping an Invasive Plant Spartina alterniflora by Combining an Ensemble One-Class Classification Algorithm with a Phenological NDVI Time-Series Analysis Approach in Middle Coast of Jiangsu, China. Remote Sens. 2020, 12, 4010. https://doi.org/10.3390/rs12244010

AMA Style

Liu X, Liu H, Datta P, Frey J, Koch B. Mapping an Invasive Plant Spartina alterniflora by Combining an Ensemble One-Class Classification Algorithm with a Phenological NDVI Time-Series Analysis Approach in Middle Coast of Jiangsu, China. Remote Sensing. 2020; 12(24):4010. https://doi.org/10.3390/rs12244010

Chicago/Turabian Style

Liu, Xiang, Huiyu Liu, Pawanjeet Datta, Julian Frey, and Barbara Koch. 2020. "Mapping an Invasive Plant Spartina alterniflora by Combining an Ensemble One-Class Classification Algorithm with a Phenological NDVI Time-Series Analysis Approach in Middle Coast of Jiangsu, China" Remote Sensing 12, no. 24: 4010. https://doi.org/10.3390/rs12244010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop