Next Article in Journal
HBIM for Conservation: A New Proposal for Information Modeling
Previous Article in Journal
A Model-Based Design System for Terrestrial Laser Scanning Networks in Complex Sites
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modelling the Spectral Uncertainty of Geographic Features in High-Resolution Remote Sensing Images: Semi-Supervising and Weighted Interval Type-2 Fuzzy C-Means Clustering

1
School of Geographic and Environmental Sciences, Tianjin Key Laboratory of Water Resources and Environment, Tianjin Normal University, Tianjin 300387, China
2
Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China
3
College of Architecture and Civil Engineering, Beijing University of Technology, Beijing 100124, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2019, 11(15), 1750; https://doi.org/10.3390/rs11151750
Submission received: 28 May 2019 / Revised: 18 July 2019 / Accepted: 22 July 2019 / Published: 25 July 2019
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
The spectral uncertainty refers to the diversity and variations of spectral characteristics within a single geographic object or across different objects of the same class. Usually, existing methods represent the spectral characteristics as precise single-valued curves. Thus, the spectral variations cannot be modeled, which further restricts the analysis and classification performance of remote sensing images. On the other hand, unsupervised methods have poor performance in classification and modeling uncertainty, while supervised methods need a large number of samples with high quality. Fuzzy semi-supervised clustering (FSSC) methods achieve a high accuracy with limited labelled samples. Thus, currently, FSSC methods attract more and more attention. This paper proposes a novel method to model the spectral uncertainty for very-high-resolution (VHR) images based on interval type-2 fuzzy sets (IT2 FSs), namely the hierarchical semi-supervising and weighted interval type-2 fuzzy c-means for objects (hierarchical SSW-IT2FCM-O) clustering method. In this method, the VHR image is segmented into image objects to reduce spectral uncertainty within objects. Spectral values, spectral indices and textures were weighted for object-based image classification. To further reduce spectral uncertainty across different objects of the same class, the spectral characteristics of land cover types were represented as banded curves with certain widths instead of precise single-valued spectral curves. The experimental results show that the banded spectral curves produced by the hierarchical SSW-IT2FCM-O can effectively model the spectral uncertainty of geographic objects. From the perspective of classification, four typical validity indices along with the confusion matrix and kappa coefficient were used to test the effectiveness of the hierarchical SSW-IT2FCM-O method, and these indices show that the presented method SSW-IT2FCM-O has greater classification accuracy than the existing FSSC methods and, more importantly, it requires smaller training samples than the existing methods.

1. Introduction

Currently, land cover determination by remote sensing data is common and important. Because the inherent fuzziness of geographical objects, and the sensor, data acquisition, processing, conversion and transmission processes may produce or propagate errors, uncertainty exists widely in remote sensing data [1]. For the object-based image analysis of very-high-resolution (VHR) remote sensing images, the spectral attributes of different pixels within a single object are often different. Second, the spectral characteristics of the same land cover type vary from object to object. Some studies have pointed out that the spectral curves of nearly all geographic objects are bands with certain ranges [1,2]. Figure 1 shows the difference between the single-valued spectrum and the banded spectrum of water, where the single-valued spectrum is calculated by the average spectrum on each band. For the banded spectrum, its lower and upper boundaries refer to the first and third quartiles of ground reflectance; this spectral attribute is expressed as an interval number on each band. Unfortunately, the single-valued spectrum does not contain such width information of the banded spectrum. As a result, the banded spectrum contains much more information than the single-valued curve and can express the uncertainty of the spectrum, but the single-valued spectrum curve cannot.
Many studies aim to address the uncertainty of remote sensing data in the classification process. For the unsupervised methods, the fuzzy c-means (FCM) clustering [3] is the classical method and is based on type-1 fuzzy sets [4]. Spectral curves adopted by the FCM are still precise single-valued curves, and these curves may deviate greatly from the real spectral curves of geographical objects. These factors lead to the low classification accuracy of FCM and its extended algorithms. Some researchers improved the FCM by interval type-2 fuzzy sets (IT2 FS) for handling the uncertainty of membership grade, such as interval type-2 fuzzy c-means (IT2FCM) [5] and interval-valued possibilistic fuzzy c-means (IPFCM) [6]. These methods use two fuzzifiers during the classification process, and the membership grade matrix of land cover types contain lower and upper membership values. However, the spectral curves used by these are still single-valued, so these methods still cannot effectively model the spectral uncertainty of geographic features.
Supervised methods can effectively simulate the spectral curves and require a large amount of sample data for training a stable and robust classifier [7,8]. Ma et al. [7] summarized that most of the supervised methods need a 30–50% training sample of the total data; for example, Wang et al. used approximately 30% of the total data to train the interval type-2 fuzzy set-based supervised classification method [9]. However, practically, it is difficult to select large samples with high quality, especially when users are unfamiliar with a study area, so labelled samples are often scarce because they are difficult or expensive to record in reality and sometimes the amount of data is too enormous to completely label [10]. Unlike supervised and unsupervised methods, the fuzzy semi-supervised clustering (FSSC) methods can take advantage of limited labelled samples to improve the performance of unsupervised methods [11]. As a result, FSSC methods are very suitable for remote sensing data classification because usually the investigation of a study area is limited. Recently, some extended versions of FSSC methods have been developed, such as the semi-supervised fuzzy c-means (ssFCM) [12], the semi-supervised kernel-based fuzzy c-means (S2KFCM) [13], the semi-supervised fuzzy clustering algorithm with feature discrimination (SFFD) [10], the semi-supervising interval type-2 fuzzy c-means algorithm using spatial information (SIIT2-FCM) [14], Semi-supervised Kernel Fuzzy C-Means in feature space (SKFCM-F) and semi-supervised multiple kernel fuzzy c-means (SMKFCM) [15]. Lai and Garibaldi compared existing studies and concluded that the Mahalanobis distance (ssFCM) can provide slightly better performance than Gaussian kernel-based distance (S2KFCM) [11]. Like the FCMs and IT2 FCMs, the SFFCs still treat the spectral characteristics as a single-valued curve. So, in our experiments, the above mentioned unsupervised or semi-supervised methods are difficult to distinguish water and shadows with, as well as grass and forest.
In summary, the existing methods cannot effectively model the uncertainty of spectral curves. Therefore, how to model the spectral uncertainty of geographic objects and then improve classification performance is still a problem. The objective of this study is to propose a hierarchical method to model spectral uncertainty in very-high-resolution (VHR) images based on the semi-supervising and weighted interval type-2 fuzzy c-means for objects (SSW-IT2FCM-O). In this method, the VHR image is segmented into objects and then object-based features, including spectra, spectral indices (SIs) and textures, are extracted for classification. Second, the SSW-IT2FCM-O is proposed to classify image objects. The proposed method has the following advantages: (1) spectral curves of land cover types are bands with certain widths for modelling exactly the uncertainty of spectral characteristics. (2) The prior knowledge of labeled samples contains three types: the hierarchical organization of land cover types, the weights of bands and SIs, and the statistic information of labeled samples. (3) A novel weighted distance is proposed to measure the distance between interval cluster centers and segmented objects. (4) The fuzzy membership degree matrix of the proposed method consists of lower and upper membership grade matrices, while the fuzzy membership degree matrix in existing FSSC methods is a single value matrix. Thus, the presented method can deal with the uncertainty of membership values. (5) The SSW-IT2FCM-O can effectively classify image objects and greatly reduce the uncertainty caused by salt-and-pepper noise that is widely existed in other FSSC methods, and it can effectively distinguish land cover types with similar spectral attributes. These advantages make the proposed method better than existing FSSC methods.
The paper is organized as follows. In Section 2, the conception of IT2 FS, weighted distanced between interval number vectors, SIs and textures are reviewed. In Section 3, the main ideology for constructing the algorithm proposed in this study is described. In Section 4, we give an experiment to and compare with the online support vector machines (LaSVM) [16], which is a popular supervised method, and other semi supervising methods, such as ssFCM, SKFCM-F, SIIT2-FCM, the confusion matrix and kappa coefficients are adopted to validate the classification results, and the spectral curves modeled by the proposed method will be discussed, and the effects of sample volume will be discussed too. Section 5 is the conclusion.

2. Prerequisites

Our study is based on the IT2 FS, and the banded curve as expressed interval number vectors and each band have different weights in our proposed method. Therefore, in this section, a brief introduction of the IT2 FS and the weighted distance between interval number vectors will be provided. The SIs are used to improve the classification performance in this study, so some selected SIs are introduced too.

2.1. IT2 FS

Fuzzy sets (i.e., type-1 fuzzy sets) have been applied to many domains because they can model fuzziness, as shown in Figure 2a. However, the membership function of a type-1 fuzzy set is single-valued and then cannot address the error of the membership value of fuzzy objects. This flaw can be overcome by type-2 fuzzy sets and type-N fuzzy sets [17]. A common type-2 fuzzy set is characterized by a noninterval secondary membership function, which makes computation extremely difficult. Additionally, the secondary membership function is difficult to handle. IT2 FSs are a special case of type-2 fuzzy sets in which the secondary membership grade is a constant that equals one [18,19,20], as shown in Figure 2b, the uncertainty of membership grade is expressed by the footprint of uncertainty (FOU) which is surrounded by the lower membership function (LMF) and upper membership function (UMF). Compared with the general type-2 fuzzy set, the IT2-FS needs much less computation, so it is popular in application. In this study, the definition of IT2 FS introduced by Mendel [20] is adopted.
Definition 1.
(IT2 FS) [20]: an IT2 FS A ˜ on the universe X ϕ is given by the following:
A ˜ = { ( ( x , u ) , μ A ˜ ( x , u ) = 1 ) | x X , u U [ 0 , 1 ] } ,
where U is the universe of discourse for the secondary variable u . Note that since U is a subset of [0, 1] and, for the sake of convenience, the IT2 FS is represented as A ˜ , and A ˜ ( x ) = [ A ( x ) , A + ( x ) ] ( 0 A ( x ) A + ( x ) 1 ) . A ( x ) and A + ( x ) are the LMF and UMF of A ˜ respectively.

2.2. Weighted Distance between Interval Number Vectors

The interval number can handle the uncertainty of data. An interval number can be expressed as x ¯ = [ x , x + ]   ( x x + ) , where x and x + are the infimum and supremum of x ¯ respectively, and the width of x ¯ is x + x + 2 ; the greater the width of x ¯ , the greater the uncertainty in it. All elements in the interval number vector are interval numbers, each of which could be expressed as x ˜ = ( x ¯ 1 , x ¯ 2 , , x ¯ n ) , and n is the number of dimensions and all elements x ¯ i   ( 1 i n ) are interval numbers.
The weighted distance between interval number vectors is used to measure the distance between spectral attributes of segment features and banded curves of land cover types. Currently, there are multiple definitions of the distance between interval numbers. The Euclidean distance between interval numbers is commonly used, but this definition considers only endpoints of interval numbers [21]. The definition proposed by Li et al. [22] is more effective for remote sensing image classification [23] and adopted in this study.
Let a ¯ = [ a , a + ] and b ¯ = [ b , b + ] be two interval numbers, then the interval distance between a ¯ and b ¯ is calculated as follows:
d ˜ ( a ¯ , b ¯ ) = ( a + a + 2 b + b + 2 ) 2 + 1 3 [ ( a + a 2 ) 2 + ( b + b 2 ) 2 ]   1 6 [ ( a ¯ b ¯ ) + ( a ¯ b ¯ ) ] 2 ,
Then for two interval number vectors x ˜ = ( x ¯ 1 , x ¯ 2 , , x ¯ n ) and y ˜ = ( y ¯ 1 , y ¯ 2 , , y ¯ n ) , the distance between them is expressed as follows:
D ˜ ( x ˜ , y ˜ ) = ( i = 1 n d ˜ 2 ( x ¯ i , y ¯ i ) ) 1 2 ,
Let W = ( w 1 , w 2 , , w n ) be the weight vector of attributes of x ˜ and y ˜ , then the weighted interval distance is expressed as:
w D ˜ ( x ˜ , y ˜ ) = ( i = 1 n w i × d ˜ 2 ( x ¯ i , y ¯ i ) ) 1 2 ,

2.3. Spectral Indices and Textures of Image Objects

In this study, SIs are used to improve the performance of modelling uncertainty and classification. Because different sensors have different band numbers and band widths, SIs may vary for different sensor data. In this study, we focus on the WorldView-2 sensor dataset (8 bands). Many SIs were developed for this kind of dataset [24,25,26], including the normalized difference vegetation index (NDVI) [27], NDVI6 [26], the soil adjusted vegetation index (SAVI) [28], the normalized difference water index (NDWI) [29], the forest and crop index (FCI) [30], the normalized difference bare soil index [30], the verified enhanced vegetation index (VEVI) [31], the WorldView non-homogeneous feature difference (NHFD) [32], the normalized difference soil index (NDSI) [33], the normalized differences built-up index (NDBI) [34], the shaded vegetation index (SVI) [35], and gray and average of green, yellow and red band (Table 1). Besides these SIs, the sum of all bands (SUM), brightness, and ratio of Nir2 are selected too. The chosen SIs are regarded as newly generated bands for classification. It is well known that a SI can provide useful information for a special land cover, but it may bring uncertainty for other land cover types, so two issues are encountered: which SIs should be selected and integrated into image clustering and how to calculate the weights for these SIs.
The gray level co-occurrence matrix (GLCM) [36] and the derived indicators, including the energy (ASM), contrast (CON), correlation (COR) and entropy (ENT), are useful for remote sensing image classification [37]. In this study, these four texture feature values and the homogeny (HOMO) are employed, and the shape index is employed too.

3. Methodology

The proposed hierarchical method is object based, and its process is described in Figure 3. In this section, we will illustrate this method in detail.

3.1. Preprocessing VHR Images

The VHR image (World View 2) contains one pan band with a spatial resolution of 0.5 m and eight multispectral bands with a spatial resolution of 2 m. The necessary preprocesses include radiometric and atmospheric correction and Gram–Schmidt sharpening for eight multispectral bands by ENVI software.

3.2. Segmentation

The sharpened image was segmented with a multi-resolution segmentation algorithm by the eCognition software (Version 8.64). The average values of the spectral characteristics of pixels within a segmented objected are defined as the spectral characteristics of this object, and then the SIs in Table 1 and four texture features and shape index are calculated based on the spectral characteristics of segmented objects.

3.3. Sample Selection

As mentioned in Section 1, FSSC methods use limited samples for classification, rather than requiring a large number of samples in supervised methods. In FSSC methods, these limited samples are used to train the class centers repeatedly during the classification process. Several representative samples for each land cover type are necessary and samples should be uniformly distributed in the study area. In this study, we use the ArcGIS software to draw some sample regions of each land cover type, and then select the samples through the overlap operation between this layer and the segmented image object layer.

3.4. Hierarchical Classification

This step is the core in our study. For classical FCMs, IT2 FCMs and FSSC methods, all target land-cover types are recognized at the same level, which makes it difficult to distinguish the types with high spectral similarity, such as water bodies and shadows. From the cognitive point of view, multi-level or hierarchical methods are helpful for expressing and understanding geographical knowledge and widely used in VHR image classification [31,38]. These kinds of methods usually use different thresholds for different SIs to recognize different land-cover types in a crisp way. However, it is often difficult to find suitable threshold values. Unlike these strategies, our method uses a fuzzy method to distinguish land cover types hierarchically.

3.4.1. Hierarchical Organization of Land Cover Types

Our method organizes the land cover type (also labeled samples) as a multi-level decision tree (Figure 4). In this tree, all non-leaf nodes are classification nodes and the proposed soft classification method is adopted to distinguish their children types. All children nodes in the decision tree belong to statistical nodes, and we will take the average, first (Q1) and third (Q3) quantiles of spectral, SIs and texture information for these nodes later.

3.4.2. Determining Subsets of Bands, SIs and Textures

For the object-based classification method, each image object contains three types of information: spectra, SIs and textures. The appropriate features are effective in distinguishing a land cover type from others; conversely, any inappropriate ones may have little effect for distinguishing it from others. As a result, we select the most useful bands, SIs and textures for each classification node by the feature selection method in the eCognition software.

3.4.3. Calculating the Weights

Once some useful bands, SIs and textures have been selected, the weights should be assigned to them. This is based on the hypothesis that if a land cover type can be distinguished from its nearest type, it must be able to be distinguished from other types. In this step, we calculate the average, first (Q1) and third (Q3) quantiles of the selected bands’ spectrum for each node by labeled samples. Let J be the number of bands ( 1 j J ) for the i-th land cover type and there are L selected samples, then the average value of each band’s spectrum is calculated as:
M e a n i j = l = 1 L B j k × A k l = 1 L A k ,
where A k is the area of k-th feature, B j k is the spectrum of j-th band of the k-th feature and M e a n i j is the mean value of the spectrum value of the j-th band of the i-th land cover type. In the same way, the mean of the selected spectral indices and textures of all land cover types are calculated. The initial spectral characteristic curve of land cover type i-th can be expressed as { [ Q 1 , Q 3 ] 1 ,   [ Q 1 , Q 3 ] 2 ,   , [ Q 1 , Q 3 ] J } . In Figure 3, each classification node contains several children nodes, the interval distance between any children and its brother nodes can be calculated using the first and third quantiles by Equations (2) and (3), and the nearest brother node can be achieved by sorting these distances. If t-th is the i-th node’s nearest brother, the weight of the selected band j-th ( w i 1 , w i 2 , , w i j )   ( 1 j J ) can be determined as:
w i j = d ˜ ( [ Q 1 , Q 3 ] i j , [ Q 1 , Q 3 ] t j ) k = 1 J d ˜ ( [ Q 1 , Q 3 ] i k , [ Q 1 , Q 3 ] t k ) ,

3.4.4. The Semi-Supervising and Weighted Interval Type-2 Fuzzy C-Means for Objects (SSW-IT2FCM-O)

The SSW-IT2FCM-O is executed repeatedly for each classification node in the decision tree. That is, the SSW-IT2FCM-O is first executed at the top node, and then a subset including bands, spectral indices and textures is built for each children classification node. If a children node still has its own children classification nodes, the SSW-IT2FCM-O would be executed at these nodes repeatedly. Finally, the final result is obtained by collecting the results of each classification node.
For object-based classification methods, all image objects have different sizes; however, all pixels have the same area in existing pixel-based FSSC methods. Thus, pixel-based methods are inappropriate for object-based methods. To make the SSW-IT2FCM-O fit for image objects, the sizes of objects are used as the weights in the object function. Considering the uncertainty for the fuzzifier m in classical FSSC methods, the object function of SSW-IT2FCM-O is:
J m 1 = i = 1 C j = 1 N A j ( u i j ) m 1 w d ˜ i j 2 + α i = 1 C j = 1 N ( u i j f i j b j ) m 1 w d ˜ i j 2 ,
J m 2 = i = 1 C j = 1 N A j ( u i j ) m 2 w d ˜ i j 2 + α i = 1 C j = 1 N ( u i j f i j b j ) m 2 w d ˜ i j 2 ,
where C is the class number and N is the number of samples in each classification node, A j is the area of object j-th, and w d ˜ i j is the weighted interval distance and will be defined later. α is the scaling factor used to maintain a balance between the supervised and unsupervised component within the optimization mechanism. F = [ f i k ] is the priori knowledge matrix used to indicate the membership grades of the labeled samples, b k is an indicator used to distinguish between labeled and unlabeled patterns if a sample x k is labeled, b k = 1 ; otherwise, b k = 0 ; m 1 and m 2 are two fuzzifiers.
In order to handle the uncertainty of the remote sensing image, in our method, three types of information (spectra, SIs and texture values) of all class centers are expressed as interval number matrix: V ˜ X = { v ˜ 1 X , v ˜ 2 X , , v ˜ C X } T , V ˜ Y = { v ˜ 1 Y , v ˜ 2 Y , , v ˜ C Y } T and V ˜ Z = { v ˜ 1 Z , v ˜ 2 Z , , v ˜ C Z } T . Where C is the class number and all element of V ˜ X , V ˜ Y and V ˜ Z are interval number vectors, and each of them has different weights for classification in SSW-IT2FCM-O, then the weighted distance between image objects and land cover types should be defined. Let each classification node contain selected bands X, selected SIs Y and selected textures Z. A weighted distance w d ˜ i j between object j and centroid i can be expressed as follows:
w d ˜ i j = β 1 | | x j v ˜ i X | | W X + β 2 | | y j v ˜ i Y | | W Y + β 3 | | z j v ˜ i Z | | W Z ,
where | |   | | W is the weighted interval distance defined by Equation (4); more specifically, | | x j v ˜ i X | | W X is the weighted spectral attribute distance of an object k to the spectral attribute center of class i, and | | y j v ˜ i Y | | W Y is the weighted spectral indices distance of an object j to a spectral indices center of class i, and | | z j v ˜ i Z | | W Z is the weighted texture value distance of an object j to a texture value center of class i. The parameter β 1 , β 2 , β 3 controls the effect of selected bands, SIs and texture values. For example, if β 3 equals to 0, this means that no texture information is selected, and the weights of X, Y and Z are calculated by Equation (6).
In type 1 fuzzy set-based FSSC methods, different fuzzifiers produce different membership values of samples, and then produce different classification results. In order to handle the uncertainty of membership values, two fuzzifiers m 1 , m 2 are used to produce the membership values of samples, and the lower and upper membership functions are constructed by m 1 , m 2 and then membership values are expressed as an interval number u ˜ i j = [ u _ i j ,     u ¯ i j ] . The Lagrange multiplier is used to minimize the objective function (7–8), and then the upper and lower membership grade of unlabeled samples are
u ¯ i j = { 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 1 1 )   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) < 1 C 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 2 1 )   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 1 C ,
u _ i j = { 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 2 1 )   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 1 C 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 2 1 )   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 1 C ,
And the upper and lower membership grade of labeled samples is:
u ¯ i j = { ( 1 α ) 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 1 1 ) + α f i k   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) < 1 C ( 1 α ) 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 2 1 ) + α f i k   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 1 C ,
u _ i j = { ( 1 α ) 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 2 1 ) + α f i k   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 1 C ( 1 α ) 1 k = 1 C ( w d ˜ i j / w d ˜ j k ) 2 / ( m 1 1 ) + α f i k   i f   1 k = 1 C ( w d ˜ i j / w d ˜ j k ) < 1 C ,
The single-valued curves are the fuzzy membership grade weighted average of the spectral values of the pixels for all bands when type-1 fuzzy set-based FCMs and FSSC methods are used in remote sensing classification. For the interval type-2 fuzzy set-based IT2 FCMs or SIIT2FCMs, the curves are determined by the Karnik–Mendel (KM) algorithm [39], which is an effective method for determining the centers of interval II-type fuzzy sets during each iteration, but all centers should be type reduced by v i + v i + 2 . This means that the spectral curves achieved by IT2 FCMs or SIIT2FCMs are single-valued rather than banded too, and the width information of the spectral curve band is lost, and this is inconsistent with the ‘same materials with different spectra’, just as Figure 1 showed. In this study, all class centers are expressed as interval number vectors in order to handle the uncertainty of centers.
In addition, because the sizes of image objects are different, the KM algorithm cannot be used directly in our study, so it should be modified for the SSW-IT2FCM-O. In type-1 FCM, the area-weighted centroid is expressed as:
v i = j = 1 N A j × ( u i j ) m × x i j j = 1 N A j × ( u i j ) m ,
And then the area-weighted KM algorithm is described as ( see the Algorithm 1):
Algorithm 1. The area-weighted KM algorithm for SW-IT2FCM-O ( u _ i j ,   u ¯ i j ,   m 1 ,   m 2 ,   A )
Step 1: Set the value m = m 1 +   m 2 2 , u i j = u _ i j +   u ¯ i j 2 , and calculate the v i by Equation (30).
Step 2: Find k ( 1 k N 1 ) such that x i k v i x i k + 1
Step 3: Calculate the v i as follows:
In the case of computing v i L :
v i = l = 1 k A l × ( u ¯ i l ) m x i k + l = k + 1 N A l × ( u _ i l ) m x i l l = 1 k A l × ( u ¯ i l ) m + l = k + 1 N A l × ( u _ i l ) m
In the case of computing v i R :
v i = l = 1 k A l × ( u ¯ i l ) m x i k + l = k + 1 N A l × ( u _ i l ) m x i l l = 1 k A l × ( u ¯ i l ) m + l = k + 1 N A l × ( u _ i l ) m
Step 4: If v i = v i , go to Step 6.
   Otherwise,
   Set v i = v i , and go back to Step 2.
Step 5: Set v i L = v i (or v i R = v i ).
The iteration will stop when J m t + 1 J m t ε is satisfied. The lower and upper membership grades of each sample belonging to each class are determined and conform to an interval number vector { u ˜ 1 k , u ˜ 2 k , , u ˜ C k } = { [ u _ 1 k , u ¯ 1 k ] , [ u _ 2 k , u ¯ 2 k ] , , [ u _ C k , u ¯ C k ] } ,   ( k = 1 ,   2 ,   ,   N . ) . As a result, the probability of any two intervals in the vector can be calculated as follows:
P ( u ˜ i k u ˜ j k ) = { 1   u _ j k u ¯ j k u _ i k u ¯ i k 1 ( u ¯ j k u _ i k ) 2 2 L ( u ˜ i k ) L ( u ˜ j k )   u _ j k u _ i k u ¯ j k u ¯ i k u _ i k + u ¯ i k 2 u _ j k 2 L ( u ˜ j k )   u _ j k u _ i k u ¯ i k < u ¯ j k 2 u ¯ i k ( u _ j k + u ¯ j k ) 2 L ( u ˜ i k )   u _ i k u _ j k u ¯ j k u ¯ i k ( u ¯ i k u _ j k ) 2 2 L ( u ˜ i k ) L ( u ˜ j k )   u _ i k < u _ j k u ¯ i k < u ¯ j k 0   u _ i k < u ¯ i k < u _ j k < u ¯ j k ,
where L ( u ˜ i k ) = u ¯ i k u _ i k and L ( u ˜ j k ) = u ¯ j k u _ j k are the widths of the interval numbers u ˜ i k and u ˜ i k , respectively, for i, j = 1, 2, …, C and k = 1, 2, …, N.
We can then obtain a possibility matrix P = (pij, k). Moreover, the ranking vector w k = ( w 1 k ,   w 2 k , , w C k ) T can be calculated by w i = 1 n ( n 1 ) ( j = 1 n p i j + n 2 1 ) , and the index of the maximum value in w k is the class index of the sample.
It is easy to know that when the area of each sample is equal to 1, centroid is sampled as v i = j = 1 N ( u i j ) m × x i j j = 1 N ( u i j ) m and then the area-weighted KM algorithm is sampled as the KM algorithm. So, the SSW-IT2FCM-O can be used for the pixel-based image classification too.
The SSW-IT2FCM-O is described as: (see the Algorithm 2):
Algorithm 2. SSW-IT2FCM-O
Step 1: Initialization, set values for two fuzzifiers m 1 ,   m 2 and the termination criterion value ε and set the class number C, and initialize the band centroid V ˜ B and the spectral indices centroid V ˜ S by the first (Q1) and third (Q3) quantiles, respectively. The weight of selected bands and SIs is calculated by Equation (6).
Step 2: Calculate the new distance between the object k and the centroid i using Equation (7) and calculate the lower and upper membership degree matrix by Equations (10)–(13).
Step 3: Calculate all centroids of the band subset v ˜ B = [ v ˜ i B ] ,   i = 1, 2, …, C and spectral indices v ˜ S = [ v ˜ i S ] ,   i = 1 ,   2 ,   ,   C and determine their lower and upper bands v ˜ i L and v ˜ i R , respectively, via the area-weighted KM algorithm.
Step 4: Calculate the objective function via Equations (8) and (9). If O ( t + 1 ) O ( t ) ε , go to Step 5; otherwise, go to Step 2.
Step 5: Calculate the possibility matrix using Equation (17) and then obtain the ranking vector, and then assign a sample to a cluster and return the clustering results.

3.5. Model the Banded Spectral Curves of Land Cover Types

Because not all bands are selected at each classification node, we can only get several interval centers of these selected bands in Section 3.4. In order to obtain a complete banded spectrum curve, we collect the lower and upper membership degree matrix of each target land cover type which is achieved in some classification nodes, and then execute Algorithm 1 to produce the banded spectrum curves of each land cover type.

4. Experiments

In this section, a WorldView-2 dataset (WV-2) is used to test the proposed method. The dataset is classified into six types: water body, shadows, buildings, bare lands, grass, and woods. Here, the fuzzifiers m 1 and m 2 are set to 2.1 and 5, respectively, the maximum number of iterations is 100, and the termination criterion value ε is set to 0.0001. The classification accuracy of the result will be compared against that of other FSSC methods in terms of the five typical validity indices: the partition coefficient (PC-), the partition entropy (PE-), the Xie and Beni index (XB-) and the Fukuyama and Sugeno index (FS-) [40,41]. In addition, the confusion matrix and kappa coefficient are used to verify the accuracy of the proposed method. In addition, the spectrum bands of target land cover types produced by the proposed method will be investigated and compared with spectral curves produced by other FSSC methods. In the last of this section, the effects of sample sizes will be investigated too, and this investigation showed the performance of our method.

4.1. Study Area and Materials

The study area (Figure 5) is the campus of Tianjin Normal University, which is located in southwestern Tianjin City, a metropolis in northern coastal Mainland China. The reason for choosing this region as the study area is mainly due to the convenience for validation, and the study area contains many typical land cover types. The land cover types mainly include buildings, wetlands, trees, grasslands, trails, bare soil and concrete roads.
In this paper, the multispectral images of WV-2 with 3988 × 2532 pixels (252 ha) are used to test the classification performance of the presented HSW-IT2FCM-O algorithm. The dataset was acquired in September 13, 2015. The radiometric and atmospheric correction and Gram–Schmidt sharpening is finished by the ENVI software. All eight bands are used to segment with multi-resolution segmentation algorithm by the eCognition software with the segment scale being 120, and the weights of shape and compactness homogeneity being 0.1 and 0.5 correspondingly. A total of 20,078 objects were produced.
All the SIs in Table 1 and the sum of all bands (SUM), brightness, and ratio of Nir2 are considered. The land cover types are hierarchically organized into a tree (Figure 6). The top node contains three types: dark objects, vegetation and impervious surfaces (ISs) and bare soils. At the second level, the dark objects are further classified into water body and shadows, the vegetation into grasslands and woodlands, and bare soils into impervious surfaces and bare soils. At the third level, the woodlands are further refined into dense and sparse woodlands. In this study, the sparse woods mainly refer to shrubs, groves and emergent macrophytes. The impervious surfaces (ISs) are further classified into roads, buildings and other ISs which mainly contain a parking lot, basketball court, etc.
The labelled dataset covers approximately 9.1% of the study area (Figure 7a) and the test dataset (Figure 7b) covers 51.6% of the study area. All these data are distributed uniformly in the study area. In this study, two datasets were used to test the proposed method and compare with other methods.

4.2. Results

We use the labeled datasets to select training samples in the segmented dataset by an intersection operation, and then use the selected training samples to calculate the mean and Q1 and Q3 for each statistic node (target class) (Figure 6). The bands, SIs and texture values used in each classification node are selected in eCognition software, which are listed in Table 2. The selected SIs are shown in Figure 8, and the weights of selected bands, SIs and texture values are calculated by Equation (6).
At the top node, we classify the study area into three types: dark objects, vegetation and impervious surfaces, and the results are showed in Figure 9 and the user accuracy (U.A.), producer accuracy (P.A.), overall accuracy (O.A.) and kappa coefficient are provided in Table 3. The overall accuracy and kappa coefficient are 0.9859 and 0.9777, respectively, and this means a very high accuracy at this level, and the vegetation has the highest accuracy, followed by dark objects, while the impervious surfaces have a relatively lower accuracy.
The final result is shown in Figure 10. The confusion matrix and accuracy are listed in Table 4. In this level, the dark objects are classified into water and shadows, and the accuracy (P.A.) of water is as high as 0.9986 and the accuracy (P.A.) of shadows is 71.59%. The vegetation node is further classified into woodlands and grasslands that almost share the same spectral absorption features. Thus, it is difficult to identify them from each other only considering the spectral information. If different spectral indices are considered, woodlands and grasslands can be distinguished with high accuracy, and the woodlands are further classified into dense and sparse woodlands, then the grasslands, dense and sparse woodlands have an accuracy (P.A.) of 0.9578, 0.8829 and 0.5357, respectively. Regarding the impervious surfaces and bare soil, including buildings and bare soil, the accuracy (P.A.) of bare soil is 0.8987. The impervious surfaces are furtherly classified into roads, buildings and others, and the accuracy of them are 0.8075, 0.7051 and 0.6249, respectively. The lowest accuracy means that the poor performance of discriminating the sparse woodlands from the dense woodlands. The reason for this phenomenon is mainly due to the fuzziness of sparse woodlands. In this study area, sparse woodlands contain shrubs, groves and emergent macrophytes, and almost all sparse woodlands are not pure. Thus, it is difficult to distinguish between sparse woodlands and dense woodlands. The complexity of building roofs, and the high similarity between its spectrum and roads’ and other ISs’, increases the difficulty of building extraction, and other impervious surfaces are similar.

4.3. Comparison with Other Methods

In this section, we compared the results of the hierarchical SSW-IT2FCM-O with that of the three FSSC methods, namely, ssFCM, SIIT2-FCM and SKFCM-F, as well as with that of LaSVM, which is a popular supervised method. All these methods use the same labelled and test datasets as described in Section 4.1.
The values of the parameters of SIIT2-FCM are the same with the hierarchical SSW-IT2FCM-O. For the ssFCM and SKFCM-F, the fuzzifier m is set to 2, the termination criterion value ε is set to 0.0001, and the maximum iterations is 100. For the LaSVM, the radial basis function (RBF) is selected as the kernel function, and the parameter gamma is set as the average of variances of band values of labeled samples and the parameter C is set as 10. The results of these methods are shown in Figure 11, and the accuracies and kappa coefficients are reported in Table 5.
Comparing the results of ssFCM, SIIT2-FCM, SKFCM-F, LaSVM and the proposed method (Figure 10 and Figure 11), it is clear that object-based methods greatly reduce the salt-and-pepper effects. The proposed method shows a better performance than other methods. In region A of Figure 10, the shadows are misclassified into water by SKFCM-F methods as well as SIIT2-FCM. This means that these methods are poor at distinguishing water and shadows. In region B, the dense woodland is partly grouped into sparse woodland using SKFCM-F and SIIT2-FCM. In region C, the sparse woodland cannot be detected by ssFCM, SIIT2-FCM, SKFCM-F, LaSVM, and was misclassified as bare soil or grass by ssFCM, misclassified as water by LaSVM, and misclassified as dense woodland by SIIT2-FCM and SKFCM-F. As a result, these methods are weak at distinguishing grass, dense and sparse woodland. In region D, fuzzy methods (ssFCM, SIIT2-FCM, SKFCM-F, hierarchical SSW-IT2FCM-O) present more detailed information than LaSVM. In region E, most of the grass are classified into bare soil by ssFCM, SIIT2-FCM, SKFCM-F, and some bare soil are classified into impervious surface by LaSVM, and the water body are partly grouped into shadows by SIIT2-FCM and SKFCM-F. In region F, part of a sports field is misclassified into roads by SIIT2-FCM and SKFCM-F. Region G is a wetland and contains shallow water, Typha domingensis, reeds, and lotus leaf. Some vegetation is misclassified into building or water by LaSVM, at this region, the result of LaSVM shows poor performance than FSSC methods. In region H, the planted shrubs are misclassified into bare soil by ssFCM and misclassified into water by LaSVM. In region I, bare soil is misclassified into impervious surface by LaSVM and SIIT2-FCM.
Like other object-based methods, the proposed method effectively avoids the negative influence of salt-and-pepper noise. The other FSSC methods are pixel-based methods and are sensitive to salt-and-pepper noise, so their results (Figure 11a–c) seem more fragmented than the result of the proposed method.
Table 5 shows that the SKFCM-F is weak in distinguishing shallow from water, and the SKFCM-F only achieves 16.68% accuracy and has a weak performance in impervious surface classification. Thus, the O.A. of SKFCM-F is the lowest. The ssFCM has the best performance for bare soil and other ISs, but it has poor performance in woodland classification. The SKFCM-F has the best performance for water, because all most dark objects are classified into water. The LaSVM has the best performance for buildings, its U.A. reaches 87.94%, but it also loses lots of detailed information. Although it shows better performance, it contains some unreasonable results. For example, the landscape transition in the region E of Figure 10 is water body → shallow water → bare soil → grass → woods, which is a typical fuzziness phenomenon. The shallow water is classified as shadows by fuzzy methods (ssFCM, SIIT2-FCM, SKFCM-F and hierarchical SSW-IT2FCM-O); these results seem reasonable since the spectral feature of shallow water is very similar with that of shadow. However, in this region, the LaSVM classifies the bare soil as impervious surface and, in region G, some vegetation is misclassified into impervious surface too. That is unreasonable. On the whole, the proposed method shows the best performance among these methods (Table 5) as the overall accuracy of the hierarchical SSW-IT2FCM-O is 87.02%.
The values of fuzzy classification validity indices PC-, PE-, XB- and FS- corresponding to ssFCM, SIIT2-FCM, SKFCM-F, hierarchical SSW-IT2FCM-O are listed in Table 6. The values of PC- indicate the average relative amount of membership sharing between pairs of fuzzy subsets. The higher the PC- value is, the better the corresponding classification results will be. PE- is a scalar measure of the amount of fuzziness in a set of results. The FS- is designed to measure the discrepancy between fuzzy compactness and fuzzy separation. XB- is used to measure the average within-cluster fuzzy compactness against the minimum between-cluster separation. The values of these three indices are smaller, indicating better clustering performance of these clustering methods. The PC- and FS- show the best performance of the proposed method. However, although the XB- value of SIIT2-FCM is the smallest, the hierarchical SSW-IT2FCM-O actually has the highest accuracy from these four FSSC methods (Table 5).
Regarding the aspect of the time consumed, the proposed method takes much less time than the other three FSSC methods (ssFCM, SIIT2-FCM and SKFCM-F) in our experiment. The main reason for this is that the SSW-IT2FCM-O is an object-based method and 20,078 features as input are classified. Furthermore, 3988 × 2532 (the resolution is 0.5 m) or 997 × 633 (the resolution is 2 m) pixels as input are classified by the other three FSSC methods which are pixel-based methods, so the input dataset size of the proposed method is very significantly smaller than that of the other three FSSC methods. Another reason for this is that the class number of each classification node is lower than the number of expected class and the number of inputs of each child classification nodes is lower than the full input data set. So, the hierarchical SSW-IT2FCM-O is less time consuming than other FSSC methods. The process of the SSW-IT2FCM-O is similar to IT2FCM and SIIT2-FCM and its computational complexity has been proved as R × O ( N 2 ) [5,14], where R is the number of required iterations. Generally, R N ; thus, the computational complexity of the SSW-IT2FCM-O and the complexity of the IT2FCM are O ( N 2 ) . Similarly, the computational complexity of the SIIT2-FCM, IT2FCM and SIIT2-FCM are O ( N 2 ) . Just as discussed above, the number of segmented objects is far lower than the number of pixels for a same study area; the processing speed of the SSW-IT2FCM-O is often faster than that of the SIIT2-FCM.

4.4. Discussion

In the FSSC methods, the centroid (spectral curve) of a class is crucial; the closer the centroid is to the true value, the better the classification performance will be. Among three selected typical FSSC methods (ssFCM, SIIT2-FCM and SKFCM-F), the ssFCM shows the best performance. Thus, it was used in this section. In this section, we treat the first (Q1) and third (Q3) quantiles of the test dataset as the true values of the spectral curves of land covers (Figure 12). Figure 12 shows that some banded curves may overlap each other, such as water and shadows, dense woodlands and sparse woodlands, and the spectral curves of the three types of impervious surfaces overlap badly. The greater the similarity between them, the greater the uncertainty of classification. Therefore, this uncertainty of spectrum makes it difficult to classify. In this section, we will discuss the ability of the proposed method for modeling the uncertainty of the spectrum. Then the uncertainty of the membership degree of six types and the effects of sample sizes will finally be discussed.

4.4.1. Can existing Methods Model Spectral Uncertainty?

The spectral curves of nine land cover types modeled by the test dataset, ssFCM and LaSVM, are shown in Figure 13. These curves modeled by ssFCM and LaSVM are single-valued, and the width of the curves cannot be estimated. As a result, ssFCM and LaSVM cannot model the spectral uncertainty and have limited ability to handle this type of uncertainty. The distances between the true valued curves and spectral curves modeled by ssFCM and LaSVM are calculated by Equation (3) and listed in Table 7. We can see that the spectral curves modeled by the ssFCM have the biggest deviation from the true value. From the perspective of land cover types, the curve of building class has the largest deviation for the three methods. The curves of building and sparse woodlands produced by ssFCM are almost outside the true value ranges, which results in the low classification accuracy of these two types.

4.4.2. Can the Hierarchical SSW-IT2FCM-O Model Spectral Uncertainty?

The banded spectral curves produced by the hierarchical SSW-IT2FCM-O are shown in Figure 14. All of them have the lower and upper boundaries, and the band width could be expressed clearly, and then spectral uncertainty could be modeled by the width of these banded spectral curves. Intuitively, the banded spectral curves modeled by the SSW-IT2FCM-O have a good shape similarity with the true values. All spectral curves are close to true values, especially the water, grass, dense woodlands and shadows. The banded spectral curves of water, grass, dense woodlands and impervious surfaces modeled by the SSW-IT2FCM-O are almost entirely within the true values of them correspondingly. In a word, the banded curves can express the spectra of geographical objects more objectively and accurately than single-valued curves. Thus, the presented method has better performance than the ssFCM.

4.4.3. Effects of Sample Volume

As mentioned in Section 1, the FSSC methods need fewer training samples than supervised methods; in this section, we will investigate the effects of sample volume to the hierarchical SSW-IT2FCM-O. To analyze the effects of sample volume, we reduce some samples to cover approximately 8 and 6% of the study area respectively. Figure 15 shows the reduced samples and results of ssFCM, SIIT2-FCM, SKFCM-F, LaSVM and hierarchical SSW-IT2FCM-O when the reduced samples cover approximately 8%, and the accuracy and Kappa coefficients are reported in Table 8. The accuracy of the ssFCM reduced greatly, some grasslands and woodlands are misclassified to bare soils. The hierarchical SSW-IT2FCM-O method lost approximately 0.47% accuracy and the LaSVM lost approximately 0.8%.
The reduced samples and results of ssFCM, SIIT2-FCM, SKFCM-F, LaSVM and hierarchical SSW-IT2FCM-O are shown in Figure 16 when the reduced samples cover approximately 6%, and the accuracies and Kappa coefficients are reported in Table 9. The shadows have been misclassified as water by ssFCM, SIIT2-FCM, and SKFCM-F, while some woodlands are misclassified as grass, as shown in Figure 16b–d. Although the LaSVM has achieved higher accuracy in a previous test, when the sample volume is reduced to 6%, some water regions are misclassified as buildings in Figure 16e, resulting in the accuracy being reduced very sharply and its accuracy loss is more serious than FSSC methods. From Table 9, we can see that the accuracies of ssFCM and LaSVM are reduced greatly while the accuracy of hierarchical SSW-IT2FCM-O lost approximately 1.45%. Therefore, the hierarchical SSW-IT2FCM-O depends less on samples and is more stable than other methods.

5. Conclusions

This study proposed a hierarchical method to model the spectral uncertainty for VHR images based on IT2 FS with higher classification accuracy than existing FSSC methods, namely, ssFCM, SIIT2-FCM and SKFCM-F. For this method, the prior knowledge of labeled samples contains three types: the hierarchical organization of land cover types, the weights of bands and spectral indices, and the statistic information of labeled samples. The input image is first classified into several subsets according to the hierarchical organization of labeled samples by the proposed methods, and then further distinguished sub-classes.
The selection of bands, spectral indices and textures are essential for the proposed method. Although there are some feature selection methods in the machine learning domain, such as the information gain algorithm and the correlation-based feature selection algorithm [42,43,44], these methods can produce very different results, and the classification results using the features selected by these methods are poor in our experiment. Therefore, in this study we select bands, spectral indices and textures by eCognition software at first. The weight determination is another important aspect in this work that is based on the assumption that if a land cover type can be distinguished from its nearest type, it will be able to be distinguished from other types. We proposed a weight determination method by the first (Q1) and third (Q3) quantiles and a weighted interval distance is defined. The proposed method effectively utilizes the ability of interval type-2 fuzzy sets to handle uncertainties of membership degrees, and the spectral curves of land cover types are modeled as banded curves. The shortcomings of the methods based on type-1 fuzzy sets are avoided. The World View-2 dataset was used to test the effect of the proposed method, and the results showed that the proposed hierarchical method can effectively model the spectral uncertainties of land cover types. From the perspective of classification, it can effectively distinguish between water bodies and shadows, woodlands and grasslands.
The weighted interval distance defined in this study is based on the Euclidean distance. In future, different similarity metrics, such as spectral similarity metrics [45] and the spectral angle metric [46], will be studied. Like other FSSC methods, the centroid is calculated by all samples, and it will increase the calculation time and affect the accuracy. However, the center of a land cover type should be determined by the samples belonging to this type, not by other samples. Thus, in future work, we will improve the centroid determination method.

Author Contributions

J.G. conceived and designed the research, implemented the HSW-IT2FCM in Java and wrote the manuscript. S.D. (Shihong Du) participated in designed the research and revised the manuscript. H.H. analyzed all test results and write the discussion. S.D. (Shouji Du) and X.Z. processed the test data and revised the manuscript.

Funding

The work in this paper was supported by the Key Project of Tianjin Natural Science Foundation of China (17JCZDJC39700).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cheng, J.; Guo, H.; Shi, W. Uncertainty of Remote Sensing Data; Chinese Science Press: Beijing, China, 2004. [Google Scholar]
  2. Swain, P.H.; Davis, S.M. Remote Sensing: The Quantitative Approach; McGraw-Hill: New York, NY, USA, 1978. [Google Scholar]
  3. Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar] [CrossRef]
  4. Zadeh, L.A. Fuzzy Sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
  5. Hwang, C.; Rhee, F.C.H. Uncertain fuzzy clustering: Interval type-2 fuzzy approach to C.-means. IEEE Trans. Fuzzy Syst. 2007, 15, 107–120. [Google Scholar] [CrossRef]
  6. Ji, Z.; Xia, Y.; Sun, Q.; Cao, G. Interval-valued possibilistic fuzzy C-means clustering algorithm. Fuzzy Sets Syst. 2014, 253, 138–156. [Google Scholar] [CrossRef]
  7. Sohn, Y.; Rebello, N. Supervised and unsupervised spectral angle classifiers. Photogramm. Eng. Remote Sens. 2002, 68, 1271–1280. [Google Scholar]
  8. Ma, L.; Li, M.; Ma, X.; Cheng, L.; Du, P.; Liu, Y. A review of supervised object-based land-cover image classification. ISPRS J. Photogramm. Remote Sens. 2017, 130, 277–293. [Google Scholar] [CrossRef]
  9. Wang, C.; Xu, A.; Li, X. Supervised Classification High-Resolution Remote-Sensing Image Based on Interval Type-2 Fuzzy Membership Function. Remote Sens. 2018, 10, 710. [Google Scholar] [CrossRef]
  10. Li, L.; Garibaldi, J.M.; He, D.; Wang, M. Semi-Supervised Fuzzy Clustering with Feature Discrimination. PLoS ONE 2015, 10, e0131160. [Google Scholar] [CrossRef]
  11. Lai, D.T.C.; Garibaldi, J.M. A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In Proceedings of the 2011 IEEE International Conference on Fuzzy Systems, Taipei, China, 27–30 June 2011; pp. 1580–1586. [Google Scholar]
  12. Pedrycz, W.; Waletzky, J. Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybern. 1997, 27, 787–795. [Google Scholar] [CrossRef]
  13. Zhang, D.; Tan, K.; Chen, S. Semi-supervised kernel-based fuzzy c-means. Lect. Notes Comput. Sci. Neural Inf. Process. 2004, 3316, 1229–1234. [Google Scholar]
  14. Ngo, L.T.; Mai, D.S.; Pedrycz, W. Semi-supervising Interval Type-2 Fuzzy C-Means clustering with spatial information for multi-spectral satellite image classification and change detection. Comput. Geosci. 2015, 83, 1–16. [Google Scholar] [CrossRef]
  15. Mai, D.S.; Ngo, L.T. Multiple kernel approach to semi-supervised fuzzy clustering algorithm for land-cover classification. Eng. Appl. Artif. Intell. 2018, 68, 205–213. [Google Scholar] [CrossRef]
  16. Bordes, A.; Ertekin, S.; Weston, J.; Bottou, L. Fast kernel classifiers with online and active learning. J. Mach. Learn. Res. 2005, 6, 1579–1619. [Google Scholar]
  17. Zadeh, L.A. The concept of a linguistic variable and its application to approximate reasoning-1. Inf. Sci. 1975, 8, 199–249. [Google Scholar] [CrossRef]
  18. Mizumoto, M.; Tanaka, K. Some properties of fuzzy sets of type-2. Inf. Control 1976, 31, 312–340. [Google Scholar] [CrossRef]
  19. Mendel, J.M.; Wu, H.W. Properties of the centroid of an interval type-2 fuzzy set, including the centroid of a fuzzy granule. In Proceedings of the 2005 IEEE International Conference on Fuzzy Systems, Reno, NV, USA, 22–25 May 2005; pp. 341–346. [Google Scholar]
  20. Mendel, J.M. Uncertain Rule-Based Fuzzy Systems Introduction and New Directions, 2nd ed.; Springer Publishing Company: New York, NY, USA, 2017. [Google Scholar]
  21. Bertoluzza, C.; Corral, N.; Salas, A. On a new class of distances between fuzzy numbers. Mathw. Soft Comput. 1995, 2, 71–84. [Google Scholar]
  22. Li, X.; Zhang, S. Rank of interval numbers based on a new distance measure. J. Xihua Univ. (Nat. Sci.) 2008, 27, 87–90. [Google Scholar]
  23. Guo, J.; Huo, H.; Peng, G. An interval number distance-and ranking-based method for remotely sensed image fuzzy clustering. Int. J. Remote Sens. 2018, 39, 8591–8614. [Google Scholar] [CrossRef]
  24. Shahi, K.; Shafri, H.Z.; Taherzadeh, E.; Mansor, S.; Muniandy, R.; Shafri, H.Z.M. A novel spectral index to automatically extract road networks from WorldView-2 satellite imagery. Egypt. J. Remote Sens. Space Sci. 2015, 18, 27–33. [Google Scholar] [CrossRef] [Green Version]
  25. Xie, C.; Huang, X.; Zeng, W.; Fang, X. A novel water index for urban high-resolution eight-band WorldView-2 imagery. Int. J. Digit. Earth 2016, 9, 925–941. [Google Scholar] [CrossRef]
  26. Liu, H. Typical Urban Greening Tree Species Classification Based on WorldView-2. Ph.D. Thesis, Inner Mongolia Agricultural University, Hohhot, China, 2016. [Google Scholar]
  27. Rouse, J.W.; Haas, R.; Schell, J. Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation; NTIS No. E73-106393; Texas A&M University, Remote Sensing Center: College Station, TX, USA, 1974; 93p. [Google Scholar]
  28. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  29. McFeeters, S.K. The use of Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  30. Zhou, X.; Jancsó, T.; Chen, C. Urban Land Cover Mapping Based on Object Oriented Classification Using World View 2 Satellite Remote Sensing Images. In Proceedings of the International Scientific Conference on Sustainable Development & Ecological Footprint, Sopron, Hungary, 26–27 March 2012. [Google Scholar]
  31. Wen, D.; Huang, X.; Liu, H.; Liao, W.; Zhang, L. Semantic Classification of Urban Trees Using Very High Resolution Satellite Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 1413–1424. [Google Scholar] [CrossRef]
  32. Wolf, A. Using World View 2 Vis-NIR MSI Imagery to Support Land Mapping and Feature Extraction Using Normalized Difference Index Ratios; DigitalGlobe: Longmont, CO, USA, 2010; Unpublished Report. [Google Scholar]
  33. Ling, C.; Ju, H.; Zhang, H.; Sun, H. Research on Wetland Type Classification Based on Improved Remote Sensing Index of Worldview-2 Data. For. Res. 2014, 27, 639–643. [Google Scholar]
  34. Zha, Y.; Gao, Y.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
  35. Xu, Z.; Liu, J.; Yu, K.; Liu, T.; Gong, C.; Tang, M.; Xie, W.; Li, Z. Construction of Shadow Vegetation Index (SVI) and Application Effects in Four Remote Sensing Images. Spectrosc. Spectr. Anal. 2013, 33, 3359–3365. [Google Scholar]
  36. Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, 3, 610–621. [Google Scholar] [CrossRef]
  37. Zheng, Z.; Du, S.; Wang, Y.-C.; Wang, Q. Mining the regularity of landscape-structure heterogeneity to improve urban land-cover mapping. Remote Sens. Environ. 2018, 214, 14–32. [Google Scholar] [CrossRef]
  38. Drăguţ, L.; Csillik, O.; Eisank, C.; Tiede, D. Automated parameterisation for multi-scale image segmentation on multiple layers. ISPRS J. Photogramm. Remote Sens. 2014, 88, 119–127. [Google Scholar] [CrossRef] [PubMed]
  39. Karnik, N.N.; Mendel, J.M. Applications of type-2 fuzzy logic systems to forecasting of time-series. Inf. Sci. 1999, 120, 89–111. [Google Scholar] [CrossRef]
  40. Xu, Y.; Brereton, R.G. A comparative study of cluster validation indices applied to genotyping data. Chemom. Intell. Lab. Syst. 2005, 78, 30–40. [Google Scholar] [CrossRef]
  41. Wang, W.; Zhang, Y. On fuzzy cluster validity indices. Fuzzy Sets Syst. 2007, 158, 2095–2117. [Google Scholar] [CrossRef]
  42. Frigui, H.; Nasraoui, O. Unsupervised learning of prototypes and attribute weights. Pattern Recognit. 2004, 37, 567–581. [Google Scholar] [CrossRef]
  43. Peng, H.C.; Long, F.; Ding, C. Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed]
  44. Sun, Y.; Todorovic, S.; Goodison, S. Local-Learning-Based Feature Selection for High-Dimensional Data Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1610–1626. [Google Scholar] [PubMed]
  45. Zhao, C.; Tian, M.; Li, J. Research progress of spectral similarity metrics. J. Harbin Eng. Univ. 2017, 38, 1179–1189. [Google Scholar]
  46. Kruse, F.A.; Lefkoff, A.B.; Boardman, J.W.; Heidebrecht, K.B.; Shapiro, A.T.; Barloon, P.J.; Goetz, A.F.H. The Spectral Image Processing Systems (SIPS)-interactive visualization and analysis of imaging spectrometer data. AIP Conf. Proc. 1993, 2, 145–163. [Google Scholar]
Figure 1. Single-valued and banded spectral curves.
Figure 1. Single-valued and banded spectral curves.
Remotesensing 11 01750 g001
Figure 2. Type-1 fuzzy set and interval type-2 fuzzy set.
Figure 2. Type-1 fuzzy set and interval type-2 fuzzy set.
Remotesensing 11 01750 g002
Figure 3. Diagram of the technical approach of this paper.
Figure 3. Diagram of the technical approach of this paper.
Remotesensing 11 01750 g003
Figure 4. Hierarchical organization of land cover types.
Figure 4. Hierarchical organization of land cover types.
Remotesensing 11 01750 g004
Figure 5. The study area based on natural color composite imagery using bands 5, 3, and 2 for RGB.
Figure 5. The study area based on natural color composite imagery using bands 5, 3, and 2 for RGB.
Remotesensing 11 01750 g005
Figure 6. The class tree of land cover types.
Figure 6. The class tree of land cover types.
Remotesensing 11 01750 g006
Figure 7. The training and test datasets.
Figure 7. The training and test datasets.
Remotesensing 11 01750 g007
Figure 8. Selected spectral indices.
Figure 8. Selected spectral indices.
Remotesensing 11 01750 g008aRemotesensing 11 01750 g008b
Figure 9. The classification result at the top node.
Figure 9. The classification result at the top node.
Remotesensing 11 01750 g009
Figure 10. The final class types.
Figure 10. The final class types.
Remotesensing 11 01750 g010
Figure 11. The results of ssFCM, SIIT2-FCM, SKFCM-F and LaSVM respectively.
Figure 11. The results of ssFCM, SIIT2-FCM, SKFCM-F and LaSVM respectively.
Remotesensing 11 01750 g011
Figure 12. The true spectral curves of nine land covers (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Figure 12. The true spectral curves of nine land covers (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Remotesensing 11 01750 g012
Figure 13. The spectral curves of true value and the single-valued spectral curves modeled by ssFCM and LaSVM (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Figure 13. The spectral curves of true value and the single-valued spectral curves modeled by ssFCM and LaSVM (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Remotesensing 11 01750 g013aRemotesensing 11 01750 g013b
Figure 14. The banded spectral curves of true value and spectral curves modeled by hierarchical SSW-IT2FCM-O (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Figure 14. The banded spectral curves of true value and spectral curves modeled by hierarchical SSW-IT2FCM-O (wavelength: nm). The x-axis is the wavelength, and the y-axis represents the reflectance for a scale factor of 10,000.
Remotesensing 11 01750 g014aRemotesensing 11 01750 g014b
Figure 15. Classification results of five methods when the sample size was reduced to 8%.
Figure 15. Classification results of five methods when the sample size was reduced to 8%.
Remotesensing 11 01750 g015aRemotesensing 11 01750 g015b
Figure 16. Classification results of five methods when the sample size was reduced to 8%.
Figure 16. Classification results of five methods when the sample size was reduced to 8%.
Remotesensing 11 01750 g016aRemotesensing 11 01750 g016b
Table 1. Some commonly used spectral indices.
Table 1. Some commonly used spectral indices.
IndexEquationRemarks
NDVI N i r 1 R e d N i r 1 + R e d
NDVI6 RedEdge Y e l l o w RedEdge + Y e l l o w
FCI Nir 1 RedEdge Nir 1 + RedEdge
SAVI N i r 2 R e d N i r 2 + R e d + L ( 1 + L ) L = 0.5
VEVI N i r 2 R e d N i r 2 + 6 R e d 3.5 B l u e 4 G r e e n + 1
NDWI G r e e n N i r 2 G r e e n + N i r 2
NDBSI R e d G r e e n R e d + G r e e n
NHFD R e d E d g e C o a s t a l R e d E d g e + C o a s t a l
NDBI R e d E d g e N i r 2 R e d E d g e + N i r 2
Gray0.3 * Red + 0.59 * Green + 0.11 * Blue
NDSI G r e e n Y e l l o w G r e e n + Y e l l o w
SVI ( N i r 2 R e d ) × N i r 2 N i r 2 + R e d
AVG G r e e n + Y e l l o w + R e d 3
Table 2. Selected bands, SIs and textures for each classification node corresponding to Figure 6.
Table 2. Selected bands, SIs and textures for each classification node corresponding to Figure 6.
Classifications Node Selected BandsSelected SIsSelected Texture Values
Full dataset1,2,3,4AVG, FCI, NDBSI, SAVI
Dark objects1,2,3,4,6AVG, NDBSI, VEVI, GRAY
Vegetation1,2,3,4FCI, NDBSI, Brightness, SUMHOMO
Impervious surfaces and bare soil1,2,3,4,7AVG, FCI, SAVI, Ratio of Nir2HOMO, CON, Shape index
Woodlands6,7FCI, Ratio of Nir2HOMO, ENT
Impervious surfaces1,2,3,4,5AVG, SAVI, Ratio of Nir2, NDBI, NDSI, NHFDHOMO, CON, ENT, Shape index
Table 3. Confusion matrix and accuracy of the result at the top node.
Table 3. Confusion matrix and accuracy of the result at the top node.
Dark ObjectsImpervious Surfaces and Bare SoilVegetationUser accuracy (U.A.) (%)
Dark objects565,698582402699.19
Impervious surfaces and bare soil6622313,587366696.82
Vegetation1261341277,28399.43
Producer accuracy (P.A.) (%)98.6399.7197.30
Overall accuracy (O.A.) (%)98.59
Kappa.0.9777
Table 4. Confusion matrix and accuracy of the final results.
Table 4. Confusion matrix and accuracy of the final results.
WaterShadowsBare SoilGrasslandsRoadsBuildingsOther Impervious SurfacesDense WoodlandsSparse WoodlandsU.A. (%)
Water473,65219,4220148912804312177595.57
Shadows23872,44724741662412249884797.43
Bare soil01511,067120610002689.87
Grasslands81131150,611002586163295.78
Roads05174560329100,00512,39360390680.75
Buildings2986249199324526,933102,8917246403870.51
Other Impervious surfaces05438106558296711,93140,20612662.49
Dense woodlands09649495979680152,181944788.29
Sparse woodlands118103973015,25885525886930,14753.57
P.A. (%)99.8671.5941.4483.9776.7580.6874.7591.0371.87
O.A. (%)87.02
Kappa0.8375
Table 5. The U.A. (%), O.A. (%) and Kappa coefficients of five methods.
Table 5. The U.A. (%), O.A. (%) and Kappa coefficients of five methods.
ssFCMSIIT2-FCMSKFCM-FLaSVMHSW-IT2FCM
U.A.Water93.1692.7696.1894.4995.57
Shadows90.7479.8716.6891.8597.43
Bare soil93.8886.5888.9485.0989.87
Grasslands65.5474.4782.7491.6395.78
Roads72.7478.2976.2565.0680.75
Buildings59.4345.5745.1187.9470.51
Other impervious surfaces67.7222.7715.2756.6862.49
Dense woodlands68.4451.4749.4386.9988.29
Sparse woodlands11.7438.0131.0318.3853.57
O.A.75.9471.8269.3484.2287.02
Kappa.0.70110.64940.61030.80158375
Table 6. Typical validity indices for four FSSC methods.
Table 6. Typical validity indices for four FSSC methods.
ParameterssFCMSIIT2-FCMSKFCM-FHierarchical SSW-IT2FCM-O
PC-0.1760.1650.4480.466
PE-2.1601.8601.2620.251
XB-2.0481.6432.5843.134
FS-2.18 × 10121.42 × 10122.68 × 10122.76 × 107
time(sec.)4378741199331
Table 7. The distances between true value and spectral curves modeled by ssFCM, LaSVM, and the hierarchical SSW-IT2FCM-O (wavelength: nm).
Table 7. The distances between true value and spectral curves modeled by ssFCM, LaSVM, and the hierarchical SSW-IT2FCM-O (wavelength: nm).
MethodsWaterShadowRoadsBuildingsOther ISsBare SoilDense WoodlandsSparse WoodlandsGrassSum
The Proposed109743876416941425807106815010458488
LaSVM95055595619031493668119649911819400
ssFCM1290560725240916638621199518138910,616
Table 8. The overall accuracies and Kappa coefficients produced by the five methods if the sample size is reduced to 8%.
Table 8. The overall accuracies and Kappa coefficients produced by the five methods if the sample size is reduced to 8%.
ssFCMSIIT2-FCMSKFCM-FLaSVMHierarchical SSW-IT2FCM-O
O.A. (%)63.8672.7670.9383.4286.55
Kappa.0.56630.65920.63000.79220.8317
Table 9. The overall accuracy and Kappa coefficients of the results of five methods if the sample size is reduced to 6%.
Table 9. The overall accuracy and Kappa coefficients of the results of five methods if the sample size is reduced to 6%.
ssFCMSIIT2-FCMSKFCM-FLaSVMHierarchical SSW-IT2FCM-O
O.A. (%)70.6069.4369.3470.0485.57
Kappa.0.62420.61760.61060.63970.8192

Share and Cite

MDPI and ACS Style

Guo, J.; Du, S.; Huo, H.; Du, S.; Zhang, X. Modelling the Spectral Uncertainty of Geographic Features in High-Resolution Remote Sensing Images: Semi-Supervising and Weighted Interval Type-2 Fuzzy C-Means Clustering. Remote Sens. 2019, 11, 1750. https://doi.org/10.3390/rs11151750

AMA Style

Guo J, Du S, Huo H, Du S, Zhang X. Modelling the Spectral Uncertainty of Geographic Features in High-Resolution Remote Sensing Images: Semi-Supervising and Weighted Interval Type-2 Fuzzy C-Means Clustering. Remote Sensing. 2019; 11(15):1750. https://doi.org/10.3390/rs11151750

Chicago/Turabian Style

Guo, Jifa, Shihong Du, Hongyuan Huo, Shouji Du, and Xiuyuan Zhang. 2019. "Modelling the Spectral Uncertainty of Geographic Features in High-Resolution Remote Sensing Images: Semi-Supervising and Weighted Interval Type-2 Fuzzy C-Means Clustering" Remote Sensing 11, no. 15: 1750. https://doi.org/10.3390/rs11151750

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop