Next Article in Journal
R-Unet: A Deep Learning Model for Rice Extraction in Rio Grande do Sul, Brazil
Next Article in Special Issue
Forest Loss Related to Brazil Nut Production in Non-Timber Forest Product Concessions in a Micro-Watershed in the Peruvian Amazon
Previous Article in Journal
Processing GPR Surveys in Civil Engineering to Locate Buried Structures in Highly Conductive Subsoils
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data in an Agricultural Watershed

1
Department of Geosciences, Mississippi State University, Mississippi State, MS 39762, USA
2
Department of Agricultural and Biological Engineering, Mississippi State University, Mississippi State, MS 39762, USA
3
USDA Forest Service, Center for Bottomland Hardwoods Research, 775 Stone Blvd., Thompson Hall, Room 309, Mississippi State, MS 39762, USA
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(16), 4020; https://doi.org/10.3390/rs15164020
Submission received: 9 July 2023 / Revised: 6 August 2023 / Accepted: 10 August 2023 / Published: 14 August 2023
(This article belongs to the Special Issue Remote Sensing Applications in Land Use and Land Cover Monitoring)

Abstract

:
Classification of remotely sensed imagery for reliable land use and land cover (LULC) remains a challenge in areas where spectrally similar LULC features occur. For example, bare soils of harvested crop fields in agricultural watersheds exhibit spectral characteristics similar to high-intensity developed regions and impede an accurate classification. The goal of this study is to improve the accuracy of LULC classification of satellite imagery for the Big Sunflower River Watershed, Mississippi using ancillary data, multiple classification methods, and a post-classification correction (PCC). To determine the best approach, the methodology was applied to Landsat 8 Operational Land Imager (OLI) imagery during the growing season and post-harvest. Imagery for the growing season was acquired on 25 August 2015, and post-harvest was acquired on 7 January 2018. Three classification methods were applied: maximum likelihood (ML), support vector machine (SVM), and random forest (RF). LULC imagery was classified as open water, woody wetlands, harvested crop, rangeland, cultivated crop, high-intensity developed, and mid-low intensity developed areas. Ancillary data such as normalized difference vegetation index (NDVI), thematic maps of urban areas, river networks, transportation networks, high-resolution National Agriculture Imagery Program (NAIP) imagery, Google Earth time-series data, and phenology were used to determine the training dataset. Initially none of the three classification methods performed adequately. Hence, a post-classification correction (PCC) was implemented by masking and applying a majority filter using thematic maps of urban areas. Once PCC was implemented, the accuracies from each of the classification methods increased significantly with the SVM classification method performing best in both the growing season and post-harvest with an overall classification accuracy of 93.5% with a Kappa statistic of 0.88 in the post-harvest imagery and an overall classification accuracy of 84% with a Kappa statistic of 0.789 in the imagery from the growing season. It was found that SVM was the best classification method while PCC is an effective strategy to implement when dealing with spectrally similar LULC features. The use of SVM together with PCC increased the reliability of the information extracted. Strategies from this study can help to evaluate the LULC in agricultural and other watersheds.

1. Introduction

Land cover mapping and assessment is a core area of remote sensing data application [1]. Land cover is an underlying variable that impacts and links numerous components of the human and physical environments [2]. Land use and land cover (LULC) maps provide base information for decision-making in watershed management applications if the maps are reliable and updated [3]. Furthermore, LULC change is an important measure that is used to evaluate the effect of applied watershed management measures and is regarded as one of the most important variables of global change affecting ecological systems [1,3]. However, land cover change estimates from remotely sensed data are limited by numerous factors that impact the accuracy and success of classifications and hinder the creation of a functional thematic map. Such factors include the complexity of the landscape of a study area, inadequate resolution of the selected remotely sensed data, and difficulty in finding the image processing and classification approach most suitable for a particular study area [4].
To address this issue, ancillary data are integrated with remote sensing data to improve the classification accuracy of land cover data [3,4,5,6,7,8,9,10,11,12]. The most common approach is to incorporate ancillary data before the classification and as a result infuse the spatial or nonspatial information that may be of value in the image classification process, including elevation, slope, aspect, geology, soils, phenology, hydrology, transportation networks, political boundaries, and/or vegetation maps [13]. Sometimes post-classification corrections are implemented utilizing ancillary data to improve accuracy. The majority of imagery classifications are based on remotely sensed spectral responses and due to the complexity of biophysical environments, spectral confusion is common among land cover classes [4]. Some studies have turned to masking to deal with spectral confusion and have had success [3,14]. Masking removes a spectrally similar class and then returns that class after classification.
In addition to incorporating ancillary data, it is also important to determine the appropriate classification technique/method for a given situation. Numerous classification algorithms have been developed and a review of the methods and techniques can be found in Lu and Weng [4]. In a broad sense, the classification methods can be broken down into common or advanced. Classification methods, such as maximum likelihood (ML), minimum distance, and K-means are considered common classification methods [1]. Advanced classification methods include artificial neural networks, support vector machine (SVM), decision trees, and random forest (RF) [1]. The main objective of this study was to explore the capabilities of pixel-based classification methods on Landsat 8 Operational Land Imager (OLI) imagery from the three classification methods; maximum likelihood (ML), support vector machine (SVM), and random forest (RF) as well as the benefits of infusing ancillary data to create accurate LULC maps of the agriculture dominated Big Sunflower River Watershed in Mississippi, United States.

2. Literature Review

2.1. Use of Remote Sensing in Agriculture

Remotely sensed data has been utilized in agricultural applications for decades [15,16,17,18]. In agricultural settings, using appropriate methodology is important for accurate land cover classifications due to varying phenology. Each crop has specific planting and harvesting times, varying leaf structures, and different biophysical and biochemical variables. Additionally, soil moisture, soil organic matter content, and soil signatures affect the remote sensing spectra. A review of remote sensing in agriculture can be found in Mulla [19]. Applications of remote sensing in agriculture are typically based on the measurement of reflected radiation from soil or plant material. Plant pigments such as chlorophyll absorb radiation strongly in the visible spectrum, especially in blue and red wavelengths, and the near-infrared is strongly reflected due to leaf density and canopy structure [19,20]. The Normalized Difference Vegetation Index (NDVI) uses pigment absorption features in the red (~660 nm) and reflectance in the near-infrared (~860 nm) regions of the electromagnetic spectrum [21] to show vegetation biomass. NDVI is capable of estimating the number of plant properties such as leaf mass, chlorophyll (pigment) concentration, water content, and absorbed (or fraction) photosynthetic radiation [21]. When examining reflectance data, it is important to consider bare soils and their respective moisture and organic matter content. These soils will vary in their specific spectral reflectance signatures [22]. Since both bare soil and crop canopy will be present in a remotely sensed image, the mixture of the two spectral signatures often confounds the interpretation of reflectance data [19].

2.2. Classification Methods

Maximum likelihood (ML) is the most extensively used parametric classification algorithm [13]. This is due to the robust abilities of ML as well as its availability in almost every image-processing software [4]. ML is based on Bayes’ Theorem and assumes the probability distributions of input classes to have a multivariate normal distribution. Instead of minimum distance, ML selects the largest posterior probability [23]. ML classification within ESRI ArcMap uses a probability density function instead of a probability distribution present in Bayes’ Theorem. This is conducted by examining the variances and covariances of the training data as it assigns each cell to the appropriate class. Bayes’ Theorem is explained in detail in [24]. There are several drawbacks to the parametric approach. The imagery of a study area can be complex and violate the assumption of a normal spectral distribution [4]. This is evident in classes with significant within-class variance such as global and continent-wide land cover mapping [12]. In addition, integrating spectral with ancillary data is especially challenging with parametric classifiers such as ML [4].
Non-parametric classifiers, such as support vector machine (SVM) and random forest (RF), have grown in popularity for numerous reasons. Non-parametric classifiers make no assumption of data distribution, nor do they require any statistical parameters to separate classes. This makes it easier to incorporate non-spectral data into a classification procedure [4]. SVM is based on statistical learning theory with the goal of determining the optimal separation of classes [25]. SVM has been recognized to give higher classification accuracies than traditional methods such as ML [13]. Additionally, SVM has the advantage of imagery with heterogeneous classes and limited training sample availability [13,26]. Experiments have demonstrated SVM’s ability to interpret hyperspectral data effectively in hyper-dimensional feature space and not require any feature reduction procedures [27,28]. SVM capitalizes on the concept of margin maximization [26]. The margin is determined by the sum of distances to the hyperplane from the closest points separating two classes [25]. The basic premise of margin maximization is to determine the optimal separating hyperplane between two classes by maximizing the margin between the classes’ closest training samples. These training samples on the margin are termed support vectors and the line between the classes is known as the optimal separating hyperplane. If it is not possible to determine a linear separator, SVM can take it a step further and project the points into a higher-dimensional space using kernel techniques and then find a linear separator [13]. Margin maximization and SVM are explained in greater detail in statistical terms in Premalatha et al. [29], Gualtieri and Cromp [30], as well as in Chang and Lin [31].
Decision tree is a classification procedure that uses a recursive strategy to partition a dataset into smaller subsets by running the data through tests that are defined at each branch in the tree [12,13]. A decision tree can be broken down into three parts: root, split, and leaf. Furthermore, the root is formed from all the data where the tests begin [13]. The split (also termed as branch or node) is the next stop. Here, decision rules are implemented as a splitting test as the data is continually split into smaller groups. The split can be defined as: i n a i x i c for multivariate and x i > c for univariate decision trees, where x i is the measurement vectors of the n selected features. The vector of linear discriminate coefficients is represented as a and c are the decision thresholds [1]. The leaves refer to the class label assigned [12].
A random forest classifier (RF) is a nonparametric machine learning algorithm utilizing multiple decision trees [13]. Each decision tree is generated from different samples and subsets of the training data. The dataset is classified a number of times based on a random sub-selection of training pixels. This creates numerous decision trees. The final decision of each pixel’s classification is the result of a majority vote for that pixel. To create variation among trees, training data is projected into a randomly chosen subspace before being fitted to each tree. Additionally, to optimize the decision at each node, a randomized procedure is introduced.

2.3. Land Use & Land Cover Classes (LULC)

An adapted version of the Anderson [32] classification scheme used for the National Land Cover Database (NLCD) is considered the standard scheme for LULC classes for agricultural watersheds [13]. The NLCD was created by a group of federal agencies, including the United States Geological Survey, known as the Multi-Resolution Land Characteristics (MRLC) consortium [13]. The NLCD is the conclusive Landsat-based, 30-m resolution, land cover database for the United States [33,34]. Thus, the NLCD classification levels were chosen to represent the LULC classes in this paper to maintain compatibility with the majority of the literature.

2.4. Post Classification Correction (PCC)

The complexity of biophysical environments may lead to spectral confusion among LULC classes and thus requires ancillary data to ‘clean up’ or improve classified maps [4]. Ancillary data used in image classification are any type of spatial or nonspatial information that is potentially valuable in the image classification process. This includes transportation networks, soils, hydrology, political boundaries, phenology, vegetation maps, geology, slope, aspect etc. [13]. Studies have found that masking and then returning the class after classification is especially beneficial in increasing thematic map accuracy [3,14]. This removes spectrally similar classes. Masks are created in a number of ways. Thakkar et al. [3] generated masks based on a 3 × 3 variance texture derived from NIR band. Additionally, the NDWI index has been used to develop a water body mask [3]. Mesev [14] utilized special census data to further classify urban areas.

2.5. Classification Accuracy Assessment

Classification accuracy presented as a confusion matrix provides a simple cross-tabulation of a mapped class label against what was observed in the ground or reference data and provides the basis to describe classification accuracy and characterize errors [2,35,36]. Overall accuracy is the percentage of cases correctly allocated [2]. The accuracy of individual classes are examined through the confusion matrix from two different viewpoints: the user’s and producer’s accuracy. This is achieved by relating the total cases correctly allocated to the class to the total cases of that class. The user’s and producer’s accuracy entirely depend on whether it is based upon the matrix’s row or column marginals [2]. User’s accuracy corresponds to errors of omission or exclusion and producer’s accuracy corresponds to errors of commission or inclusion. Cohen’s Kappa coefficient is a standard measure used in accuracy assessment and resolves the issue of chance agreement or the allocation of the correct class by chance [2]. Many studies have recommended Cohen’s Kappa coefficient to be the standard measure of classification accuracy [37,38,39,40]. Stehman [36] argues overall accuracy, user’s accuracy, and producer’s accuracy are more applicable accuracy measures due to their direct interpretation as probabilities distinguishing data quality of a specific map and thus recommends all summary measures be used as each measure alone obscures potentially important details.
This method of accuracy assessment comes with inherent assumptions and limitations. Generally, it is implied that each pixel belongs solely to one of the classes in a defined set of mutually exclusive classes [2]. It is argued that the Kappa coefficient is not always suitable as a chance agreement is overestimated and results in an underestimation of classification accuracy [2,41]. Each measure of accuracy assesses different components of accuracy and thus different assumptions about the data [42]. Additionally, there is no widely acceptable measure of accuracy but a variety of indices, each sensitive to different features. Thus, there is no all-purpose measure of classification accuracy [2].
Sampling design is very important as the confusion matrix cannot be properly interpreted otherwise. A basic sampling size, such as random sampling, is suitable only if the sample size is large enough to guarantee all classes are adequately represented [2]. All constraints in a particular study must be considered in the design process of an accuracy assessment. The design should be practical so as to not diminish the credibility of the derived accuracy statement [43].

3. Materials and Methods

3.1. Study Area

The Big Sunflower River watershed (BSRW) is part of the Yazoo Basin and is one of the main tributaries of the Yazoo River in Mississippi. The Yazoo basin in northwestern Mississippi comprises an area of around 19,684 km2 making it the largest in the Mississippi alluvial valley [44]. Interior drainage of this basin happens through complex and sluggish streams that eventually connect to the Big Sunflower or Bogue Phalia rivers, or Deer Creek, which flow into the Yazoo River and ultimately to the Mississippi River [44]. The BSRW is located in the humid subtropical climate region, characterized by temperate winters; long, hot summers; and rainfall that is fairly evenly distributed throughout the year. The BSRW is known as a crop-dominated watershed encompassing a substantial amount of Mississippi’s agricultural-heavy region, which is commonly termed the Mississippi Delta (Figure 1). The BSRW is located within eleven Mississippi delta counties (Bolivar, Coahoma, Humphreys, Issaquena, Leflore, Sharkey, Sunflower, Tallahatchie, Warren, Washington, and Yazoo) with a total surface area of 7660 km2 [45]. Elevation ranges from nearly flat to undulating gentle slopes from around forty-nine to two hundred feet above sea level [46]. BSRW is ideal for agriculture due to nutrient-rich alluvial soils from years of deposition from seasonal flooding from the Mississippi River and surrounding tributaries. The soils vary extensively in structure, texture, frequency, and depth [46]. Agriculture has been a linchpin for the economy in this area with cotton, soybean, rice, corn, and wheat as the major crops. Typical planting and harvesting dates in the Mississippi Delta are listed in Table 1 [47]. Given the economic importance and the impact land cover has on both human and physical environments, accurate LULC maps with higher temporal resolution are required to provide base information for watershed management applications. Being an agricultural watershed, LULC in BSRW is dynamic seasonally as well as having non-seasonal temporal variations. The US Dept. of Agriculture (USDA) currently only provides one cropland data layer (CDL) map annually and the US Geological Survey (USGS) provides a national landcover database (NLCD) map every ~five years for the continental United States. This temporal resolution is insufficient for watershed modeling or management applications since LULC change is often more aggressive. While CDL and NLCD products are applicable at national scale analyses, at local-scale studies, as in this work, it is important to derive detailed classifications for improving the accuracies. This study is an effort to improve upon the annually available USDA LULC compilations by using Landsat 8 Operational Land Imager (OLI) data to create a database of LULC for BSRW with higher temporal resolution.

3.2. Remote Sensing Data and Processing

Landsat 8 OLI C1 Level-2 imagery of 25 August 2015, during the growing season, and the imagery of 7 January 2018, during post-harvest were downloaded through USGS Earth Explorer. Bands 1–7 were layer-stacked, and data processing was carried out using QGIS 2.18 and ArcMap 10.4 prior to analysis. Once imagery was mosaicked the study area was subset from the rest of the image. A normalized vegetation index (NDVI) was generated from the original data for analysis. This provided a measure of the absence or presence of vegetation and is useful for assessing the health of vegetation with higher NDVI values indicating healthy vegetation and lower NDVI values showing stressed vegetation [1]. A flowchart showing the remote sensing data used, processing, and methodology for this study is detailed in Figure 2. First, satellite data was selected based on percent cloud cover, image quality, radiometric, and geometric correction to obtain the best possible image quality. Second, to determine the number of predictors (i.e., band combinations), class separability and band separability were determined in Erdas Imagine using transformed divergence statistics from the selected training data. Next, each scheme, ML, RF, and SVM, was implemented and RF was optimized by modifying tree depth on the order of 10 increments (e.g., 40, 50, 60, etc.) and the number of trees by factors of 100 (e.g., 100, 200, 300, etc.) until performance leveled off. An accuracy assessment was performed using a stratified random sampling method. Next, PCC was applied with prior training data for each scheme, ML, RF, and SVM. Lastly, a final accuracy assessment was performed for all schemes with PCC implemented. The same number of training data for classification and testing data for accuracy assessment were used as reported in Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12 and Table 13.

3.3. Ancillary Data

In this study, a city mask was generated using the National Agriculture Imagery Program (NAIP) high-resolution aerial imagery, Landsat 8 OLI data, and geographic information system (GIS) ancillary data including thematic maps of urban areas, transportation networks, and shapefiles of developed areas. NAIP imagery was acquired through the United States Department of Agriculture (USDA) Natural Resources Conservation Service (NRCS) Geospatial Data Gateway [48]. NAIP acquires aerial imagery during the agricultural growing seasons in the U.S. providing high-resolution data for the study [48]. Shapefiles of developed areas were obtained from the Mississippi Automated Resource Information System (MARIS). Other ancillary data used to determine the training dataset included the Normalized Difference Vegetation Index (NDVI), river networks, Google Earth time-series data, and crop phenology.

3.4. Mask Generation

Shapefiles of all incorporated cities within the BSRW were downloaded from the Mississippi Automated Resource Information System (MARIS). The incorporated cities shapefile was last updated in 2010. Hence, each shapefile was edited to represent the boundaries of high to low-intensity developed areas more accurately while also keeping permeable surfaces outside of the shapefile. To accomplish this, NAIP imagery and the Landsat 8 OLI imagery were used interchangeably to ensure an accurate edit of each incorporated cities shapefile. The updated cities shapefile was then masked over the BSRW shapefile (Figure 3), which created a second shapefile with developed areas within the BSRW removed. Finally, this second shapefile was used together with the Landsat 8 OLI imagery to mask all developed areas contained within the imagery.

3.5. LULC Classification and Post-Classification Correction

Accuracy assessment was carried out for each classification method before and after PCC. Each classified thematic map’s accuracy was assessed using an accuracy assessment workflow. Five-hundred assessment points were generated using a stratified random sampling strategy for each classified thematic map. Stratified random sampling distributes points proportional in number to the class area of each class. Each point was examined using the original satellite imagery, NAIP imagery, NDVI image, and Google Earth time series data to determine its actual class or ground truth. The classified points and ground truth data were then compiled into a confusion matrix. This matrix compared the user’s accuracy (Equation (1)) versus the producer’s accuracy (Equation (2)) as well as with overall accuracy (Equation (3)). The user’s accuracy was calculated to determine how frequently the class assigned will be present on the ground. To know how often real features on the ground are correctly shown on the classified map, the producer’s accuracy was calculated. To determine the inter-rater reliability between classes, the Kappa statistic was computed.
U s e r s   A c c u r a c y = N u m b e r   o f   C o r r e c t l y   C l a s s i f i e d   P i x e l s   i n   e a c h   C a t e g o r y T o t a l   N u m b e r   o f   C l a s s i f i e d   P i x e l s   i n   t h a t   c a t e g o r y   R o w   t o t a l × 100
P r o d u c e r s   A c c u r a c y = N u m b e r   o f   C o r r e c t l y   C l a s s i f i e d   P i x e l s   i n   e a c h   C a t e g o r y T o t a l   N u m b e r   o f   R e f e r e n c e   P i x e l s   i n   t h a t   C a t e g o r y   C o l u m n   T o t a l × 100
O v e r a l l   A c c u r a c y = T o t a l   N u m b e r   o f   C o r r e c t l y   C l a s s i f i e d   P i x e l s   d i a g o n a l T o t a l   N u m b e r   o f   R e f e r e n c e   P i x e l s × 100

4. Results

4.1. Growing Season

The growing season presented a substantial amount of varying vegetation types due to the large variety of crops present. Utilizing transformed divergence statistics, it was determined that seven bands were necessary to effectively separate each class. In terms of user accuracy, all three schemes struggled with the same three classes, rangeland, high-intensity developed, and medium-low intensity developed before PCC. These classes can be hard to separate due to their similar spectral signatures calculated using each respective class samples (Figure 4). Looking at the producer’s accuracy, inaccuracies varied more between classes, but all struggled with rangeland and cultivated crop classes. This can be attributed to the low classification accuracy of classes such as rangeland and urban classes causing exclusion for other classes.

4.1.1. Classification of the Growing Season Imagery before Post-Classification Correction

ML scheme was the least successful prior to PCC and had an overall accuracy of 61% (Table 2). RF (Table 3) and SVM (Table 4) both performed better than ML prior to PCC with overall accuracies of 72% and 68%, respectively. Figure 5A displays the original Landsat 8 OLI imagery. Figure 5B–D represent the classified imagery for the same area with each applied scheme, ML, SVM, and RF, respectively, before the application of PCC.
ML accuracy assessment resulted in a Kappa statistic of 0.515 (Table 2). ML had trouble differentiating between cultivated crops and rangeland and thus caused lower accuracy. Some of this inaccuracy was also due to the woody wetlands class present typically on the boundaries of forests or in less dense tree cover areas. Another issue came from the two urban classes, high intensity developed and medium-low intensity developed. The bare soils in harvested fields affected both. This is attributed to the high reflectance values in components of the soils similar to that of developed areas (Figure 4). Due to the presence of vegetation in the medium-low intensity developed class, there was a mixing of classes with rangeland, harvested crops, and cultivated crops. ML did well with open water, woody wetlands, harvested crops, and cultivated crops in terms of user accuracy. However, due to the inaccuracies for the other classes (i.e., rangeland, high intensity developed, mid-low intensity developed), the producer’s accuracy indicated the classes had significant exclusion.
RF accuracy assessment shows an overall accuracy of 72% and a Kappa statistic of 0.66 (Table 3). Optimizing to a tree depth of 50 and a total number of 200 trees produced the best results for RF before PCC. Before optimization, the overall accuracy was 62% with a Kappa statistic of 0.53. RF scheme had similar issues as ML and SVM. Of the three schemes, RF was able to differentiate rangeland and cultivated crops most successfully as indicated by the user’s accuracy. However, in terms of producer’s accuracy, RF excluded more than ML and SVM in the amount of the rangeland class. RF performed well, in terms of user accuracy, in separating high-intensity developed. However, the low user accuracy of mid-low intensity developed was posed as a problem. This low accuracy in the urban class signifies exclusion in other classes since a large number of pixels were misclassified as medium-low intensity or high intensity developed.
SVM had similar problems differentiating between rangeland and cultivated crop classes. Out of 105 reference pixels for the rangeland class, 48 of them should have been cultivated crops. This issue along with the medium-low intensity developed class contributed to a low producer’s accuracy for the cultivated crop class. Similar to ML, SVM could not separate the two urban classes from soils and vegetation. Areas of woody wetlands were misclassified as rangeland, which affected the producer’s accuracy of the woody wetlands class the most. Similarly, areas of rangeland misclassified as medium-low intensity developed affected the producer’s accuracy of the rangeland class. Throughout the map, it is evident that parts of cultivated crop fields were misclassified as rangeland and medium-low intensity developed. In terms of user accuracy, SVM did well with open water, woody wetlands, harvested crops, and cultivated crops. However, the errors with other classes caused some exclusion in those same classes, hence resulted in lower producer’s accuracy. Accuracy assessment for the SVM classification showed an overall accuracy of 68% and a Kappa statistic of 0.595 (Table 4). Although none of the schemes produced sufficient results before PCC, RF performed the best with an overall accuracy of 72% with a Kappa of 0.66 (Table 3).
Table 2. Accuracy assessment values for growing season imagery using maximum likelihood classification before post-classification correction.
Table 2. Accuracy assessment values for growing season imagery using maximum likelihood classification before post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water100000001010
Woody Wetlands055000005510
Harvested Crop009000009010
Rangeland01913774011320.280
Cultivated Crop01201111001240.8950
High Int. Developed2031031100.30
Mid-Low Int. Developed0429292004860.0470
Total1290123682053650700
Producer’s Accuracy0.830.610.730.540.5410.66700.6110
Kappa0000000000.515
Table 3. Accuracy assessment for growing season imagery using random forest classification before post-classification correction. Number of trees = 200 and tree depth = 50.
Table 3. Accuracy assessment for growing season imagery using random forest classification before post-classification correction. Number of trees = 200 and tree depth = 50.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water10000040140.710
Woody Wetlands01305200200.650
Harvested Crop00220003250.880
Rangeland00010104150.670
Cultivated Crop000102304370.620
High Int. Developed0000060610
Mid-Low Int. Developed0000000000
Total1013222526101111700
Producer’s Accuracy1110.40.880.6000.720
Kappa0000000000.66

4.1.2. Classification of the Growing Season Imagery after Post-Classification Correction

Table 5, Table 6 and Table 7 show the accuracy assessments for the growing season using ML, SVM, and RF after PCC. Figure 6A displays the original Landsat 8 OLI imagery and Figure 6B–D represents the same portion of the map with each applied scheme, ML, SVM, and RF, respectively, after the application of PCC.
ML scheme produced an overall accuracy of 75% with a Kappa statistic of 0.67 (Table 5). The main misclassifications came from the rangeland class. The misclassification of woody wetlands into open water was a mixed pixel issue. A flooded field next to a forest produced an NDVI value too high to be representative of water and therefore represents the woody wetlands class. A similar issue arose with harvested crops classified as open water. There was a small road between fishponds and thus was mixed with both open water and harvested crop. However, NDVI values reveal the pixel more so represented harvested crop. Woody wetlands were misclassified several times into four other classes resulting in a producer’s accuracy of 63% with rangeland being the dominant factor. This misclassification was caused by a number of factors. For example, Landsat 8 resolution is thirty meters, and thus, any strip of forest between fields or other land cover types may be misclassified. Furthermore, the density of trees within a pixel can cause misclassification. The producer’s accuracy of the rangeland class is satisfactory; however, the user’s accuracy was low. The most substantial problem was cultivated crop pixels classified as rangeland. Typically, this occurred in crop fields with lower NDVI values or where soil signal was influencing the scheme’s decision-making. Additionally, late August is the beginning of harvest for farmers. Thus, crops were beginning to reach maturity and many experienced the end of their life cycle as leaves began to turn brown and yellow. Moreover, land cover boundaries, particularly bare soils, created mixed pixels causing the scheme to choose rangeland. This problem contributed to cultivated crops having a low producer’s accuracy as portions of that land cover class were excluded due to those pixels being classified as rangeland. The other classes did well, regarding the user’s accuracy, and where there were inaccuracies in the producer’s accuracy typically had to do with misclassifications with rangeland. Finally, it appears that ML tends to “overcompensate” for a class. For example, a neighboring pixel to a different land cover class may be added to that class even though the values representative of that pixel not being correlated to that class.
Table 4. Accuracy assessment for growing season imagery using support vector machine classification before post-classification correction.
Table 4. Accuracy assessment for growing season imagery using support vector machine classification before post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water9001000100.90
Woody Wetlands16302100670.940
Harvested Crop0010010011020.980
Rangeland02732648011050.2480
Cultivated Crop0503136001440.940
High Int. Developed2020051100.50
Mid-Low Int. Developed0321231709730.120
Total12981265620251251100
Producer’s Accuracy0.750.640.790.460.6710.7500.680
Kappa0000000000.595
Table 5. Accuracy assessment values for growing season imagery using maximum likelihood classification after post-classification correction.
Table 5. Accuracy assessment values for growing season imagery using maximum likelihood classification after post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water81100100.80
Woody Wetlands0560005610
Harvested Crop00106101070.990
Rangeland027988792030.4340
Cultivated Crop04021191250.9520
Total8881169119850100
Producer’s Accuracy10.6360.9130.9670.6000.7530
Kappa00000000.674
Table 6. Accuracy assessment for growing season imagery using random forest classification after post-classification correction. Number of trees = 100 and tree depth = 40.
Table 6. Accuracy assessment for growing season imagery using random forest classification after post-classification correction. Number of trees = 100 and tree depth = 40.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water90000910
Woody Wetlands113030170.760
Harvested Crop002440280.860
Rangeland000111120.920
Cultivated Crop0001527420.640
Total101324332810800
Producer’s Accuracy0.9110.330.9600.780
Kappa00000000.72
Table 7. Accuracy assessment for growing season imagery using support vector machine with post-classification correction.
Table 7. Accuracy assessment for growing season imagery using support vector machine with post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water1000001010
Woody Wetlands064004680.940
Harvested Crop10114511210.940
Rangeland024378331380.5650
Cultivated Crop05121581660.950
Total11931188519650300
Producer’s Accuracy0.9090.6880.9660.9170.80600.840
Kappa00000000.789
RF performed better than ML and improved with the implementation of PCC (Table 6). Implementing a tree depth of 40 and a total number of 100 trees was found to produce the best results for RF. The same issues bedeviling SVM and ML can explain some components of misclassification in RF, but not all. RF scheme’s choices for pixel class could not always be explained. Since RF works using a voting system of trees, this may complicate decision-making in that too many rules are being used to determine a class choice that is relatively small.
SVM scheme produced an overall accuracy of 84% and Kappa statistic of 0.789 and had similar issues as ML, however, SVM did a considerably better job handling those issues (Table 7). For instance, the user’s accuracy for the rangeland class increased and did not affect the producer’s accuracy of cultivated crops as poorly as ML. Nonetheless, SVM had similar issues regarding the producer’s accuracy of woody wetlands and the rangeland class was culpable. After examining the imagery, it was observed that the spatial resolution of Landsat 8 imagery, tree density, and mixed pixels from land cover class boundaries caused most of the errors. Overall, SVM did well with all classes excluding the user’s accuracy of rangeland and the producer’s accuracy of woody wetlands. SVM performed well in terms of separating classes, especially alongside differing land cover class boundaries.
In conclusion, results dramatically improved with the use of PCC with regard to majority filter and masking urban areas. SVM returned the most ideal results and had an overall accuracy of 84% and a Kappa statistic of 0.789. ML and RF did not perform as well, and all schemes had issues with the rangeland class in regards to user accuracy. Additionally, the woody wetlands class was misclassified as rangeland frequently with all schemes. However, a number of misclassifications can be attributed to mixed land cover types in a given pixel or boundary areas between classes.

4.2. Post-Harvest

In the post-harvest imagery, vegetation types and area of cover dropped substantially due to harvested crops and winter weather resulting in better results overall for ML (Table 8), RF (Table 9), and SVM (Table 10) schemes. However, rangeland, high-intensity developed, and mid-low-intensity developed classes still created substantial inaccuracy. Also, to examine class separability, spectral signatures were plotted using each respective class samples (Figure 7) acquired from the post-harvest imagery. Transformed divergence determined seven bands were necessary and effectively separated each class.

4.2.1. Classification of the Post-Harvest Imagery before Post-Classification Correction

Table 8, Table 9 and Table 10 exhibit the accuracy assessments for the post-harvest imagery using ML, RF, and SVM, respectively, before PCC. Figure 8 displays the original Landsat 8 post-harvest imagery. Figure 8B–D represents the same portion of the imagery with each applied scheme, ML, SVM, and RF, before the application of PCC. All had trouble differentiating between rangeland and cultivated crop land cover classes. Another issue was the misclassification of land cover types as high-intensity developed and mid-low intensity developed.
ML accuracy assessment yielded the best results before PCC over the post-harvest imagery among all the classifications implemented (Table 8). ML produced an overall accuracy of 77% with a Kappa statistic of 0.66. ML scheme had considerable trouble with the urban land cover classes; high intensity developed and mid-low intensity developed. This in turn affected other classes’ producer’s accuracy. The post-harvest imagery has significantly more bare soils than the growing season imagery. A number of these soils have reflectance values similar to that of developed areas and this caused misclassification. The misclassification of open water and woody wetlands is due to the fact that either developed class involves mixed pixels of woody wetlands and open water land covers. The misclassification of woody wetlands with rangeland is attributed to tree density, land cover boundaries, and other resolution-related issues. Throughout the classified imagery harvested crop land cover was misclassified as mid-low intensity developed. This occurred with the rangeland class as well. Since mid-low intensity developed is a class containing vegetation and impervious surfaces this is somewhat expected. Other than issues related to the developed classes, ML performed well.
Table 8. Accuracy assessment values for post-harvest imagery using maximum likelihood classification before post-classification correction.
Table 8. Accuracy assessment values for post-harvest imagery using maximum likelihood classification before post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water11001000120.920
Woody Wetlands07301000740.9860
Harvested Crop0122210002240.990
Rangeland07974202940.7870
Cultivated Crop0002800100.80
High Int. Developed74220001500
Mid-Low Int. Developed084920004810.0490
Total1893282101100651000
Producer’s Accuracy0.610.7850.7870.730.800.66700.7680
Kappa0000000000.665
RF accuracy assessment shows an overall accuracy of 72.4% with a Kappa statistic of 0.593 (Table 9). This portrays a drop in accuracy in almost all classes in terms of user’s accuracy. RF misclassified several classes resulting in very poor producer accuracies. Upon examination of rangeland pixels misclassified as woody wetlands, it is difficult to determine RF’s reasoning. The pixels themselves are not mixed nor close to a land cover boundary. However, there are errors present in the classification related to resolution issues causing mixed pixels. The urban classes, high intensity developed and mid-low intensity developed, continued to be problematic. All classes had pixels misclassified as either urban class except for cultivated crops.
SVM accuracy assessment resulted in 75% overall accuracy with a Kappa statistic of 0.641 (Table 10). SVM had complications with urban land cover classes as well. Using a different scheme did not resolve the issue of misclassification of woody wetlands and open water into high-intensity developed. A considerable amount of harvested cropland cover and rangeland was misclassified as mid-low intensity developed as well. As mentioned earlier, the increase in soil land cover and the similarity of signatures with developed areas caused this misclassification problem. Additionally, the rangeland class had a significant drop in terms of producer accuracy as compared to ML due to developed classes. Overall, SVM results were not as sound as results with the ML scheme.
Table 9. Accuracy assessment for post-harvest imagery using random forest classification before post-classification correction.
Table 9. Accuracy assessment for post-harvest imagery using random forest classification before post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water14020000160.8750
Woody Wetlands07637001870.8740
Harvested Crop04225110002400.940
Rangeland08134402490.690
Cultivated Crop0001900100.90
High Int. Developed2121022100.20
Mid-Low Int. Developed0105523008960.0830
Total1699288771321350800
Producer’s Accuracy0.8750.7670.780.440.6910.6200.720
Kappa0000000000.59
Table 10. Accuracy assessment for post-harvest imagery using support vector machine classification before post-classification correction.
Table 10. Accuracy assessment for post-harvest imagery using support vector machine classification before post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropHigh Int. DevMid-Low Int. DevTotalUser’s AccuracyKappa
Open Water150000001510
Woody Wetlands07803000810.960
Harvested Crop0222800002300.990
Rangeland08050205650.7690
Cultivated Crop0003700100.70
High Int. Developed5210020100.20
Mid-Low Int. Developed0125627003980.030
Total201022858392850900
Producer’s Accuracy0.750.760.80.60.77810.37500.750
Kappa0000000000.64

4.2.2. Classification of the Post-Harvest Imagery after Post-Classification Correction

Implementing PCC into the classification methodology produced significantly better results for all classification schemes. Table 11, Table 12 and Table 13 exhibit the results for post-harvest using ML, RF, and SVM, respectively, after PCC. Additionally, Figure 9 exhibits the ability of PCC to clean up image classification for each scheme, ML, SVM, and RF, respectively.
ML produced user accuracies above 90% for all classes except rangeland (Table 11). The low user’s accuracy of rangeland caused portions of woody wetlands, harvested crops, and cultivated crops to be misclassified resulting in a drop in producer’s accuracy for those land cover classes. All but cultivated crop class contributed to lowering woody wetlands producer’s accuracy. Now that the urban area’s signature has been masked rangeland class is the only class with any significant impact on classification accuracy.
RF (Table 12) did not perform nearly as well as either SVM or ML. No class had a user’s accuracy over 88% whereas SVM and ML produced user’s accuracies exceeding 90% for all classes, excluding ML performance with rangeland. The producer’s accuracy of woody wetlands decreased even more with RF and all classes affected woody wetlands class. While SVM and ML only had issues with rangeland, RF had difficulties with cultivated crops as well. As mentioned with RF classification during the growing season, RF’s pixel class choice cannot always be explained or understood. SVM (Table 13) handled the issue with rangeland significantly better than ML and RF. However, the producer’s accuracy of rangeland decreased as compared to ML. Also, woody wetlands were still excluded to some degree due to rangeland and harvested crop class. This problem was typical along the boundaries of classes where mixed pixels occurred. Overall, the increase in accuracy with SVM is due to SVM’s efficiency in the class separation of pixels with mixed land cover classes. All in all, SVM with PCC outperformed ML and RF with an overall accuracy of 93.5% with a Kappa statistic of 0.88. ML with PCC had an overall accuracy of 88.8% with a Kappa statistic of 0.82. Finally, RF results had an overall accuracy of 84.6%, with a Kappa statistic of 0.72.
Table 11. Accuracy assessment for post-harvest imagery using maximum likelihood classification after post-classification correction.
Table 11. Accuracy assessment for post-harvest imagery using maximum likelihood classification after post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water171000180.940
Woody Wetlands077010780.980
Harvested Crop24251302600.960
Rangeland013319721430.670
Cultivated Crop0000101010
Total19952821011250900
Producer’s Accuracy0.890.810.890.960.8300.8880
Kappa00000000.82
Table 12. Accuracy assessment for post-harvest imagery using random forest classification after post-classification correction.
Table 12. Accuracy assessment for post-harvest imagery using random forest classification after post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water171110200.850
Woody Wetlands0812100930.870
Harvested Crop1122812603200.870
Rangeland0514450640.700
Cultivated Crop02035100.50
Total1810129885550700
Producer’s Accuracy0.940.800.940.53100.8460
Kappa00000000.72
Table 13. Accuracy assessment for post-harvest imagery using support vector machine with post-classification correction.
Table 13. Accuracy assessment for post-harvest imagery using support vector machine with post-classification correction.
LULC ClassOpen WaterWoody
Wetlands
Harvested CropRangelandCultivated CropTotalUser’s AccuracyKappa
Open Water1900001910
Woody Wetlands080030830.960
Harvested Crop3112921203180.910
Rangeland020751780.960
Cultivated Crop00019100.90
Total2293292911050800
Producer’s Accuracy0.860.8610.820.900.9350
Kappa00000000.88

5. Discussion

SVM was found to be the most robust of the three schemes implemented with post-classification correction. This finding reveals that traditional parametric classifiers are not as suitable for agricultural settings due to the fact that the assumption of normal spectral distribution is often violated [4]. Pal and Mather [49] reported similar results in their study comparing SVM to ML and artificial neural networks.
No scheme implemented was able to effectively separate urban classes from bare soils and the various vegetation classes. The BSRW presents a unique challenge due to its soil structure. The BSRW soils vary widely in texture, structure, depth, and frequency [46]. Additionally, water absorption in soil largely influences reflectance in near-infrared and shortwave infrared regions [19,21]. These factors cause spectral confusion within the classifiers and result in no scheme effectively separating urban areas from bare soils and the various vegetation classes. Herold et al. [50] and Mesev [14] found bare soil surfaces to have spectral similarities to urban material types. Additionally, the spatial resolution of remote sensing imagery poses a limitation making mixed pixels common [4]. This has a direct effect on the mid-low intensity developed class separation from vegetation classes.
The PCC method of masking is an effective method to implement and increases classification accuracy significantly. Thakkar et al. [3] and Mesev [14] found similar results. Thakkar et al. [3] applied three masks, forest, water body, and drainage network masks, to produce sound results for the Arjuni watershed in Gujarat, India. Mesev [14] applied census data in urban areas to create accurate areal estimations. PCC is a reliable method to implement when dealing with the limitations of certain spatial resolutions and spectral similarities between classes.
Confusion between natural vegetation, such as rangeland, and agriculture is a difficult problem to solve and is a major source of error in remote sensing-based global land cover maps [12]. Moreover, bare soil and crop canopies will often both be present in a remotely sensed image and this mixture of two spectral signatures will often confound the interpretation of reflectance data leading to possible misclassification [19]. In the growing season imagery, a number of fields with crops have reached maturity and are ready for harvest. Upon reaching maturity some crops, such as wheat, corn, milo, and soybeans, turn yellow or brown and soybeans lose their leaves. This phenomenon allows the soil signatures to affect the vegetation pixels. The spectral signatures of soils and senescent crop residues are highly similar and traditional classification schemes have not proven robust enough to successfully differentiate the two [51]. This research found a number of fields that appeared to be bare soils if not for a faint NDVI signature or spotting throughout the field of crops not fully mature. These fields were misclassified as rangeland in some instances.
There are inherent limitations involved with classification accuracy assessments. Pixels are assumed implicitly to belong fully to one of the defined sets of mutually exclusive classes [2]. However, due to the complexity of biophysical environments and imagery resolution, this assumption is not always met. Mixed pixels have been identified as the most important cause of misclassification and a prime contributor to the underestimation of land cover change [2,4]. In this research, mixing at class boundaries was a significant problem. When dealing with measures of accuracy it has been argued that chance agreement is overestimated in the calculation of the Kappa coefficient and results in an underestimation of classification accuracy [2,41]. Sampling design also has major implications. This research chose a stratified random method in which points are randomly distributed within each class, where each class has a number of points proportional to its relative area. Thus, classes larger in area within the map were assessed more so than smaller class areas. Within an accuracy assessment, all errors are weighed equally. However, some errors are more critical or damaging than others and, in many instances, errors observed are between relatively similar classes [2].
The lack of ground truth data is an important limitation to be noted in this research. NAIP imagery and ancillary data such as NDVI were used to bridge this gap. NAIP imagery has a resolution of 0.6 m and was particularly helpful in determining land cover and other various components. Agricultural areas are highly dynamic and constantly changing in correspondence with crop growth and harvest. Thus, more Landsat 8 imagery in between the growing season and harvest would enhance the knowledge of land cover as well as what scheme and methodology is most suitable. Additionally, there are a plethora of classification techniques and methods that could increase the accuracy of land cover maps for an agricultural watershed.

6. Conclusions

Although all classifiers have been proven to be robust in previous research, none could perform satisfactorily to assure the desired classification accuracy for a heterogeneous agricultural landscape. In the present study, it was possible to improve the accuracy of all classifiers by incorporating PCC and other ancillary data. Additionally, the high-resolution NAIP imagery and 3 × 3 majority filter further aided in reducing the misclassification. The overall accuracy of 84.3% (for August 2015) and 93.5% (for January 2018) with SVM demonstrate the integration of PCC and ancillary data for remote sensing imagery is an effective method for improving classification accuracy.

Author Contributions

Conceptualization, P.D. and P.P.; methodology, P.D. and S.L.S.; software, S.L.S.; writing—original draft preparation, S.L.S.; writing—review and editing, P.D., P.P. and Y.O.; visualization, S.L.S.; supervision, P.D.; project administration, P.P.; funding acquisition, P.P., P.D. and Y.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by USDA NIFA/AFRI competitive grant award # 2017-67020-26375.

Data Availability Statement

All data generated or analyzed during this study are included in this published article in the form of figures and tables. Additional information about the dataset or the dataset in a different format than what is presented in this article can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest. The statements, findings, conclusions, and recommendations are those of the author(s) and do not necessarily reflect the views of the U.S. Department of Agriculture. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

References

  1. Otukei, J.R.; Blaschke, T. Land Cover Change Assessment Using Decision Trees, Support Vector Machines and Maximum Likelihood Classification Algorithms. Int. J. Appl. Earth Obs. Geoinf. 2010, 12, 27–31. [Google Scholar] [CrossRef]
  2. Foody, G.M. Status of Land Cover Classification Accuracy Assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
  3. Thakkar, A.K.; Desai, V.R.; Patel, A.; Potdar, M.B. Post-Classification Corrections in Improving the Classification of Land Use/Land Cover of Arid Region Using RS and GIS: The Case of Arjuni Watershed, Gujarat, India. Egypt. J. Remote Sens. Space Sci. 2017, 20, 79–89. [Google Scholar] [CrossRef] [Green Version]
  4. Lu, D.; Weng, Q. A Survey of Image Classification Methods and Techniques for Improving Classification Performance. Int. J. Remote Sens. 2007, 28, 823–870. [Google Scholar] [CrossRef]
  5. Stefanov, W.L.; Ramsey, M.S.; Christensen, P.R. Monitoring Urban Land Cover Change: An Expert System Approach to Land Cover Classification of Semiarid to Arid Urban Centers. Remote Sens. Environ. 2001, 77, 173–185. [Google Scholar] [CrossRef]
  6. Xiuwan, C. Using Remote Sensing and GIS to Analyse Land Cover Change and Its Impacts on Regional Sustainable Development. Int. J. Remote Sens. 2002, 23, 107–124. [Google Scholar] [CrossRef]
  7. Currit, N. Development of a Remotely Sensed, Historical Land-Cover Change Database for Rural Chihuahua, Mexico. Int. J. Appl. Earth Obs. Geoinf. 2005, 7, 232–247. [Google Scholar] [CrossRef]
  8. Yuan, F.; Sawaya, K.E.; Loeffelholz, B.C.; Bauer, M.E. Land Cover Classification and Change Analysis of the Twin Cities (Minnesota) Metropolitan Area by Multitemporal Landsat Remote Sensing. Remote Sens. Environ. 2005, 98, 317–328. [Google Scholar] [CrossRef]
  9. Qian, Y.; Zhang, K.; Qiu, F. Spatial Contextual Noise Removal for Post Classification Smoothing of Remotely Sensed Images. Proc. ACM Symp. Appl. Comput. 2005, 1, 524–528. [Google Scholar] [CrossRef]
  10. Judex, M.; Thamm, H.; Menz, G. Improving Land-Cover Classification with a Knowledge- Based Approach and Ancillary Data. In Proceedings of the 2nd Workshop of the EARSeL SIG on Land Use and Land Cover, Bonn, Germany, 28–30 September 2006; pp. 184–191. [Google Scholar]
  11. Manandhar, R.; Odehi, I.O.A.; Ancevt, T. Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data Using Post-Classification Enhancement. Remote Sens. 2009, 1, 330–344. [Google Scholar] [CrossRef] [Green Version]
  12. McIver, D.K.; Friedl, M.A. Using Prior Probabilities in Decision-Tree Classification of Remotely Sensed Data. Remote Sens. Environ. 2002, 81, 253–261. [Google Scholar] [CrossRef]
  13. Jensen, J.R. Introductory Digital Image Processing: A Remote Sensing Perspective; Pearson Education, Inc.: Glenview, IL, USA, 2015. [Google Scholar]
  14. Mesev, V. The Use of Census Data in Urban Image Classification. Photogramm. Eng. Remote Sens. 1998, 64, 431–438. [Google Scholar]
  15. Tucker, C.J.; Holben, B.N.; Elgin, J.H.; McMurtrey, J.E. Remote Sensing of Total Dry-Matter Accumulation in Winter Wheat. Remote Sens. 1980, 11, 171–189. [Google Scholar]
  16. Moran, M.S.; Clarke, T.R.; Inoue, Y.; Vidal, A. Estimating Crop Water Deficit Using the Relation between Surface-Air Temperature and Spectral Vegetation Index. Remote Sens. Environ. 1994, 49, 246–263. [Google Scholar] [CrossRef]
  17. Wardlow, B.D.; Egbert, S.L. Large-Area Crop Mapping Using Time-Series MODIS 250 m NDVI Data: An Assessment for the U.S. Central Great Plains. Remote Sens. Environ. 2008, 112, 1096–1116. [Google Scholar] [CrossRef]
  18. Bolton, D.K.; Friedl, M.A. Forecasting Crop Yield Using Remotely Sensed Vegetation Indices and Crop Phenology Metrics. Agric. For. Meteorol. 2013, 173, 74–84. [Google Scholar] [CrossRef]
  19. Mulla, D.J. Twenty Five Years of Remote Sensing in Precision Agriculture: Key Advances and Remaining Knowledge Gaps. Biosyst. Eng. 2013, 114, 358–371. [Google Scholar] [CrossRef]
  20. Pinter, P.J.; Hatfield, J.L.; Schepers, J.S.; Barnes, E.M.; Moran, M.S.; Daughtry, C.S.T.; Upchurch, D.R. Remote Sensing for Crop Management; American Society for Photogrammetry and Remote Sensing: Baton Rouge, LA, USA, 2003; Volume 69. [Google Scholar]
  21. Cheng, Y.B.; Ustin, S.L.; Riaño, D.; Vanderbilt, V.C. Water Content Estimation from Hyperspectral Images and MODIS Indexes in Southeastern Arizona. Remote Sens. Environ. 2008, 112, 363–374. [Google Scholar] [CrossRef]
  22. Ben-Dor, E.; Chabrillat, S.; Demattê, J.A.M. Characterization of Soil Properties Using Reflectance Spectroscopy. In Fundamentals, Sensor Systems, Spectral Libraries, and Data Mining for Vegetation, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2018; pp. 187–247. [Google Scholar] [CrossRef]
  23. Atkinson, P.M.; Lewis, P. Geostatistical Classification for Remote Sensing: An Introduction. Comput. Geosci. 2000, 26, 361–371. [Google Scholar] [CrossRef]
  24. Aplin, P.; Atkinson, P.M.; Curran, P.J. Fine Spatial Resolution Simulated Satellite Sensor Imagery for Land Cover Mapping in the United Kingdom. Remote Sens. Environ. 1999, 68, 206–216. [Google Scholar] [CrossRef]
  25. Cortes, C.; Vapnik, V. Support Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar]
  26. Melgani, F.; Bruzzone, L. Classification of Hyperspectral Remote Sensing. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar]
  27. Gualtieri, J.A.; Cromp, R.F. Support Vector Machine for Hypserspectral Remote Sensing Classification. Adv. Comput. Assist. Recognit. 1998, 3584, 221–232. [Google Scholar]
  28. Gualtieri, J.A.; Chettri, S.R.; Cromp, R.F.; Johnson, L.F. Support Vector Machine Classifiers as Applied to AVIRIS Data. In Proceedings of the Eighth JPL Airborne Earth Science Workshop, Pasadena, CA, USA, 10–11 February 1999. [Google Scholar]
  29. Premalatha, M.; Lakshmi, C.V. SVM Trade-off between Maximize the Margin and Minimize the Variables Used for Regression. Int. J. Pure Appl. Math. 2013, 87, 741–750. [Google Scholar] [CrossRef] [Green Version]
  30. Gualtieri, J.A.; Cromp, R. SVM for Hyperspectral Remote Sensing Classification. Proc. SPIE 1998, 3584, 221–232. [Google Scholar]
  31. Chang, C.C.; Lin, C.J. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [Google Scholar] [CrossRef]
  32. Anderson, J.R.; Hardy, E.E.; Roach, J.T.; Witmer, R.E. Land Use and Land Cover Classification System for Use with Remote Sensor Data; U.S. Government Printing Office: Washington, DC, USA, 1976.
  33. Homer, C.; Fry, J. The National Land Cover Database. US Geol. Surv. Fact Sheet 2012, 3020, 1–4. [Google Scholar]
  34. Homer, C.; Dewitz, J.; Yang, L.; Jin, S.; Danielson, P.; Xian, G.; Coulston, J.; Herold, N.; Wickham, J.; Megown, K. Completion of the 2011 National Land Cover Database for the Conterminous United States—Representing a Decade of Land Cover Change Information. Photogramm. Eng. Remote Sens. 2015, 81, 345–354. [Google Scholar]
  35. Canters, F. Evaluating the Uncertainty of Area Estimates Derived from Fuzzy Land-Cover Classification. Photogramm. Eng. Remote Sens. 1997, 63, 403–414. [Google Scholar]
  36. Stehman, S.V. Selecting and Interpreting Measures of Thematic Classification Accuracy. Remote Sens. Environ. 1997, 62, 77–89. [Google Scholar] [CrossRef]
  37. Smits, P.C.; Dellepiane, S.G.; Schowengerdt, R.A. Quality Assessment of Image Classification Algorithms for Land-Cover Mapping: A Review and a Proposal for a Cost-Based Approach. Int. J. Remote Sens. 1999, 20, 1461–1486. [Google Scholar] [CrossRef]
  38. Rosenfield, G.H.; Fitzpatrick-Lins, K. A Coefficient of Agreement as a Measure of Thematic Classification Accuracy. Photogramm. Eng. Remote Sens. 1986, 52, 223–227. [Google Scholar]
  39. Fitzgerald, R.W.; Lees, B.G. Assessing the Classification Accuracy of Multisource Remote Sensing Data. Remote Sens. Environ. 1994, 47, 362–368. [Google Scholar] [CrossRef]
  40. Fung, T.; Ledrew, E. The Determination of Optimal Threshold Levels for Change Detection Using Various Accuracy Indices. Photogramm. Eng. Remote Sens. 1988, 54, 1449–1454. [Google Scholar]
  41. Ma, Z.; Redmond, R.L. Tau Coefficients for Accuracy Assessment of Classification of Remote Sensing Data. Photogramm. Eng. Remote Sens. 1994, 61, 435–439. [Google Scholar]
  42. Stehman, S.V. Comparing Thematic Maps Based on Map Value. Int. J. Remote Sens. 1999, 20, 2347–2366. [Google Scholar] [CrossRef]
  43. Stehman, S.V.; Czaplewski, R.L. Design and Analysis for Thematic Map Accuracy Assessment: Fundamental Principles. Remote Sens. Environ. 1998, 64, 331–344. [Google Scholar] [CrossRef]
  44. Saucier, R.T. Quaternary Geology of the Lower Mississippi Valley; Arkansas Archeological Survey: Fayetteville, AR, USA, 1994; Volume I. [Google Scholar]
  45. Risal, A.; Parajuli, P.B.; Dash, P.; Ouyang, Y.; Linhoss, A. Sensitivity of Hydrology and Water Quality to Variation in Land Use and Land Cover Data. Agric. Water Manag. 2020, 241, 106366. [Google Scholar] [CrossRef]
  46. Snipes, C.E.; Evans, L.P.; Poston, D.H.; Nichols, S.P. Agricultural Practices of the Mississippi Delta; ACS Publications: Washington, DC, USA, 2004; pp. 43–60. [Google Scholar] [CrossRef]
  47. National Agricultural Statistics Service. Field Crops Usual Planting and Harvesting Dates. In Agriculural Handbook; NASS: Burr Ridge, IL, USA, 2010; pp. 1–51. [Google Scholar]
  48. USDA. Data Gateway; USDA: Washington, DC, USA, 2018.
  49. Pal, M.; Mather, P.M. Support Vector Machines for Classification in Remote Sensing. Int. J. Remote Sens. 2005, 26, 1007–1011. [Google Scholar] [CrossRef]
  50. Herold, M.; Roberts, D.A.; Gardner, M.E.; Dennison, P.E. Spectrometry for Urban Area Remote Sensing—Development and Analysis of a Spectral Library from 350 to 2400 Nm. Remote Sens. Environ. 2004, 91, 304–319. [Google Scholar] [CrossRef]
  51. South, S.; Qi, J.; Lusch, D.P. Optimal Classification Methods for Mapping Agricultural Tillage Practices. Remote Sens. Environ. 2004, 91, 90–97. [Google Scholar] [CrossRef]
Figure 1. Big Sunflower River Watershed and the surrounding Yazoo Basin.
Figure 1. Big Sunflower River Watershed and the surrounding Yazoo Basin.
Remotesensing 15 04020 g001
Figure 2. Methodology flowchart.
Figure 2. Methodology flowchart.
Remotesensing 15 04020 g002
Figure 3. Mask generation for post-classification correction.
Figure 3. Mask generation for post-classification correction.
Remotesensing 15 04020 g003
Figure 4. Spectral Signature for each class during the growing season from the Landsat OLI imagery of 25 August 2015.
Figure 4. Spectral Signature for each class during the growing season from the Landsat OLI imagery of 25 August 2015.
Remotesensing 15 04020 g004
Figure 5. Growing season (25 August 2015) before PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Figure 5. Growing season (25 August 2015) before PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Remotesensing 15 04020 g005
Figure 6. Growing season (25 August 2015) after PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Figure 6. Growing season (25 August 2015) after PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Remotesensing 15 04020 g006
Figure 7. Spectral Signature for each class post-harvest from the Landsat OLI imagery of 7 January 2018.
Figure 7. Spectral Signature for each class post-harvest from the Landsat OLI imagery of 7 January 2018.
Remotesensing 15 04020 g007
Figure 8. Post-harvest (7 January 2018) before PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, and (D) random forest classification.
Figure 8. Post-harvest (7 January 2018) before PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, and (D) random forest classification.
Remotesensing 15 04020 g008
Figure 9. Post-harvest (7 January 2018) after PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Figure 9. Post-harvest (7 January 2018) after PCC; (A) Landsat 8 OLI imagery, (B) support vector machine classification, (C) maximum likelihood classification, (D) random forest classification.
Remotesensing 15 04020 g009
Table 1. Usual Planting and Harvesting Dates [47].
Table 1. Usual Planting and Harvesting Dates [47].
USDA: 2010 Usual Planting Dates Usual Harvesting Dates
Crops BeginMost ActiveEnd Begin Most ActiveEnd
Barleyn/a n/an/an/an/an/a
Corn17 Mar24 March–27 April4 May11 August23 August–23 September7 October
Cotton20 April27 April–19 May29 May15 September27 September–29 October12 November
Potatoes, Sweet4 May7 June–23 June7 July20 August2 September–28 October7 November
Hay, othern/a n/an/a10 Apriln/a26 September
Oatsn/a n/an/an/an/an/a
Peanuts25 April6 May–31 May15 June20 September29 September–31 October10 November
Rice6 April18 April–16 May24 May29 August5 September–6 October20 October
Ryen/a n/an/an/an/an/a
Sorghum 8 April14 April–21 May3 June19 August29 August–27 September2 October
Soybeans19 April26 April–31 May17 June10 September13 September–31 October9 November
Sugarbeetsn/a n/an/an/an/an/a
Tobaccon/a n/an/an/an/an/a
Wheat (Winter)24 September10 October–18 November30 November28 May2 June–21 June1 July
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dash, P.; Sanders, S.L.; Parajuli, P.; Ouyang, Y. Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data in an Agricultural Watershed. Remote Sens. 2023, 15, 4020. https://doi.org/10.3390/rs15164020

AMA Style

Dash P, Sanders SL, Parajuli P, Ouyang Y. Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data in an Agricultural Watershed. Remote Sensing. 2023; 15(16):4020. https://doi.org/10.3390/rs15164020

Chicago/Turabian Style

Dash, Padmanava, Scott L. Sanders, Prem Parajuli, and Ying Ouyang. 2023. "Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data in an Agricultural Watershed" Remote Sensing 15, no. 16: 4020. https://doi.org/10.3390/rs15164020

APA Style

Dash, P., Sanders, S. L., Parajuli, P., & Ouyang, Y. (2023). Improving the Accuracy of Land Use and Land Cover Classification of Landsat Data in an Agricultural Watershed. Remote Sensing, 15(16), 4020. https://doi.org/10.3390/rs15164020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop