2.3.4. Iterative Classification

Once the new training data are added to the initial training set, they are used as inputs for a SVM classifier and this procedure is repeated until a predefined stopping criterion is satisfied. The iterative classification stopped when the percentage of changed pixels between current and previous classification results was less than 5%.

#### *2.4. Experimental Setting*

When applying the heuristic approach to self-learning, several practical issues arise. The first issue is regarding the optimal number of land-cover maps to generate unique sequence rules. If too many land-cover maps are used, stable rules are identified, but superfluous complex rules are also generated. By contrast, using too few land-cover maps may result in simplistic but unstable sequential patterns. The effects of the number of past land-cover maps was investigated in this study.

The second issue is a biased sampling problem. The pixels selected as new training data are likely to be biased to a specific class type. This is because the new training data selected by the BT algorithm come from specific boundaries containing a large number of training samples [25,39]. The inclusion of biased pixels that favor a specific major class type into the new training data may result in the over-estimation of that class type, and the overall degradation of classification accuracy. In this study, random under-sampling (i.e., restriction of the number of newly labeled pixels) of the training data assigned to specific class types was applied to obtain unbiased training data.

The last issue is regarding the quality or reliability of class labels predicted from unique sequence rules. The self-learning approach requires no analyst intervention, so the reliability of the class labels predicted from the unique sequence rules is critical for classification performance. In this study, the unique sequence rules are built from upscaled 250 m CDLs, not from the original 30 m CDLs. Thus, it is necessary to use pixels with high confidence in the upscaled CDLs. Since the most frequent class within each 250 m pixel was assigned to that corresponding pixel during upscaling, the confidence in the class assignment at 250 m could be derived from fractions of the assigned class. To obtain more reliable rules, the most confident pixels in all 250 m CDLs, which have higher fraction values, were used to build the sequence rules.

#### **3. Results**

#### *3.1. Generation of Rule-Based Class Labels*

To use the most confident pixels in 250 m CDLs for the rule generation, we used only pixels whose fractions of classes assigned to the 250 m CDLs from 2010 to 2014 exceeded a specific thresholding value to define rule information. When a thresholding value of greater than 70% was applied, few pixels were extracted for most classes except for winter wheat and non-crop. Thus, the rule information was finally generated using only pixels whose fractions were greater than 60%.

Overlaying many past CDLs generates too many unique sequence rules that have similar but not identical class sequences. It is very difficult to predict the single class label from complex rules because there are some possible class labels in 2015. To reduce the uncertainty attached to a class label assignment, all possible rules were not considered for the generation of rule-based class labels.

After analyzing typical cropping characteristics in the study area, we selected some unique sequence rules that could provide predictable information on a class label assignment. Winter wheat–fallow rotation has been known as the common cropping system in Kansas [41]. The winter wheat–fallow rotation system allows the accumulation of soil moisture in the cultivation area during the fallow periods. Due to soil erosion potential, however, winter wheat–summer crops such as corn, sorghum, and soybean rotations are being widely planted [41–43]. Of these crops, corn-soybean rotations dominate in Kansas.

A total of 21 rules were finally defined to predict class labels in 2015 (Table 3). Not all 21 rules represent the frequent patterns. Some frequent patterns (e.g., rules #4 and 21 in Table 3) were

selected, but other patterns that were less frequent but facilitated the prediction of class labels in 2015 (e.g., rules #6 and 9 in Table 3) were also selected. Although a simple heuristic approach was applied to generate rule information, the sequential patterns of land-covers between 2010 and 2014 in Table 3 well reflect the above predominant crop rotation sequences in Kansas. Typical sequence rules in the study area include winter wheat–fallow rotation, winter wheat-summer crop rotations, and summer crop rotations, as well as continuously growing crops. In addition, grain/hay and non-crop classes including water and urban remain unchanged.

**Table 3.** Sequential patterns of land-covers between 2010 and 2014 (C: corn, S: sorghum, SB: soybean, WW: winter wheat, A: alfalfa, OH: other hay, FA: fallow, W: water, U: urban, FO: forest, G: grass). The class labels in 2015 predicted from CDLs spanning the past five years are shown in bold.


As mentioned in Section 2.3, the effectiveness of the sequential patterns of land-covers depends on the number of CDLs used. To investigate this, the following different cases were considered to generate the rules: (1) using CDLs from 2010 to 2014 (5 years), (2) using CDLs from 2011 to 2014 (4 years), (3) using CDLs from 2012 to 2014 (3 years), and (4) using CDLs from 2013 to 2014 (2 years).

The class labels in 2015 of pixels in which sequential patterns of land-cover changes between 2010 and 2014 matched to the 21 rules were predicted as the corresponding labels of the rightmost column in Table 3. The rule-based class label images predicted from these four different cases are given in Figure 4. By superimposing the new training data candidates on the predicted label image, the class labels of the candidates were assigned automatically. Note that the rule-based class labels were not assigned to all pixels in the study area because some sequence rules were not considered and only the most confident pixels in CDLs were used to define rule information.

As the number of CDLs used to define the sequence rule decreased, the proportion of pixels in which the class labels in 2015 could be predicted increased accordingly (e.g., 23.38% (37,415 pixels) and 35.31% (56,496 pixels) for using past five-year and two-year CDLs, respectively). The fewer the CDLs, the more areas that were assigned to certain crop types such as corn, soybean, and winter wheat. By contrast, if the number of CDLs increased, relatively few areas had the rule-based class label and many areas remained unlabeled. Note that the number of pixels with rule-based class labels is much larger than that of initial training pixels (i.e., 37,415 versus 420). These rule images were separately used for further classification procedures and their classification performance were compared.

**Figure 4.** Rule-based class label images predicted from past CDLs: (**a**) 5-year (2010 to 2014), (**b**) 4-year (2011 to 2014), (**c**) 3-year (2012 to 2014), and (**d**) 2-year (2013 to 2014). P (the percentage value below each predicted class label image) denotes the proportion of pixels in which the class label for 2015 could be predicted.
