*4.2. Generation of Sequence Rules*

The generation of reliable sequence rules from past land-cover maps is essential in the self-learning approach. This study applied a heuristic approach to predict the class labels of candidates for new training data. Recently, a machine learning approach was presented to build rules on crop rotations for early crop type mapping before the crop season [46]. A Markov logic network (MLN), which can combine learning from data with expert knowledge, was applied to model crop rotations. The assessment results based on different temporal length and spatial coverages revealed that the MLN showed an accuracy of up to 60%, particularly the good prediction accuracy even for a large

region with heterogeneous climatic conditions and soils. In this study, a heuristic approach for the rule generation has been tested on a relatively small area. From our previous study in Illinois State [19], the rule-based class labels, which were combined with the AL-based classification results, could contribute to an improvement in classification accuracy in a large area, which indicates the applicability of rule information. Similar to the approach in Osman et al. [46], we will test whether the rules built from data in a small area can be transferred to other landscapes or large areas with diverse crop rotations, as well as the applicability of the MLN. Since a wrong label assignment greatly affects classification performances [47], the effects of class label noise and the accuracy of existing land-cover maps should also be investigated.

#### *4.3. Practical Issues*

For crop classification, annual change patterns should be used in order to properly account for various cropping systems. However, from a practical viewpoint, the collection of many consecutive annual land-cover maps is a demanding task, and not always possible. If limited land-cover maps are available, or if the time interval between sequential land-cover maps is more than two years, another approach within the self-learning framework should be developed. Instead of using deterministic hard class labels, a probabilistic approach based on transition probability can be applied. In terms of temporal contextual information, the transition probabilities between considered land-cover classes can be defined using expert knowledge. These probabilities can then be combined with the conditional probabilities based on spectral or scattering features within a probabilistic framework. If this probabilistic approach is combined with the self-learning approach, errors from rule-based class labels could be reduced. Future studies will investigate these aspects.

Other practical issue is regarding the selection of new training pixels from candidates. To alleviate the bias towards majority classes, new training pixels were selected from random under-sampling. For the comparison purposes, under-sampling of pixels with highest uncertainty was tested additionally. The candidate pixels were first sorted in a descending order by their uncertainty values and the pixels with highest uncertainty were then selected as new training pixels. When past three CDLs were used for the rule generation, the overall accuracy of this different under-sampling approach was 81.44%, which is slightly lower than that of random under-sampling (84.42%). This different classification accuracy is due to the number and class types of newly added training pixels. Relatively many pixels for sorghum which has the lower accuracy of rule-based class labels were selected as new training pixels, whereas fewer pixels for winter wheat and non-crops with high accuracy were selected. This different selection of new training pixels resulted in the lower classification accuracy. Despite the relatively higher accuracy of random under-sampling, majority classes still affected the classification accuracy, as mentioned in Section 3.4. To avoid selecting redundant pixels, the representativeness and diversity of the most uncertain pixels should be considered as criteria for the identification of the most informative pixels. To this end, entropy and spatial density measures, and clustering can be applied [48,49]. The application of these criteria will be tested within a self-learning framework.

## **5. Conclusions**

A self-learning classification approach, which can select the most informative labeled pixels as new training data, was presented in this study. The proposed approach differs from the AL approach in that no analyst intervention was required. The class labels for new candidate pixels were predicted from representative sequence rules selected from sequential change patterns of past land-cover maps. A classification experiment in crop cultivation areas demonstrated that this method could be used to properly define the class labels of unlabeled informative pixels. By progressively adding these informative labeled pixels into the training data, misclassification based purely on spectral information from a small number of training data could be greatly reduced, and higher classification accuracy was achieved. To strengthen the advantage of the self-learning approach, more extensive classification

experiments in other regions with a wide variety of land-cover types and climatic conditions and different availability of past land-cover maps will be included in future work.

**Acknowledgments:** This work was carried out with the support of "Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ009978)," Rural Development Administration, Republic of Korea. The authors thank three anonymous reviewers for providing invaluable comments on the original manuscript.

**Author Contributions:** Yeseul Kim and No-Wook Park designed this study. Yeseul Kim performed data processing and prepared the draft of this manuscript. No-Wook Park contributed to methodological developments and edited the manuscript. Kyung-Do Lee made important contributions to interpretations of crop rotations and classification results.

**Conflicts of Interest:** The authors declare no conflict of interest.
