**5. Conclusions**

Our results demonstrate that, even in diverse African agroforestry landscapes with high species diversity and imbalanced training data, the classification of some species, or groups of species, is possible when proper pre-processing, feature transformation and species grouping approaches are used. The MNF transformed data combined with the ALS features was superior in performance when compared with the other feature sets, and the best results were achieved with SVM classifier. If the aim is only to map the distribution of one species at the time, we sugges<sup>t</sup> combining all the other species into one mixed group, as the highest accuracies were achieved with this approach. If many species are classified at the same time, the spectral separability measures like JM distance can be used to find spectrally and structurally similar groups of species. With this approach, we found an ecologically and economically meaningful group of six fruit bearing trees that can be mapped with moderate accuracy. The up-sampling improved the F1-scores for some species with fewer samples. For example, *Acacia tortilis* with only four samples was classified with high mean F1-score after up-sampling. However, due to the small sample size, it is difficult to assess the performance of the model when predicted over the whole study area. Our results also provide important insights into the spectral and structural features that differentiate the tree species in the study area, while we found notable differences in the important spectral regions compared with previous studies. The three

non-native tree species (*Eucalyptus* spp., *Grevillea robusta*, and *Acacia mearnsii*) that could be mapped with the highest accuracies account for 40.1% of the samples. Thus, it is possible to map the decrease in biodiversity indirectly by mapping changes in the distribution of these species, as the increase in their distribution could mean decrease in the number of native tree species. In addition, *Eucalyptus* spp. and *Acacia mearnsii* are highly invasive species, and mapping their distribution would be valuable for conservation planning. *Grevillea robusta* is an important agroforestry tree and mapping their distribution would provide valuable information for the characterization of agroforestry practices in the study area. Although the number of species that were classified accurately is relatively low, better results could be achieved with more representative field data.

**Supplementary Materials:** The following are available online at www.mdpi.com/2072-4292/9/9/875/s1, Table S1: List of narrowband vegetation indices, Table S2: McNemar's score and statistical significance of difference in overall accuracy between support vector machine and random forest classifiers for different feature sets, Table S3: McNemar's score (lower triangular part) and statistical significance of difference in overall accuracy (upper triangular part) between different feature sets using support vector machine; Table S4: McNemar's score (lower triangular part) and statistical significance of change in overall accuracy (upper triangular part) between different feature sets using random forest, Table S5: Change in overall accuracy and Kappa after feature selection.

**Acknowledgments:** This research was funded by Ministry for Foreign Affairs of Finland (CHIESA and BIODEV projects) and Academy of Finland (TAITAWATER project). The authors want to thank Jessica Broas and Kirsi Kivistö from University of Helsinki and Darius Kimuzi from Taita Research Station for helping to collect the field data in 2013 and 2015. Jesse Hietanen preprocessed the ALS point clouds. Elisa Schäfer, Kirsi Kivistö and Hari Adhikari helped to orthorectify the AisaEAGLE data. Pekka Hurskainen and Fabian Fassnacht offered valuable comments on the manuscript. Taita Research Station of the University of Helsinki in Kenya and its staff are warmly acknowledged for logistical support and safe accommodation during the fieldwork. Research permit NCST/RCD/17/012/33 from National Council for Science and Technology of Republic of Kenya is warmly acknowledged.

**Author Contributions:** R.P., J.H., E.M. and P.P. conceived and designed the experiments. R.P. performed the experiments and analyzed the data. J.H., E.M. and A.V. contributed materials and analysis tools. R.P. wrote the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.
