**Anita Sabat-Tomala \*, Edwin Raczko and Bogdan Zagajewski**

Department of Geoinformatics, Faculty of Geography and Regional Studies, University of Warsaw, Cartography and Remote Sensing, ul. Krakowskie Przedmie´scie 30, 00-927 Warsaw, Poland; edwin.raczko@uw.edu.pl (E.R.); bogdan@uw.edu.pl (B.Z.)

**\*** Correspondence: anita.sabat@uw.edu.pl; Tel.: +48-225-520-654

Received: 17 December 2019; Accepted: 3 February 2020; Published: 5 February 2020

**Abstract:** Invasive and expansive plant species are considered a threat to natural biodiversity because of their high adaptability and low habitat requirements. Species investigated in this research, including *Solidago* spp., *Calamagrostis epigejos*, and *Rubus* spp., are successfully displacing native vegetation and claiming new areas, which in turn severely decreases natural ecosystem richness, as they rapidly encroach on protected areas (e.g., Natura 2000 habitats). Because of the damage caused, the European Union (EU) has committed all its member countries to monitor biodiversity. In this paper we compared two machine learning algorithms, Support Vector Machine (SVM) and Random Forest (RF), to identify *Solidago* spp., *Calamagrostis epigejos*, and *Rubus* spp. on HySpex hyperspectral aerial images. SVM and RF are reliable and well-known classifiers that achieve satisfactory results in the literature. Data sets containing 30, 50, 100, 200, and 300 pixels per class in the training data set were used to train SVM and RF classifiers. The classifications were performed on 430-spectral bands and on the most informative 30 bands extracted using the Minimum Noise Fraction (MNF) transformation. As a result, maps of the spatial distribution of analyzed species were achieved; high accuracies were observed for all data sets and classifiers (an average F1 score above 0.78). The highest accuracies were obtained using 30 MNF bands and 300 sample pixels per class in the training data set (average F1 score > 0.9). Lower training data set sample sizes resulted in decreased average F1 scores, up to 13 percentage points in the case of 30-pixel samples per class.

**Keywords:** Natura 2000; invasive species; expansive species; support vector machine; random forest; biodiversity
