*Article* **Machine Learning Systems Detecting Illicit Drugs Based on Their ATR-FTIR Spectra**

**Iulia-Florentina Darie 1,2, Stefan Razvan Anton <sup>3</sup> and Mirela Praisler 4,\***


**Abstract:** We present a comparative study aiming to determine the most efficient multivariate model screening for the main drugs of abuse based on their ATR-FTIR spectra. A preliminary statistical analysis of selected spectra data extracted from the public SWGDRUG IR Library was first performed. The results corroborated those of an exploratory analysis that was based on several dimensionality reduction methods, i.e., Principal Component Analysis (PCA), Independent Component Analysis (ICA), and autoencoders. Then, several machine learning methods, i.e., Support Vector Machines (SVM), eXtreme Gradient Boosting (XGB), Random Forest, Gradient Boosting, and K-Nearest Neighbors (KNN), were used to assign the drug class membership. In order to account for the stochastic nature of these machine learning methods, both models were evaluated 10 times on a randomly distributed subset of the whole SWGDRUG IR Library, and the results were compared in detail. Finally, their performance in assigning the class identity of three classes of drugs of abuse, i.e., hallucinogenic (2C-x, DOx, and NBOMe) amphetamines, cannabinoids, and opioids, were compared based on confusion matrices and various classification parameters, such as balanced accuracy, sensitivity, and specificity. The advantages of each of the illicit drug-detecting systems and their potential as forensic screening tools used in field scenarios are also discussed.

**Keywords:** amphetamines; cannabinoids; opioids; ATR-FTIR spectra; PCA; ICA; autoencoders; SVM; XGB; random forest; gradient boosting; K-Nearest Neighbors (KNN)
