4.1.6. Machine Learning Classifier Algorithms

The Weka machine learning toolkit [54] is used for identifying stress levels. The Weka toolkit has several preprocessing features before classification. Our data set was not balanced when the number of instances belonging to each class was considered. We solved this issue by removing samples from the majority class. We selected random undersampling because it is the most commonly applied method [55]. In this way, we prevented classifiers from biasing towards the class with more instances. In this study, we employed five different machine learning classification algorithms to recognize different stress levels: MultiLayer Perceptron (MLP), Random Forest (RF) (with 100 trees), K-nearest neighbors (kNN) (*n* = 1–4), Linear discriminant analysis (LDA), Principal component analysis (PCA) and support vector machine (SVM) with a radial basis function. These algorithms were selected because they were the most commonly applied and successful classifiers for detecting stress levels [30,48]. In addition, 10-fold stratified cross-validation was then applied and hyperparameters of the machine learning algorithms were fine-tuned with grid search. The best performing models have been reported.
