*3.4. Machine Learning Classification Models*

Ultimately, the 18 questionnaire variables were considered predictors of perceived stress. The entire list of predictors, along with their descriptions, is provided in the Supplementary Materials. Of these 18 variables, the following 7 were identified as the best set of predictors using correlation-based feature selection: age, monthly income, COPE avoidance, COPE positive, BSCS total score, BFI-10 emotional stability, and BFI-10 agreeableness. Using these predictors, ML algorithms were trained and tested according to the procedure described in the "Data Analysis" section. Classification results for the test set are reported in Table 3, which quantifies predictive performance according to the following metrics: receiver operating characteristic curve (ROC) area, precision, recall and F-measure (F1 score). It is worth noting that the classifiers showed an ROC area ranging from 0.70 to 0.78 in the test set. However, the random forest algorithm highlighted the lower sensitivity (recall) of the high perceived stress class compared to the other classifiers, making it a weaker model for the purposes of prediction.


**Table 3.** Metrics of the ML Models Tested on the 407 New Participants (Test Set).

Note: ROC = receiver operating characteristic curve; SVM = support vector machine.
