**3. Methods**

Classification was is in this study based on Random Forest, as it was found as the best classification algorithm for stress detection in our previous study [7]. However, to found the most appropriate regression model for the purpose, several regression models were compared. The comparison was made using Matlab's (version 2018b) Regression Learner application which enables fast experimenting with multiple regression algorithms. Regression Learner contains two options for dividing data into training and testing: cross-validation which randomly divides data into desired number of groups, and holdout which randomly divides data into two groups. As the aim of this study is to build user-independent models, leave-one-subject-out -method needs to be used in the final model training process, and therefore, neither of the approaches provided by Regression Learner are valid for the purpose. Moreover, both approaches lead to over-trained models as same persons data can be in training and testing sets. However, though Regression Learner cannot be used to calculate the final outcomes of this paper, it can be used to compare the performance of different regression algorithms when predicting the amount of stress, and help to select the right regression algorithm to calculate the final outcomes of this article.

In the end, 13 regression algorithms were compared by training regression models using 5-fold cross-validation, see Table 1. As a result of this comparison, it was decided to use Bagged tree based ensemble model in the experiments as the root mean square error (RMSE) is the lowest using it.


**Table 1.** Comparison of regression models using 5-fold cross-validation.
