*Binning.*

In Saucier's original study [13], trait scores were divided into three bins. Trait scores between [8, 29] were considered to be *low*, between [30, 50] were considered to be *average*, and between [51, 72] were considered to be *high*. This assumes an increased importance since one of the conceived ML architectures, as explained later, uses classification models, where labels (the personality traits) are required to be binned. Hence, considering the original split and the need to create trait bins, labels were binned using the three bins defined originally. As depicted in Figure 4, after binning the dataset using the original intervals, bins ge<sup>t</sup> imbalanced, with all five traits having a higher number of observations falling within the range [30, 50]. In fact, for all traits, around 60% of observations fall within the *average* bin. Regarding the other two bins, *high* contains significantly more observations than *low* for all traits except for the *Extroversion* trait, which contains approximately the same amount of observations in the *high* and *low* bins. This distribution of observations must be taken into consideration when conceiving and training the ML models. In fact, this distribution will lead to the use of error metrics that take into account the presence of imbalanced bins.

**Figure 4.** Distribution of observations per bin and personality trait.
