*Final Considerations.*

Two datasets were created. *Age*, *language* and *genre* features were removed from both datasets as well as the rating of the 40 adjectives since those will not be used by the models. Dataset with *No DA* consists of 250 observations, while the dataset *With DA* consists of 5230 observations. Both datasets contain 50 features that correspond to the 40 one-hot encoded adjectives, the 5 personality traits' scores and the 5 binned personality traits.
