Personality Classification of Social Users Based on Feature Fusion
Abstract
:1. Introduction
- HMA-CNN: we embed the multi-headed self-attention mechanism into the CNN architecture by dividing the text sequence into multiple regions to learn the local feature representation of each region in a cascade computation, and then gradually expand the region to model the global feature relationships in a hierarchical manner.
- HA-BiLSTM: we use the word attention mechanism to generate sentence-level feature representations. Then, we combine the scattered posts into multiple sequence fragments of the same length, and use Bi-LSTM and sentence-level attention mechanism to calculate the temporal characteristics of the captured text sequence and the contribution of different posts to personality traits.
- HMA-CNN, HA-BiLSTM, and word embedding multiple modules perform feature fusion in a parallel manner to compensate for the limitations of features extracted by a single model, maximize the use of rich semantic information of text data, and ensure the integrity and diversity of features, thus improving the efficiency and accuracy of personality classification tasks.
2. Related Work
3. Materials and Methods
3.1. Personality Classification Model
3.2. Data Preprocessing
3.3. Feature Extraction
3.3.1. HMA-CNN
- Convolutional layers
- H-MHSA
3.3.2. HA-BiLSTM
- Word-Level Attention
- Bi-LSTM Layer
- Sentence-Level Attention
3.4. Feature Fusion and Classification
Algorithm 1 HMA-CNN | |
Input: social post text initialized with Word2Vec | |
Output: document vector CD | |
1: | for k = 1,2, …, kernel_num do |
2: | |
3: | |
4: | |
5: | end for |
6: | |
7: | for g in g_size do //g_size = [8, 4, 2] |
8: | |
9: | |
10: | end for |
Algorithm 2 HA-BiLSTM | |
Input: social post text initialized with Word2Vec | |
Output: document vector LD //output the personality representation | |
1: | for i = 1,2, …, post_num do |
2: | |
3: | |
4: | |
5: | end for |
6: | end for |
7: | |
8: | for c = 1,2,…, C do |
9: | |
10: | end for |
11: | |
12: | LD = Sentence-Level Att(S) //calculate with Sentence-Level Attention |
4. Experiment and Analysis
4.1. Dataset
4.2. Evaluation Metrics and Parameter Settings
4.3. Comparative Experiment on Length of Text Sequence
4.4. Comparative Experiment of Different Model Architectures and Baseline Models
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mehta, Y.; Majumder, N.; Gelbukh, A.; Cambria, E. Recent trends in deep learning based personality detection. Artif. Intell. Rev. 2020, 53, 2313–2339. [Google Scholar] [CrossRef] [Green Version]
- Digman, J.M. Personality structure: Emergence of the five-factor model. Annu. Rev. Psychol. 1990, 41, 417–440. [Google Scholar] [CrossRef]
- Shun, M.C.Y.; Yan, M.C.; Zhiqi, S.; Bo, A. Learning personality modeling for regulating learning feedback. In Proceedings of the 2015 IEEE 15th International Conference on Advanced Learning Technologies, Hualien, Taiwan, 6–9 July 2015; pp. 355–357. [Google Scholar]
- Park, G.; Schwartz, H.A.; Eichstaedt, J.C.; Kern, M.L.; Kosinski, M.; Stillwell, D.J.; Ungar, L.H.; Seligman, M.E.P. Automatic personality assessment through social media language. J. Pers. Soc. Psychol. 2015, 108, 934–952. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xue, D.; Wu, L.; Hong, Z.; Guo, S.; Gao, L.; Wu, Z.; Sun, J. Deep learning-based personality recognition from text posts of online social networks. Appl. Intell. 2018, 48, 4232–4246. [Google Scholar] [CrossRef]
- Lynn, V.; Balasubramanian, N.; Schwartz, H.A. Hierarchical modeling for user personality prediction: The role of message-level attention. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. 5–10 July 2020; pp. 5306–5316. [Google Scholar]
- Tadesse, M.M.; Lin, H.; Xu, B.; Yang, L. Personality predictions based on user behavior on the Facebook social media platform. IEEE Access 2018, 6, 61959–61969. [Google Scholar] [CrossRef]
- Amirhosseini, M.H.; Kazemian, H. Machine learning approach to personality type prediction based on the myers–briggs type indicator®. Multimodal Technol. Interact. 2020, 4, 9. [Google Scholar] [CrossRef] [Green Version]
- Han, S.; Huang, H.; Tang, Y. Knowledge of words: An interpretable approach for personality recognition from social media. Knowl. Based Syst. 2020, 194, 105550. [Google Scholar] [CrossRef]
- Majumder, N.; Poria, S.; Gelbukh, A.; Cambria, E. Deep learning based document modeling for personality detection from text. IEEE Intell. Syst. 2017, 32, 74–79. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Sun, X.; Liu, B.; Cao, J.; Luo, J.; Shen, X. Who am I? Personality detection based on deep learning for texts. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas, MO, USA, 20–24 May 2018; pp. 1–6. [Google Scholar]
- Tandera, T.; Suhartono, D.; Wongso, R.; Prasetio, Y.L. Personality prediction system from Facebook users. Procedia Comput. Sci. 2017, 116, 604–611. [Google Scholar] [CrossRef]
- Gao, S.; Li, W.; Song, L.J.; Zhang, X.; Lin, M.; Lu, S. PersonalitySensing: A multi-view multi-task learning approach for personality detection based on smartphone usage. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12 October 2020; pp. 2862–2870. [Google Scholar]
- Mnih, V.; Heess, N.; Graves, A. Recurrent models of visual attention. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 3, pp. 2204–2212. [Google Scholar]
- Luong, M.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. In Proceedings of the EMNLP 2015: Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1412–1421. [Google Scholar]
- Bahdanau, D.; Cho, K.H.; Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaizer, L.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.; Le, Q. Xlnet: Generalized autoregressive pretraining for language understanding. Adv. Neural Inf. Process. Syst. 2019, 32, 5754–5764. [Google Scholar]
- Conneau, A.; Lample, G. Cross-lingual language model pretraining. Adv. Neural Inf. Process. Syst. 2019, 32, 7059–7069. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Keh, S.S.; Cheng, I. Myers-Briggs personality classification and personality-specific language generation using pre-trained language models. arXiv 2019, arXiv:1907.06333. [Google Scholar]
- Jiang, H.; Zhang, X.; Choi, J.D. Automatic text-based personality recognition on monologues and multi-party dialogues using attentive networks and contextual embeddings (Student Abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, New York City, NY, USA, 7–12 February 2020; Volume 34, pp. 13821–13822. [Google Scholar]
- Yang, F.; Quan, X.; Yang, Y.; Yu, J. Multi-document transformer for personality detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event. 2–9 February 2021; Volume 35, pp. 14221–14229. [Google Scholar]
- Polap, D.; Wlodarczyk-Sielicka, M. Classification of non-conventional ships using a neural bag-of-words mechanism. Sensors 2020, 20, 1608. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Nagaoka, Y.; Miyazaki, T.; Sugaya, Y.; Omachi, S. Text detection using multi-stage region proposal network sensitive to text scale. Sensors 2021, 21, 1232. [Google Scholar] [CrossRef] [PubMed]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781v3. [Google Scholar]
- Kosinski, M.; Matz, S.C.; Gosling, S.D.; Popov, V.; Stillwell, D. Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines. Am. Psychol. 2015, 70, 543. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Model | Dataset | Approach | Feature Type | Remarks |
---|---|---|---|---|
Majumder et al. CNN [10] | Stream-of-consciousness essays | Deep-learning technique, hierarchical modeling | Semantic features extracted by CNN, document-level stylistic features | Average accuracy: 62.68% |
Tandera et al. LSTM + CNN 1D [13] | myPersonality | Deep learning + resampling technique, hierarchical modeling | Features extracted by combining LSTM and 1D CNN | 1. Different language features and resampling techniques were used to set up different scenes. 2. Average accuracy: 62.71%. |
Michael et al. SNA + XGBoost [7] | myPersonality | Machine-learning technique | SNA features | 1. Study illustrated that a correlation exists between user personality and social network interaction behavior. 2. XGBoost classifier with SNA features can achieve highest prediction accuracy of 71.00% compared with linguistic features. |
Xue et al. AttRCNN [5] | myPersonality | Deep-learning technique, hierarchical modeling | Deep semantic features extracted from AttRCNN, statistical linguistic features vectors | Average mean absolute error (MAE): 0.42768. |
Lynn et al. Sequence Networks + Attn [6] | Facebook status posts of 68,687 users | Deep learning technique, hierarchical modeling | Word- and message-level attention feature representation | Model based on message-level attention achieved the best average accuracy: 54.98%. |
Han et al. Random Forest [9] | Microblogs | Machine-learning technique | Personality lexicon combined keywords of microblogs and external knowledge | 1. Personality explanation model proposed to analyze relationships between text features of user microblogs and personality scores. 2. F1 score: 0.737 |
Keh et al. [23] Bert | MBTI personality datasets | Deep learning technique | semantic features extracted from Bert | 1. Accuracy: 0.47 2. A fine-tuned BERT model was used for personality-specific language generation. |
Yang et al. Transformer-MD [25] | MBTI personality datasets | Transformer, MLP | Aggregated post feature representation dimension-specific representation | 1. Transformer-MD captures the dependencies between social text posts without introducing post-order bias. 2. The dimensional attention mechanism is designed to capture the impact of different dimensions of posts on each personality trait. |
HMAttn-ECBiL | myPersonality | Deep-learning technique, hierarchical and parallel modeling | Fusion features: word vector and two kinds of document vectors | 1. Hybrid model combines the original word-embedding vector and the proposed modules including HMA-CNN, HA-BiLSTM. 2. Highest average classification accuracy: 72.01%. |
Parameter | Value |
---|---|
batch_size | 32 |
learning_rate | 0.001 |
dropout rate | 0.5 |
embedding_size | 300 |
max_length | 400 |
num_filters | 128 |
g_size in HMSA | [8,4,2] |
number of head in H-MHSA | [1,2,4] |
hidden_size | 128 |
dense_unit | 256 |
hidden activation | ReLU |
Model | Sequence Length | EXT | NEU | AGR | CON | OPN | Average Accuracy |
---|---|---|---|---|---|---|---|
HMAttn-ECBiL | 200 | 62.09%/0.73 | 53.04%/0.69 | 66.03%/0.79 | 61.23%/0.72 | 73.43%/0.86 | 63.16% |
400 | 73.94%/0.79 | 62.14%/0.76 | 70.74%/0.83 | 68.65%/0.81 | 84.57%/0.91 | 72.01% | |
600 | 65.05%/0.69 | 55.02%/0.66 | 66.11%/0.76 | 62.82%/0.75 | 79.68%/0.89 | 65.74% |
Model. | Module |
---|---|
ECBiL | CNN, Bi-LSTM + original word embedding module |
HMAttn-EC | HMA-CNN + original word embedding module |
HAttn-EBiL | HA-BiLSTM + original word embedding module |
HMAttn-CBiL | HMA-CNN + HA-BiLSTM |
HMAttn-ECBiL | HMA-CNN, HA-BiLSTM + original word embedding module |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, X.; Sui, Y.; Zheng, K.; Shi, Y.; Cao, S. Personality Classification of Social Users Based on Feature Fusion. Sensors 2021, 21, 6758. https://doi.org/10.3390/s21206758
Wang X, Sui Y, Zheng K, Shi Y, Cao S. Personality Classification of Social Users Based on Feature Fusion. Sensors. 2021; 21(20):6758. https://doi.org/10.3390/s21206758
Chicago/Turabian StyleWang, Xiujuan, Yi Sui, Kangfeng Zheng, Yutong Shi, and Siwei Cao. 2021. "Personality Classification of Social Users Based on Feature Fusion" Sensors 21, no. 20: 6758. https://doi.org/10.3390/s21206758
APA StyleWang, X., Sui, Y., Zheng, K., Shi, Y., & Cao, S. (2021). Personality Classification of Social Users Based on Feature Fusion. Sensors, 21(20), 6758. https://doi.org/10.3390/s21206758