Automated Classification of Evidence of Respect in the Communication through Twitter
Abstract
:1. Introduction
1.1. Background
1.2. Our Focus and Related Research
1.3. Defining Respect
- Courteous: offering courtesy and exhibiting gracious good manners;
- Humble: having modesty, lacking in arrogance and pride;
- Honorific: showing honor or respect; and
- Reverent: feeling or showing profound respect or veneration.
1.4. Relationship between Sentiment and Expression of Respect
1.5. Our Contribution
- A new data set of tweets that, to the best of our knowledge, is the first open data set annotated with a focus on the expression of respect,
- A comparison of 14 selected approaches from the fields of deep learning, natural language processing, and machine learning used for the extraction of features and classification of tweet respectfulness in the new data set,
- Analysis of the correlation between tweet sentiment and respectfulness to answer the two questions of whether positive tweets are always respectful and whether negative tweets are always disrespectful, and
- Finally, to enable full reproducibility of our experiments, we openly publish our data and code.
2. Methods
2.1. Analyzed Data and Annotation Scheme
2.2. Annotation Correlations
2.3. Feature Extraction Methods
2.3.1. Configuration of Models Which Used LSTMs
2.3.2. Configuration of Fine-Tuned Models
2.4. Cross-Validation
2.5. Machine Learning Classification
Classification Metrics
2.6. Software, Code, and Computing Machine
3. Results
3.1. Relationship between Respectfulness and Sentiment
3.2. Comparison of Classification Performance
4. Discussion
4.1. Relationship between Respectfulness and Sentiment
4.2. General View of Classification Performance
4.3. Worst-Best Model Comparisons
4.4. Comparison with Results from Other Studies
4.5. Example Practical Benefits of Carrying out Respect Analysis
5. Limitations of Our Study
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Commander, U.S. Pacific Fleet. Available online: www.cpf.navy.mil/downloads/2020/02/signature-behaviors.pdf (accessed on 7 July 2020).
- Gascón, J.F.F.; Gutiérrez-Aragón, Ó.; Copeiro, M.; Villalba-Palacín, V.; López, M.P. Influence of Instagram stories in attention and emotion depending on gender. Comunications 2020, 28, 41–50. [Google Scholar] [CrossRef] [Green Version]
- Waseem, Z.; Hovy, D. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL Student Research Workshop, San Diego, CA, USA, 12–17 June 2016; International Committee on Computational Linguistics: Stroudsburg, Pennsylvania, 2016; pp. 88–93. [Google Scholar]
- Burnap, P.; Williams, M.L. Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making. Policy Internet 2015, 7, 223–242. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Z.; Robinson, D.; Tepper, J. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network. In Mining Data for Financial Applications; Springer Nature: Heraklion, Greece, 2018; pp. 745–760. [Google Scholar]
- Waseem, Z. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter. In Proceedings of the First Workshop on NLP and Computational Social Science; International Committee on Computational Linguistics: Copenhagen, Denmark, 2016; pp. 138–142. [Google Scholar]
- Kwok, I.; Wang, Y. Locate the Hate: Detecting Tweets against Blacks. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence AAAI’13, Bellevue, WA, USA, 14–18 July 2013; AAAI Press: Washington, DC, USA, 2013; pp. 1621–1622. [Google Scholar]
- Gambäck, B.; Sikdar, U.K. Using Convolutional Neural Networks to Classify Hate-Speech. In Proceedings of the First Workshop on Abusive Language Online; International Committee on Computational Linguistics: Vancouver, BC, Canada, 2017; pp. 85–90. [Google Scholar]
- Jaki, S.; De Smedt, T. Right-Wing German Hate Speech on Twitter: Analysis and Automatic Detection. arXiv 2019, arXiv:1910.07518. [Google Scholar]
- Sanguinetti, M.; Poletto, F.; Bosco, C.; Patti, V.; Stranisci, M. An Italian Twitter Corpus of Hate Speech against Immigrants. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan (LREC 2018); European Language Resources Association (ELRA): Miyazaki, Japan, 2018. [Google Scholar]
- Frenda, S. Exploration of Misogyny in Spanish and English Tweets. In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), Sevilla, Spain, 18 September 2018; pp. 260–267. [Google Scholar]
- United Nations. United Nations Strategy and Plan of Action on Hate Speech. Available online: www.un.org/en/genocideprevention/hate-speech-strategy.shtml (accessed on 6 July 2020).
- European Commission against Racism and Intolerance (ECRI) Standards. Available online: www.coe.int/en/web/european-commission-against-racism-and-intolerance/ecri-standards (accessed on 16 July 2020).
- Google. Google Scholar. Available online: http://scholar.google.com (accessed on 2 December 2010).
- Hambrick, M.E.; Simmons, J.M.; Greenhalgh, G.P.; Greenwell, T.C. Understanding Professional Athletes’ Use of Twitter: A Content Analysis of Athlete Tweets. Int. J. Sport Commun. 2010, 3, 454–471. [Google Scholar] [CrossRef]
- Kassing, J.W.; Sanderson, J. Fan–Athlete Interaction and Twitter Tweeting Through the Giro: A Case Study. Int. J. Sport Commun. 2010, 3, 113–128. [Google Scholar] [CrossRef]
- Yusof, S.Y.A.M.; Tan, B. Compliments and Compliment Responses on Twitter among Male and Female Celebrities. Pertanika J. Soc. Sci. Humanit. 2014, 22, 75–96. [Google Scholar]
- Clark, M. To Tweet Our Own Cause: A Mixed-Methods Study of the Online Phenomenon “Black Twitter”; University of North Carolina: Chapel Hill, NC, USA, 2014. [Google Scholar]
- Maros, M.; Rosli, L. Politeness Strategies in Twitter Updates of Female English Language Studies Malaysian Under-graduates. Lang. Linguist. Lit. 2017, 23. [Google Scholar] [CrossRef] [Green Version]
- Xu, W. From Shakespeare to Twitter: What are Language Styles all about? In Proceedings of the Workshop on Stylistic Variation; International Committee on Computational Linguistics: Copenhagen, Denmark, 2017; pp. 1–9. [Google Scholar]
- Fatin, M.F. The Differences Between Men And Women Language Styles In Writing Twitter Updates. Psychology 2014, 4, 1. [Google Scholar]
- Ciot, M.; Sonderegger, M.; Ruths, D. Gender Inference of Twitter Users in Non-English Contexts. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Seattle, Washington, USA, 2013; pp. 1136–1145. [Google Scholar]
- Voigt, R.; Camp, N.P.; Prabhakaran, V.; Hamilton, W.L.; Hetey, R.C.; Griffiths, C.M.; Jurgens, D.; Jurafsky, D.; Eberhardt, J.L. Language from police body camera footage shows racial disparities in officer respect. Proc. Natl. Acad. Sci. USA 2017, 114, 6521–6526. [Google Scholar] [CrossRef] [Green Version]
- Giorgini, G.; Irrera, E. The Roots of Respect: A Historic-Philosophical Itinerary; De Gruyter: Berlin, Germany, 2017. [Google Scholar]
- Starkey, H. Democratic Citizenship, Languages, Diversity and Human Rights: Guide for the Development of Language Education Policies in Europe from Linguistic Diversity to Plurilingual Education: Reference Study; Council of Europe: Strasbourg, France, 2002. [Google Scholar]
- Duranti, A.; Good-win, C.; Duranti, A.C.G. (Eds.) Rethinking Context: Language as an Interactive Phenomenon; Cambridge University Press: Cambridge, UK; New York, NY, USA, 1992. [Google Scholar]
- Adams, S.M.; Bosch, E.; Balaresque, P.; Ballereau, S.; Lee, A.C.; Arroyo-Pardo, E.; López-Parra, A.M.; Aler, M.; Grifo, M.S.G.; Brion, M.; et al. The Genetic Legacy of Religious Diversity and Intolerance: Paternal Lineages of Christians, Jews, and Muslims in the Iberian Peninsula. Am. J. Hum. Genet. 2008, 83, 725–736. [Google Scholar] [CrossRef] [Green Version]
- Modood, T. Moderate secularism, religion as identity and respect for religion. In Civil Liberties, National Security and Prospects for Consensus; Cambridge University Press: Cambridge, UK, 2012; pp. 62–80. [Google Scholar]
- Helm, B.W. Communities of Respect: Grounding Responsibility, Authority, and Dignity; Oxford University Press: Oxford, UK, 2017. [Google Scholar]
- Teuber, A. Kant’s Respect for Persons. Political Theory 1983, 12, 221–242. [Google Scholar] [CrossRef]
- Fabi, R. “Respect for Persons,” Not “Respect for Citizens”. Am. J. Bioeth. 2016, 16, 69–70. [Google Scholar] [CrossRef] [PubMed]
- Dillon, R.S. Respect for persons, identity, and information technology. Ethic Inf. Technol. 2009, 12, 17–28. [Google Scholar] [CrossRef]
- Hudson, S.D. The Nature of Respect. Soc. Theory Pr. 1980, 6, 69–90. [Google Scholar] [CrossRef]
- Chapman, L.M. Respectful Language. J. Psychol. Issues Organ. Cult. 2013, 3, 115–132. [Google Scholar] [CrossRef]
- Holtgraves, T.M. Language as Social Action: Social Psychology and Language Use; Lawrence Erlbaum Associates Publishers: Mahwah, NJ, USA, 2002; p. 232. [Google Scholar]
- Thompson, M. Enough Said: What’s Gone Wrong with the Language of Politics; St. Martin’s Press: New York, NY, USA, 2016. [Google Scholar]
- Wolf, R. Respect and disrespect in international politics: The significance of status recognition. Int. Theory 2011, 3, 105–142. [Google Scholar] [CrossRef]
- Beach, M.C.; Duggan, P.S.; Cassel, C.K.; Geller, G. What Does ‘Respect’ Mean? Exploring the Moral Obligation of Health Professionals to Respect Patients. J. Gen. Intern. Med. 2007, 22, 692–695. [Google Scholar] [CrossRef] [Green Version]
- Fiok, K. Krzysztoffiok/Twitter_Sentiment. Available online: https://github.com/krzysztoffiok/twitter_sentiment (accessed on 18 October 2020).
- Ross, B.; Rist, M.; Carbonell, G.; Cabrera, B.; Kurowsky, N.; Wojatzki, M. Measuring the Reliability of Hate Speech An-notations: The Case of the European Refugee Crisis. arXiv 2016, arXiv:1701.08118 2016. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Ho, T.K. Random decision forests. In Proceedings of the 3rd International conference on document analysis and recognition, Montreal, QC, Canada, 14–16 August 1995. [Google Scholar]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Sharma, M.; Sharma, S.; Singh, G. Performance Analysis of Statistical and Supervised Learning Techniques in Stock Data Mining. Data 2018, 3, 54. [Google Scholar] [CrossRef] [Green Version]
- Sharma, M.; Singh, G.; Singh, R. Design of GA and Ontology based NLP Frameworks for Online Opinion Mining. Recent Pat. Eng. 2019, 13, 159–165. [Google Scholar] [CrossRef]
- Kumar, P.; Gahalawat, M.; Roy, P.P.; Dogra, D.P.; Kim, B.-G. Exploring Impact of Age and Gender on Sentiment Analysis Using Machine Learning. Electronics 2020, 9, 374. [Google Scholar] [CrossRef] [Green Version]
- Pennebaker, J.W.; Boyd, R.L.; Jordan, K.; Blackburn, K. The Development and Psychometric Properties of LIWC2015; The University of Texas: Austin, TX, USA, 2015. [Google Scholar]
- Crossley, S.A.; Kyle, K.; McNamara, D.S. Sentiment Analysis and Social Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and social-order analysis. Behav. Res. Methods 2016, 49, 803–821. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pennington, J.; Socher, R.; Manning, C. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781v3. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Conneau, A.; Kiela, D.; Schwenk, H.; Barrault, L.; Bordes, A. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. arXiv 2017, arXiv:1705.02364. [Google Scholar]
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. HuggingFace’s Transformers: State-of-the-Art Natural Language Processing. arXiv 2020, arXiv:1910.03771. [Google Scholar]
- Akbik, A.; Blythe, D.; Vollgraf, R. Contextual String Embeddings for Sequence Labeling. In Proceedings of the 27th In-ternational Conference on Computational Linguistics; Association for Computational Linguistics: Santa Fe, Mexico, 2018; pp. 1638–1649. [Google Scholar]
- Sklearn.Feature_Selection.Mutual_Info_Classif—Scikit-Learn 0.24.0 Documentation. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_classif.html (accessed on 16 October 2020).
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations. arXiv 2020, arXiv:1909.11942. [Google Scholar]
- Yim, J.; Joo, D.; Bae, J.; Kim, J. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7130–7138. [Google Scholar] [CrossRef]
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv 2019, arXiv:1908.10084. [Google Scholar]
- Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching Word Vectors with Subword Information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Lample, G.; Conneau, A. Cross-Lingual Language Model Pretraining. arXiv 2019, arXiv:1901.07291. [Google Scholar]
- Fiok, K.; Karwowski, W.; Gutierrez, E.; Davahli, M.R. Comparing the Quality and Speed of Sentence Classification with Modern Language Models. Appl. Sci. 2020, 10, 3386. [Google Scholar] [CrossRef]
- Fiok, K. Krzysztoffiok/Respectfulness_in_Twitter. Available online: https://github.com/krzysztoffiok/respectfulness_in_twitter (accessed on 16 October 2020).
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 1–13. [Google Scholar] [CrossRef] [Green Version]
- González-Carvajal, S.; Garrido-Merchán, E.C. Comparing BERT against Traditional Machine Learning Text Classification. arXiv 2021, arXiv:2005.13012. [Google Scholar]
- Kowsari, K.; Meimandi, K.J.; Heidarysafa, M.; Mendu, S.; Barnes, L.E.; Brown, D.E. Text Classification Algorithms: A Survey. Information 2019, 10, 150. [Google Scholar] [CrossRef] [Green Version]
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep Contextualized Word Representations. arXiv 2018, arXiv:1802.05365. [Google Scholar]
- Permanent Suspension of @Realdonaldtrump. Available online: https://blog.twitter.com/en_us/topics/company/2020/suspension.html (accessed on 16 January 2021).
- Facebook. Available online: https://www.facebook.com/zuck/posts/10112681480907401 (accessed on 14 January 2021).
Label Name | Label | Tweet Description |
---|---|---|
Disrespectful | 0 | Is aggressive and/or strongly impolite, seems “evidently” disrespectful. |
Respectful | 1 | Tweets that are certainly not disrespectful are written in “standard” language without any evidently negative or positive attitudes. If it is unclear whether the tweet should be considered very respectful or respectful, the tweet is labeled respectful. |
Very respectful | 2 | Undoubtedly exhibits respect. |
Method Name Adopted in This Study | Additional Description | Data-Specific Adaptability | Method for Obtaining Tweet-Level Embeddings | Source |
---|---|---|---|---|
Term Frequency | Top 300 features selected according to a mutual information method implemented in the Python sci-kit learn module | Data-specific training required | Native output of features for the whole text data instance | [56] |
SEANCE | Lexicon-based method, “Sentiment analysis and social cognition engine” | No data-specific training | Native output of features for the whole text data instance | [48] |
LIWC | Lexicon-based method, “Linguistic inquiry and word count” | No data-specific training | Native output of features for the whole text data instance | [47] |
Albert Pooled | Tiny version of BERT, model version “base-v2pooled” | No data-specific training | Mean of token embeddings | [57] |
Distilbert Pooled | “Distilled” [58] version of BERT pre-trained to output sentence-level embeddings, model version “base-nli-stsb-mean-tokenspooled” | No data-specific training | Mean of token embeddings | [59] |
Roberta Pooled | Robustly pre-trained BERT ready to output sentence-level embeddings, model version “roberta-large-nli-stsb-mean-tokenspooled” | No data-specific training | Mean of token embeddings | [59] |
Fasttext LSTM | Token embeddings from Fasttext, model version “en-crawl” converted by an LSTM into tweet-level embeddings | Data-specific training required | Bidirectional LSTM | [60] |
RoBERTa LSTM | Token embeddings from robustly pretrained BERT “roberta-large” model version converted by an LSTM into tweet-level embeddings | Data-specific training required | Bidirectional LSTM | [61] |
Albert | Fine-tuned tiny version of BERT transformer model, version “base-v2” | Data-specific training required | CLS token | [57] |
BERT L C | Fine-tuned “BERT large cased” transformer model | Data-specific training required | CLS token | [54] |
BERT L UNC | Fine-tuned “BERT large uncased” transformer model | Data-specific training required | CLS token | [54] |
XLM-MLM-EN-2048 | Fine-tuned cross-lingual transformer model “MLM-EN-2048” | Data-specific training required | CLS token | [62] |
XLM-RoBERTa-L | Fine-tuned cross-lingual transformer model version based on Robustly Pretrained BERT large | Data-specific training required | CLS token | [62] |
RoBERTa L | Fine-tuned transformer model version Robustly Pretrained BERT large | Data-specific training required | CLS token | [61] |
Metric | ||
---|---|---|
Model | F1 | MCC |
Term Frequency | 0.5800 | 0.4049 |
SEANCE | 0.6041 | 0.4301 |
LIWC | 0.6150 | 0.4537 |
Albert Pooled | 0.6387 | 0.5003 |
Distilbert Pooled | 0.6397 | 0.5037 |
Fasttext LSTM | 0.6827 | 0.5271 |
Roberta Pooled | 0.6523 | 0.5331 |
Albert | 0.7136 | 0.5871 |
BERT L C | 0.7165 | 0.5902 |
XLM-MLM-EN-2048 | 0.7263 | 0.6062 |
BERT L UNC | 0.7253 | 0.6062 |
RoBERTa LSTM | 0.7240 | 0.6104 |
XLM-RoBERTa-L | 0.7431 | 0.6249 |
RoBERTa L | 0.7350 | 0.6337 |
Respect Class | ||||
---|---|---|---|---|
Term Frequency | 0 | 1 | 2 | |
0 | 258 | 585 | 6 | |
1 | 87 | 3584 | 59 | |
2 | 10 | 251 | 124 | |
RoBERTa L | 0 | 1 | 2 | |
0 | 625 | 224 | 0 | |
1 | 167 | 3473 | 90 | |
2 | 0 | 212 | 173 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fiok, K.; Karwowski, W.; Gutierrez, E.; Liciaga, T.; Belmonte, A.; Capobianco, R. Automated Classification of Evidence of Respect in the Communication through Twitter. Appl. Sci. 2021, 11, 1294. https://doi.org/10.3390/app11031294
Fiok K, Karwowski W, Gutierrez E, Liciaga T, Belmonte A, Capobianco R. Automated Classification of Evidence of Respect in the Communication through Twitter. Applied Sciences. 2021; 11(3):1294. https://doi.org/10.3390/app11031294
Chicago/Turabian StyleFiok, Krzysztof, Waldemar Karwowski, Edgar Gutierrez, Tameika Liciaga, Alessandro Belmonte, and Rocco Capobianco. 2021. "Automated Classification of Evidence of Respect in the Communication through Twitter" Applied Sciences 11, no. 3: 1294. https://doi.org/10.3390/app11031294
APA StyleFiok, K., Karwowski, W., Gutierrez, E., Liciaga, T., Belmonte, A., & Capobianco, R. (2021). Automated Classification of Evidence of Respect in the Communication through Twitter. Applied Sciences, 11(3), 1294. https://doi.org/10.3390/app11031294