A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans

Alhadlaq, Aseel; Alnuaim, Abeer

doi:10.3390/app13116729

Open AccessArticle

A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans

by

Aseel Alhadlaq

^*

and

Abeer Alnuaim

Department of Computer Science and Engineering, College of Applied Studies and Community Service, King Saud University, P.O. Box 22459, Riyadh 11495, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(11), 6729; https://doi.org/10.3390/app13116729

Submission received: 25 March 2023 / Revised: 5 May 2023 / Accepted: 29 May 2023 / Published: 31 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

Twitter is one of the best online platforms for social interaction, introducing unique means of story-telling through tweets and enabling multiple approaches to the analysis of their content. This study was motivated by the increasing practice of incorporating Twitter into cultural studies and the research gap in Twitter-based cultural studies between emerging nations. This research aims to examine the emotional and sentimental cultural traits of Arabic and Hispanic viewers of a specific football match, as shown through their tweets, regardless of their distinct languages, to determine whether cultural diversity can be noticed in online interaction. Hundreds of tweets from both communities were translated into English as an intermediate language and then evaluated and contrasted using machine learning (ML) models. According to the research, Arabs are more collectivistic (as opposed to individualistic) and, as a result, exhibit less emotional arousal than Hispanics, which was partially supported by the collected Twitter data. This demonstrates how Twitter could play a key part in cultural research, and, therefore, this study contributes to cross-national comparative cultural research. We demonstrate that our method can also be used to evaluate the quality of machine translation based on how effectively it captures the emotions and sentiments of original languages.

Keywords:

twitter; sentiments; emotions; machine learning; Arabs; Hispanic

1. Introduction

The emergence and global use of social media are the most notable manifestations of the world’s information and communication technology (ICT)-based development. In this new worldwide paradigm, data are exchanged at lightning speed over social networks, resulting in an endless number of activities, including the manipulation of public feelings and emotions. Due to the fertile ground formed by the specific ways in which social networks dominate the internet, processes such as the exchange of jeers, insults, feelings of hatred, admiration, or other human emotions have recently become more common with international events. This is despite the fact that such processes are not novel in the annals of communication. Twitter, in particular, can offer more. Twitter messaging can aid in establishing and preserving an ethical image, which lessens the impact of ethical crises [1]. It is currently one of the most prominent platforms worldwide [2]. It is among the best online platforms for social interaction, introducing novel methods of storytelling through tweets, for which several methods for examination are possible. First and foremost, tweets allow conversations between speakers of various languages, highlighting their critical role in removing linguistic and geographic barriers. This gives researchers access to broader and more in-depth research topics. As a result, over the past five years, a body of research on emotional communication on Twitter has emerged [3].

Currently, the popularity of Twitter generates a huge amount of behavioral data that may be used in many different ways. Among these is to understand how Twitter reflects the emotional and sentimental cultural aspects of various nations. A further approach is to determine whether there are significant cross-national differences in how emotional and sentimental tweets are expressed and, if so, how these differences relate to the cultural differences noted by researchers. Although several academics have examined this topic, few have provided cross-national comparisons [4]. We can therefore anticipate fewer cross-national comparisons between developing world regions and nations due to the absence of financial support for technological and cultural advancement. Additionally, this may be due to a variety of linguistic factors [5]. In a group of languages, including Arabic, Turkish, and Hebron, each input word can be composed of multiple lexical and functional elements, making natural language processing (NLP) extremely difficult [6]. Due to issues such as the grammatical complexity of the Arabic language, its variety of dialects, and the need for reliable data sources, there is a lack of Arabic-language research in NLP, particularly in summarization [7]. Amharic is among the languages with the fewest resources in ML. Therefore, it requires cost-effective strategies and enormous quantities of annotated training data, as well as techniques distinct from those used for the English language. Consequently, in developing nations, we can expect a small number of successfully developed NLP algorithms, which are frequently built on the foundation of machine learning (ML) algorithms.

This paper utilizes a mixed-method approach, which we believe to be the most suitable technique. First, all the tweets published in Arabic and Spanish during the Argentina vs. Saudi Arabia World Cup 2022 football match were gathered. Next, they were analyzed quantitatively in terms of the number of tweets and their distribution between Arabs and Hispanics. The qualitative-content-analysis technique was then used to probe into the texts to uncover cultural differences hidden within users’ emotional and semantic traits.

The proposed model in this method comprises successive phases in which the outcome of each phase builds upon the previous phase’s result. Therefore, we collected the tweets during the specified duration and separated them into Arabic and Spanish groups. Next, we translated both texts into English as an intermediary language and performed an emotional and sentimental analysis of the translated material. The final step was to classify the gathered emotions as low or high in arousal, based on the distribution of basic emotions in the arousal–valence space. Subsequently, the data were examined in terms of the collectivistic levels of Arabs and Hispanics, as shown in the literature, and the results are demonstrated in tables and charts. The only way to obtain Arabic and Spanish tweets for a comparative emotional and sentimental analysis was to translate them into English. There is no method available to analyze emotions in native texts. Arabic texts cannot be analyzed for emotions by EmoRoberta, but Spanish texts can. EmoRoberta could not have been used with native Spanish texts and English translations of Arabic texts. This would have been unjust.

In our research, we used both a sentiment analysis and an emotion analysis to compare Arabs and Hispanics regarding their tweets during the Saudi Arabia vs. Argentina match in the Fédération Internationale de football Association (FIFA) World Cup in Qatar, in 2022. We performed both emotion and sentiment analyses and did not choose one over the other because they are two independent methodologies that generate unique insights. Emotion analysis provides a finer degree of granularity than sentiment analysis. Therefore, emotion- and sentiment-analysis models may be considered mutually supportive. Emotion analysis is hence an additional layer on top of the comparatively simple sentiment analysis. Both emotions and sentiments often contribute to the building of a collectivist culture within a community. Thus, in our research, we seek to expand the scope of cross-national social media research by discussing the emotional and sentimental cultural components of Arab and Hispanic tweets. Our study exemplifies the use of Twitter data to explore research subjects that have rarely been addressed due to a lack of data and technology. It reveals how cross-cultural variances affect Twitter users’ messages and makes it easier to discriminate between various Twitter users with diverse cultural origins.

To the best of our knowledge, this is the first study to systematically examine and compare emotional and sentimental indicators in the content of Twitter tweets published in various languages. In addition, while occupying a major percentage of the globe (approximately one billion people), the Arab world and Latin America are underrepresented in social and cultural psychological research. This is therefore the first study to compare the emotional and sentimental cues in the content of Twitter tweets between two groups from the developing world. From the generalization perspective, since the main focus of this study is on the sentiments and emotions that language constructs contain, it can be generalized to all Arab countries. They can also be more specific to a single Arab country. The same holds for Latin American nations that speak Spanish. Moreover, this cross-national comparison between the two communities will help to reassess the current dimension of sentiments as positive, negative, or neutral, providing a more sophisticated approach to this question. It will also help to reassess emotional dimensions such as admiration, amusement, anger, etc., from a more sophisticated perspective.

This study was motivated by the growing practice of incorporating Twitter into cultural studies. This inspired us to investigate and compile information on the cultural distinctions between Arabs and Hispanics, as evidenced by their tweets. These two communities are both known to originate in emerging regions: the Arab world and Latin America. There are no Twitter-based cultural comparisons between these nations. This was a further research gap we were inspired to fill. The intense interest among Arabs and Hispanics in the World Cup match between Saudi Arabia and Argentina sparked this motivation. In addition, there is a perception that contemporary societies are complex and that these complexities have affected their cultures. Therefore, the categorization of Arabs and Hispanics as collectivistic or individualistic may be significantly altered. As a result, the comparison of Arabs and Hispanics will help researchers to reconsider current cultural dimensions.

Consequently, the purpose of this study is to investigate the emotional and sentimental cultural traits of viewers from Arabic- and Spanish-speaking communities, as observed in their tweets about a specific football match. We aimed to determine whether, even though they speak different languages, cultural distinctions can be seen in the online communication of the respective communities.

The remainder of this paper is structured as follows. In Section 2, we present the literature related to our own research. Next, in Section 3, we describe the methodology we employ. Subsequently, in Section 4, we report the implementation of the proposed methodology, through which we obtained the relevant tweets, translated them, and performed an emotion and sentiment analysis of their content. Next, in Section 5, we present a detailed discussion, in which we correlate the findings of the data analysis with the findings in the literature and highlight the results accordingly. Section 6 concludes with a summary of the results, the limitations of the work, and a discussion of our contribution.

2. Related Work

Few studies have been undertaken on the cross-national emotional and sentimental characteristics of Twitter tweets. Fewer still have been carried out on developing nations [8]. Cross-cultural emotional differences regarding urban greenspace (UGS), as disclosed in English and German tweets, were investigated. A sentiment analysis was conducted on the collected tweets to determine the sentiment values and their corresponding tweet numbers. The results indicate that different emotions are elicited by different types of UGS, and that English and German demands were distinct, as evidenced by their tweets, with the highest sentiment values in gardens and parks, respectively. The activity environment contributed most to positive emotions, regardless of cultural differences. The findings of the study indicate that human emotions can indicate whether UGS supply satisfies human needs and that particular contextual factors can promote positive human emotions to sustain their needs in a cross-cultural context [9]. A novel sentiment-analysis technique that enables a comparison of the emotive contents of Twitter messages in the United States and Japan was used by researchers investigating how affective cultural values may influence social media use. The study revealed that Japanese users primarily produced low (vs. high)-arousal postings, whilst U.S. users mostly produced positive (vs. negative) posts, in line with their respective cultural and emotional values. However, in contrast to their affective cultural values, the Japanese users were more affected by changes in others’ high-arousal positive(including feelings such as excitement) posts than the U.S. users than by changes in their high-arousal negative (including feelings such as fury) posts. When accounting for variations in baseline exposure to emotive content across various themes, these trends persisted. The authors claimed that, across cultures, social media users are affected by culturally relevant content that contradicts their affective values [10]. An empirical study on how Twitter users employ emojis was presented. The research used a comprehensive, cross-regional data set from Twitter to perform the analysis. The authors employed distributional semantic models to express emoji semantics and contrasted country-specific emoji models. It was observed that the categories and frequencies of the emojis expressed by users could serve as rich sources of data for understanding cultural differences between Twitter users from a wide variety of demographics. The study indicated that the preferred usage of emojis conforms to Hofstede’s cultural dimensions model, in which different cultural dimensions within countries demonstrate considerably diverse uses of emojis to express emotions.

From another perspective [11], researchers investigated how people’s use of emotions on Twitter is influenced by cross-cultural variations. By combining Twitter-emoticon-usage patterns with Hofstede’s culture index, the authors found that people from collectivistic cultures favor vertical and eye-oriented emoticons, whereas people from individualistic cultures prefer horizontal and mouth-oriented emoticons [12]. People’s emotions, captured from Twitter tweets, about the Russia–Ukraine war (RUW) were examined to present a framework for automatically categorizing various societal emotions on Twitter. The study proposed a framework for automatically categorizing the many societal emotions on Twitter using a pertained ML technique, EmoRoberta. The model extracted 27 distinct emotions exhibited by Twitter users, which were then classified using machine-learning techniques. The study found that 81% of the Twitter users who participated in the survey had a neutral opinion of the RUW; however, there were hints concerning countries other than Russia and Ukraine, including Slovakia and the United States. The majority of the tweets described the RUW with terminology more closely associated with Ukraine than with Russia [13]. Key clinical indicators of the advancement of the COVID-19 pandemic were compared with indicators of public perceptions of the pandemic revealed from 20 million related tweets in a certain period. There were signs of psychophysical characteristics: Twitter users were becoming increasingly interested in death, but their tone was shifting away from passion and towards reason. Word co-occurrences were analyzed semantically to reveal variations in the affective context of COVID-19 fatalities. Their calculated parameters agreed with the estimations from the psychological experiments. The research demonstrated that users’ tweets differ in their sensitivity to national COVID-19 mortality rates based on their country.

Few ML-based cross-national studies on the emotional and sentimental characteristics of Twitter material have been conducted. Fewer still have been conducted on emerging nations, with the majority performed on developed nations. One study [8] focused on English and German tweets. Another [9] focused on user tweets from the USA and Japan. The authors of [10] compared the usage of emojis, not the text, in tweets by users from many countries, of which four were developing countries: Indonesia, Brazil, the Philippines, and Mexico. To contribute to the closing of this research gap, we based our comparison of Arab and Hispanic tweets on text rather than emojis. Furthermore, the authors of [10,11] used Hofstede’s cultural dimensions as a base for cultural comparison, while others did not. The authors of [12] examined emotions captured from Twitter tweets by using the pre-trained ML model, EmoRoberta. This was comparable to our method, in which we used only one Hofstede dimension (individualism vs. collectivism (IDV)) in our research, as we found that IDV is the most frequently studied Hofstede dimension to date. The authors of [12] also assessed users’ emotions in captured tweets by using the pre-trained ML model, EmoRoberta, which was identical to our strategy. The authors of [13] examined how the public emotions relating to the COVID-19 pandemic were revealed by 20 million tweets relating to this topic during a specific time frame, and compared their findings to those of psychological experiments. This is comparable to the way in which we conducted our research, which involved mapping our findings to literary descriptions of various cultural traits. In addition, all the previous studies used Twitter text messages in English as a basis for their analyses and comparisons, except [10], which used the emojis in the tweets examined. On the other hand, we considered textual Twitter messages in languages other than English and translated them into English before conducting the emotion analysis. While more research is required in this area, we believe that the agreement between our study’s findings and the literature regarding the collectivistic differences between Arabs and Hispanics is only one example of how the translation preserved the emotions in the Twitter content.

3. Methodology

To achieve the goal of this work, our suggested method for extracting and classifying societal emotions entails many interconnected components. The proposed method’s workflow is depicted in Figure 1.

As mentioned above, in this paper, we utilize a mixed-method approach, which we believe to be the most suitable technique. First, all tweets published in Arabic and Spanish during Argentina vs. Saudi Arabia World Cup 2022 football match were gathered. Next, they are analyzed quantitatively in terms of the number of tweets and their distribution between Arabs and Hispanics. The qualitative-content-analysis technique was then used to probe into the text to uncover cultural differences hidden within users’ emotional and semantic traits.

Within this methodology, our research adopted a novel approach. We compared two non-English-speaking communities from the developing world (Arab and Hispanic) based on their Twitter activity. We translated their tweets from their original languages into English, and then used NLP-based ML algorithms designed for English text to culturally assess these tweets and compare the sentiments and emotions of the two groups. Furthermore, in our approach, the problem of detecting users’ physical locations from Twitter-based public information was solved by the language of the gathered tweets, which revealed whether the tweet’s author was Arab or a Hispanic and, therefore, was used as a clue as to the author’s home nation. However, since the sentimental and emotional contexts of tweets may change when they are translated from one language to another, we utilized the study’s results as a guide to establish how effectively the translation preserved sentiments and emotions at a suitable level for cultural comparisons. Following translation, the tweets were emotionally rated using an English-based machine-learning model and presented in descriptive charts. Next, the collected emotions taken from the tweets were classified in terms of their arousal level, high vs. low, in accordance with the fundamental emotion distribution in the arousal–valence space [14,15]. Next, we addressed how these data can be understood in terms of psychological researchers’ proposed nation-based cultural dimensions. Subsequently, as mentioned above, the research results, when consistent with the literature about the cultural distinction between Arabs and Hispanics, were viewed as evidence. The results showed how effectively, to some extent, the machine translation preserved the context’s sentimental and emotional clues. We provide more details in the following sections.

It is important to highlight that the only way to obtain Arabic and Spanish tweets for a comparative emotional and sentimental analysis was to translate them into English. There is no method available to analyze emotions in native texts. Arabic text cannot be analyzed for emotions by EmoRoberta, but Spanish text can. EmoRoberta could not have been used with native Spanish text and an English translation of Arabic text. This would have been unjust.

3.1. Capturing Tweets

We captured all the tweets written in Arabic and Spanish languages that were posted during the period of the Argentina vs. Saudi Arabia football match in the FIFA World Cup in Qatar, in 2022. We chose 22 November 2022 as the starting date, which was the day of Argentina vs. Saudi Arabia match (the day when the trend “Where is Messi” in the Arabic language started), and 23 November 2022 as the ending date, which was one day after the match. We developed a Python script to communicate programmatically with Twitter via its Application Programming Interface (API) for developers. In this method, the script takes as input a key phrase, a start date, and an end date, and returns all tweets containing the specified key phrase throughout the specified duration. The Tweet-gathering period was marked by a high level of tension among both Arab speakers and Spanish speaker users. To facilitate the cross-cultural analysis by examining Twitter tweets, we retrieved the Twitter data using Python script with specific filtering. Therefore, only users watching Argentina vs. Saudi Arabia football match from hundreds of countries and regions were represented. As the aim of this work was to study emotions and sentiments on Twitter for Arabic- and Spanish-language speakers only, we needed to rule out tweets in any other language. Therefore, we parsed all the tweets posted by unique Arabic and Spanish users in the selected period. We crawled all the Arabic and Spanish Twitter data throughout the collection period with the key phrase “Where is Messi” written in either Arabic or Spanish. Messi is a well-known Argentinian football player known to football fans worldwide.

3.2. Splitting the Tweets

We divided the tweets into two classes based on the language of the terms contained within them and saved them in two separate tables: “Messi_ar” table for tweets using the phrase “Where is Messi”, which became popular in Arabic (5186 tweets), and the “Messi_es” table for tweets including the Spanish phrase “Donde está Messi”, in order to observe the Spanish-speaking population’s use of the expression (398 tweets).

3.3. Translation into English

Because the collected tweets were in Arabic and Spanish, they needed to be translated into English as an intermediary language so that we could compare Arabic and Spanish tweets in emotion and sentiment dimensions. To this end, we developed a Python script that interacted with Google Translate. The software received Arabic and Spanish tweets as inputs and delivered them translated into English. This allowed us to compare the tweets using English-based ML algorithms for sentiment analysis and emotion recognition that are already available.

It is known that when using translated data, translated emotions and sentiment data are likely to contain samples that are not indicative of their assigned sentiment or emotion in their source language [16]. Although the level of preservation may be deemed sufficient for the present cross-national emotional comparison of messages on Twitter, more research was required.

3.4. Emotion Recognition

Emotion analysis was conducted by using another NLP approach that involves the extraction and analysis of emotions from a text. For this work, we used the EmoRoberta model which uses the cutting-edge Roberta approach with few changes to its key hyperparameters [17]. Both Roberta and EmoRoberta utilized Google’s well-known Bidirectional Encoder Representations from Transformers (BERT) model [18]. Then, Roberta surpassed the BERT model as the best pre-trained model for use in text classification tasks [19]. EmoRoberta further divides the text into 28 emotion categories (admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization) [20].

We built a Python script that used this model and applied it to our translated English texts. The output of this analysis illustrated the percentage of references to each emotion in the tweets collected during and one day after Saudi Arabia and Argentina’s 2022 FIFA World Cup match in Qatar.

3.5. Sentiment Analysis

Sentiment analysis is used to determine whether a text is positive, negative, or neutral. For this task, we used ‘twitter-roberta-base-sentiment-latest’ [21], which is a machine-learning model trained over ~124 million tweets. We built a Python script that used this model and applied it to our translated English texts. When a text was input, the script generated categorization probabilities for whether it was positive, negative, or neutral.

3.6. Defining Emotional Arousal Level (High vs. Low)

Emotional arousal level is one of the significant culturally variable elements of emotion. Consistently, cross-cultural disparities in emotional arousal intensity have been discovered. There are clear distinctions between social cultures in terms of collectivistic vs. individualistic tendencies.

The collected emotions were categorized as low or high-arousal emotions based on the distribution of fundamental emotions in the arousal–valence space proposed [14,15]. This emotional classification was used to distinguish between Arabs and Hispanics based on the distinction between their respective levels of collectivism.

The flowchart of the previously mentioned Python scripts is depicted in Figure 2.

A link to the scripts can be found in [22].

4. Results

This section presents the descriptive results of both the sentiments and the emotions captured from the Twitter dataset regarding the football match between Saudi Arabia and Argentina in the FIFA World Cup in Qatar, in 2022.

4.1. Emotion Recognition

The emotion analysis was conducted by using the NLP approach, which involves the extraction and analysis of emotions from a text. For this work, as mentioned above, we used the EmoRoberta model. EmoRoberta combines the cutting-edge Roberta approach from NLP. It divides texts into 28 emotion categories. Figure 3 depicts a magnified view of the collected emotions, revealing that we were able to extract 14 emotions from the translated tweets by Arabs and Hispanics on the 2022 FIFA World Cup match between Saudi Arabia and Argentina. Among the total number of the collected tweets during the specified time (5186 tweets in Arabic and 398 in Spanish), 811 tweets in Arabic and 94 tweets in Spanish were emotionally classified. These tweets contained the key phrase, “Where is Messi”, in both Arabic and Spanish after they were translated into English. The emotional contexts of the remaining tweets could not be determined or was lost owing to translation, so they were omitted from the results.

4.2. Sentiment Analysis

Sentiment analysis is used to determine whether a text is positive, negative, or neutral. As mentioned above, we used “twitter-roberta-base-sentiment-latest” [21], which is a machine-learning model trained over ~124 million tweets. Between the neutral, positive, and negative levels, we retrieved the sentiment values and the percentages of the total number of collected tweets (5186 tweets in Arabic and 398 in Spanish). These tweets contained the key phrase, “Where is Messi”, in both Arabic and Spanish, after they were translated into English. Figure 4 depicts the percentage view of the retrieved sentiments.

In the figure, three sentiment dimensions, namely neutral, positive, and negative, were used to characterize the Tweets from Arabic- and Spanish-speaking users. In terms of the neutral dimension in the collected tweets, the Spanish users were much more neutral and sentimentally indifferent than the Arab users (72% vs. 50%). The users identified as Arabs were more likely to be sentimentally positive than the Spanish-speaking users (26 % vs. 8%). Additionally, the users identified as Arabs were more likely to be sentimentally negative than the Spanish-speaking users (25% vs. 21%).

4.3. Emotions Classification (High Arousal vs. Low Arousal)

The fourteen emotions taken from the tweets and shown in Figure 2 are given in Table 1, where their arousal level is determined following the fundamental emotion distribution in the arousal–valence space [14], the most thorough version proposed [15]. As mentioned above, this emotional classification was utilized to distinguish between Arabs and Hispanics based on the distinction between their respective levels of collectivism.

The emotions in the table with question marks indicate those whose arousal levels in the literature could not be determined.

5. Discussion

This study aims to determine the sentiments and emotions expressed in Twitter tweets by Arab and Hispanic fans speaking different languages during the Saudi Arabia vs. Argentina match at the 2022 FIFA World Cup in Qatar to gain a better understanding of how Twitter tweets represent the cultural characteristics of both Arab and Hispanic societies. To achieve this objective, we analyzed publicly accessible data from Twitter tweets containing a specific key phrase indicating that the authors were viewers of the Saudi Arabia vs. Argentina football match. We emphasized that the key-phrase language (Arabic or Spanish) is the most important piece of information for determining to which community a person belongs. We utilized Google Translate to convert all the Arabic and Spanish tweets containing the key phrase, “Where is Messi”, to a common language, English, so that we could use prebuilt English-based machine learning (ML) techniques for the sentiment and emotion analysis. We examined the results from the following two perspectives, which complemented one another: the perspective of the individualism–collectivism dimension, IDV, proposed by Hofstede, which is one of the cultural dimensions outlined in his model; and the perspective of the affective (emotional) vs. neutral dimension, which is one of the cultural dimensions highlighted by Hampden-Turner in his model. Thus, the concept behind this study is also included in works on social networks and how to use them to assess cultural differences between nations that do not necessarily share the same language.

5.1. Emotion Analysis

5.1.1. Individualism vs. Collectivism

According to Hofstede [23], the values that differentiate nations can be statistically categorized into five clusters: power distance (PDI), individualism versus collectivism (IDV), masculinity versus femininity (MAS), uncertainty avoidance (UAI), and long-term orientation (LTO) (LTO). Of these aspects, we consider individualism vs. collectivism (IDV) to draw a comparison between Arab and Hispanic football fans and ignore other aspects, as IDV is the most frequently studied Hofstede dimension to date [24], especially in development communities. Individualist (IDV) cultures, according to this model, are distinguished by their emphasis on personal liberties and accomplishments. People are typically socialized to be independent, autonomous, and competitive in such communities. On the other hand, individuals raised in very collectivist societies typically place more value on interdependence than independence, emphasizing communal harmony over individual success. Personal interest and psychological autonomy are generally put aside in favor of the cohesion and welfare of the larger social group. We should notice that a low IDV characterizes societies of a more collectivist nature with close ties between individuals.

According to Hofstede’s model, the Arab world has an IDV score of 38. Latin America does not have an IDV score, since Hofstede did not calculate the cultural dimensions of Latin America as a distinct entity, as in the Arab world. Therefore, we used Argentina’s score to represent the Hispanics, as the competition was between Argentina and Saudi Arabia in the first place. Consequently, we deemed Argentina to be the representative of Latin America. Argentina’s IDV score was 46. Thus, we noticed that the IDV score of the Arabic-speaking fans was lower than that of the Spanish-speaking fans (38 vs. 46). Accordingly, as a high IDV suggests a low amount of societal interdependence, whereas a low IDV suggests a more collectivist outlook, Arabs are more collectivistic than Hispanics.

Additionally, according to [25], Arabic- and Spanish-speaking individuals both live in collectivistic cultures. Nonetheless, there may be some cultural variances between them, as Latin America has low collectivism scores while the Arab world has moderate scores. In the same vein [26], a comparison between nations in terms of collectivism revealed that, in general, those from Arab countries, Eastern Europe, Africa, and to a lesser extent, Latin Americans, accept collectivist beliefs more strongly. Even while the differences are moderate, they are more obvious than in the case of individualist ideas.

As a result, it appears, from the literature, that Arab nations and, to a lesser extent Latin Americans, accept collectivism. Arabs are therefore more collectivistic than Latin Americans. Figure 5 depicts the relationship between individualism and collectivism and how Arabs and Hispanics fit within it.

5.1.2. High-Arousal vs. Low-Arousal Emotions

From a different perspective, according to [27], high-arousal emotions are more prevalent among Westerners than low-arousal emotions. In contrast, people in Eastern or collectivist cultures experience and choose low-arousal emotions more than high-arousal emotions. The author further claims that in Western or individualist civilizations, high-arousal emotions are rewarded and encouraged more than low-arousal emotions. High-arousal emotions are more prevalent among Westerners than low-arousal emotions. In contrast, people in Eastern or collectivist cultures experience and choose low-arousal emotions more than high-arousal emotions. It can be argued that emotions with low levels of arousal are valued more highly than those with high arousal in Eastern and collectivist countries. As a result, low-arousal emotions are more common and preferred in the East than high-arousal emotions [27]. In summary, these arguments suggest that emotional arousal reduces as the collectivistic score increases within the collectivistic margin.

According to the previously mentioned global cross-cultural research, the Arab world is more collectivistic than South America; hence, high-arousal emotions are more widespread among Hispanics than low-arousal emotions. The opposite is true among Arabs: low-arousal emotions are more prevalent than high-arousal emotions. The data depicted in Table 1 were examined in terms of the collectivistic level of Arabs and Hispanics, and the results are demonstrated in Table 2. For instance, in the literature, amusement is classified as a high-arousal emotion. Our results suggested that the Spanish percentage was greater than the Arab percentage (44.44% vs. 22.28%), which was in line with research that found that among Hispanics, high-arousal emotions are more common than low-arousal emotions. Accordingly, we marked amusement in the table as consistent. Furthermore, the second emotion shown in the table (admiration), was classified in the literature as a low-arousal emotion. The findings revealed that the Spanish percentage was lower than the Arab percentage (8.8% vs. 15.11%), which was in line with research that found that among Hispanics, low-arousal emotions are less common than high-arousal emotions. Accordingly, we marked admiration in the table as consistent. Additionally, the third emotion shown in the table (sadness), was classified in the literature as a low-arousal emotion. The findings revealed that the Spanish percentage was lower than the Arab percentage (6.06% vs. 14.13%), which was in line with research that found that among Hispanics, low-arousal emotions are less common than high-arousal emotions. Accordingly, we marked it in the table as consistent.

The results revealed that, of the 14 emotions extracted from the tweets, ten were classified according to the arousal–valence space: amusement, admiration, sadness, joy, excitement, gratitude, love, anger, surprise, and annoyance. Eight of the ten emotions were consistent with the literature, whereas two were not. These results were determined based on our prior conclusions that low-arousal emotions are more prevalent than high-arousal emotions among Arabs, and that high-arousal emotions are more common than low-arousal emotions among Hispanics. As mentioned above, the table contains question marks to indicate emotions whose arousal level could not be identified from the literature.

Our analysis demonstrated that the emotional differences between speakers of Arabic and Spanish are consistent with their respective collectivist cultures. This can in turn seen as a sign that the machine translation did not destroy the emotional context in the tweets, but rather preserved it to some extent.

However, it is important to mention that, among the total number of the tweets collected during the specified timeframe (5186 tweets in Arabic and 398 in Spanish), 811 tweets in Arabic and 94 tweets in Spanish were emotionally classified. This provided some information. The effectiveness of the machine translation in maintaining the emotional context of a text might vary between languages.

We noticed that 24% of the total number of Spanish tweets were classified emotionally after translation, but just 16% of the total number of Arabic tweets were classified emotionally. This may indicate that machine translation from Spanish to English—both of which use Latin characters—is more effective than translation from Arabic to English, which uses a different character set. More research is required in this regard.

5.2. Sentiment Analysis

Regarding sentiment analysis, Figure 6 depicts another view of Figure 4, where we sentiments are reclassified as neutral vs. emotional. According to Figure 6, Arabic speakers are more emotional than Hispanics (Arabic: 50% vs. Spanish: 28%). As we demonstrate in the material that follows, this is consistent with the literature.

In [28], a seven-dimensional cultural model, one of which is the neutral vs. affective (emotional) dimension, was presented. In a neutral social culture, people tend to keep their emotions to themselves, whereas, in an affective (emotional) culture, emotions are normal and public, as people have a tendency to express them. This is not to say that people in neutral cultures lack feelings and emotions; it simply suggests differences in how emotions are expressed in public. Using a continuum ranging from neutral to affective, the author describes which cultures are willing to express negative emotions at work.

Figure 7 shows a comparison between some nations using the neutral vs. emotional Hampden-Turner dimension model. Figure 7 shows that Arabs, represented by Saudi Arabia because the majority of Arab football fans watching the match under discussion were likely to be Saudis, are more emotional than Hispanics, represented by Argentina, since the majority of Spanish-speaking football fans watching the match were likely to be Argentinians. This matches with our results, as demonstrated in Figure 6, and, at the same time, it can be seen as a sign that the machine translation did not destroy the sentimental contexts of the tweets, but rather preserved them to some extent.

However, it is important to note that various emotions were examined [29] in five different countries. It was found that those from more collective societies were more likely to exhibit emotion-suppressing behaviors, such as gaze aversion, facial concealing, or controlled smiling. This reveals that the sentimental disparities between Arabic and Spanish speakers were not congruent with their respective affective vs. neutral social contexts, which contradicts the revelations of the Hampden-Turner dimension model above. However, methods such as ours might be applied to favor one of the two viewpoints over the other.

5.3. Discussion of Outcomes

We arrived at a general outcome when we combined the results of the sentiment analysis and the emotion analysis. We obtained strong evidence that the sentiments and emotions of both Arabs and Hispanics were revealed in their tweets regarding the Saudi Arabia vs. Argentina match at the 2022 FIFA World Cup in Qatar, and that these sentiments and emotions were consistent with their cultural characteristics, as revealed in the literature. The consistency of our results with the literature also suggests that machine translation does, to some extent, maintain the emotional contents of translated texts, and that it achieves this with Spanish more effectively than with Arabic.

6. Conclusions

In this study, we examined how Twitter users communicate their sentiments and emotions in their tweets, as well as how they have adapted to the cultural qualities of their communities, as mentioned in the literature. There are certain disparities between Arabs and Hispanics, according to the data. These findings are consistent with the IDV index of Hofstede’s cultural dimensions model and the research of several other psychologists. It is also partially consistent with Hampden-Turner’s emphasis on the neutral vs. emotive cultural dimension.

According to the findings of this study, users exhibit emotions, feelings, and cultural traits that, in some respects, mirror those of their cultural peers. Although this study offers some fascinating contributions, it does feature some limitations.

-: First, not all nations can benefit equally from Hofstede’s cultural model. For the sake of his research, Hofstede regarded all of the Arab countries as a single entity, and, he only included a portion of the Arab countries in his model. Hofstede also considered Spanish-speaking countries to be distinct countries rather than a region. To work around this, we chose Argentina as their representative as the competition was between Argentina and Saudi Arabia in the first place.
-: We wish another competition between Arabs and Hispanics had taken place in the 2022 World Cup to increase the persuasiveness and representativeness of our research. However, Saudi Arabia and Argentina were the only Arab and Hispanic teams to compete against each other during the 2022 Qatar World Cup, with the exception of the match between an Arab nation, Morocco, and Spain (a Spanish-speaking country). This match, however, was not included in our research because our purpose was to compare two developing-world populations. Spain is classified as Hispanic, but not as belonging to Latin America, meaning that it does not belong to the category of emerging nations.
-: The disparities between the emotionally classified tweets may indicate that machine translation from Spanish to English—both of which use Latin characters—is more effective than translation from Arabic to English, which uses a different character set. Hence, the results of the research were interpreted with caution.
-: We sometimes could not find a score for the entire Arab world in the literature. In such cases, we deemed Saudi Arabia to be representative of the Arab world, as the competition was between Saudi Arabia and Argentina in the first place.
-: Although this study adds to the body of international cultural research on the usage of new media, its sample size was modest. We only included a portion of the users in the two groups in the data we collected because the data were associated with a specific event. Each group only consisted of football fans.
-: We used English-based NLP and ML algorithms to extract sentiments and emotions after tweets from non-English languages were translated into English, which may have resulted in the loss of sentimental and emotional context. This may mean that the emotional analysis was not as reliable as it would have been if the tweets had originally been written in English.

This research has numerous significant implications and contributions, despite its limitations:

-: The authors believe that this study is the first cross-cultural comparison of developing regions using Twitter.
-: The data from Twitter can be used to examine cross-cultural differences without requiring significant time or effort to define the geographic location or background culture by using the language in tweets as a cue to define a group culture.
-: We show that the assessment of the sentiments and emotions in tweets in non-English languages after they are translated into English using our method may enable an evaluation of the degree of success of machine translation.
-: The study also demonstrates that it is possible to recognize and classify users from various cultural origins using tweets as a basis.
-: Although something vital appears to be lost in translation, the findings of this study demonstrate that emotions and sentiments were preserved, to some extent, after machine translation, since they were consistent with the findings in the literature. The research results are evidence of how effectively, to some extent, the machine translation preserved the sentiments’ and emotions’ contexts and implications.

Author Contributions

Conceptualization, A.A. (Aseel Alhadlaq); Methodology, A.A. (Aseel Alhadlaq) and A.A. (Abeer Alnuaim); Formal analysis, A.A. (Abeer Alnuaim); Resources, A.A. (Abeer Alnuaim); Data curation, A.A. (Abeer Alnuaim); Writing—original draft, A.A. (Abeer Alnuaim); Writing—review & editing, A.A. (Aseel Alhadlaq). All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by Researchers Supporting Project number (RSP2023R314) King Saud University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors acknowledge the Researchers Supporting Project number (RSP2023R314) King Saud University, Riyadh, Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

Raheja, S.; Chipulu, M. Can Twitter messaging help corporations mitigate the impact of ethical scandals? We topic-model pre-scandal tweets of 92 ‘offenders’ to investigate. Soc. Bus. Rev. 2021, 16, 420–441. [Google Scholar] [CrossRef]
Appel, G.; Grewal, L.; Hadi, R.; Stephen, A.T. The future of social media in marketing. J. Acad. Mark. Sci. 2020, 48, 79–95. [Google Scholar] [CrossRef]
Maheshkar, V.; Sarin, S.K. Review and Analysis of Emotion Detection from Tweets using Twitter Datasets. In Proceedings of the WAC-2022: Workshop on Applied Computing, CEUR Workshop Proceedings, Chennai, India, 27–28 January 2022; Available online: https://ceur-ws.org/Vol-3142/PAPER_04.pdf (accessed on 28 April 2023).
Cho, S.E.; Park, H.W. Cross-National Comparison of Twitter Use between South Korea and Japan: An Exploratory Study. Int. J. Contents 2012, 8, 50–55. [Google Scholar] [CrossRef]
Ariely, M.; Nazaretsky, T.; Alexandron, G. Machine Learning and Hebrew NLP for Automated Assessment of Open-Ended Questions in Biology. Int. J. Artif. Intell. Educ. 2023, 33, 1–34. [Google Scholar] [CrossRef]
Elsaid, A.; Mohammed, A.; Ibrahim, L.F.; Sakre, M.M. A Comprehensive Review of Arabic Text Summarization. IEEE Access 2022, 10, 38012–38030. [Google Scholar] [CrossRef]
Neshir, G.; Rauber, A.; Atnafu, S. Meta-Learner for Amharic Sentiment Classification. Appl. Sci. 2021, 11, 8489. [Google Scholar] [CrossRef]
Chen, S.; Liu, L.; Chen, C.; Haase, D. The interaction between human demand and urban greenspace supply for promoting positive emotions with sentiment analysis from twitter. Urban For. Urban Green. 2022, 78, 127763. [Google Scholar] [CrossRef]
Hsu, T.W.; Niiya, Y.; Thelwall, M.; Ko, M.; Knutson, B.; Tsai, J.L. Social media users produce more affect that supports cultural values, but are more influenced by affect that violates cultural values. J. Personal. Soc. Psychol. 2021, 121, 969–983. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Ch’Ng, E.; Chong, A.Y.L.; See, S. An empirical analysis of emoji usage on Twitter. Ind. Manag. Data Syst. 2019, 119, 1748–1763. [Google Scholar] [CrossRef]
Park, J.; Baek, Y.M.; Cha, M. Cross-cultural comparison of nonverbal cues in emoticons on Twitter: Evidence from big data analysis. J. Commun. 2014, 64, 333–354. [Google Scholar] [CrossRef]
Vyas, P.; Vyas, G.; Dhiman, G. RUemo—The Classification Framework for Russia-Ukraine War-Related Societal Emotions on Twitter through Machine Learning. Algorithms 2023, 16, 69. [Google Scholar] [CrossRef]
Dyer, J.; Kolic, B. Public risk perception and emotion on Twitter during the COVID-19 pandemic. Appl. Netw. Sci. 2020, 5, 99. [Google Scholar] [CrossRef] [PubMed]
Andersen, P.A. An Arousal-Valence Model of Nonverbal Immediacy Exchange. In Proceedings of the Annual Meeting of the Central States Speech Association, Chicago, IL, USA, 12–14 April 1984. [Google Scholar]
Neviarouskaya, A.; Prendinger, H.; Ishizuka, M. Textual affect sensing for sociable and expressive online communication. In Affective Computing and Intelligent Interaction; Springer: Berlin/Heidelberg, Germany, 2007; pp. 218–229. [Google Scholar]
Kajava, K.; Öhman, E.; Hui, P.; Tiedemann, J. Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection. In Proceedings of the Digital Humanities in the Nordic Countries, Riga, Latvia, 24 October 2020. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A Robustly Optimized Bert Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Demszky, D.; Movshovitz-Attias, D.; Ko, J.; Cowen, A.; Nemade, G.; Ravi, S. GoEmotions: A Dataset of Fine-Grained Emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 4040–4054. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Alkaabi, N.; Zaki, N.; Ismail, H.; Khan, M. Detecting Emotions behind the Screen. AI 2022, 3, 948–960. [Google Scholar] [CrossRef]
Available online: https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest (accessed on 28 April 2023).
Available online: https://github.com/AlihadiZd/Twitter-scraping-with-NLP.git (accessed on 28 April 2023).
Hofstede, G. The GLOBE debate: Back to relevance. J. Int. Bus. Stud. 2010, 41, 1339–1346. [Google Scholar] [CrossRef]
Lambiase, S.; Catolino, G.; Torre, M.; Tamburri, D.A.; Palomba, F.; Ferrucci, F. Fright Not and Be Dispersed! Evaluating Cultural Dimensions Versus Software Communication and Collaboration Activities. Available online: https://ssrn.com/abstract=4210197 (accessed on 28 April 2023).
Hofstede, G. The Cultural Relativity of Organizational Practices and Theories. J. Int. Bus. Stud. 1983, 14, 75–89. [Google Scholar] [CrossRef]
Oyserman, D.; Coon, H.M.; Kemmelmeier, M. Rethinking Individualism and Collectivism: Evaluation of Theoretical assumptions and meta-analyses. Psychol. Bull. 2002, 128, 3–72. [Google Scholar] [CrossRef] [PubMed]
Lim, N. Cultural differences in emotion: Differences in emotional arousal level between the East and the West. Integr. Med. Res. 2016, 5, 105–109. [Google Scholar] [CrossRef] [PubMed]
Hampden-Turner, C.; Trompenaars, F.; Hampden-Turner, C. Riding the Waves of Culture: Understanding Diversity in Global Business; Hachette: London, UK, 2020. [Google Scholar]
Cordaro, D.T.; Sun, R.; Keltner, D.; Kamble, S.; Huddar, N.; McNeil, G. Universals and cultural variations in 22 emotional expressions across five cultures. Emotion 2018, 18, 75–93. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Proposed method workflow.

Figure 2. Flowchart of Python scripts.

Figure 3. A magnified view of the collected emotions.

Figure 4. The percentage view of the retrieved sentiments.

Figure 5. The relationship between individualism and collectivism.

Figure 6. Sentiments distinction (neutral vs. emotional).

Figure 7. Affective vs. neutral classification [28].

Table 1. Emotions arousal level.

No.	Emotion	Arousal
1	amusement	High arousal
2	admiration	Low arousal
3	sadness	Low arousal
4	joy	High arousal
5	excitement	High arousal
6	gratitude	Low arousal
7	approval	?
8	caring	?
9	love	Low arousal
10	confusion	?
11	anger	High arousal
12	surprise	?
13	annoyance	?
14	desire	?

Table 2. Differences in emotional arousal between Arabs and Hispanics.

Emotion	Arousal	Arab	Spanish	Consistency
amusement	High arousal	22.28%	44.44%	consistent
admiration	Low arousal	15.11%	8.08%	consistent
sadness	Low arousal	14.13%	6.06%	consistent
joy	High arousal	8.80%	6.06%	no
excitement	High arousal	7.50%	0.00%	no
gratitude	Low arousal	4.89%	0.00%	consistent
approval	?	3.26%	2.02%	?
caring	?	3.26%	3.03%	?
love	Low arousal	3.26%	0.00%	consistent
confusion	?	3.04%	11.11%	?
anger	High arousal	2.61%	4.04%	consistent
surprise	High arousal	0	3.03%	consistent
annoyance	High arousal	0	5.05%	consistent
desire	?	0	2.02%	?

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alhadlaq, A.; Alnuaim, A. A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans. Appl. Sci. 2023, 13, 6729. https://doi.org/10.3390/app13116729

AMA Style

Alhadlaq A, Alnuaim A. A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans. Applied Sciences. 2023; 13(11):6729. https://doi.org/10.3390/app13116729

Chicago/Turabian Style

Alhadlaq, Aseel, and Abeer Alnuaim. 2023. "A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans" Applied Sciences 13, no. 11: 6729. https://doi.org/10.3390/app13116729

APA Style

Alhadlaq, A., & Alnuaim, A. (2023). A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans. Applied Sciences, 13(11), 6729. https://doi.org/10.3390/app13116729

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Twitter-Based Comparative Analysis of Emotions and Sentiments of Arab and Hispanic Football Fans

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Capturing Tweets

3.2. Splitting the Tweets

3.3. Translation into English

3.4. Emotion Recognition

3.5. Sentiment Analysis

3.6. Defining Emotional Arousal Level (High vs. Low)

4. Results

4.1. Emotion Recognition

4.2. Sentiment Analysis

4.3. Emotions Classification (High Arousal vs. Low Arousal)

5. Discussion

5.1. Emotion Analysis

5.1.1. Individualism vs. Collectivism

5.1.2. High-Arousal vs. Low-Arousal Emotions

5.2. Sentiment Analysis

5.3. Discussion of Outcomes

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI