*Article* **Networks and Stories. Analyzing the Transmission of the Feminist Intangible Cultural Heritage on Twitter**

**Jordi Morales-i-Gras \*, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta and Simón Peña-Fernández**

Journalism Department, University of the Basque Country, 48940 Leioa, Spain; julen.orbegozo@ehu.eus (J.O.-T.); ainara.larrondo@ehu.eus (A.L.-U.); simon.pena@ehu.eus (S.P.-F.)

**\*** Correspondence: info@jordimorales.com

**Abstract:** Internet social media is a key space in which the memorial resources of social movements, including the stories and knowledge of previous generations, are organised, disseminated, and reinterpreted. This is especially important for movements such as feminism, which places great emphasis on the transmission of an intangible cultural legacy between its different generations or waves, which are conformed through these cultural transmissions. In this sense, several authors have highlighted the importance of social media and hashtivism in shaping the fourth wave of feminism that has been taking place in recent years (e.g., #metoo). The aim of this article is to present to the scientific community a hybrid methodological proposal for the network and content analysis of audiences and their interactions on Twitter: we will do so by describing and evaluating the results of different research we have carried out in the field of feminist hashtivism. Structural analysis methods such as social network analysis have demonstrated their capacity to be applied to the analysis of social media interactions as a mixed methodology, that is, both quantitative and qualitative. This article shows the potential of a specific methodological process that combines inductive and inferential reasoning with hypothetico-deductive approaches. By applying the methodology developed in the case studies included in the article, it is shown that these two modes of reasoning work best when they are used together.

**Keywords:** feminism; hashtivism; Twitter; social network analysis; Machine Learning

## **1. Introduction**

This article is part of a broader research project dedicated to the analysis of social movements through digital conversations in social networks, taking as a reference certain public controversies of high impact in the online and offline public debate. Our methodological proposal is framed within the new research currents within the Sociology of Communication [1], which employ the digital footprint that millions of Internet users leave at the disposal of the scientific community through their interactions and actions. It is therefore a matter of using massive data and processing them through certain methodological processes to obtain information that helps the scientific community to describe and put into their interpretative context the social phenomena that take place around us, with the aim of better understanding the dynamics and changes in the logics of collective action. It is also a matter of understanding the consequences of these dynamics in the shaping of social movements, necessarily anchored in their own immaterial cultural heritage, and projecting themselves towards a future that each generation defines based on its own aspirations.

In this context, this methodological proposal offers a research perspective to the scientific community interested in social movements, the logics of collective action, contemporary public debate, and deliberative processes, among others. It does so, moreover, using the big data provided by a microblogging network such as Twitter, and with a method that not only describes, but also explains, interprets, and helps to understand how and why social networks are used and what effects and what social and democratic transformations they promote.

**Citation:** Morales-i-Gras, J.; Orbegozo-Terradillos, J.; Larrondo-Ureta, A.; Peña-Fernández, S. Networks and Stories. Analyzing the Transmission of the Feminist Intangible Cultural Heritage on Twitter. *Big Data Cogn. Comput.* **2021**, *5*, 69. https://doi.org/10.3390/ bdcc5040069

Academic Editors: Manolis Wallace, Vassilis Poulopoulos, Angeliki Antoniou and Martín López-Nores

Received: 18 October 2021 Accepted: 17 November 2021 Published: 24 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Social media are, after all, conversation tools of our society in the contemporary digital context and, as Castillo [2] argues, examining the conversation tools of a culture is an excellent way to understand it, and to understand its links with the past and with the future. The social media that have emerged alongside the web 2.0 have created spaces for communication and citizen participation that foster cooperation and mutual aid [3]. Those media are one of the main open mechanisms of public conversation, fundamental for the creation of the public agenda and deserving of in crescendo attention from the scientific community. Information technologies have given rise to what authors such as Dery [4], Joyanes [5] or Lévy [6] have baptized as "cyberculture" or "culture of connectivity" [7]. There is no doubt that the expansion of the main online platforms such as Facebook, Twitter, Flickr, Youtube, or Wikipedia reinforce the idea that contemporary society is facing a constantly evolving technocultural ecosystem and a phase of sociability that has online interaction as one of its main exponents. In such an ecosystem, meanings are permanently negotiated and reconsidered in a multilateral situation in which different generations participate and in which they reinterpret and construct themselves.

The methodological proposal contained in this article to observe and analyze public debate through digital network conversations reinforces the scientific production on the phenomenon that the sociologist Javier Toret calls the "connected multitude". This is defined as "the ability to connect, group and synchronize, through technological and communicative devices, and around objectives, the brains and bodies of a large number of subjects in sequences of time, space, emotions, behaviors and languages" [8] (p. 23). According to Toret, this would be one of the many structural conditions in the Network Society [9]. The connected crowd, then, emerges in the new paradigm of Mass Self-Communication [10] in the Network Society and as one of the main characteristics of what researchers such as Melucci [11], Candón-Mena [12] and Romero [13] call "New Social Movements". In this context, the demonstrations against the World Trade Organization summit in Seattle (1999), the Black Lives Matter movement, the Arab Spring, the Spanish 15-M, the movements fighting for degrowth or the feminist movements that are re-emerging in the new political, social, and communicative context.

In this organizational context, the concept of a "social network" ceases to be a metaphor and becomes pure metonymy, and therefore, all those that understand that the relationship between agents is the minimum unit of social analysis emerge as privileged perspectives of analysis: we will see that Social Network Analysis –or simply, SNA— is particularly fertile in these contexts. Activist networks or networks of social movements, sometimes defined with uncomfortably cybernetic references, are bundles of interactions, communicative and action spaces where experiences of struggle and self-organization are shared, where a certain reflexivity lives and a shared sense of protests is built through current and virtual dialogues with past generations that embody the different stages or waves of the movements themselves, thus managing their immaterial cultural heritage. Beyond a social morphology, networks have become a model for emergent forms of politics [14] (p. 92). In our opinion, this also applies to the politics of collective memory and of the intangible cultural heritages of political and social movements.

In this article we take this metonymic conception of social movements as social networks as our starting point. We intend, firstly, to present the main characteristics of the research with which our epistemology is connected, and secondly, to detail the methodological proposal that we have articulated in other research and make it available to the scientific community for discussion and improvement.

We are going to present a methodological proposal designed for the study of massive conversations in social media, through which to generate knowledge about a particular object of study, which is hashtag feminism and its importance for the configuration of the so-called fourth wave of feminism, understanding such a process of self-definition as an exercise of transmission and management of an immaterial cultural heritage [15]. At the heart of the proposal lies the will to contribute to the necessary hybridization between perspectives linked to Computer Science and Social Science, between Data Engineering and content analysis, between quantitative and qualitative analysis techniques, and between inductive and hypothetico-deductive reasoning. We strongly believe, and we will try to argue, that such hybrid approaches are today more necessary than ever.

In this article we will focus on the following issues:


#### **2. Objectives**

#### *2.1. Analyzing the Shaping of the Current Feminist Wave through Twitter*

The aim of this article is double. On the one hand, we want to bring to the table a specific methodology for the analysis of Twitter conversations that can be applied to the study of the shaping of the contemporary feminist movement. This involves the assumption, in line with Deborah Withers' work on the politics of transmission of the feminist intangible cultural heritage in the digital age [15] (p. 5), that each feminist generation defines and generates itself through practices that involve the transmission of an intangible cultural heritage that connects and enables dialogue between generations. This is precisely what a metaphor as beautiful as that of the "waves" tells us when characterizing such generations. On the other hand, we would also like to present a series of empirical works that we have developed and reflect on them in these same keys of transmission of a feminist intangible cultural heritage.

This methodological proposal focuses especially on the most well-known microblogging platform at a global level, which is Twitter. This is so, among other reasons, because Twitter is the source of information that best allows segmenting users, discovering how citizens participate in the political debate and how they are grouped by ideological affinity [16]. Likewise, Twitter has become a consolidated medium for communicating issues related to politics, having since its birth in 2006 a growing importance in political contexts and having been used by virtually all actors interacting in the public-political space [17]. Five years after its creation in 2006, Rodríguez and Ureña [18] already pointed out Twitter as the social network that had acquired the greatest relevance among the political and journalistic class. For Piscitelli [19] (p. 15) at that time it also constituted "one of the most powerful communication mechanisms in history".

Subsequently, from various scientific perspectives authors such as Pariser [20], Page [21], Carr [22], Marwick [23], or Fuchs [24] lowered the most encouraging expectations around the use of social networks and social or political mobilization. As summarized by GiraldoLuque, Fernández-García, and Pérez-Arce [25] in their research on the mobilization that emerged around the hashtag #Niunamenos, Twitter is a means of dissemination and a space for expression around certain public controversies, but its scope for building consensus scenarios or transforming preconceived imaginaries is limited.

That said, Twitter has also been defined during its decade of existence as a space for social interaction, dialectical exchange, and as a sphere of deliberation in which much of the activism of social movements and contemporary social mobilization is [26–28]. In fact, this social network has aroused great interest in the academic community in recent years due to the specific type of conversation that takes place on it. Twitter is undoubtedly the most popular network for discussing political issues and current news, and has had a great impact on all the political and social mobilizations that have taken place in the world in recent years: from the Arab Spring to the Black Lives Matter movement that emerged during the pandemic following the spread of the COVID-19 virus in the U.S. For this reason, social and political movements have been a privileged object of analysis through Twitter data and SNA techniques.

In this sense, the scientific field attaches particular importance to the observation of the changing communication paradigm to further elucidate the dichotomy between the dichotomy of social media and their social function. To this end, big data from social media interaction is an immense source of information with great potential to explain social processes from multidisciplinary perspectives such as sociology, communication, or politics.

In the specific case of feminism, we believe that the analysis of conversations established on Twitter allows us to understand several dynamics that are established for the transmission of the feminist intangible cultural heritage, and even for the conformation of the "waves" that characterize the extension and temporal evolution of the movement. Several authors have already pointed out the importance of feminist hashtivism in shaping the fourth wave [28–30]. In this regard, over the past few years many studies have proliferated around the #metoo movement and its aftershocks beyond the initial scandals linked to the Hollywood film industry [31–34]. In our view, all these analyses and meta-analyses pivot around a series of generational phenomena that are intimately linked to the transmission of the feminist intangible cultural heritage, and even, to the controversies that can develop between generations of activists.

## *2.2. Related Works*

The perspective we will develop in the following section is certainly innovative. However, it should also be acknowledged that we are also underpinned by a growing scholarly interest in social movements on Twitter, and more specifically, in feminism on Twitter. Several authors have already contributed to framing fourth-wave feminism as a connected or networked feminism, which was internationally raised by the strength of protests such as #MeToo [34,35]. There has also been a strong recent interest in the particularity of Spanish feminism on Twitter [36–38], which is the subject of several of the papers that follow.

A trend that has advanced in parallel to the academic interest in feminism is the interest in the social consequences of artificial intelligence. Here, a small yet increasingly important number of feminist articles around the concept of algorithmic injustice, data justice or data feminism are noteworthy [39,40]. In addition to this, there is a small group of research with which, in addition to sharing an object of study such as fourth-wave feminism on Twitter, we have important methodological links. This is research that uses Social Network Analysis to investigate relationships and discourse [41,42]. Undoubtedly, our methodological proposal should be considered within this general paradigm.

#### **3. Methodology**

#### *3.1. Big Data and Interpretative Perspectives*

The kind of challenges that have shaped the big data paradigm have largely been technical and technological challenges. In his famous 2001 article—in which big data is not yet referred to as such—technologist Doug Laney [43] mentioned the three "Vs" (i.e., volume, velocity, and variety) that would become crucial in the field of data management over the next few years. All of Laney's Vs referred to different technical aspects of data storage and processing infrastructures. Later, other authors [44–46] would go on to add more Vs to characterize the paradigm, such as "variety", "veracity", "validity", "volatility", "virtuality", or "visualization". It is at this point that the concept of "value" is presented as central, associated with the notion that data needs to be interpreted to generate return, whether economic or otherwise.

The predictions of some overconfident observers during the first decade of the 21st century invited us to think of a "post-analytical" world [47]. Instead, if anything has become clear over the last 20 years in reference to big data, it is that the analysis and interpretation of such data is a key aspect that can compromise the most sophisticated of automatic processing systems. Over the last few years, dozens of cases have come to light in which systems based on heavily automated massive data—many of them based on "black box" algorithms such as neural networks—have given rise to socially unacceptable situations. Among these situations or perverse effects, algorithms that reinforce human cognitive biases giving rise to echo chambers [48] or bubble filters [20], algorithms that discriminate socially vulnerable collectives [49,50] or, even, chatbots that acquire racist behaviors through community "training" [51] stand out.

Nowadays, large amounts of data flow through new channels becoming a valuable source of information [52]. At the same time, as evidenced by all the cases mentioned above, the most important and socially transcendent challenges faced by the big data paradigm are those related to the analysis and interpretation of data, and not so much to the technical capacity for its storage and processing. Such is the case that some of the most authoritative voices in the world of Artificial Intelligence [53] have already urged the community to abandon the use of black box algorithms (e.g., deep neural networks). These experts propose to redesign systems based on simpler and more transparent algorithms (e.g., regression or decision trees) that facilitate analytical and interpretative work.

This epistemological shift that is taking place among researchers and practitioners of big data, artificial intelligence, and data mining in general [54], represents a great opportunity for social scientists, and for communication scientists. The big data paradigm relies on enormously diverse data sources: hence the V for "variety". Leaving aside exceptional sources such as genomic and biomedical data, meteorological, and environmental data, and some of the data from industry and mining, most big data is social, or has a large social component (e.g., financial, banking, GPS mobility, urban sensor, web browsing, e-commerce, or credit card consumption data). Among them, data from the so-called social media are particularly voluminous, as they come from a wide variety of user-platform and user-user interactions within the different platforms (e.g., posts, mentions, likes, swipes, or shares).

Social media data is a sociotechnological by-product generated jointly by platforms and users from the systematic recording of a series of interactions [55]. Therefore, given its interactive and relational nature, the most abundant data in social media is that which is easily computable as a matrix of relationships (e.g., mentions between users, friendship or follower relationships between users, or relationships established between users and content). This gives great centrality to structural analysis methods [56] such as Social Network Analysis (SNA). With a somewhat smaller but equally important presence, social media also includes data of an attributional nature (e.g., metadata associated with a post or a user). Unlike relational data, attributive data tend to be used in prediction and classification models using Machine Learning (ML) techniques [57].

It is around these two types of techniques (i.e., SNA and ML), that most social media data mining studies are framed, often combining aspects of both. These techniques are usually labeled as "quantitative" because of their mathematical and computational orientation. According to the view defended in this article, this is a more than questionable label, rooted in a dichotomy that is debatable to say the least (i.e., the difference between qualitative and quantitative perspectives). SNA has demonstrated on multiple and diverse occasions its ability to be applied as a mixed methodology [58,59]; on the other hand, ML is increasingly used as a supporting method in qualitative analyses, especially with data from social media [60,61].

In our view, both SNA and ML challenge the tension artificially established in Social Science between quantitative and qualitative techniques, inviting us to overcome this dichotomy. These techniques put on the table the need to articulate analytical strategies that combine the mathematical and computational rigor typical of quantitative approaches with the interpretative skills that characterize qualitative analysis. The type of perspective that we have tried to develop in the research reported in this article is intended to be a contribution to this way of understanding Social Science and big data.

#### *3.2. Social Media as Relational and Textual Big Data Sources: Possibilities and Legal Limits*

Twitter is the social media with the most open data policy to date, compared to other platforms such as Facebook or Instagram. Twitter has a free API (Application Programing Interface) that allows data retrieval with a maximum of seven days of retroactivity, and allows, according to the information provided by the company on its website, real-time data capture, provided that no more than 1% of the platform's global traffic is captured.

The data that the standard, free Twitter API can retrieve is quite extensive: tweets and retweets published, relationships between users, and even their metadata (e.g., their biography, number of followers or number of followers). As reported by the company itself, the standard API does not return 100% of the tweets issued, but it does return "the most important ones" since its API "is oriented towards relevance and not completeness" [62]. As such, the data we can retrieve from the free API represents an indeterminate portion of the total that, in principle, reflects the totality of the conversation very well. Twitter raises the possibility of acquiring 100% of the data and greater retroactivity in its payment plans.

Derived from these conditions of opportunity and the relational nature of the data that can be retrieved from the Twitter API, studies on Twitter using SNA techniques have proliferated during the second decade of the 21st century [63,64]. In this sense, as indicated above, social, and political movements have been a central object of analysis using Twitter data and SNA techniques.

It is possible to distinguish three different strategies of analysis through the conversations and digital interactions of this type of movements and other expressions of collective action developed on Twitter: (1) mention networks, (2) semantic networks, and (3) following relationship networks. All three types derive from a series of decisions that researchers make about the type of data to be represented in the graphs, and about the representation strategy itself. Likewise, the three types of analysis raise different possibilities to be transferred to the methodological processes applied to data captured in other social networks.

The first type consists of the analysis of dynamic relationships, formalized in networks of mentions, retweets, or replies between users [65–68]. This type of networks tends to be conceptualized as directed (i.e., the edges of the network have direction, they are emitted by a node and received by another node) and weighted (i.e., the edges of the network have weights, being able to represent a relationship of one or several mentions), due to the type of relationship they represent.

In general, these are networks with very low densities (i.e., most of the nodes in the network are not directly linked) and with very high "Modularity" figures obtained using the Louvain algorithm [69] (i.e., the communities reflect very strong intra-group association patterns, and very weak inter-group association patterns), which we will see in detail later. Because of the type of data represented, this analysis can only be carried out on Twitter or other platforms where the mention-type relationship plays an analogous role: Mastodon, Gab, or Slack. This type of analysis is not directly transferable to networks such as Facebook or Instagram. On Facebook pages, it is normal to respond to the messages posted, and there is no analogous element to the retweet that is traceable between pages. On Instagram, people like rather than comment, and likes are not provided by the API at the level of each user.

Semantic networks, as a second line of research, have been explored in a complementary or alternative way to Topic Modeling algorithms [70–72]. Word networks tend to be conceptualized in an undirected and weighted way. That means that it is assumed, as a rule, that two words will co-occur symmetrically, or that they will be symmetrically linked, and also, that the number of times two words co-occur in a discourse is usually a relevant factor in the analysis.

The morphological characteristics of semantic networks are highly variable since they can represent different types of discourse with very different levels of lexical diversity. It is very common for semantic networks to be the result of a series of data processing operations using Natural Language Programming techniques, such as the segmentation or "tokenization" of a text (i.e., its division into words, sentences or paragraphs), the filtering of stopwords (i.e., the removal of particles that do not provide relevant information, such as articles, adverbs, or conjunctions) or "lemmatization" (i.e., the transformation of the words of a text into their canonical form, according to a pre-designed dictionary). In contrast to the previous case, this type of analysis is extremely versatile and transferable to any textual data source: social media, written and digital press, blogs, books, or scientific articles, among other cases of analysis.

Relationships between words, between sentences or between documents can be studied through SNA, which has yielded very good results in recent research [73]. To this end, several types of networks can be synthesized according to analytical needs: networks of words according to the number of times they appear together, networks of documents according to the number of words they share, networks of hashtags according to the frequency with which users have used them in their posts, and so on.

In any case, the most common approach to this type of analytical problems has been through heuristic rule processes from the fields of Natural Language Programming [74] or in combination with ML models [75], which usually imply a significant improvement in the predictive or classificatory capacity of such models. Due to the great complexity of the human language, black box analysis techniques such as embeds or embeddings have proliferated during the last few years [76]. These are deep neural networks which, as we have already seen, provide very good results in exchange for a great opacity in the internal processes of the algorithms. These technologies enjoy enormous popularity among computational scientists faced with problems such as word prediction in search engines or automatic text translation.

Finally, it is worth highlighting the third type of analysis, most likely the least employed, which is the one that consists of observing networks of established relationships and their effects or consequences [77,78]. The networks synthesized from the relationships established between social media users will be directed or undirected, depending on the platform (e.g., on Twitter they will be directed, since one user can follow another without being followed back by the other; whereas on Facebook or LinkedIn they should be undirected, since if there is no agreement between two users, they will not be "friends" or "contacts" on these networks), and, typically, they will be unweighted (i.e., it is not possible to follow anyone more than once on Twitter, nor to be friends with someone more than once on Facebook or LinkedIn).

In this type of analysis, a distinction can be made between egonets and socionets, which are fundamental categories of the SNA [79]. The first type of networks (i.e., egonets), in the context of social media, are those that reflect the links between a user's followers or friends. The second type (i.e., socionets) represents the relationships established between

a group of nodes, without any of them constituting the center of the network. In the second case, the population of the network will have been designed according to some criterion external to the network itself (e.g., the network of relationships among the students of a course or among the journalists of a media outlet). The formal characteristics of the network will depend on the criteria according to which they have been constituted, although they will tend to be denser than the networks of mentions because of transitivity and homophily characteristic of personal networks: it is to be expected that someone's friends will end up knowing each other and establishing friendship as well [80]. Likewise, the Modularity figures derived from the Louvain algorithm will tend to be lower than for mention networks.

Twitter allows synthesizing egonets and socionets by retrieving data from its standard API. The other powerful networks, such as Facebook or LinkedIn, allowed egonets with their standard API before the Cambridge Analytica scandal [81]. Currently, these networks no longer provide these data, although they can be achieved through web scraping or web scraping techniques (i.e., techniques for the automatic extraction of data available on websites and social networks), increasingly popular, being used for a myriad of data mining operations (e.g., robots for indexing web content, flight, hotel, or insurance comparators, or for automatic alert systems). However, their legal basis is still somewhat unclear [82].

Web scraping can be implemented with completely legal tools, but its use may contravene the regulations of the social media platforms or websites from which the data is extracted. As a rule, therefore, these are operations that cannot be implemented by a logged-in user, but which the social media company will not be able to prevent if they occur from a non-logged-in user, since nothing that a social media platform makes available to a non-logged-in surfer can contravene the provisions of the data protection laws operating in the territories in which the platform operates. Although this is a swampy terrain with many issues, the type of judicial decisions that have been made over the last few years are favorable to web scraping of information available to non-logged-in users [83].

Following this doctrine, and if we stick to data that can be accessed by a non-logged-in user, web scraping is a very good alternative to API data access for research aimed at synthesizing semantic networks, or some hybrid models, such as networks between users and words, or between users and hashtags. One way or another, it will always be possible to apply web scraping techniques to obtain semantic data from social media, as well as from other sites on the Internet. In this way, the analyst will be able to rely on complete datasets rather than an indeterminate portion of the total and will have a greater temporal margin and retroactivity. However, web scraping is not feasible for research that focuses on the mention relationships between users of the major social media platforms, let alone their follower or friendship relationships. Obtaining this data is technically feasible, but it is necessary to violate the social media regulations, and in many cases, also the data protection laws in force in each territory.

#### **4. Results**

#### *4.1. Network Analysis and Machine Learning as Assistants for the Interpretation of Dynamics in Virtual Networks*

We have previously emphasized the need to articulate analytical and interpretative perspectives to overcome the artificial distinction between quantitative and qualitative analysis that has characterized social science in recent decades. The big data paradigm –and more specifically, techniques such as SNA and ML— exposes the obsolescence of this way of segmenting scientists based on their skill repertoires, while pointing to the need to generate new hybrid methodological frameworks that allow for simultaneous mathematical and phenomenological analyses.

In our opinion, one of the most effective ways to analyze and interpret the dynamics of virtual networks is to articulate SNA and content analysis techniques, through the development of workflows more typical of Data Engineering. This involves taking as a starting point the mention-type interactions (i.e., nominations of one user by another) on the social network Twitter, in the context of a series of digital conversations related to issues of public and political debate, and then synthesizing networks or graphs from them (i.e., Figure 1). Thus, in the resulting massive graphs, each point or node represents a user of the Twitter social network (e.g., a personal account, a company, a media outlet, a political party, etc.) and each line represents an established mention from one user to another (e.g., a retweet, a reply, or a direct allusion). These are therefore directed and weighted networks, to which a series of algorithms are applied to generate value from the data.

**Figure 1.** Network of mentions to political parties on Twitter during an electoral campaign (April 2019, Spain) in two different spheres of influence. Source: Own elaboration with Gephi software.

One of the most useful Machine Learning algorithms for a perspective such as the one detailed in this paper is the Louvain algorithm for community identification in massive graphs [69]. It is an unsupervised learning algorithm that performs a series of operations on the data in a recursive manner and allows the identification of clusters or sets of nodes that make up specific communities within a network. The process by which the Louvain algorithm identifies communities consists of randomly grouping nodes, and permanently evaluating the gain or loss of Modularity (i.e., a metric that evaluates the overall quality of the community partition of a network, comparing it with a randomly constituted network of equal size) [84] implementing only those groupings that result in gains in this metric (i.e., optimizing the quality of the community partition).

The output of the Louvain algorithm consists of a set of communities (i.e., a community partition) and a Modularity figure that allows us to evaluate its mathematical relevance. According to the creator of the Modularity metric, Mark Newman [84], values between 0.3 and 1.0 indicate a good quality of the community partition of a Network (i.e., it is assumed that the network is significantly different from the one that could have been constituted by chance). Despite their obvious similarities, modularity should not be confused with a hypothetical validation metric such as the "*p*-value" used in inferential statistics for the acceptance or rejection of the null hypothesis. Modularity is not used by the Louvain algorithm as a metric for hypothetical validation, but as an internal optimization mechanism. In other words, the algorithm is oriented to obtain the best possible Modularity figure. This feature, far from being a problem, is what allows the researcher to work with categories based on empirical data. This is a great example of how unsupervised algorithms facilitate qualitative readings of massive quantitative data.

In community identification, being an unsupervised process, the role of the analyst is not to train the algorithm to identify one or another type of groups, but to interpret the results of a node clustering process based on the patterns that the algorithm itself is able to identify in the data autonomously. Common SNA software (e.g., Gephi or Pajek) allows the analyst to establish community partitions at different resolutions [85], thus being able to choose between identifying more smaller groups or fewer larger groups. Thus, when it comes to analyzing social movements such as feminism itself, this type of approach allows conceptualizing the complexity of social identity (i.e., the diversity of

identifications available in the Self and its hierarchical structure) and intergroup relations in a privileged way and allows social analysts to move away from essentialist and reductionist conceptualizations [86].

In this analytical model, the cluster is the element that provides the context for the analysis of the rest of the data: the leaders in the network and its contents. Regarding the analysis of the leaders of a network (e.g., the most mentioned users, the most active in mentioning third parties, the best intermediaries, the ones that can most easily reach any other, etc.), it will be relevant to use metrics such as the input degree (i.e., the number of edges received by a node), the output degree (i.e., the number of outgoing edges) or the betweenness centrality degree (i.e., the number of shortest paths between pairs of nodes in a network that have to pass through each node). On the other hand, for content analysis, lists of tweets and lists of hashtags are elaborated by clusters, by means of Data Engineering strategies of crossing and combining data sources (i.e., Figure 2).

**Figure 2.** Example of combining data from two different tables. Source: Own elaboration with Power Point.

The next step in this proposal is to use certain Business Intelligence software such as Tableau, PowerBi, Google Data Studio, or Grafana to carry out the cross-referencing of data. All this without neglecting the desirability of also being able to count on SQL and NoSQL database technologies that allow to establish links between databases with different degrees of structuring, depending on specific categories, and to subsequently represent the combined fields in tables or graphs. Using this type of tools, it is possible to generate dashboards with linked visualizations that allow interactive navigation: selecting each cluster and visualizing its properties based on the indicators and key variables for each case of analysis. This type of approach therefore requires analytical profiles that are also familiar with some fundamental operations of Data Engineering, such as value transformation or table joining.

#### *4.2. Combining Induction and Deduction to Understand the Current Feminist Wave*

Social media constitute a sort of public sphere in which different social movements deploy their strategies and define themselves through practices that involve the transmission of an immaterial legacy [15]. In the case of feminism, this has been recurrently expressed through the metaphor of "waves," about which several authors suggest that we are currently facing the fourth [87]. The role of hashtivism and, very particularly that which has been developed on Twitter, is becoming very important in the self-definition of fourth wave feminism, a constructor of new political subjects that pivot around particular campaigns or hashtags [88]. The fourth wave of feminism thus dialogues with analogical activists who have endowed them with a whole tradition of struggle and a not inconsiderable set

of small victories that, taken together, have improved the living conditions of women in different parts of the world. It is precisely this dialogue, sometimes explicit and sometimes implicit, in which the practices of intangible cultural heritage transmission and political self-definition of fourth-wave feminism materialize. It is this dialogue that we wish to analyze here.

The guidelines and steps described above can be applied in various investigations to approach the object of study from different perspectives, depending on the research objectives. We will now look at four practical applications of the described methodology applied to the analysis of fourth-wave feminism and feminist hashtivism. In them, we start from a perspective that develops from a type of inductive and inferential reasoning that seeks theoretical synthesis from the observation of cases (i.e., Figure 3). We will also see how this type of approach is used in a complementary way with hypothetico-deductive approaches that seek the opposite: the validation of theories and hypotheses based on the observation of cases. In fact, by putting these practical and real examples on the table, it is argued that these two modes of reasoning work best when used together, and this is precisely one of the main strengths of the method of analysis described.

**Figure 3.** General workflow for the proposed methodology. Source: Own elaboration with Power Point.

In methodological terms, we argue that attributing epistemological priority to contextual analysis using unsupervised algorithms and an inducive logic (e.g., detecting communities with the Louvain algorithm) is an efficient strategy to overcome part of the most common problems in the analysis of massive social media data, such as, for example, uninformative and/or spurious sentiment analysis [89].

Below we cite four case studies of research published in scientific journals that employ this method and apply it to the observation of phenomena related to one of the most significant contemporary mobilizing currents such as feminisms and their presence in the online public debate. All these investigations are part of the general project whose methodological documentation we are carrying out in these pages. It is a project dedicated to the analysis of different conversations and controversies that occupy fourth wave feminism in the Spanish, Spanish-speaking and, eventually, also international sphere. All of them have in common, therefore, being investigations thought and executed from the same epistemological and methodological mentality, and furthermore, the fact of dealing with an object of study linked to the practices, transmitters of the intangible cultural heritage of feminism.

#### 4.2.1. Feminisms Outraged at Justice

The reaction of Spanish-speaking feminism on Twitter to the controversial sentence against the members of "la manada"—a group of men who raped a woman at a local festival—is analyzed in this article using SNA techniques [90]. The sentence in question condemned the rapists for harassment, but not for rape, despite recognizing the materiality of the crime, also acquitting them of the crimes of recording the rape with a cell phone and robbery with intimidation.

Following network analysis and community identification, the five most disseminated tweets in each community are identified, along with their most prominent leaders by input degree (i.e., the most mentioned). From the content analysis, a total of three "macronarratives" are identified around the Spanish judicial system, raised by the reaction to the sentence on Twitter. Two of these "macro narratives" projected a very negative evaluation of the Spanish judicial system, while the third narrative was one of defense of the system and, simultaneously, of criticism towards the feminist movement.

In this research it was possible to identify a series of practices strongly linked to the construction of the political subject of fourth-wave feminism, as well as a series of practices of identification and differentiation with respect to older generations and to the historical feminist movement.

#### 4.2.2. Digital Prospects of the Contemporary Feminist Movement for Dialogue and International Mobilization

An analysis is carried out based on the analysis of the conversation in Spanish and English around the 2018 international day against violence against women on Twitter [88]. The most shared contents of each cluster of the network, both messages and images, are inspected and the levels of intercommunity homophily (i.e., to what extent nodes tend to establish links with those who are part of the same group or with those who are part of the other group) [91] are examined.

Overall, the nuances between Latin feminist activisms (i.e., generally more contentious) and Anglo-Saxon ones (i.e., more liberal and based on the support of individual cases) and the absence of shared transnational and translinguistic narratives are observed, which leads researchers to problematize the usefulness of feminist hashtivism, at least, as far as its international coordination is concerned.

The results of this research open the door to consider the plurality of forms in which fourth-wave feminism materializes, as well as the different links established with the legacy of feminist activists of previous waves and the transmission of their immaterial cultural heritage.

#### 4.2.3. Feminist Hashtag Activism in Spain

An analysis is made of the contents of the clusters during the digital conversation that arose because of the sentence of "la manada" in April 2018 and continued for a few months, through a series of hashtags in solidarity with the victim [92]. On this occasion, attention is paid to the tone of their messages, and an interesting correlation is detected between those clusters with higher Input Degree Centralization (i.e., to what extent the reception of mentions is centralized or decentralized in the network) [93] and a greater banality in the contents (e.g., memes and other viral contents) and, in turn, between those clusters more decentralized in the reception of mentions and a greater seriousness, formality and anchorage with the historical feminist movement in the contents disseminated and the codes used.

In addition, a link is established between an external variable to the network (i.e., the degree of politicization of the users, according to the number of politicians' accounts followed) and the belonging to a cluster of high or low input degree Centralization by means of a Machine Learning algorithm: a logistic regression that achieved a predictive capacity with an accuracy of 67.4%. In other words, it is inferred that the number of followed politician accounts is a very good predictor for the mode of political participation on Twitter.

This study invites us to consider the diversity of modes of participation and belonging to fourth-wave feminism, as well as the diversity of forms of linkage (i.e., stronger, or weaker, more conscious, or more unconscious) with previous generations and with the feminist immaterial cultural heritage.

#### 4.2.4. Influence of Gender in Electoral Debates in Spain

The type and tone of the most shared messages in different clusters are identified, with the occasion of two televised electoral debates (April 2019 General Elections in Spain), which were starred entirely by men (#ElDebateDecisivo) and the other entirely by women (#L6Neldebate) [94].

It is observed, through the analysis of the most shared tweets in each cluster, and the strategy of identifying the nodes present in the female debate in the clusters of the male one, that the type of users who participate in the conversation of both debates do so with more serious codes, while those who participate only in the male one made use of banal and humorous resources more frequently.

In line with the findings of the previous article, in this one we can observe a certain alignment between the modes of articulation of the more self-conscious feminist hashtivism with the feminist agenda of political parties, while a greater alignment of spontaneous and depoliticized feminism with the irreverent and sometimes banal standards of virtual conversations.

#### **5. Conclusions with Discussion**

Social networks such as Twitter are a constant object of analysis by the various disciplines of social science, among which we highlight in this paper the discipline that focuses on the analysis of mass communication, transformed into mass interpersonal communication [95]. This field has evidenced an indisputable transformation that alters the traditional theoretical-methodological foundations of research on communicative production and reception. One of the main evidences of such an evolution would be the growth in the use of systematic and standardized data techniques in empirical works, as well as the tendency to put the research effort into the discursive and dialogic dimension of communication [96].

After presenting a specific workflow, which we believe is also replicable and applicable to many analytical objects, we have presented four different investigations around the same theoretical object. With variations, in the four investigations we have proceeded with the download of data from the Twitter API, with Network Analysis applied to the mention relationships between Twitter users, with community identification based on the network of mentions, and with the visualization of each community through a data modelling and visualization strategy. Not surprisingly, each investigation has led to a separate set of results. At the aggregate level, however, what we have seen is the usefulness of the methodology.

Our intention here has been to expose a replicable methodology for the generation of knowledge in massive data environments that require hybrid analytical perspectives and skills that can articulate rigorous mathematical analysis and relevant phenomenological interpretations. Our object of study has been fourth-wave feminism, and more specifically, we have focused on some reflections on the processes of transmission of the feminist intangible cultural heritage that take place on Twitter, in the thread of massive conversations, often chaotic, which nevertheless vertebrate and emplace the contemporary self-definition of the movement, always in dialogue with its recent and not so recent past, as well as projecting into the future. To this end, a series of specific analytical procedures have been defined to study the processes of communication on Twitter.

The distinctiveness of this methodological procedure consists not only in the use of analysis techniques and tools that are not very common in the social-scientific field, although increasingly so, but also in deploying a type of strategy that makes it possible to combine quantitative and qualitative analysis, as well as hypothetico-deductive and inductive perspectives. The paradigm of big data and the techniques associated with the computational sciences to which we have referred—SNA and ML—make it possible, to a large extent, to overcome old technical, methodological, and even epistemological

dichotomies or, at least, place us before new scenarios that provoke new metaphors and ways of looking at the classic objects of study of the social and communication sciences. These analyses not only serve to characterize the most recent research in communication 2.0, but also shape an analytical proposal based on the virtues of Twitter to weave individual messages with collective dialogic capacity around certain conversations or hashtags, this being a logic closely linked to feminist hashtivism, fourth-wave feminism, and the processes of transmission of immaterial cultural heritages in digital environments.

In any case, neither the proposed methodology nor the research we have used as an example obviate the existence of critical positions on the potential of the digital medium, on hashtivism in general, and on feminist hashtivism. We understand that, in addition, many of these critiques—traditionally established from the academy and from the feminist activism of previous generations—not only imply a form of real dialogue between feminists, but also imply in themselves practices of transmission of the immaterial cultural heritage: a transmission that never takes place in a linear and unidirectional way, but that necessarily passes through the self-constructive and creative filter of each generation or wave. Similarly, the proposals collected present not only the benefits, but also the methodological limitations of research on Twitter [97] based on the restrictions of the data collected, the bias of representation when making general assumptions and other problems, derived, for example, from the language of use of the users.

**Author Contributions:** Conceptualization, J.M.-i.-G. and J.O.-T.; methodology, J.M.-i.-G.; software, J.M.-i.-G.; validation, A.L.-U., J.O.-T. and S.P.-F.; formal analysis, A.L.-U.; investigation, J.O.-T.; resources, S.P.-F.; data curation, J.M.-i.-G.; writing—original draft preparation, J.M.-i.-G.; writing review and editing, J.O.-T., A.L.-U. and S.P.-F.; visualization, J.M.-i.-G.; supervision, J.O.-T. and J.M.-i.-G.; funding acquisition, A.L.-U. and S.P.-F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Basque Government grant number IT-1112.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Written informed consent was not obtained due to Twitter API access policies.

**Data Availability Statement:** Data available upon request.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

