1. Introduction
In recent years, topic modeling has emerged as a key tool for extracting latent thematic structures from large-scale textual data, particularly in domains such as news analysis, policy evaluation, and digital humanities. In big data applications, topic modeling and thematic analysis have become popular tools in natural language processing (NLP) for understanding textual data, though they approach the task from different angles [
1]. Topic modeling identifies abstract topics within a corpus by analyzing co-occurring word patterns. On the other hand, thematic analysis is a qualitative method focusing on the identification of themes, patterns, and meanings within textual data through manual coding and interpretation. While topic modeling relies on statistical algorithms to extract topics, thematic analysis involves a more nuanced exploration of the data, influenced by the subjectivity and context. Despite their differences, both approaches have seen significant advancement and adoption in the last decade, contributing to a deeper understanding of large text datasets in various fields such as social sciences, healthcare, and marketing [
2].
To address the complexity of multilingual and morphologically rich corpora, a range of topic modeling approaches have been proposed, including probabilistic, embedding-based, and fuzzy logic-driven models. Topic modeling in deep text mining offers advantages such as semantic understanding and feature representation by identifying latent topics or themes in documents [
3]. These extracted topics serve as informative features for deep learning models, aiding in learning complex patterns and relationships among words or documents. Additionally, topic modeling assists in dimensionality reduction by representing documents in terms of their topic distributions, mitigating the curse of dimensionality [
4,
5]. However, challenges arise in terms of granularity loss, training complexity, adaptability, and interpretability [
6]. Integrating topic modeling with deep learning architectures introduces complexity during training, requiring careful optimization of hyperparameters and architectural choices. Moreover, while topic modeling enhances interpretability, the interpretation of deep learning models remains challenging, particularly in complex neural network architectures. Overall, while topic modeling enhances semantic understanding and feature representation, its integration with deep text classification necessitates consideration of both advantages and challenges.
Fuzzy topic modeling, which integrates fuzzy logic principles into topic modeling algorithms, finds applications across various domains where data exhibit vagueness or uncertainty. For instance, in social media analysis, fuzzy topic modeling effectively captures the ambiguity and imprecision inherent in user-generated content [
7]. In healthcare, it assists in identifying latent themes in medical records or patient feedback, considering the diverse and nuanced nature of medical terminology and patient experiences [
8,
9]. Moreover, in market research and customer feedback analysis, fuzzy topic modeling uncovers subtle patterns and sentiments in consumer opinions and preferences [
10]. Additionally, in text summarization and information retrieval tasks, fuzzy topic modeling improves the relevance and coherence of extracted topics or summaries by considering the fuzzy nature of linguistic expressions [
11,
12]. Overall, fuzzy topic modeling offers versatile applications across domains where traditional topic modeling approaches may struggle to handle the inherent uncertainties present in textual data.
Despite their advancements, these models face persistent challenges in capturing semantic ambiguity, overlapping themes, and language-specific features, especially in agglutinative languages like Turkish. Current challenges in topic modeling include scalability to handle large-scale datasets efficiently, interpretability to provide meaningful insights from the identified topics, robustness to noise and outliers in the data, and adaptability to diverse domains and languages. Scalability remains a significant challenge, as traditional topic modeling algorithms struggle to process massive volumes of text data in a reasonable amount of time [
13]. Interpretability is crucial for ensuring that the identified topics are coherent, semantically meaningful, and easily understandable by users. Robustness is another challenge, as topic models often encounter noisy or ambiguous text data, leading to suboptimal topic representations. Finally, adapting topic modeling techniques to different domains and languages requires overcoming issues related to data sparsity, domain-specific terminology, and linguistic variations, necessitating the development of domain-aware and language-aware topic modeling approaches [
14]. Addressing these challenges is crucial for advancing topic modeling techniques and enabling their widespread adoption in various applications [
15,
16], including information retrieval, recommendation systems, and text summarization.
The state of the art in fuzzy text processing involves leveraging fuzzy logic and machine learning techniques to address the inherent ambiguity and uncertainty present in textual data. Researchers have explored various approaches, including fuzzy set theory, fuzzy clustering, and fuzzy logic-based classifiers, to capture the vagueness and imprecision of natural language. These methods often incorporate linguistic variables, membership functions, and fuzzy rules to model the uncertainty inherent in text classification tasks [
17]. Additionally, advancements in deep learning, such as neural networks with fuzzy logic-inspired layers or architectures, have contributed to improving the accuracy and robustness of fuzzy text models. Bottlenecks in fuzzy text processing arise from the linguistic variables and their relationships within a fuzzy logic framework, as linguistic terms can be subjective and context dependent [
18]. Another challenge is devising effective fuzzy inference mechanisms to make decisions or classifications based on fuzzy rules, as determining the optimal aggregation methods for fuzzy sets and handling the computational complexity of fuzzy inference engines can be demanding. Additionally, integrating fuzzy techniques with existing text processing pipelines may introduce complexities in preprocessing, feature extraction, and model training, requiring careful optimization and tuning to achieve optimal performance. Overall, addressing these bottlenecks requires innovative approaches that balance computational efficiency with the ability to capture and interpret the nuances of fuzzy linguistic expressions in text data.
This study aims to comparatively evaluate the semantic structures produced by different topic modeling techniques through knowledge graph visualizations, with a focus on interpretability and thematic granularity in Turkish news discourse. The present study distinguishes itself through its integrative comparative framework that systematically evaluates conventional, neural, and fuzzy logic-based topic modeling techniques in the context of Turkish news discourse. While prior research has explored individual topic modeling methods—such as LDA, LDA extensions, BERTopic, or Top2Vec—few studies have examined their comparative performance in morphologically rich languages or investigated their implications through knowledge graph construction. This study contributes a novel perspective by incorporating Fuzzy Latent Semantic Analysis with weights (FLSA-W) as a soft clustering alternative capable of modeling partial topic membership and thematic ambiguity. In particular, it innovatively combines fuzzy logic with weighted term distributions and leverages neural embedding-based methods (e.g., Top2Vec and BERTopic) to contrast hard and soft clustering behaviors. Moreover, the use of knowledge graphs provides an additional layer of interpretability by mapping semantic connectivity and topic transitions across models. This multi-model and multi-representational approach fills a critical gap in the literature, offering both methodological innovation and empirical insights into the role of fuzzy reasoning in multilingual and semantically complex corpora.
The findings of this study have practical implications for a range of stakeholders, including professionals in the news industry, content analysts, and the NLP research community. By demonstrating the effectiveness of fuzzy topic modeling techniques in capturing overlapping and nuanced thematic structures within Turkish news data, this study provides a practical approach for improving automated content classification, media monitoring, and discourse analysis. These insights are particularly valuable in the context of Turkish and Turkic languages, which are morphologically rich and often considered low-resource due to the limited availability of annotated linguistic tools and datasets. The proposed methods offer scalable, unsupervised solutions that can enhance the accessibility and semantic analysis of content in these underrepresented languages. More broadly, this study contributes to the development of inclusive NLP methodologies capable of addressing the unique linguistic challenges of low-resource language families, thereby supporting more equitable advances in multilingual language technologies.
The primary objective of this study is to evaluate and compare the capacity of fuzzy topic modeling algorithms to capture thematic overlap and network complexity in multilingual news corpora, with a particular emphasis on Turkish-language datasets. Specifically, this research investigates how fuzzy topic models—characterized by their ability to assign partial membership values to topics—can more accurately reflect the semantic ambiguity, polysemy, and contextual fluidity found in real-world news narratives, as opposed to traditional crisp or probabilistic models. To achieve this goal, we apply FLSA-W and benchmark its performance against state-of-the-art models such as BERTopic, Top2Vec, and LDA. These models are assessed not only through conventional topic quality metrics (e.g., coherence and diversity) but also via knowledge graph visualizations that reveal inter-topic relationships and lexical centralities. The inclusion of fuzzy models is particularly justified by the linguistic complexity of Turkish, which involves rich morphology, agglutination, and context-dependent word meanings—factors that challenge standard topic modeling techniques. Ultimately, this study aims to answer the following research questions:
To what extent do fuzzy topic models offer more nuanced and accurate representations of overlapping thematic structures in Turkish news data compared to conventional models?
How do the resulting knowledge graphs reflect the differences in model behavior, particularly in terms of topic connectivity and semantic granularity?
Can fuzzy-based modeling approaches improve the interpretability and real-world applicability of topic modeling in complex linguistic environments?
By explicitly addressing these questions, this study advances the field of low-resource morphologically complex languages and contributes new insights into how fuzzy logic frameworks can enhance thematic extraction and knowledge graph construction. The rest of our study is presented as follows.
Section 2 reviews related works in topic modeling techniques and deep topic information extraction.
Section 3 details our dataset and methodology on the basis of four different topic models. The dataset, tools, and statistical learning frameworks are given with our corresponding representation based on computational parameters.
Section 4 presents our results with knowledge-based graphing and statistical evaluation.
Section 5 highlights the scope of the study based on the obtained results and presents a brief discussion. Finally,
Section 6 concludes the interpretation of fuzzy topic modeling using current cutting-edge and prospective trends.
2. Related Works
Despite remarkable advances, integrating topic modeling with knowledge graph construction continues to pose several methodological and practical challenges. One significant issue lies in the semantic alignment between the probabilistic nature of topic models and the structured, relational nature of knowledge graphs. Topic models like LDA, BERTopic, or Top2Vec often produce topics as soft clusters of words without explicit ontological boundaries [
19], making it difficult to map these fuzzy topics onto precise graph structures without losing important contextual nuances. Additionally, ensuring consistency when updating dynamic corpora, adding time-evolving topics or emerging subtopics, demands robust graph evolution strategies that preserve both local and global semantic relationships. Another major challenge is the quantitative evaluation of the combined system. While coherence and diversity metrics exist for topic models, measuring how well topics integrate into a knowledge graph requires advanced metrics for graph quality, such as structural entropy, community modularity, or graph embedding similarities. Moreover, the computational complexity of generating and maintaining large-scale graphs enriched with topic information can become prohibitive, especially for real-time applications [
4,
20]. Bridging these gaps calls for innovative approaches that unify probabilistic topic distributions with graph-based reasoning and representation learning, potentially leveraging graph neural networks or hybrid symbolic–neural architectures to enhance interpretability, scalability, and adaptability.
The Fuzzy Latent Semantic Analysis (FLSA) [
21] and its variant FLSA-W [
22,
23] represent significant advancements in topic modeling, particularly for domains like news analytics where linguistic ambiguity and semantic drift are pervasive. Unlike traditional models such as LDA, which assign words to topics based on strict probabilistic distributions, FLSA incorporates fuzzy membership functions to allow words and documents to partially belong to multiple topics. This is particularly advantageous in news streams, where overlapping themes, polysemy, and evolving contexts demand more flexible topic boundaries. Compared to Top2Vec or BERTopic—which rely heavily on embedding spaces or clustering dense vectors—FLSA and FLSA-W explicitly encode vagueness in linguistic data, yielding richer interpretability and smoother topic transitions over time. Their ability to adjust membership degrees dynamically can better reflect real-world semantic fluidity in fast-changing news corpora. However, despite these advantages, FLSA and FLSA-W also face certain bottlenecks when compared to their mainstream counterparts. FLSA models can require more complex parameter tuning and higher computational costs due to the added fuzzy clustering steps and iterative defuzzification. Similarly, embedding-based methods like Top2Vec and BERTopic excel in leveraging pre-trained contextual embeddings, often outperforming FLSA in capturing subtle semantic relationships in large datasets. Yet these models sometimes struggle with interpretability and tend to ignore the inherent uncertainty in topic boundaries—a gap FLSA-W addresses through its weighted fuzzy memberships.
In content analysis, LDA found broad application in news modeling, particularly in topic modeling and content recommendation systems. LDA was leveraged to categorize various amounts of news articles into topics, providing efficient content organization and retrieval for users [
24,
25]. Furthermore, LDA’s ability to uncover hidden thematic structures within news data enhanced sentiment analysis and opinion mining tasks, enabling more nuanced understanding of public opinion trends [
26,
27]. Additionally, LDA-based models were integrated into news recommendation systems, aiding in personalized content delivery by identifying relevant topics based on user preferences and historical interactions [
28]. These applications underscored LDA’s versatility in extracting meaningful insights from news corpora, contributing to advancements in information retrieval and user engagement in the digital news landscape. Moreover, contextual factors and data quality impact LDA performance. The studies highlighted the importance of considering contextual factors, such as language and cultural context, and data quality issues, like optical recognition errors and problematic segmentation of articles [
29]. These factors should be carefully addressed when applying LDA to Turkish news corpora. Hybrid approaches and contextual embedding can improve topic modeling. A study using BERT-LDA [
30] demonstrates that incorporating contextual embedding can lead to more coherent topics. This suggests that exploring hybrid approaches and contextual embedding techniques may enhance the performance of LDA in Turkish news corpora.
In the field of news modeling, Top2Vec [
31,
32], a document embedding technique, became popular for its effectiveness in tasks like topic summarization and sentiment analysis [
33]. Unlike traditional methods that rely on keyword extraction or statistical analysis, Top2Vec leverages its understanding of semantic relationships between words to uncover latent topics within news articles. This allows it to not only identify key themes but also differentiate between positive, negative, or neutral sentiment within the news content. This empowers researchers to design algorithms that can autonomously summarize the insights of news feeds by extracting prominent topics. Essentially, Top2Vec offers a powerful tool for sifting through the vast amount of news data and extracting valuable insights that would be difficult or time-consuming to obtain through manual methods [
34].
BERTopic [
35] is another cutting-edge tool in neural topic modeling with a high ability to handle media data and microblogs. It surpasses traditional methods, which rely on keyword extraction and statistical analysis. BERTopic uses the power of pre-trained transformer models to keep the semantic relationships within news articles [
36]. This allows it to uncover latent thematic structures that transcend keyword identification. BERTopic does not just provide abstract clusters; it generates human-interpretable topic descriptions using keywords, fostering a clear comprehension of the underlying themes. BERTopic as a pre-processing step enhances the accuracy of news article classification, even with limited labeled data. This makes BERTopic a pivotal tool for organizations that require automated news categorization and analysis, be it for sentiment analysis, identifying fake news, or personalizing news feeds for user preferences [
37,
38].
3. Materials and Methods
The proposed research pipeline follows a structured six-stage framework designed to provide comprehensive topic modeling and semantic analysis of Turkish news media, as shown in
Figure 1. Data collection and corpus construction involve the compilation of global news articles from two major Turkish media agencies, ensuring internationally relevant discourse. The text preprocessing and normalization phase prepares the raw textual data through tokenization, lowercasing, stopword removal, lemmatization, and stemming, optimizing it for topic modeling algorithms. Exploratory analysis and word cloud visualization are performed to generate initial insights into term frequencies and highlight dominant lexical patterns across the corpus. Topic modeling is conducted using Top2Vec and FuzzyTM, two advanced models that leverage neural embeddings and fuzzy logic, respectively, to extract latent semantic structures without requiring a predefined number of topics. The performance of these models is evaluated against conventional approaches—specifically LDA and BERTopic—using quantitative metrics such as coherence, diversity, and an interpretability score derived from their product. Knowledge graphs are generated to visually represent topic interconnectivity, and entropy analysis is applied to quantify the structural complexity of the resulting semantic networks. This integrated framework enables both a rigorous computational comparison and a deeper understanding of how different modeling paradigms reveal thematic content in morphologically rich, multilingual settings. The complete analysis is presented with its corresponding English translations.
The dataset comprises 3307 articles published over a six-month period (March 2023–September 2023), containing a total of 12,131 unique tokens, as represented in
Figure 2. The news distribution stands uniformly distributed even if there is a decrease during the summer time.
To ensure high-quality input for modeling, we applied a preprocessing pipeline specifically designed for Turkish natural language processing. This involved lowercasing, punctuation and stopword removal, and tokenization. Given the agglutinative nature of Turkish, we used the Zemberek NLP library [
39,
40] for effective lemmatization and morphological normalization, reducing lexical variation and enhancing model interpretability.
As an initial step of data analysis, we generated a word cloud based on term frequencies to visualize dominant lexical patterns within the corpus. This provided a preliminary insight into the most salient topics prior to formal modeling in
Figure 3.
The word cloud analysis of the news dataset reveals a strong focus on geopolitical issues and international relations. The most frequently occurring terms such as “region” and “country” (both with a weight of 1.0), along with high-frequency mentions of “USA” (0.9), “Russia” (0.8), “Turkey” (0.8), and the “Middle East” (0.8), indicate a heavy emphasis on global diplomacy and conflict. Countries like “Ukraine”, “China”, “Iran”, “Israel”, and “Azerbaijan” also appear frequently, suggesting ongoing international tensions, regional instability, or significant global events involving these nations. Additionally, the presence of terms such as “war” (0.6), “foreign minister” (0.6), and “important visit” (0.6) supports the notion that much of the news coverage centers on political actions, diplomatic engagements, and security concerns. Moreover, the dataset highlights environmental and humanitarian themes. The term “forest fire” (0.65) is notably prominent, indicating recent or recurring natural disasters, while “life”, “loss of life”, and “human rights” each appear with moderate frequency, suggesting the coverage of human impact and social justice issues. The inclusion of “social media”, “citizen”, and “demonstration” points to civil participation and public discourse playing a role in news narratives. The diverse mixture of geographic, political, and society terms indicates a well-rounded coverage of global affairs with particular attention on conflict zones, governmental activity, and crises affecting populations worldwide.
3.1. Topic Modeling
In this study, we employed two statistical and two neural topic modeling approaches to analyze Turkish-language news content. To conduct a meaningful comparative evaluation, four topic modeling methods were selected: LDA, BERTopic, Top2Vec, and FLSA-W. LDA serves as the baseline probabilistic model, widely recognized for its interpretability and foundational role in topic modeling, despite limitations in handling semantic ambiguity and non-English morphology. BERTopic was chosen for its ability to leverage transformer-based embeddings and class-based TF-IDF to produce contextually rich and interpretable topics, which is particularly beneficial for linguistically complex languages such as Turkish. Top2Vec was included due to its unsupervised document embedding and clustering mechanism that allows for automatic topic discovery without extensive preprocessing or predefined topic counts, making it adaptable to the nuances of Turkish corpora. Finally, FLSA-W represents a fuzzy logic-based extension that integrates word embeddings to model overlapping and imprecise thematic boundaries, aligning well with the linguistic ambiguity and morphological richness of Turkish. Together, these models provide a diverse methodological spectrum, enabling a comprehensive analysis of how probabilistic, neural, and fuzzy-based approaches differ in capturing latent themes and semantic structures in Turkish news data.
The first statistical method is FLSA [
21]. We employed an extended weighted version FLSA-W [
22], designed to address limitations in traditional fuzzy topic models. Unlike FLSA, which performs clustering at the document level, FLSA-W clusters over word distributions, enabling the detection of multiple topics within a single document. This shift from document-based to word-based clustering significantly improves interpretability, particularly in morphologically rich languages such as Turkish, where topic boundaries are often less distinct. Moreover, FLSA-W is capable of reducing topic overlaps, resulting in more semantically coherent and distinct topic representations. In the evaluation, our initial hypothesis is focused on the diversity of the statistical topic model FLSA-W. Fuzzy topic modeling integrates principles of fuzzy logic into topic modeling to address the limitations of traditional probabilistic models, particularly their inability to effectively capture uncertainty and overlapping semantics in textual data. FLSA-W models documents as fuzzy sets over topics and represents topics as fuzzy sets over terms, allowing for partial membership of terms to multiple topics. This soft assignment enables nuanced topic representations that reflect the inherent vagueness in natural language. Given a corpus of
N documents and a vocabulary of
M terms:
- (1)
Local Term Matrix:
where
is the frequency of term
in document
.
- (2)
Global Weighting (Optional): Apply global weighting using schemes such as
entropy,
idf, or
probidf.
be a diagonal matrix with global weights for each term. The weighted matrix becomes:
- (3)
Dimensionality Reduction (SVD): Perform Singular Value Decomposition (SVD)
Project into lower k-dimensional space: ; document representations or ; term representations
- (4)
Fuzzy Clustering:
Apply fuzzy clustering to get the following partition matrix:
where
C is the number of topics and
is the degree of membership of document
in topic
.
- (5)
Probability Distributions:
Normalize across for each .
The second statistical model employed is Latent Dirichlet Allocation (LDA) [
3,
41], one of the most widely used probabilistic topic models. LDA is a generative probabilistic model for collections of discrete data, such as text corpora [
41]. It assumes that each document is a mixture of latent topics and each topic is a distribution over words. Despite its popularity, LDA exhibits several well-documented limitations—particularly when applied to large and semantically dense corpora. First, it suffers from sparsity issues, leading to suboptimal topic distributions. Second, LDA assumes mutually exclusive topics, making it less effective in domains where topic boundaries are superficial or overlapping, as is often the case in news discourse. These limitations can be more pronounced in Turkish, due to its agglutinative nature and frequent use of compound expressions, which can dilute the semantic clarity of topic–word associations. For a corpus with
D documents, each containing
words, the generative process of LDA is as follows:
- (1)
For each topic
, sample a distribution over words:
- (2)
For each document :
- (a)
Sample topic proportions:
- (b)
For each word :
- i.
Sample a topic assignment:
- ii.
Exact inference in LDA is intractable due to the coupling between
,
, and
. Therefore, variational inference is used to approximate the posterior. We introduce a variational distribution:
where:
Variational inference proceeds by iteratively updating and :
- 1.
Initialize ,
- 2.
Repeat until convergence:
denotes the digamma function. These updates are applied for each document in the corpus, and the topic–word distributions are updated using the expected word counts under the variational distribution.
Additionally, we consider current neural topic modeling approaches, which are equipped to capture contextual and semantic nuances in Turkish texts. BERTopic [
35] and Top2Vec [
31,
32] are used deep learning methods. Since they use text embeddings rather than bag-of-words, deep learning models can benefit from semantic similarities between different words.
BERTopic is a neural topic modeling technique that leverages transformer-based embeddings and a class-based TF-IDF procedure to generate coherent, interpretable topics from textual data [
35]. BERTopic has four steps for topic modeling. First, is uses text embeddings to represent documents. Second, it applies dimensionality reduction to document vectors using Uniform Manifold Approximation and Projection UMAP method. Then, it clusters document vectors for topic extraction. Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) is used for clustering. Finally, it uses TF-IDF to obtain topic-representative words. Given a document corpus
:
- 1.
Document Embeddings: A pre-trained transformer model;BERT maps each document to a high-dimensional vector representation that captures semantic meaning.
- 2.
Dimensionality Reduction: UMAP projects the dense embeddings to a lower-dimensional space while preserving the local and global structure to facilitate effective clustering.
- 3.
Clustering: HDBSCAN identifies dense regions in the reduced space and assigns cluster labels to form coherent topic groups without requiring the number of clusters in advance.
where;
is the cluster (topic) label assigned to document
and
K is the number of topics (clusters) discovered.
- 4.
Topic Representation (c-TF-IDF): Class-based TF-IDF aggregates the term frequencies within each cluster to extract representative keywords that define the semantic content of each topic.
where; is the frequency of term t in documents assigned to cluster k, is the total number of terms in the aggregated document for cluster k, is the number of clusters in which term t appears and is the total number of documents in the corpus.
Top2Vec is different from BERTopic in terms of text embeddings and topic word extraction. Top2Vec [
31,
32] is a semantic topic modeling approach that jointly embeds words and documents into the same vector space and finds dense areas to discover topics. It removes the need for prior preprocessing like stop word removal or specifying the number of topics beforehand. It uses jointly embedded document vectors. The benefit is that word, document, and topic vectors can based on compared by vector similarity. The topic word extraction utilizes this feature: representative words that are closest words to topic’s vector. UMAP and HDBSCAN are used just like BERTopic. Given a corpus
and a vocabulary
:
- 1.
Word and Document Embeddings:
where;
is the embedding function,
is the word embedding of word
,
is the document embedding of document
and
D is the embedding dimensionality.
- 2.
Dimensionality Reduction (UMAP):
where;
is the non-linear projection preserving topological structure and
is the low-dimensional representation of document
.
- 3.
Clustering (HDBSCAN):
where;
is the cluster (topic) label assigned to document
and
K is the number of discovered topics. Each topic vector is the centroid of its cluster:
where
is the set of documents in cluster
k.
- 4.
Topic Words:
Find representative words based on cosine similarity:
where;
is the topic vector of cluster
k and
is the cosine similarity function.
We evaluated the performance of these approaches using diversity, coherence, and interpretability scores [
22,
23]. The diversity score measures the inter-topic quality by calculating how unique the top words are across topics. If topics share many top words, diversity is low; if they share few or none, diversity is high. This ensures that each topic represents a distinct semantic cluster.
Here,
is the number of unique words in the top-n words of all topics.
is the total number of words in the top-n words across all topics. Diversity ranges from 0 (no uniqueness) to 1 (maximum uniqueness).
The coherence score quantifies how strongly the words within a topic co-occur in the corpus. This is often calculated using the Normalized Pointwise Mutual Information (NPMI) metric. NPMI assumes that semantically related words appear near each other in text. This measure has been shown to correlate well with human interpretability of topics.
Here,
is the probability of co-occurrence of words
and
in a sliding window.
is a small constant to avoid division by zero.
is a weighting exponent. NPMI ranges from 0 to 1, where 1 indicates perfect coherence.
The interpretability score combines intra-topic coherence and inter-topic diversity by taking the product of the coherence and diversity scores. This reflects the idea that a good topic model should produce both internally coherent topics and distinct topics across the model.
This composite measure also ranges between 0 and 1. A high interpretability score means the model generates topics that are both meaningful and distinct.
The interpretability is inherently a multifaceted and human-centered concept that cannot be fully captured through numerical proxies alone. The use of the product of topic coherence and diversity as a composite interpretability metric, while reductive, follows the methodology established in prior studies—most notably by Rijcken et al. [
22,
23]. Their work provides a reproducible and quantifiable basis for comparing topic models, particularly in the absence of large-scale user studies or domain expert validation. While this approach does not replace qualitative assessment, it offers a consistent framework for benchmarking interpretability in experimental settings. Future work may complement these metrics with human-in-the-loop evaluations to further ground the results in practical usability.
3.2. Exploratory Analysis: Topic Clustering
We applied a qualitative analysis to explore the meaning of topic modeling before generating knowledge graphs. Topic clustering via dimensional reduction involves transforming high-dimensional textual data (like documents or keywords) into a lower-dimensional space using Principal Component Analysis (PCA), preserving meaningful relationships. This enables efficient grouping of semantically similar topics, improving organization, search, and content recommendation. In order to obtain an initial understanding of topic distributions and the degree of separation among the models, we conducted a PCA to project the high-dimensional topic-document matrices into a two-dimensional space. The topic scores generated by FLSA-W, LDA, BERTopic, and Top2Vec were each subjected to PCA independently. This dimensionality reduction enables visual comparison of how topics are spatially distributed across documents and helps to identify latent structure, overlap, or clustering tendencies inherent in each modeling approach. The 2D projections facilitate interpretability while preserving as much of the original variance as possible in the reduced space [
42].
Following the PCA transformation, we applied the k-means clustering algorithm to the resulting 2D embeddings to assess the consistency and separability of topic groupings across models. To determine the optimal number of clusters and evaluate the quality of the clustering, silhouette analysis was conducted for each projection. Silhouette scores quantify the compactness and separation of the clusters, offering insight into how well the models produce distinct thematic groupings in reduced space. This dual-step exploratory approach—PCA for visual interpretation and k-means with silhouette analysis for quantitative validation—provides a complementary lens for comparing topic model behavior, particularly in terms of coherence, overlap, and semantic distinctiveness.
3.3. Knowledge Graphs
In this study, we investigate the construction of knowledge graphs as a systematic framework for representing and analyzing the thematic structures produced by multiple topic modeling algorithms. Specifically, we generated four distinct knowledge graphs, each encapsulating the topics derived from LDA, BERTopic, Top2Vec, and FLSA-W. In these graphs, nodes correspond to individual topics, while edges encode the semantic similarity between topics based on their lexical and contextual overlap. By leveraging this graph-based representation, we are able to quantify inter-topic relationships, thus providing a robust quantitative basis for comparing how different models capture latent semantic patterns within the same corpus. Furthermore, to account for the dynamic evolution of thematic content, we extended this approach by constructing a supergraph that incorporates topics generated on a monthly basis. Each temporal subgraph is connected to the overarching graph structure, facilitating the examination of topic continuity, divergence, and emergence over time. This hierarchical integration not only enriches the qualitative interpretability of topic transitions but also enables the extraction of quantitative insights through advanced graph analytical techniques, including embedding-based similarity measures and community detection. Overall, the proposed knowledge graph framework demonstrates significant potential for enhancing topic modeling research by bridging interpretability with rigorous quantitative analysis of semantic structures.
Knowledge graph entropy is a quantitative measure used to capture the level of uncertainty, disorder, or information diversity present within a knowledge graph’s structure. In the context of topic modeling, graph entropy helps to characterize the complexity and richness of the semantic relationships encoded in the graph representation of topics and their connections. A higher entropy value generally indicates a more heterogeneous and complex network, suggesting diverse and less predictable connections among topics, whereas lower entropy implies a more uniform or ordered structure. Shannon graph entropy adapts Shannon’s classical entropy concept to graph structures. It can be calculated based on the probability distribution derived from node degrees or any other meaningful graph attribute. The standard formula for Shannon graph entropy
H is:
where
G is the graph,
N is the total number of nodes,
is the probability associated with node
i, and
is the degree of node
i.
This measure provides an interpretable scalar value that reflects the structural complexity of the knowledge graph and can be used for comparative evaluations of different topic modeling approaches.
All experiments were conducted on a machine equipped with a 2 GHz Quad-Core Intel Core i5 processor and 16 GB of 3733 MHz RAM, running macOS Sequoia 15.5. The implementation environment utilized Python version 3.9.10. Various topic modeling libraries were employed, including FLSA-W using the FuzzyTM library, Top2Vec with the top2vec package, BERTopic using the bertopic library, and LDA implemented via gensim.models.LdaMulticore.
4. Results
The figures obtained from the silhouette analysis and density heatmap visualization, performed after projecting the topic model embeddings into two dimensions with PCA, reveal important insights about the latent thematic structures discovered by each method. The results were given in Turkish language and their corresponding English translations.
Figure 4and
Figure 5 show the PCA of LDA topic modeling.
Figure 6 shows the heatmap of LDA’s topic projections. Visually, a dense and slightly elongated central region is observed, with the presence of two distinct centroids represented by red markers. The color intensity in the heatmap reflects the areas of highest topic projection density. K-means analysis, enhanced by silhouette analysis, identified two clusters for LDA. The centroids are positioned within the densest regions, indicating that LDA was able to identify two primary and well-defined topic groups. The visible overlap in the central region of the heatmap may suggest some semantic proximity or shared vocabulary between these two broad topic groups. The probabilistic nature of LDA, where topics are distributions over words and documents are distributions over topics, is reflected in this PCA projection, which visualizes the relationships between these topic distributions. The elongated central region and the overlap indicate that, while two clusters are optimal, the topics identified by LDA may not be entirely orthogonal or mutually exclusive. This can be attributed to the polysemy inherent in natural language or to the probabilistic nature of LDA, where documents can belong to multiple topics, creating a semantic gradient between the identified groups.
Figure 7 and
Figure 8 show the PCA of BERTopic modeling.
Figure 9 shows the heatmap of topic projections. Its visual appearance is remarkably similar to LDA in terms of centroid density and positioning. BERTopic also resulted in two silhouette-optimized k-means clusters. The visual similarity with the LDA projection suggests that, despite BERTopic’s distinct methodology—which relies on document embeddings, dimensionality reduction via UMAP, HDBSCAN clustering, and topic representation via c-TF-IDF—its final topic representations, when projected via PCA, converge to a similar bipartite structure. This convergence is a significant finding, as it implies that the dominant semantic divisions in the dataset are robust enough to be captured by diverse topic modeling paradigms. Both models, through their distinct paths, converge to the same topic organization at a macro level.
Figure 10 and
Figure 11 show the PCA of Top2Vec modeling.
Figure 12 shows a heatmap of topic projections. Top2Vec is the most distinctive, displaying the most prominent feature: the presence of three centroids and clearly distinct high-density regions around them. The spatial arrangement of these three clusters shows a central region and two peripheral regions. It is crucial to note that Top2Vec is the only model for which the silhouette analysis determined an optimal K of three clusters. This is a critical distinction. Visual evidence corroborates this, showing three apparently well-defined and separate dense areas, with the centroids positioned within them. This implies that Top2Vec’s topic representations inherently possess a more granular structure that is better partitioned into three distinct groups.
Figure 13 and
Figure 14 show the PCA of FLSA-W modeling.
Figure 15 shows the heatmap of topic projections. While it also shows two clusters, the density distribution is noteworthy. One of the centroids appears to be located in a very dense central region, while the other may be associated with a less dense, more diffuse, or possibly more isolated region. FLSA-W also resulted in two silhouette-optimized k-means clusters.
The topic modeling results for the LDA method, despite showing lower coherence (0.27) and interpretability (0.095) compared to other methods, reflect a broad thematic diversity that aligns with prominent geopolitical and regional issues in news media over the analyzed months. Across March to October, topics consistently centered on major international relations themes such as the Russia–Ukraine conflict, US foreign policy, Middle Eastern security dynamics, and domestic political developments. Notably, frequent co-occurrences of keywords like “Russia,” “Ukraine,” “US,” “Israel,” and “Turkey” indicate the salience of international conflicts and diplomatic affairs in the news discourse, while terms such as “attack,” “security,” “president,” and “region” underscore ongoing concerns about military actions and governance. The monthly topic clusters reveal a nuanced temporal progression of news focus. For instance, in March and April, the prevalence of keywords related to geopolitical tensions and security issues—such as “Iran,” “Israel,” “safety,” and “demonstration”—reflects a focus on regional instability and conflict outbreaks. Moving into May and June, there is a noticeable inclusion of domestic political terminology alongside ongoing war-related vocabulary, with terms like “president” and ”election” emerging, indicating an intertwining of international and national political coverage. The summer months continue to emphasize “NATO”, “defense”, and security topics, while also introducing environmental and social issues in the Mediterranean zone such as “fire” and “heat”, particularly in July and August, suggesting a diversification of news themes beyond purely political-military events. In the later months, especially September and October, the topic clusters increasingly highlight the escalating Israeli–Palestinian conflict by reflecting a shift in news focus towards “Middle East”. Additionally, topics involving “Azerbaijan”, “Armenia”, and “Karabakh” suggest regional conflict awareness remains important. Although the LDA-generated topics may exhibit lower coherence scores, the presence of diverse yet thematically coherent clusters demonstrates that the method captures a wide array of key political, security, and social issues dominating media narratives throughout the period.
The BERTopic model demonstrates a nuanced and coherent capture of topical themes across the months, with relatively high interpretability (0.55) and diversity (0.96) scores compared to other methods. The topics reflect a rich mixture of geopolitical, social, environmental, and cultural issues relevant to news. For example, recurring geopolitical themes emerge strongly throughout the months, including conflicts involving “Russia”, “Ukraine”, and the “United States”, as well as Middle Eastern affairs. This consistent presence indicates BERTopic’s strength in capturing complex international narratives, maintaining thematic coherence while reflecting topical diversity. Environmental and disaster-related topics also appear with considerable clarity and frequency, highlighting events such as wildfires (“fire”, “forest”, “flame”, “region”), storms (“storm”, “hurricane”, “damage”), and earthquakes (“earthquake”, “magnitude”, “felt”). These topics not only capture the temporal variability of natural disasters but also their social impact, as indicated by tokens related to human casualties and responses (e.g., “victims”, “intervention”, “police”). The inclusion of culturally significant topics, such as religious observances and electoral politics, further illustrates BERTopic’s ability to identify diverse subject matters with meaningful interpretability, linking linguistic patterns to recognizable societal phenomena. Lastly, topics related to legal and administrative affairs appear regularly, showing discussions on reforms, judiciary activities, and protests. The coherence of these themes alongside political events contributes to the overall narrative consistency. Importantly, BERTopic’s high interpretability score suggests that its identified topics are not only statistically robust but also intuitively understandable to human readers, supporting its suitability for complex, multilingual news corpora.
The qualitative interpretation of the Top2Vec topic modeling output reveals a set of themes over time across domains such as disaster response, international diplomacy, domestic unrest, and geopolitical developments. For March, a dominant portion of topics revolves around the aftermath of natural disasters. These are complemented by terms referring to logistical operations (“vehicle”, “center”, “system”, “emergency”) and social response (“citizen”, “support”, “coordination”), suggesting that media coverage in March was largely shaped by a focus on disaster management and public health concerns. Simultaneously, topics present security-related themes, marked by “military”, “exercise”, “NATO”, “nuclear”, “missile”, and “Ukraine”, indicating active coverage of global defense dynamics and regional conflicts. In April and May, the model identifies a shift toward political discourse and protest movements, especially visible in topics related to “parliament”, “protest”, “reform”, “strike”, and “government”. This pattern aligns with social unrest and policy debates, particularly in relation to labor laws and judicial reforms, as evident in countries like France and Israel. By June and July, the discourse transitions toward electoral politics, international summits (e.g., “NATO”, “EU”, “Zelensky”, “Stoltenberg”), and renewed military narratives (e.g., ”Wagner Group”, “Moscow”, “coup”, “Putin”). Additionally, recurring environmental and climate-related keywords such as “wildfire”, “heat”, “flood”, “meteorology”, and “evacuation” suggest increased attention to ecological hazards. These themes continue into August and September with similar coverage of “natural disasters”, “storm”, and “casualties”.
The qualitative analysis of the FLSA-W topic distributions reveals that FLSA-W identifies granular and thematically distinct topics, reflecting the method’s strength in separating concurrent discourses. March topics range from domestic economic discourse (“production”, “food”, “sharing”) to geopolitical affairs involving Ukraine and the US (“military”, “defense”, “duty”), and localized incidents (“fire”, “rainfall”, “police”, “operation”). In later months, topics evolve toward international diplomacy and disaster events. For example, June includes topics like international inequality and media bias (“media”, “inequality”, “guardian”, “confederation”) as well as natural disasters (“evacuation”, “damage”, “fire”, “casualty”). July captures both social unrest and geopolitical tensions (“protest”, “terrorism”, “summit”, “missile”). These clusters are topically coherent within their themes and show a reasonable level of interpretability, such as public health and safety, international cooperation, and civil–military dynamics. The repetition of certain tokens (e.g., “Muhammed”, “Saim”) across months may slightly lower interpretability due to their contextual ambiguity without deeper textual embedding, partially explaining the lower interpretability score (0.33). By October, topics point out acute global issues and religious–political conflicts. Notable examples include military escalations in Gaza (“Israel”, “attack”, “security”) and domestic societal concerns (“student”, “ethics”, “lesson”, “community”). Despite lower coherence, the clear separation of themes from socio-economic discussions to international crises reflects strong topic independence. In sum, FLSA-W’s output highlights the evolution of media focus over time and provides a highly diverse thematic mapping. However, the trade-off in coherence and interpretability suggests the need for supplementary contextualization to enhance topic clarity for downstream qualitative research.
The evaluation results given in
Table 1 underscore the strengths of the proposed fuzzy topic modeling framework (FLSA-W) in capturing thematic diversity within Turkish news data. Achieving a perfect diversity score (1.000), the fuzzy model demonstrates its ability to uncover a wide range of distinct topics, reflecting its capacity to handle semantic overlap and blurred topic boundaries. This is particularly valuable in the context of news articles, where thematic content often blends and shifts rapidly. Moreover, its interpretability score (0.335) surpasses that of LDA by a wide margin, illustrating the fuzzy model’s advantage in producing topic representations that are more comprehensible to human analysts. While its coherence score is modest compared to Top2Vec, it remains higher than LDA’s, suggesting a more balanced approach to both topic clarity and thematic coverage.
Top2Vec emerges as the overall leader in terms of topic coherence (0.808) and interpretability (0.783), indicating that its embedding-based methodology excels at producing semantically rich and easily understandable topics. This model is particularly adept at leveraging contextual similarity to form coherent clusters, making it highly effective in understanding the nuanced language of Turkish news. Its performance across all three metrics—especially its strong diversity (0.970)—positions it as a powerful alternative to more traditional models like LDA. However, while Top2Vec produces tightly formed topics, it does not explicitly address the uncertainty or thematic fuzziness present in real-world corpora, which the fuzzy model is designed to capture.
In contrast, both LDA and BERTopic demonstrate limitations in one or more evaluation dimensions. LDA shows the weakest performance overall, particularly in interpretability (0.0945), reinforcing its well-known difficulty in providing clear, actionable topics in linguistically complex domains. BERTopic performs reasonably well across all metrics, yet it is outpaced by both the fuzzy model in diversity and Top2Vec in coherence and interpretability. These findings suggest that while BERTopic serves as a competent baseline for modern topic modeling, it may fall short in capturing either the full thematic breadth or the semantic richness required for deeper insights. Taken together, the results highlight the complementary strengths of the fuzzy model and Top2Vec, with the former offering a more flexible, uncertainty-aware approach, and the latter providing highly coherent and interpretable topics—making them the most effective tools for modeling evolving digital media landscapes.
The comparison of knowledge graphs generated using four different topic modeling approaches—FLSA-W, Top2Vec, BERTopic, and LDA—reveals notable differences in structural complexity as measured by Shannon graph entropy, given in
Table 2. Shannon entropy quantifies the amount of uncertainty or information contained in the graph’s structure, with higher values indicating greater complexity or richness in connectivity. Among the base models, FLSA-W (
Figure 16) and both Top2Vec (
Figure 17) and BERTopic (
Figure 18) exhibit similar entropy levels (around 6.2–6.5), suggesting relatively balanced and information-rich networks. In contrast, LDA (
Figure 19) has the lowest entropy at 4.93, implying a sparser or more hierarchically simple graph structure. When the knowledge graphs are extended temporally using month-based topic modeling, entropy values increase across all approaches, reflecting the added complexity introduced by temporal topic expansion. Overall, these results suggest that FLSA-W and Top2Vec are more adaptive at constructing complex and informative knowledge graphs, especially when enhanced with temporal modeling in news modeling.
The Shannon graph entropy scores indicate that FLSA-W (
Figure 20) achieves the highest entropy at 8.98, followed closely by Top2Vec (
Figure 21) (8.37) and BERTopic (
Figure 22) (8.10), highlighting their effectiveness in capturing nuanced, time-evolving semantic relationships. Top2Vec follows with a score of 8.37, maintaining high entropy values throughout the months, though with slight fluctuations. BERTopic shows more variability, with entropy values dropping significantly in June and October, resulting in a lower total score of 8.10. LDA (
Figure 23), as the classical baseline model, yields the lowest overall entropy (6.20) and exhibits stable but consistently lower values, reflecting limited capacity in capturing the diversity and nuance of topic structures in Turkish news data. The entropy values for the Total-Months, which merge individual monthly topic graphs, mirror these trends, further emphasizing the robustness of fuzzy and embedding-based models in maintaining topical complexity over time.
5. Discussion
The findings of this study highlight the critical importance of incorporating flexibility and semantic awareness into topic modeling approaches, particularly when applied to complex, unstructured domains such as Turkish news media. Traditional probabilistic models like LDA fall short in representing the evolving and overlapping nature of news topics, as evidenced by their lower coherence and interpretability scores. FLSA-W addresses these limitations by introducing a mechanism to accommodate thematic ambiguity and boundary uncertainty. Its perfect diversity score and competitive interpretability affirm its strength in uncovering a wide range of topics while maintaining human-readable outputs. The introduction of a graph-based entropy metric further adds analytical depth, allowing structural complexity of topic relationships to be quantified in a principled manner.
LDA remains a classic probabilistic model, which makes it efficient and interpretable but fundamentally constrained when dealing with semantic nuance and overlapping themes. Its relatively low coherence (0.270), diversity (0.350), and interpretability (0.0945) confirm its limited adaptability to unstructured Turkish news data where topics frequently merge and evolve. The knowledge graph entropy for LDA and monthly LDA (LDA-All) (4.93 and 6.20) further indicates that its topic networks are structurally simpler and less information-dense than those of more modern approaches. BERTopic bridges this gap by integrating contextual embeddings with density-based clustering and class-based TF-IDF representations. This combination results in balanced performance across coherence (0.571), diversity (0.960), and interpretability (0.548). Its entropy values (6.23 and 8.10) reflect an improved ability to capture more intricate topic connections compared to LDA. However, BERTopic’s reliance on hard clustering still constrains its handling of fuzzy thematic boundaries. The need for explicit representation of overlapping topics remains partially addressed, which limits its capacity for modeling high levels of semantic fluidity in evolving domains like digital news.
Top2Vec demonstrates significant advancement by jointly embedding words and documents, allowing topics to emerge naturally as dense regions in the semantic space. This architecture yields the highest coherence (0.808) and interpretability (0.783) scores among all models, with strong diversity (0.970). Its entropy values (6.23 and 8.37) show that its topic similarity graphs contain richer structural patterns than LDA’s or BERTopic’s, although still slightly less than FLSA-W’s extended graphs. Top2Vec’s fully unsupervised nature and semantic alignment make it highly effective at capturing coherent, interpretable topics with minimal manual tuning. However, its crisp clustering approach does not explicitly address the uncertainty and thematic overlaps prevalent in Turkish news, which FLSA-W is designed to manage. The FLSA-W framework stands out by explicitly modeling thematic uncertainty through fuzzy clustering. Its perfect diversity score (1.000) and substantial interpretability (0.335) demonstrate its capacity to handle overlapping topics while producing outputs that remain understandable. The highest entropy scores (6.49 and 8.98) among all models confirm that FLSA generates the most information-rich topic networks, effectively reflecting the complex interconnections in evolving news content. These findings position FLSA-W and Top2Vec as complementary models: Top2Vec excels in semantic coherence and clarity, while FLSA-W adds critical nuance and structural depth for contexts where thematic overlap is intrinsic to the data.
The diversity score of 1.000 for the FLSA-W model, which implies a complete absence of overlapping top words across topics, presents an outcome that is statistically questionable in NLP studies. While high diversity is generally desirable for ensuring distinct topic boundaries, a perfect score may suggest the model is artificially maximizing topic separation, potentially at the expense of semantic coherence and interpretability. This raises concerns about whether FLSA-W is producing genuinely meaningful topic structures or merely enforcing disjoint clusters through its fuzzy clustering on the SVD-transformed word space. Rijcken et al. [
23] evaluated FLSA-W on four different open datasets in English and obtained 1.0 for diversity. Furthermore, we repeated our experiments without applying stemming. This decision aimed to preserve more of the natural lexical variation in the corpus, which would typically introduce overlap among semantically related terms across topics. Interestingly, the results reveal that even without stemming, FLSA-W continues to produce the same diversity score (1.00), coupled with relatively low coherence (0.42) and interpretability (0.42). This suggests that the model may still be enforcing topic separation at the cost of semantic cohesion. In contrast, both Top2Vec and BERTopic yielded high coherence (0.78 and 0.67, respectively) and strong interpretability (0.77 and 0.65) while maintaining high, but not perfect, diversity scores (0.99 and 0.97), which are more consistent with natural language structure. These outcomes suggest that FLSA-W’s diversity metric may not be sensitive to nuanced lexical variations and may reflect the underlying clustering mechanism rather than meaningful topic formation.
The comparative analysis of knowledge graphs derived from four topic modeling approaches—FLSA-W, Top2Vec, BERTopic, and LDA—demonstrates varying capacities to capture the thematic complexity and linguistic specificity of Turkish news discourse. The FLSA-W model stands out for its ability to handle thematic ambiguity and fuzzy topic boundaries, thanks to its weighted fuzzy clustering mechanism. The resulting semantic network captures overlapping narratives and partial memberships that reflect the morphological richness and polysemy inherent in Turkish. The graph highlights interlinked topics across crisis events, financial systems, and public governance, emphasizing FLSA-W’s strength in representing uncertainty and interpretability in real-world contexts. Top2Vec offers a complementary yet distinct perspective by generating dense, semantically rich embeddings that preserve both dominant and peripheral narratives. Its co-occurrence network reveals well-defined clusters around military conflicts, environmental disasters, and legal-political developments while also uncovering subtler cross-domain relationships. This capacity to represent topics in a shared vector space allows Top2Vec to maintain contextual nuance, even in the face of Turkish’s agglutinative structure and syntactic variation. The visualization confirms Top2Vec’s robustness in capturing contextually specific and thematically cohesive groupings, thus enriching the overall topic landscape with latent but meaningful connections. The knowledge graphs constructed from BERTopic and LDA further highlight how model architectures influence semantic structuring. BERTopic demonstrates an advanced ability to extract context-dependent clusters, especially around event-based and named-entity-driven narratives. It uncovers a diverse thematic range beyond geopolitics, spanning legal accusations, economic developments, and humanitarian crises. In contrast, the LDA-based graph produces a more rigid, hierarchical structure, heavily centered on geopolitical conflict and state-centric discourse. Despite its relative simplicity, LDA effectively identifies dominant narrative patterns and key political actors. Together, these models reveal complementary insights: while LDA excels in mapping overarching political themes, BERTopic, Top2Vec, and FLSA-W enrich the discourse with contextual depth, thematic diversity, and fuzzy relational nuance—attributes that are crucial for analyzing complex, multilingual corpora such as Turkish media texts. The models incorporating semantic embeddings and fuzzy logic principles—specifically FLSA-W and Top2Vec—are more effective in capturing the inherent ambiguity, richness, and temporal variation in Turkish-language corpora. The performance of FLSA-W can be attributed to its fuzzy set-based handling of overlapping themes and enhanced semantic coverage via word embeddings, which leads to higher entropy and thus richer topic networks. Top2Vec, by directly embedding documents and clustering them without a priori topic numbers, also retains strong entropy values, confirming its adaptability to language morphology and evolving thematic structures. In contrast, BERTopic shows vulnerability to data sparsity or contextual drift in certain months, possibly due to reliance on clustering around transformer-based representations that can fluctuate with noisy inputs. LDA’s low entropy across all metrics illustrates the limitations of probabilistic count-based models in managing complex, morphologically rich languages like Turkish, particularly in dynamic and domain-specific contexts such as news.
In terms of computational efficiency, FLSA-W demonstrated the lowest runtime of approximately 5.538 s, indicating high performance with minimal resource consumption. BERTopic and LDA followed with execution times of 15.072 and 10.634 s, respectively, reflecting moderate computational demands. In contrast, Top2Vec incurred the highest runtime of 37.735 s, suggesting greater computational overhead, likely due to its reliance on document embeddings and clustering. These results highlight the varying trade-offs between model complexity and runtime efficiency, with fuzzy-based approaches showing potential advantages in computational cost.
6. Conclusions
This comparative evaluation demonstrates that addressing topic overlap and network complexity requires moving beyond traditional probabilistic frameworks such as LDA, which struggle with semantic ambiguity and produce relatively simplistic topic structures. While BERTopic introduces modern embeddings and dynamic clustering, it remains limited by hard cluster boundaries, only partially capturing the fluidity present in evolving news data. In contrast, Top2Vec and the fuzzy framework (FLSA-W) emerge as robust alternatives, each excelling in complementary aspects of topic modeling. Specifically, Top2Vec achieves the highest coherence and interpretability by leveraging semantic embeddings to uncover well-defined, semantically rich clusters while maintaining strong diversity and network complexity. The FLSA-W framework, however, surpasses all other models in diversity and entropy-based structural richness by explicitly incorporating fuzzy clustering to represent thematic uncertainty. Together, these findings emphasize that combining semantic embeddings with uncertainty-aware methods holds significant promise for modeling complex, unstructured text in dynamic media contexts.
Our experiments reinforces the hypothesis that the algorithm’s fuzzy clustering on the word-level SVD space inherently constructs orthogonal clusters, resulting in artificially non-overlapping top terms. The significantly lower coherence and interpretability scores further support this interpretation, implying that while FLSA-W successfully generates lexically distinct topic word sets, these sets may lack internal semantic relatedness or human interpretability. These findings emphasize the need for future work to examine the internal mechanisms of FLSA-W and potentially refine its clustering stage or evaluation framework to better reflect semantic quality alongside structural separation.
Our findings highlight the importance of language-specific preprocessing and modeling strategies. The morphological richness and agglutinative structure of the Turkish language has qualitative and quantitative effects on topic modeling. The comparative performance of coherence, diversity, and interpretability metrics suggests that standard approaches may fall short when applied to morphologically complex languages like Turkish. Our results demonstrate that methods incorporating semantic embeddings or fuzzy logic adaptations are more robust in handling the nuanced word forms and syntactic flexibility of Turkish. This reflection affirms the requirement of tuning topic modeling frameworks to the linguistic characteristics of the target language to achieve meaningful and accurate thematic representation.
Entropy, as used in this study, serves primarily as a proxy for structural complexity and topic connectivity, rather than a direct measure of semantic richness. Our interpretation aligns with prior literature that employs entropy to assess the distributional characteristics of topic relationships; however, we recognize that complexity does not always equate to interpretability or quality. A more nuanced discussion of this point has been incorporated to clarify that entropy should be considered alongside other indicators—such as coherence, diversity, and graph modularity—when evaluating model outputs. Future iterations of this work will explore more targeted measures of semantic utility to better ground the structural metrics in human interpretability. Building on these insights, future research should explore hybrid models that integrate the strengths of fuzzy clustering and embedding-based representations to better balance coherence, interpretability, and flexibility. Extending entropy-based graph metrics to capture topic evolution over time and across multiple domains could provide deeper insights into how thematic structures develop in response to real-world events.
The graph entropy demonstrates that fuzzy-enhanced and neural embedding-based models yield more structurally informative topic representations than traditional probabilistic models. Future NLP applications, especially in morphologically rich or low-resource languages, may benefit from integrating fuzzy semantics and deep contextual embeddings for more nuanced and temporally stable topic modeling. Additionally, applying this framework to multilingual or cross-cultural news datasets may reveal further nuances in how fuzzy semantics and network complexity influence topic understanding in diverse information landscapes.