Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs

Chen, Xin

doi:10.3390/su17198920

Open AccessArticle

Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs

by

Xin Chen

School of Intelligent Science and Information Engineering, Shenyang University, Shenyang 110044, China

Sustainability 2025, 17(19), 8920; https://doi.org/10.3390/su17198920

Submission received: 18 August 2025 / Revised: 4 October 2025 / Accepted: 6 October 2025 / Published: 8 October 2025

(This article belongs to the Section Hazards and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

Amid the increasing frequency and severity of major disasters, the rapid spread of online misinformation poses substantial risks to public safety, effective crisis management, and long-term societal sustainability. Current methods for managing disaster-related rumors rely on static, rule-based approaches that lack scalability, fail to capture nuanced misinformation, and are limited to reactive responses, hindering effective disaster management. To address this gap, this study proposes a novel framework that leverages large language models (LLMs) and event knowledge graphs (EKGs) to facilitate the sustainable agile identification and adaptive control of disaster-related online rumors. The framework follows a multi-stage process, which includes the collection and preprocessing of disaster-related online data, the application of Gaussian Mixture Wasserstein Autoencoders (GMWAEs) for sentiment and rumor analysis, and the development of EKGs to enrich the understanding and reasoning of disaster events. Additionally, an enhanced model for rumor identification and risk control is introduced, utilizing Graph Attention Networks (GATs) to extract node features for accurate rumor detection and prediction of rumor propagation paths. Extensive experimental validation confirms the efficacy of the proposed methodology in improving disaster response. This study contributes novel theoretical insights and presents practical, scalable solutions for rumor control and risk management during crises.

Keywords:

sustainable agile identification; adaptive risk control; LLMs; EKGs; GMWAEs

1. Introduction

Amidst the increasing frequency and severity of major disasters, the rapid spread of misinformation poses a significant threat to public safety and crisis management. These disasters not only cause direct physical and environmental damage but also generate a large volume of false information, which can quickly incite public panic, disrupt emergency responses, and potentially lead to social instability [1]. Thus, the agile identification and adaptive risk control of disaster-related online rumors have become critical tasks in ensuring effective disaster response and maintaining social stability.

Traditional public opinion analysis methods face significant limitations when addressing disaster-related online rumors [2,3]. First, these methods typically rely on manual supervision and simple rule- or keyword-based matching, which makes it difficult to handle the large and complex volumes of social media data generated during disasters. Manual or semi-automated approaches are not only inefficient but also prone to missing critical nodes in rumor propagation due to information overload. Second, traditional methods often lack the capability to dynamically monitor and predict rumor propagation paths, leading to reactive rather than proactive responses to misinformation, which increases risks to public safety. Moreover, these methods often fall short in data analysis accuracy, making it challenging to effectively distinguish between rumors and factual information, frequently resulting in misjudgments, wasted emergency resources, and exacerbated public distrust.

To overcome these limitations, this study proposes an innovative framework based on LLMs and EKGs, aimed at achieving agile identification and adaptive risk control of disaster-related online rumors. This framework leverages advanced Artificial Intelligence (AI) technologies to efficiently and accurately process and analyze the vast amounts of social media data generated during disasters. The key stages of the framework include the collection and preprocessing of disaster-related data, followed by the construction of an LLM for sentiment analysis and rumor detection. Through in-depth text analysis, the model significantly enhances identification accuracy and efficiency. Additionally, GMWAEs are employed to handle complex data distributions, improving the model’s ability to capture anomalous information [4,5,6,7]. The construction of an Event Knowledge Graph integrates disaster-related information, providing richer contextual support and enhancing the prediction of rumor propagation paths. Unlike traditional methods, this knowledge graph can be dynamically updated, offering precise support for real-time monitoring and emergency response. Lastly, the improved model based on GAT extracts and analyzes node features, enabling early identification of rumors and prediction of propagation paths, thereby enhancing the system’s responsiveness and precision.

The practical applications of this study are of great significance. First, quickly and accurately identifying and controlling rumors during major disasters is crucial for public safety and social stability. The framework proposed in this study allows government agencies, emergency management departments, and other stakeholders to more effectively monitor online public opinion, promptly detect and respond to rumor propagation, and avoid resource wastage and social unrest. Second, the widespread use of social media has increased the speed and scope of information dissemination to unprecedented levels, rendering traditional public opinion analysis tools inadequate for modern crisis management needs. By incorporating advanced AI technologies, this study provides technical support for addressing the challenges of information dissemination, improving the accuracy of rumor identification, and enhancing the efficiency of risk control. Finally, this framework is not only applicable to major disasters but can also be extended to public health events, social crisis management, and policy communication, providing robust support for information management across various domains.

To validate the framework’s effectiveness and generalizability, this study examines two distinct case studies: the 2021 Zhengzhou flood and the 2023 Maui wildfire. The Zhengzhou flood, occurring in a Chinese urban context with a monolingual dataset, tests the framework’s baseline performance in a flood-specific scenario, while the Maui wildfire, an English-language dataset from a wildfire event, addresses gaps in multilingual data, diverse disaster types, and bot-generated misinformation. These cases were selected to reflect varying linguistic, cultural, and technological contexts, enabling a comprehensive evaluation of the framework’s adaptability and robustness across different disaster landscapes and evolving misinformation challenges. In summary, this study not only delves deeply into the theories of rumor control in the context of major disasters but also offers innovative technical solutions, particularly in agile identification and adaptive risk control, providing new ideas and methods for enhancing public safety and crisis management capabilities.

2. Related Work

The rapid dissemination of misinformation during major disasters poses significant challenges to public safety and crisis management [8,9,10]. AI technologies are increasingly crucial for agile identification and adaptive control. Related research primarily focuses on underlying technologies, categorized into three main approaches: generative topic models, which uncover latent themes and model complex data distributions; graph-based models, which leverage network structures to analyze information dissemination dynamics; and statistical classification methods, which employ supervised or probabilistic techniques for information classification and prediction. These technologies provide foundational support for addressing rumor dissemination.

2.1. Generative Topic Models

Generative topic models, like Latent Dirichlet Allocation (LDA) and autoencoders, extract thematic patterns. Danny et al. [11] outlined topic modeling for large corpora. Uthirapathy et al. [12] used LDA and Bidirectional Encoder Representations from Transformers (BERT) for Twitter rumor detection. Jaradat et al. [13] applied labeled LDA for real-time processing. Zhu et al. [14] tracked topic evolution, while Wang et al. [15] developed adaptive modeling with time decay. Virtanen et al. [16] modeled dynamic patterns in crime articles. Du et al. [17], Liu et al. [18], and Peng et al. [19] advanced topic tracking and hierarchical detection. Technologies like Single-Pass and WAE improved models by Chen et al. [20,21], and BERT by Petrick et al. [22] enhance thematic and sentiment analysis.

2.2. Graph-Based Models

Graph-based models analyze rumor propagation via networks. Zhang et al. [23] used Graph Fusion Networks for multimodal event detection. Sagar et al. [24] applied clustering for topic tracking. Guo et al. [25] and Li et al. [26] advanced rumor detection with dynamic Graph Neural Networks (GNNs). Guan et al. [27] introduced EKGs for event modeling, while Mayank et al. [28] used DEAP-FAKED for news verification. Qudus et al. [29] enabled real-time fact-checking, and Kishore et al. [30] developed MultiCheck for multimodal verification.

2.3. Statistical Classification Methods

Statistical methods use supervised or probabilistic techniques. Liu et al. [31] proposed MLEM for multilingual event mining. Sukhwan et al. [32] modeled topical correlations. Ankita et al. [33] reviewed text categorization. Kuang et al. [34] aligned news and social media for event detection. Dai et al. [35] optimized Support Vector Machine (SVM), and Meysam et al. [36] reviewed Twitter topic detection. Technologies like Generative Pre-trained Transformer (GPT) by Zhu et al. [37] and Zerveas et al. [38], hybrid models by Zhang et al. [39] and Cho et al. [40], Reinforcement Learning (RL) by Sun et al. [41], and predictive modeling by Han et al. [42] enhance rumor detection and risk prediction.

2.4. Novel Framework for Proactive Rumor Management

Based on the aforementioned foundational methods as in Table 1, there is currently limited literature directly applied to the identification and adaptive control of disaster-related online rumors. Relevant studies include: Ahsan et al. [43] conducted a review of rumor detection, verification, and control mechanisms in online social networks; Lian et al. [44] researched control strategies for online misinformation during natural disasters, using Typhoon Mangkhut as a case study; and Dong et al. [45] developed a model integrating adaptive rumor propagation and activity contagion within advanced networks; Liu et al. [46] proposed a method based on zero-sum differential games and approximate dynamic programming to investigate competition on online social platforms and rumor control in disaster scenarios.

To qualitatively evaluate the performance of existing methods relative to the proposed framework, we conduct a comprehensive analysis by integrating insights from three foundational methods, generative topic models, graph-based models, and statistical classification methods, alongside studies specific to the identification and adaptive control of disaster-related online rumors. This multi-dimensional comparison spans key dimensions such as adaptability, semantic integration, proactivity, scalability, real-time capability, and generalizability, revealing the collective shortcomings of these methods in holistic disaster rumor management while underscoring the innovations of the proposed framework.

The analysis begins with the three foundational methods, starting with generative topic models, exemplified by LDA, BERT-enhanced variants, and WAE-based techniques. These models excel at extracting latent thematic patterns and tracking topic evolution through mechanisms such as drifting probability or time-decay adjustments, providing valuable insights into rumor themes like public panic during disasters. However, their static probabilistic assumptions often lead to posterior collapse in complex, multimodal disaster data, where evolving slang or emotional bursts overwhelm fixed distributions. This results in limited adaptability to real-time multilingual streams, leading to overlooked emerging rumors and delayed thematic analysis that hinders early intervention in disaster crises, where themes rapidly shift from ignition to conspiracy theories.

Among the foundational methods, graph-based models, including GNNs, EKGs, and fusion networks, effectively model propagation dynamics and multimodal interactions, for instance fusing text-image edges for event detection or clustering for topic tracking, thereby enabling visualization of rumor paths in social networks. Yet, they suffer from shallow semantic depth, struggling to handle nuanced contexts in disaster narratives, such as cultural biases or bot-amplified misinformation, which leads to incomplete relational inference and poor scalability in high-volume, cross-lingual data. In practice, this manifests as fragmented graphs that fail to predict adaptive paths in disaster events, where network contagion overlooks event causality, allowing rumors to cascade unchecked.

The statistical classification methods in the foundational approaches, including SVM optimizations, RL, GPT hybrids, and predictive modeling, provide robust supervised labeling and risk forecasting, for example by aligning news–social media for bursty events or incorporating game-theoretic competition, with strengths in handling engagement metrics such as retweets. Nevertheless, their reactive nature, focusing on post hoc verification rather than anticipation, combined with high computational demands, limits proactivity and real-time deployment. For instance, they perform well in isolated tasks like sentiment classification but lack integration for generalizable event mining, resulting in inconsistent performance across diverse disasters, where multilingual or sparse data exacerbates false negatives and erodes trust in adaptive controls.

Building on these limitations of the foundational methods, we now analyze studies specific to the identification and adaptive control of disaster-related online rumors. First, case-specific strategies for misinformation control during natural disasters typically employ rule-based approaches tailored to particular events, such as keyword matching or threshold-based flagging. While effective in isolated contexts, these methods exhibit poor scalability, struggling to generalize across diverse disaster types or platforms with varying data volumes. Moreover, they inadequately incorporate nuanced semantic analysis, overlooking subtle linguistic cues common in disaster-related rumors, such as sarcasm or cultural idioms. Consequently, such strategies lead to high false-positive rates in multicultural or multilingual environments, wasting resources on non-critical content and eroding trust in emergency communications.

Second, propagation models simulate rumor spread through network contagion dynamics, excelling at visualizing information flows and identifying key influencers. Yet, they frequently overlook linguistic variations, such as dialectal differences or evolving slang, and technical challenges like posterior collapse in probabilistic text distributions, where models underfit multimodal or noisy social media data. This oversight limits proactive risk mitigation, as predictions become unreliable in real-time floods of unverified posts, potentially allowing rumors to propagate unchecked before detection.

At the same time, game-theoretic competition models for rumor control in disaster-specific contexts frame misinformation as adversarial interactions between spreaders and verifiers, using optimization techniques like differential games to balance suppression costs. Although innovative, these models are computationally intensive, requiring extensive simulations that hinder real-time deployment, and remain reactive, responding post-detection rather than anticipating paths. They also neglect structured event reasoning, such as integrating temporal or causal event contexts, which is essential for multilingual data where rumors blend factual and fabricated elements across languages.

In stark contrast, the technical framework proposed in this study holistically addresses these shortcomings through synergistic integration. LLMs provide superior semantic integration and adaptability, surpassing the static themes of generative models and the rigidity of classification methods to enable real-time, culturally sensitive detection. EKGs offer scalable, dynamic relational inference, extending graph-based propagation to causal event mapping and live updates, addressing the semantic shallowness of network models and enabling generalizable path prediction across disaster types. The enhanced GMWAE ensures robust handling of complex distributions via Gaussian mixtures and Wasserstein regularization, mitigating posterior collapse and computational burdens while boosting proactivity through anomaly aware autoencoding, outperforming statistical predictions in sparse, noisy scenarios. This holistic design promotes seamless generalizability, shifting from siloed, reactive tools to an interconnected system that anticipates risks, optimizes interventions, and enhances crisis resilience.

3. Proposed Approach

3.1. Overall Framework

The blending of rumors with real information further complicates their detection and management, necessitating a robust, integrated system. This system, detailed in Figure 1, is built around two core components—Sustainable Agile Identification and Adaptive Risk Control—supported by Data Collection and Preprocessing and Feedback and Optimization, and enhanced by LLMs and EKGs to ensure timely and effective rumor management.

Sustainable Agile Identification is essential for early rumor detection, enabling swift responses to curb their spread and mitigate harm. This component comprises three interconnected modules: Real-Time Monitoring, which leverages LLMs to scan and detect potential rumors from large-scale, unstructured online texts; Rumor Tagging, where LLMs categorize identified rumors based on sentiment and context; and Automated Reporting, which generates actionable reports to guide decision-making. These modules, as outlined in Figure 1, work synergistically to facilitate rapid analysis through advanced natural language processing, pinpointing emerging rumors for immediate intervention. Adaptive Risk Control follows, employing flexible, context-specific strategies to manage rumor impact, with three modules: Strategy Selection, which uses EKGs to assess risks and determine mitigation approaches; Dynamic Adjustment, which adapts strategies in real-time based on EKG-driven insights; and Multi-Sector Collaboration, which coordinates resources across departments for effective responses. This adaptability ensures a dynamic response to evolving rumor patterns, minimizing social harm during disasters.

Supporting these core components, Data Collection and Preprocessing gathers and cleans data from diverse sources such as social media, news, and government reports, providing a reliable foundation, while Feedback and Optimization evaluates system performance to drive continuous improvement. LLMs enhance Agile Identification by excelling in text and sentiment analysis, extracting emotional tendencies and implicit information to support accurate rumor tagging, while EKGs bolster Adaptive Risk Control by mapping event contexts and relationships, enabling advanced reasoning and risk assessment to refine strategy selection. This integrated system, as illustrated in Figure 1, connects data preprocessing to monitoring, tagging, reporting, strategy formulation, adjustment, collaboration, and optimization, ensuring a comprehensive approach to rumor identification and control validated by case studies.

The technical process begins with Data Collection and Preprocessing, which ensures accurate and consistent data input by structuring information from multiple sources, feeding into the system’s core components. Within Agile Identification, Real-Time Monitoring employs LLMs to process this data, extracting sentiment and keywords to identify potential rumors, followed by Rumor Tagging using LLM-driven contextual analysis to categorize them, and Automated Reporting to produce actionable outputs. In Adaptive Risk Control, Strategy Selection leverages EKGs to assess risks, Dynamic Adjustment adapts strategies based on EKG insights, and Multi-Sector Collaboration integrates resources for effective action. Finally, Feedback and Optimization analyzes outcomes, refining LLMs and EKGs models to enhance adaptability, all orchestrated within the architecture to address real-world disaster scenarios effectively.

3.2. Data Collection and Preprocessing

In the data collection and preprocessing phase, the first step involves identifying online channels that offer valuable information related to major disasters, including social media platforms, news websites, blogs, online forums, and user-generated content platforms. Data collection focuses on not only textual information but also multimodal data, such as metadata from images and videos. Given that data often contains noise, missing information, and formatting inconsistencies, preprocessing is a critical step. The data cleaning process removes junk information, irrelevant data, duplicate content, and advertisements. Next, textual data is standardized through tokenization, stop-word removal, and lemmatization. Simultaneously, natural language processing techniques are applied to perform sentiment labeling, categorizing each text as positive, negative, or neutral to facilitate subsequent sentiment analysis. These steps ensure high data quality, providing reliable input for model training in later stages.

3.3. LLM Construction

In the large language model construction phase, the GMWAE is employed to train a deep learning model tailored for identifying disaster-related rumors, as illustrated in Figure 2. This architecture begins with input data—comprising vast corpora of disaster-related texts—modeled by a Gaussian Mixture Model to capture the complex and diverse textual features inherent in such contexts. The encoder then transforms this data into latent variables within the latent space, defined by mean (μ) and variance (σ), which represent the distribution of these features. These latent variables, denoted as Z, are sampled and passed to the decoder, which reconstructs data closely resembling the original input, optimized through Wasserstein Loss and Kullback–Leibler (KL) Divergence to ensure alignment in form and distribution, enhancing the model’s generalization ability to classify known rumors and detect emerging ones.

To further refine the model’s performance, Figure 2 integrates an LLM embedding layer, leveraging the LLM’s strong semantic understanding to ensure the generated data aligns with the input’s contextual meaning. This embedding layer enhances the model’s ability to discern grammatical structures, sentiment tendencies, and underlying rumor propagation patterns, critical for assessing rumor authenticity and risk levels in disaster scenarios. The Semantic Consistency Enhancement Module, also depicted in Figure 2, further improves data generation quality by reinforcing contextual integrity, particularly for natural language processing tasks. This module specifically optimizes data generation quality through a multi-step mechanism: it employs a feedback loop that compares generated outputs against the original input embeddings, adjusting discrepancies using a similarity metric to preserve semantic coherence; it integrates a regularization technique to penalize deviations from the input’s syntactic and thematic structure; and it leverages contextual embeddings from the LLM to refine the latent space representation, ensuring that reconstructed data retains the nuanced meaning and intent of the source material. Together, these components enable the GMWAE-enhanced LLM to produce high-quality, semantically consistent outputs, equipping it to efficiently identify and classify disaster-related rumors with precision, as validated by its application in subsequent case studies.

By leveraging the architecture in Figure 2, the GMWAE-enhanced LLM not only generates accurate rumor representations but also supports real-time detection and scalability across diverse disaster scenarios, by adapting to varying data complexities. This design enhances the system’s ability to handle large-scale, dynamic social media inputs, ensuring robust performance in high-pressure environments, which is crucial for timely intervention. To assess how this adaptability translates into practical outcomes, Figure 3 provides a comparative analysis of reconstruction results from WAE and GMWAE on the same dataset, offering insights into their effectiveness in managing complex data distributions.

The WAE is primarily trained by minimizing reconstruction error. While it can capture the global features of the data to some degree, its performance often falls short when faced with complex structures, particularly in data that exhibits multiple modes or subtle features. The reconstructed data, as shown in the figure, may reveal the WAE’s limitations in faithfully capturing such complexities. For instance, when dealing with data containing multiple Gaussian clusters, the WAE may struggle to accurately replicate these structures, leading to reconstructions that appear somewhat blurred or biased in comparison to the original data. In contrast, the GMWAE enhances the WAE by incorporating a Gaussian Mixture Model (GMM), which allows the model to better capture the intricate distribution of data within the latent space. The GMM increases the model’s expressive power by representing different modes and structures within the data through a mixture of multiple Gaussian distributions. As a result, the GMWAE generally achieves superior performance when handling complex data distributions. The reconstructed data, as shown in the figure, more accurately mirrors the original data’s details, especially in the case of complex patterns like Gaussian clusters, with the reconstruction closely resembling the true structure of the original data. In summary, the GMWAE surpasses the WAE in managing complex data, as it more precisely captures and reconstructs intricate data distributions. By integrating the GMM, the GMWAE fully exploits the diversity of the latent space, thereby enhancing the quality of data reconstruction. This improvement enables the GMWAE to deliver more accurate and detailed reconstruction results when dealing with complex data patterns.

In this study, the main role of LLMs is to perform sentiment classification and contextual analysis on disaster-related social media data, identifying rumors or misleading information. Through pre-training on large-scale text data, LLMs are able to recognize underlying patterns and contextual information within the text, helping to determine whether the information is a rumor. Next, the integration of LLMs with the GMWAE framework is achieved by combining sentiment analysis and anomaly detection functions. Specifically, LLMs conduct sentiment analysis and topic modeling on the input disaster-related text to help identify rumors or false information. Then, GMWAE is used to handle complex distributions in the text data, providing more precise modeling capabilities, especially in capturing anomalous data or potential rumors. The combination of Gaussian mixture models in GMWAE enables modeling of multiple complex data distributions, which is crucial for distinguishing between factual information and rumors. Additionally, the sentiment analysis results provided by the LLMs will be input into the GMWAE model, further enhancing the model’s ability to identify disaster-related information. By integrating the sentiment analysis data generated by LLMs into the GMWAE framework, the study can more efficiently capture anomalous information and rumor propagation paths from social media, thereby achieving more flexible and effective rumor detection and risk control.

3.4. EKG Construction

During the event knowledge graph construction phase, key entities related to disasters and event elements are extracted from the collected data to build a highly structured knowledge graph. This graph not only captures basic associations between entities but also represents complex causal relationships and event logic. Furthermore, the knowledge graph is enriched by incorporating information from external authoritative knowledge bases, thereby enhancing its comprehensiveness and accuracy. These enhancements equip the system with deep semantic understanding and reasoning capabilities, enabling it to swiftly identify rumors and make more precise response decisions when disaster events occur.

In this context, key entities related to disasters refer to the core components that define and drive a disaster scenario, serving as foundational nodes in the knowledge graph. These include disaster types, geographic elements, temporal markers, actors and organizations, resources, impacts, and misinformation elements. These entities are critical for mapping the disaster’s physical, social, and informational dimensions. Meanwhile, event elements encompass the dynamic components that outline the disaster’s progression and relationships, acting as attributes or connections within the graph. These include triggers and causes, processes and sequences, impacts and consequences, responses and interventions, relational logic, and contextual factors. Together, they construct a narrative framework that supports reasoning and prediction. The process of extracting these entities and elements from collected data, primarily through social media posts, employs a multi-stage AI-driven method. Initially, data is preprocessed using filtering techniques to remove noise, followed by tokenization and vectorization to create embeddings. Key entities are then extracted using Named Entity Recognition enhanced by GMWAE and BERT models, where BERT identifies entities such as locations or impacts, and GMWAE refines clustering with Gaussian mixtures, achieving high precision through multi-loss optimization. Event elements are derived via relation extraction with GATs, which map dependencies and identify themes like causal chains. The Semantic Consistency Enhancement Module ensures accuracy by comparing outputs to inputs, applying regularization to maintain syntactic integrity, and refining latent representations with LLM contextual embeddings. Finally, external knowledge bases enrich the graph through entity linking and triple extraction, supporting dynamic updates for real-time adaptability.

Figure 4 presents an integrated framework for disaster management, encompassing emergency response, public opinion dynamics, and post-disaster reconstruction, with interconnections that highlight the flow from initial triggers to long-term recovery. At the center, major disasters such as floods (leading to infrastructure damage, landslides, and building collapses), earthquakes, or tsunamis serve as the core nodes, initiating a chain of consequences including search and rescue operations, medical services, and evacuation efforts, as depicted in the diagram’s central pathways. Resource distribution and coordination nodes connect to emergency response modules like ambulance deployment, shelters, and relief resources, optimizing immediate interventions, while the “Causes” branch links disaster types to casualties, psychological support needs, and environmental recovery, emphasizing the multifaceted impacts. Simultaneously, the “Results in Rumors” and “Misinformation Spread” nodes illustrate how disasters generate rumors on social media and news outlets, influencing public opinion and amplifying panic, with “Public Opinion” and “Social Media” nodes feeding into fact-checking and data integration for enhanced awareness. The knowledge inference stage, leveraging EKGs and LLMs as shown in the diagram’s “Event Knowledge Graph” and “Large Language Model” components, facilitates real-time analysis to counter misinformation and refine response strategies through contextual risk assessment and policy updates. Post-disaster, the framework shifts to reconstruction nodes like infrastructure reconstruction, environmental recovery, and economic stimulus plans, guided by data-driven insights from “Effect Evaluation” and “System Update,” with “Policy Adjustment” and “Guides” ensuring adaptive resilience. This interconnected structure reinforces preparedness and mitigation strategies for future crises by integrating emergency actions, rumor control, and recovery efforts into a cohesive, dynamic system.

To further refine the analysis and optimization of emergency response, information management, and decision support systems, a comprehensive evaluation framework for knowledge graphs is employed. This framework assesses the knowledge graph across multiple dimensions to ensure its effectiveness in event-based analysis. Table 2 presents this evaluation framework, focusing on Accuracy (True Positives (TP); False Positives (FP); and False Negatives (FN)), Completeness, Scalability, Query Efficiency, and Inference Capability. Each dimension includes specific criteria and formulas to quantitatively evaluate the precision of events and relationships, the coverage of relevant entities, the scalability for large datasets and complex structures, query performance, and the accuracy and diversity of inferences. This approach ensures a thorough assessment of the graph’s effectiveness in event-based analysis.

3.5. Rumor Identification and Risk Control

In the rumor identification and risk control stage, the system integrates the outcomes from previous stages and continuously monitors new data from various sources. This data is then analyzed using the constructed large language model in conjunction with the knowledge graph. The enhanced model for rumor identification and risk control, based on GATs, leverages graph-structured data to represent the information dissemination relationships within social networks. The structure is illustrated in Figure 5.

Figure 5 illustrates the structure of a multi-layer GATs, with each layer adhering to a consistent operational flow tailored for processing graph-structured data, as depicted in the diagram. Each layer initiates by transforming input features into query (Q), key (K), and value (V) vectors through a linear transformation, a process clearly outlined in the central flow. These vectors are then fed into a multi-head attention mechanism, where their correlations are computed using a scaled dot-product attention mechanism, highlighted by the attention head nodes in the diagram. The outputs from all attention heads are concatenated, aggregated, and integrated with the original input features via a residual connection, followed by a normalization step to enhance stability and mitigate the vanishing gradient problem. The aggregated features then pass through a Dropout layer, illustrated in the regularization segment of the diagram, to prevent overfitting, and a LeakyReLU activation function, depicted as the nonlinearity step, to introduce non-linear transformations. The resulting output features, enriched with contextual information as indicated in the final output node, serve either as the model’s final output or as input to the subsequent layer, enabling effective rumor propagation analysis in disaster scenarios.

4. Case Study of Major Disaster-Related Online Rumors

4.1. Data Collection and Preprocessing Related to the 2021 Zhengzhou Flood

Crawler technology is employed to preprocess data from 16,131 microblogs related to the 2021 Zhengzhou flood, resulting in a comprehensive corpus focused on public opinion surrounding the disaster. Given the nature of microblog data, characterized by repetitive content, colloquial language, and the frequent use of emoticons, thorough data cleaning and organization are essential to enhance the accuracy of subsequent analyses. The text preprocessing stage involves several key steps to standardize the data. First, repetitive data from reposts and low-frequency words, are removed to prevent any adverse impact on topic extraction and model performance. The JIEBA word segmentation tool is then applied to segment the Chinese microblog data, effectively preparing the text for further analysis. Additionally, emoticons, which convey users’ emotions and attitudes, are translated to enrich the corpus and accurately reflect their influence on topic classification. Stop words, which are common terms with minimal semantic value, are also removed using a customized list that incorporates both general and flood disaster-specific stop words, thereby enhancing the clarity of the analysis. Finally, the text is vectorized using a count vectorizer, transforming it into a structured format suitable for further analysis.

4.2. Emotion Analysis Based on LLMs

The integration of LLMS with the GMWAE significantly enhances the accuracy and depth of sentiment analysis in online public opinion monitoring. By leveraging the advanced linguistic capabilities of LLMs, this approach captures nuanced emotions and context from unstructured text data, such as social media posts. The GMWAE, renowned for its ability to model complex data distributions, further refines this analysis by effectively capturing the underlying sentiment distributions and variability within the data. This combination enables more precise identification and classification of sentiments, improving the system’s capability to detect subtle emotional shifts and potential risks in online discourse. As a result, it becomes a powerful tool for the proactive management of public sentiment.

Figure 6 illustrates the average emotional value of daily microblogs, calculated by date. The analysis reveals that on 17 July 2021, when the rainstorm abruptly struck Zhengzhou, the average sentiment of microblogs reached its most negative point. This sharp decline reflects the widespread distress and concern during the initial stages of the disaster. In response, the government swiftly mobilized extensive resources for disaster relief. As these efforts began to take effect, the average emotional value of microblogs steadily improved. By 28 July 2021, sentiment had risen to over 0.84, indicating a significant shift towards a more positive emotional state. This improvement suggests a growing sense of relief and optimism among the public, attributed to the effective disaster management and recovery measures implemented. To further identify, quantify, and track changes in public sentiment, we utilize the emotion word distribution matrix. This matrix quantifies and analyzes the distribution of sentiment words, providing essential data support for developing effective public opinion management strategies.

Figure 7 presents the distribution of emotional words across 16,131 microblog posts related to the 2021 Zhengzhou flood, categorized into seven emotions: happiness, sadness, joy, anger, surprise, fear, and disgust. The matrix shows word frequencies in each emotion category, with a color gradient from blue (low frequency) to red (high frequency). For example, words related to “sadness” appeared 609 times in sadness posts, while “anger” words were used 588 times in anger posts, reflecting direct associations between emotions and language. Off-diagonal interactions, such as “disgust” words appearing 696 times in happiness posts, highlight the complexity of emotional expressions in disaster contexts. This matrix provides a comprehensive analysis of emotional responses during the disaster, offering insights that aid in developing effective public opinion monitoring and early warning systems to manage emotional fluctuations.

BERT is a powerful machine learning algorithm based on the transformer framework, widely recognized for its high efficiency, scalability, and superior performance in natural language processing tasks, including classification and sentiment analysis. BERT leverages bidirectional context to construct a robust representation of text and improves model accuracy by capturing nuanced relationships within data. When combined with generative models, BERT offers the advantage of powerful emotion topic classification. Generative models are used to extract meaningful features and structure the data, while BERT provides robust classification capabilities by fine-tuning on these features. By using these combinations, BERT can efficiently capture latent topics or distributions in the data and classify the emotional tone of public opinion with high precision.

The GMWAE and BERT model is used to classify the emotion topic of public opinion in flood disaster microblogs, which shows a good effect. In order to compare the performance of different models, 4 experimental scenarios are built: LDA + BERT, Variational Autoencoder (VAE) + BERT, WAE + BERT, GMWAE + BERT. Precision rate, recall rate, and F1 scores are shown in Table 3. The results show that the model adopted in this paper has better performance.

The GMWAE + BERT emotion topic classification method introduces the Wasserstein distance, which solves the gradient vanishing problem due to its superior smoothing property relative to KL divergence and Jensen-Shannon (JS) divergence. This method not only stabilizes training through BERT’s bidirectional contextual understanding but also provides a reliable process index highly correlated with the quality of generated samples, leveraging BERT’s pre-trained semantic knowledge for refined classification. As a result, the topic words extracted by this method exhibit superior expressive power, and the topics derived are more detailed and nuanced. Therefore, this model achieves higher scores in performance comparisons with other models.

The GMWAE + BERT model shows good results, but the following limitations were found which need further research. Firstly, with the increasing number of emotion topics, the performance of GMWAE + BERT has decreased to a certain extent, the possible reason is that more distributions are modeled in a small range in the hidden space, and the possibility of cascading becomes larger, so that the same hidden variable corresponds to multiple topic distributions and leads to posterior collapse. Additionally, the model training time is longer due to BERT’s computational demands in fine-tuning, and the convergence speed is slower. Lastly, it is difficult to estimate covariance in the case of insufficient samples.

4.3. Research on the 2021 Zhengzhou Flood Case Study

Figure 8 illustrates a detailed network of physical factors and their consequences triggered by the flood, with the “Flood” node at the center as the initiating event, as depicted in the diagram’s central structure. This node sets off structural damage and building collapses, leading to injuries and casualties, while overwhelming drainage systems and causing water contamination, as shown by the interconnected pathways that map these physical impacts specific to the 2021 Zhengzhou event. The network highlights evacuation as a direct response to these physical threats, with connections linking it to the flood warning system, a critical element that triggers timely evacuations and reduces the disaster’s physical toll. Emergency services such as rescue operations, first aid, and temporary shelters—expand in response to rising casualties, as illustrated by their linked nodes, reflecting immediate physical relief efforts. Overall, this structured representation vividly portrays the cascading physical effects and response measures of the 2021 Zhengzhou Flood, emphasizing the complex interconnections that support disaster mitigation.

Figure 9 presents a detailed knowledge graph illustrating the interconnected elements central to disaster management, with the “DISASTER TYPE” node such as flooding, serving as the starting point, triggering secondary impacts like “WATER POLLUTION” and “TRAFFIC DISRUPTION,” as shown by the branching connections in the diagram. These factors amplify the disaster’s reach, intensifying public panic and economic losses, which are linked to “GOVERNMENT RESPONSE” and “INTERNATIONAL AID” nodes that coordinate relief efforts, with arrows highlighting their influence. Relief funds, volunteer coordination, and public education nodes mitigate these effects, as depicted in the graph’s response pathways. Simultaneously, “SCIENTIFIC RESEARCH” and “DISASTER PREDICTION” nodes feed into future preparedness, while “TECHNOLOGICAL INNOVATION” and “POLICY ADJUSTMENT” nodes drive strategic enhancements, their interconnections illustrated. Social elements, including “PUBLIC OPINION” and “SOCIAL MEDIA” nodes, underscore information dissemination, with their roles mapped in the diagram. During post-disaster recovery, nodes such as “POLICY ADJUSTMENT,” “ECONOMIC STIMULUS PLAN,” and “POST-DISASTER RECONSTRUCTION” guide resilience-building efforts, as shown by the recovery pathways. This societal response and recovery knowledge graph provides a multidimensional view of how these interconnected factors shape disaster response and recovery, emphasizing coordinated action to bolster resilience, particularly in the context of the 2021 Zhengzhou Flood.

Figure 8 and Figure 9 offer complementary insights into disaster events, with Figure 8 detailing the causal chain of physical impacts stemming from the flood, such as structural damage, building collapses, injuries, and the subsequent demand for rescue operations, as mapped by its interconnected nodes. This diagram emphasizes the direct material losses and physical consequences, highlighting the flood’s immediate toll. In contrast, Figure 9 broadens the perspective to encompass multi-level societal reactions, illustrating the roles of media coverage, public panic, government and international aid responses, and macro-level factors like scientific research, technological innovation, and policy adjustments, with its network of connected elements. Together, these diagrams provide a comprehensive view of the disaster process, with Figure 8 focusing on the initial physical impacts and Figure 9 exploring the ensuing societal responses and recovery strategies, collectively depicting the full spectrum from occurrence to resilience-building.

To further assess the effectiveness of these knowledge graphs, Table 4 compares Figure 8 and Figure 9 across criteria such as Precision, Accuracy, Coverage, Completeness, Scalability, Response Time, Resource Utilization, Inference Accuracy, and Inference Diversity. Figure 9 consistently outperforms Figure 8, showing higher values in most categories, which reflects greater precision in entity representation, more accurate relationship modeling, broader coverage and completeness, improved scalability and efficiency, and enhanced accuracy and diversity in inference capabilities. The dataset used for evaluation was specifically constructed for this study to assess the performance of EKGs in disaster-related rumor detection and misinformation analysis. Data sources primarily include real-world social media posts, such as those from Weibo and Twitter, supplemented by textual and multimodal content from news articles and official reports to ensure diversity. For the Zhengzhou flood case, 16,131 Weibo posts were collected from July 17 to July 30, 2021, using keyword searches and API scraping, focusing on posts containing disaster-related discussions, rumors, and responses. To enhance the dataset and address limitations in rumor-specific labeling, additional samples were drawn from publicly available benchmark datasets to evaluate EKG construction and inference. Annotation rules followed a multi-step, expert-guided process to ensure reliability and alignment with disaster misinformation dynamics. First, posts were filtered for relevance using automated keyword matching and manual review to exclude off-topic content. Second, a team of domain experts semi-supervisedly labeled entities and relationships, with event elements annotated using multi-class labels for veracity and severity, resolving inter-annotator disagreements via majority vote. The process incorporated weak supervision from pre-trained models to scale labeling, followed by human validation on the data. Evaluation proceeded in phases: (1) automated metric computation using TP, FP, and FN to assess Concept and Entity Precision, Event Relationship Accuracy, Event Coverage, and Entity and Attribute Completeness, quantifying the performance of entity extraction and relationship modeling through precision, recall, and F1 scores; (2) graph-specific tests, including Data Scalability, Structural Scalability, Response Time, and Resource Utilization, measuring the robustness and resource efficiency of graph construction through processing time and efficiency benchmarks on varying subgraph sizes; (3) inference simulations via GATs to evaluate Inference Accuracy and Inference Diversity, measuring the accuracy and diversity of retained rumor propagation paths; (4) cross-validation across the two cases to assess overall generalizability, ensuring the model’s robustness across multiple disaster scenarios. This rigorous construction and evaluation process validated the EKGs’ robustness.

Building on this robust foundation, the knowledge graphs’ ability to model complex event relationships enables precise predictions of disaster dynamics, a critical aspect validated through temporal analysis of specific incidents. This leads to an examination of how these insights apply to the 2021 Zhengzhou flood, where probability variations of key events are analyzed over time.

Figure 10 details the probability variations of four key events (urban flooding, infrastructure damage, rescue operations, psychological assistance) during the 2021 Zhengzhou flood across different dates. The X-axis covers the date range from 17 July 2021, to 30 July 2021, while the Y-axis represents the propagation probability, with values ranging from 0 to 1, indicating the likelihood of the events happening.

Figure 10 reveals the evolving trends of these events, with the blue curve demonstrating an early peak in urban flooding likelihood that subsides as mitigation efforts progress, while the orange curve indicates a consistent rise in infrastructure damage probability, reflecting ongoing structural challenges. The green curve highlights a rapid escalation and sustained high probability for rescue operations, adapting to the disaster’s demands, and the pink curve shows an initial surge in psychological assistance needs that tapers off as recovery support takes hold.

4.4. Supplementary Research on the 2023 Maui Wildfire Case Study

The 2021 Zhengzhou flood case study, based on a structured Weibo dataset comprising 16,131 posts, effectively validated the framework’s performance in detecting and managing urban flood-related rumors in a Chinese context. However, this case study has critical limitations, restricting the framework’s generalizability across diverse disaster scenarios and evolving misinformation environments, particularly in handling multilingual data, varied disaster types, and countering misinformation generated by bot accounts. To address these gaps and evaluate the framework’s adaptability, this study supplements the analysis with the 2023 Maui wildfire case, which occurred on 8 August 2023, in Hawaii, USA, using a dataset of approximately 11,407 English-language posts. This case was selected to specifically address three limitations of the Zhengzhou flood case:

First, the Zhengzhou dataset focused on monolingual Chinese data processed using JIEBA tokenization, which could not test the framework’s ability to handle multilingual and multicultural social media data. In contrast, the Maui wildfire dataset, consisting of English posts, contains extensive Western slang, multimedia content, and diverse expressive forms. By processing this data using vectorization techniques to mitigate higher noise interference, this study validates the framework’s cross-linguistic robustness.

Second, the Zhengzhou case centered on urban flooding, with rumors primarily related to infrastructure failures and rescue delays, confined to the dynamic characteristics specific to flood disasters. Conversely, the Maui wildfire involved rapidly spreading fires, air pollution, and evacuation challenges, giving rise to wildfire-specific misinformation, such as “laser-induced arson”. This case tests the framework’s ability to adapt to different disaster event structures within EKGs and its adaptability in propagation analysis using GATs.

Third, the 2021 Zhengzhou case predates the widespread adoption of generative AI technologies, whereas the Maui dataset includes misinformation propagated by bot accounts. By enhancing EKG nodes and leveraging GATs to detect automated propagation patterns, this study evaluates the framework’s capability to handle multimodal misinformation.

Applying the framework to the Maui dataset revealed that the data preprocessing phase must address higher noise levels from bot-generated content, necessitating strengthened filtering mechanisms. Sentiment analysis using the GMWAE + BERT model yielded an F1 score of 0.92, slightly lower than the 0.95 achieved in the Zhengzhou case. This difference primarily stems from complex English slang and emotionally driven expressions fueled by bot-generated misinformation. During EKG construction, wildfire-related nodes such as “fire spread” and “AI misinformation” were incorporated, as shown in Figure 11 and Figure 12.

Figure 11 illustrates a structured knowledge graph mapping the physical consequences and causal relationships of wildfires, with the central “Wildfire” node serving as the hub that integrates environmental triggers and direct impacts. Climate change exacerbates drought and vegetation dryness, which combined with strong winds, lead to fire ignition and rapid fire spread, as depicted by the branching arrows connecting these nodes. The core physical impacts radiate from the wildfire node, including building destruction and structural damage that result in casualties, necessitating evacuation and emergency services like rescue operations and medical aid, shown through interconnected edges emphasizing material losses and immediate threats. Air pollution emerges as a key secondary effect, influencing public health and environmental recovery, while shelter and relief supplies nodes highlight the physical response demands, linked to government and international aid coordination. This network underscores the cascading nature of physical devastation, from ignition to widespread damage, providing a visual framework for analyzing wildfire dynamics and informing targeted mitigation strategies in disaster management.

Figure 12 presents a structured knowledge graph that outlines the interconnected impacts and responses to a wildfire event, with the central Disaster Type node serving as the key hub linking environmental triggers and societal effects. Disaster nodes like fire and infrastructure damage initiate a cascade of core nodes like major disaster and evacuation and secondary disaster nodes like traffic disruption and air pollution, as shown by the weighted edges ranging from 0.7 to 0.9 indicating impact strength. Societal reaction nodes such as emergency response, volunteers, and international aid and economic & policy nodes such as policy adjustment, relief fund allocation, and public safety illustrate community and governance responses, while support nodes like media coverage, social media, AI misinformation, and laser fire conspiracy with direct and indirect influences highlight the role of information and rumors in shaping perceptions. This network underscores the progression from initial disaster to recovery efforts, emphasizing the interplay of physical, social, and informational dynamics in managing wildfire crises. Based on the evaluation methods described, the inference diversity, with a value of 0.90, surpasses that of the Zhengzhou case, which has a value of 0.89. This improvement stems from the incorporation of richer digital rumor nodes, addressing the limitation of the flood case that primarily focused on physical impacts.

Figure 13 displays a line graph that tracks the evolution of misinformation risk from August 9 to August 19. The red line, representing the overall misinformation propagation risk, starts low and rises sharply in the initial days, reaching a peak influenced by the “Laser Fire Theory Peak,” a surge in conspiracy-related misinformation. This is followed by a phase marked by “Emergency Response Questions,” indicating public concerns, before the risk gradually declines. A brief uptick tied to “Development Concerns” occurs mid-period, after which the risk stabilizes in the low-risk zone. The graph highlights the dynamic shifts in misinformation risk, peaking early due to rumors and tapering off as the situation evolves, underscoring the need for adaptive communication strategies during the crisis.

Overall, the case studies conducted on the 2021 Zhengzhou flood and the 2023 Maui wildfire provided detailed evidence of the framework’s effectiveness in detecting specific rumors and predicting their propagation paths with high fidelity. For the Zhengzhou flood case, the GATs-enhanced model accurately identified key rumors, with those surrounding “systemic failures in flood warning and response” being the most prominent. These rumors predominantly reflect the public’s excessive panic regarding the operation of the flood warning system and coordination of personnel evacuation. The model, through the GATs attention mechanism, conducted a propagation path analysis, successfully tracing the origins of these rumors to unverified local accounts with low credibility scores, and revealed a rapid dissemination pattern driven by retweets and community clusters. The GMWAE sentiment analysis indicated a shift from panic to skepticism, providing information support for targeted debunking efforts. For the Maui wildfire, the framework detected rumors such as the “laser-induced arson” conspiracy, with the GATs model predicting propagation paths by focusing on amplification driven by bot-generated misinformation, identifying key nodes like influential users and misinformation hubs. These insights, enriched by EKGs contextualization, facilitated precise intervention strategies, such as prioritizing high-risk user clusters, demonstrating the framework’s capability for proactive and adaptive rumor management.

5. Discussion

This study introduces an advanced framework to combat the rapid spread of misinformation during disasters, a critical threat to public safety and crisis management. By integrating LLMs with EKGs, the framework enables agile detection and control of disaster-related online rumors. It comprises three key stages: collecting and preprocessing online data, building an LLM specialized in sentiment and rumor analysis using the GMWAE, and constructing an event knowledge graph for deeper event understanding. A novel integration of GMWAE and GATs further enhances the framework’s ability to model rumor dissemination. GMWAE extracts rich semantic features from disaster-related texts, which are mapped as node attributes in a graph, empowering GATs to capture complex rumor propagation patterns. This synergy improves both the accuracy of rumor detection and the prediction of dissemination paths. Experimental results demonstrate that the framework performs robustly in diverse disaster scenarios, as evidenced by its application to the 2021 Zhengzhou flood and the 2023 Maui wildfire. These case studies validate the framework’s effectiveness in handling monolingual and multilingual datasets, different disaster types, and complex propagation patterns, including bot-generated misinformation. The framework significantly outperforms existing methods, offering a novel approach to managing rumor risks in disaster scenarios and shedding light on the interplay between textual content and network dynamics.

For practical implementation, the framework is designed to seamlessly integrate into operational crisis management ecosystems, with deployment commencing from the modular encapsulation of core components, including data ingestion pipelines, LLMs and GMWAE sentiment analyzers, EKGs builders, and GATs propagation predictors, into containerized microservices using Docker and Kubernetes for scalability. Real-time data streams from social media are ingested via Kafka streams and processed on GPU-accelerated cloud infrastructures to handle peak loads during disasters. Initial setup requires baseline model fine-tuning on specific datasets, followed by iterative validation. Operational monitoring employs dashboards to track key performance indicators. Stakeholder engagement is facilitated through hybrid training programs, combining domain experts from emergency agencies with AI specialists to customize EKGs schemas for local contexts. Post-deployment, continuous auditing via A/B testing refines thresholds, while rollback mechanisms safeguard against model drift in evolving threat landscapes.

Cross-national implementation requires rigorous attention to data management and AI governance to address geopolitical and ethical variances. Data management protocols prioritize compliance with diverse regulations. Federated learning architectures support cross-border collaborative model updates without centralizing raw data, thereby enabling shared EKG enrichments while preserving data sovereignty. For multilingual scalability, biases from underrepresented languages are mitigated through diverse training corpora. AI governance aspects include rigorous ethical audits to detect and correct model biases, transparency mechanisms via explainable AI tools in EKGs, and accountability frameworks involving multidisciplinary oversight committees. These measures not only promote equitable cross-border access but also build trust, ultimately fostering resilient, inclusive disaster response ecosystems.

Despite these advances, several limitations require further attention. The most pressing challenge is the framework’s long training time and slow convergence, particularly with large-scale data, which hinders its use in time-sensitive disaster response. Optimizing the training process is critical to enable real-time rumor detection, allowing authorities to issue timely public warnings. Additionally, the GMWAE model struggles to adapt to texts of varying lengths, leading to inconsistent performance across short social media posts and longer articles. This variability could reduce reliability in diverse online environments. Another issue is the scarcity of disaster-related data, which complicates accurate rumor modeling, especially with small sample sizes. To address these challenges, future research will prioritize three improvements. First, we will streamline the training process using parallel computing and lightweight model designs to enhance real-time applicability while maintaining accuracy. Second, we will refine GMWAE’s architecture with adaptive mechanisms, such as attention-based modules, to handle diverse text lengths effectively. Third, to overcome data scarcity, we will explore data augmentation techniques, such as generative models like Variational Autoencoders, and alternative statistical methods, like Bayesian inference, to improve modeling with limited samples.

Moving forward, two research directions will guide further development. First, we will systematically study how text length and data scarcity affect performance, developing robust solutions to ensure consistent results across varied conditions. This includes testing augmentation strategies like synthetic text generation via pre-trained LLMs. Second, we will investigate advanced prior distributions, such as Dirichlet or hierarchical Bayesian models, to better capture the nuances of disaster-related texts, optimizing the rumor detection model for greater precision. Balancing training speed, convergence, and accuracy will remain a key focus to ensure operational efficiency.

In conclusion, this framework lays a strong foundation for combating disaster-related misinformation, but addressing its limitations is essential to maximize its impact. By enabling faster and more reliable rumor control, ongoing refinements will strengthen public safety systems, equipping communities to respond more effectively to future crises.

6. Conclusions

This study addresses the escalating challenge of misinformation during disasters, a growing threat that undermines public safety and effective crisis management. We developed an intelligent framework using LLMs and EKGs to swiftly identify and control online rumors. The framework involves collecting and preprocessing data, building a specialized LLM with GMWAE for sentiment and rumor analysis, and integrating an event knowledge graph for deeper event understanding. Enhanced GATs further enable accurate rumor detection and propagation prediction. Experimental results, validated through case studies on the 2021 Zhengzhou flood and the 2023 Maui wildfire, confirm the framework’s effectiveness in improving disaster response, highlighting its potential to significantly reduce misinformation’s impact in crises.

However, challenges persist that could limit its effectiveness. The most pressing is the framework’s slow response time, which risks delaying critical actions, such as debunking false alerts that could trigger panic or disrupt evacuations. Similarly, inconsistent detection of diverse rumor types, such as those concerning health versus infrastructure, may undermine public confidence during multifaceted crises. Future efforts will focus on enabling instant, consistent rumor mitigation to meet the demands of urgent disaster scenarios.

In summary, this framework establishes a vital foundation for smarter disaster management. By addressing these challenges, it can empower responders to act swiftly, averting chaos and protecting lives. We envision it catalyzing AI-driven disaster strategies worldwide, standardizing real-time misinformation controls, and fostering joint initiatives among AI researchers, emergency teams, and policymakers. Ongoing advancements will fortify global resilience, equipping societies to confront future crises with clarity and unity.

Funding

This study was supported by the grant from the basic research projects of educational department of Liaoning province (Grant No. LJ212411035018).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the need to protect the privacy of its content.

Acknowledgments

I gratefully acknowledge the financial support provided by the funding agencies.

Conflicts of Interest

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Chen, L.; Liu, Y.; Chang, Y.D. Public opinion analysis of novel coronavirus from online data. J. Saf. Sci. Resil. 2020, 1, 120–127. [Google Scholar] [CrossRef] [PubMed]
Alattar, F.; Shaalan, K. Emerging research topic detection using filtered-LDA. AI 2021, 2, 578–599. [Google Scholar] [CrossRef]
Michal, K. Theory of Mind May Have Spontaneously Emerged in Large Language Models. arXiv 2024, arXiv:2302.02083v6. [Google Scholar] [CrossRef]
Chen, Y.; Georgiou, T.T.; Tannenbaum, A. Optimal transport for Gaussian mixture models. IEEE Access 2018, 7, 6269–6278. [Google Scholar] [CrossRef] [PubMed]
Delon, J.; Desolneux, A. A Wasserstein-type distance in the space of Gaussian mixture models. SIAM J. Imaging Sci. 2020, 13, 936–972. [Google Scholar] [CrossRef]
Gaujac, B.; Feige, I.; Barber, D. Gaussian mixture models with Wasserstein distance. arXiv 2018, arXiv:1806.04465. [Google Scholar] [CrossRef]
Kolouri, S.; Rohde, G.K.; Hoffmann, H. Sliced Wasserstein distance for learning Gaussian mixture models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3427–3436. [Google Scholar]
Liu, J.; Wang, P.; Shang, Z. IterDE: An Iterative Knowledge Distillation Framework for Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 4488–4496. [Google Scholar] [CrossRef]
Wang, S.; Zhou, X.G. Overview of Topic Detection and Tracking of Methods for Microblogs. In Proceedings of the International Conference on Education, Culture, Economic Management and Information Service, Changsha, China, 20–21 June 2020; pp. 242–248. [Google Scholar] [CrossRef]
Zong, C.Q.; Xia, R.; Zhang, J.J. Topic Detection and Tracking. Text Data Min. 2021, 2, 201–225. [Google Scholar] [CrossRef]
Valdez, D.; Pickett, A.C.; Goodson, P. Topic Modeling: Latent Semantic Analysis for the Social Sciences. Soc. Sci. Q. 2018, 99, 1665–1679. [Google Scholar] [CrossRef]
Uthirapathy, S.E.; Sandanam, D. Topic Modelling and Opinion Analysis on Climate Change Twitter Data Using LDA And BERT Model. Procedia Comput. Sci. 2023, 218, 908–917. [Google Scholar] [CrossRef]
Jaradat, S.; Matskin, M. On Dynamic Topic Models for Mining Social Media. Emerg. Res. Chall. Oppor. Comput. Soc. Netw. Anal. Min. 2019, 9, 209–230. [Google Scholar] [CrossRef]
Zhu, H.M.; Qian, L.; Qin, W. Evolution analysis of online topics based on ‘word-topic’ coupling network. Scientometrics 2022, 127, 3767–3792. [Google Scholar] [CrossRef]
Wang, J.M.; Wu, X.D.; Li, L. A framework for semantic connection based topic evolution with DeepWalk. Intell. Data Anal. 2018, 22, 211–237. [Google Scholar] [CrossRef]
Virtanen, S. Uncovering dynamic textual topics that explain crime. R. Soc. Open Sci. 2021, 8, 1–14. [Google Scholar] [CrossRef] [PubMed]
Du, Y.; Yi, Y.; Li, X. Extracting and tracking hot topics of micro-blogs based on improved Latent Dirichlet Allocation. Eng. Appl. Artif. Intell. 2020, 87, 103279. [Google Scholar] [CrossRef]
Liu, W.; Jiang, L.; Wu, Y.S. Topic Detection and Tracking Based on Event Ontology. IEEE Access 2020, 8, 2995776. [Google Scholar] [CrossRef]
Peng, X.; Han, C.; Ouyang, F. Topic tracking model for analyzing student-generated posts in SPOC discussion forums. Int. J. Educ. Technol. High. Educ. 2020, 17, 35. [Google Scholar] [CrossRef]
Chen, X. Monitoring of Public Opinion on Typhoon Disaster Using Improved Clustering Model Based on Single-Pass Approach. SAGE Open 2023, 13, 1–16. [Google Scholar] [CrossRef]
Chen, X.; Qiu, Z.Z. Research on Text Classification Based on WAE Improved Model. Comput. Simul. 2022, 39, 331–336. [Google Scholar]
Petrick, F.; Herold, C.; Petrushkov, P. Document-Level Language Models for Machine Translation. In Proceedings of the Eighth Conference on Machine Translation, Singapore, 6–7 December 2023; pp. 375–391. [Google Scholar] [CrossRef]
Zhang, Y.H.; Song, K.H.; Cai, X.R. Multimodal Topic Detection in Social Networks with Graph Fusion. In Proceedings of the International Conference on Web Information Systems and Applications, Kaifeng, China, 24–26 September 2021; pp. 28–38. [Google Scholar] [CrossRef]
Sagar, P.; Sanket, S.; Sandip, P. Topic Detection and Tracking in News Articles. In Proceedings of the International Conference on Information and Communication Technology for Intelligent Systems, Ahmedabad, India, 25–26 March 2017; pp. 420–426. [Google Scholar] [CrossRef]
Guo, M.; Chen, C.; Hou, C.; Wu, Y.; Yuan, X. FGDGNN: Fine-Grained Dynamic Graph Neural Network for Rumor Detection on Social Media. In Findings of the Association for Computational Linguistics; Association for Computational Linguistics: Vienna, Austria, 2025; pp. 5676–5687. [Google Scholar] [CrossRef]
Li, H.; Jiang, L.; Li, J. Continuous-time dynamic graph networks integrated with knowledge propagation for social media rumor detection. Mathematics 2024, 12, 3453. [Google Scholar] [CrossRef]
Guan, S.; Cheng, X.; Bai, L.; Zhang, F.; Li, Z.; Zeng, Y.; Jin, X.; Guo, J. What is event knowledge graph: A survey. IEEE Trans. Knowl. Data Eng. 2022, 35, 7569–7589. [Google Scholar] [CrossRef]
Mayank, M.; Sharma, S.; Sharma, R. DEAP-FAKED: Knowledge graph based approach for fake news detection. arXiv 2021, arXiv:2107.10648. [Google Scholar] [CrossRef]
Qudus, U.; Röder, M.; Saleem, M.; Ngomo, A.C.N. Fact checking knowledge graphs—A survey. ACM Comput. Surv. 2025, 58, 1–36. [Google Scholar] [CrossRef]
Kishore, A.; Kumar, G.; Patro, J. Multimodal fact checking with unified visual, textual, and contextual representations. arXiv 2025, arXiv:2508.05097. [Google Scholar] [CrossRef]
Liu, Y.P.; Peng, H.; Li, J.X. Event detection and evolution in multi-lingual social streams. Front. Comput. Sci. 2020, 5, 213–227. [Google Scholar] [CrossRef]
Sukhwan, J.; Wan, C.Y. An alternative topic model based on Common Interest Authors for topic evolution analysis. J. Informetr. 2020, 14, 101040. [Google Scholar] [CrossRef]
Ankita, D.; Himadri, M.; Niladriar, S.D. Text categorization: Past and present. Artif. Intell. Rev. 2021, 54, 1–48. [Google Scholar] [CrossRef]
Kuang, G.S.; Guo, Y.; Liu, Y. Bursty Event Detection via Multichannel Feature Alignment. In Proceedings of the ICBDC ‘20: The 5th International Conference on Big Data and Computing, Chengdu, China, 28–30 May 2020; pp. 39–45. [Google Scholar] [CrossRef]
Dai, T.J.; Xiao, Y.P.; Liang, X. ICS-SVM: A user retweet prediction method for hot topics based on improved SVM. Digit. Commun. Netw. 2022, 8, 186–193. [Google Scholar] [CrossRef]
Meysam, A.; Mohammad, R. Topic Detection and Tracking Techniques on Twitter: A Systematic Review. Complexity 2021, 4, 1–15. [Google Scholar] [CrossRef]
Zhu, W.; Tian, A.; Yin, C. Instance-Aware Prompt Tuning for Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; Volume 1, pp. 14285–14304. [Google Scholar] [CrossRef]
Zerveas, G.; Rekabsaz, N.; Eickhoff, C. Enhancing the ranking context of dense retrieval through reciprocal nearest neighbors. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 8798–8807. [Google Scholar] [CrossRef]
Zhang, X.; Yang, Q.; Xu, D. Xuan yuan 2.0: A large Chinese financial chat model with hundreds of billions parameters. arXiv 2023, arXiv:2305.12002. [Google Scholar] [CrossRef]
Cho, S.; Jeong, S.; Seo, J. Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker. In Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 960–971. [Google Scholar] [CrossRef]
Sun, Z.; Shen, S.; Cao, S. Aligning Large Multimodal Models with Factually Augmented RLHF. In Findings of the Association for Computational Linguistics: ACL 2024; Association for Computational Linguistics: Toronto, ON, Canada, 2024; pp. 13088–13110. [Google Scholar] [CrossRef]
Han, S.; Liu, J.C.; Wu, J.Y. Transforming Graphs for Enhanced Attribute Clustering: An Innovative Graph Transformer-Based Method. arXiv 2023, arXiv:2306.11307v3. [Google Scholar] [CrossRef]
Ahsan, M.; Kumari, M.; Sharma, T.P. Rumors detection, verification and controlling mechanisms in online social networks: A survey. Online Soc. Netw. Media 2019, 14, 100050. [Google Scholar] [CrossRef]
Lian, Y.; Liu, Y.; Dong, X. Strategies for controlling false online information during natural disasters: The case of Typhoon Mangkhut in China. Technol. Soc. 2020, 62, 101265. [Google Scholar] [CrossRef]
Dong, Y.; Huo, L.; Perc, M.; Boccaletti, S. Adaptive rumor propagation and activity contagion in higher-order networks. Commun. Phys. 2025, 8, 261. [Google Scholar] [CrossRef]
Liu, W.; Liu, J.; Niu, Z. Online social platform competition and rumor control in disaster scenarios: A zero-sum differential game approach with approximate dynamic programming. Complex Intell. Syst. 2025, 11, 461. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the Collaboration between Sustainable Agile Identification and Adaptive Risk Control.

Figure 2. Architecture of GMWAE Enhanced by Large Language Model.

Figure 3. Reconstruction Results of WAE and GMWAE on the Same Original data.

Figure 4. Event Knowledge Graph Construction.

Figure 5. GATs-Based Model for Rumor Identification and Risk Control.

Figure 6. Daily average emotion value of Zhengzhou flood in 2021.

Figure 7. Emotion Word Distribution Matrix.

Figure 8. Physical Impact Knowledge Graph for the 2021 Zhengzhou Flood.

Figure 9. Societal Response Graph for the 2021 Zhengzhou Flood.

Figure 10. Event Propagation probability over time.

Figure 11. Physical Impact Knowledge Graph for the 2023 Maui wildfire.

Figure 12. Societal Response Graph for the 2023 Maui wildfire.

Figure 13. Misinformation Propagation Risk Trends for Maui Wildfire Outbreak.

Table 1. Overview of Techniques for Misinformation Analysis.

Approach	Key Studies	Methods/Technologies	Applications
Generative Topic Models	Danny et al. [11], Uthirapathy et al. [12], Jaradat et al. [13], Zhu et al. [14], Wang et al. [15], Virtanen et al. [16], Du et al. [17], Liu et al. [18], Peng et al. [19], Chen et al. [20,21], Petrick et al. [22]	LDA, BERT, Transformers, WAE	Thematic analysis, rumor detection, topic tracking
Graph-Based Models	Zhang et al. [23], Sagar et al. [24], Guo et al. [25], Li et al. [26], Guan et al. [27], Mayank et al. [28], Qudus et al. [29], Kishore et al. [30]	GNNs, EKGs, DEAP-FAKED, MultiCheck	Event detection, multimodal verification, fact-checking
Statistical Classification	Liu et al. [31], Sukhwan et al. [32], Ankita et al. [33], Kuang et al. [34], Dai et al. [35], Meysam et al. [36], Zhu et al. [37], Zerveas et al. [38], Zhang et al. [39], Cho et al. [40], Sun et al. [41], Han et al. [42]	GPT, CNN-RNN-LLM, RL, predictive modeling	Rumor labeling, event mining, risk prediction

Table 2. Comprehensive Evaluation Framework for Event Knowledge Graph.

Evaluation Dimension	Evaluation Criteria	Formula/Indicator
Accuracy	Concept and Entity Precision	Precision = TP/(TP + FP)
Accuracy	Event Relationship Accuracy	Recall = TP/(TP + FN)
Completeness	Event Coverage	Coverage = Number of Entities/Total Entities
Completeness	Entity and Attribute Completeness	Attribute Completeness = Defined Attributes/Total Required Attributes
Scalability	Data Scalability	Scalability Factor = Processed Data Volume/Initial Data Volume
Scalability	Structural Scalability	Structural Complexity = Numbers of Nodes/Number of Edges
Query Efficiency	Response Time	Average Response Time
Query Efficiency	Resource Utilization	Resource Utilization Rate = Used Resources/Available Resources
Inference Capability	Inference Accuracy	Inference Precision = Successful Inferences/Total Inferences
Inference Capability	Inference Diversity	Inference Diversity Index = Inference Types/Total Inference Types

Table 3. Performance comparison of different models.

Classification Model	Precision	Recall	F1 Score
LDA + BERT	0.81	0.77	0.82
VAE + BERT	0.86	0.84	0.88
WAE + BERT	0.91	0.89	0.92
GMWAE + BERT	0.94	0.93	0.95

Table 4. Comprehensive Evaluation Results for Event Knowledge Graph.

Evaluation Criteria	Evaluation Results for Figure 8	Evaluation Results for Figure 9
Concept and Entity Precision	0.85	0.88
Event Relationship Accuracy	0.75	0.78
Event Coverage	0.80	0.85
Entity and Attribute Completeness	0.70	0.75
Data Scalability	0.90	0.92
Structural Scalability	0.85	0.88
Response Time	0.80	0.83
Resource Utilization	0.82	0.85
Inference Accuracy	0.88	0.91
Inference Diversity	0.87	0.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, X. Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs. Sustainability 2025, 17, 8920. https://doi.org/10.3390/su17198920

AMA Style

Chen X. Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs. Sustainability. 2025; 17(19):8920. https://doi.org/10.3390/su17198920

Chicago/Turabian Style

Chen, Xin. 2025. "Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs" Sustainability 17, no. 19: 8920. https://doi.org/10.3390/su17198920

APA Style

Chen, X. (2025). Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs. Sustainability, 17(19), 8920. https://doi.org/10.3390/su17198920

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sustainable Agile Identification and Adaptive Risk Control of Major Disaster Online Rumors Based on LLMs and EKGs

Abstract

1. Introduction

2. Related Work

2.1. Generative Topic Models

2.2. Graph-Based Models

2.3. Statistical Classification Methods

2.4. Novel Framework for Proactive Rumor Management

3. Proposed Approach

3.1. Overall Framework

3.2. Data Collection and Preprocessing

3.3. LLM Construction

3.4. EKG Construction

3.5. Rumor Identification and Risk Control

4. Case Study of Major Disaster-Related Online Rumors

4.1. Data Collection and Preprocessing Related to the 2021 Zhengzhou Flood

4.2. Emotion Analysis Based on LLMs

4.3. Research on the 2021 Zhengzhou Flood Case Study

4.4. Supplementary Research on the 2023 Maui Wildfire Case Study

5. Discussion

6. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI