MDPI - Publisher of Open Access Journals

27 pages, 1434 KB

Open AccessArticle

An ML-Based Approach to Leveraging Social Media for Disaster Type Classification and Analysis Across World Regions

by Mohammad Robel Miah, Lija Akter, Ahmed Abdelmoamen Ahmed, Louis Ngamassi and Thiagarajan Ramakrishnan

Computers 2026, 15(1), 16; https://doi.org/10.3390/computers15010016 - 1 Jan 2026

Viewed by 682

Abstract

Over the past decade, the frequency and impact of both natural and human-induced disasters have increased significantly, highlighting the urgent need for effective and timely relief operations. Disaster response requires efficient allocation of resources to the right locations and disaster types in a [...] Read more.

Over the past decade, the frequency and impact of both natural and human-induced disasters have increased significantly, highlighting the urgent need for effective and timely relief operations. Disaster response requires efficient allocation of resources to the right locations and disaster types in a cost- and time-effective manner. However, during such events, large volumes of unverified and rapidly spreading information—especially on social media—often complicate situational awareness and decision-making. Consequently, extracting actionable insights and accurately classifying disaster-related information from social media platforms has become a critical research challenge. Machine Learning (ML) approaches have shown strong potential for categorizing disaster-related tweets, yet substantial variations in model accuracy persist across disaster types and regional contexts, suggesting that universal models may overlook linguistic and cultural nuances. This paper investigates the categorization and sub-categorization of natural disaster tweets using a labeled dataset of over 32,000 samples. Logistic Regression and Random Forest classifiers were trained and evaluated after comprehensive preprocessing to predict disaster categories and sub-categories. Furthermore, a country-specific prediction framework was implemented to assess how regional and cultural variations influence model performance. The results demonstrate strong overall classification accuracy, while revealing marked differences across countries, emphasizing the importance of context-aware, culturally adaptive ML approaches for reliable disaster information management. Full article

(This article belongs to the Special Issue Advances in Semantic Multimedia and Personalized Digital Content)

► Show Figures

Figure 1

19 pages, 7359 KB

Open AccessArticle

An Aspect-Based Emotion Analysis Approach on Wildfire-Related Geo-Social Media Data—A Case Study of the 2020 California Wildfires

by Christina Zorenböhmer, Shaily Gandhi, Sebastian Schmidt and Bernd Resch

ISPRS Int. J. Geo-Inf. 2025, 14(8), 301; https://doi.org/10.3390/ijgi14080301 - 1 Aug 2025

Cited by 1 | Viewed by 1583

Abstract

Natural disasters like wildfires pose significant threats to communities, which necessitates timely and effective disaster response strategies. While Aspect-based Sentiment Analysis (ABSA) has been widely used to extract sentiment-related information at the sub-sentence level, the corresponding field of Aspect-based Emotion Analysis (ABEA) remains [...] Read more.

Natural disasters like wildfires pose significant threats to communities, which necessitates timely and effective disaster response strategies. While Aspect-based Sentiment Analysis (ABSA) has been widely used to extract sentiment-related information at the sub-sentence level, the corresponding field of Aspect-based Emotion Analysis (ABEA) remains underexplored due to dataset limitations and the increased complexity of emotion classification. In this study, we used EmoGRACE, a fine-tuned BERT-based model for ABEA, which we applied to georeferenced tweets of the 2020 California wildfires. The results for this case study reveal distinct spatio-temporal emotion patterns for wildfire-related aspect terms, with fear and sadness increasing near wildfire perimeters. This study demonstrates the feasibility of tracking emotion dynamics across disaster-affected regions and highlights the potential of ABEA in real-time disaster monitoring. The results suggest that ABEA can provide a nuanced understanding of public sentiment during crises for policymakers. Full article

► Show Figures

Figure 1

20 pages, 1496 KB

Open AccessArticle

Utilizing LLMs and ML Algorithms in Disaster-Related Social Media Content

by Vasileios Linardos, Maria Drakaki and Panagiotis Tzionas

GeoHazards 2025, 6(3), 33; https://doi.org/10.3390/geohazards6030033 - 2 Jul 2025

Cited by 4 | Viewed by 4366

Abstract

In this research, we explore the use of Large Language Models (LLMs) and clustering techniques to automate the structuring and labeling of disaster-related social media content. With a gathered dataset comprising millions of tweets related to various disasters, our approach aims to transform [...] Read more.

In this research, we explore the use of Large Language Models (LLMs) and clustering techniques to automate the structuring and labeling of disaster-related social media content. With a gathered dataset comprising millions of tweets related to various disasters, our approach aims to transform unstructured and unlabeled data into a structured and labeled format that can be readily used for training machine learning algorithms and enhancing disaster response efforts. We leverage LLMs to preprocess and understand the semantic content of the tweets, applying several semantic properties to the data. Subsequently, we apply clustering techniques to identify emerging themes and patterns that may not be captured by predefined categories, with these patterns surfaced through topic extraction of the clusters. We proceed with manual labeling and evaluation of 10,000 examples to evaluate the LLMs’ ability to understand tweet features. Our methodology is applied to real-world data for disaster events, with results directly applicable to actual crisis situations. Full article

► Show Figures

Figure 1

19 pages, 2065 KB

Open AccessArticle

Do Spatial Trajectories of Social Media Users Imply the Credibility of the Users’ Tweets During Earthquake Crisis Management?

by Ayse Giz Gulnerman

Appl. Sci. 2025, 15(12), 6897; https://doi.org/10.3390/app15126897 - 18 Jun 2025

Cited by 1 | Viewed by 1356

Abstract

Earthquakes are sudden-onset disasters requiring rapid, accurate information for effective crisis response. Social media (SM) platforms provide abundant geospatial data but are often unstructured and produced by diverse users, posing challenges in filtering relevant content. Traditional content filtering methods rely on natural language [...] Read more.

Earthquakes are sudden-onset disasters requiring rapid, accurate information for effective crisis response. Social media (SM) platforms provide abundant geospatial data but are often unstructured and produced by diverse users, posing challenges in filtering relevant content. Traditional content filtering methods rely on natural language processing (NLP), which underperforms with mixed-language posts or less widely spoken languages. Moreover, these approaches often neglect the spatial proximity of users to the event, a crucial factor in determining relevance during disasters. This study proposes an NLP-free model that assesses the spatial credibility of SM content by analysing users’ spatial trajectories. Using earthquake-related tweets, we developed a machine learning-based classification model that categorises posts as directly relevant, indirectly relevant, or irrelevant. The Random Forest model achieved the highest overall classification accuracy of 89%, while the k-NN model performed best for detecting directly relevant content, with an accuracy of 63%. Although promising overall, the classification accuracy for the directly relevant category indicates room for improvement. Our findings highlight the value of spatial analysis in enhancing the reliability of SM data (SMD) during crisis events. By bypassing textual analysis, this framework supports relevance classification based solely on geospatial behaviour, offering a novel method for evaluating content trustworthiness. This spatial approach can complement existing crisis informatics tools and be extended to other disaster types and event-based applications. Full article

(This article belongs to the Section Earth Sciences)

► Show Figures

Figure 1

32 pages, 3367 KB

Open AccessArticle

Post-Disaster Recovery Assessment Using Sentiment Analysis of English-Language Tweets: A Tenth-Anniversary Case Study of the 2010 Haiti Earthquake

by Diana Contreras, Dimosthenis Antypas, Javier Hervas, Sean Wilkinson, Jose Camacho-Collados, Philippe Garnier and Cécile Cornou

Sustainability 2025, 17(11), 4967; https://doi.org/10.3390/su17114967 - 28 May 2025

Cited by 3 | Viewed by 2223

Abstract

The 2010 Haiti earthquake stands as one of the most catastrophic events in terms of loss of life and destruction. Following an earthquake, there is an urgent demand for information. Regrettably, few studies have tracked the progress of the post-disaster recovery, leaving this [...] Read more.

The 2010 Haiti earthquake stands as one of the most catastrophic events in terms of loss of life and destruction. Following an earthquake, there is an urgent demand for information. Regrettably, few studies have tracked the progress of the post-disaster recovery, leaving this phase poorly understood. In previous years, data were exclusively collected through on-site missions, but today, social media (SM) has enhanced earthquake reconnaissance teams’ capacity to collect data beyond the emergency phase. However, text data from SM is unstructured, making it necessary to use natural language processing techniques to extract meaningful information. Sentiment analysis (SA), which classifies people’s opinions into positive, negative, or neutral polarity, is a promising tool for understanding earthquake recovery. For the purposes of this paper, we conduct SA at the tweet level on data collected around the tenth anniversary of the earthquake using human expertise to fine-tune automatic classification methods. We conclude that the anniversary date is the best time to collect data. In our sample, 56.3% of the tweets in the sample were classified as negative, followed by positive (27.3%), neutral (8.2%), and unrelated (8.1%). In our study, we conclude that the assessment of the recovery progress based on data collected from Twitter is negative. The automatic method for SA with the highest accuracy is ‘btweet’. The assessment result must be validated by stakeholders. Full article

(This article belongs to the Section Hazards and Sustainability)

► Show Figures

Figure 1

24 pages, 963 KB

Open AccessArticle

Multihead Average Pseudo-Margin Learning for Disaster Tweet Classification

by Iustin Sîrbu, Robert-Adrian Popovici, Traian Rebedea and Ștefan Trăușan-Matu

Information 2025, 16(6), 434; https://doi.org/10.3390/info16060434 - 24 May 2025

Viewed by 1033

Abstract

During natural disasters, social media platforms, such as X (formerly Twitter), become a valuable source of real-time information, with eyewitnesses and affected individuals posting messages about the produced damage and the victims. Although this information can be used to streamline the intervention process [...] Read more.

During natural disasters, social media platforms, such as X (formerly Twitter), become a valuable source of real-time information, with eyewitnesses and affected individuals posting messages about the produced damage and the victims. Although this information can be used to streamline the intervention process of local authorities and to achieve a better distribution of available resources, manually annotating these messages is often infeasible due to time and cost constraints. To address this challenge, we explore the use of semi-supervised learning, a technique that leverages both labeled and unlabeled data, to enhance neural models for disaster tweet classification. Specifically, we investigate state-of-the-art semi-supervised learning models and focus on co-training, a less-explored approach in recent years. Moreover, we propose a novel hybrid co-training architecture, Multihead Average Pseudo-Margin, which obtains state-of-the-art results on several classification tasks. Our approach extends the advantages of the voting mechanism from Multihead Co-Training by using the Average Pseudo-Margin (APM) score to improve the quality of the pseudo-labels and self-adaptive confidence thresholds for improving imbalanced classification. Our method achieves up to 7.98% accuracy improvement in low-data scenarios and 2.84% improvement when using the entire labeled dataset, reaching 89.55% accuracy on the Humanitarian task and 91.23% on the Informative task. These results demonstrate the potential of our approach in addressing the critical need for automated disaster tweet classification. We made our code publicly available for future research. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)

► Show Figures

Figure 1

20 pages, 7258 KB

Open AccessArticle

MSBKA: A Multi-Strategy Improved Black-Winged Kite Algorithm for Feature Selection of Natural Disaster Tweets Classification

by Guangyu Mu, Jiaxue Li, Zhanhui Liu, Jiaxiu Dai, Jiayi Qu and Xiurong Li

Biomimetics 2025, 10(1), 41; https://doi.org/10.3390/biomimetics10010041 - 10 Jan 2025

Cited by 11 | Viewed by 2064

Abstract

With the advancement of the Internet, social media platforms have gradually become powerful in spreading crisis-related content. Identifying informative tweets associated with natural disasters is beneficial for the rescue operation. When faced with massive text data, choosing the pivotal features, reducing the calculation [...] Read more.

With the advancement of the Internet, social media platforms have gradually become powerful in spreading crisis-related content. Identifying informative tweets associated with natural disasters is beneficial for the rescue operation. When faced with massive text data, choosing the pivotal features, reducing the calculation expense, and increasing the model classification performance is a significant challenge. Therefore, this study proposes a multi-strategy improved black-winged kite algorithm (MSBKA) for feature selection of natural disaster tweets classification based on the wrapper method’s principle. Firstly, BKA is improved by utilizing the enhanced Circle mapping, integrating the hierarchical reverse learning, and introducing the Nelder–Mead method. Then, MSBKA is combined with the excellent classifier SVM (RBF kernel function) to construct a hybrid model. Finally, the MSBKA-SVM model performs feature selection and tweet classification tasks. The empirical analysis of the data from four natural disasters shows that the proposed model has achieved an accuracy of 0.8822. Compared with GA, PSO, SSA, and BKA, the accuracy is increased by 4.34%, 2.13%, 2.94%, and 6.35%, respectively. This research proves that the MSBKA-SVM model can play a supporting role in reducing disaster risk. Full article

(This article belongs to the Special Issue Advances in Swarm Intelligence Optimization Algorithms and Applications)

► Show Figures

Figure 1

23 pages, 4860 KB

Open AccessArticle

An Enhanced IDBO-CNN-BiLSTM Model for Sentiment Analysis of Natural Disaster Tweets

by Guangyu Mu, Jiaxue Li, Xiurong Li, Chuanzhi Chen, Xiaoqing Ju and Jiaxiu Dai

Biomimetics 2024, 9(9), 533; https://doi.org/10.3390/biomimetics9090533 - 4 Sep 2024

Cited by 16 | Viewed by 3332

Abstract

The Internet’s development has prompted social media to become an essential channel for disseminating disaster-related information. Increasing the accuracy of emotional polarity recognition in tweets is conducive to the government or rescue organizations understanding the public’s demands and responding appropriately. Existing sentiment analysis [...] Read more.

The Internet’s development has prompted social media to become an essential channel for disseminating disaster-related information. Increasing the accuracy of emotional polarity recognition in tweets is conducive to the government or rescue organizations understanding the public’s demands and responding appropriately. Existing sentiment analysis models have some limitations of applicability. Therefore, this research proposes an IDBO-CNN-BiLSTM model combining the swarm intelligence optimization algorithm and deep learning methods. First, the Dung Beetle Optimization (DBO) algorithm is improved by adopting the Latin hypercube sampling, integrating the Osprey Optimization Algorithm (OOA), and introducing an adaptive Gaussian–Cauchy mixture mutation disturbance. The improved DBO (IDBO) algorithm is then utilized to optimize the Convolutional Neural Network—Bidirectional Long Short-Term Memory (CNN-BiLSTM) model’s hyperparameters. Finally, the IDBO-CNN-BiLSTM model is constructed to classify the emotional tendencies of tweets associated with the Hurricane Harvey event. The empirical analysis indicates that the proposed model achieves an accuracy of 0.8033, outperforming other single and hybrid models. In contrast with the GWO, WOA, and DBO algorithms, the accuracy is enhanced by 2.89%, 2.82%, and 2.72%, respectively. This study proves that the IDBO-CNN-BiLSTM model can be applied to assist emergency decision-making in natural disasters. Full article

(This article belongs to the Special Issue Nature-Inspired Metaheuristic Optimization Algorithms 2024)

► Show Figures

Figure 1

15 pages, 936 KB

Open AccessArticle

Drowning in the Information Flood: Machine-Learning-Based Relevance Classification of Flood-Related Tweets for Disaster Management

by Eike Blomeier, Sebastian Schmidt and Bernd Resch

Information 2024, 15(3), 149; https://doi.org/10.3390/info15030149 - 7 Mar 2024

Cited by 23 | Viewed by 3664

Abstract

In the early stages of a disaster caused by a natural hazard (e.g., flood), the amount of available and useful information is low. To fill this informational gap, emergency responders are increasingly using data from geo-social media to gain insights from eyewitnesses to [...] Read more.

In the early stages of a disaster caused by a natural hazard (e.g., flood), the amount of available and useful information is low. To fill this informational gap, emergency responders are increasingly using data from geo-social media to gain insights from eyewitnesses to build a better understanding of the situation and design effective responses. However, filtering relevant content for this purpose poses a challenge. This work thus presents a comparison of different machine learning models (Naïve Bayes, Random Forest, Support Vector Machine, Convolutional Neural Networks, BERT) for semantic relevance classification of flood-related, German-language Tweets. For this, we relied on a four-category training data set created with the help of experts from human aid organisations. We identified fine-tuned BERT as the most suitable model, averaging a precision of 71% with most of the misclassifications occurring across similar classes. We thus demonstrate that our methodology helps in identifying relevant information for more efficient disaster management. Full article

► Show Figures

Figure 1

16 pages, 581 KB

Open AccessArticle

Emotional Health and Climate-Change-Related Stressor Extraction from Social Media: A Case Study Using Hurricane Harvey

by Thanh Bui, Andrea Hannah, Sanjay Madria, Rosemary Nabaweesi, Eugene Levin, Michael Wilson and Long Nguyen

Mathematics 2023, 11(24), 4910; https://doi.org/10.3390/math11244910 - 9 Dec 2023

Cited by 7 | Viewed by 2701

Abstract

Climate change has led to a variety of disasters that have caused damage to infrastructure and the economy with societal impacts to human living. Understanding people’s emotions and stressors during disaster times will enable preparation strategies for mitigating further consequences. In this paper, [...] Read more.

Climate change has led to a variety of disasters that have caused damage to infrastructure and the economy with societal impacts to human living. Understanding people’s emotions and stressors during disaster times will enable preparation strategies for mitigating further consequences. In this paper, we mine emotions and stressors encountered by people and shared on Twitter during Hurricane Harvey in 2017 as a showcase. In this work, we acquired a dataset of tweets from Twitter on Hurricane Harvey from 20 August 2017 to 30 August 2017. The dataset consists of around 400,000 tweets and is available on Kaggle. Next, a BERT-based model is employed to predict emotions associated with tweets posted by users. Then, natural language processing (NLP) techniques are utilized on negative-emotion tweets to explore the trends and prevalence of the topics discussed during the disaster event. Using Latent Dirichlet Allocation (LDA) topic modeling, we identified themes, enabling us to manually extract stressors termed as climate-change-related stressors. Results show that 20 climate-change-related stressors were extracted and that emotions peaked during the deadliest phase of the disaster. This indicates that tracking emotions may be a useful approach for studying environmentally determined well-being outcomes in light of understanding climate change impacts. Full article

(This article belongs to the Special Issue Healthcare Data Analytics Using AI)

► Show Figures

Figure 1

26 pages, 7795 KB

Open AccessArticle

Temporal Relationship between Daily Reports of COVID-19 Infections and Related GDELT and Tweet Mentions

by Innocensia Owuor and Hartwig H. Hochmair

Geographies 2023, 3(3), 584-609; https://doi.org/10.3390/geographies3030031 - 16 Sep 2023

Cited by 4 | Viewed by 4172

Abstract

Social media platforms are valuable data sources in the study of public reactions to events such as natural disasters and epidemics. This research assesses for selected countries around the globe the time lag between daily reports of COVID-19 cases and GDELT (Global Database [...] Read more.

Social media platforms are valuable data sources in the study of public reactions to events such as natural disasters and epidemics. This research assesses for selected countries around the globe the time lag between daily reports of COVID-19 cases and GDELT (Global Database of Events, Language, and Tone) and Twitter (X) COVID-19 mentions between February 2020 and April 2021 using time series analysis. Results show that GDELT articles and tweets preceded COVID-19 infections in Australia, Brazil, France, Greece, India, Italy, the U.S., Canada, Germany, and the U.K., while for Poland and the Philippines, tweets preceded and GDELT articles lagged behind COVID-19 disease incidences, respectively. This shows that the application of social media and news data for surveillance and management of pandemics needs to be assessed on a case-by-case basis for different countries. It also points towards the applicability of time series data analysis for only a limited number of countries due to strict data requirements (e.g., stationarity). A deviation from generally observed lag patterns in a country, i.e., periods with low COVID-19 infections but unusually high numbers of COVID-19-related GDELT articles or tweets, signals an anomaly. We use the seasonal hybrid extreme Studentized deviate test to detect such anomalies. This is followed by text analysis of news headlines from NewsBank and Google on the date of these anomalies to determine the probable event causing an anomaly, which includes elections, holidays, and protests. Full article

► Show Figures

Figure 1

14 pages, 975 KB

Open AccessArticle

A Comprehensive Analysis of Transformer-Deep Neural Network Models in Twitter Disaster Detection

by Vimala Balakrishnan, Zhongliang Shi, Chuan Liang Law, Regine Lim, Lee Leng Teh, Yue Fan and Jeyarani Periasamy

Mathematics 2022, 10(24), 4664; https://doi.org/10.3390/math10244664 - 9 Dec 2022

Cited by 11 | Viewed by 4373

Abstract

Social media platforms such as Twitter are a vital source of information during major events, such as natural disasters. Studies attempting to automatically detect textual communications have mostly focused on machine learning and deep learning algorithms. Recent evidence shows improvement in disaster detection [...] Read more.

Social media platforms such as Twitter are a vital source of information during major events, such as natural disasters. Studies attempting to automatically detect textual communications have mostly focused on machine learning and deep learning algorithms. Recent evidence shows improvement in disaster detection models with the use of contextual word embedding techniques (i.e., transformers) that take the context of a word into consideration, unlike the traditional context-free techniques; however, studies regarding this model are scant. To this end, this paper investigates a selection of ensemble learning models by merging transformers with deep neural network algorithms to assess their performance in detecting informative and non-informative disaster-related Twitter communications. A total of 7613 tweets were used to train and test the models. Results indicate that the ensemble models consistently yield good performance results, with F-score values ranging between 76% and 80%. Simpler transformer variants, such as ELECTRA and Talking-Heads Attention, yielded comparable and superior results compared to the computationally expensive BERT, with F-scores ranging from 80% to 84%, especially when merged with Bi-LSTM. Our findings show that the newer and simpler transformers can be used effectively, with less computational costs, in detecting disaster-related Twitter communications. Full article

(This article belongs to the Special Issue New Insights in Machine Learning and Deep Neural Networks)

► Show Figures

Figure 1

23 pages, 4446 KB

Open AccessArticle

Detecting Natural Hazard-Related Disaster Impacts with Social Media Analytics: The Case of Australian States and Territories

by Tan Yigitcanlar, Massimo Regona, Nayomi Kankanamge, Rashid Mehmood, Justin D’Costa, Samuel Lindsay, Scott Nelson and Adiam Brhane

Sustainability 2022, 14(2), 810; https://doi.org/10.3390/su14020810 - 12 Jan 2022

Cited by 57 | Viewed by 8717

Abstract

Natural hazard-related disasters are disruptive events with significant impact on people, communities, buildings, infrastructure, animals, agriculture, and environmental assets. The exponentially increasing anthropogenic activities on the planet have aggregated the climate change and consequently increased the frequency and severity of these natural hazard-related [...] Read more.

Natural hazard-related disasters are disruptive events with significant impact on people, communities, buildings, infrastructure, animals, agriculture, and environmental assets. The exponentially increasing anthropogenic activities on the planet have aggregated the climate change and consequently increased the frequency and severity of these natural hazard-related disasters, and consequential damages in cities. The digital technological advancements, such as monitoring systems based on fusion of sensors and machine learning, in early detection, warning and disaster response systems are being implemented as part of the disaster management practice in many countries and presented useful results. Along with these promising technologies, crowdsourced social media disaster big data analytics has also started to be utilized. This study aims to form an understanding of how social media analytics can be utilized to assist government authorities in estimating the damages linked to natural hazard-related disaster impacts on urban centers in the age of climate change. To this end, this study analyzes crowdsourced disaster big data from Twitter users in the testbed case study of Australian states and territories. The methodological approach of this study employs the social media analytics method and conducts sentiment and content analyses of location-based Twitter messages (n = 131,673) from Australia. The study informs authorities on an innovative way to analyze the geographic distribution, occurrence frequency of various disasters and their damages based on the geo-tweets analysis. Full article

(This article belongs to the Special Issue The Adaptability of Cities to Climate Change)

► Show Figures

Figure 1

12 pages, 883 KB

Open AccessArticle

Location Analysis for Arabic COVID-19 Twitter Data Using Enhanced Dialect Identification Models

by Nader Essam, Abdullah M. Moussa, Khaled M. Elsayed, Sherif Abdou, Mohsen Rashwan, Shaheen Khatoon, Md. Maruf Hasan, Amna Asif and Majed A. Alshamari

Appl. Sci. 2021, 11(23), 11328; https://doi.org/10.3390/app112311328 - 30 Nov 2021

Cited by 10 | Viewed by 4393

Abstract

The recent surge of social media networks has provided a channel to gather and publish vital medical and health information. The focal role of these networks has become more prominent in periods of crisis, such as the recent pandemic of COVID-19. These social [...] Read more.

The recent surge of social media networks has provided a channel to gather and publish vital medical and health information. The focal role of these networks has become more prominent in periods of crisis, such as the recent pandemic of COVID-19. These social networks have been the leading platform for broadcasting health news updates, precaution instructions, and governmental procedures. They also provide an effective means for gathering public opinion and tracking breaking events and stories. To achieve location-based analysis for social media input, the location information of the users must be captured. Most of the time, this information is either missing or hidden. For some languages, such as Arabic, the users’ location can be predicted from their dialects. The Arabic language has many local dialects for most Arab countries. Natural Language Processing (NLP) techniques have provided several approaches for dialect identification. The recent advanced language models using contextual-based word representations in the continuous domain, such as BERT models, have provided significant improvement for many NLP applications. In this work, we present our efforts to use BERT-based models to improve the dialect identification of Arabic text. We show the results of the developed models to recognize the source of the Arabic country, or the Arabic region, from Twitter data. Our results show 3.4% absolute enhancement in dialect identification accuracy on the regional level over the state-of-the-art result. When we excluded the Modern Standard Arabic (MSA) set, which is formal Arabic language, we achieved 3% absolute gain in accuracy between the three major Arabic dialects over the state-of-the-art level. Finally, we applied the developed models on a recently collected resource for COVID-19 Arabic tweets to recognize the source country from the users’ tweets. We achieved a weighted average accuracy of 97.36%, which proposes a tool to be used by policymakers to support country-level disaster-related activities. Full article

(This article belongs to the Special Issue Fighting COVID-19: Emerging Techniques and Aid Systems for Prevention, Forecasting and Diagnosis)

► Show Figures

Figure 1

18 pages, 5474 KB

Open AccessArticle

Spatiotemporal Evolution of the Online Social Network after a Natural Disaster

by Shi Shen, Junwang Huang, Changxiu Cheng, Ting Zhang, Nikita Murzintcev and Peichao Gao

ISPRS Int. J. Geo-Inf. 2021, 10(11), 744; https://doi.org/10.3390/ijgi10110744 - 2 Nov 2021

Cited by 6 | Viewed by 3582

Abstract

Social media has been a vital channel for communicating and broadcasting disaster-related information. However, the global spatiotemporal patterns of social media users’ activities, interactions, and connections after a natural disaster remain unclear. Hence, we integrated geocoding, geovisualization, and complex network methods to illustrate [...] Read more.

Social media has been a vital channel for communicating and broadcasting disaster-related information. However, the global spatiotemporal patterns of social media users’ activities, interactions, and connections after a natural disaster remain unclear. Hence, we integrated geocoding, geovisualization, and complex network methods to illustrate and analyze the online social network’s spatiotemporal evolution. Taking the super typhoon Haiyan as a case, we constructed a retweeting network and mapped this network according to the tweets’ location information. The results show that (1) the distribution of in-degree and out-degree follow power-law and retweeting networks are scale-free. (2) A local catastrophe could attract significant global interest but with strong geographical heterogeneity. The super typhoon Haiyan especially attracted attention from the United States, Europe, and Australia, in which users are more active in posting and forwarding disaster-related tweets than other regions (except the Philippines). (3) The users’ interactions and connections are also significantly different between countries and regions. Connections and interactions between the Philippines and the United States, Europe, and Australia were much closer than in other regions. Therefore, the agencies and platforms should also pay attention to other countries and regions outside the disaster area to provide more valuable information for the local people. Full article

(This article belongs to the Special Issue Geovisualization and Social Media)

► Show Figures

Figure 1

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI