Skip Content
You are currently on the new version of our website. Access the old version .
Future InternetFuture Internet
  • Article
  • Open Access

23 September 2021

Machine Learning in Detecting COVID-19 Misinformation on Twitter

and
1
Computer Science and Information Systems Department, The Public Authority for Applied Education and Training, Safat 13147, Kuwait
2
Information Systems and Operations Management Department, Kuwait University, Safat 13055, Kuwait
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Digital and Social Media in the Disinformation Age

Abstract

Social media platforms such as Facebook, Instagram, and Twitter are an inevitable part of our daily lives. These social media platforms are effective tools for disseminating news, photos, and other types of information. In addition to the positives of the convenience of these platforms, they are often used for propagating malicious data or information. This misinformation may misguide users and even have dangerous impact on society’s culture, economics, and healthcare. The propagation of this enormous amount of misinformation is difficult to counter. Hence, the spread of misinformation related to the COVID-19 pandemic, and its treatment and vaccination may lead to severe challenges for each country’s frontline workers. Therefore, it is essential to build an effective machine-learning (ML) misinformation-detection model for identifying the misinformation regarding COVID-19. In this paper, we propose three effective misinformation detection models. The proposed models are long short-term memory (LSTM) networks, which is a special type of RNN; a multichannel convolutional neural network (MC-CNN); and k-nearest neighbors (KNN). Simulations were conducted to evaluate the performance of the proposed models in terms of various evaluation metrics. The proposed models obtained superior results to those from the literature.

1. Introduction

The rapid growth of the Internet and related technologies dramatically changed society in all aspects. The popularity of Internet-related technologies is increasing on a daily basis. The Internet has opened up a new effective and powerful global communication medium without any barriers regarding location and time. Social media is an interactive and computer-based tool that facilitates sharing knowledge, thoughts, views, opinions, experiences, documents, audio, and video by forming virtual communities and networks.
The growth of the Internet and the ubiquitous usage of the Web and technology tools have created a vast amount of data. According to the latest report of the International Telecommunication Union (ITU), 93% of the global population has access to a network, and 70% of the world’s youth are using the Internet [1]. This enabled users to interact using different social media platforms and express their opinions on different issues. The amount of data generated from social media platforms has attracted both researchers and practitioners to extract underlying valuable information. Since the early 1990s, researchers used generated text as data in the area of natural language processing (NLP). Sentiment analysis (SA), as a subfield of NLP, has gained popularity to understand the attitude of the writer. SA can classify text written into predefined subjective categories (i.e., positive, negative, or neutral) or measure the strength of a sentiment [2]. Machine learning (ML) adopts algorithms and uses data to build models with minimal human intervention. ML models are also deployed in SA areas for various objectives [3,4,5].
SA is used in different areas such as tourism [6], dialects on social media [7], hotel reviews [8,9], customer reviews [10], politics and campaigns [11,12], mental health [13], stock returns [14], investment [15], climate change [16], real estate [5], movie reviews [17], and product reviews [18]. The importance of SA lies in its powerful ability to gain better insights, achieve a competitive advantage, and reach optimal decisions. For instance, Philander et al. [2] used SA to understand public opinions towards hospitality firms. Moreover, the authors in [19] discussed the influence of SA on physical and online marketing, implication on product sales, and influence of pricing strategy. Shuyuan Deng et al. [14] discussed the causality between sentiments and stock returns at an hourly level. In the area of climate change, Jost et al. [16] applied SA to find the level of agreement between climate researchers and policymakers, and suggested collaboration between them. Another practical implication of SA is assessing customers in making purchasing decisions [20]. While the importance and implications of SA are endless, misleading information is a serious threat in the area.
Misleading information has many definitions, for example, “news articles that are intentionally and verifiably false, and could mislead readers” [21]. It has been a concern long before using social media, which can alter people’s beliefs. For example, in response to a letter back in 1968, an editor writes, “while high-minded individuals debate the potential future misuse of computer-based information systems, major computer users and manufacturers presently continue, thoughtlessly and routinely, to gather personal and misleading data from all applicants for employment” [22]. In another earlier study, the author [23] performed a series of experiments to explore what happens when a person receives misinformation at different points of times. Misleading information can have significant effects on many aspects such as consumer decision making [24], elections [25], education [26], and medicine [27].
Undertaking online misinformation is a global concern across official agencies. For example, the European Commission (EC), a branch of the European Union (EU), is dealing with both online disinformation and misinformation to ensure the protection of European values and democratic systems. The EC has developed several initiatives to protect the public from intentional deception and from fake news that is believed to be true. Those initiatives include action plans, codes of practices, and fact checkers [28]. In August 2020, UNESCO published a handbook on fake news and disinformation in media. The handbook was useful, especially during the COVID-19 pandemic, to minimize the negative impact on the public’s consciousness and attitudes [29].
While past research investigated misleading information for various objectives, the performance of the models using health misinformation data was not satisfactory. To bridge this research gap, this work specifically focuses on building better models to detect COVID-19 misinformation on Twitter. This exploratory work uses a general framework as an attempt to propose models that outperform implemented algorithms in the literature.
As such, this work attempts to answer the two following research questions (RQ): RQ1: which machine-learning models exist to detect healthcare misinformation on SNS? RQ2: can we propose machine-learning models that effectively outperform existing ones in RQ1?
To address the two research questions above, an extensive literature review was conducted, followed by building ML models to bridge the research gap in detecting COVID-19 misinformation on Twitter.

3. Methods

This section briefly covers the dataset used for research, performance metrics of machine-learning models, and the proposed models used in this research.

3.1. CoAID Dataset

Liming Cui and Dongwon Lee [78] proposed a diverse misinformation dataset related to healthcare spread through social media and online platforms, hence the name Covid19 heAlthcare mIsinformation Dataset (CoAID). They collected various healthcare misinformation instances of COVID-19, labeled them, and made them available to the public. CoAID contains fake or misleading news from various websites and other social media platforms. It also includes various user engagements related to such fake news, which were also labeled. CoAID supplies annotated news, claims, and their related tweet replies. Compared to other available datasets, which are all discussed here, CoAID is different. It includes fake and valid news, and claims in addition to user engagements on social media platforms. It consists of a sufficiently large dataset related to user engagements on Twitter, which is properly classified. Hence, CoAID was selected in this work.

3.2. Performance Metrics

Performance or evaluation metrics play a vital role in identifying an optimal model for classification. They evaluate the performance of a model while training the classifier. It is important to select a suitable metric for evaluating classifiers. There are various available evaluation metrics for determining the performance of a model. It is important to carefully select the most adaptable metric when classifying imbalanced data [89]. The majority of the data in an imbalanced dataset belong to a particular class, and the minority to another class in the case of a two-way classification scenario. There is a possibility of bias towards the majority category. So, we need to more carefully evaluate the classifier. Therefore, accuracy, precision, recall, F measure, and PR-AUC are used for evaluating classifier performance.
A confusion matrix is a way to visualize classifier performance. Most evaluation metrics are based on the number of correctly classified evaluated documents. Each row represents the predicted category in a confusion matrix, whereas each column represents the actual category. The matrix compares the actual values with the predicted ones and obtained four matrices: TN, TP, FN, and FP.
  • TP or true positive: classifier correctly predicted the observation as positive.
  • TN or true negative: classifier correctly predicted the observation as negative.
  • FP or false positive: classifier wrongly classified the observation as positive, but it is actually negative.
  • FN or false negative: classifier wrongly classified the observation as negative, but it is actually positive.
The confusion matrix is effective in measuring other evaluation metrics such as precision, recall, and accuracy. Various evaluation metrics are displayed in Table 1.
Table 1. Performance evaluation metrics.
Area under precision–recall curve (PR-AUC): The precision–recall curve is similar to the ROC curve, which is also a performance evaluation metric, especially when the supplied data are heavily imbalanced. PR-AUC is generally used to summarize the precision–recall curve into a single value. If the value of PR-AUC is small, it indicates a bad classifier; a higher value such as 1 indicates an excellent classifier.

3.3. Framework of Proposed Models

It is very challenging to find the most feasible solution for detecting misinformation on social media platforms such as Twitter. Here, we identify misinformation or fake news transmitted through Twitter as tweets. We propose efficient misinformation-detection models for detecting misinformation or fake news on Twitter. In this section, the proposed models are explained. Figure 1 [90] shows the general architecture that was applied on the proposed models.
Figure 1. General framework of proposed models.
1.
Data cleaning and preprocessing: they are conducted to eliminate unwanted or irrelevant data or noise in the supplied dataset in order to produce the corpus in a clean and understandable format to improve data accuracy. This step involves the removal of unwanted symbols such as punctuation, special characters, URLs, hashtags, www, HTTPS, and digits. After the data are cleaned, they are preprocessed, including stop-word removal, stemming, and lemmatization. Here, we only removed the stop words.
2.
Feature extraction: After performing data cleaning and preprocessing, it is important to extract the features from the text documents. There are many features, but the most important and commonly used are words. In this step, extracted features are converted into vector representation. For this model, TF-IDF was selected for converting text features into corresponding word-vector representation. The generated vector may be high0dimensional.
3.
Feature selection and dimensionality reduction: dimensionality reduction is important in text classification applications to improve the performance of the proposed models. It reduces the number of features to represent documents by selecting the most essential features to project the documents. Feature selection is most important in dimensionality reduction since it selects the most essential feature, capturing the essence of a document. There are various feature-selection and dimensionality-reduction models, from which singular value decomposition was implemented here, which is one of the most effective models. After dimensionality reduction, the entire corpus is divided into training and test sets.
4.
Sampling the training set: Sampling is mainly performed on an imbalanced corpus to rebalance class distributions. There are mainly two types of sampling: over- and undersampling. Oversampling duplicates or generates new data in the minority class to balance the corpus, whereas undersampling delete or merges the data in the majority class. Oversampling is more effective, since undersampling may delete relevant examples from the majority class. Here, oversampling was performed to rebalance the training corpus.
5.
Training: the proposed model is trained using the training corpus.
6.
Performance evaluation: the performance of each model is evaluated using different evaluation metrics such as accuracy, precision, recall, F measure, and PR-AUC.
7.
Hyperparameter optimization: Hyperparameters are very significant since they directly impact the characteristics of the proposed model, and can even control the performance of the model to be trained. So, for improving the effectiveness of the proposed model, hyperparameters are tuned.

4. Results and Discussion

The proposed models were simulated in the Python programming language using the CoAID misinformation dataset. The Keras framework was utilized for simulating the proposed models. The CoAID dataset contains fake and real claims and news, their corresponding tweets, and their replies. All these data were properly classified into fake and real categories. So, we selected this dataset for simulating our proposed models. The collected tweets from the CoAID dataset were then preprocessed and classified into training and test corpora. The training corpus contained 80% of all extracted tweets. Misinformation-detection models were implemented in Google Colab and trained using the training corpus. Then, the models’ performance was tested and evaluated using various metrics. Table 2 shows the results of the performance evaluation of the proposed models in terms of various performance evaluation metrics.
Table 2. Performance evaluation metrics of proposed methods.
Most of the available fake-news detection corpus is imbalanced. In order to eliminate the biasing effects of an imbalanced corpus, a sampling method is used. In this simulation, the random-oversampling method was employed to effectively handle the challenges of imbalanced data. Models were simulated with and without sampling. Most data in the corpus were classified into majority classes when we did not employ the sampling method. The simulation results of the proposed models (LSTM, MC-CNN, KNN)) were compared with already implemented models in the literature [78]; the comparison was separately plotted for precision, recall, F measure, and PR AUC, shown in Figure 2, Figure 3, Figure 4 and Figure 5, respectively. The results showed that the proposed misinformation-detection models in this study were far more effective than all the other models in the literature. Figure 6 summarizes a comparison between all three proposed models with respect to performance metrics. The results showed that the performance of the three proposed models was within a close range. However, KNN (K = 3) showed relatively better performance than that of LSTM and MC-CNN.
Figure 2. Precision comparison: proposed models vs. models in the literature [78].
Figure 3. Recall comparison: proposed models vs. models in the literature [78].
Figure 4. F measure comparison: proposed models vs. models in the literature [78].
Figure 5. PU_AUC comparison: proposed models vs. models in the literature [78].
Figure 6. Comparison between proposed models with respect to all performance evaluation metrics.

5. Conclusions

In an attempt to answer the first research question, a literature review was conducted to investigate existing ML models to detect healthcare misinformation on SNS. As demonstrated in previous sections, there exists a research gap in this area. The performance of previous ML models was unsatisfactory. To bridge the gap, this study proposed a framework for detecting COVID-19 misinformation spread through social media platforms, especially on Twitter. In an attempt to answer the second research question of whether the proposed ML models outperform existing ones, three ML models were implemented. The performance of the proposed models presented in this paper was evaluated using various metrics, namely, precision, recall, F measure, and PR-AUC. The models were simulated using the CoAID misinformation dataset, a healthcare misinformation corpus. In order to avoid a biasing effect, the sampling method was also employed in this work. Our proposed misinformation-detection models more accurately and effectively classified COVID-19-related misinformation available on Twitter. The proposed models are well-suited for misinformation detection in both balanced and imbalanced corpora.
This research has important practical implications. Misinformation is a significant problem on social media, especially when it is health-related. Trusted sources of health information could be a matter of life and death, as in the case for COVID-19. Therefore, this works intends to introduce misinformation-detection models with increased accuracy compared to that of others proposed in the literature [78]. Social media platforms could consider our approach to improve shared online content.
Several limitations exist in the current research. First, the provided evidence is restricted to one dataset and needs to be tested on other datasets related to different areas. Next, it is important to test the generalizability of our results by using misinformation from other social media platforms. Lastly, our model was purely algorithmic, with no evidence of external validity. Future research should consider addressing the limitations of this study. This work is a starting point to further improve misinformation-detection algorithms on social media networks.

Author Contributions

Conceptualization, M.N.A.; methodology, M.N.A.; software, M.N.A.; validation, M.N.A. and Z.M.A.; formal analysis, M.N.A. and Z.M.A.; investigation, M.N.A. and Z.M.A.; resources, M.N.A. and Z.M.A.; data curation, M.N.A.; writing—original draft preparation, M.N.A. and Z.M.A.; writing—review and editing, Z.M.A. and M.N.A.; visualization, M.N.A.; supervision, M.N.A. and Z.M.A.; project administration, M.N.A. and Z.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. ITU—Facts and Figures 2020—Interactive Report. Available online: https://www.itu.int/en/ITU-D/Statistics/Pages/ff2020interactive.aspx (accessed on 20 March 2021).
  2. Philander, K.; Zhong, Y. Twitter sentiment analysis: Capturing sentiment from integrated resort tweets. Int. J. Hosp. Manag. 2016, 55, 16–24. [Google Scholar] [CrossRef]
  3. Durier, F.; Vieira, R.; Garcia, A.C. Can Machines Learn to Detect Fake News? A Survey Focused on Social Media. In Proceedings of the 52nd Hawaii International Conference on System Sciences, HICSS, Grand Wailea, Maui, HI, USA, 8–11 January 2019. [Google Scholar]
  4. Thota, A.; Tilak, P.; Ahluwalia, S.; Lohia, N. Fake News Detection: A Deep Learning Approach. SMU Data Sci. Rev. 2018, 1, 10. [Google Scholar]
  5. Sun, D.; Du, Y.; Xu, W.; Zuo, M.Y.; Zhang, C.; Zhou, J. Combining Online News Articles and Web Search to Predict the Fluctuation of Real Estate Market in Big Data Context. Pac. Asia J. Assoc. Inf. Syst. 2015, 6, 2. [Google Scholar] [CrossRef]
  6. Alaei, A.R.; Becken, S.; Stantic, B. Sentiment Analysis in Tourism: Capitalizing on Big Data. J. Travel Res. 2019, 58, 175–191. [Google Scholar] [CrossRef]
  7. Alnawas, A.; Arici, N. Sentiment Analysis of Iraqi Arabic Dialect on Facebook Based on Distributed Representations of Documents. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2019, 18. [Google Scholar] [CrossRef]
  8. Al-Smadi, M.; Qawasmeh, O.; Al-Ayyoub, M.; Jararweh, Y.; Gupta, B. Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. J. Comput. Sci. 2018, 27, 386–393. [Google Scholar] [CrossRef]
  9. Hwang, S.Y.; Lai, C.; Jiang, J.J.; Chang, S. The Identification of Noteworthy Hotel Reviews for Hotel Management. Pac. Asia J. Assoc. Inf. Syst. 2014, 6. [Google Scholar] [CrossRef]
  10. Binder, M.; Heinrich, B.; Klier, M.; Obermeier, A.; Schiller, A. Explaining the Stars: Aspect-based Sentiment Analysis of Online Customer Reviews. In Proceedings of the 27th European Conference on Information Systems—Information Systems for a Sharing Society, ECIS, Stockholm and Uppsala, Sweden, 8–14 June 2019. [Google Scholar]
  11. Ceron, A.; Curini, L.; Iacus, S.M. Using Sentiment Analysis to Monitor Electoral Campaigns: Method Matters—Evidence From the United States and Italy. Soc. Sci. Comput. Rev. 2015, 33, 3–20. [Google Scholar] [CrossRef]
  12. Sandoval-Almazan, R.; Valle-Cruz, D. Sentiment Analysis of Facebook Users Reacting to Political Campaign Posts. Digit. Gov. Res. Pract. 2020, 1. [Google Scholar] [CrossRef]
  13. Davcheva, E. Text Mining Mental Health Forums-Learning From User Experiences. In Proceedings of the ECIS 2018, Portsmouth, UK, 23–28 June 2018. [Google Scholar]
  14. Deng, S.; Huang, Z.J.; Sinha, A.P.; Zhao, H. The Interaction between Microblog Sentiment and Stock Returns: An Empirical Examination. MIS Q. 2018, 42, 895–918. [Google Scholar] [CrossRef]
  15. Deng, S.; Kwak, D.H.; Wu, J.; Sinha, A.; Zhao, H. Classifying Investor Sentiment in Microblogs: A Transfer Learning Approach. In Proceedings of the International Conference on Information Systems (ICIS 2018), San Francisco, CA, USA, 13–16 December 2018. [Google Scholar]
  16. Jost, F.; Dale, A.; Schwebel, S. How positive is “change” in climate change? A sentiment analysis. Environ. Sci. Policy 2019, 96, 27–36. [Google Scholar] [CrossRef]
  17. Wollmer, M.; Weninger, F.; Knaup, T.; Schuller, B.; Sun, C.; Sagae, K.; Morency, L. YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context. IEEE Intell. Syst. 2013, 28, 46–53. [Google Scholar] [CrossRef]
  18. Yan, Z.; Xing, M.; Zhang, D.; Ma, B.; Wang, T. A Context-Dependent Sentiment Analysis of Online Product Reviews based on Dependency Relationships. In Proceedings of the 35th International Conference on Information Systems: Building a Better World Through Information Systems, ICIS, Auckland, New Zealand, 14–17 December 2014. [Google Scholar]
  19. Srivastava, D.P.; Anand, O.; Rakshit, A. Assessment, Implication, and Analysis of Online Consumer Reviews: A Literature Review. Pac. Asia J. Assoc. Inf. Syst. 2017, 9, 43–73. [Google Scholar] [CrossRef]
  20. Lak, P.; Turetken, O. The Impact of Sentiment Analysis Output on Decision Outcomes: An Empirical Evaluation. AIS Trans. Hum. Comput. Interact. 2017, 9, 1–22. [Google Scholar] [CrossRef] [Green Version]
  21. Moravec, P.L.; Kim, A.; Dennis, A.R. Flagging fake news: System 1 vs. System 2. In Proceedings of the International Conference on Information Systems (ICIS 2018), San Francisco, CA, USA, 13–16 December 2018; pp. 1–17. [Google Scholar]
  22. Abbott, R.J. Letters to the Editor: Gathering of Misleading Data with Little Regard for Privacy. Commun. ACM 1968, 11, 377–378. [Google Scholar] [CrossRef]
  23. Loftus, E. Reacting to blatantly contradictory information. Mem. Cogn. 1979, 7, 368–374. [Google Scholar] [CrossRef] [Green Version]
  24. Wessel, M.; Thies, F.; Benlian, A. A Lie Never Lives to be Old: The Effects of Fake Social Information on Consumer Decision-Making in Crowdfunding. In Proceedings of the European Conference on Information Systems, Münster, Germany, 26–29 May 2015. [Google Scholar]
  25. Allcott, H.; Gentzkow, M. Social Media and Fake News in the 2016 Election. J. Econ. Perspect. 2017, 31, 211–236. [Google Scholar] [CrossRef] [Green Version]
  26. Rosenberg, S.A.; Elbaum, B.; Rosenberg, C.R.; Kellar-Guenther, Y.; McManus, B.M. From Flawed Design to Misleading Information: The U.S. Department of Education’s Early Intervention Child Outcomes Evaluation. Am. J. Eval. 2018, 39, 350–363. [Google Scholar] [CrossRef]
  27. Bianchini, C.; Truccolo, I.; Bidoli, E.; Mazzocut, M. Avoiding misleading information: A study of complementary medicine online information for cancer patients. Libr. Inf. Sci. Res. 2019, 41, 67–77. [Google Scholar] [CrossRef]
  28. Commision, E. Tackling Online Disinformation. Available online: https://digital-strategy.ec.europa.eu/en/policies/online-disinformation (accessed on 9 May 2021).
  29. UNESCO. Fake News: Disinformation in Media. Available online: https://en.unesco.org/news/unesco-published-handbook-fake-news-and-disinformation-media (accessed on 9 May 2021).
  30. Hou, R.; Pérez-Rosas, V.; Loeb, S.; Mihalcea, R. Towards Automatic Detection of Misinformation in Online Medical Videos. In Proceedings of the 2019 International Conference on Multimodal Interaction, Suzhou, China, 14–18 October 2019; pp. 235–243. [Google Scholar] [CrossRef] [Green Version]
  31. Bautista, J.R.; Zhang, Y.; Gwizdka, J. Healthcare professionals’ acts of correcting health misinformation on social media. Int. J. Med. Inform. 2021, 148, 104375. [Google Scholar] [CrossRef]
  32. Suarez-Lledo, V.; Alvarez-Galvez, J. Prevalence of Health Misinformation on Social Media: Systematic Review. J. Med. Internet Res. 2021, 23, e17187. [Google Scholar] [CrossRef] [PubMed]
  33. Van Bavel, J.; Boggio, P.; Capraro, V.; Cichocka, A.; Cikara, M.; Crockett, M.; Crum, A.; Douglas, K.; Druckman, J.; Drury, J.; et al. Using social and behavioural science to support COVID-19 pandemic response. Nat. Hum. Behav. 2020, 460–471. [Google Scholar] [CrossRef]
  34. Venkatesan, S.; Han, W.; Kisekka, V.; Sharman, R.; Kudumula, V.; Jaswal, H.S. Misinformation in Online Health Communities. In Proceedings of the Eighth Pre-ICIS Workshop on Information Security and Privacy, Milano, Italy, 14 December 2013. [Google Scholar]
  35. Chou, W.S.; Sciences, P.; Cancer, N.; Oh, A.; Sciences, P.; Cancer, N.; Klein, W.M.P.; Sciences, P.; Cancer, N. The Persistence and Peril of Misinformation. Am. Sci. 2017, 372. [Google Scholar] [CrossRef]
  36. Li, Y.J.; Cheung, C.M.; Shen, X.L.; Lee, M.K. Health Misinformation on Social Media: A Literature Review. In Proceedings of the 23rd Pacific Asia Conference on Information Systems: Secure ICT Platform for the 4th Industrial Revolution, PACIS, Xi’an, China, 8–12 July 2019; Volume 194. [Google Scholar]
  37. Coronavirus DISEASE (COVID-19) Pandemic. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019 (accessed on 20 March 2021).
  38. Munich Security Conference. Available online: https://www.who.int/director-general/speeches/detail/munich-security-conference (accessed on 20 February 2021).
  39. Fell, L. Trust and COVID-19: Implications for Interpersonal, Workplace, Institutional, and Information-Based Trust. Digit. Gov. Res. Pract. 2020, 2. [Google Scholar] [CrossRef]
  40. Bode, L.; Vraga, E. See Something, Say Something: Correction of Global Health Misinformation on Social Media. Health Commun. 2017, 33, 1–10. [Google Scholar] [CrossRef] [PubMed]
  41. Gu, R.; Li, M.X. Investigating the Psychological Mechanism of Individuals’ Health Misinformation Dissemination on Social Media. Available online: https://scholars.hkbu.edu.hk/en/publications/investigating-the-psychological-mechanism-of-individuals-health-m (accessed on 19 August 2021).
  42. Swire-Thompson, B.; Lazer, D. Public Health and Online Misinformation: Challenges and Recommendations. Annu. Rev. Public Health 2020, 41, 433–451. [Google Scholar] [CrossRef] [Green Version]
  43. Ghenai, A. Health Misinformation in Search and Social Media. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17), Tokyo, Japan, 7–11 August 2017; Association for Computing Machinery: New York, NY, USA, 2017; p. 1371. [Google Scholar] [CrossRef] [Green Version]
  44. Khan, T.; Michalas, A.; Akhunzada, A. Fake news outbreak 2021: Can we stop the viral spread? J. Netw. Comput. Appl. 2021, 190, 103112. [Google Scholar] [CrossRef]
  45. Apuke, O.D.; Omar, B. Social media affordances and information abundance: Enabling fake news sharing during the COVID-19 health crisis. Health Inform. J. 2021, 27, 14604582211021470. [Google Scholar] [CrossRef]
  46. Southwell, B.; Niederdeppe, J.; Cappella, J.; Gaysynsky, A.; Kelley, D.; Oh, A.; Peterson, E.; Chou, W.Y. Misinformation as a Misunderstood Challenge to Public Health. Am. J. Prev. Med. 2019, 57. [Google Scholar] [CrossRef]
  47. Tasnim, S.; Hossain, M.M.; Mazumder, H. Impact of Rumors and Misinformation on COVID-19 in Social Media. J. Prev. Med. Public Health 2020, 53, 171–174. [Google Scholar] [CrossRef] [Green Version]
  48. Vraga, E.; Bode, L. Addressing COVID-19 Misinformation on Social Media Preemptively and Responsively. Emerg. Infect. Dis. 2021, 27. [Google Scholar] [CrossRef]
  49. Zhou, X.; Zafarani, R. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. 2020, 53. [Google Scholar] [CrossRef]
  50. Obiala, J.; Obiala, K.; Mańczak, M.; Owoc, J.; Olszewski, R. COVID-19 misinformation: Accuracy of articles about coronavirus prevention mostly shared on social media. Health Policy Technol. 2021, 10, 182–186. [Google Scholar] [CrossRef]
  51. Apuke, O.D.; Omar, B. Fake news and COVID-19: Modelling the predictors of fake news sharing among social media users. Telemat. Inform. 2021, 56, 101475. [Google Scholar] [CrossRef]
  52. Jonathan, G.M.; Jonathan, G.M. Exploring Social Media Use during a Public Health Emergency in Africa: The COVID-19 Pandemic. Available online: https://www.researchgate.net/publication/345877480_Exploring_Social_Media_Use_During_a_Public_Health_Emergency_in_Africa_The_COVID-19_Pandemic (accessed on 19 August 2021).
  53. Islam, A.N.; Laato, S.; Talukder, S.; Sutinen, E. Misinformation sharing and social media fatigue during COVID-19: An affordance and cognitive load perspective. Technol. Forecast. Soc. Chang. 2020, 159, 120201. [Google Scholar] [CrossRef] [PubMed]
  54. Bastani, P.; Bahrami, M. COVID-19 Related Misinformation on Social Media: A Qualitative Study from Iran (Preprint). J. Med. Internet Res. 2020. [Google Scholar] [CrossRef] [PubMed]
  55. Chakraborty, K.; Bhatia, S.; Bhattacharyya, S.; Platos, J.; Bag, R.; Hassanien, A.E. Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers—A study to show how popularity is affecting accuracy in social media. Appl. Soft Comput. 2020, 97, 106754. [Google Scholar] [CrossRef]
  56. Pennycook, G.; McPhetres, J.; Zhang, Y.; Lu, J.G.; Rand, D.G. Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention. Psychol. Sci. 2020, 31, 770–780. [Google Scholar] [CrossRef] [PubMed]
  57. Mejova, Y.; Kalimeri, K. COVID-19 on Facebook Ads: Competing Agendas around a Public Health Crisis. In Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies, Guayaquil, Ecuador, 15–17 June 2020; pp. 22–31. [Google Scholar] [CrossRef]
  58. Dimitrov, D.; Baran, E.; Fafalios, P.; Yu, R.; Zhu, X.; Zloch, M.; Dietze, S. TweetsCOV19—A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM’20), Online, 19–23 October 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 2991–2998. [Google Scholar] [CrossRef]
  59. Aphiwongsophon, S.; Chongstitvatana, P. Identifying misinformation on Twitter with a support vector machine. Eng. Appl. Sci. Res. 2020, 47, 306–312. [Google Scholar] [CrossRef]
  60. Deokate, S.B. Fake News Detection using Support Vector Machine learning Algorithm. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 2019. Available online: https://www.researchgate.net/publication/336465014_Fake_News_Detection_using_Support_Vector_Machine_learning_Algorithm (accessed on 19 August 2021).
  61. Ciprian-Gabriel, C.; Coca, G.; Iftene, A. Identifying Fake News on Twitter Using Naïve Bayes, Svm And Random Forest Distributed Algorithms. In Proceedings of the 13th Edition of the International Conference on Linguistic Resources and Tools for Processing Romanian Language (ConsILR-2018), Bucharest, Romania, 22–23 November 2018. [Google Scholar]
  62. Shmueli, G.; Bruce, P.C.; Gedeck, P.; Patel, N.R. Data Mining for Business Analytics: Concepts, Techniques and Applications in Python; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar]
  63. Kolbe, D.; Zhu, Q.; Pramanik, S. Efficient k-nearest neighbor searching in nonordered discrete data spaces. ACM Trans. Inf. Syst. (TOIS) 2010, 28, 1–33. [Google Scholar] [CrossRef]
  64. Cunningham, P.; Delany, S.J. k-Nearest Neighbour Classifiers-A Tutorial. ACM Comput. Surv. (CSUR) 2021, 54, 1–25. [Google Scholar] [CrossRef]
  65. Ali, M.; Jung, L.T.; Abdel-Aty, A.H.; Abubakar, M.Y.; Elhoseny, M.; Ali, I. Semantic-k-NN algorithm: An enhanced version of traditional k-NN algorithm. Expert Syst. Appl. 2020, 151, 113374. [Google Scholar] [CrossRef]
  66. Mokhtar, M.S.; Jusoh, Y.Y.; Admodisastro, N.; Pa, N.C.; Amruddin, A.Y. Fakebuster: Fake News Detection System Using Logistic Regression Technique In Machine Learning. Int. J. Eng. Adv. Technol. (IJEAT) 2019, 9, 2407–2410. [Google Scholar]
  67. Ogdol, J.M.G.; Samar, B.L.T.; Catarroja, C. Binary Logistic Regression based Classifier for Fake News. J. High. Educ. Res. Discip. 2018. Available online: http://www.nmsc.edu.ph/ojs/index.php/jherd/article/view/98 (accessed on 19 August 2021).
  68. Nada, F.; Khan, B.F.; Maryam, A.; Nooruz-Zuha; Ahmed, Z. Fake News Detection using Binary Logistic Regression. Int. Res. J. Eng. Technol. (IRJET) 2019, 8, 1705–1711. [Google Scholar]
  69. Bharti, P.; Bakshi, M.; Uthra, R. Fake News Detection Using Logistic Regression, Sentiment Analysis and Web Scraping. Int. J. Adv. Sci. Technol. 2020, 29, 1157–1167. [Google Scholar]
  70. Ghosh, A.; Sufian, A.; Sultana, F.; Chakrabarti, A.; De, D. Fundamental concepts of convolutional neural network. In Recent Trends and Advances in Artificial Intelligence and Internet of Things; Springer: Berlin/Heidelberg, Germany, 2020; pp. 519–567. [Google Scholar]
  71. Bai, L.; Yao, L.; Wang, X.; Kanhere, S.S.; Guo, B.; Yu, Z. Adversarial multi-view networks for activity recognition. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2020, 4, 1–22. [Google Scholar] [CrossRef]
  72. Chen, J.; Yang, Y.t.; Hu, K.k.; Zheng, H.b.; Wang, Z. DAD-MCNN: DDoS attack detection via multi-channel CNN. In Proceedings of the 2019 11th International Conference on Machine Learning and Computing, Zhuhai, China, 22–24 February 2019; pp. 484–488. [Google Scholar]
  73. Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv 2015, arXiv:abs/1409.0473. [Google Scholar]
  74. Ruchansky, N.; Seo, S.; Liu, Y. CSI: A Hybrid Deep Model for Fake News Detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM’17), Singapore, 6–10 November 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 797–806. [Google Scholar] [CrossRef] [Green Version]
  75. Cui, L.; Wang, S.; Lee, D. SAME: Sentiment-Aware Multi-Modal Embedding for Detecting Fake News. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM ’19), Vancouver, BC, Canada, 27–30 August 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 41–48. [Google Scholar] [CrossRef]
  76. Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical Attention Networks for Document Classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016. [Google Scholar]
  77. Cui, L.; Shu, K.; Wang, S.; Lee, D.; Liu, H. DEFEND: A System for Explainable Fake News Detection. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM ’19), Beijing, China, 3–7 November 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 2961–2964. [Google Scholar] [CrossRef]
  78. Cui, L.; Lee, D. CoAID: COVID-19 Healthcare Misinformation Dataset. arXiv 2020, arXiv:2006.00885. [Google Scholar]
  79. Jamison, A.; Broniatowski, D.A.; Smith, M.C.; Parikh, K.S.; Malik, A.; Dredze, M.; Quinn, S.C. Adapting and extending a typology to identify vaccine misinformation on twitter. Am. J. Public Health 2020, 110, S331–S339. [Google Scholar] [CrossRef]
  80. Hartwig, K.; Reuter, C. TrustyTweet: An Indicator-based Browser-Plugin to Assist Users in Dealing with Fake News on Twitter. In Proceedings of the WI 2019, the 14th International Conference on Business Informatics, AIS eLibrary, Siegen, Germany, 23–27 February 2019; pp. 1844–1855. Available online: https://aisel.aisnet.org/wi2019/specialtrack01/papers/5/ (accessed on 10 April 2021).
  81. Memon, S.A.; Carley, K.M. Characterizing COVID-19 Misinformation Communities Using a Novel Twitter Dataset. arXiv 2020, arXiv:2008.00791. [Google Scholar]
  82. Shahi, G.; Dirkson, A.; Majchrzak, T.A. An Exploratory Study of COVID-19 Misinformation on Twitter. Online Soc. Netw. Media 2020, 22, 100104. [Google Scholar] [CrossRef] [PubMed]
  83. Singh, L.; Bansal, S.; Bode, L.; Budak, C.; Chi, G.; Kawintiranon, K.; Padden, C.; Vanarsdall, R.; Vraga, E.; Wang, Y. A first look at COVID-19 information and misinformation sharing on Twitter. arXiv 2020, arXiv:2003.13907. [Google Scholar]
  84. Alqurashi, S.; Hamawi, B.; Alashaikh, A.; Alhindi, A.; Alanazi, E. Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter. arXiv 2021, arXiv:2101.05626. [Google Scholar]
  85. Girgis, S.; Amer, E.; Gadallah, M. Deep Learning Algorithms for Detecting Fake News in Online Text. In Proceedings of the 2018 13th International Conference on Computer Engineering and Systems (ICCES), Cairo, Egypt, 18–19 December 2018; pp. 93–97. [Google Scholar] [CrossRef]
  86. Hossain, T.; Logan IV, R.L.; Ugarte, A.; Matsubara, Y.; Young, S.; Singh, S. COVIDLies: Detecting COVID-19 Misinformation on Social Media. Available online: https://openreview.net/pdf?id=FCna-s-ZaIE (accessed on 19 August 2021).
  87. Shahi, G.K.; Nandini, D. FakeCovid—A Multilingual Cross-domain Fact Check News Dataset for COVID-19. In Proceedings of the 14th International AAAI Conference on Web and Social Media, Atlanta, GA, USA, 8–11 June 2020. [Google Scholar]
  88. Zhou, X.; Mulay, A.; Ferrara, E.; Zafarani, R. ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM 20), Turin, Italy, 22–26 October 2018; Association for Computing Machinery: New York, NY, USA, 2020; pp. 3205–3212. [Google Scholar] [CrossRef]
  89. Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1–11. [Google Scholar] [CrossRef]
  90. Ahmad, I.; Yousaf, M.; Yousaf, S.; Ahmad, M.O. Fake News Detection Using Machine Learning Ensemble Methods. Complexity 2020, 2020, 8885861. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.