Sentiment Analysis for Fake News Detection
Abstract
:1. Introduction
2. Related Work
3. Fake News
- false connection, where headlines, visuals, or captions do not support the content;
- false context, corresponding to genuine content shared with false contextual information;
- manipulated content, i.e., genuine information manipulated to deceive;
- misleading content, which involves misleading use of information to frame an issue or individual;
- imposter content, where genuine sources are impersonated;
- fabricated content, 100% false, designed to deceive and harm; and
- satire/parody, with potential to fool but no intention to cause harm. Given the non-harmful nature of these news and because they are easily identifiable as parodic [30], this type of news is not usually considered for fake news detection, although satire can be used as an excuse to avoid the accusation of spreading false news [31].
4. Fake News Detection
- Knowledge-based, fact-checking approaches that try to determine whether the claims made in a news story are supported by facts. For this, knowledge bases, including the semantic web and linked open data, are used [70,71]. We could also include in this category fact-checking approaches based on the use of information retrieval techniques to find documents that support a news piece [72,73].
- Context-based approaches that try to determine the truthfulness of a news story based on its metadata, such as the credibility of its author and publisher as well as the speed and form of dissemination of the news on social networks [27].
- Content-based approaches that try to determine the veracity of a story based on its text, which includes considerations of style (for example, length, variety of words, complexity of vocabulary, and complexity of syntactic constructions) [10,24,68] as well as of the type and strength of sentiments and emotions conveyed by the news, which is the topic covered in this article.
- Stance classification [74,75], the task of determining the opinion behind some text, with the target or topic known in advance (target-specific stance classification) or not (open stance classification). It is different from fake news detection in that it is not used for assessing veracity but consistency [24].
- Truth discovery [77], the task of detecting true facts by resolving conflicts among multi-source noisy information, e.g., databases, the web, crowdsourced data, etc. To solve this task, both the credibility of the source and the truthfulness of the objects of interest need to be taken into account.
- Clickbait detection [9], the task of distinguishing headlines that fit the facts described in a news item from those designed for eye-catching and teasing in online media.
- Bot detection [82]. Social media is now populated by small programs designed to exhibit human-like behavior called social bots [83] that automatically spread posts to give the impression that a given piece of information is highly popular and endorsed by many people. The task of detecting bots is closely related to opinion spam and fake review detection.
- Determining source credibility [84,85]. Credibility is a perceived quality associated with believability, trustworthiness, perceived reliability, expertise, and accuracy. As a result, we could say that credibility is a transversal quality for all tasks that deal with misinformation and disinformation.
- Hate speech detection [86,87,88]. Hate speech is a broad umbrella term for insulting and offensive user-created content addressed to specific targets to incite violence or hate toward a minority based on specific characteristics of groups, such as ethnic origin, religion, or other. In hate speech, language is used to attack or diminish these groups. These types of messages are based on content of doubtful credibility, including rumors and fake news.
4.1. Resources
- Papers that discuss experimental results usually provide some reference to the source of the data used in such experiments, which can range from simply indicating the name of the data set to providing a link or, more rarely, a bibliographic reference. In the case of links, a lot of them were broken, so we did our best to obtain the right links.
- A search was performed on Google (https://www.google.com/, accessed on 3 June 2021) using “fake news dataset” and “fake news corpus” as query phrases to find additional resources.
- Manual browsing was performed for the most popular evaluation campaigns in NLP, such as SemEval, CLEF, IberLEF, etc.
- In addition, a Google search for “fake news shared task” and “fake news campaign” was performed.
4.2. Data Sets
- Fact Checking corpus (https://sites.google.com/site/andreasvlachos/resources/, accessed on 3 June 2021) by Vlachos and Riedel [96]. It contains 106 statements from Channel 4 FactCheck (https://www.channel4.com/news/factcheck/, accessed on 3 June 2021) and PolitiFact.com (https://www.politifact.com/, accessed on 3 June 2021). Both websites have large archives of fact-checked statements that cover a wide range of issues involving UK and US public life, and they provide detailed verdicts with fine-grained labels that were aligned to a five-point scale of “True”, “MostlyTrue”, “HalfTrue”, “MostlyFalse”, and “False”.
- BuzzFeed-Webis Fake News Corpus 2016 (https://zenodo.org/record/1239675, accessed on 3 June 2021) [68]. This corpus encompasses 1627 Facebook posts from 9 publishers on 7 workdays close to the US 2016 presidential election. It contains 256 posts from three left-wing publishers, 545 posts from three right-wing ones, and 826 posts from three mainstream publishers. All publishers earned Facebook’s blue checkmark, indicating authenticity and an elevated status within the network. Each post and linked news article was rated “mostly true”, “mixture of true and false”, “mostly false”, or “no factual content” by BuzzFeed journalists.
- BuzzFace (https://github.com/gsantia/BuzzFace, accessed on 3 June 2021) [97] is an extension of the previous corpus enriched with 1.6 million Facebook comments and reactions and with additional data from Twitter and Reddit.
- Craig Silverman data sets (https://github.com/BenjaminDHorne/fakenewsdata1, accessed on 3 June 2021). The data set called Buzzfeed Political News Data contains true news stories and malicious fake news stories from buzzfeednews.com. Other data set, Random Political News Data, contains true news from The Wall Street Journal, The Economist, BBC, NPR, ABC, CBS, USA Today, The Guardian, NBC, and The Washington Post; satirical news from The Onion, Huffington Post Satire, The Borowitz Report, The Beaverton, SatireWire, and Faking News; and fake news from Ending The Fed, True Pundit, abcnews.com.co, DC Gazette, Liberty Writers News, Before its News, InfoWars, and Real News Right Now.
- LIAR (https://www.cs.ucsb.edu/~william/data/liar_dataset.zip, accessed on 3 June 2021) [98], with 12.8 K human labeled short statements from PolitiFact.com evaluated for its truthfulness using six labels: “pants-fire”, “false”, “barely-true”, “half-true”, “mostly-true”, and “true”. A rich set of meta-data for the author of each statement is also provided. The statements are sampled from news releases, TV and radio interviews, campaign speeches, TV ads, Twitter messages, debates, and Facebook posts. The most discussed subjects are economy, healthcare, taxes, federal-budget, education, jobs, state-budget, candidate biographies, elections, and immigration.
- Fact Checking data set (https://hrashkin.github.io/factcheck.html, accessed on 3 June 2021) [99], a collection of rated statements from PolitiFact.com with additional unreliable news articles from different types of unreliable sources including satire, propaganda, and hoaxes.
- BS Detector data set (https://www.kaggle.com/mrisdal/fake-news, accessed on 3 June 2021), sometimes known as Kaggle Fake News data set, contains text and metadata from 244 websites and represents 12,999 posts from 30 days collected by a browser extension called BS detector developed for checking news veracity. This extension searches all links on a given webpage for references to unreliable sources by checking against a manually compiled list of domains. Therefore, the documents in the data set were labeled by software, not by human annotators.
- Fake vs. real news project data set (https://github.com/joolsa/fake_real_news_dataset, accessed on 3 June 2021) was developed by George McIntire and contains 7.8 k news with equal allocation of fake and real news. Half of the corpus comes from political news [100]. Fake news were collected from the BS Detector data set. Real news were collected from media organizations such as The New York Times, WSJ, Bloomberg, NPR, and The Guardian during 2015 and 2016.
- CREDBANK data set (https://github.com/compsocial/CREDBANK-data, accessed on 3 June 2021) [101] comprises more than 60 million tweets collected over 96 days, grouped into 1049 real-world events. Each tweet was annotated by 30 human annotators from Amazon Mechanical Turk.
- Snopes data set (https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/impact/web-credibility-analysis/, accessed on 3 June 2021) [102] contains 1277 true claims and 3579 false claims collected from the fact-checking website Snopes (https://www.snopes.com/, accessed on 3 June 2021). A set of 30 reporting articles is provided for each claim, which were obtained by using the text of the claim as query to Google in order to extract the first 30 results.
- Fact-Checking Facebook Politics Pages (https://github.com/BuzzFeedNews/2016-10-facebook-fact-check, https://www.buzzfeednews.com/article/craigsilverman/partisan-fb-pages-analysis, accessed on 3 June 2021) contains 666 posts from three large right-wing hyperpartisan Facebook pages, 471 from hyperpartisan left-wing pages, as well as 1145 posts from three large mainstream political news pages that were manually rated as “mostly true”,“mixture of true and false”, or “mostly false”. Satirical and opinion-driven posts or posts that lacked a factual claim were rated as “no factual content”.
- FEVER, Fact Extraction and VERification data set (https://fever.ai/resources.html, accessed on 3 June 2021) [103], consists of 185,445 claims manually verified against the introductory sections of Wikipedia pages and classified as Supported, Refuted, or NotEnoughInfo. For the first two classes, systems and annotators need to also return the combination of sentences forming the necessary evidence supporting or refuting the claim. The claims were generated by human annotators extracting claims from Wikipedia and mutating them in a variety of ways, some of which were meaning-altering. The verification of each claim was conducted in a separate annotation process by annotators who were aware of the page but not the sentence from which the original claim was extracted. Although this data set does not contain news, we consider it relevant and is included in this list because it shows a way to convert a text with objective facts (similar in some way to a true news item) into a false text (equivalent to a fake news).
- Fake News vs. Satire corpus (https://github.com/jgolbeck/fakenews, accessed on 3 June 2021) [31] contains 283 fake news articles and 203 satirical stories focused on American politics, posted between January 2016 and October 2017. The title, a link, and the full text ID are provided for each article. For fake news stories, a rebutting article is also provided that disproves the premise of the original story.
- Fake.Br Corpus (https://github.com/roneysco/Fake.br-Corpus, accessed on 3 June 2021) [104,105] composed of 7200 true and fake news written in Brazilian Portuguese: 3600 fake news were manually collected from four Brazilian newspapers while 3600 true news were collected in a semi-automatic way from major news agencies in Brazil, choosing the most similar ones to fake news, with manual verification to guarantee that the fake and true news were in fact subject-related.
- FakeNewsCorpus (https://github.com/architapathak/FakeNewsCorpus, accessed on 3 June 2021) by Pathak and Srihari [106] contains 704 fake and questionable articles from Aug.–Nov., 2016, on the topic of the 2016 US election. Each article was manually checked, and two types of labels were assigned to it: a primary label based on the assertions made (“False”, “Partial truth”, or “Opinions”) and a secondary label (“Fake” or “Questionable”).
- NELA-GT-2018 (https://doi.org/10.7910/DVN/ULHLCB, accessed 3 on June 2021) [107], a data set that does not annotate the veracity of each news story but rather the truthfulness of the news source. It is made up of 713,534 articles from 194 news and media producers including mainstream, hyper-partisan, and conspiracy sources. It includes the ground truth ratings of the sources from eight independent assessment sites (NewsGuard, Pew Research Center, Wikipedia, OpenSources, Media Bias/Fact Check MBFC, AllSides, BuzzFeed News, and Politifact) covering multiple dimensions of veracity, including reliability, bias, transparency, adherence to journalistic standards, and consumer trust.
- NELA-GT-2019 (https://doi.org/10.7910/DVN/O7FWPO, accessed on 3 June 2021) [108] is an updated version of NELA-GT-2018 with 1.12 M news articles from 260 sources. One major change is the removal of NewsGuard labels.
- FA-KES (https://zenodo.org/record/2607278#.YEpyDGhKhPY, accessed on 3 June 2021) [109], a fake news data set around the Syrian war. The data set consists of a set of 804 news articles written in English from several media outlets representing mobilization press, loyalist press, and diverse print media annnotated as fake or credible, taking the information obtained from the Syrian Violations Documentation Center (VDC) as ground truth. For each news, relevant information (e.g., date, location, and number of casualties) was extracted manually through the crowdsourcing platform CrowdFlower and then matched against the VDC database to deduce whether an article is fake.
- The Spanish Fake News Corpus (https://github.com/jpposadas/FakeNewsCorpusSpanish, accessed on 3 June 2021) [110] contains 971 news written in Spanish compiled from January to July of 2018 from sources from Spain, Mexico, Argentina, Colombia, the USA, and the United Kingdom including newspapers, media companies, websites dedicated to validating fake news, and websites designated by different journalists as sites that regularly publish fake news. Each article was manually labeled as true if there was evidence that it was published on reliable sites (established newspaper websites or renowned journalist websites). Conversely, an article was labeled as fake if it was contradicted by any article from reliable sites or from websites specializing in the detection of deceptive content or if no other evidence was found about the news besides the source. This corpus was used as training data set at the FakeDeS 2021 shared task.
- FakeNewsNet (https://github.com/KaiDMML/FakeNewsNet, accessed on 3 June 2021) [111] contains news from the fact-checking websites PolitiFact.com and GossipCop (https://www.gossipcop.com/, accessed on 3 June 2021). In addition to text, it also contains information on user engagement related to the fake and real news pieces in the form of social media posts and their replies, likes, and reposts. Timestaps and locations explicitly listed in user profiles are also provided.
- FakeHealth (https://github.com/EnyanDai/FakeHealth, accessed on 3 June 2021) [56] consists of two data sets: HealthStory with news stories reported by news media (e.g., Reuters Health) and HealthRelease with news releases from universities, research centers, and companies. Each data set includes news contents, news reviews covering explanations regarding ten health news evaluation criteria, social engagements from Twitter, and user networks. The news stories and releases in the data sets have been evaluated by experts on a set of reliability criteria. For the news contents, text is provided along with the source publishers, image links, and other side information. News social engagements include 500 k tweets, 29 k replies, 14 k retweets, and 27 k user profiles with timelines and friend lists.
- HWB (https://dcs.uoc.ac.in/cida/resources/hwb.html, accessed on 3 June 2021) [55], a Health and Well-Being fake news data set composed of 500 legitimate news on health from reputable sources (CNN, The New York Times, and The New Indian Express, among others), manually double-checked for truthfulness and 500 fake news on health from well-reported misinformation websites (Before It’s News, Nephef, and Mad World News, among others), manually verified for misinformation presence.
- Weibo-20 [112] (https://github.com/RMSnow/WWW2021, accessed on 3 June 2021), a Chinese data set with 3161 instances of fake news (1355 are fake news pieces from the Weibo-16 rumor data set [93], and the rest are news pieces judged as misinformation officially by the Weibo Community Management Center) with their 1,132,298 comments on Weibo and 3201 real news (2351 were real news pieces from Weibo-16, and 850 are new real news) with their 851,142 comments on Weibo.
4.3. Shared Tasks
- The goal of the Fast & Furious Fact Check Challenge (https://www.herox.com/factcheck/teams, accessed on 3 June 2021), held in 2016, was to develop faster ways to check facts through automation. The challenge consisted in assigning a“truth rating” to each claim (there were 90 claims, 41 for training and 49 for testing). The small size of the set of claims prevents its use by today’s prevalent machine learning techniques.
- The Fake News Challenge FNC-1 (http://www.fakenewschallenge.org/, accessed on 3 June 2021) [113] held in 2017 aimed to determine the perspective (or stance) of a news article relative to a given headline. Therefore, despite its name, this task does not detect fake news but determines its stance. An article’s stance can either agree or disagree with the headline, discuss the same topic, or be completely unrelated. An existing data set for stance classification [91] was enhanced for this purpose, with 50 K labelled claim–article pairs, combining 300 claims with 2582 articles. The claims and the articles were curated and labeled by journalists.
- WSDM—Fake News Classification (https://www.kaggle.com/c/fake-news-pair-classification-challenge/, accessed on 3 June 2021) was a shared task organized within the framework of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM 2019). More than the detection of fake news, this task dealt with the detection of news that propagated or refuted other fake news, since given the title of a fake news article A and the title of a news article of interest B, participants were asked to classify B into one of three categories: “agreed” (B talks about the same fake news as A), “disagreed” (B refutes the fake news in A), and “unrelated” (B is unrelated to A). The training data set contained 320,767 news pairs in both Chinese and English, while the test data set contained 80,126 news pairs in both languages.
- Fake News Detection Challenge KDD 2020 (https://www.kaggle.com/c/fakenewskdd2020, accessed on 3 June 2021) was a shared task organized in the context of the Second International TrueFact Workshop: Making a Credible Web for Tomorrow in conjunction with SIGKDD 2020. The data set was composed of fake and true news.
- Profiling Fake News Spreaders on Twitter (https://pan.webis.de/clef20/pan20-web/author-profiling.html, accessed on 3 June 2021) [114] was a shared task organized within CLEF 2020 in the context of PAN, a series of scientific events, and shared tasks on digital text forensics and stylometry. The particularity of this task is that it does not properly try to detect fake news but to detect whether a Twitter user is a potential propagator of fake news. The languages considered for this task were English and Spanish. The data set consisted of the last 100 tweets of 500 users (250 fake news spreaders and 250 true news spreaders) for each of the two languages.
- UrduFake@FIRE2020 (https://www.urdufake2020.cicling.org/, accessed on 3 June 2021) [115] was a shared task organized within the framework of the 12th meeting of the Forum for Information Retrieval Evaluation (FIRE 2020). The data set used for the task was the Bend-The-Truth Urdu fake news data set, composed of 750 true news articles in the domains of technology, education, business, sports, politics, and entertainment, obtained from a variety of mainstream news websites predominantly in Pakistan, India, the United Kingdom, and the USA, and 550 fake news intentionally written by a group of professional journalists in the same domains and of the approximately same length as true news.
- The CONSTRAINT 2021 shared task (https://constraint-shared-task-2021.github.io/, accessed on 3 June 2021) [116] was organized in the context of the First Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT 2021) collocated with AAAI 2021. It encompassed a task for detecting fake news in English over a data set with real and fake news on COVID-19 [117] and a second task on hostile post detection in Hindi over a data set of 8200 hostile and non-hostile texts from various social media platforms such as Twitter, Facebook, or WhatsApp [118]. This latter task is a multi-label, multi-class classification problem where each post can belong to one or more of a set of classes: fake news, hate speech, offensive, defamation, and non-hostile posts.
- FakeDeS (https://sites.google.com/view/fakedes, accessed on 3 June 2021) is the Fake News Detection in Spanish Shared Task established in 2021 under the umbrella of the Iberian Languages Evaluation Forum (IberLEF). The training data set for this task is The Spanish Fake News Corpus [110] described above. The testing corpus contains news related to COVID-19 and news from other Ibero-American countries.
4.4. Evaluation Measures
- A true positive is counted for each article predicted as fake news that is actually annotated as fake in the test set;
- A false positive is counted for each article predicted as fake news that is actually annotated as true news in the test set;
- A true negative is counted for each article predicted as true news that is actually annotated as true in the test set; and
- A false negative is counted for each article predicted as true news that is actually annotated as fake in the test set.
4.5. Summary
5. Sentiment Analysis as a Base Component for Text Analytics
6. Sentiment Analysis for Fake News Detection
6.1. Fake News Detection Systems Based on SA
6.2. SA as a Feature for Fake News Detection Systems
7. Discussion
- Multilingualism. Most of the work published on detecting fake news has been conducted on documents written in English. However, as this is a global issue, it is imperative to have systems that work in as many languages as possible. It was only recently that approaches that are able to deal with fake news for several languages have appeared [189,190]. For this purpose, effective multilingual SA methods may be applied [128,145].
- Multimedia content, particularly image and video, is becoming increasingly important on social media. Fake news are also increasingly accompanied by this type of content, so it will be necessary to enrich detection systems with multimedia SA methods [191].
- The most difficult fake news to detect are those in which falsehood has been subtly introduced, for example, expanding an authentic news piece with the addition of fake data or slightly modifying an authentic news story [192]. In this case, aspect-based SA [193] and adversarial training [194] can be of great help.
- High-performance AI systems, particularly those based on deep learning, behave similar to black boxes that provide good results but can hardly justify a given output in a human-understandable way. The creation of explainable AI systems [195] is becoming more and more important, and therefore, it is necessary to add mechanisms both to the SA methods used and to the resulting fake news detection systems.
- It is known that NLP algorithms and resources may have inadvertently introduced biases [196,197]. This also applies to SA systems [198]. Adding algorithmic biases to psychological and sociological biases already present in people [199,200] could call into question the usefulness of automated fake news detection systems in the future. Therefore, we must add bias mitigation mechanisms in SA systems in order to avoid giving more or less credibility to a news item depending on the gender, race, geographic origin, religion, or any other personal circumstance of the writer or the people mentioned in the text.
8. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Shearer, E.; Mitchell, A. News Use Across Social Media Platforms in 2020. 2021. Available online: https://www.journalism.org/2021/01/12/news-use-across-social-media-platforms-in-2020/ (accessed on 3 June 2021).
- Hernon, P. Disinformation and misinformation through the internet: Findings of an exploratory study. Gov. Inf. Q. 1995, 12, 133–139. [Google Scholar] [CrossRef]
- Zubiaga, A.; Aker, A.; Bontcheva, K.; Liakata, M.; Procter, R. Detection and Resolution of Rumours in Social Media: A Survey. ACM Comput. Surv. 2018, 51, 32:1–32:36. [Google Scholar] [CrossRef] [Green Version]
- Meel, P.; Vishwakarma, D.K. Fake news, rumor, information pollution in social media and web: A contemporary survey of state-of-the-arts, challenges and opportunities. Expert Syst. Appl. 2020, 153, 112986. [Google Scholar] [CrossRef]
- Zannettou, S.; Sirivianos, M.; Blackburn, J.; Kourtellis, N. The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans. ACM J. Data Inf. Qual. 2019, 11, 10:1–10:37. [Google Scholar] [CrossRef] [Green Version]
- Sharma, K.; Qian, F.; Jiang, H.; Ruchansky, N.; Zhang, M.; Liu, Y. Combating Fake News: A Survey on Identification and Mitigation Techniques. ACM Trans. Intell. Syst. Technol. 2019, 10, 21:1–21:42. [Google Scholar] [CrossRef]
- Guerini, M.; Staiano, J. Deep Feelings: A Massive Cross-Lingual Study on the Relation between Emotions and Virality. In Proceedings of the 24th International Conference on World Wide Web Companion, WWW 2015, Florence, Italy, 18–22 May 2015; Companion Volume. Gangemi, A., Leonardi, S., Panconesi, A., Eds.; ACM: New York, NY, USA, 2015; pp. 299–305. [Google Scholar] [CrossRef]
- Dickerson, J.P.; Kagan, V.; Subrahmanian, V.S. Using sentiment to detect bots on Twitter: Are humans more opinionated than bots? In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2014, Beijing, China, 17–20 August 2014; Wu, X., Ester, M., Xu, G., Eds.; IEEE Computer Society: Washington, DC, USA, 2014; pp. 620–627. [Google Scholar] [CrossRef]
- Chen, Y.; Conroy, N.J.; Rubin, V.L. Misleading Online Content: Recognizing Clickbait as “False News”. In Proceedings of the 2015 ACM Workshop on Multimodal Deception Detection, WMDD@ICMI 2015, Seattle, WA, USA, 13 November 2015; Abouelenien, M., Burzo, M., Mihalcea, R., Pérez-Rosas, V., Eds.; ACM: New York, NY, USA, 2015; pp. 15–19. [Google Scholar] [CrossRef]
- Horne, B.D.; Adali, S. This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News. In Proceedings of the Workshops of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 2017), Montreal, QC, Canada, 15–18 May 2017; An, J., Kwak, H., Benevenuto, F., Eds.; AAAI Press: Palo Alto, CA, USA, 2017. Volume AAAI Technical Report WS-17-17: News and Public Opinion. pp. 759–766. [Google Scholar]
- Conroy, N.J.; Rubin, V.L.; Chen, Y. Automatic deception detection: Methods for finding fake news. In Information Science with Impact: Research in and for the Community—Proceedings of the 78th ASIS&T Annual Meeting, ASIST 2015, St. Louis, MO, USA, 6–10 October 2015; Wiley: Hoboken, NJ, USA, 2015; Volume 52, pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
- Shu, K.; Sliva, A.; Wang, S.; Tang, J.; Liu, H. Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor. 2017, 19, 22–36. [Google Scholar] [CrossRef]
- Shu, K.; Wang, S.; Lee, D.; Liu, H. Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements. In Disinformation, Misinformation, and Fake News in Social Media: Emerging Research Challenges and Opportunities; Shu, K., Wang, S., Lee, D., Liu, H., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 1–19. [Google Scholar] [CrossRef]
- Shu, K.; Liu, H. Detecting Fake News on Social Media. In Synthesis Lectures on Data Mining and Knowledge Discovery; Morgan & Claypool Publishers: San Rafael, CA, USA, 2019; Volume 18. [Google Scholar]
- Hussein, D.M.E.D.M. A survey on sentiment analysis challenges. J. King Saud Univ. Eng. Sci. 2018, 30, 330–338. [Google Scholar] [CrossRef]
- Flekova, L.; Preotiuc-Pietro, D.; Ruppert, E. Analysing domain suitability of a sentiment lexicon by identifying distributionally bipolar words. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA@EMNLP 2015, Lisbon, Portugal, 17 September 2015; Balahur, A., der Goot, E.V., Vossen, P., Montoyo, A., Eds.; The Association for Computer Linguistics: Stroudsburg, PA, USA, 2015; pp. 77–84. [Google Scholar] [CrossRef] [Green Version]
- Thorne, J.; Vlachos, A. Automated Fact Checking: Task Formulations, Methods and Future Directions. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, NM, USA, 20–26 August 2018; Bender, E.M., Derczynski, L., Isabelle, P., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 3346–3359. [Google Scholar]
- Elhadad, M.K.; Li, K.F.; Gebali, F. Fake News Detection on Social Media: A Systematic Survey. In Proceedings of the IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2019, Victoria, BC, Canada, 21–23 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Bondielli, A.; Marcelloni, F. A survey on fake news and rumour detection techniques. Inf. Sci. 2019, 497, 38–55. [Google Scholar] [CrossRef]
- da Silva, F.C.D.; Vieira, R.; Garcia, A.C. Can Machines Learn to Detect Fake News? A Survey Focused on Social Media. In Proceedings of the 52nd Hawaii International Conference on System Sciences, HICSS 2019, Grand Wailea, Maui, HI, USA, 8–11 January 2019; Bui, T., Ed.; ScholarSpace: Honolulu, HI, USA, 2019; pp. 1–8. [Google Scholar]
- Klyuev, V. Fake News Filtering: Semantic Approaches. In Proceedings of the 2018 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 29–31 August 2018; pp. 9–15. [Google Scholar] [CrossRef]
- Collins, B.; Hoang, D.T.; Nguyen, N.T.; Hwang, D. Fake News Types and Detection Models on Social Media A State-of-the-Art Survey. In Proceedings of the Intelligent Information and Database Systems-12th Asian Conference, ACIIDS 2020, Phuket, Thailand, 23–26 March 2020; Companion Proceedings. Sitek, P., Pietranik, M., Krótkiewicz, M., Srinilta, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1178, pp. 562–573. [Google Scholar] [CrossRef]
- Zhou, X.; Zafarani, R. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. 2020, 53, 109:1–109:40. [Google Scholar] [CrossRef]
- Oshikawa, R.; Qian, J.; Wang, W.Y. A Survey on Natural Language Processing for Fake News Detection. In Proceedings of the 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, 11–16 May 2020; Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., et al., Eds.; European Language Resources Association: Paris, France, 2020; pp. 6086–6093. [Google Scholar]
- Zhang, X.; Ghorbani, A.A. An overview of online fake news: Characterization, detection, and discussion. Inf. Process. Manag. 2020, 57, 102025. [Google Scholar] [CrossRef]
- de Souza, J.V.; Gomes, J., Jr.; de Souza Filho, F.M.; de Oliveira Julio, A.M.; de Souza, J.F. A systematic mapping on automatic classification of fake news in social media. Soc. Netw. Anal. Min. 2020, 10, 48. [Google Scholar] [CrossRef]
- Antonakaki, D.; Fragopoulou, P.; Ioannidis, S. A survey of Twitter research: Data model, graph structure, sentiment analysis and attacks. Expert Syst. Appl. 2021, 164, 114006. [Google Scholar] [CrossRef]
- Allcott, H.; Gentzkow, M. Social Media and Fake News in the 2016 Election. J. Econ. Perspect. 2017, 31, 211-36. [Google Scholar] [CrossRef] [Green Version]
- Wardle, C. Fake News. It’s Complicated. 2017. Available online: https://firstdraftnews.org/articles/fake-news-complicated/ (accessed on 3 June 2021).
- Tandoc, E.C., Jr.; Lim, Z.W.; Ling, R. Defining “Fake News”. Digit. J. 2018, 6, 137–153. [Google Scholar] [CrossRef]
- Golbeck, J.; Mauriello, M.L.; Auxier, B.; Bhanushali, K.H.; Bonk, C.; Bouzaghrane, M.A.; Buntain, C.; Chanduka, R.; Cheakalos, P.; Everett, J.B.; et al. Fake News vs. Satire: A Dataset and Analysis. In Proceedings of the 10th ACM Conference on Web Science, WebSci 2018, Amsterdam, The Netherlands, 27–30 May 2018; Akkermans, H., Fontaine, K., Vermeulen, I., Houben, G., Weber, M.S., Eds.; ACM: New York, NY, USA, 2018; pp. 17–21. [Google Scholar] [CrossRef]
- Scheufele, D.A.; Krause, N.M. Science audiences, misinformation, and fake news. Proc. Natl. Acad. Sci. USA 2019, 116, 7662–7669. [Google Scholar] [CrossRef] [Green Version]
- Field-Fote, E.E. Fake News in Science. J. Neurol. Phys. Ther. 2019, 43, 139–140. [Google Scholar] [CrossRef]
- Taddicken, M.; Wolff, L. ‘Fake News’ in Science Communication: Emotions and Strategies of Coping with Dissonance Online. Media Commun. 2020, 8, 206–217. [Google Scholar] [CrossRef]
- Kedar, H.E. Fake News in Media Art: Fake News as a Media Art Practice vs. Fake News in Politics. Postdigit. Sci. Educ. 2020, 2, 132–146. [Google Scholar] [CrossRef] [Green Version]
- Ruzicka, V.; Kang, E.; Gordon, D.; Patel, A.; Fashimpaur, J.; Zaheer, M. The Myths of Our Time: Fake News. arXiv 2019, arXiv:1908.01760. [Google Scholar]
- Rapoza, K. Can ‘Fake News’ Impact The Stock Market? Forbes 2017. Available online: https://www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-the-stock-market/ (accessed on 3 June 2021).
- Clarke, J.; Chen, H.; Du, D.; Hu, Y.J. Fake News, Investor Attention, and Market Reaction. Inf. Syst. Res. 2020. Forthcoming. [Google Scholar] [CrossRef]
- Kogan, S.; Moskowitz, T.J.; Niessner, M. Fake News in Financial Markets; Social Science Research Network (SSRN): Rochester, NY, USA, 2020. [Google Scholar] [CrossRef]
- Domenico, G.D.; Sit, J.; Ishizaka, A.; Nunan, D. Fake news, social media and marketing: A systematic review. J. Bus. Res. 2021, 124, 329–341. [Google Scholar] [CrossRef]
- Visentin, M.; Pizzi, G.; Pichierri, M. Fake News, Real Problems for Brands: The Impact of Content Truthfulness and Source Credibility on consumers’ Behavioral Intentions toward the Advertised Brands. J. Interact. Mark. 2019, 45, 99–112. [Google Scholar] [CrossRef]
- Di Domenico, G.; Visentin, M. Fake news or true lies? Reflections about problematic contents in marketing. Int. J. Mark. Res. 2020. Forthcoming. [Google Scholar] [CrossRef]
- Bakir, V.; McStay, A. Fake News and The Economy of Emotions. Digit. J. 2018, 6, 154–175. [Google Scholar] [CrossRef]
- Sindermann, C.; Cooper, A.; Montag, C. A short review on susceptibility to falling for fake political news. Curr. Opin. Psychol. 2020, 36, 44–48, Cyberpsychology. [Google Scholar] [CrossRef] [PubMed]
- Scardigno, R.; Mininni, G. The Rhetoric Side of Fake News: A New Weapon for Anti-Politics? World Future 2020, 76, 81–101. [Google Scholar] [CrossRef]
- Brun, I. National Security in the Era of Post-Truth and Fake News; Institute for National Security Studies: Tel Aviv, Israel, 2020. [Google Scholar]
- Belova, G.; Georgieva, G. Fake News as a Threat to National Security. Int. Conf. Knowl. Based Organ. 2018, 24, 19–22. [Google Scholar] [CrossRef] [Green Version]
- Vasu, N.; Ang, B.; Teo, T.A.; Jayakumar, S.; Faizal, M.; Ahuja, J. Fake News: National Security in the Post-Truth Era; Technical Report; S. Rajaratnam School of International Studies, Nanyang Tecnological University: Singapore, 2018. [Google Scholar]
- Verrall, N.; Mason, D. The Taming of the Shrewd. How Can the Military Tackle Sophistry, ‘Fake’ News and Post-Truth in the Digital Age? RUSI J. 2018, 163, 20–28. [Google Scholar] [CrossRef]
- Gallacher, J.D.; Barash, V.; Howard, P.N.; Kelly, J. Junk News on Military Affairs and National Security: Social Media Disinformation Campaigns Against US Military Personnel and Veterans; Data Memo 2017.9; Project on Computational Propaganda; Oxford Internet Institute, University of Oxford: Oxford, UK, 2017. [Google Scholar]
- Kwanda, F.A.; Lin, T.T.C. Fake news practices in Indonesian newsrooms during and after the Palu earthquake: A hierarchy-of-influences approach. Inf. Commun. Soc. 2020, 23, 849–866. [Google Scholar] [CrossRef]
- Hunt, K.; Agarwal, P.; Aziz, R.A.; Zhuang, D.J. Fighting Fake News during Disasters. OR/MS Today 2020, 47, 34–39. [Google Scholar] [CrossRef]
- Sawano, T.; Ozaki, A.; Hori, A.; Tsubokura, M. Combating ‘fake news’ and social stigma after the Fukushima Daiichi Nuclear Power Plant incident—The importance of accurate longitudinal clinical data. QJM Int. J. Med. 2019, 112, 479–481. [Google Scholar] [CrossRef]
- Naeem, S.B.; Bhatti, R.; Khan, A. An exploration of how fake news is taking over social media and putting public health at risk. Health Inf. Libr. J. 2020. [Google Scholar] [CrossRef] [PubMed]
- Anoop, K.; Deepak, P.; Lajish, V.L. Emotion Cognizance Improves Health Fake News Identification. In Proceedings of the 24th Symposium on International Database Engineering & Applications, Seoul, Korea, 12–14 August 2020; Association for Computing Machinery: New York, NY, USA, 2020. IDEAS ’20. [Google Scholar] [CrossRef]
- Dai, E.; Sun, Y.; Wang, S. Ginger Cannot Cure Cancer: Battling Fake Health News with a Comprehensive Data Repository. In Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, ICWSM 2020, Atlanta, GA, USA, 8–11 June 2020; Choudhury, M.D., Chunara, R., Culotta, A., Welles, B.F., Eds.; AAAI Press: Palo Alto, CA, USA, 2020; pp. 853–862. [Google Scholar]
- Mesquita, C.T.; Oliveira, A.; Seixas, F.L.; Paes, A. Infodemia, Fake News and Medicine: Science and The Quest for Truth. Int. J. Cardiovasc. Sci. 2020, 33, 203–205. [Google Scholar] [CrossRef]
- Hansen, P.R.; Schmidtblaicher, M. A Dynamic Model of Vaccine Compliance: How Fake News Undermined the Danish HPV Vaccine Program. J. Bus. Econ. Stat. 2021, 39, 259–271. [Google Scholar] [CrossRef]
- Loomba, S.; de Figueiredo, A.; Piatek, S.J.; de Graaf, K.; Larson, H.J. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nat. Hum. Behav. 2021. [Google Scholar] [CrossRef]
- The Lancet Infectious Diseases. The COVID-19 infodemic. Lancet Infect. Dis. 2020, 20, 875. [CrossRef]
- Ceron, W.; de Lima-Santos, M.F.; Quiles, M.G. Fake news agenda in the era of COVID-19: Identifying trends through fact-checking content. Online Soc. Netw. Media 2021, 21, 100116. [Google Scholar] [CrossRef]
- Englmeier, K. The Role of Text Mining in Mitigating the Threats from Fake News and Misinformation in Times of Corona. Procedia Comput. Sci. 2021, 181, 149–156. [Google Scholar] [CrossRef]
- Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef] [PubMed]
- Rubin, V.L.; Conroy, N. Discerning truth from deception: Human judgments and automation efforts. First Monday 2012, 17. [Google Scholar] [CrossRef]
- Alonso García, S.; Gómez García, G.; Sanz Prieto, M.; Moreno Guerrero, A.J.; Rodríguez Jiménez, C. The Impact of Term Fake News on the Scientific Community. Scientific Performance and Mapping in Web of Science. Soc. Sci. 2020, 9, 73. [Google Scholar] [CrossRef]
- De keersmaecker, J.; Roets, A. ‘Fake news’: Incorrect, but hard to correct. The role of cognitive ability on the impact of false information on social impressions. Intelligence 2017, 65, 107–110. [Google Scholar] [CrossRef]
- Tang, J.; Chang, Y.; Liu, H. Mining social media with social theories: A survey. SIGKDD Explor. 2013, 15, 20–29. [Google Scholar] [CrossRef]
- Potthast, M.; Kiesel, J.; Reinartz, K.; Bevendorff, J.; Stein, B. A Stylometric Inquiry into Hyperpartisan and Fake News. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018; Volume 1: Long Papers. Gurevych, I., Miyao, Y., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 231–240. [Google Scholar] [CrossRef] [Green Version]
- Parikh, S.B.; Atrey, P.K. Media-Rich Fake News Detection: A Survey. In Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, MIPR 2018, Miami, FL, USA, 10–12 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 436–441. [Google Scholar] [CrossRef]
- Wu, Y.; Agarwal, P.K.; Li, C.; Yang, J.; Yu, C. Toward Computational Fact-Checking. Proc. VLDB Endow. 2014, 7, 589–600. [Google Scholar] [CrossRef] [Green Version]
- Ciampaglia, G.L.; Shiralkar, P.; Rocha, L.M.; Bollen, J.; Menczer, F.; Flammini, A. Computational Fact Checking from Knowledge Networks. PLoS ONE 2015, 10, e0128193. [Google Scholar] [CrossRef]
- Magdy, A.; Wanas, N.M. Web-based statistical fact checking of textual documents. In Proceedings of the 2nd international workshop on Search and mining user-generated contents, SMUC@CIKM 2010, Toronto, ON, Canada, 30 October 2010; Cortizo, J.C., Carrero, F.M., Cantador, I., Jiménez, J.A.T., Rosso, P., Eds.; ACM: New York, NY, USA, 2010; pp. 103–110. [Google Scholar] [CrossRef]
- Lease, M. Fact Checking and Information Retrieval. In Proceedings of the First Biennial Conference on Design of Experimental Search & Information Retrieval Systems, Bertinoro, Italy, 28–31 August 2018; Alonso, O., Silvello, G., Eds.; CEUR-WS.org, 2018. Volume 2167, CEUR Workshop Proceedings. pp. 97–98. [Google Scholar]
- Hardalov, M.; Arora, A.; Nakov, P.; Augenstein, I. A Survey on Stance Detection for Mis- and Disinformation Identification. arXiv 2021, arXiv:2103.00242. [Google Scholar]
- Lillie, A.E.; Middelboe, E.R. Fake News Detection using Stance Classification: A Survey. arXiv 2019, arXiv:1907.00181. [Google Scholar]
- Vosoughi, S.; Mohsenvand, M.N.; Roy, D. Rumor Gauge: Predicting the Veracity of Rumors on Twitter. ACM Trans. Knowl. Discov. Data 2017, 11, 50:1–50:36. [Google Scholar] [CrossRef]
- Li, Y.; Gao, J.; Meng, C.; Li, Q.; Su, L.; Zhao, B.; Fan, W.; Han, J. A Survey on Truth Discovery. SIGKDD Explor. 2015, 17, 1–16. [Google Scholar] [CrossRef]
- Ott, M.; Cardie, C.; Hancock, J.T. Negative Deceptive Opinion Spam. In Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, Westin Peachtree Plaza Hotel, Atlanta, GA, USA, 9–14 June 2013; Vanderwende, L., Daumé, H., III, Kirchhoff, K., Eds.; The Association for Computational Linguistics: Stroudsburg, PA, USA, 2013; pp. 497–501. [Google Scholar]
- Peng, Q.; Zhong, M. Detecting Spam Review through Sentiment Analysis. J. Softw. 2014, 9, 2065–2072. [Google Scholar] [CrossRef]
- Wu, Y.; Ngai, E.W.T.; Wu, P.; Wu, C. Fake online reviews: Literature review, synthesis, and directions for future research. Decis. Support Syst. 2020, 132, 113280. [Google Scholar] [CrossRef]
- Elmogy, A.M.; Tariq, U.; Mohammed, A.; Ibrahim, A. Fake Reviews Detection using Supervised Machine Learning. Int. J. Adv. Comput. Sci. Appl. 2021, 12. [Google Scholar] [CrossRef]
- Latah, M. Detection of malicious social bots: A survey and a refined taxonomy. Expert Syst. Appl. 2020, 151, 113383. [Google Scholar] [CrossRef]
- Ferrara, E.; Varol, O.; Davis, C.A.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 59, 96–104. [Google Scholar] [CrossRef] [Green Version]
- Castillo, C.; Mendoza, M.; Poblete, B. Information credibility on twitter. In Proceedings of the 20th International Conference on World Wide Web, WWW 2011, Hyderabad, India, 28 March–1 April 2011; Srinivasan, S., Ramamritham, K., Kumar, A., Ravindra, M.P., Bertino, E., Kumar, R., Eds.; ACM: New York, NY, USA, 2011; pp. 675–684. [Google Scholar] [CrossRef]
- Viviani, M.; Pasi, G. Credibility in social media: Opinions, news, and health information—A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2017, 7. [Google Scholar] [CrossRef]
- Schmidt, A.; Wiegand, M. A Survey on Hate Speech Detection using Natural Language Processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, SocialNLP@EACL 2017, Valencia, Spain, 3 April 2017; Ku, L., Li, C., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 1–10. [Google Scholar] [CrossRef] [Green Version]
- Fortuna, P.; Nunes, S. A Survey on Automatic Detection of Hate Speech in Text. ACM Comput. Surv. 2018, 51, 85:1–85:30. [Google Scholar] [CrossRef]
- Alrehili, A. Automatic Hate Speech Detection on Social Media: A Brief Survey. In Proceedings of the 16th IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2019, Abu Dhabi, United Arab Emirates, 3–7 November 2019; IEEE Computer Society: Washington, DC, USA, 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Rubin, V.L.; Chen, Y.; Conroy, N.J. Deception detection for news: Three types of fakes. In Proceedings of the Information Science with Impact: Research in and for the Community—78th ASIS&Proceedings of the Information Science with Impact: Research in and for the Community—78th ASIS&T Annual Meeting, ASIST 2015, St. Louis, MO, USA, 6–10 October 2015; Wiley: Hoboken, NJ, USA, 2015; Volume 52, pp. 1–4. [Google Scholar] [CrossRef]
- Ott, M.; Choi, Y.; Cardie, C.; Hancock, J.T. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, Portland, OR, USA, 19–24 June 2011; Lin, D., Matsumoto, Y., Mihalcea, R., Eds.; The Association for Computer Linguistics: Stroudsburg, PA, USA, 2011; pp. 309–319. [Google Scholar]
- Ferreira, W.; Vlachos, A. Emergent: A novel data-set for stance classification. In Proceedings of the NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; Knight, K., Nenkova, A., Rambow, O., Eds.; The Association for Computational Linguistics: Stroudsburg, PA, USA, 2016; pp. 1163–1168. [Google Scholar] [CrossRef]
- Zubiaga, A.; Liakata, M.; Procter, R.; Wong Sak Hoi, G.; Tolmie, P. Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads. PLoS ONE 2016, 11, e0150989. [Google Scholar] [CrossRef] [PubMed]
- Ma, J.; Gao, W.; Mitra, P.; Kwon, S.; Jansen, B.J.; Wong, K.; Cha, M. Detecting Rumors from Microblogs with Recurrent Neural Networks. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016; Kambhampati, S., Ed.; IJCAI/AAAI Press: Palo Alto, CA, USA, 2016; pp. 3818–3824. [Google Scholar]
- Tacchini, E.; Ballarin, G.; Vedova, M.L.D.; Moret, S.; de Alfaro, L. Some Like it Hoax: Automated Fake News Detection in Social Networks. arXiv 2017, arXiv:1704.07506. [Google Scholar]
- Pérez-Rosas, V.; Mihalcea, R. Experiments in Open Domain Deception Detection. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, 17–21 September 2015; Màrquez, L., Callison-Burch, C., Su, J., Pighin, D., Marton, Y., Eds.; The Association for Computational Linguistics: Stroudsburg, PA, USA, 2015; pp. 1120–1125. [Google Scholar] [CrossRef]
- Vlachos, A.; Riedel, S. Fact Checking: Task definition and dataset construction. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, Baltimore, MD, USA, 26 June 2014; pp. 18–22. [Google Scholar] [CrossRef]
- Santia, G.C.; Williams, J.R. BuzzFace: A News Veracity Dataset with Facebook User Commentary and Egos. In Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM 2018, Stanford, CA, USA, 25–28 June 2018; AAAI Press: Palo Alto, CA, USA, 2018; pp. 531–540. [Google Scholar]
- Wang, W.Y. “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, BC, Canada, 30 July–4 August 2017; Volume 2: Short Papers. Barzilay, R., Kan, M., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 422–426. [Google Scholar] [CrossRef]
- Rashkin, H.; Choi, E.; Jang, J.Y.; Volkova, S.; Choi, Y. Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017; Palmer, M., Hwa, R., Riedel, S., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 2931–2937. [Google Scholar] [CrossRef] [Green Version]
- Khan, J.Y.; Khondaker, M.T.I.; Iqbal, A.; Afroz, S. A Benchmark Study on Machine Learning Methods for Fake News Detection. arXiv 2019, arXiv:1905.04749. [Google Scholar]
- Mitra, T.; Gilbert, E. CREDBANK: A Large-Scale Social Media Corpus with Associated Credibility Annotations. In Proceedings of the Ninth International Conference on Web and Social Media, ICWSM 2015, University of Oxford, Oxford, UK, 26–29 May 2015; Cha, M., Mascolo, C., Sandvig, C., Eds.; AAAI Press: Palo Alto, CA, USA, 2015; pp. 258–267. [Google Scholar]
- Popat, K.; Mukherjee, S.; Strötgen, J.; Weikum, G. Credibility Assessment of Textual Claims on the Web. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; ACM: New York, NY, USA, 2016. CIKM ’16. pp. 2173–2178. [Google Scholar] [CrossRef]
- Thorne, J.; Vlachos, A.; Christodoulopoulos, C.; Mittal, A. FEVER: A Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; Association for Computational Linguistics: New Orleans, LA, USA, 2018; pp. 809–819. [Google Scholar] [CrossRef] [Green Version]
- Monteiro, R.A.; Santos, R.L.S.; Pardo, T.A.S.; de Almeida, T.A.; Ruiz, E.E.S.; Vale, O.A. Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In Proceedings of the Computational Processing of the Portuguese Language-13th International Conference, PROPOR 2018, Canela, Brazil, 24–26 September 2018; Villavicencio, A., Moreira, V.P., Abad, A., de Medeiros Caseli, H., Gamallo, P., Ramisch, C., Oliveira, H.G., Paetzold, G.H., Eds.; Springer: Berlin/Heidelberg, Germany, 2018. Volume 11122, Lecture Notes in Computer Science. pp. 324–334. [Google Scholar] [CrossRef]
- Silva, R.M.; Santos, R.L.; Almeida, T.A.; Pardo, T.A. Towards automatically filtering fake news in Portuguese. Expert Syst. Appl. 2020, 146, 113199. [Google Scholar] [CrossRef]
- Pathak, A.; Srihari, R.K. BREAKING! Presenting Fake News Corpus for Automated Fact Checking. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, 28 July–2 August 2019; Volume 2: Student Research Workshop. Alva-Manchego, F.E., Choi, E., Khashabi, D., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 357–362. [Google Scholar] [CrossRef]
- Nørregaard, J.; Horne, B.D.; Adali, S. NELA-GT-2018: A Large Multi-Labelled News Dataset for the Study of Misinformation in News Articles. In Proceedings of the Thirteenth International Conference on Web and Social Media, ICWSM 2019, Munich, Germany, 11–14 June 2019; Pfeffer, J., Budak, C., Lin, Y., Morstatter, F., Eds.; AAAI Press: Palo Alto, CA, USA, 2019; pp. 630–638. [Google Scholar]
- Gruppi, M.; Horne, B.D.; Adali, S. NELA-GT-2019: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles. arXiv 2020, arXiv:2003.08444. [Google Scholar]
- Salem, F.K.A.; Feel, R.A.; Elbassuoni, S.; Jaber, M.; Farah, M. FA-KES: A Fake News Dataset around the Syrian War. In Proceedings of the Thirteenth International Conference on Web and Social Media, ICWSM 2019, Munich, Germany, 11–14 June 2019; Pfeffer, J., Budak, C., Lin, Y., Morstatter, F., Eds.; AAAI Press: Palo Alto, CA, USA, 2019; pp. 573–582. [Google Scholar]
- Posadas-Durán, J.P.; Gómez-Adorno, H.; Sidorov, G.; Escobar, J.J.M. Detection of fake news in a new corpus for the Spanish language. J. Intell. Fuzzy Syst. 2019, 36, 4869–4876. [Google Scholar] [CrossRef]
- Shu, K.; Mahudeswaran, D.; Wang, S.; Lee, D.; Liu, H. FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media. Big Data 2020, 8, 171–188. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Cao, J.; Li, X.; Sheng, Q.; Zhong, L.; Shu, K. Mining Dual Emotion for Fake News Detection. In The Web Conference 2021, Proceedings of The World Wide Web Conference WWW 2021; ACM: New York, NY, USA, 2021. [Google Scholar] [CrossRef]
- Hanselowski, A.; PVS, A.; Schiller, B.; Caspelherr, F.; Chaudhuri, D.; Meyer, C.M.; Gurevych, I. A Retrospective Analysis of the Fake News Challenge Stance-Detection Task. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, NM, USA, 20–26 August 2018; Bender, E.M., Derczynski, L., Isabelle, P., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 1859–1874. [Google Scholar]
- Bevendorff, J.; Ghanem, B.; Giachanou, A.; Kestemont, M.; Manjavacas, E.; Markov, I.; Mayerl, M.; Potthast, M.; Rangel, F.; Rosso, P.; et al. Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection. In Proceedings of the 11th International Conference of the CLEF Association (CLEF 2020), Thessaloniki, Greece, 22–25 September 2020; Arampatzis, A., Kanoulas, E., Tsikrika, T., Vrochidis, S., Joho, H., Lioma, C., Eickhoff, C., Névéol, A., Cappellato, L., Ferro, N., Eds.; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
- Amjad, M.; Sidorov, G.; Zhila, A.; Gelbukh, A.; Rosso, P. UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu. In Forum for Information Retrieval Evaluation; Association for Computing Machinery: New York, NY, USA, 2020; FIRE 2020; pp. 37–40. [Google Scholar] [CrossRef]
- Patwa, P.; Bhardwaj, M.; Guptha, V.; Kumari, G.; Sharma, S.; PYKL, S.; Das, A.; Ekbal, A.; Akhtar, S.; Chakraborty, T. Overview of CONSTRAINT 2021 Shared Tasks: Detecting English COVID-19 Fake News and Hindi Hostile Posts. In Proceedings of the First Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT); Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
- Patwa, P.; Sharma, S.; Srinivas, P.; Guptha, V.; Kumari, G.; Akhtar, M.S.; Ekbal, A.; Das, A.; Chakraborty, T. Fighting an Infodemic: COVID-19 Fake News Dataset. arXiv 2020, arXiv:2011.03327. [Google Scholar]
- Bhardwaj, M.; Akhtar, M.S.; Ekbal, A.; Das, A.; Chakraborty, T. Hostility Detection Dataset in Hindi. arXiv 2020, arXiv:2011.03588. [Google Scholar]
- Ling, C.X.; Huang, J.; Zhang, H. AUC: A Statistically Consistent and more Discriminating Measure than Accuracy. In Proceedings of the IJCAI-03, Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 9–15 August 2003; Gottlob, G., Walsh, T., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 2003; pp. 519–526. [Google Scholar]
- Wiebe, J.; Bruce, R.F.; O’Hara, T.P. Development and Use of a Gold-Standard Data Set for Subjectivity Classifications. In Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, University of Maryland, College Park, MD, USA, 20–26 June 1999; Dale, R., Church, K.W., Eds.; ACL: Stroudsburg, PA, USA, 1999; pp. 246–253. [Google Scholar] [CrossRef] [Green Version]
- Turney, P.D. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; ACL: Stroudsburg, PA, USA, 2002; pp. 417–424. [Google Scholar] [CrossRef]
- Esuli, A.; Sebastiani, F. SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining. In Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, 22–28 May 2006; Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odijk, J., Tapias, D., Eds.; European Language Resources Association (ELRA): Paris, France, 2006; pp. 417–422. [Google Scholar]
- Miller, G.A. WordNet: A Lexical Database for English. Commun. ACM 1995, 38, 39–41. [Google Scholar] [CrossRef]
- Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.D.; Stede, M. Lexicon-Based Methods for Sentiment Analysis. Comput. Linguist. 2011, 37, 267–307. [Google Scholar] [CrossRef]
- Brooke, J.; Tofiloski, M.; Taboada, M. Cross-Linguistic Sentiment Analysis: From English to Spanish. In Proceedings of the Recent Advances in Natural Language Processing, RANLP 2009, Borovets, Bulgaria, 14–16 September 2009; Angelova, G., Bontcheva, K., Mitkov, R., Nicolov, N., Nikolov, N., Eds.; RANLP 2009 Organising Committee/ACL: Stroudsburg, PA, USA, 2009; pp. 50–54. [Google Scholar]
- Thelwall, M.; Buckley, K.; Paltoglou, G.; Cai, D.; Kappas, A. Sentiment in short strength detection informal text. J. Assoc. Inf. Sci. Technol. 2010, 61, 2544–2558. [Google Scholar] [CrossRef] [Green Version]
- Hutto, C.J.; Gilbert, E. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eighth International Conference on Weblogs and Social Media, ICWSM 2014, Ann Arbor, MI, USA, 1–4 June 2014; Adar, E., Resnick, P., Choudhury, M.D., Hogan, B., Oh, A.H., Eds.; The AAAI Press: Palo Alto, CA, USA, 2014. [Google Scholar]
- Vilares, D.; Gómez-Rodríguez, C.; Alonso, M.A. Universal, unsupervised (rule-based), uncovered sentiment analysis. Knowl. Based Syst. 2017, 118, 45–55. [Google Scholar] [CrossRef] [Green Version]
- Cambria, E.; Olsher, D.; Rajagopal, D. SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; Brodley, C.E., Stone, P., Eds.; AAAI Press: Palo Alto, CA, USA, 2014; pp. 1515–1521. [Google Scholar]
- Cambria, E.; Li, Y.; Xing, F.Z.; Poria, S.; Kwok, K. SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis. In Proceedings of the CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; d’Aquin, M., Dietze, S., Hauff, C., Curry, E., Cudré-Mauroux, P., Eds.; ACM: New York, NY, USA, 2020; pp. 105–114. [Google Scholar] [CrossRef]
- Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, Philadelphia, PA, USA, 6–7 July 2002; pp. 79–86. [Google Scholar] [CrossRef] [Green Version]
- Mohammad, S.; Kiritchenko, S.; Zhu, X. NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets. In Proceedings of the 7th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2013, Atlanta, GA, USA, 14–15 June 2013; Diab, M.T., Baldwin, T., Baroni, M., Eds.; The Association for Computer Linguistics: Stroudsburg, PA, USA, 2013; pp. 321–327. [Google Scholar]
- Agarwal, A.; Xie, B.; Vovsha, I.; Rambow, O.; Passonneau, R. Sentiment Analysis of Twitter Data. In Proceedings of the Workshop on Languages in Social Media; Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; LSM ’11; pp. 30–38. [Google Scholar]
- Joshi, M.; Rosé, C.P. Generalizing Dependency Features for Opinion Mining. In Proceedings of the ACL 2009, 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2–7 August 2009; Short Papers. The Association for Computer Linguistics: Stroudsburg, PA, USA, 2009; pp. 313–316. [Google Scholar]
- Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C. On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages. J. Assoc. Inf. Sci. Technol. 2015, 66, 1799–1816. [Google Scholar] [CrossRef] [Green Version]
- Kalchbrenner, N.; Grefenstette, E.; Blunsom, P. A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, Baltimore, MD, USA, 22–27 June 2014; Volume 1: Long Papers. The Association for Computer Linguistics: Stroudsburg, PA, USA, 2014; pp. 655–665. [Google Scholar] [CrossRef] [Green Version]
- dos Santos, C.N.; Gatti, M. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts. In Proceedings of the COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, Dublin, Ireland, 23–29 August 2014; Hajic, J., Tsujii, J., Eds.; ACL: Stroudsburg, PA, USA, 2014; pp. 69–78. [Google Scholar]
- Severyn, A.; Moschitti, A. Twitter Sentiment Analysis with Deep Convolutional Neural Networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, 9–13 August 2015; Baeza-Yates, R., Lalmas, M., Moffat, A., Ribeiro-Neto, B.A., Eds.; ACM: New York, NY, USA, 2015; pp. 959–962. [Google Scholar] [CrossRef]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.Y.; Potts, C. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, Grand Hyatt Seattle, Seattle, WA, USA, 18–21 October 2013; A meeting of SIGDAT, a Special Interest Group of the ACL. ACL: Stroudsburg, PA, USA, 2013; pp. 1631–1642. [Google Scholar]
- Tai, K.S.; Socher, R.; Manning, C.D. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, Beijing, China, 26–31 July 2015; Volume 1: Long Papers. The Association for Computer Linguistics: Stroudsburg, PA, USA, 2015; pp. 1556–1566. [Google Scholar] [CrossRef] [Green Version]
- Radford, A.; Józefowicz, R.; Sutskever, I. Learning to Generate Reviews and Discovering Sentiment. arXiv 2017, arXiv:1704.01444. [Google Scholar]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; Volume 1 (Long and Short Papers). Burstein, J., Doran, C., Solorio, T., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Dang, N.C.; Moreno-García, M.N.; De la Prieta, F. Sentiment Analysis Based on Deep Learning: A Comparative Study. Electronics 2020, 9, 483. [Google Scholar] [CrossRef] [Green Version]
- Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C. A syntactic approach for opinion mining on Spanish reviews. Nat. Lang. Eng. 2015, 21, 139–163. [Google Scholar] [CrossRef] [Green Version]
- Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C. Supervised sentiment analysis in multilingual environments. Inf. Process. Manag. 2017, 53, 595–607. [Google Scholar] [CrossRef]
- Hamidian, S.; Diab, M.T. Rumor Identification and Belief Investigation on Twitter. In Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA@NAACL-HLT 2016, San Diego, CA, USA, 16 June 2016; Balahur, A., der Goot, E.V., Vossen, P., Montoyo, A., Eds.; The Association for Computer Linguistics: Stroudsburg, PA, USA, 2016; pp. 3–8. [Google Scholar] [CrossRef] [Green Version]
- Meire, M.; Ballings, M.; den Poel, D.V. The added value of auxiliary data in sentiment analysis of Facebook posts. Decis. Support Syst. 2016, 89, 98–112. [Google Scholar] [CrossRef]
- AbdelFattah, M.; Galal, D.; Hassan, N.; Elzanfaly, D.; Tallent, G. A Sentiment Analysis Tool for Determining the Promotional Success of Fashion Images on Instagram. Int. J. Interact. Mob. Technol. 2017, 11, 66–73. [Google Scholar] [CrossRef] [Green Version]
- Hu, X.; Tang, J.; Gao, H.; Liu, H. Social Spammer Detection with Sentiment Information. In Proceedings of the 2014 IEEE International Conference on Data Mining, ICDM 2014, Shenzhen, China, 14–17 December 2014; Kumar, R., Toivonen, H., Pei, J., Huang, J.Z., Wu, X., Eds.; IEEE Computer Society: Washington, DC, USA, 2014; pp. 180–189. [Google Scholar] [CrossRef]
- Rubin, V.; Conroy, N.; Chen, Y.; Cornwell, S. Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News. In Proceedings of the Second Workshop on Computational Approaches to Deception Detection; Association for Computational Linguistics: San Diego, CA, USA, 2016; pp. 7–17. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S.; Zhang, X.; Chan, J.; Rosso, P. Irony detection via sentiment-based transfer learning. Inf. Process. Manag. 2019, 56, 1633–1644. [Google Scholar] [CrossRef]
- Arcila-Calderón, C.; Blanco-Herrero, D.; Frías-Vázquez, M.; Seoane-Pérez, F. Refugees Welcome? Online Hate Speech and Sentiments in Twitter in Spain during the Reception of the Boat Aquarius. Sustainability 2021, 13, 2728. [Google Scholar] [CrossRef]
- Li, S.; Li, G.; Law, R.; Paradies, Y. Racism in tourism reviews. Tour. Manag. 2020, 80, 104100. [Google Scholar] [CrossRef]
- Jha, A.; Mamidi, R. When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data. In Proceedings of the Second Workshop on NLP and Computational Social Science, NLP+CSS@ACL 2017, Vancouver, BC, Canada, 3 August 2017; Hovy, D., Volkova, S., Bamman, D., Jurgens, D., O’Connor, B., Tsur, O., Dogruöz, A.S., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 7–16. [Google Scholar] [CrossRef]
- Aguilar, S.J.; Baek, C. Sexual harassment in academe is underreported, especially by students in the life and physical sciences. PLoS ONE 2020, 15, e0230312. [Google Scholar] [CrossRef] [PubMed]
- Dessì, D.; Recupero, D.R.; Sack, H. An Assessment of Deep Learning Models and Word Embeddings for Toxicity Detection within Online Textual Comments. Electronics 2021, 10, 779. [Google Scholar] [CrossRef]
- Vilares, D.; Hermo, M.; Alonso, M.A.; Gómez-Rodríguez, C.; Vilares, J. LyS at CLEF RepLab 2014: Creating the State of the Art in Author Influence Ranking and Reputation Classification on Twitter. In Proceedings of the Working Notes for CLEF 2014 Conference, Sheffield, UK, 15–18 September 2014; Cappellato, L., Ferro, N., Halvey, M., Kraaij, W., Eds.; CEUR-WS.org: Aachen, Germany, 2014; Volume 1180, CEUR Workshop Proceedings. pp. 1468–1478. [Google Scholar]
- Bamakan, S.M.H.; Nurgaliev, I.; Qu, Q. Opinion leader detection: A methodological review. Expert Syst. Appl. 2019, 115, 200–222. [Google Scholar] [CrossRef]
- Vilares, D.; Thelwall, M.; Alonso, M.A. The megaphone of the people? Spanish SentiStrength for real-time analysis of political tweets. J. Inf. Sci. 2015, 41, 799–813. [Google Scholar] [CrossRef] [Green Version]
- Alonso, M.A.; Vilares, D. A review on political analysis and social media. Proces. Leng. Nat. 2016, 56, 13–24. [Google Scholar]
- Jang, J.-S.; Lee, B.-I.; Choi, C.-H.; Kim, J.-H.; Seo, D.-M.; Cho, W.-S. Understanding pending issue of society and sentiment analysis using social media. In Proceedings of the 2016 Eighth International Conference on Ubiquitous and Future Networks (ICUFN), Vienna, Austria, 5–8 July 2016; pp. 981–986. [Google Scholar] [CrossRef]
- Etter, M.; Colleoni, E.; Illia, L.; Meggiorin, K.; D’Eugenio, A. Measuring Organizational Legitimacy in Social Media: Assessing Citizens’ Judgments With Sentiment Analysis. Bus. Soc. 2018, 57, 60–97. [Google Scholar] [CrossRef] [Green Version]
- Azar, P.D.; Lo, A.W. The Wisdom of Twitter Crowds: Predicting Stock Market Reactions to FOMC Meetings via Twitter Feeds. J. Portf. Manag. 2016, 42, 123–134. [Google Scholar] [CrossRef] [Green Version]
- Chen, H.; De, P.; Hu, Y.J.; Hwang, B.H. Wisdom of Crowds: The Value of Stock Opinions Transmitted Through Social Media. Rev. Financ. Stud. 2014, 27, 1367–1403. [Google Scholar] [CrossRef]
- Sharma, S.; Jain, A. Role of sentiment analysis in social media security and analytics. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2020, 10. [Google Scholar] [CrossRef]
- Zunic, A.; Corcoran, P.; Spasic, I. Sentiment Analysis in Health and Well-Being: Systematic Review. JMIR Med Inform. 2020, 8, e16023. [Google Scholar] [CrossRef]
- Alamoodi, A.H.; Zaidan, B.B.; Zaidan, A.A.; Albahri, O.S.; Mohammed, K.I.; Malik, R.Q.; Almahdi, E.M.; Chyad, M.A.; Tareq, Z.; Albahri, A.S.; et al. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert Syst. Appl. 2021, 167, 114155. [Google Scholar] [CrossRef]
- Diakopoulos, N.; Naaman, M.; Kivran-Swaine, F. Diamonds in the rough: Social media visual analytics for journalistic inquiry. In Proceedings of the 5th IEEE Conference on Visual Analytics Science and Technology, IEEE VAST 2010, Salt Lake City, UT, USA, 24–29 October 2010; Part of VisWeek 2010. IEEE Computer Society: Washington, DC, USA, 2010; pp. 115–122. [Google Scholar] [CrossRef]
- AlRubaian, M.A.; Al-Qurishi, M.; Al-Rakhami, M.; Rahman, S.M.M.; Alamri, A. A Multistage Credibility Analysis Model for Microblogs. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015, Paris, France, 25–28 August 2015; Pei, J., Silvestri, F., Tang, J., Eds.; ACM: New York, NY, USA, 2015; pp. 1434–1440. [Google Scholar] [CrossRef]
- Chatterjee, R.; Agarwal, S. Twitter truths: Authenticating analysis of information credibility. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16–18 March 2016; pp. 2352–2357. [Google Scholar]
- Dey, A.; Rafi, R.Z.; Hasan Parash, S.; Arko, S.K.; Chakrabarty, A. Fake News Pattern Recognition using Linguistic Analysis. In Proceedings of the 2018 Joint 7th International Conference on Informatics, Electronics Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision Pattern Recognition (icIVPR), Kitakyushu, Japan, 25–29 June 2018; pp. 305–309. [Google Scholar] [CrossRef]
- Bhutani, B.; Rastogi, N.; Sehgal, P.; Purwar, A. Fake News Detection Using Sentiment Analysis. In Proceedings of the 2019 Twelfth International Conference on Contemporary Computing, IC3 2019, Noida, India, 8–10 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Ajao, O.; Bhowmik, D.; Zargari, S. Sentiment Aware Fake News Detection on Online Social Networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, UK, 12–17 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 2507–2511. [Google Scholar] [CrossRef] [Green Version]
- Zubiaga, A.; Liakata, M.; Procter, R. Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media. arXiv 2016, arXiv:1610.07363. [Google Scholar]
- Cui, L.; Wang, S.; Lee, D. SAME: Sentiment-aware multi-modal embedding for detecting fake news. In Proceedings of the ASONAM ’19: International Conference on Advances in Social Networks Analysis and Mining, Vancouver, BC, Canada, 27–30 August 2019; Spezzano, F., Chen, W., Xiao, X., Eds.; ACM: New York, NY, USA, 2019; pp. 41–48. [Google Scholar] [CrossRef]
- Vicario, M.D.; Quattrociocchi, W.; Scala, A.; Zollo, F. Polarization and Fake News: Early Warning of Potential Misinformation Targets. ACM Trans. Web 2019, 13. [Google Scholar] [CrossRef]
- Ross, J.; Thirunarayan, K. Features for Ranking Tweets Based on Credibility and Newsworthiness. In Proceedings of the 2016 International Conference on Collaboration Technologies and Systems, CTS 2016, Orlando, FL, USA, 31 October–4 November 2016; Smari, W.W., Natarian, J., Eds.; IEEE Computer Society: Washington, DC, USA, 2016; pp. 18–25. [Google Scholar] [CrossRef] [Green Version]
- Nakashole, N.; Mitchell, T.M. Language-Aware Truth Assessment of Fact Candidates. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, Baltimore, MD, USA, 22–27 June 2014; Volume 1: Long Papers. The Association for Computer Linguistics: Stroudsburg, PA, USA, 2014; pp. 1009–1019. [Google Scholar] [CrossRef]
- Popat, K.; Mukherjee, S.; Strötgen, J.; Weikum, G. Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; International World Wide Web Conferences Steering Committee: Republic and Canton of Geneva, Switzerland, 2017. WWW ’17 Companion. pp. 1003–1012. [Google Scholar] [CrossRef] [Green Version]
- Hassan, N.; Arslan, F.; Li, C.; Tremayne, M. Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; ACM: New York, NY, USA, 2017; pp. 1803–1812. [Google Scholar] [CrossRef]
- Alonso, M.A.; Gómez-Rodríguez, C.; Vilares, J. On the Use of Parsing for Named Entity Recognition. Appl. Sci. 2021, 11, 1090. [Google Scholar] [CrossRef]
- Varol, O.; Ferrara, E.; Menczer, F.; Flammini, A. Early detection of promoted campaigns on social media. EPJ Data Sci. 2017, 6, 13. [Google Scholar] [CrossRef] [Green Version]
- Yang, Y.; Zheng, L.; Zhang, J.; Cui, Q.; Li, Z.; Yu, P.S. TI-CNN: Convolutional Neural Networks for Fake News Detection. arXiv 2018, arXiv:1806.00749. [Google Scholar]
- Reis, J.C.S.; Correia, A.; Murai, F.; Veloso, A.; Benevenuto, F.; Cambria, E. Supervised Learning for Fake News Detection. IEEE Intell. Syst. 2019, 34, 76–81. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.J.; Hovy, E.H. Hierarchical Attention Networks for Document Classification. In Proceedings of the NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; Knight, K., Nenkova, A., Rambow, O., Eds.; The Association for Computational Linguistics: Stroudsburg, PA, USA, 2016; pp. 1480–1489. [Google Scholar] [CrossRef] [Green Version]
- Enayet, O.; El-Beltagy, S.R. NileTMRG at SemEval-2017 Task 8: Determining Rumour and Veracity Support for Rumours on Twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, Vancouver, BC, Canada, 3–4 August 2017; Bethard, S., Carpuat, M., Apidianaki, M., Mohammad, S.M., Cer, D.M., Jurgens, D., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2017; pp. 470–474. [Google Scholar] [CrossRef] [Green Version]
- Guo, H.; Cao, J.; Zhang, Y.; Guo, J.; Li, J. Rumor Detection with Hierarchical Social Attention Network. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, 22–26 October 2018; Cuzzocrea, A., Allan, J., Paton, N.W., Srivastava, D., Agrawal, R., Broder, A.Z., Zaki, M.J., Candan, K.S., Labrinidis, A., Schuster, A., et al., Eds.; ACM: New York, NY, USA, 2018; pp. 943–951. [Google Scholar] [CrossRef]
- Gorrell, G.; Aker, A.; Bontcheva, K.; Derczynski, L.; Kochkina, E.; Liakata, M.; Zubiaga, A. SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours. In Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, 6–7 June 2019; May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 845–854. [Google Scholar] [CrossRef] [Green Version]
- Abonizio, H.Q.; de Morais, J.I.; Tavares, G.M.; Barbon Junior, S. Language-Independent Fake News Detection: English, Portuguese, and Spanish Mutual Features. Future Internet 2020, 12, 87. [Google Scholar] [CrossRef]
- Guibon, G.; Ermakova, L.; Seffih, H.; Firsov, A.; Noé-Bienvenu, G.L. Fake News Detection with Satire. In CICLing: International Conference on Computational Linguistics and Intelligent Text Processing; CICLing: La Rochelle, France, 2019. [Google Scholar]
- Li, Z.; Fan, Y.; Jiang, B.; Lei, T.; Liu, W. A survey on sentiment analysis and opinion mining for social multimedia. Multim. Tools Appl. 2019, 78, 6939–6967. [Google Scholar] [CrossRef]
- Zhou, Z.; Guan, H.; Bhat, M.M.; Hsu, J. Fake News Detection via NLP is Vulnerable to Adversarial Attacks. In Proceedings of the 11th International Conference on Agents and Artificial Intelligence, ICAART 2019, Prague, Czech Republic, 19–21 February 2019; Rocha, A.P., Steels, L., van den Herik, H.J., Eds.; SciTePress: Setubal, Portugal, 2019; Volume 2, pp. 794–800. [Google Scholar] [CrossRef]
- Do, H.H.; Prasad, P.W.C.; Maag, A.; Alsadoon, A. Deep Learning for Aspect-Based Sentiment Analysis: A Comparative Review. Expert Syst. Appl. 2019, 118, 272–299. [Google Scholar] [CrossRef]
- Han, J.; Zhang, Z.; Schuller, B.W. Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives [Review Article]. IEEE Comput. Intell. Mag. 2019, 14, 68–81. [Google Scholar] [CrossRef]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2021, 23, 18. [Google Scholar] [CrossRef] [PubMed]
- Blodgett, S.L.; Barocas, S.; Daumé, H., III; Wallach, H.M. Language (Technology) is Power: A Critical Survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020; Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 5454–5476. [Google Scholar] [CrossRef]
- Bender, E.M.; Friedman, B. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Trans. Assoc. Comput. Linguist. 2018, 6, 587–604. [Google Scholar] [CrossRef] [Green Version]
- Kiritchenko, S.; Mohammad, S. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, *SEM@NAACL-HLT 2018, New Orleans, LA, USA, 5–6 June 2018; Nissim, M., Berant, J., Lenci, A., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 43–53. [Google Scholar] [CrossRef] [Green Version]
- van der Linden, S.; Panagopoulos, C.; Roozenbeek, J. You are fake news: Political bias in perceptions of fake news. Media Cult. Soc. 2020, 42, 460–470. [Google Scholar] [CrossRef]
- Pennycook, G.; Rand, D.G. The Psychology of Fake News. Trends Cogn. Sci. 2021, 25, 388–402. [Google Scholar] [CrossRef] [PubMed]
Reference | Language | SA Method | Detection Method | Data Set |
---|---|---|---|---|
Diakopoulos et al. (2010) [168] | English | Lexicon + ngram-based classifier | N/A | Ad-hoc from Twitter |
Chatterjee and Agarwal (2016) [170] | English and Hinglish | SVM | Pipeline LDA + SA | Ad-hoc from Twitter |
Ross and Thirunarayan (2016) [177] | English | Ranking SVM | N/A | Ad-hoc from Twitter |
Hassan et al. (2017) [180] | English | Commercial tool (Alchemy) | SVM | Ad-hoc from US Presidential debates |
Vosoughi et al. (2018) [63] | English | Lexicon-based | N/A | Ad-hoc from Twitter |
Reference | Language | SA Method | Detection Method | Data Set | Performance |
---|---|---|---|---|---|
Castillo et al. (2011) [84] | English | Lexicon-based | J48 decision tree | Ad-hoc from Twitter | P = 0.861; R = 0.860; F1 = 0.860 |
AlRubaian et al. (2015) [169] | Arabic | Lexicon-based | Naive-Bayes | Ad-hoc from Twitter | P = 0.8624; R = 0.988; F1 = 0.926; Acc = 0.903 |
Popat et al. (2016) [102] | English | Lexicon-based | Logistic regression | Snopes | Acc = 71.96; AUC = 0.80 |
Popat et al. (2017) [179] | English | Lexicon-based | CRF | Snopes | Acc = 84.02; AUC = 0.86 |
Horne and Adali (2017) [10] | English | Lexicon-based | SVM | Silverman’s Buzzfeed Political News | Acc = 0.77 |
Ad-hoc (news articles) | Acc = 0.71 | ||||
Rashkin et al. (2017) [99] | English | Lexicon-based | MaxEnt | Fact Checking | F1 = 0.55 |
LSTM | F1 = 0.56 | ||||
Varol et al. (2017) [182] | English | Lexicon-based | KNN | Ad-hoc from Twitter | Acc = 0.97; F1 = 0.81 |
Dey et al. (2018) [171] | English | Lexicon-based | KNN | Ad-hoc from Twitter | Acc = 0.66 |
Yang et al. (2018) [183] | English | N/A | TI-CNN | BS Detector | P = 0.9220; R = 0.9277; F1 = 0.9210 |
Bhutani et al. (2019) [172] | English | Naive-Bayes | Random Forest | LIAR | AUC = 0.63 |
Fake vs. real news project | AUC = 0.88 | ||||
Ajao et al. (2019) [173] | English | Lexicon-based | SVM | Rumors [174] | Acc = 0.86; P = 0.86; R = 0.86; F1 = 0.86 |
LSTM-HAN | Acc = 0.86; P = 0.86, R = 0.82; F1 = 0.84 | ||||
Cui et al. (2019) [175] | English | Rule-based | Ad-hoc deep neural network | FakeNewsNet PolitiFact | F1 = 0.7724 |
FakeNewsNet GossipCop | F1 = 0.8042 |
Reference | Language | SA Method | Detection Method | Data Set | Performance |
---|---|---|---|---|---|
Del Vicario et al. (2019) [176] | Italian | Commercial tool (Dandelion API) | Linear Regression | Ad-hoc from Facebook | P = 0.90; R = 0.90; FPR = 0.11; F1 = 0.90 |
Logistic Regression | P = 0.91; R = 0.91; FPR = 0.08; F1 = 0.91 | ||||
KNN | P = 0.87; R = 0.87; FPR = 0.13; F1 = 0.87 | ||||
Decision tree | P = 0.89; R = 0.89; FPR = 0.12; F1 = 0.89 | ||||
Reis et al. (2019) [184] | English | Rule-based | KNN | BuzzFace | AUC = 0.80; F1 = 0.75 |
Naive Bayes | AUC = 0.72; F1 = 0.75 | ||||
Random Forest | AUC = 0.85; F1 = 0.81 | ||||
SVM | AUC = 0.79; F1 = 0.76 | ||||
XGBoost | AUC = 0.86; F1 = 0.81 | ||||
Anoop et al. (2020) [55] | English | Lexicon-based | Naive Bayes | HWB | Acc = 0.790 |
KNN | Acc = 0.925 | ||||
SVM | Acc = 0.900 | ||||
Random Forests | Acc = 0.840 | ||||
Decision Tree | Acc = 0.940 | ||||
AdaBoost | Acc = 0.0.965 | ||||
CNN | Acc = 0.910 | ||||
LSTM | Acc = 0.920 | ||||
Zhang et al. (2021) [112] | English | Lexicon-based + Commercial (NVIDIA) | BiGRU | RumourEval-19 [188] | F1 = 0.340 |
BERT | F1 = 0.346 | ||||
NileTMRG | F1 = 0.342 | ||||
Chinese | Lexicon-based + Commercial (Baidu AI) | BiGRU | Weibo-20 | F1 = 0.855 | |
BERT | F1 = 0.867 | ||||
HSA-BLSTM | F1 = 0.908 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alonso, M.A.; Vilares, D.; Gómez-Rodríguez, C.; Vilares, J. Sentiment Analysis for Fake News Detection. Electronics 2021, 10, 1348. https://doi.org/10.3390/electronics10111348
Alonso MA, Vilares D, Gómez-Rodríguez C, Vilares J. Sentiment Analysis for Fake News Detection. Electronics. 2021; 10(11):1348. https://doi.org/10.3390/electronics10111348
Chicago/Turabian StyleAlonso, Miguel A., David Vilares, Carlos Gómez-Rodríguez, and Jesús Vilares. 2021. "Sentiment Analysis for Fake News Detection" Electronics 10, no. 11: 1348. https://doi.org/10.3390/electronics10111348