Limitations of Large Language Models in Propaganda Detection Task
Abstract
:1. Introduction
1.1. Motivation
1.2. Objective
- Confirm the reproducibility of previous studies that utilized LLMs.
- Test different LLMs on the annotated English dataset containing online news to find spans in text where propaganda techniques were used and classify them.
1.3. Contributions
- We found that propaganda detection in online news is a more difficult task for a generative pre-trained transformer (GPT) than previous research had shown, in particular, a study utilizing widely used OpenAI GPT models, because the results of previous works could not be replicated and we received significantly lower results.
- We provide a thorough survey, not only of the literature regarding propaganda in general but also more particularly in online news, with a focus on Polish news outlets. We additionally provide an extensive list of Polish organizations that monitor misinformation in Polish online news outlets—such institutions may want to put more focus on automatic propaganda detection in the future.
- We showed that the newest GPT models, in particular, gpt-4-0125-preview, can be used for initial propaganda detection in online news at a coarse-gained level, but it still requires human supervision, as about 25% of the news fragments labeled as propagandist by LLMs and checked by us did not contain any propaganda technique. This allows for decreasing the costs of human labor and the amount of time needed for the generation of new training data for this task.
- We discovered that GPT models can generate the output in the convenient form of a Python code, which enables faster processing and further analyses.
1.4. Paper Structure
2. Literature Review
2.1. Short Introduction to the History of Propaganda
- A society of cardinals, the overseers of foreign missions; also the College of Propaganda at Rome founded by Pope Urban VIII in 1627 for education of missionary priests.
- Any institution or scheme for propagating a doctrine or system.
- Effort directed systematically toward the gaining of public support for an opinion or a course of action.
- The principles advanced by a propaganda.
2.2. Media Bias, Fact Checking and Propaganda in Online News
- Sentence-level classification (SLC): prediction of at least one propaganda technique at the sentence level; best F1 score—multi-granularity with ReLU (60.98%).
- Fragment-level classification (FLC): identification of a span and the type of propaganda technique; best F1 score—multi-granularity with sigmoid (38.98% for the span task and 22.58% for the full task).
2.3. Propaganda, Media Bias and Fact Checking in Poland
3. Datasets
3.1. Propaganda Techniques Corpus
- Appeal to authority;
- Appeal to fear/prejudice;
- Bandwagon, reductio ad hitlerum;
- Black-and-white fallacy;
- Causal oversimplification;
- Doubt;
- Exaggeration, minimization;
- Flag-waving;
- Loaded language;
- Name calling, labeling;
- Repetition;
- Slogans;
- Thought-terminating cliches;
- Whataboutism, straw men, red herring.
3.2. Polish Online News Corpus (PONC)
4. Methods
4.1. LLM on SemEval2020—English Data, Sprenkamp et al.’s Approach
4.2. LLM on SemEval2020—English Data
- Subtask 1 (SI)—given an article, identify specific fragments that contain at least one propaganda technique.
- Subtask 2 (TC)—given a text fragment identified as propaganda and its document context, identify the applied propaganda technique [41].
4.3. Propaganda Technique Detection on the PONC subset with the Use of an LLM
- Binary classification task—whether there was propaganda in the chosen news excerpt; if there was no propaganda technique being used, we marked it as “no propaganda”.
- Propaganda technique classification to check whether the correct technique was chosen; if not, we added our comment of the suggested technique.
5. Results
5.1. LLM on SemEval2020—English Data, Sprenkamp et al. Approach
5.2. LLM on SemEval2020—English Data
5.3. Propaganda Technique Detection in PONC Subset with the Use of LLM
- A total of 26 out of 100 fragments were marked by the annotator as not propaganda (accuracy = 74%).
- A total of 23 out of 74 examples of propaganda were marked as the wrong propaganda technique classification (accuracy = 69%).
- The most popular techniques were appeal to fear/prejudice (22) and loaded language (21).
- There were no examples of repetition nor whataboutism, straw men, red herring.
6. Error Analysis
- The generated technique name was not included in the provided list.
- The generated technique name was more granulated, e.g., whataboutism instead of whataboutism, straw men, red herring.
- The output was a description of the used technique instead of the label.
- “I appeal to the government, to the Prime Minister, to all those who make decisions.” (original: Apeluję do rządu, do premiera, do wszystkich, którzy podejmują decyzje.)—mistakenly marked as appeal to authority due to the use of the word appeal and mentioning examples of authorities, such as prime minister or the government.
- Fragment “perhaps without the ’Boleks’, i.e., without the opposition activists who secretly collaborated with the political police, many revolutions would have had a bloodier course” was marked as bandwagon, but we think it should be noted as name calling, labeling (original: być może bez ’Bolków’, czyli bez działaczy opozycyjnych, którzy tajnie współpracowali z policją polityczną, wiele rewolucji miałoby bardziej krwawy przebieg).
- One fragment concerned the Polish and Belarusian border crisis and included emojis of flags of both countries. It was mistakenly marked as flag-waving.
7. Discussion
- As a general observation, we can say that gpt-4-0125-preview was often unable to output an accurate span for a propaganda technique—some selected fragments were too long and the additional text did not include any valuable context to better understand the detected propaganda technique.
- Although the temperature was set to “0”, which should provide more deterministic results, for the given prompt and task, various GPT models were unable to generate the same results; therefore, the blue method should be considered not reproducible.
- In the original paper by Sprenkamp et al. [58], it was not mentioned whether the models were run several times, but we can assume it was done only once, and thus, the results are not trustworthy.
- In the same paper, there was no mention about error analysis nor any specific mistakes that the models made when predicting the propaganda techniques.
- Reformulation of the original SemEval2020 Task 3 and the use of the annotated development set as a test set was an example of data contamination—there could be a high risk that the models were trained on this data and it was the reason for significantly better results. The experiment should be conducted on the original test set for which the golden labels were not released to the public.
8. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
GPT | Generative pre-trained transformer |
LLM | Large language model |
MBFC | Media Bias/Fact Check |
PiS | Prawo i Sprawiedliwość (Law and Justice) |
PONC | Polish Online News Corpus |
PTC | Propaganda Techniques Corpus |
SemEval | Semantic Evaluation |
SI | Span identification |
SOTA | State of the art |
TC | Technique classification |
Appendix A. Prompts
Appendix A.1. Prompt for Task 2 (LLM on SemEval2020—English Data) and Task 3 (Propaganda Technique Detection in the PONC Subset with the Use of LLM, First Try)—In English
- Appeal_to_Authority
- Appeal_to_fear-prejudice
- Bandwagon,Reductio_ad_hitlerum
- Black-and-White_Fallacy
- Causal_Oversimplification
- Doubt
- Exaggeration,Minimisation
- Flag-Waving
- Loaded_Language
- Name_Calling,Labeling
- Repetition
- Slogans
- Thought-terminating_Cliches
- Whataboutism,Straw_Men,Red_Herring
Appendix A.2. Prompt for Task 2 (LLM on SemEval2020—English Data—Technique Classification)
- Loaded_Language—Uses specific phrases and words that carry strong emotional impact to affect the audience, e.g., a lone lawmaker’s childish shouting.’
- Name_Calling,Labeling—Gives a label to the object of the propaganda campaign as either the audience hates or loves, e.g., ‘Bush the Lesser.’
- Repetition— Repeats the message over and over in the article so that the audience will accept it, e.g., ‘Our great leader is the epitome of wisdom. Their decisions are always wise and just’.
- Exaggeration,Minimisation—Either representing something in an excessive manner or making something seem less important than it actually is, e.g., ‘I was not fighting with her; we were just playing’.
- Appeal_to_fear-prejudice—Builds support for an idea by instilling anxiety and/or panic in the audience towards an alternative, e.g., ‘stop those refugees; they are terrorists.’
- Flag-Waving; Playing on strong national feeling (or with respect to a group, e.g., race, gender, political preference) to justify or promote an action or idea, e.g., ‘entering this war will make us have a better future in our country’.
- Causal_Oversimplification— Assumes a single reason for an issue when there are multiple causes, e.g., ‘If France had not declared war on Germany, World War II would have never happened’.
- Appeal_to_Authority—Supposes that a claim is true because a valid authority or expert on the issue supports it, ‘The World Health Organisation stated, the new medicine is the most effective treatment for the disease.’
- Slogans—A brief and striking phrase that contains labeling and stereotyping, e.g., “Make America great again”!
- Thought-terminating_Cliches—Words or phrases that discourage critical thought and useful discussion about a given topic, e.g., “it is what it is”
- Whataboutism,Straw_Men,Red_Herring—Attempts to discredit an opponent’s position by charging them with hypocrisy without directly disproving their argument, e.g., ‘They want to preserve the FBI’s reputation’.
- Black-and-White_Fallacy—Gives two alternative options as the only possibilities, when actually more options exist, e.g., ‘You must be a Republican or Democrat’.
- Bandwagon,Reductio_ad_hitlerum—Justify actions or ideas because everyone else is doing it, or reject them because it’s favored by groups despised by the target audience, e.g., “Would you vote for Clinton as president? 57% say yes.”
- Doubt—Questioning the credibility of someone or something, e.g., ‘Is he ready to be the Mayor’?
Appendix A.3. Prompt for Task 3, Second Prompt—In Polish, Techniques in English
- Appeal_to_Authority
- Appeal_to_fear-prejudice
- Bandwagon,Reductio_ad_hitlerum
- Black-and-White_Fallacy
- Causal_Oversimplification
- Doubt
- Exaggeration,Minimisation
- Flag-Waving
- Loaded_Language
- Name_Calling,Labeling
- Repetition
- Slogans
- Thought-terminating_Cliches
- Whataboutism,Straw_Men,Red_Herring
Appendix A.4. Prompts for Task 3, Third Prompt—In Polish, Techniques In English, Few-Shot—Examples of Propaganda Techniques in Polish
- Appeal_to_Authority—“Nie ‘zbawiajmy’ świata kosztem Polski, pięknie pisał Prymas Tysiąclecia”
- Appeal_to_fear-prejudice—“Według najnowszych danych agencji badawczej Inquiry, aż 47 proc. respondentów w tej grupie deklaruje, że nie będzie się szczepić. Czy naprawdę w Polsce jesteśmy gotowi ryzykować życiem i zdrowiem naszych dzieci?”
- Bandwagon, Reductio_ad_hitlerum—“Aż 65% badanych uważa, że niepełnoletność matki nie jest argumentem zezwalającym na aborcję”
- Black-and-White_Fallacy—“Była zastępczyni rzecznika praw obywatelskich w rozmowie z Interią stwierdziła, że „potrzebna jest partia, która w sposób pryncypialny podejdzie do kwestii walki z katastrofą klimatyczną i bezkompromisowo do praw zwierząt”.—Bez weganizmu taka perspektywa nie będzie możliwa—oceniła.”
- Causal_Oversimplification—“Dzis Wielki Dzień Pszczół. Ginie ich miliony przez zmiany klimatyczne. A jeśli nadal będziemy je zabijać, np. używając neonikotynoidów to wkrótce będziemy obchodzić Dzień Wspomnienia o Pszczołach.”
- Doubt—“Zadziwiające, że PiS nie potrafi sięgnęć po pieniądze z Funduszu Odbudowy, a mami nam oczy nierealnym odszkodowaniem od Berlina”
- Exaggeration,Minimisation— “Aborcja to tylko zabieg medyczny”
- Flag-Waving—“Już nigdy nie pozwolimy, by na polskiej ziemi stanęła noga rosyjskiego żołnierza”
- Loaded_Language— “Oni się chcą tylko nachapać i nakraść.”
- Name_Calling, Labeling—“Ci zaś, którzy nie pamiętają PRL, mogą sobie skojarzyć styl telewizji Jacka Kurskiego z Chinami albo innymi krajami Wschodu.”
- Repetition
- Slogans—“Stop Ukrainizacji Polski!.”
- Thought-terminating_Cliches—“Taka jest prawda i koniec.”
- Whataboutism, Straw_Men, Red_Herring
References
- Groeling, T. Media Bias by the Numbers: Challenges and Opportunities in the Empirical Study of Partisan News. Annu. Rev. Political Sci. 2013, 16, 129–151. [Google Scholar] [CrossRef]
- Adams, Z.; Osman, M.; Bechlivanidis, C.; Meder, B. (Why) Is Misinformation a Problem? Perspect. Psychol. Sci. 2023, 18, 1436–1463. [Google Scholar] [CrossRef] [PubMed]
- Levak, T. Disinformation in the New Media System—Characteristics, Forms, Reasons for its Dissemination and Potential Means of Tackling the IssueDezinformacije u novomedijskom sustavu—značajke, oblici, razlozi širenja i potencijalni načini njihova suzbijanja. Med. IstražIvanja 2021, 26, 29–58. [Google Scholar] [CrossRef]
- Kotelenets, E.; Barabash, V. Propaganda and Information Warfare in Contemporary World: Definition Problems, Instruments and Historical Context. In Proceedings of the International Conference on Man-Power-Law-Governance: Interdisciplinary Approaches (MPLG-IA 2019), Moscow, Russia, 24–25 September 2019; pp. 374–377. [Google Scholar] [CrossRef]
- Szwoch, J.; Staszkow, M.; Rzepka, R.; Araki, K. Sentiment Analysis of Polish Online News Covering Controversial Topics—Comparison Between Lexicon and Statistical Approaches. In Proceedings of the Language Technology Conference (LTC’23), Poznań, Poland, 21–23 April 2023; pp. 277–281. [Google Scholar]
- Szwoch, J.; Staszkow, M.; Rzepka, R.; Araki, K. Can LLMs Determine Political Leaning of Polish News Articles? In Proceedings of the 10th IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE’23), Yanuca Island, Fiji, 4–6 December 2023. [Google Scholar]
- Mitts, T.; Phillips, G.; Walter, B.F. Studying the Impact of ISIS Propaganda Campaigns. J. Politics 2022, 84, 1220–1225. [Google Scholar] [CrossRef]
- Pavlíková, M.; Šenkýřová, B.; Drmola, J. Propaganda and Disinformation Go Online. In Challenging Online Propaganda and Disinformation in the 21st Century; Springer International Publishing: Cham, Switzerland, 2021; pp. 43–74. [Google Scholar] [CrossRef]
- Khaldarova, I.; Pantti, M. Fake News. J. Pract. 2016, 10, 891–901. [Google Scholar] [CrossRef]
- Woolley, S. Digital Propaganda: The Power of Influencers. J. Democr. 2022, 33, 115–129. [Google Scholar] [CrossRef]
- Media Bias/Fact Check—Poland Government and Media Profile. 2023. Available online: https://mediabiasfactcheck.com/poland-media-profile/ (accessed on 20 March 2024).
- Reporters Without Borders—Poland. 2023. Available online: https://rsf.org/en/country/poland (accessed on 20 March 2024).
- Szwoch, J.; Staszkow, M.; Rzepka, R.; Araki, K. Creation of Polish Online News Corpus for Political Polarization Studies. In Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences, Marseille, France, 20–25 June 2022; Afli, H., Alam, M., Bouamor, H., Casagran, C.B., Boland, C., Ghannay, S., Eds.; European Language Resources Association: Marseille, France; pp. 86–90. [Google Scholar]
- Bernays, E. Propaganda; Horace Liveright: New York, NY, USA, 1928. [Google Scholar]
- Stanley, J. How Propaganda Works; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
- Ellul, J. Propaganda: The Formation of Men’s Attitudes; A Borzoi book; Knopf Doubleday Publishing Group: New York, NY, USA, 1965. [Google Scholar]
- Chomsky, N. Media Control: The Spectacular Achievements of Propaganda; Open Media Series; Seven Stories Press: New York, NY, USA, 2002. [Google Scholar]
- Herman, E.; Chomsky, N. Manufacturing Consent: The Political Economy of the Mass Media; Pantheon Books: New York, NY, USA, 2002. [Google Scholar]
- Tversky, A.; Kahneman, D. Judgment under Uncertainty: Heuristics and Biases. In Utility, Probability, and Human Decision Making: Selected Proceedings of an Interdisciplinary Research Conference, Rome, Italy, 3–6 September 1973; Wendt, D., Vlek, C., Eds.; Springer: Dordrecht, The Netherlands, 1975; pp. 141–162. [Google Scholar] [CrossRef]
- Hamborg, F.; Donnay, K.; Gipp, B. Automated identification of media bias in news articles: An interdisciplinary literature review. Int. J. Digit. Libr. 2018, 20, 391–415. [Google Scholar] [CrossRef]
- Nakov, P.; Sencar, H.T.; An, J.; Kwak, H. A Survey on Predicting the Factuality and the Bias of News Media. arXiv 2021, arXiv:2103.12506. [Google Scholar]
- Guo, Z.; Schlichtkrull, M.; Vlachos, A. A Survey on Automated Fact-Checking. Trans. Assoc. Comput. Linguist. 2022, 10, 178–206. [Google Scholar] [CrossRef]
- Rashkin, H.; Choi, E.; Jang, J.Y.; Volkova, S.; Choi, Y. Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; Palmer, M., Hwa, R., Riedel, S., Eds.; Association for Computational Linguistics: Copenhagen, Denmark, 2017; pp. 2931–2937. [Google Scholar] [CrossRef]
- Aimeur, E.; Amri, S.; Brassard, G. Fake news, disinformation and misinformation in social media: A review. Soc. Netw. Anal. Min. 2023, 13, 30. [Google Scholar] [CrossRef]
- Alam, F.; Cresci, S.; Chakraborty, T.; Silvestri, F.; Dimitrov, D.; Martino, G.D.S.; Shaar, S.; Firooz, H.; Nakov, P. A Survey on Multimodal Disinformation Detection. arXiv 2022, arXiv:2103.12541. [Google Scholar]
- Czym Jest Fact-Checking?—Zarys Inicjatyw na Świecie i w Polsce (English: What Is Fact-Checking?—Outline of Initiatives in the World and in Poland). 2019. Available online: https://cyberpolicy.nask.pl/czym-jest-fact-checking-zarys-inicjatyw-na-swiecie-i-w-polsce/ (accessed on 20 March 2024).
- Paul, R.; Elder, L. The Thinker’s Guide for Conscientious Citizens on How to Detect Media Bias & Propaganda in National and World News: Based on Critical Thinking Concepts & Tools; Thinker’s Guide Series; Rowman & Littlefield: Lanham, MD, USA, 2004. [Google Scholar]
- Huang, K.H.; McKeown, K.; Nakov, P.; Choi, Y.; Ji, H. Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation. arXiv 2023, arXiv:2203.05386. [Google Scholar]
- Barrón-Cedeño, A.; Jaradat, I.; Da San Martino, G.; Nakov, P. Proppy: Organizing the news based on their propagandistic content. Inf. Process. Manag. 2019, 56, 1849–1864. [Google Scholar] [CrossRef]
- Da San Martino, G.; Yu, S.; Barr’on-Cede no, A.; Petrov, R.; Nakov, P. Fine-Grained Analysis of Propaganda in News Articles. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019. [Google Scholar]
- Yu, S.; Martino, G.D.S.; Nakov, P. Experiments in Detecting Persuasion Techniques in the News. arXiv 2019, arXiv:1911.06815. [Google Scholar]
- Weston, A. A Rulebook for Arguments; Hackett: Indianapolis, IN, USA, 2009. [Google Scholar]
- Torok, R. Symbiotic radicalisation strategies: Propaganda tools and neuro linguistic programming. In Proceedings of the 8th Australian Security and Intelligence Conference, Perth, Australia, 30 November–2 December 2015. [Google Scholar]
- Teninbaum, G. Reductio ad Hitlerum: Trumping the Judicial Nazi Card. In Michigan State Law Review; HeinOnline: New York, NY, USA, 2009; p. 541. [Google Scholar]
- Jowett, G.; O’Donnell, V. Propaganda and Persuasion; Advances in Political Science; SAGE Publications; SAGE: Thousand Oaks, CA, USA, 1986. [Google Scholar]
- Hobbs, R. Teaching about Propaganda: An Examination of the Historical Roots of Media Literacy. J. Media Lit. Educ. 2014, 6, 56–67. [Google Scholar] [CrossRef]
- Goodwin, J.; McKerrow, R. Accounting for the Force of the Appeal to Authority; Iowa State University Press: Ames, IA, USA, 2011. [Google Scholar]
- Richter, M.L. The Kremlin’s Platform for ‘Useful Idiots’ in the West: An Overview of RT’s Editorial Strategy and Evidence of Impact; Technical report; Kremlin Watch: Politico, France, 2017. [Google Scholar]
- Hunter, J. Brainwashing in a Large Group Awareness Training? The Classical Conditioning Hypothesis of Brainwashing. Ph.D. Dissertation, University of KwaZulu-Natal, Pietermaritzburg, South Africa, 2015. [Google Scholar]
- Dan, L. Techniques for the Translation of Advertising Slogans. In Proceedings of the International Conference Literature, Discourse and Multicultural Dialogue, LDMD ’15, Tîrgu-Mureș, Mureș, 3–4 December 2015; pp. 12–23. [Google Scholar]
- Da San Martino, G.; Barrón-Cedeño, A.; Nakov, P. Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection. In Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, Hong Kong, China, 4 November 2019; Feldman, A., Da San Martino, G., Barrón-Cedeño, A., Brew, C., Leberknight, C., Nakov, P., Eds.; Association for Computational Linguistics: Hongkong, China; pp. 162–170. [Google Scholar] [CrossRef]
- Da San Martino, G.; Barrón-Cedeño, A.; Wachsmuth, H.; Petrov, R.; Nakov, P. SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona, Spain, 12–13 December 2020. [Google Scholar]
- Martino, G.D.S.; Cresci, S.; Barrón-Cedeño, A.; Yu, S.; Pietro, R.D.; Nakov, P. A Survey on Computational Propaganda Detection. arXiv 2020, arXiv:2007.08024. [Google Scholar]
- Morio, G.; Morishita, T.; Ozaki, H.; Miyoshi, T. Hitachi at SemEval-2020 Task 11: An Empirical Study of Pre-Trained Transformer Family for Propaganda Detection. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona, Spain, 12–13 December 2020; Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E., Eds.; International Committee for Computational Linguistics: Barcelona, Spain, 2020; pp. 1739–1748. [Google Scholar] [CrossRef]
- Jurkiewicz, D.; Borchmann, Ł.; Kosmala, I.; Graliński, F. ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona, Spain, 12–13 December 2020; Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E., Eds.; pp. 1415–1424. [Google Scholar] [CrossRef]
- Abdullah, M.; Altiti, O.; Obiedat, R. Detecting Propaganda Techniques in English News Articles using Pre-trained Transformers. In Proceedings of the 2022 13th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 21–23 June 2022; pp. 301–308. [Google Scholar] [CrossRef]
- Piskorski, J.; Stefanovitch, N.; Da San Martino, G.; Nakov, P. SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 2343–2361. [Google Scholar] [CrossRef]
- Piskorski, J.; Stefanovitch, N.; Nikolaidis, N.; Da San Martino, G.; Nakov, P. Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Volume 1: Long Papers. Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; pp. 3001–3022. [Google Scholar] [CrossRef]
- Piskorski, J.; Stefanovitch, N.; Bausier, V.A.; Faggiani, N.; Linge, J.; Kharazi, S.; Nikolaidis, N.; Teodori, G.; De Longueville, B.; Doherty, B.; et al. News Categorization, Framing and Persuasion Techniques: Annotation Guidelines; Technical report; European Commission Joint Research Centre: Ispra, Italy, 2023. [Google Scholar]
- Koreeda, Y.; Yokote, K.i.; Ozaki, H.; Yamaguchi, A.; Tsunokake, M.; Sogawa, Y. Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; pp. 1702–1711. [Google Scholar] [CrossRef]
- Pauli, A.; Sarabia, R.; Derczynski, L.; Assent, I. TeamAmpa at SemEval-2023 Task 3: Exploring Multilabel and Multilingual RoBERTa Models for Persuasion and Framing Detection. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; pp. 847–855. [Google Scholar] [CrossRef]
- Wu, B.; Razuvayevskaya, O.; Heppell, F.; Leite, J.A.; Scarton, C.; Bontcheva, K.; Song, X. SheffieldVeraAI at SemEval-2023 Task 3: Mono and Multilingual Approaches for News Genre, Topic and Persuasion Technique Classification. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 1995–2008. [Google Scholar] [CrossRef]
- Falk, N.; Eichel, A.; Piccirilli, P. NAP at SemEval-2023 Task 3: Is Less Really More? (Back-)Translation as Data Augmentation Strategies for Detecting Persuasion Techniques. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 1433–1446. [Google Scholar] [CrossRef]
- Hromadka, T.; Smolen, T.; Remis, T.; Pecher, B.; Srba, I. KInITVeraAI at SemEval-2023 Task 3: Simple yet Powerful Multilingual Fine-Tuning for Persuasion Techniques Detection. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 629–637. [Google Scholar] [CrossRef]
- Rodrigo-Ginés, F.J.; Plaza, L.; Carrillo-de Albornoz, J. UnedMediaBiasTeam @ SemEval-2023 Task 3: Can We Detect Persuasive Techniques Transferring Knowledge From Media Bias Detection? In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 787–793. [Google Scholar] [CrossRef]
- Modzelewski, A.; Sosnowski, W.; Wilczynska, M.; Wierzbicki, A. DSHacker at SemEval-2023 Task 3: Genres and Persuasion Techniques Detection with Multilingual Data Augmentation through Machine Translation and Text Generation. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada, 9–14 July 2023; Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E., Eds.; pp. 1582–1591. [Google Scholar] [CrossRef]
- Dao, J.; Wang, J.; Zhang, X. YNU-HPCC at SemEval-2020 Task 11: LSTM Network for Detection of Propaganda Techniques in News Articles. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, Online, 12–13 December 2020; Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E., Eds.; pp. 1509–1515. [Google Scholar] [CrossRef]
- Sprenkamp, K.; Jones, D.G.; Zavolokina, L. Large Language Models for Propaganda Detection. arXiv 2023, arXiv:2310.06422. [Google Scholar]
- Jones, D.G. Detecting Propaganda in News Articles Using Large Language Models. Eng. Open Access 2024, 2, 1–12. [Google Scholar]
- Hasanain, M.; Ahmed, F.; Alam, F. Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles. arXiv 2024, arXiv:2402.17478. [Google Scholar]
- Kutt, M. Zderzenie cywilizacji — Skuteczna propaganda czy strach? Stosunek polskich internautów do islamskich uchodźców (English: Clash of civilisations—Effective propaganda or fear? Attitudes of Polish Internet users towards Islamic refugees). Rocz. Kult. 2017, 8, 25–38. [Google Scholar] [CrossRef]
- Pogorzelski, P. Zagrożenie Rosyjską Dezinformacją w Polsce i Formy Przeciwdziałania (English: The Threat of Russian Disinformation in Poland and the Forms of Counteraction); Technical report; Kolegium Europy Wschodniej: Wrocławiu, Poland, 2017. [Google Scholar]
- Dobek-Ostrowska, B. Mediatyzacja polityki w tygodnikach opinii w Polsce—między polityzacją a komercjalizacją (English: Mediatisation of politics in weekly opinion magazines in Poland—between politicisation and commercialisation). Zesz. Prasozn. 2018, 61, 224–246. [Google Scholar] [CrossRef]
- Olechowska, P. Stopień stronniczości polskich dzienników ogólnoinformacyjnych (wybrane wyznaczniki)/Degree of bias in Polish general-information newspapers (selected determinants). Political Prefer. 2017, 61, 107–130. [Google Scholar] [CrossRef]
- Gorwa, R. Computational Propaganda in Poland: False Amplifiers and the Digital Public Sphere; Technical report, Computational Propaganda Research Project; University of Oxford: Oxford, UK, 2017. [Google Scholar]
- Treichel, P. News Propaganda in Poland: Mixed Methods Analysis of the Online News Coverage about the Media Law Proposal Lex TVN. Master’s Thesis, University of Warsaw, Warsaw, Poland, 2022. [Google Scholar]
- Malwina Popiołek, M.H.M.B. Infodemia—An Analysis of Fake News in Polish News Portals and Traditional Media during the Coronavirus Pandemic. Commun. Soc. 2021, 34, 81–89. [Google Scholar] [CrossRef]
- Media Bias/Fact Check—TVP Info—Bias and Credibility. 2023. Available online: https://mediabiasfactcheck.com/tvp-info-bias/ (accessed on 20 March 2024).
- Media Bias/Fact Check—TVN24—Bias and Credibility. 2023. Available online: https://mediabiasfactcheck.com/tvn24-bias/ (accessed on 20 March 2024).
- Gazeta Wyborcza—Bias and Credibility. 2023. Available online: https://mediabiasfactcheck.com/gazeta-wyborcza-bias/ (accessed on 20 March 2024).
- Political Critique—Bias and Credibility. 2023. Available online: https://mediabiasfactcheck.com/political-critique/ (accessed on 20 March 2024).
- FL24.net—Bias and Credibility. 2024. Available online: https://mediabiasfactcheck.com/fl24-net/ (accessed on 20 March 2024).
- Top Websites Ranking—Most Visited News & Media Publishers Websites in Poland. 2024. Available online: https://www.similarweb.com/top-websites/poland/news-and-media/ (accessed on 20 March 2024).
- “Wiadomości” Liderem Programów Informacyjnych. Wszystkie Dzienniki ze Spadkiem ogląDalności (English: “Wiadomości” the Leader of News Programmes. All Daily TV News with Falling Viewing Figures). 2024. Available online: https://www.wirtualnemedia.pl/artykul/propaganda-wiadomosci-prowadzacy-fakty-wydarzenia-wrzesien-2023-rok (accessed on 20 March 2024).
- Kowalska-Chrzanowska, M.; Krysiński, P. Polskie projekty fact-checkingowe demaskujące fałszywe informacje na temat wojny w Ukrainie (English: Polish fact-checking projects exposing false information about the war in Ukraine). Media I SpołEczeńStwo 2022, 17, 51–71. [Google Scholar]
- Mejova, Y.; Zhang, A.X.; Diakopoulos, N.; Castillo, C. Controversy and Sentiment in Online News. arXiv 2014, arXiv:1409.8152. [Google Scholar]
- Jakaza, E.; Visser, M. ‘Subjectivity’ in newspaper reports on ‘controversial’ and ‘emotional’ debates: An appraisal and controversy analysis. Lang. Matters 2016, 47, 3–21. [Google Scholar] [CrossRef]
- Large Language Models for Propaganda Detection—Github Project. 2023. Available online: https://github.com/sprenkamp/LLM_propaganda_detection (accessed on 20 March 2024).
- The OpenAI API—Documentation—Models. 2024. Available online: https://platform.openai.com/docs/models (accessed on 26 March 2024).
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Adv. Neural. Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
Organization | Description | Website |
---|---|---|
Ad Fontes Media | Creator of Interactive Media Bias Chart® | www.adfontesmedia.com (accessed on 20 March 2024) |
All Sides | Provides balanced news, media bias ratings and diverse perspectives—top news stories are presented from the left, center and right of the political spectrum. | https://www.allsides.com/unbiased-balanced-news (accessed on 20 March 2024) |
FactCheck.org | Monitors the accuracy of information said by major US political figures in TV ads, debates, speeches, interviews and news releases. Receiver of a Pulitzer award. | www.factcheck.org (accessed on 20 March 2024) |
Media Bias/Fact Check | Promotes awareness of media bias and misinformation by rating the bias, factual accuracy and credibility of media sources. | www.mediabiasfactcheck.com (accessed on 20 March 2024) |
PolitiFact | Receiver of a Pulitzer award. | www.politifact.com (accessed on 20 March 2024) |
Snopes | One of the first fact-checking services; initially investigated urban legends. | www.snopes.com (accessed on 20 March 2024) |
Swiss Policy Research (SPR) | Research group investigating geopolitical propaganda. Creators of “The Media Navigator”—leading media outlets classifier for English and German languages news providers based on their political stance and their establishment bias. | www.swprs.org/media-navigator (accessed on 20 March 2024) |
Media Outlet | Bias Rating | Factual Reporting | Media Type | Traffic/Popularity | MBFC Credibility Rating | Notes |
---|---|---|---|---|---|---|
TVP Info | Right | Mixed | TV station | High traffic | Medium | www.tvp.info (accessed on 20 March 2024) |
TVN24 | Left-center | High | TV station | High traffic | High | https://tvn24.pl/ (accessed on 20 March 2024) |
Gazeta Wyborcza | Left-center | High | Newspaper | High traffic | High | www.wyborcza.pl (accessed on 20 March 2024) |
FL24.net | Far-right | Mixed | Website | Minimal traffic | Low; questionable reasoning: propaganda, poor sourcing, lack of transparency | No longer exists; operated in France, owned by Polish media outlet |
Political Critique | Left-center | High | Website | No information | No information | No longer exists; Polish version: https://krytykapolityczna.pl/ (accessed on 20 March 2024) |
Organization | Year of Creation | Website | Notes |
---|---|---|---|
Fundacja Reporterów | 2010 | www.fundacjareporterow.org (accessed on 20 March 2024) | indEX and Ukraine Monitor projects observe influence of disinformation on extremist groups |
Demagog | 2014 | www.demagog.org.pl (accessed on 20 March 2024) | First Polish fact-checking organization |
OKO.press | 2016 | www.oko.press (accessed on 20 March 2024) | Does not allow readers to report suspicious content |
Demaskator24 | 2018 | www.web.archive.org/web/20190615000000*/Demaskator24.pl (accessed on 20 March 2024) | Only Facebook account exists, no longer updated |
Konkret24.pl | 2018 | www.konkret24.tvn24.pl (accessed on 20 March 2024) | Created by the TVN group |
Sprawdzam AFP | 2019 | www.sprawdzam.afp.com/list (accessed on 20 March 2024) | Part of Agence France-Presse (AFP; a multilingual, multicultural news agency) |
AntyFAKE | 2019 | www.antyfake.pl (accessed on 20 March 2024) | No longer updated |
Odfejkuj.info | 2020 | www.odfejkuj.info (accessed on 20 March 2024) | No longer updated |
Pravda | 2020 | www.pravda.org.pl (accessed on 20 March 2024) | Fact checking of information, statements and digital content |
FakeNews.pl | 2020 | www.fakenews.pl (accessed on 20 March 2024) | International Fact-Checking Network member |
#FakeHunter | 2020 | www.fake-hunter.pap.pl (accessed on 20 March 2024) | Created to check news regarding SARS-CoV-2 |
Zgłoś Trolla | 2022 | www.web.archive.org/web/20190615000000*/zglostrolla.pl (accessed on 20 March 2024) | Created to check news regarding war in Ukraine |
Dataset | News Article Count |
---|---|
Training set | 371 |
Development set | 75 |
Test set | 90 |
Column Name | Column Description |
---|---|
id | Article identification number |
technique | Propaganda technique |
begin_offset | Beginning of the span (inclusive) |
end_offset | End of the span (exclusive) |
Model | Precision | Recall | F1 Score |
---|---|---|---|
Baseline (gpt-4 CoT [58]) | 0.56868 | 0.57821 | 0.57340 |
Baseline—reproduce attempt | 0.46479 | 0.64525 | 0.54035 |
gpt-3.5-turbo-0125 base | 0.58923 | 0.48883 | 0.53435 |
gpt-3.5-turbo-0125 CoT | 0.60700 | 0.43575 | 0.50732 |
gpt-3.5-turbo-1106 base | 0.65104 | 0.34916 | 0.45455 |
gpt-3.5-turbo-1106 CoT | 0.63934 | 0.32682 | 0.43253 |
gpt-4-0125-preview base | 0.53292 | 0.47486 | 0.50222 |
gpt-4-0125-preview CoT | 0.58419 | 0.47486 | 0.52388 |
gpt-4-1106-preview base | 0.70833 | 0.04749 | 0.08901 |
gpt-4-1106-preview CoT | 0.81818 | 0.05028 | 0.09474 |
Technique\Model | Baseline—gpt-4 CoT | Baseline—gpt-4 CoT, Our Attempt | gpt-3.5-turbo-0125 Base | gpt-3.5-turbo-0125 CoT | gpt-3.5-turbo-1106 Base | gpt-3.5-turbo-1106 CoT | gpt-4-1106-Preview Base | gpt-4-1106-Preview CoT | gpt-4-0125-Preview base | gpt-4-0125-Preview CoT |
---|---|---|---|---|---|---|---|---|---|---|
Appeal to authority | 0.19048 | 0.24000 | 0.32432 | 0.22857 | 0.11765 | 0.31579 | 0.00000 | 0.00000 | 0.40000 | 0.24000 |
Appeal to fear/prejudice | 0.00000 | 0.00000 | 0.48276 | 0.00000 | 0.54545 | 0.00000 | 0.00000 | 0.00000 | 0.53333 | 0.00000 |
Bandwagon, reductio ad hitlerum | 0.00000 | 0.12500 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
Black-and-white fallacy | 0.00000 | 0.00000 | 0.18182 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.28571 | 0.00000 |
Causal oversimplification | 0.50000 | 0.37500 | 0.29630 | 0.31579 | 0.11111 | 0.12500 | 0.00000 | 0.00000 | 0.41509 | 0.37500 |
Doubt | 0.54545 | 0.53465 | 0.24390 | 0.56757 | 0.00000 | 0.41509 | 0.06667 | 0.06897 | 0.55172 | 0.57447 |
Exaggeration, minimization | 0.64000 | 0.64706 | 0.35088 | 0.057142 | 0.05128 | 0.00000 | 0.00000 | 0.00000 | 0.56250 | 0.60274 |
Flag-waving | 0.00000 | 0.00000 | 0.15385 | 0.00000 | 0.10811 | 0.00000 | 0.00000 | 0.00000 | 0.23256 | 0.00000 |
Loaded language | 0.93617 | 0.93617 | 0.90625 | 0.89600 | 0.77876 | 0.71698 | 0.30380 | 0.34568 | 0.84034 | 0.93233 |
Name calling, labeling | 0.74286 | 0.72165 | 0.79630 | 0.73684 | 0.76636 | 0.76364 | 0.07407 | 0.00000 | 0.57895 | 0.66667 |
Repetition | 0.65789 | 0.67327 | 0.60274 | 0.68235 | 0.59459 | 0.53846 | 0.10000 | 0.15000 | 0.37500 | 0.47273 |
Slogans | 0.10000 | 0.23529 | 0.10000 | 0.09523 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
Thought-terminating cliches | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
Whataboutism, straw men, red herring | 0.33333 | 0.45283 | 0.43243 | 0.27273 | 0.08333 | 0.09091 | 0.00000 | 0.00000 | 0.16667 | 0.33333 |
Model | Precision | Recall | F1 |
---|---|---|---|
Hitachi | 0.56544 | 0.47368 | 0.51551 |
ApplicaAI | 0.59954 | 0.41650 | 0.49153 |
gpt-4-0125-preview, try 1 | 0.16059 | 0.20109 | 0.17857 |
gpt-4-0125-preview, try 2 | 0.16659 | 0.23174 | 0.19384 |
gpt-4-0125-preview, try 3 | 0.16595 | 0.23174 | 0.19340 |
gpt-3.5-turbo-0125, try 1 | 0.15132 | 0.06707 | 0.09294 |
gpt-3.5-turbo-0125, try 2 | 0.17226 | 0.06583 | 0.09525 |
gpt-3.5-turbo-0125, try 3 | 0.16172 | 0.07142 | 0.09908 |
baseline | 0.00320 | 0.13045 | 0.00162 |
Technique—Model | Baseline | gpt-4-0125-Preview | gpt-3.5-turbo-0125 |
---|---|---|---|
F1 score | 0.25196 | 0.23352 | 0.30000 |
Appeal to authority | 0.00000 | 0.00000 | 0.00000 |
Appeal to fear/prejudice | 0.03681 | 0.05128 | 0.01449 |
Bandwagon, reductio ad hitlerum | 0.00000 | 0.00000 | 0.00000 |
Black-and-white fallacy | 0.00000 | 0.04211 | 0.03125 |
Causal oversimplification | 0.11561 | 0.03922 | 0.00000 |
Doubt | 0.29143 | 0.03265 | 0.00995 |
Exaggeration, minimization | 0.14420 | 0.05882 | 0.06329 |
Flag-waving | 0.06195 | 0.04878 | 0.01869 |
Loaded language | 0.46477 | 0.43468 | 0.47970 |
Name calling, labeling | 0.00000 | 0.08458 | 0.02778 |
Repetition | 0.19262 | 0.02643 | 0.03550 |
Slogans | 0.00000 | 0.00000 | 0.00000 |
Thought-terminating cliches | 0.00000 | 0.00000 | 0.00000 |
Whataboutism, straw men, red herring | 0.00000 | 0.00000 | 0.00000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Szwoch, J.; Staszkow, M.; Rzepka, R.; Araki, K. Limitations of Large Language Models in Propaganda Detection Task. Appl. Sci. 2024, 14, 4330. https://doi.org/10.3390/app14104330
Szwoch J, Staszkow M, Rzepka R, Araki K. Limitations of Large Language Models in Propaganda Detection Task. Applied Sciences. 2024; 14(10):4330. https://doi.org/10.3390/app14104330
Chicago/Turabian StyleSzwoch, Joanna, Mateusz Staszkow, Rafal Rzepka, and Kenji Araki. 2024. "Limitations of Large Language Models in Propaganda Detection Task" Applied Sciences 14, no. 10: 4330. https://doi.org/10.3390/app14104330
APA StyleSzwoch, J., Staszkow, M., Rzepka, R., & Araki, K. (2024). Limitations of Large Language Models in Propaganda Detection Task. Applied Sciences, 14(10), 4330. https://doi.org/10.3390/app14104330