Knowledge-Driven Generative Design of Role-Playing Game Scenarios
Abstract
1. Introduction
- 1.
- The design of a modular LLM-based pipeline for RPG scenario generation that enables systematic analysis of generation factors.
- 2.
- An empirical evaluation of the influence of prompt step count, knowledge compendium variants, and LLM choice on multiple qualitative dimensions of RPG scenarios.
- 3.
- Empirical evidence that compact, non-randomly structured narrative skeletons can lead to more coherent and practically usable generated scenarios.
- 4.
- An ablation study assessing the impact of algorithmic selection of narrative elements on scenario quality.
2. Related Work
2.1. From Procedural Content Generation to Generative AI in Games: Evolution of Narrative Systems
2.2. LLM-Based Narrative Systems: Capabilities and Deployment Barriers
2.3. Quest and Scenario Generation: cRPG Methods and Transfer Limits to RPG Preparation
The emerging narrative that appears by system in the ARPGs [RPGs] sessions becomes a unique narrative artifact, which stimulates the imagination of the players within a collaborative context and creative freedom. The DRPGs [cRPGs] have failed, despite all the technological advances, to emulate this gaming experience.
2.4. Knowledge Grounding for Consistent Scenario Generation
2.5. Evaluation of Generated Narrative Artifacts
3. Materials and Methods
3.1. Game World Compendium Structure
- Entity: component of type: character, object, location, or event;
- Relation: directed connection between entities , where denotes the source entity and the target entity.
3.2. Generation Pipeline Design
- Skeletal scenario generation
- Skeletal scenario completion
- Scenario text generation
3.2.1. Skeletal Scenario Generation
3.2.2. Skeletal Scenario Completion
- Candidate representation: The candidate entity description , optionally prepended with its relation description if connected to g
- Context representation: The concatenation of the quest giver’s g description and the action textual template
3.2.3. Scenario Text Generation
- System promptYou are an RPG gamemaster creating scenarios for tabletop RPG games. Base your work strictly on the provided context. Only use the listed entities and connections, and preserve their integrity and logic.
- User promptWrite an RPG scenario, realising the following structure.
3.3. Experimental Setup
3.3.1. Scenario Generation
3.3.2. Scenario Evaluation
- Evaluation promptYou are an expert RPG scenario reviewer. Evaluate the following scenario based on the category:category with descriptionsProvide a score from 0.0 to 5.0 and a one-sentence explanation. Respond ONLY in the following format:Score: numberNote: brief explanationThe following scenario was generated in response to this user prompt:Prompt for scenario generation:Write an RPG scenario, using the following structure:scenario_promptScenario: scenario
4. Results
4.1. Ablation
4.2. Self-Bias and Human Evaluation
4.3. Statistical Significance
4.4. Popular Metrics Analysis
5. Discussion
5.1. Limitations of LLM-Based Evaluation
5.2. Design Guidelines and Future Directions
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| RPG | Role-Playing Game |
| cRPG | Computer Role-Playing Game |
| NPC | Non-Player Character |
| LLM | Large Language Model |
| PCG | Procedural Content Generation |
| GM | Game Master |
| GenAI | Generative Artificial Intelligence |
| GPT | Generative Pre-trained Transformers |
| NLG | Natural Language Generation |
| AI | Artificial Intelligence |
Appendix A
Appendix B
References
- Arenas, D.L.; Viduani, A.; Araujo, R.B. Therapeutic use of role-playing game (RPG) in mental health: A scoping review. Simul. Gaming 2022, 53, 285–311. [Google Scholar] [CrossRef]
- Yuliawati, L.; Wardhani, P.A.P.; Ng, J.H. A scoping review of tabletop role-playing game (TTRPG) as psychological intervention: Potential benefits and future directions. Psychol. Res. Behav. Manag. 2024, 17, 2885–2903. [Google Scholar] [CrossRef] [PubMed]
- Hammer, J.; To, A.; Schrier, K.; Bowman, S.L.; Kaufman, G. Learning and role-playing games. In Role-Playing Game Studies; Routledge: New York, NY, USA, 2018; pp. 283–299. [Google Scholar] [CrossRef]
- Katō, K. Employing tabletop role-playing games (TRPGs) in social communication support measures for children and youth with autism spectrum disorder (ASD) in Japan: A hands-on report on the use of leisure activities. Jpn. J. Analog. Role-Play. Game Stud. 2019, 23–28. [Google Scholar] [CrossRef]
- Merrick, A.; Li, W.W.; Miller, D.J. A study on the efficacy of the tabletop roleplaying game Dungeons & Dragons for improving mental health and self-concepts in a community sample. Games Health J. 2024, 13, 128–133. [Google Scholar] [CrossRef]
- Williams, J.P.; Kirschner, D.; Deterding, S. Sociology and role-playing games. In The Routledge Handbook of Role-Playing Game Studies; Routledge: New York, USA, 2024; pp. 243–260. [Google Scholar] [CrossRef]
- Henning, G.; de Oliveira, R.R.; de Andrade, M.T.P.; Gallo, R.V.; Benevides, R.R.; Gomes, R.A.F.; Fukue, L.E.K.; Lima, A.V.; de Oliveira, M.B.B.Z.; de Oliveira, D.A.M.; et al. Social skills training with a tabletop role-playing game, before and during the pandemic of 2020: In-person and online group sessions. Front. Psychiatry 2024, 14, 1276757. [Google Scholar] [CrossRef]
- Finch, M. Tome of Adventure Design (Revised); Mythmere Games: Katy, TX, USA, 2022. [Google Scholar]
- Sholtis, J. The Dungeon Dozen; Hydra Collective (brand name: Hydra Cooperative): Austin, TX, USA, 2014. [Google Scholar]
- Robbins, B. Donjon Random Inn Generator. 2009. Available online: https://donjon.bin.sh/fantasy/inn/ (accessed on 15 February 2026).
- Cros.land. D&D 5e Magic Item Generator. 2018. Available online: https://cros.land/dnd-5e-magic-item-generator/ (accessed on 15 February 2026).
- Kassoon. D&D NPC Generator. 2016. Available online: https://www.kassoon.com/dnd/npc-generator/ (accessed on 15 February 2026).
- Azgaar. Fantasy Map Generator. 2017. Available online: https://azgaar.github.io/Fantasy-Map-Generator/ (accessed on 15 February 2026).
- Hendrikx, M.; Meijer, S.; Van Der Velden, J.; Iosup, A. Procedural content generation for games: A survey. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2013, 9, 1–22. [Google Scholar] [CrossRef]
- Gallotta, R.; Todd, G.; Zammit, M.; Earle, S.; Liapis, A.; Togelius, J.; Yannakakis, G.N. Large language models and games: A survey and roadmap. IEEE Trans. Games 2024. early access. [Google Scholar] [CrossRef]
- Maleki, M.F.; Zhao, R. Procedural content generation in games: A survey with insights on emerging llm integration. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Lexington, KY, USA, 18–22 November 2024; Volume 20, pp. 167–178. [Google Scholar] [CrossRef]
- Young, R.M.; Ware, S.G.; Cassell, B.A.; Robertson, J. Plans and planning in narrative generation: A review of plan-based approaches to the generation of story, discourse and interactivity in narratives. Sprache Und Datenverarbeitung Spec. Issue Form. Comput. Model. Narrat. 2013, 37, 41–64. [Google Scholar]
- Hafis, M.; Tolle, H.; Supianto, A.A. A literature review of empirical evidence on procedural content generation in game-related implementation. J. Inf. Technol. Comput. Sci. 2019, 4, 308–328. [Google Scholar] [CrossRef]
- Yang, D.; Kleinman, E.; Harteveld, C. GPT for games: A scoping review (2020-2023). In Proceedings of the 2024 IEEE Conference on Games (CoG); IEEE: New York, NY, USA, 2024; pp. 1–8. [Google Scholar] [CrossRef]
- Yang, D.; Kleinman, E.; Harteveld, C. GPT for Games: An Updated Scoping Review (2020–2024). IEEE Trans. Games 2025. early access. [Google Scholar] [CrossRef]
- Petroni, F.; Rocktäschel, T.; Riedel, S.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 2463–2473. [Google Scholar] [CrossRef]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.t.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Yu, W.; Zhu, C.; Li, Z.; Hu, Z.; Wang, Q.; Ji, H.; Jiang, M. A survey of knowledge-enhanced text generation. ACM Comput. Surv. 2022, 54, 1–38. [Google Scholar] [CrossRef]
- Guan, J.; Feng, Z.; Chen, Y.; He, R.; Mao, X.; Fan, C.; Huang, M. LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation. Trans. Assoc. Comput. Linguist. 2022, 10, 434–451. [Google Scholar] [CrossRef]
- Ono, J.; Ogata, T. A design plan of a game system including an automatic narrative generation mechanism: The entire structure and the world settings. J. Robot. Netw. Artif. Life 2016, 2, 243–246. [Google Scholar] [CrossRef]
- Ryan, J. Curating Simulated Storyworlds; University of California: Santa Cruz, CA, USA, 2018. [Google Scholar]
- Gervás, P.; Méndez, G. Distributing Creative Responsibility Between a Knowledge-Based Content Determiner and a Neural Text Realizer. In Proceedings of the EPIA Conference on Artificial Intelligence; Springer: Cham, Switzerland, 2024; pp. 41–53. [Google Scholar] [CrossRef]
- Wen, Y.; Huang, C.; Zhou, H.; Zeng, Z.; Po, C.M.L.; Togelius, J.; Merino, T.; Earle, S. All stories are one story: Emotional arc guided procedural game level generation. arXiv 2025, arXiv:2508.02132. [Google Scholar] [CrossRef]
- Latitude. AI Dungeon. 2025. Available online: https://aidungeon.com/ (accessed on 15 February 2026).
- Hua, M.; Raley, R. Playing With Unicorns: AI Dungeon and Citizen NLP. DHQ Digit. Humanit. Q. 2020, 14, 4. [Google Scholar]
- Store, A.A. Hermes-3. 2025. Available online: https://aiagentstore.ai/ai-agent/hermes-3 (accessed on 15 February 2026).
- AI, M. Mistral. 2025. Available online: https://mistral.ai/ (accessed on 15 February 2026).
- Crosland, K. AI Powered Game Master Tools. 2023. Available online: https://cros.land/2023/04/ai-powered-game-master-tools/ (accessed on 15 February 2026).
- Cros.land. AI Powered DnD 5e Monster Statblock Generator. 2023. Available online: https://cros.land/ai-powered-dnd-5e-monster-statblock-generator/ (accessed on 3 September 2025).
- Sudowrite. Sudowrite. 2025. Available online: https://sudowrite.com/ (accessed on 15 February 2026).
- López, J.J.P. Procedural and Emergent Narrative: From Analog RPG to Digital RPG. In Proceedings of the Abstract Proceedings of DiGRA 2023 Conference: Limits and Margins of Games; Digital Games Research Association: Tampere, Finland, 2023. [Google Scholar] [CrossRef]
- de Lima, E.S.; Feijó, B.; Furtado, A.L. Procedural Generation of Quests for Games Using Genetic Algorithms and Automated Planning. In Proceedings of the SBGames, Rio de Janeiro, Brazil, 28–31 October 2019; pp. 144–153. [Google Scholar] [CrossRef]
- de Lima, E.S.; Feijó, B.; Furtado, A.L. Procedural generation of branching quests for games. Entertain. Comput. 2022, 43, 100491. [Google Scholar] [CrossRef]
- Prins, V.L.; Prins, J.; Preuss, M.; Gómez-Maureira, M.A. Storyworld: Procedural quest generation rooted in variety & believability. In Proceedings of the 18th International Conference on the Foundations of Digital Games; Association for Computing Machinery: New York, NY, USA, 2023; pp. 1–4. [Google Scholar] [CrossRef]
- Breault, V.; Ouellet, S.; Davies, J. Let CONAN tell you a story: Procedural quest generation. Entertain. Comput. 2021, 38, 100422. [Google Scholar] [CrossRef]
- Balint, J.T.; Bidarra, R. Procedural generation of narrative worlds. IEEE Trans. Games 2022, 15, 262–272. [Google Scholar] [CrossRef]
- Buongiorno, S.; Klinkert, L.; Zhuang, Z.; Chawla, T.; Clark, C. PANGeA: Procedural artificial narrative using generative AI for turn-based, role-playing video games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Lexington, KY, USA, 18–22 November 2024; Volume 20, pp. 156–166. [Google Scholar] [CrossRef]
- Griffith, I. Procedural Narrative Generation Through Emotionally Interesting Non-Player Characters. Master’s Thesis, Linnaeus University, Växjö, Sweden, 2018. [Google Scholar]
- Rahman, A.; Yu, A.; Cho, K. Game Knowledge Management System: Schema-Governed LLM Pipeline for Executable Narrative Generation in RPGs. Systems 2026, 14, 175. [Google Scholar] [CrossRef]
- Jørgensen, N.H.; Tharmabalan, S. Narrative Adherence in LLM-Driven Games. Master’s Thesis, Aalborg University, Aalborg, Denmark, 2025. [Google Scholar]
- Delafuente, P.; Honraopatil, A.; Martin, L.J. Does Reasoning Help LLM Agents Play Dungeons and Dragons? A Prompt Engineering Experiment. arXiv 2025, arXiv:2510.18112. [Google Scholar] [CrossRef]
- Mumper, P. Emergent Narrative in Tabletop Role-Playing Games: An Application of Concepts; Honors Projects, 938; Bowling Green State University: Bowling Green, OH, USA, 2024. [Google Scholar]
- Gryka-Zawadzka, D. Podręczniki do TRPG: Między użytkowością a literackością. Białostockie Stud. Lit. 2024, 24, 175–187. [Google Scholar] [CrossRef]
- Svan, O.; Wuolo, A. Emergent Player-Driven Narrative in Blades in the Dark and Dungeons & Dragons: A Comparative Study. Bachelor’s Thesis, Uppsala University, Uppsala, Sweden, June 2021. [Google Scholar]
- Thoppilan, R.; De Freitas, D.; Hall, J.; Shazeer, N.; Kulshreshtha, A.; Cheng, H.T.; Jin, A.; Bos, T.; Baker, L.; Du, Y.; et al. Lamda: Language models for dialog applications. arXiv 2022, arXiv:2201.08239. [Google Scholar] [CrossRef]
- Pan, S.; Luo, L.; Wang, Y.; Chen, C.; Wang, J.; Wu, X. Unifying large language models and knowledge graphs: A roadmap. IEEE Trans. Knowl. Data Eng. 2024, 36, 3580–3599. [Google Scholar] [CrossRef]
- Park, J.S.; O’Brien, J.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, San Francisco, CA, USA, 29 October–1 November 2023; pp. 1–22. [Google Scholar] [CrossRef]
- Wizards of the Coast. Forgotten Realms Campaign Setting; Wizards of the Coast: Renton, WA, USA, 2001. [Google Scholar]
- Urbanek, J.; Fan, A.; Karamcheti, S.; Jain, S.; Humeau, S.; Dinan, E.; Rocktäschel, T.; Kiela, D.; Szlam, A.; Weston, J. Learning to Speak and Act in a Fantasy Text Adventure Game. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); Inui, K., Jiang, J., Ng, V., Wan, X., Eds.; Association for Computational Linguistics: Hong Kong, China, 2019; pp. 673–683. [Google Scholar] [CrossRef]
- Ashby, T.; Webb, B.K.; Knapp, G.; Searle, J.; Fulda, N. Personalized quest and dialogue generation in role-playing games: A knowledge graph-and language model-based approach. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; pp. 1–20. [Google Scholar] [CrossRef]
- Gottschalk, S.; Demidova, E. EventKG–the hub of event knowledge on the web–and biographical timeline generation. Semant. Web 2019, 10, 1039–1070. [Google Scholar] [CrossRef]
- Van der Lee, C.; Gatt, A.; Van Miltenburg, E.; Krahmer, E. Human evaluation of automatically generated text: Current trends and best practice guidelines. Comput. Speech Lang. 2021, 67, 101151. [Google Scholar] [CrossRef]
- Fu, J.; Ng, S.K.; Jiang, Z.; Liu, P. Gptscore: Evaluate as you desire. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers); Association for Computational Linguistics: Mexico City, Mexico, 2024; pp. 6556–6576. [Google Scholar] [CrossRef]
- Chiang, C.H.; Lee, H.Y. Can Large Language Models Be an Alternative to Human Evaluations? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics: Toronto, ON, Canada, 2023; pp. 15607–15631. [Google Scholar] [CrossRef]
- Chhun, C.; Suchanek, F.M.; Clavel, C. Do language models enjoy their own stories? prompting large language models for automatic story evaluation. Trans. Assoc. Comput. Linguist. 2024, 12, 1122–1142. [Google Scholar] [CrossRef]
- Bradley, H.; Dai, A.; Teufel, H.B.; Zhang, J.; Oostermeijer, K.; Bellagente, M.; Clune, J.; Stanley, K.; Schott, G.; Lehman, J. Quality-Diversity through AI Feedback. In Proceedings of the The Twelfth International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 7–12 July 2002; pp. 311–318. [Google Scholar] [CrossRef]
- Lin, C.Y. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Text Summarization Branches Out, Barcelona, Spain, 25–26 July 2004; pp. 74–81. [Google Scholar]
- Banerjee, S.; Lavie, A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, USA, 29 June 2005; pp. 65–72. [Google Scholar]
- Sai, A.B.; Mohankumar, A.K.; Khapra, M.M. A survey of evaluation metrics used for nlg systems. ACM Comput. Surv. (CSUR) 2022, 55, 1–39. [Google Scholar] [CrossRef]
- Gehrmann, S.; Clark, E.; Sellam, T. Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text. J. Artif. Intell. Res. 2023, 77, 103–166. [Google Scholar] [CrossRef]
- Zhang, T.; Kishore, V.; Wu, F.; Weinberger, K.Q.; Artzi, Y. BERTScore: Evaluating Text Generation with BERT. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Yuan, W.; Neubig, G.; Liu, P. Bartscore: Evaluating generated text as text generation. Adv. Neural Inf. Process. Syst. 2021, 34, 27263–27277. [Google Scholar]
- Guan, J.; Huang, M. UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP); Webber, B., Cohn, T., He, Y., Liu, Y., Eds.; Association for Computational Linguistics: Vienna, Austria, 2020; pp. 9157–9166. [Google Scholar] [CrossRef]
- Liu, Y.; Iter, D.; Xu, Y.; Wang, S.; Xu, R.; Zhu, C. G-Eval: NLG Evaluation using Gpt-4 with Better Human Alignment. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Singapore, 2023; pp. 2511–2522. [Google Scholar] [CrossRef]
- Deriu, J.; Rodrigo, A.; Otegi, A.; Echegoyen, G.; Rosset, S.; Agirre, E.; Cieliebak, M. Survey on evaluation methods for dialogue systems. Artif. Intell. Rev. 2021, 54, 755–810. [Google Scholar] [CrossRef]
- Doran, J.; Parberry, I. A prototype quest generator based on a structural analysis of quests from four MMORPGs. In Proceedings of the 2nd International Workshop on Procedural Content Generation in Games; Association for Computing Machinery: New York, NY, USA, 2011; pp. 1–8. [Google Scholar] [CrossRef]
- Nussbaum, Z.; Morris, J.X.; Duderstadt, B.; Mulyar, A. Nomic embed: Training a reproducible long context text embedder. arXiv 2024, arXiv:2402.01613. [Google Scholar] [CrossRef]
- Wang, B.; Wang, A.; Chen, F.; Wang, Y.; Kuo, C.C.J. Evaluating word embedding models: Methods and experimental results. APSIPA Trans. Signal Inf. Process. 2019, 8, e19. [Google Scholar] [CrossRef]
- Levy, O.; Goldberg, Y.; Dagan, I. Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Comput. Linguist. 2015, 3, 211–225. [Google Scholar] [CrossRef]
- Schnabel, T.; Labutov, I.; Mimno, D.; Joachims, T. Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 298–307. [Google Scholar] [CrossRef]
- Faruqui, M.; Tsvetkov, Y.; Rastogi, P.; Dyer, C. Problems With Evaluation of Word Embeddings Using Word Similarity Tasks. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, Berlin, Germany, 12 August 2016; pp. 30–35. [Google Scholar] [CrossRef]
- Icard, B.; Zve, E.; Sainero, L.; Breton, A.; Ganascia, J.G. Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models. In Proceedings of the 31st International Conference on Computational Linguistics (COLING); Association for Computational Linguistics: Abu Dhabi, United Arab Emirates, 2025. [Google Scholar]
- Likert, R. A technique for the measurement of attitudes. Arch. Psychol. 1932, 22, 140. [Google Scholar]
- Google DeepMind. Gemini 2.5 Flash. 2025. Available online: https://deepmind.google/models/gemini/flash/ (accessed on 15 February 2026).
- OpenAI. GPT-4 Research. 2023. Available online: https://openai.com/pl-PL/index/gpt-4-research/ (accessed on 15 February 2026).


| Motivation | Description | Operationalized Description |
|---|---|---|
| Knowledge | Information known to a character | Wants to gather information |
| Comfort | Physical comfort | Wants to have physical comfort |
| Reputation | How others perceive a character | Wants to be perceived well by others |
| Serenity | Peace of mind | Wants a peace of mind |
| Protection | Security against threats | Wants security against threats |
| Conquest | Desire to prevail over enemies | Wants to prevail over enemies |
| Wealth | Economic power | Wants economic power |
| Ability | Character skills | Wants to use their skills |
| Equipment | Usable assets | Wants physical assets |
| Action | Template | Effect |
|---|---|---|
| capture | Capture {character} | Negative |
| damage | Damage {object} | Negative |
| defend | Defend {character} | Positive |
| escort | Escort {character} to {location} | Positive |
| exchange | Exchange {object} with {character} | Neutral |
| experiment | Experiment with {object} | Neutral |
| get | Obtain {object} | Neutral |
| give | Give {object} to {character} | Positive |
| goto | Go to {location} | Neutral |
| kill | Kill {character} | Negative |
| listen | Listen to {character} | Neutral |
| repair | Repair {object} | Positive |
| report | Report to {character} | Positive |
| spy | Spy on {character} | Negative |
| steal | Steal {object} from {character} | Negative |
| take | Take {object} from {character} | Neutral |
| use | Use {object} | Neutral |
| Prompt Variants | |||||
|---|---|---|---|---|---|
| Variant | Narrative Points | Description | |||
| 4 | Short | ||||
| 8 | Medium | ||||
| 12 | Long | ||||
| Compendium Variants | |||||
| Compendium | Characters | Events | Locations | Objects | Relations |
| 5 | 1 | 3 | 1 | 10 | |
| 6 | 4 | 6 | 4 | 20 | |
| 18 | 9 | 12 | 11 | 50 | |
| Category | [60] | [58] | Author | Description |
|---|---|---|---|---|
| Relevance | ✓ | ✓ | Does the scenario address the given prompt and appropriately incorporate the provided context? | |
| Coherence | ✓ | ✓ | Are the events and character actions logical, causally consistent, and internally coherent? | |
| Complexity | ✓ | Does the scenario feature a complex narrative structure, including multi-layered plots and interdependent subplots? | ||
| Informativeness | ✓ | Does the scenario provide sufficiently detailed and comprehensive information about the narrative and game world? | ||
| Interactivity | ✓ | Does the scenario offer players meaningful choices and opportunities for interaction that influence the course of events? | ||
| Structure | ✓ | Does the scenario follow an organizational structure appropriate for RPG scenarios, including clearly defined sections (e.g., locations, characters, items, and plot points)? |
| Language Models | ||
|---|---|---|
| Model | Release Date | Size |
| Mistral: 7B | 27 September 2023 | 4.1 GB |
| LLaMA3.1: 8B | 23 July 2024 | 4.9 GB |
| StableBeluga: 7B | 21 July 2023 | 4.1 GB |
| Gemma3: 4B | 12 March 2025 | 3.3 GB |
| Generator | Verifier | ||||
|---|---|---|---|---|---|
| Gemma | LLaMA | Mistral | Stable Beluga | Human | |
| Relevance | |||||
| Gemma | 4.575 | 3.070 | |||
| LLaMA | 2.500 | ||||
| Mistral | |||||
| Stable Beluga | 4.019 | 4.450 | |||
| Coherence | |||||
| Gemma | 3.213 | ||||
| LLaMA | 2.818 | ||||
| Mistral | 3.046 | ||||
| Stable Beluga | 3.917 | 4.317 | |||
| Complexity | |||||
| Gemma | 2.179 | ||||
| LLaMA | 2.500 | ||||
| Mistral | 2.500 | ||||
| Stable Beluga | 3.163 | 3.724 | 3.050 | ||
| Informativeness | |||||
| Gemma | 3.445 | ||||
| LLaMA | 2.182 | ||||
| Mistral | 3.039 | ||||
| Stable Beluga | 4.258 | 4.257 | |||
| Interactivity | |||||
| Gemma | 2.782 | ||||
| LLaMA | |||||
| Mistral | 3.075 | ||||
| Stable Beluga | 3.312 | 3.873 | 2.273 | ||
| Structure | |||||
| Gemma | 3.871 | 3.054 | |||
| LLaMA | |||||
| Mistral | |||||
| Stable Beluga | 3.310 | 4.197 | 2.409 | ||
| Factor | Source | Relevance | Coherence | Complex. | Inform. | Inter. | Structure |
|---|---|---|---|---|---|---|---|
| l | |||||||
| Levene’s | p | ||||||
| p | c | ||||||
| l | |||||||
| ANOVA | p | ||||||
| p | c | ||||||
| l | |||||||
| ANOVA | p | ||||||
| c | |||||||
| l | |||||||
| ANOVA | p | ||||||
| c |
| Criterion | Value | Gem:Lla | Gem:Mis | Gem:Bel | Lla:Mis | Lla:Bel | Mis:Bel |
|---|---|---|---|---|---|---|---|
| Relevance | |||||||
| p | |||||||
| d | |||||||
| Coherence | |||||||
| p | |||||||
| d | |||||||
| Complex. | |||||||
| p | |||||||
| d | |||||||
| Inform. | |||||||
| p | |||||||
| d | |||||||
| Inter. | |||||||
| p | |||||||
| d | |||||||
| Structure | |||||||
| p | |||||||
| d |
| Criterion | Value | |||
|---|---|---|---|---|
| Relevance | ||||
| p | ||||
| d | ||||
| Coherence | ||||
| p | ||||
| d | ||||
| Complex. | ||||
| p | ||||
| d | ||||
| Inform. | ||||
| p | ||||
| d | ||||
| Inter. | ||||
| p | ||||
| d | ||||
| Structure | ||||
| p | ||||
| d |
| Parameter | BLEU | ROUGE | BERTScore | ||||
|---|---|---|---|---|---|---|---|
| 1F | 2F | LF | P | R | F1 | ||
| 3.067 | 0.229 | ||||||
| Stable Beluga | |||||||
| Gemma | |||||||
| LLaMA | |||||||
| Mistral | |||||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Owczarek, W.; Wróbel, J.; Pęszor, D. Knowledge-Driven Generative Design of Role-Playing Game Scenarios. Appl. Sci. 2026, 16, 2966. https://doi.org/10.3390/app16062966
Owczarek W, Wróbel J, Pęszor D. Knowledge-Driven Generative Design of Role-Playing Game Scenarios. Applied Sciences. 2026; 16(6):2966. https://doi.org/10.3390/app16062966
Chicago/Turabian StyleOwczarek, Wojciech, Julia Wróbel, and Damian Pęszor. 2026. "Knowledge-Driven Generative Design of Role-Playing Game Scenarios" Applied Sciences 16, no. 6: 2966. https://doi.org/10.3390/app16062966
APA StyleOwczarek, W., Wróbel, J., & Pęszor, D. (2026). Knowledge-Driven Generative Design of Role-Playing Game Scenarios. Applied Sciences, 16(6), 2966. https://doi.org/10.3390/app16062966

