Multimodal Technologies and Interaction

31 pages, 28657 KB

Open AccessArticle

Agent-Based Paradigm for the Self-Configuration of a Conceptual Mechanical Assembly Modeling Application in Virtual Reality

by Julian Conesa, Francisco José Mula and Manuel Contero

Multimodal Technol. Interact. 2026, 10(2), 21; https://doi.org/10.3390/mti10020021 - 22 Feb 2026

Viewed by 456

Abstract

The immersive, multisensory experiences offered by virtual reality have been transformative across multiple disciplines, enhancing practical and theoretical skills while increasing user motivation and learning. On the other hand, multi-agent systems have proven to be effective in facilitating the expansion and modularity of [...] Read more.

The immersive, multisensory experiences offered by virtual reality have been transformative across multiple disciplines, enhancing practical and theoretical skills while increasing user motivation and learning. On the other hand, multi-agent systems have proven to be effective in facilitating the expansion and modularity of computer systems. This paper presents an application developed in a virtual reality environment based on multi-agent systems for the conceptual design of mechanical assemblies from primitives. As a main novelty, the primitives can be defined by the user of the application from a set of models and images, and an Excel document, without the need for programming knowledge, taking advantage of the possibilities offered by multi-agent systems. In addition, for each primitive, it is possible to define a set of geometric and dimensional modifications, as well as a set of position relations with respect to other primitives to generate mechanical assemblies. Full article

(This article belongs to the Topic AI-Based Interactive and Immersive Systems)

► Show Figures

Figure 1

22 pages, 1737 KB

Open AccessReview

How Virtual Reality Design Reshapes Our Ecological Connection to Natural Systems

by Ivonne Angelica Castiblanco Jimenez, Santiago Parra Barrios and Ana Maria Correa Jimenez

Multimodal Technol. Interact. 2026, 10(2), 20; https://doi.org/10.3390/mti10020020 - 20 Feb 2026

Viewed by 654

Abstract

This integrative literature review examines how virtual reality (VR) design can transform environmental understanding by changing users from passive observers to active participants in ecological systems. We aimed to analyze the interaction strategies through which VR enables environmental awareness and to identify the [...] Read more.

This integrative literature review examines how virtual reality (VR) design can transform environmental understanding by changing users from passive observers to active participants in ecological systems. We aimed to analyze the interaction strategies through which VR enables environmental awareness and to identify the most effective approaches for fostering ecological connection. Through systematic analysis of studies published between 2015 and 2025, we found that effective VR implementations share three core design mechanisms: progressive engagement that builds connection over time, a careful balance between interaction and reflection, and multisensory integration that creates believable immersive experiences. These design mechanisms, in turn, build ecological connection through three fundamental pillars: perspective-taking that generates empathy, the creation of authentic sensory experiences, and the development of network thinking to understand complex interconnections. This review contributes to the field by mapping the development of environmental VR applications, identifying successful implementation strategies, and highlighting research gaps. Our analysis provides a comprehensive interaction framework for designing more effective environmental experiences and advancing this emerging field when innovative approaches are most needed. Full article

(This article belongs to the Special Issue Intelligent Interaction Design: Innovative Models and the Future of Human–Computer Experience)

► Show Figures

Figure 1

30 pages, 19886 KB

Open AccessArticle

MoodScape: Emotion-Informed Terrain Synthesis for Virtual Reality System

by Rahul Kumar Rai, Reshu Bansal and Shashi Shekhar Jha

Multimodal Technol. Interact. 2026, 10(2), 19; https://doi.org/10.3390/mti10020019 - 11 Feb 2026

Viewed by 506

Abstract

(1) Background: Virtual environments (VEs) significantly influence human emotions through various elements such as lighting, color, and terrain. While the effects of lighting and color on emotions within VEs have been extensively studied, the impact of the terrain remains underexplored. This paper addresses [...] Read more.

(1) Background: Virtual environments (VEs) significantly influence human emotions through various elements such as lighting, color, and terrain. While the effects of lighting and color on emotions within VEs have been extensively studied, the impact of the terrain remains underexplored. This paper addresses this gap by investigating the correlation between terrain characteristics in VEs and users’ emotional states. (2) Methods: We conducted a user study in which participants were exposed to various 3D terrains and used the Self-Assessment Manikin (SAM) to rate their emotional responses (valence, arousal, and dominance). Building on these insights, we propose MoodScape, an automated framework for emotion-informed terrain generation that significantly reduces the need for extensive expertise and manual effort. In the current implementation, continuous SAM valence–arousal targets are discretised into four quadrant-based affect/terrain classes, and this discrete class label conditions DH-CVAE-GAN terrain synthesis. MoodScape designs a generative adversarial network (GAN) architecture called DH-CVAE-GAN, which integrates a dual-head conditional variational autoencoder as the generator alongside a discriminator network to ensure effective and realistic terrain generation. The DH-CVAE-GAN is trained on a satellite-derived digital elevation model (DEM) dataset, which helps the generated terrains reflect realistic geographic patterns. (3) Results: Quantitative and qualitative evaluations on our study sample suggest that MoodScape can generate terrains whose perceived affective tone is broadly consistent with the specified affect-class inputs, indicating potential applications in gaming and exploratory therapeutic Virtual Reality, while formal clinical efficacy remains in future work. Full article

(This article belongs to the Topic AI-Based Interactive and Immersive Systems)

► Show Figures

Graphical abstract

16 pages, 3673 KB

Open AccessReview

Virtual Reality Learning Environments: A Review of Support for Autonomous Learning Development

by Pablo Fernández-Arias, Antonio del Bosque and Diego Vergara

Multimodal Technol. Interact. 2026, 10(2), 18; https://doi.org/10.3390/mti10020018 - 5 Feb 2026

Viewed by 948

Abstract

The rapid expansion of digital education in the 21st century has positioned Virtual Reality Learning Environments (VRLEs) as promising spaces for fostering greater learner autonomy. As immersive technologies become more accessible and pedagogically versatile, they offer students opportunities to regulate their learning processes, [...] Read more.

The rapid expansion of digital education in the 21st century has positioned Virtual Reality Learning Environments (VRLEs) as promising spaces for fostering greater learner autonomy. As immersive technologies become more accessible and pedagogically versatile, they offer students opportunities to regulate their learning processes, experiment in interactive scenarios, and progress at their own pace. This review examines how autonomous learning has been conceptualized and investigated within VRLE research through a comprehensive bibliometric analysis of studies published between 2000 and 2025. The results reveal a research field shaped by two major orientations: one focused on human and pedagogical dimensions (learner diversity, instructional design, and evidence-based strategies) and another on technological innovation (artificial intelligence, machine learning, and simulation-based systems). Topic analyses show that digital and immersive education dominate current scholarly production, while areas directly related to autonomy, personalized learning, and student-centered methodologies remain comparatively less developed. Accordingly, it is crucial to reinforce pedagogical structures that enable autonomous learning in VR environments and to integrate technological advancements in a manner that translates into tangible improvements in educational quality across different settings. Full article

(This article belongs to the Special Issue Educational Virtual/Augmented Reality)

► Show Figures

Figure 1

23 pages, 2386 KB

Open AccessArticle

Beyond the Classroom: Technology-Enabled Acceleration Models for Gifted Learners in the Digital Era

by Yusra Zaki Aboud

Multimodal Technol. Interact. 2026, 10(2), 17; https://doi.org/10.3390/mti10020017 - 4 Feb 2026

Viewed by 1006

Abstract

The digital era represents a paradigm shift in gifted education, moving at an accelerating pace away from traditional models toward flexible and personalized technology-based pathways. This study investigates the impact of a model implemented via the FutureX platform in Saudi Arabia on the [...] Read more.

The digital era represents a paradigm shift in gifted education, moving at an accelerating pace away from traditional models toward flexible and personalized technology-based pathways. This study investigates the impact of a model implemented via the FutureX platform in Saudi Arabia on the autonomy and self-regulated learning (SRL) of 63 gifted high school students. Using a quasi-experimental design, the study integrated quantitative measures (paired t-tests) with phenomenological analysis of interviews. The quantitative results showed statistically significant improvements (p < 0.001) in the dimensions of autonomy and self-regulated learning, with large Cohen’s d effect sizes for planning (d = 1.05), monitoring (d = 1.05), and cognitive control (d = 1.30). These gains were supported by a pedagogical design intentionally embedded within the platform to scaffold self-regulation. These findings were reinforced by qualitative results, with 88% of gifted students reporting that the platform provided appropriately challenging content and promoted self-learning and goal-setting behaviors. Full article

(This article belongs to the Special Issue Human-AI Collaborative Interaction Design: Rethinking Human-Computer Symbiosis in the Age of Intelligent Systems)

► Show Figures

Figure 1

33 pages, 1460 KB

Open AccessArticle

Systematic Analysis of Vision–Language Models for Medical Visual Question Answering

by Muhammad Haseeb Shah and Heriberto Cuayáhuitl

Multimodal Technol. Interact. 2026, 10(2), 16; https://doi.org/10.3390/mti10020016 - 3 Feb 2026

Viewed by 1133

Abstract

General-purpose vision–language models (VLMs) are increasingly applied to imaging tasks, yet their reliability on medical visual question answering (Med-VQA) remains unclear. We investigate how three state-of-the-art VLMs—ViLT, BLIP, and MiniCPM-V-2—perform on radiology-focused Med-VQA when evaluated in a modality-aware manner. Using SLAKE and OmniMedVQA-Mini, [...] Read more.

General-purpose vision–language models (VLMs) are increasingly applied to imaging tasks, yet their reliability on medical visual question answering (Med-VQA) remains unclear. We investigate how three state-of-the-art VLMs—ViLT, BLIP, and MiniCPM-V-2—perform on radiology-focused Med-VQA when evaluated in a modality-aware manner. Using SLAKE and OmniMedVQA-Mini, we construct harmonised subsets for computed tomography (CT), magnetic resonance imaging (MRI), and X-ray, standardising schema and answer processing. We first benchmark all models in a strict zero-shot setting, then perform supervised fine-tuning on modality-specific data splits, and finally add a post-hoc semantic option-selection layer that maps free-text predictions to multiple-choice answers. Zero-shot performance is modest (exact match ≈20% for ViLT/BLIP and 0% for MiniCPM-V-2), confirming that off-the-shelf deployment is inadequate. Fine-tuning substantially improves all models, with ViLT reaching ≈80% exact match and BLIP ≈50%, while MiniCPM-V-2 lags behind. When coupled with option selection, ViLT and BLIP achieve 90–93% exact match and F1 across all modalities, corresponding to 95–97% BERTScore-F1. Our novel results show that (i) modality-specific supervision is essential for Med-VQA, and (ii) post-hoc option selection can transform strong but imperfect generative predictions into highly reliable discrete decisions on harmonised radiology benchmarks. The latter is useful for medical VLMs that combine generative responses with option or sentence selection. Full article

► Show Figures

Figure 1

33 pages, 24792 KB

Open AccessArticle

A User-Centered Evaluation of a VR HMD-Based Harvester Training Simulator

by Pranjali Barve and Raffaele De Amicis

Multimodal Technol. Interact. 2026, 10(2), 15; https://doi.org/10.3390/mti10020015 - 2 Feb 2026

Viewed by 714

Abstract

Skilled operation of forestry harvesters is essential for ensuring safety, efficiency, and sustainability in logging practices. However, conventional training methods are often prohibitively expensive and limited by access to specialized equipment. This study delivers one of the first user-centered validations of a low-cost, [...] Read more.

Skilled operation of forestry harvesters is essential for ensuring safety, efficiency, and sustainability in logging practices. However, conventional training methods are often prohibitively expensive and limited by access to specialized equipment. This study delivers one of the first user-centered validations of a low-cost, VR HMD-based forestry harvester simulator, directly addressing access and scalability barriers in training. With 26 participants, we quantify cognitive load, usability, user experience, and simulator sickness using established instruments. An increase in cognitive load was seen from baseline tutorial to each training module (NASA-TLX:

18.65 \to 34.26 \to 38.43

; rm-ANOVA, p < 0.001). Usability was ‘Good’ (with a mean SUS score: 76.63), hedonic UX ranked in the top decile (UEQ-S), and simulator sickness was moderate (mean SSQ score: 28.91), while task success remained high across all modules. These results indicate early-stage feasibility and usability of a low-cost VR HMD harvester simulator for student-focused introductory instruction, and they provide actionable design guidance (e.g., managing extraneous load, comfort safeguards) advancing evidence-based VR HMD-based training in the forest engineering and harvesting domain. Our findings validate the potential of VR-HMD as a tool for forestry education capable of addressing training accessibility gaps and enhancing learner motivation through immersive experiential learning. Full article

► Show Figures

Figure 1

29 pages, 2740 KB

Open AccessArticle

An HCI-Centered Experiences of ICT Integration and Its Impact on Professional Competencies Supporting Formative Assessment in Higher Education e-Learning

by Abdelaziz Boumahdi, Fadwa Ammari and Mohammed Ammari

Multimodal Technol. Interact. 2026, 10(2), 14; https://doi.org/10.3390/mti10020014 - 2 Feb 2026

Viewed by 697

Abstract

As universities expand their e-learning systems, it becomes increasingly important to understand how the use of information and communication technologies (ICTs) changes the skills needed for effective formative assessment. This study uses the principles of human–computer interaction (HCI) to create a framework for [...] Read more.

As universities expand their e-learning systems, it becomes increasingly important to understand how the use of information and communication technologies (ICTs) changes the skills needed for effective formative assessment. This study uses the principles of human–computer interaction (HCI) to create a framework for examining how digital tools, interfaces, and modes of interaction influence the way teachers assess students in higher education. The research relies on the information provided by 115 Mohammed V University teachers, who filled out a competency-based assessment grid regarding online assessment practices. The results remain exploratory and context-dependent and do not make claims of statistical representativeness beyond the studied institutional context. The findings attest to the virtues of digital technology in improving methodological and techno-pedagogical skills, without excluding the existence of serious shortcomings in semio-ethical and evaluative skills. It is certainly useful to leverage feedback to correct imperfections in evaluation practices and make them more responsive to digital interfaces. It is becoming imperative to rethink professional skills as the regulatory halo of the online formative assessment system, in order to evaluate a more synergistic framework that can give better visibility to virtual classrooms. Full article

(This article belongs to the Special Issue Online Learning to Multimodal Era: Interfaces, Analytics and User Experiences)

► Show Figures

Figure 1

25 pages, 17750 KB

Open AccessArticle

A Mixed Reality Tool with Automatic Speech Recognition for 3D CAD Based Visualization and Automatic Dimension Generation in the Industry 5.0 Shipyard

by Aida Vidal-Balea, Antón Valladares-Poncela, Javier Vilar-Martínez, Tiago M. Fernández-Caramés and Paula Fraga-Lamas

Multimodal Technol. Interact. 2026, 10(2), 13; https://doi.org/10.3390/mti10020013 - 1 Feb 2026

Viewed by 650

Abstract

Industry 5.0 is composed of a variety of complex tasks and challenging processes requiring specialized labor and multidisciplinary coordination. Specifically, when it comes to shipbuilding, shipyards leverage advanced technologies, seeking to replace operations that continue to rely on traditional methods, such as 2D [...] Read more.

Industry 5.0 is composed of a variety of complex tasks and challenging processes requiring specialized labor and multidisciplinary coordination. Specifically, when it comes to shipbuilding, shipyards leverage advanced technologies, seeking to replace operations that continue to rely on traditional methods, such as 2D blueprints and paper-based documentation, which can lead to inefficiencies and alignment errors in precision-dependent tasks. For this reason, this article focuses on embracing Mixed Reality (MR) technologies to address these challenges in the context of electrical outfitting tasks. The design, development and evaluation of a MR application tailored for HoloLens 2 smart glasses aims to streamline the workflow for operators, reducing reliance on paper-based documentation and enhancing the precision of assembly processes. The proposed system allows for the precise positioning of 3D models in the real environment, ensuring accurate alignment during assembly. Additionally, it incorporates automatic dimension generation between objects in the scene. To further enhance usability, the application integrates a Galician on-device Automatic Speech Recognition (ASR) system, allowing operators to interact seamlessly with the MR interface using voice commands. The whole system has been exhaustively tested, both through usability and functionality evaluations, which validate MR as a viable tool for shipyard assembly and inspection tasks. Full article

(This article belongs to the Special Issue Multimodal Interaction Design in Immersive Learning and Training Environments)

► Show Figures

Figure 1

18 pages, 1730 KB

Open AccessArticle

Design and Prototype of a Chatbot for Public Participation in Major Infrastructure Projects

by Jonathan Matthei, Johannes Maas, Maurice Wischum, Sven Mackenbach and Katharina Klemt-Albert

Multimodal Technol. Interact. 2026, 10(2), 12; https://doi.org/10.3390/mti10020012 - 30 Jan 2026

Viewed by 604

Abstract

Public participation is a central element of democratic decision-making processes, but it often faces challenges within planning approval procedures due to problems of understanding and accessibility. This paper aims to counteract these challenges through the conceptual development, prototypical implementation and validation of a [...] Read more.

Public participation is a central element of democratic decision-making processes, but it often faces challenges within planning approval procedures due to problems of understanding and accessibility. This paper aims to counteract these challenges through the conceptual development, prototypical implementation and validation of a chatbot. The chatbot is designed to facilitate access to planning documents and improve the participation process as a whole. After presenting the theoretical foundations of chatbots and large language models (LLMs), three central use cases are described. The main tasks of the chatbot are to simplify the language of complex planning documents, find documents and information, and answer frequently asked questions. The underlying architecture of the prototype is based on the concept of retrieval augmented generation (RAG) and uses a vector database in which the information is embedded and stored as vectors. To evaluate the developed prototype, four focus workshops were conducted with professionals affiliated with road and rail infrastructure administrations at both state and federal levels in Germany. During these workshops, participants tested the core functionalities and assessed the system using both quantitative and qualitative criteria. The results indicate a strong potential for improving the handling of standard inquiries. By improving access to complex planning documents, the system may also contribute to a reduction in objections. At the same time, the evaluation emphasizes the importance of limiting hallucinations through appropriate technical safeguards and clearly indicating the use of AI to users. The insights gained from this study will be incorporated into the prototype developed within the BIM4People research project, funded by the German Federal Ministry of Transport. The aim therefore is to implement additional use cases and continuously optimize the functionality of the system through an iterative development process. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Multimodal Technol. Interact., Volume 10, Issue 2 (February 2026) – 10 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI