Journal Description
Multimodal Technologies and Interaction
Multimodal Technologies and Interaction
is an international, peer-reviewed, open access journal on multimodal technologies and interaction published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), Inspec, dblp Computer Science Bibliography, and other databases.
- Journal Rank: JCR - Q2 (Computer Science, Cybernetics) / CiteScore - Q1 (Neuroscience (miscellaneous))
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 21.7 days after submission; acceptance to publication is undertaken in 4.6 days (median values for papers published in this journal in the second half of 2025).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
- Journal Cluster of Artificial Intelligence: AI, AI in Medicine, Algorithms, BDCC, MAKE, MTI, Stats, Virtual Worlds and Computers.
Impact Factor:
2.4 (2024);
5-Year Impact Factor:
2.7 (2024)
Latest Articles
Empirical Validation of Fitts’ Law in Virtual Reality: Modeling, Prediction, and Modality Comparison
Multimodal Technol. Interact. 2026, 10(5), 49; https://doi.org/10.3390/mti10050049 - 1 May 2026
Abstract
Fitts’ law is a foundational model for predicting pointing performance and has been increasingly explored in immersive virtual reality (VR) environments. This paper presents a controlled experimental framework for deriving modality-specific Fitts’ law models in VR and evaluating their predictive transfer to applied
[...] Read more.
Fitts’ law is a foundational model for predicting pointing performance and has been increasingly explored in immersive virtual reality (VR) environments. This paper presents a controlled experimental framework for deriving modality-specific Fitts’ law models in VR and evaluating their predictive transfer to applied interaction tasks. The framework comprises two scenarios. The first replicates a standardized ISO 9241 pointing task in a 3D virtual environment to derive predictive movement time models by systematically varying target distance (20–50 cm), target size (2.5–5 cm), and spatial configuration ( , , , ). The second simulates an applied warehouse-inspired task involving tool sorting and structured placement actions to evaluate the generalizability of the derived models in more ecologically valid VR interactions. Thirty-two participants completed all tasks using the Meta Quest 3 headset and two interaction modalities: a handheld controller and hand tracking with gesture recognition. Results show that Fitts’ law remains a strong predictor of movement time for 3D pointing in VR, with high linear fits for both the controller ( ) and hand tracking ( ). However, models derived from standardized pointing tasks showed limited transferability to applied object-manipulation scenarios, producing prediction errors of approximately 27–35% and systematically underestimating movement times. Additionally, both objective metrics and subjective evaluations indicated that controller-based interaction outperformed hand tracking in efficiency, accuracy, perceived workload, and usability. These findings highlight both the robustness and limitations of Fitts-based performance modeling in realistic VR interaction contexts.
Full article
Open AccessArticle
FLAG: Fatty Liver Awareness Game for Liver Health Literacy in Last-Semester Software Engineering Students
by
Franklin Parrales-Bravo, José Borbor-Albay, Janio Jadán-Guerrero and Leonel Vasquez-Cevallos
Multimodal Technol. Interact. 2026, 10(5), 48; https://doi.org/10.3390/mti10050048 - 1 May 2026
Abstract
Non-alcoholic fatty liver disease affects approximately thirty percent of the global population, yet public awareness remains dangerously low among young adults facing occupational risk factors. This study introduces the Fatty Liver Awareness Game (FLAG), an educational serious game designed to improve liver health
[...] Read more.
Non-alcoholic fatty liver disease affects approximately thirty percent of the global population, yet public awareness remains dangerously low among young adults facing occupational risk factors. This study introduces the Fatty Liver Awareness Game (FLAG), an educational serious game designed to improve liver health literacy among software engineering students at the University of Guayaquil. While evaluated with this specific sample, FLAG is intended for the broader target population of young adults in developing nations who face occupational sedentary risk and limited access to preventive health education. Through a controlled experiment with fifty participants randomly assigned to game-based or traditional lecture instruction, the game demonstrated superior effectiveness, with a twenty-percentage-point advantage in post-test scores and a seventy-two percent reduction in incorrect responses compared to fifty percent in the lecture group. The large effect size (Cohen’s d = 1.43) and reduced performance variability among game participants indicate that interactive, feedback-rich learning environments can outperform passive instruction for this population and content domain. While the present design does not isolate the contribution of individual game elements—such as narrative framing, explanatory feedback, or mini-game interleaving—the results establish FLAG as a replicable model for digital health interventions targeting underserved populations at critical developmental junctures. Future component analyses are needed to determine which specific design features drive the observed advantages.
Full article
Open AccessArticle
Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification
by
Padmavathi Kesavan, Miranda Lakshmi Travis, Martin Aruldoss and Martin Wynn
Multimodal Technol. Interact. 2026, 10(5), 47; https://doi.org/10.3390/mti10050047 - 30 Apr 2026
Abstract
This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature
[...] Read more.
This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature representations. A curated dataset of bilingual text samples is utilized, along with synthetic speech generated through text-to-speech (TTS) to enable controlled multimodal experimentation. Five deep learning architectures are evaluated under text-only, audio-only, and learnable fusion settings. Experimental findings indicate that text-based models consistently achieve strong performance in both proficiency and style classification tasks. In contrast, the audio-only model demonstrates limited effectiveness, highlighting the constraints of synthetic acoustic features in capturing meaningful linguistic information. The fusion models provide only marginal improvements over text-based approaches, suggesting that textual representations play a dominant role in proficiency and stylistic classification within controlled datasets. These results emphasize the importance of linguistic features over acoustic signals for automated language assessment in low-resource settings. The proposed framework provides a scalable and reproducible approach and offers a foundation for future work incorporating real speech data and more diverse linguistic inputs.
Full article
Open AccessArticle
AI-Enhanced Motion Capture for Multimodal Interaction in Chinese Shadow Puppetry Heritage
by
Gaihua Wang, Hengchao Yun, Lixin Yang, Qingyuan Zheng and Tianmuran Liu
Multimodal Technol. Interact. 2026, 10(5), 46; https://doi.org/10.3390/mti10050046 - 28 Apr 2026
Abstract
This study examines how AI-enhanced motion capture (AI-MoCap) mediates the preservation, transmission, and re-creation of Chinese shadow puppetry as performative intangible cultural heritage. Through a state-of-the-art review and comparative analysis of three representative application models—technology-driven, culturally integrated, and entertainment-oriented—the paper explores how AI-MoCap
[...] Read more.
This study examines how AI-enhanced motion capture (AI-MoCap) mediates the preservation, transmission, and re-creation of Chinese shadow puppetry as performative intangible cultural heritage. Through a state-of-the-art review and comparative analysis of three representative application models—technology-driven, culturally integrated, and entertainment-oriented—the paper explores how AI-MoCap supports the digitization of performative techniques while reshaping modes of cultural presentation and interaction. Cross-case comparison highlights recurring tensions between technical standardization and cultural authenticity while also indicating possibilities for symbolic reconstruction, contextual continuity, and ethically grounded design. Based on this comparison, the paper develops a dual-channel inheritance framework—“perception–symbol” and “design–performance”—and treats cultural resolution and digital ethics as analytical and normative principles for resisting algorithmic homogenization. Rather than functioning only as a digitization tool, AI-MoCap can be understood as a mediating mechanism whose cultural value depends on how it remains embedded in community-based performative logics, symbolic systems, and ethical boundaries. The resulting framework offers transferable guidance for future research, curation, training, and policy discussion in the digital safeguarding of performance-based heritage.
Full article
(This article belongs to the Special Issue Human-AI Collaborative Interaction Design: Rethinking Human-Computer Symbiosis in the Age of Intelligent Systems)
►▼
Show Figures

Figure 1
Open AccessSystematic Review
Interactive Narratives and Serious Games in Oncology and Grief Support: A Systematic Literature Review
by
João Macieira, Marco Vale, Elena Vanica and Vitor Carvalho
Multimodal Technol. Interact. 2026, 10(5), 45; https://doi.org/10.3390/mti10050045 - 27 Apr 2026
Abstract
►▼
Show Figures
The impact of oncological diseases extends far beyond the clinical patient, profoundly affecting the mental health of caregivers, family members, and volunteers who navigate complex emotional landscapes of grief, anxiety, and trauma. While the domain of digital health has seen a proliferation of
[...] Read more.
The impact of oncological diseases extends far beyond the clinical patient, profoundly affecting the mental health of caregivers, family members, and volunteers who navigate complex emotional landscapes of grief, anxiety, and trauma. While the domain of digital health has seen a proliferation of serious games aimed at pediatric patient education and treatment adherence, the specific perspective of the “second-order patient”, the caregiver or survivor, remains significantly under-explored. The primary objective of this study is to systematically review the current state of interactive narratives in oncology, palliative care, and grief support, identifying research gaps to inform the broader design space of empathy-driven serious games. Following the PRISMA guidelines, 31 articles were selected from an initial query of 116 records. Interventions were categorized into Serious Games, Games, and Gamification. The analysis reveals a critical thematic transition: early interventions relied heavily on biological “battle” metaphors to empower patients, whereas the current literature advocates for “thanatosensitive” designs that foster empathy. However, a distinct research gap persists regarding narratives that explore post-loss meaning reconstruction and the hospital volunteer experience. Synthesizing these findings, this paper establishes an evidence-based theoretical framework demonstrating a significant opportunity for games that prioritize dialogue and emotional processing over traditional winning conditions. As a practical application of these findings, we also briefly outline the conceptualization of a prototype simulating a widower’s experience volunteering in a palliative ward, shifting the ludic focus from defeating a disease to navigating loss.
Full article

Figure 1
Open AccessProject Report
From Tradition to Technology: A Framework for Smart Pilgrim Management on the Camino de Santiago
by
Adriana Mar, Fernando Monteiro, Pedro Pereira, Jose Carlos García, João F. A. Martins and Daniel Basulto
Multimodal Technol. Interact. 2026, 10(5), 44; https://doi.org/10.3390/mti10050044 - 23 Apr 2026
Abstract
►▼
Show Figures
The Camino de Santiago, a UNESCO-listed pilgrimage route, has experienced sustained growth in visitor numbers, challenging municipalities to preserve cultural integrity while ensuring service quality. This study reviews people-counting technologies and proposes a smart pilgrim management framework grounded in flux measurement systems to
[...] Read more.
The Camino de Santiago, a UNESCO-listed pilgrimage route, has experienced sustained growth in visitor numbers, challenging municipalities to preserve cultural integrity while ensuring service quality. This study reviews people-counting technologies and proposes a smart pilgrim management framework grounded in flux measurement systems to support data-driven and sustainable decision-making. Drawing on the smart tourism literature, the conceptual framework integrates infrared counters, mobile tracking solutions, and GPS/Wi-Fi data to generate real-time insights into pilgrim flows. A pilot simulation illustrates how these data can inform operational and strategic planning. The framework enables local authorities to monitor pedestrian movements, anticipate service demands (sanitation, accommodation, and safety), and detect overcrowding in sensitive heritage areas. By incorporating technological solutions into traditionally low-tech pilgrimage settings, municipalities can transition from reactive to proactive management approaches. The paper contributes a scalable and ethically grounded framework tailored to heritage pilgrimage routes, advancing smart tourism applications in culturally significant contexts.
Full article

Graphical abstract
Open AccessSystematic Review
Who, Where, What, and How to Nudge: A Systematic Review of Co-Designed Digital Nudges for Behavioral Interventions
by
Alaa Ziyud, Khaled Al-Thelaya and Jens Schneider
Multimodal Technol. Interact. 2026, 10(4), 43; https://doi.org/10.3390/mti10040043 - 21 Apr 2026
Abstract
►▼
Show Figures
Digital nudges refer to subtle modifications in digital choice architectures that are increasingly applied across domains such as healthcare, human–computer interactions, and behavioral science. However, existing approaches often overlook users’ needs, contextual factors, and ethical considerations related to transparency and autonomy. This systematic
[...] Read more.
Digital nudges refer to subtle modifications in digital choice architectures that are increasingly applied across domains such as healthcare, human–computer interactions, and behavioral science. However, existing approaches often overlook users’ needs, contextual factors, and ethical considerations related to transparency and autonomy. This systematic literature review, guided by PRISMA 2020, examines the integration of co-design methodologies in digital nudging across four dimensions: participants, application domains, nudge forms, and development methods. The findings show that co-design is primarily driven by end-users, supported by domain experts and technology specialists. Applications are concentrated in health-related contexts, particularly chronic disease management and mental health. The effectiveness of priming varied across studies, with some reporting short-term benefits and others indicating user fatigue, suggesting context-dependent impact and limited long-term effectiveness.
Full article

Graphical abstract
Open AccessArticle
From Prompts to High-Fidelity Prototypes: A Usability Evaluation of Generative AI-Driven Prototyping Tools for Smart Mobile App Design
by
John Bustamante-Orejuela, Xavier Quiñonez-Ku and Pablo Pico-Valencia
Multimodal Technol. Interact. 2026, 10(4), 42; https://doi.org/10.3390/mti10040042 - 17 Apr 2026
Abstract
The integration of Generative Artificial Intelligence (GAI) into software design tools has transformed the early stages of mobile application development, particularly prototype creation from natural-language prompts. This study evaluates the usability and effectiveness of GAI-assisted prototyping tools for generating high-fidelity mobile application prototypes.
[...] Read more.
The integration of Generative Artificial Intelligence (GAI) into software design tools has transformed the early stages of mobile application development, particularly prototype creation from natural-language prompts. This study evaluates the usability and effectiveness of GAI-assisted prototyping tools for generating high-fidelity mobile application prototypes. A controlled laboratory usability study was conducted in which undergraduate Information Technology Engineering students used and evaluated four widely adopted prototyping platforms: Figma, Uizard, Visily, and Stitch. Participants employed these tools to recreate mobile interfaces corresponding to the interaction model of the Duolingo application. The System Usability Scale (SUS) was used to assess perceived usability and effectiveness from the users’ perspective. The results indicate that all evaluated tools enabled rapid prototype generation; however, significant differences emerged in usability, structural fidelity, and perceived control. Figma and Stitch achieved the highest usability scores and demonstrated greater alignment with the reference prototype (82.86 and 80.36, respectively). Visily achieved a favorable usability score (78.57), while Uizard obtained a moderate score (67.14). Although Uizard and Visily exhibited strong automation capabilities and faster initial generation, their outputs required additional manual refinement to achieve higher fidelity and customization. Participant feedback emphasized the importance of output quality, responsiveness, and foundational design knowledge in achieving satisfactory results. Overall, the findings suggest that current GAI-based prototyping tools are effective and valuable in real-world software development contexts. However, their effectiveness appears closely related to the degree of user control, responsiveness, and the ability to iteratively refine AI-generated interface components.
Full article
(This article belongs to the Special Issue Intelligent Interaction Design: Innovative Models and the Future of Human–Computer Experience)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Introducing Brain–Computer Interfaces in Factories and Fabrication Lines for the Inclusion of Disabled Workers–Industry 5.0—A Modern Challenge and Opportunity
by
Marian-Silviu Poboroniuc, Zoltán Nochta, Martin Klepal, Nina Hunter, Danut-Constantin Irimia, Alina Georgiana Baciu, Kelaja Schert, Tim Piotrowski and Alexandru Mitocaru
Multimodal Technol. Interact. 2026, 10(4), 41; https://doi.org/10.3390/mti10040041 - 17 Apr 2026
Abstract
►▼
Show Figures
Flexible factories and adaptive fabrication lines offer a testbed for advanced multimodal interaction concepts that can support the inclusion of disabled workers in Industry 5.0 manufacturing systems. The study synthesizes interdisciplinary data from ergonomics, industrial automation, and EU regulatory frameworks to establish a
[...] Read more.
Flexible factories and adaptive fabrication lines offer a testbed for advanced multimodal interaction concepts that can support the inclusion of disabled workers in Industry 5.0 manufacturing systems. The study synthesizes interdisciplinary data from ergonomics, industrial automation, and EU regulatory frameworks to establish a conceptual model for human-machine interaction. Building on conceptual modeling and a structured literature analysis, the study proposes a six-step integration framework that links task demands, worker capabilities, and interaction modalities within human-in-the-loop manufacturing environments. Although no empirical case study was conducted in this phase, an exemplary application is presented for a semi-automated bike wheel manufacturing process. Detailed machine-based assembly line flows and simulated process data were utilized for illustrative purposes to depict the process and validate the proposed Capability–Task Matching Matrix. The results operationalize the human-centric vision of Industry 5.0 by providing a structured methodology for the inclusion of disabled workers within fabrication environments. The findings are organized into two primary components: the conceptual development of the Integration Approach and its practical application to a semi-automated industrial use-case. Finally, a particular focus is placed on Brain–Computer Interfaces (BCIs) as an emerging interaction channel that enables non-muscular control, attention monitoring, and neuroadaptive feedback, complementing conventional interfaces rather than replacing them. The framework is illustrated through application to the same semi-automated bicycle wheel assembly line, where BCI-supported interaction, augmented interfaces, and robotic assistance are mapped to specific production tasks and assessed in terms of feasibility and technological maturity. Drawing on the paper’s results, an explanatory 10-year roadmap outlines the feasibility and phased deployment of BCI solutions. It aligns technological advances with European regulations and a vision for a fully inclusive manufacturing enterprise.
Full article

Figure 1
Open AccessArticle
The Discrimination Threshold on the Palm for Two Successive Rectangular Stimuli
by
Mayuka Kojima and Akio Yamamoto
Multimodal Technol. Interact. 2026, 10(4), 40; https://doi.org/10.3390/mti10040040 - 15 Apr 2026
Abstract
►▼
Show Figures
This study investigates tactile spatial resolution on the palm using two successive rectangular stimuli. Whereas classical tactile resolution studies have focused mainly on point or circular stimulation, less is known about how spatial resolution depends on the placement and geometry of rectangular, device-relevant
[...] Read more.
This study investigates tactile spatial resolution on the palm using two successive rectangular stimuli. Whereas classical tactile resolution studies have focused mainly on point or circular stimulation, less is known about how spatial resolution depends on the placement and geometry of rectangular, device-relevant stimuli. We measured the successive two-stimulus discrimination threshold using three rectangular stimulators across five palm areas aligned along the proximal–distal axis. Participants compared a fixed reference stimulus with a variable comparison stimulus, and the minimum separation at which the two stimuli were perceived as occurring at different locations was recorded as the threshold. The overall average threshold across all experimental conditions was approximately 5.2 mm. The threshold varied systematically across palm regions, being smallest around the palmar digital crease and the base of the fingers. In the central palm, threshold differences were more evident for changes in stimulator width than for changes in stimulator length. These results extend tactile spatial resolution research beyond point stimulation and provide design-relevant guidance for palm-based haptic devices.
Full article

Figure 1
Open AccessArticle
Multimodal Smart-Skin for Real-Time Sitting Posture Recognition with Cross-Session Validation
by
Giva Andriana Mutiara, Muhammad Rizqy Alfarisi, Paramita Mayadewi, Lisda Meisaroh and Periyadi
Multimodal Technol. Interact. 2026, 10(4), 39; https://doi.org/10.3390/mti10040039 - 9 Apr 2026
Abstract
►▼
Show Figures
Prolonged sitting with poor posture is associated with musculoskeletal disorders, reduced productivity, and long-term health risks. Many existing posture monitoring systems predominantly rely on single-modality sensing, such as pressure or vision-based approaches, limiting their ability to capture both static alignment and dynamic micro-movements.
[...] Read more.
Prolonged sitting with poor posture is associated with musculoskeletal disorders, reduced productivity, and long-term health risks. Many existing posture monitoring systems predominantly rely on single-modality sensing, such as pressure or vision-based approaches, limiting their ability to capture both static alignment and dynamic micro-movements. This study proposes a multimodal smart-skin system integrating pressure, temperature, and vibration sensors for sitting posture recognition. A total of 42 sensors distributed across 14 anatomical locations were deployed, generating 15,037 samples collected over three independent sessions to evaluate cross-session temporal generalization across nine posture classes under controlled experimental conditions. Two deep learning architectures—Temporal Convolutional Networks with Attention (TCN + Attn) and Convolutional Neural Network–Long Short-Term Memory (CNN − LSTM)—were compared under Leave-One-Session-Out (LOSO) cross-validation. TCN + Attn achieved 85.23% LOSO accuracy, outperforming CNN − LSTM by 2.56 percentage points while reducing training time by 36.7% and inference latency by 33.9%. Ablation analysis revealed that temperature sensing was the most discriminative unimodal modality (71.5% accuracy), and full multimodal fusion improved LOSO accuracy by 22.93% compared to pressure-only configurations. These results demonstrate the feasibility of multimodal smart-skin sensing combined with temporal convolutional modeling for cross-session posture recognition and indicate potential for efficient real-time, privacy-preserving ergonomic monitoring. This study should be interpreted as a controlled, single-subject proof-of-concept, and further validation in multi-subject and real-world environments is required to establish broader generalizability.
Full article

Figure 1
Open AccessArticle
PAD-Guided Multimodal Hybrid Contrastive Emotion Recognition upon STEM-E2VA Dataset
by
Shufei Duan, Wenjie Zhang, Liangqi Li, Ting Zhu, Fangyu Zhao, Fujiang Li and Huizhi Liang
Multimodal Technol. Interact. 2026, 10(4), 38; https://doi.org/10.3390/mti10040038 - 2 Apr 2026
Abstract
►▼
Show Figures
There are still challenges in speech emotion recognition, as the representation capability of single-modal information is limited, there are difficulties in capturing continuous emotional transitions in discrete emotion annotations, and the issues of modal structural differences and cross-sample alignment in multimodal fusion methods
[...] Read more.
There are still challenges in speech emotion recognition, as the representation capability of single-modal information is limited, there are difficulties in capturing continuous emotional transitions in discrete emotion annotations, and the issues of modal structural differences and cross-sample alignment in multimodal fusion methods persist. To address these, this study undertakes work from both data and model perspectives. For data, a Chinese multimodal database STEM-E2VA was constructed, synchronously collecting four modalities of data: articulatory kinematics, acoustics, glottal signals, and videos. This covers seven discrete emotion categories and employs PAD continuous annotation. By integrating discrete and continuous dimensional annotations, it better represents the distinction between strong and weak emotions under the same discrete emotion label. Concurrently, to process the biases in PAD annotations, we employed the SCL-90 psychological questionnaire to analyze annotators’ cognitive and emotional perceptions, thereby ensuring data reliability. For model, this paper proposes a multimodal supervised contrastive fusion network incorporating PAD perception. It employs a PAD-enhanced hybrid contrastive loss function to optimize intra-model and inter-modal feature alignment. Utilizing a cross-attention mechanism combined with a GRU–Transformer network for temporal feature extraction, it achieves deep fusion of multimodal information, reducing inter-modal discrepancies and cross-class confusion. Experiments demonstrate that the proposed method achieves 85.47% accuracy in discrete sentiment recognition on STEM-E2VA, with a substantial reduction in RMSE for PAD dimension prediction. It also exhibits excellent generalization capability on IEMOCAP, providing a novel framework for integrating discrete and continuous sentiment representations.
Full article

Figure 1
Open AccessArticle
Ergonomic Evaluation of Augmented Reality-Based Visualization of Scattered Radiation Distribution During Partial-Angle CT
by
Hiroaki Hasegawa
Multimodal Technol. Interact. 2026, 10(4), 37; https://doi.org/10.3390/mti10040037 - 2 Apr 2026
Abstract
►▼
Show Figures
Computed tomography (CT)-guided procedures require close proximity to the CT gantry or patient, increasing occupational exposure to scattered radiation. Even though radiation-protective equipment is commonly used, the optimization of CT fluoroscopic techniques remains important. Partial-angle CT (PACT) employs a limited exposure angle, producing
[...] Read more.
Computed tomography (CT)-guided procedures require close proximity to the CT gantry or patient, increasing occupational exposure to scattered radiation. Even though radiation-protective equipment is commonly used, the optimization of CT fluoroscopic techniques remains important. Partial-angle CT (PACT) employs a limited exposure angle, producing cumulative scattered radiation distributions that vary with the selected angle and are difficult to estimate in advance. I aimed to develop an augmented reality (AR)-based visualization method for cumulative scattered radiation distributions during PACT and to evaluate its ergonomic feasibility as a proof of concept for occupational exposure reduction. An AR display system was developed to overlay cumulative scattered radiation distributions onto physical space using AR glasses. Workload was assessed using the NASA Task Load Index (NASA-TLX), and usability was assessed using the System Usability Scale (SUS). Compared with non-virtual conditions using radiation-protective glasses alone, AR-assisted visualization was associated with increased perceived workload, and usability scores were lower than those reported in previous AR studies. These findings indicate that, for AR display systems to support occupational exposure reduction, perceived task demands must be comparable to conventional protection strategies. Further improvements in visualization methods, user familiarity with AR environments, and ergonomic optimization are required to facilitate clinical implementation.
Full article

Figure 1
Open AccessArticle
The Emergent Rhythms of a Robot Vacuum Cleaner—An Empirically Grounded Account of Agential Realism
by
Linus de Petris, Siamak Khatibi and Yuan Zhou
Multimodal Technol. Interact. 2026, 10(4), 36; https://doi.org/10.3390/mti10040036 - 1 Apr 2026
Abstract
►▼
Show Figures
This article builds on the argument that design for complex interactive systems should shift from creating linear transactional interactions toward organizing relational complexity. Grounded in Karen Barad’s agential realism, we argue that a designer’s role can benefit from not predefining interactions but from
[...] Read more.
This article builds on the argument that design for complex interactive systems should shift from creating linear transactional interactions toward organizing relational complexity. Grounded in Karen Barad’s agential realism, we argue that a designer’s role can benefit from not predefining interactions but from curating the material-discursive conditions under which meaningful relations can emerge. To explore the empirical and temporal dimensions of this practice, we conducted an exploratory workshop setting the conditions for emergent gameplay dynamics and discussions on agential realist anticipation. Participants utilized a custom-designed game and built their own physical controllers to anticipate and adapt to shifting gameplay conditions. Our results demonstrate how alterations in relational constraints, rather than explicit pre-programmed goals, drove the emergence of non-predefined gameplay rhythms. The findings provide empirical grounding for an agential realist understanding of anticipation, showing that an interactive system’s identity lies in its unfolding processual patterns rather than a static final state. Based on these findings, we propose three design principles for further exploration: Design for Relational Emergence, Design for Re-membering, and Design for Emergent Patterns. Consequently, we conclude by outlining a conceptual approach for non-linear computational architectures, drawing on principles from Enactive AI and reservoir computing.
Full article

Figure 1
Open AccessArticle
Reading Noise: Integrating Physiological Sensing and Sound-Driven Visualization to Externalize Noise-Related Cognitive Disruption During Reading
by
Xueyi Li, Yonghong Liu, Zihui Jiang and Yangcheng Wang
Multimodal Technol. Interact. 2026, 10(4), 35; https://doi.org/10.3390/mti10040035 - 30 Mar 2026
Abstract
►▼
Show Figures
Environmental noise may interfere with the reading experience by increasing cognitive load and psychophysiological arousal, yet these effects are difficult to perceive and communicate in real time. This study presents Reading Noise, an interactive installation that combines physiological sensing and sound-driven visualization to
[...] Read more.
Environmental noise may interfere with the reading experience by increasing cognitive load and psychophysiological arousal, yet these effects are difficult to perceive and communicate in real time. This study presents Reading Noise, an interactive installation that combines physiological sensing and sound-driven visualization to externalize perceived noise-related disturbance and psychophysiological strain during reading. In a controlled experiment, 46 participants completed reading tasks under four levels of background conversational noise (0–30, 31–60, 61–90, and >90 dB) while ambient sound level, electrodermal activity (EDA), and electrocardiogram (ECG) were recorded in real time. Following data quality screening, inferential statistical analyses were performed on the analyzable physiological subset (n = 16). Based on these data, a hybrid mapping strategy combining rule-based assignment and LMM-informed exploratory calibration was developed to map acoustic and physiological changes onto dynamic text-based visual parameters, including deformation intensity, jitter, and motion instability, for real-time feedback. Within the analyzable subset, noise level was associated with significant changes in the recorded physiological indicators (all p < 0.05): skin conductance level (SCL) and skin conductance responses per minute (SCRs/min) increased (4.69 ± 2.13 to 5.93 ± 2.19 μS; 1.49 ± 1.59 to 2.51 ± 2.13), whereas the percentage of successive RR intervals differing by more than 50 ms (pNN50) and the root mean square of successive differences (RMSSD) decreased (15.84 ± 16.52% to 10.57 ± 11.35%; 36.63 ± 17.62 to 29.67 ± 16.66 ms). Subjective cognitive load also increased significantly (2.06 ± 0.29 to 6.38 ± 0.31). A follow-up installation study with 24 cross-disciplinary participants, with reported group interaction observations drawn from a 12-participant subset, suggested that the installation may facilitate shared interpretation of attention-related disruption and cognitive strain, indicating the potential of physiology-informed visual translation as a boundary object approach for empathetic, sound-mediated communication.
Full article

Figure 1
Open AccessArticle
Distributed Teaching Agency–AI in the University: A Typology Based on Student Voice
by
Tomás Fontaines-Ruiz, Antonio Ponce-Rojo, Paolo Fabre Merchán, Walther Casimiro Urcos and Liliana Cánquiz Rincón
Multimodal Technol. Interact. 2026, 10(4), 34; https://doi.org/10.3390/mti10040034 - 27 Mar 2026
Abstract
Generative AI is reshaping university teaching and creating tension around authority, evidence, and accountability when decisions are made using algorithms. From a student perspective, this study constructed a typology of distributed teacher–AI agency (TAI) and examined the discursive mechanisms that produce the illusion
[...] Read more.
Generative AI is reshaping university teaching and creating tension around authority, evidence, and accountability when decisions are made using algorithms. From a student perspective, this study constructed a typology of distributed teacher–AI agency (TAI) and examined the discursive mechanisms that produce the illusion of teacher autonomy. A non-experimental, cross-sectional, explanatory study was conducted: a lexicometric analysis of the ALCESTE (IRAMUTEQ) questionnaire, using open-ended responses from 3120 students (Mexico, n = 2051; Ecuador, n = 1069), segmented into 1077 units, and analyzed using positioning theory. Co-agency was operationalized using Teacher Agency (A), Delegation to AI (D), Governance (G: disclosure, criteria, verification), and the Illusion Index (II = A/(D + G + 1)). Three configurations emerged: Immediate Customizer (28.8%) with very high A and minimal D/G (II = 25.4); Technological Literacy Facilitator (27.3%) with visible delegation and safeguards (II ≈ 2.0); and Operational Optimizer (43.9%) oriented toward accelerating tasks with moderate governance (II ≈ 2.7). The illusion was associated with the agentive erasure of AI and a rhetoric of immediacy/efficiency that replaced verifiable criteria. These findings transform the student voice into a criteria-based diagnostic tool for strengthening traceability, minimal verification, and responsible orchestration of AI in higher education.
Full article
Open AccessArticle
HMI Design of Intelligent Vehicles Based on Multimodal Experiments of Driver Emotions
by
Tongyue Sun, Yongjia Li and Xihui Yang
Multimodal Technol. Interact. 2026, 10(3), 33; https://doi.org/10.3390/mti10030033 - 21 Mar 2026
Abstract
►▼
Show Figures
Negative driving emotions constitute a significant factor compromising road safety. Current intelligent vehicle human machine interaction (HMI) systems predominantly focus on functional implementation, lacking the capability to perceive and adapt to the driver’s psychological state. To address this issue, this study investigates the
[...] Read more.
Negative driving emotions constitute a significant factor compromising road safety. Current intelligent vehicle human machine interaction (HMI) systems predominantly focus on functional implementation, lacking the capability to perceive and adapt to the driver’s psychological state. To address this issue, this study investigates the intrinsic relationship between driving emotions and HMI through multimodal experiments. Experiment One reveals the distribution patterns of drivers’ visual attentional scope under different emotional states. Experiment Two establishes a color preference model for HMI interfaces corresponding to specific emotions. Experiment Three quantitatively analyzes the impact of emotional variations on the perceptual efficiency of auditory warnings. Based on the experimental data, an interaction design principle matching “Emotion-Scene-Modality” is formulated, guiding the design of a data-driven, emotion-adaptive HMI prototype system. This system can perceive the driver’s emotional state in real time via multimodal sensors and dynamically adjust interface color themes, information layout, warning sound effects, and voice interaction style according to predefined interaction strategies. Usability testing demonstrates that, compared to traditional static HMI, this affective adaptive system effectively mitigates the driver’s negative emotional load and provides alerts that are more perceptible and less likely to cause irritation during critical moments. Consequently, it offers a significant theoretical foundation and practical reference for constructing a safer and more comfortable next-generation intelligent vehicle cockpit interaction paradigm.
Full article

Figure 1
Open AccessArticle
Design and Evaluation of Interactive Radar Visualisation of Academic Performance for Parents and Students
by
Ka Ian Chan, Patrick Pang and Huiwen Zou
Multimodal Technol. Interact. 2026, 10(3), 32; https://doi.org/10.3390/mti10030032 - 20 Mar 2026
Abstract
►▼
Show Figures
This study investigates how parents and students interpret and form continued engagement intentions with a radar visualisation tool designed to present multi-subject academic performance. While data visualisation is increasingly used in education, limited empirical attention has been given to whether parents and students,
[...] Read more.
This study investigates how parents and students interpret and form continued engagement intentions with a radar visualisation tool designed to present multi-subject academic performance. While data visualisation is increasingly used in education, limited empirical attention has been given to whether parents and students, who share the same performance information but hold distinct roles, respond to visualised reports through similar behaviours. To address this gap, an interactive radar visualisation was developed to present secondary school students’ achievement across subjects with peer reference points. Drawing on the Unified Theory of Acceptance and Use of Technology (UTAUT) as an analytical framework, this study examines the determinants of continued intention to use the visualisation tool. Questionnaire data were collected from 706 parents and 264 students in a Macao secondary school. Structural equation modelling (SEM) revealed fundamentally different ideas of continued engagement. For parents, continued intention was significantly associated with performance expectancy (PE) and effort expectancy (EE), social influence (SI) and facilitating conditions (FC), suggesting the tool functioned as a decision support system for academic planning. For students, only social influence (SI) and facilitating conditions (FC) emerged as significant predictors, indicating that peer comparison and external expectations may not fit their needs. Parents also reported significantly higher continued intention than students. The finding extended UTAUT by demonstrating that core acceptance relationships are moderated by different roles, reframing technology acceptance in educational visualisation from system adoption to information interpretation. The study provides empirical evidence that visualised performance reporting functions not merely as a data display but also as a communication medium whose meaning is actively constructed by users. These insights highlight the need for role-sensitive design, emphasising actionable planning support for parents and personally meaningful, agency-oriented feedback for students, in order to foster productive home–school communication and sustained engagement with learning information.
Full article

Figure 1
Open AccessArticle
Navigating the Future: A Design Fiction Study on User Perceptions of Next-Gen LLM-Based Voice Interaction
by
Biju Thankachan, Deepak Akkil, Sama Rahman, Kristiina Jokinen and Markku Turunen
Multimodal Technol. Interact. 2026, 10(3), 31; https://doi.org/10.3390/mti10030031 - 20 Mar 2026
Abstract
►▼
Show Figures
Voice user interfaces (VUIs) have evolved from simple command-based systems to more advanced platforms capable of engaging in complex, multi-turn conversations. While current VUIs primarily perform routine tasks, their future trajectory is poised to be significantly shaped by advancements in large language models
[...] Read more.
Voice user interfaces (VUIs) have evolved from simple command-based systems to more advanced platforms capable of engaging in complex, multi-turn conversations. While current VUIs primarily perform routine tasks, their future trajectory is poised to be significantly shaped by advancements in large language models (LLMs), enhancing their language understanding and human-like interaction capabilities. This study explores user perceptions of next-generation VUIs using a design fiction approach. We crafted five plausible future scenarios, depicted in comic-style formats, showcasing diverse VUI use-cases. Results from the focus group discussions reveal valuable insights highlighting the potential and challenges of integrating advanced VUIs into everyday interactions. Our results highlight the importance of building trust, factors influencing trust, social aspects and implications of technology, preferences for interaction techniques, and various ethical considerations associated with technology. We conclude by providing design guidelines for future VUIs, emphasizing the need for designing to build trust, the importance of domain specificity, the importance of enabling social experiences mediated via VUIs, and more.
Full article

Figure 1
Open AccessArticle
Possibilities of Artificial Intelligence in Sports Refereeing: An Exploratory Study Contrasting the Literature Review with Expert-Perceived Opportunities
by
David Martín Moncunill, Domingo Sampedro Lirio and Miguel Ángel Bravo Hijón
Multimodal Technol. Interact. 2026, 10(3), 30; https://doi.org/10.3390/mti10030030 - 19 Mar 2026
Abstract
►▼
Show Figures
Sports have progressively incorporated technological advances, yet while the impact on performance and broadcasting is remarkable, the application of Artificial Intelligence (AI) in sports refereeing appears residual. A closer examination of prior research suggests that this limited development reflects deeper conceptual patterns within
[...] Read more.
Sports have progressively incorporated technological advances, yet while the impact on performance and broadcasting is remarkable, the application of Artificial Intelligence (AI) in sports refereeing appears residual. A closer examination of prior research suggests that this limited development reflects deeper conceptual patterns within the field. While existing research on AI in sports officiating has predominantly conceptualized the field under an accuracy-optimization paradigm (focusing on decision precision, visual attention patterns, referee fatigue, and performance enhancement), there is a systematic lack of theoretical and empirical work that frames officiating as a broader socio-technical ecosystem. In particular, the literature does not provide conceptual models addressing (i) AI-assisted risk prevention and athlete safety as a core officiating function, (ii) human–AI task redistribution in cognitively overloaded and hybrid evaluative environments (e.g., disciplines such as artistic gymnastics or bodybuilding, where technical execution and aesthetic judgment are simultaneously assessed), and (iii) the redefinition of the referee’s role when AI operates as an anticipatory or real-time alert system rather than merely as a post hoc verification tool. Thus, the gap is not only one of application but of knowledge production: the dominant paradigm optimizes decision accuracy, yet it leaves the question of how AI can transform refereeing responsibilities, cognitive load distribution, and safety governance within competitive ecosystems under-theorized. This exploratory study adopts a Human–Computer Interaction (HCI) perspective to contrast existing initiatives with the practical expectations of professional referees. The methodology comprises two pillars: a systematic literature review following PRISMA guidelines and qualitative experimentation involving professional referees using focus groups and affinity diagrams techniques. From an initial total of 1251 records retrieved across five academic databases (2019–2025), 1122 articles were analyzed after applying strict inclusion/exclusion criteria. The findings provide preliminary support for our hypothesis of a significant underutilization gap, showing that research is concentrated on accuracy systems, while high-potential areas identified as critical by experts, such as athlete safety, represent only 0.6% of the analyzed literature. The study contributes a conceptual framework based on five categories established by experts, according to the identified use cases, providing guidance for future AI integration and interdisciplinary research in the sports officiating ecosystem. Based on the results, we point to future applications and lines of research aimed at integrating AI as a tool for sports refereeing.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
AI, Inventions, MTI, Robotics, Sci, Sensors, Standards, Technologies
Toward Trustworthy Human-AI Collaboration: From Interactive Intelligence to Collaborative Autonomy
Topic Editors: George Margetis, Helmut Degen, Stavroula NtoaDeadline: 30 November 2026
Topic in
Electronics, MTI, BDCC, AI, Virtual Worlds, Applied Sciences
AI-Based Interactive and Immersive Systems
Topic Editors: Sotiris Diplaris, Nefeli Georgakopoulou, Stefanos Vrochidis, Giuseppe Amato, Maurice Benayoun, Beatrice De GelderDeadline: 31 December 2026
Topic in
AI, Arts, Computers, MTI
Artificial Intelligence and the Future of Art
Topic Editors: Ahmed Elgammal, Marian MazzoneDeadline: 31 October 2027
Conferences
Special Issues
Special Issue in
MTI
Online Learning to Multimodal Era: Interfaces, Analytics and User Experiences
Guest Editors: Nikleia Eteokleous, Rita PanaouraDeadline: 31 May 2026
Special Issue in
MTI
uHealth Interventions and Digital Therapeutics for Better Diseases Prevention and Patient Care
Guest Editor: Silvia GabrielliDeadline: 30 June 2026
Special Issue in
MTI
Behavioral Cybersecurity, Deception and Secure Design
Guest Editors: Derek L. Hansen, Ben SchooleyDeadline: 30 June 2026
Special Issue in
MTI
Intelligent Interaction Design: Innovative Models and the Future of Human–Computer Experience
Guest Editor: Peng-Wei HsiaoDeadline: 30 June 2026



