Next Article in Journal
A Lightweight Deep Learning Model for Profiled SCA Based on Random Convolution Kernels
Previous Article in Journal
Toward Annotation, Visualization, and Reproducible Archiving of Human–Human Dialog Video Recording Applications
Previous Article in Special Issue
Assessing the Impact of Prior Coding and Artificial Intelligence Learning on Non-Computing Majors’ Perception of AI in a University Context
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Exploration of Combining Hologram-like Images and Pedagogical Agent Gesturing

1
Department of Education, Chonnam National University, Gwangju 61187, Republic of Korea
2
Department of English for International Conferences and Communication, Hankuk University of Foreign Studies, Seoul 02450, Republic of Korea
3
Department of Elementary Education, Jeju National University, Jeju 63243, Republic of Korea
4
College of Liberal Arts, Kongju National University, Cheonan 31080, Republic of Korea
*
Author to whom correspondence should be addressed.
Information 2025, 16(5), 350; https://doi.org/10.3390/info16050350
Submission received: 25 February 2025 / Revised: 19 April 2025 / Accepted: 25 April 2025 / Published: 27 April 2025

Abstract

:
The split-attention principle suggests that separating onscreen information sources can overburden working memory and impede learning. While research has traditionally focused on the separation of images and text, relatively little is known about the impact of multiple simultaneous visual inputs. This study examined the split-attention principle in a multimedia environment featuring a pedagogical agent performing gestures, with hologram-like images either integrated centrally with the agent or spatially separated. A within-subjects design (N = 80) investigated the impact on satisfaction, cognitive load, and cued recall. The quantitative findings revealed no significant differences between the two spatial conditions. Preliminary qualitative insights from a limited sample of six individual interviews suggested that some participants may employ strategies to simplify complex designs and manage perceived cognitive load. Based on these limited qualitative observations, this research tentatively proposes the “pruning principle”, a metacognitive strategy where learners actively “prune” extraneous information to optimize cognitive resources. These findings underscore the importance of considering individual differences and metacognitive strategies in multimedia design.

1. Introduction

Designing multimedia environments presents significant challenges due to the complexities involved in information presentation. To address cognitive load, the Cognitive Theory of Multimedia Learning (CTML) has introduced fifteen principles over the past two decades [1]. For example, the multimedia principle states that combining words and pictures in presentations enhances learning more effectively than using words alone. Similarly, the signaling principle highlights the importance of visual cues, such as arrows or highlighting, to guide attention toward key information. While many CTML principles are well-established [2], certain nuances warrant further exploration to ensure their applicability across diverse multimedia designs. For instance, the redundancy principle originally suggested that presenting identical information through written text and speech alongside pictures could impede learning [3]. However, subsequent research by Mayer and Johnson [4] demonstrated that displaying key terms, rather than verbatim text, alongside narration can enhance learning. This finding underscores the importance of nuanced interpretations and adaptations of CTML principles.
Although CTML offers a robust framework for designing multimedia environments, additional design principles complement and extend its guidelines, often with subtle but important differences, particularly regarding cognitive load. For example, CTML’s spatial contiguity principle closely parallels the split-attention principle [5]. Both principles advocate for reducing extraneous cognitive load by ensuring that related information is presented in close proximity. The spatial contiguity principle states that that separating text from related images increases cognitive load, as it forces users to mentally integrate the two elements in working memory [6]. A classic example of the spatial contiguity principle is a map of Europe with the country names listed below the map rather than integrated within it. In such cases, users unfamiliar with the countries must repeatedly shift their focus, increasing cognitive demand on working memory. In contrast, the split-attention principle extends beyond spatial considerations to include factors such as mismatched timing between animations and narration or presenting information through conflicting modes (Ayres & Sweller, 2014). Thus, while CTML addresses certain temporal and modality aspects through its temporal contiguity and modality principles, the split-attention principle provides a broader framework for managing cognitive load across diverse information formats. This is particularly relevant in multimedia environments that incorporate pedagogical agents, where multiple channels of information must be effectively integrated.
While the spatial contiguity and split-attention principles have a significant overlap, the split-attention principle provides deeper insight into multimedia environments with multiple graphical elements, such as pedagogical agents combined with visual aids. Although Mayer [6] argued that a speaker’s image does not significantly enhance learning, the embodiment of pedagogical agents introduces an important nuance, as their design and integration can influence learner engagement and cognitive processing. Mayer and DaPra [7] found that highly embodied agents—featuring lip synchronization, eye gaze, facial expressions, gestures, and body movements—significantly enhance social perception and learning outcomes compared to minimally embodied agents with only lip synchronization. However, incorporating highly embodied agents can increase the complexity of the multimedia environment, potentially contributing to split-attention effects that may hinder learning if not carefully managed. Learners must divide their attention between the agent’s nonverbal cues, speech, and other visual elements, potentially increasing the processing demand on working memory. While research has examined split-attention effects between text and images, there remains a gap in understanding how multiple simultaneous visual inputs—such as pedagogical agent gestures combined with visual aids—affect satisfaction, cognitive processing, learning outcomes, and viewing strategies. This gap is particularly significant for understanding how learners allocate attention and integrate complex visual information during multimedia presentations.
This study aims to investigate whether split-attention effects arise in multimedia environments when multiple simultaneous visual inputs are presented simultaneously. The design features a pedagogical agent performing gestures at double frequency, with static images either positioned to the side of the agent or integrated centrally within the display. This within-subjects study investigates participants’ satisfaction, cognitive load, learning outcomes, and viewing strategies when engaging with declarative information on the topic of Australia. Furthermore, individual interviews were conducted to explore how learners allocate attention to and integrate complex visual information during multimedia presentations, offering insights into how split attention may have influenced their interactions with the visual elements.

2. Literature Review

2.1. Split-Attention Principle in Multimedia Learning

The split-attention principle suggests that the spatial separation of multimedia information sources, such as images and text, requires users to mentally integrate these elements. This process can divert essential cognitive resources, thereby increasing cognitive load [8]. Grounded in cognitive load theory (CLT), which acknowledges the limited capacity of working memory, the split-attention principle emphasizes that instructional materials should be designed to minimize extraneous cognitive load—the mental effort required to process information that is not directly relevant to the learning task [9,10,11]. Dividing attention between separate sources of information can hinder efficient processing and comprehension, particularly in multimedia learning environments.
Similarly, the split-attention principle builds upon the concept of dual-channel processing, which suggests that working memory operates through separate channels for processing auditory and visual information [12]. When learners encounter spatially separated information, they are required to continuously shift their visual attention, dividing their focus and potentially hindering learning and comprehension. This process risks exceeding the limited capacity of working memory. Such a division of attention and its impact on working memory aligns with the active processing assumption, which emphasizes the importance of meaningful engagement with material for effective learning [1]. The active processing assumption suggests that there is an active cognitive engagement, where learners select relevant information, organize it into coherent representations that can be assessed by working memory, and integrate these representations into existing schemas within long-term memory. This type of processing allows learners to experience germane cognitive load, which supports learning by fostering meaningful connections. In contrast, overloading one or both of the dual channels can lead to extraneous cognitive load, which hinders learning by consuming cognitive resources unnecessarily [13]. While the active processing assumption underscores the importance of engaging with relevant information, Schroeder and Cenkci’s [14] systematic review raises questions about the role of extraneous cognitive load. Their findings suggest that learners may employ preemptive strategies, such as selective attention, to manage the cognitive demands placed on working memory, potentially mitigating the negative effects of extraneous cognitive load. The authors suggest that integrating graphics and text is an effective design strategy, as it reduces the cognitive burden on working memory and minimizes the need for selective attention, allowing learners to focus more efficiently on the learning task.
Theories and the majority of the research on split attention have primarily focused on the integration of text and images. However, the concept of split attention extends beyond this scope. It also applies to multiple visuals, such as multiple visuals, such as multiple static images presented simultaneously [15], images paired with animations [16,17], multiple animations displayed concurrently [18], or combinations of images and a gesturing pedagogical agent [19]. The findings from these studies suggest a potentially complex interaction between visual information and the spatial design of information. For example, some research [15] indicates that spatial distance alone does not necessarily hinder the integration of information. In contrast, other studies suggest that spatial separation can increase cognitive load and alter viewing strategies, potentially impacting learning outcomes [17]. Individualized strategies for managing split-attention can vary widely, ranging from ignoring one source of information entirely [17] to actively altering focus between multiple sources [19]. These strategies are often influenced by the learner’s perception of each source’s usefulness and its potential contribution to learning. Furthermore, individual cognitive abilities, such as mental rotation, can affect whether split attention negatively impacts performance [18]. This indicates that processing spatially separated information requires not only the integration of different content but also the effective management of cognitive resources and the application of metacognitive strategies.

2.2. Gestures

Human-to-human communication often involves multiple strategies to convey information. This can occur verbally, through spoken or written language, or nonverbally, through human displays such as facial expressions, tone of voice, body position, and gestures, which are interpreted by the listener [20]. Although research indicates that listeners typically spend more time focusing on the speaker’s face (84.9%) than on gestures (2.1%), gestures still play an important role in communication. It is suggested that attention to gestures occurs unconsciously, enabling the concurrent processing of verbal and gestural information [21]. Similarly, Gullberg and Holmqvist [22] found that listeners are more likely to focus on gestures that extend into the peripheral vision, include holds (pauses in movement), or are accompanied by the speaker’s direct gaze toward the gesture. This suggests that gestures may be processed unconsciously as part of the visual background. This is consistent with Kendon’s [23] argument that gestures aid comprehension, even when listeners do not consciously focus on them.
However, there is nuance in how attention is directed, or not directed, toward gestures. For example, listeners with weaker verbal skills, such as children or second-language users, tend to benefit significantly from gestures during communication, as these provide additional contextual and semantic cues that support comprehension [24]. In the context of foreign language acquisition, Vandergrift [25] observed that gestures help learners conceptualize speech and support working memory. Similarly, Church et al. [26] demonstrated that both English and Spanish speakers performed better in math learning tasks when watching videos that combined speech and gestures, compared to videos with speech alone. The Spanish-speaking students had no prior knowledge of the English language, leading the authors to suggest that the gestures played a crucial role in helping these students link the visual information to their own language, thereby enhancing their comprehension of the concepts. Beyond aiding comprehension, gestures can be viewed as a form of language production. Neuroscientific evidence from fMRI and PET scans indicates that Broca’s area, the brain’s primary language center, is actively engaged in processing meaningful gestures. This suggests that gestures and speech function are integrated modalities of communication [27,28]. However, neither pure movement nor meaningless gestures activate Broca’s area [29,30] suggesting that meaningful gestures convey information that transcends basic motor skills or visual stimuli. Since gestures occur in space, they are classified as spatial movements that engage spatial working memory [31,32]. Alibali [33] suggests that gestures provide benefits not only for their producers but also for their observers. For the producer, gestures offer the ability to emphasize spatial information, support spatial descriptions, and integrate spatial details into verbal communication. For the viewer, spatial gestures provide visual reinforcement for complex topics involving spatial relationships or convey unspoken spatial information essential for understanding. Thus, gestures can be viewed as sophisticated communication strategies that play a crucial role in enhancing both the transmission and comprehension of information, especially when spatial concepts are involved.

2.3. Gesturing Pedagogical Agents

Early researchers theorized that designing pedagogical agents with gestures could not only give the participants the perception that the agent possessed a persona but also enhance the learning process [34]. Lester et al. [35] proposed that pedagogical agents should be designed with deictic believability, meaning they should incorporate speech, movement, and pointing gestures to emulate human behavior. Similarly, social agency theory suggests that designing agents with verbal (e.g., human voice) and visual (e.g., facial expressions and gestures) social cues can activate the user’s social schema, akin to those used in human-to-human communication [36,37]. Mayer and Dapra [7] expanded upon social agency theory by proposing the embodiment principle, which suggests that pedagogical agents should be designed to exhibit common human behaviors, such as lip synchronization, eye gaze, facial expressions, body sway, and gestures. A number of meta-analyses and review studies have provided empirical evidence that embodying an agent positively impacts both learning outcomes and the perception of the agent’s persona [38,39,40]. A meta-analysis by Davis [41] examining the impact of agent gestures found that gestures significantly enhanced retention and transfer measures, as well as improving the perception of the agent’s persona. However, most studies included in this analysis primarily investigated agents using deictic gestures. Craig et al. [42] suggested that while deictic gestures are effective in directing the learner’s spatial attention to important onscreen information, researchers should investigate other gesture types that might convey information valuable for learning. Since then, only a limited number of studies have examined the use of all gesture types (iconic, metaphoric, deictic, and beat) and the frequency rates at which gestures are performed in relation to learning outcomes. The findings from these studies suggest that increasing gesture frequency (doubling the average rate) enhances the recall of both procedural knowledge [19,43] and declarative knowledge [44]. However, these studies have exclusively focused on advanced non-native speakers, who may derive greater benefits from the additional input compared to other learner groups. More research is needed with native-speaking populations to determine whether full gesturing and increased gesture frequency provide the same benefits observed with non-native speakers.

2.4. Research Purpose

Effective multimedia learning environments require careful considerations of how users’ view and perceive visual inputs that convey information. This research employs a counterbalanced repeated measures design with two distinct spatial conditions: (1) gestures and hologram-like images centrally integrated and (2) gestures and hologram-like images spatially separated. Individual interviews were conducted to gather insights into participants’ perceptions and viewing patterns. This mixed-methods design investigates the concept of split attention, examining how the integration or separation of visual inputs impacts satisfaction, cognitive load, and learning outcomes.
This study addresses the following research questions:
RQ1: How do gesturing agents and image placement influence satisfaction in a multimedia learning environment?
RQ2: How does image placement and agent gesturing affect the type of cognitive load experienced by participants?
RQ3: To what extent does cued recall vary based on the use of gesturing agents and the spatial positioning of images (middle vs. side)?
RQ4: How do participants’ perceptions of spatially integrated and separated learning visuals influence their satisfaction, perceived difficulty, and viewing strategies?

3. Methods

3.1. Research Design and Participants

The research intervention employed a repeated measures design with two different scripts about Australia. Participants viewed hologram-like images presented in two spatial configurations: positioned to the side of the gesturing agent and centrally aligned with the gesturing agent. The scripts were presented in the same order, but the placement of the hologram-like images was alternated to control for order effects. Participants were randomly assigned to one of two conditions: S1M2 (side images for video one, middle images for video two) or M1S2 (middle images for video one, side images for video two). For the purposes of this exploratory study, the middle hologram-like image configuration serves as the experimental condition. Participants were given both oral and written informed consent prior to completing demographic information. For the order of the intervention and data collection, please see Table 1.
An a priori power analysis was conducted using G*Power 3.1 to determine the required sample size for a repeated measures ANOVA. The analysis was based on an alpha level of 0.05, a desired power of 0.80, and a medium effect size (f = 0.25). The results indicated that a minimum of 34 participants was needed to achieve sufficient power to detect meaningful within-subject effects. The participants in this study were 80 advanced foreign language learners at a foreign language university in Seoul, South Korea. The participants came from diverse native language backgrounds, with 52 identifying their native language as Korean, 8 identifying their native language as Chinese, and 12 identifying other native languages such as French (4), Russian (3), Spanish (3), English (2), in addition to single speakers of Romanian, Portuguese, Indonesian, Ukrainian, Uzbek, German, Afrikaans, and Slovenian. To ensure the results were not skewed, the data from the two native English speakers were analyzed separately to assess their potential influence on the study’s findings. No significant impact was observed across any of the measures, so their data were retained. The average age of the participants was 22.36 years (SD = 2.43), with 56 participants identifying as female and 24 as male. A majority of the students were seniors (27) or juniors (35), with the remaining participants being freshmen (13) and sophomores (5). The participants were either majoring or taking a double major in an English language concentration, or they were exchange students taking English-only courses offered by the university.

3.2. Instructional Content

The content for this study was based on Discovering Australia [45,46] which provides general information (declarative knowledge) about Australia through 22 individual text snippets, each ranging from 74 to 80 words in length. Two of the snippets serve as introductions, while the remaining 20 snippets contain two questions to assess the participant’s understanding of the content. This research utilized one introduction and ten informational topics, which covered the following subjects: Northern coastal wetlands, Perth, sea life in Australian waters, wild rabbits, Wolf Creek Crater, Aboriginal people, MacDonnell Ranges, Great Sandy Desert, Timor Sea, and morning glory clouds. The content has a Flesch–Kincaid grade level of 10.7 and a Flesch reading ease score of 46.7, indicating that it corresponds to college-level reading in English.

3.3. Multimedia Environment and Gesture Design

The agent was created using iClone 8.3 TM with a black background to enhance contrast for the hologram-like images, which were subsequently integrated using Camtasia 2023.3.13. To address the complexity of the images used in the previous studies, the hologram-like images used in this research were sourced from royalty-free websites and modified using an online PNG image converter and editor. The PNG enabled the removal of color, adjustments to line width, and the addition of glowing of lines to produce a hologram-like effect. The learning environment with hologram-like images and location can be viewed in Figure 1.
The pedagogical agent’s gestures were created using the timeline of iClone 8.33. The gesture frequency was established by doubling the average number of gestures observed in human-to-human communication involving declarative information [47], where speakers typically produced 6.51 representational gestures (iconic, metaphoric, deictic) and 1.32 nonrepresentational gestures (beat) per 100 words. This approach is consistent with previous agent gesture research, which found that increasing gesture frequency (doubling) significantly improved cued recall compared to other experimental conditions [44]. Accordingly, this study employed only an enhanced gesture frequency, with the agent performing 13.44 representational gestures and 2.97 nonrepresentational gestures per 100 words.

3.4. Instruments

Demographic data were collected from participants after they provided informed consent and before completing a prior knowledge test. Participants were asked to provide their international age (to account for Korea’s distinct age counting system), gender, university major, year classification, and native language. Participants were also asked if they had lived or had extended stays in Australia and, if so, for how long. This information was collected to assess whether any prolonged experiences could potentially influence the data. No participants reported having prior experience of living in or staying in Australia.

3.4.1. Prior Knowledge Test

Previous research using this instructional content [45,48] did not include a prior knowledge test, as it was assumed that American participants would not possess substantial knowledge about Australia. Previous research involving participants in Korea employed a 5-point Likert scale to allow participants to self-assess their knowledge of Australian history, geography, and wildlife [44]. For this experiment, prior knowledge was assessed using 10 generalized questions covering four broad areas—geography, ecology, culture and society, and the environment—related to the instructional content. Questions included “What is the most dangerous sea creature found in Australian waters?” and “What is the name of the period between November and December in Australia?” Analysis revealed no significant differences in prior knowledge between the two conditions.

3.4.2. Satisfaction

The satisfaction survey was adapted from Ritzhaupt et al’s. [48] time compression study, which measured satisfaction with the presentation of information. This survey consisted of 15 questions, each scored on a 5-point Likert scale. The first ten questions utilized bipolar adjective pairs, ranging from negative (1) to positive (5), and the final five questions were rated on a scale from strongly disagree (1) to strongly agree (5). This survey demonstrated excellent internal consistency, with a Cronbach’s alpha of α = 0.95.

3.4.3. Cognitive Load

Cognitive load was measured using ten questions based on the framework by Leppink et al. [49]. The questions addressed intrinsic cognitive load (3 items), extraneous cognitive load (3 items), and germane cognitive load (4 items). The questions were rated on a 9-point Likert scale, ranging from extremely disagree (1) to extremely agree (9). Intrinsic cognitive load items were negatively worded, meaning higher scores indicated greater perceived inherent difficulty. In the original survey, extraneous cognitive load items were negatively worded. However, for this study, the questions were rephrased positively to reduce potential comprehension difficulties for participants. As a result, lower scores indicate higher extraneous cognitive load. Conversely, higher scores on germane cognitive load items reflect an increase in beneficial cognitive load, which supports learning. The Cronbach’s alpha values for intrinsic cognitive load (α = 0.87), extraneous cognitive load (α = 0.83), and germane cognitive load (α = 0.85) were all considered good, indicating reliable internal consistency for each measure.

3.4.4. Cued Recall

Learning was assessed through two cued recall questions for each of the ten topics. The questions were identical to those used in previous research [44,45,48]. The open-ended responses were independently scored by two reviewers, and any scoring discrepancies were resolved through discussion. The weighted Cohen’s Kappa interrater reliability was calculated at k = 0.98, indicating almost perfect agreement.

3.5. Interviews

Interviews were conducted to explore participants’ perceptions and viewing patterns regarding the placement of hologram-like images in the center and at the sides of the gesturing agent. At the conclusion of the study, participants were invited to volunteer for a 20 min one-on-one interview with a Korean researcher, fluent in both English and Korean, by providing their email address. Participants who completed the interview were compensated with a $10 coffee gift card.
The interviews focused on questions related to satisfaction, cognitive load (perceived difficulty), spatial placement preferences, and viewing strategies. To facilitate recall, screenshots of the two conditions were shown during the interviews. The interviews were conducted within one week of participants’ exposure to the research intervention. While ten participants initially expressed interest in being interviewed, only six completed interviews. The other four participants did not respond to either the initial or follow-up interview email requests.
A five-phase thematic analysis [50] was conducted to examine the interview responses. First, all the Zoom interviews were recorded and transcribed for a thorough review. Second, preliminary codes and sub-codes were generated from the transcripts. Third, specific coding themes were identified within broader thematic categories. Fourth, the themes were refined by analyzing similarities and differences in the participant experiences. Finally, the qualitative findings were integrated with the quantitative results to generate the final report.

4. Results

A repeated measures ANOVA was conducted to analyze the effect of image placement (S1M2: side images first, middle images second; M1S2: middle images first, side images second) on each measured outcome. Additional tests (t-tests) were conducted to compare the image conditions (S1 vs. M1; S2 vs. M2) for each video and to assess potential order effects (S1 vs. S2; M1 vs. M2). For this exploratory study, the middle hologram-like image position served as the experimental condition. Please see Table 2 for the satisfaction, cognitive load, and cued recall means and standard deviations.

4.1. Satisfaction

A repeated measures ANOVA revealed a significant difference between the two conditions (F(1,78) = 5.192, p = 0.025). However, Bonferroni post hoc tests revealed no significant pairwise differences (p < 0.05).
Between-subjects analysis for the side and middle hologram-like images based on the video order (S1 vs. M1; S2 vs. M2) revealed no significant differences (p < 0.05).
Between-subjects order effects analysis (S1 vs. S2; M1 vs. M2) found no significant order effects for the side hologram-like images. However, there were significant order effects for the middle hologram-like images. Participants reported significantly lower satisfaction with the M2 video compared to the M1 video, with a small effect size (p = 0.036, d = 0.48). See Figure 2 for the satisfaction descriptive plots.

4.2. Cognitive Load

4.2.1. Intrinsic Cognitive Load

A repeated measures ANOVA indicated that there was a significant difference between the two conditions (F(1,78) = 21.759, p < 0.001). Bonferroni post hoc tests suggested that increased intrinsic cognitive load may be attributed to order effects. In the M1S2 group, S2 scored significantly higher on intrinsic cognitive load than M1, with a medium effect size (p = 0.003, d = 0.55). Similarly, in the S1M2 group, M2 scored significantly higher on intrinsic cognitive load than S1, with a small effect size (p = 0.027, d = 0.42).
Additionally, the between-subjects analysis indicated an order effect for the middle hologram-like images. The M2 video was perceived as significantly increasing intrinsic cognitive load compared to the M1 video, with a medium effect size (p = 0.013, d = 0.70).
Between-subjects analysis comparing side and middle hologram-like images based on video order (S1 vs. M1; S2 vs. M2) revealed no significant differences (p < 0.05).

4.2.2. Extraneous Cognitive Load

Extraneous cognitive load was not reverse-scored, meaning that lower scores indicate higher extraneous cognitive load. A repeated measures ANOVA indicated that there was a significant difference between the two conditions at F(1,78) = 7.152, p = 0.009. However, Bonferroni post hoc tests revealed no significant pairwise differences (p < 0.05).
Between-subjects analysis comparing the side and middle hologram-like images based on video order (S1 vs. M1; S2 vs. M2) revealed no significant differences (p < 0.05).
Between-subjects order effects analysis (S1 vs. S2; M1 vs. M2) revealed no significant order effects for the side hologram-like images. However, significant order effects were found for the middle hologram-like images. Participants reported significantly higher extraneous cognitive load in the M2 video compared to the M1 video, with a medium effect size (p = 0.028, d = 0.50).

4.2.3. Germane Cognitive Load

A repeated measures ANOVA indicated that there was no significant difference between the two conditions (F(1,78) = 2.181, p = 0.144). Between-subjects analysis also found no order effects for any of the image conditions (p < 0.05). See Figure 3 for the cognitive load descriptive plots.

4.3. Cued Recall

A repeated measures ANOVA revealed a significant difference between the two conditions (F(1,78) = 5.478, p < 0.022). However, Bonferroni post hoc tests revealed no significant pairwise differences (p < 0.05). Assumption checks for the side hologram-like image data (S1 vs. S2) indicated a violation of Levene’s test of equality of variances. A between-subjects Mann–Whitney U test indicated that the higher score of the S2 video was approaching significance against the S1 video (U = 620.50, p = 0.072, d = 0.224). No other pairwise comparisons were significant. See Figure 4 for the cued recall descriptive plots.

4.4. Interviews

4.4.1. Data Analysis

This study utilized thematic analysis to systematically analyze the qualitative data. Thematic analysis is a method used to identify, analyze, and report patterns (themes) within the data [51]. It offers a structured approach for organizing and describing the dataset in detail. By using thematic analysis, the study was able to gain a comprehensive understanding of the underlying patterns and meanings in the qualitative data, facilitating a robust interpretation of the findings [52]. The data were first transcribed verbatim and thoroughly reviewed to ensure familiarity with the content. Open coding was then performed to generate initial codes by identifying key features within the data. These codes were then organized into potential themes based on their relationships and relevance to the research questions. The themes were systematically reviewed and refined to ensure they accurately represented the data and captured the nuances of participants’ experiences. This process involved iterative reading and re-reading of the data, accompanied by collaborative discussions with co-researchers to reach consensus on the thematic structure. The final themes were carefully defined and named to encapsulate the core essence of the data, creating a coherent narrative that effectively addresses the study’s objectives.

4.4.2. Interview Findings

Participants were asked to watch two videos, each featuring a speaker accompanied by hologram-like images. In one set of videos, the holograms were placed in front of the speaker, while in the other, they were positioned to the side. The identified themes were categorized as satisfaction and viewing strategy. The satisfaction category encompassed cognitive load, as the participants often associated their satisfaction with the condition that imposed less cognitive load. Similarly, cognitive load was also identified as a key influence on participants’ viewing strategies. See Table 3 for information on the interview participants. Please see Appendix A for the full quotes by the participants.

4.4.3. Satisfaction

The preferences for the placement of the hologram-like images were evenly divided, with three participants preferring the side placement and three preferring the middle. Among those who preferred the images at the side, the middle placement of the image in front of the gesturing agent overwhelmed them visually. Participants two and six preferred side placements because the overlapping hologram-like images and gestures were distracting, making it difficult to focus on either the images or the gestures.
However, the other participants found the middle placement more satisfying. They mentioned that the gestures played a significant role in their understanding, and having the hologram-images centrally positioned enhanced their learning experience. Participant one felt having the images and gestures in one central location allowed her to attend to all the information because both elements could be seen simultaneously. Participant five suggested that the central location was preferred because the gestures were important for learning, and looking towards the side meant he could potentially miss the gestures.
The level of satisfaction with the placement of the hologram-like images was influenced by the difficulty each design presented. Participants who preferred the side placement reported being distracted by the overlapping visual input in the middle position, which they found overwhelming. Conversely, those who preferred the middle placement emphasized the importance of observing the gestures during the presentation. For them, the side placement was more challenging, because it diverted their attention away from the gestures.

4.4.4. Viewing Strategy

Side Viewing Strategy

Although the students were evenly divided in their satisfaction with the hologram-like images in the middle or to the side, their viewing strategy was notably similar when the images were positioned to the side. All the participants, except one, reported briefly looking at the image first before returning their focus to the agent.
However, within the strategy of switching between the hologram-like images and the gesturing agent, there were strategies for directing attention. Participant one (middle viewing preference) would try to memorize the image and then direct attention back to the gesturing agent. Participant five (middle viewing preference) indicated that he would briefly look at the hologram-like images and then view the agent like he views humans—by watching the agent’s mouth and hands. Conversely, participant three (side viewing preference) mentioned that she would switch back and forth between the images and the agent, but that the agent was only briefly viewed and her attention was mostly focused on the images because the agent was something extra.
Regardless of their satisfaction, all the participants, except one, employed a similar strategy when the images were at the side. Most preferred to glance briefly at the hologram-like images before shifting their attention back to the agent. In contrast, participant three prioritized the images.

Middle Viewing Strategy

The participants who preferred the middle image placement reported using a similar viewing strategy. Since the hologram-like images and gestures were in a central location, it was easy for the participants to direct their attention to either visual input. For example, participant four suggested that it was easier to concentrate on the images or gestures with a slight visual adjustment.
However, the participants who were more satisfied with the side placement of the hologram-like images reported using different viewing strategies for the middle image placements. Participant six stated that she was distracted by the middle placement and had to change strategies a couple times. At first, she tried to focus only on the images, but when that was still overwhelming, she directed her attention to a blank space and listened to the audio. However, she did not totally abandon the images, as she would briefly look at the images when they changed before returning to the blank spot.
Notably, not all the participants who preferred side placement were able to implement a useful viewing strategy. Participant two indicated he stopped concentrating on the narration because he was overwhelmed trying to mentally separate the visual inputs.
The viewing strategies suggest that, regardless of their satisfaction with the placement of the hologram-like images, the participants generally adopted similar strategies when the images were positioned to the side. This suggests that the design may present a binary choice for the viewer: either focus more on the agent or on the image, depending on which is perceived as more beneficial for retaining information. Since the static images were simple and easy to remember, most participants felt the gesturing agent provided more valuable input to help them retain the information. The lone participant who primarily focused on the static images perceived the agent perceived the agent as non-essential, describing the agent’s gestures as merely “extra”.
Conversely, the middle image placement prompted distinct viewing strategies that were closely tied to satisfaction. The participants who preferred the middle placement expressed satisfaction with having all the information in one location, allowing them to access it with only minor glances up or down. They did not find the combination of multiple informational inputs distracting. However, the participants who were not satisfied with the middle placement appeared to be overwhelmed, leading to negative viewing strategies. Some participants turned the presentation into a listening session by avoiding visual engagement altogether, while others tuned out of the presentation entirely. The middle placement design restricted participants’ ability to adopt alternative viewing strategies to reduce stress, because all the information was centrally concentrated, limiting flexibility in how they engaged with the content.

5. Discussion

This research investigated whether the spatial arrangement of hologram-like images and gestures provided evidence supporting the split-attention principle and examined how separating or combining multiple visual inputs influenced satisfaction, cognitive load, and learning. The quantitative findings revealed no significant differences attributable to the spatial placement of hologram-like image and gestures. Any significant results observed were likely attributable to order effects rather than the spatial arrangement itself.
To explain these findings, it is important to acknowledge that the original concept of split attention primarily addressed the integration of images and text [8]. Over time, research expanded to examine the effects of multiple visuals presented simultaneously [15,16,17,18,19]. The foundation of split attention is that spatially separating information sources forces learners to engage in mental integration, which can overload working memory and hinder efficient processing [8,9,10,11]. This additional processing demand contributes to extraneous cognitive load [13].
The results of this study did not provide evidence supporting the split-attention principle, which is consistent with Schroeder and Cenkci’s [14] analysis of 41 studies. Their findings indicated that integrated graphic designs do not significantly impact measures of cognitive load compared to spatially distant designs. Intrinsic cognitive load was the only significant finding, with the results influenced by order effects rather than the spatial placement of visual information. This suggests that the second video required more inherent processing, which is likely due to the change in the display of visual information from the first video. Unfortunately, the quantitative data provides limited insight beyond drawing parallels to similar findings in previous research. Therefore, the qualitative data will be used for a deeper analysis to explore how these findings may relate to existing theories and research.
The individual interviews were conducted to explore how different visual presentations affected participants’ attention, focusing on whether presenting information in a central location reduced the need to split their attention between separate locations. The interviews indicated that the participants were evenly divided in their design preferences. Satisfaction was higher in conditions that imposed less viewing difficulty (cognitive load), which influenced their viewing strategies. In other words, satisfaction and viewing strategies appeared to be closely linked to the participants’ perception of extraneous cognitive load experienced during the presentation.
These interviews corroborated the findings from previous research, indicating that when participants experience increased cognitive load, they tend to either ignore one source of information [17], or actively switch between sources to focus on the one they perceive as most beneficial for learning [19]. This is consistent with the active processing principle [1], which suggests that learners actively engage with relevant information, organizing it in working memory to integrate it into long-term memory. The data suggests that Schroeder and Cenkci’s [14] assertion that students may employ preemptive strategies could also be correct. This means that the active processing principle might need to be slightly expanded to account for individual differences in cognitive processing that affect viewing choices. If multimedia presentations include numerous visuals, what is considered relevant information may be subjective and vary between individuals, depending on their prior knowledge, cognitive preferences, and learning goals. We propose the “pruning principle”, a metacognitive strategy that prompts learners to actively evaluate information when the processing demands challenge the limits of working memory. This principle suggests that learners will identify and prioritize content they deem relevant while “pruning” information they perceive as extraneous, thereby optimizing their cognitive resources. This principle is consistent with the information foraging theory [53], which suggests that information sources vary in “profitability” and assessments are based on the cost of processing the information. Thus, optimizing cognitive resources for effective learning can be observed in examples such as participant three, who ignored the agent, perceiving it as “something extra”, or participant six, who stared at a blank space on the screen and focused solely on the narration, feeling overwhelmed by having all the visuals concentrated in one place. This may explain why extraneous cognitive load was not significant in this research, as the participants were likely assessing their mental load after implementing their pruning strategies. The pruning principle does not suggest that participants are making optimal choices for learning, but rather that they are concentrating on the information they personally deem relevant for effective learning. Future research could examine the pruning principle using eye-tracking and think-aloud protocols with materials designed to induce varying cognitive load levels, potentially revealing how individuals prune information during the learning process.
Regarding the preference for visual inputs being separated or combined in one location, future research should examine whether satisfaction and cognitive load are influenced by individuals’ inherent spatial ability. Spatial ability is often categorized as high or low, where high spatial ability indicates sufficient working memory capacity to construct mental models, and low spatial ability suggests limited available resources for constructing such models [54]. Spatial ability encompasses multiple skills, one of which is perceptual speed. Perceptual speed refers to the ability to process multiple visual inputs simultaneously [55]. It is plausible to suggest that viewing preference may be influenced by an individual’s spatial perceptual speed. Participants with high spatial perceptual speed might feel comfortable with information presented in a central location, easily making simple up-and-down head movements to integrate the hologram-like images and gestures matching the narration. Conversely, participants with low spatial perceptual speed may have felt overwhelmed due to their reduced ability to process multiple sources of information in working memory. Further research is needed to evaluate whether spatial ability significantly impacts the choices participants make when viewing videos. Additionally, examining how specific spatial skills, such as perceptual speed, can be utilized to design multimedia environments that accommodate diverse learning preferences would provide valuable insights.

Limitations

This study has some limitations. First, the repeated measures design indicated that the only significant outcome was likely influenced by order effects. Future research could address this limitation by employing a between-subjects design to eliminate the potential of order on the results. Second, the results may not be generalizable to all contexts, as the participants were advanced second-language users of English. Factors such as language proficiency level or whether the language is native or a second language could affect the outcomes. Further exploration with different populations is encouraged.

6. Conclusions

The findings from this research suggest that learners employ metacognitive strategies, such as the proposed “pruning principle”, to manage extraneous cognitive load. This principle offers a new perspective on how learners prioritize and process visual inputs in complex multimedia designs, enabling them to focus on information they deem most relevant.
Further qualitative research, such as employing a think-aloud protocol, might be useful to provide valuable insights into how participants make decisions about what to focus on when engaging with multimedia presentations. This approach would allow researchers to capture real-time cognitive processes and decision-making strategies.
Although eye-tracking can provide detailed data on what participants view during the learning process, it does not reveal the underlying strategies guiding individual viewing behaviors. Specifically, it cannot capture conscious decision-making processes or the attention allocation strategies learners employ.
Despite the absence of significant differences in quantitative measures, the qualitative data indicated a relationship between satisfaction, cognitive load, and viewing strategies. Learners’ preferences for image placement were closely tied to their satisfaction and the perceived difficulty of the image location. This suggests that individual differences could impact experiences in multimedia environments. Consequently, further research is needed to assess spatial skills, such as perceptual speed, to better understand how these factors can inform and optimize multimedia design. For instructional designers, this study underscores the importance of creating environments that accommodate the diverse skills and capacities of learners. For example, providing options to customize the placement of visual inputs could improve both learner satisfaction and learning outcomes by addressing individual preferences and cognitive needs.

Author Contributions

Conceptualization, R.O.D., J.V. and Y.J.L.; methodology, R.O.D., Y.J.L. and E.B.Y.; validation, R.O.D., E.B.Y. and Y.J.L.; formal analysis, R.O.D., E.B.Y. and Y.J.L.; investigation, R.O.D. and J.V.; data curation, R.O.D., J.V. and Y.J.L.; writing—original draft preparation, R.O.D., J.V., E.B.Y., Y.J.L. and J.H.L.; writing—review and editing, R.O.D., E.B.Y., J.V., Y.J.L. and J.H.L.; project administration, R.O.D., E.B.Y. and J.H.L.; funding acquisition, R.O.D., E.B.Y. and J.H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Chonnam National University (grant number: 2024-0400-01).

Institutional Review Board Statement

All procedures performed in this study involving human participants were in accordance with the ethical standards of the Declaration of Helsinki and approved by the Institutional Review Board of Chonnam National University (protocol code 1040198-240603-HR-090-02).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Interview Quotes for Satisfaction and Viewing Strategies

Appendix A.1. Satisfaction

Participant two: I preferred the hologram to the side because in the middle it felt like too much information in one place. For me, it was very distracting with one picture in front of the gestures. When the holograms were on the side, it felt clean.
Participant six: It was easier for me to switch my attention, either on the image or on the gestures. When it was it the middle it was overlapping and distracting. I could neither focus on image or the gestures. It was overwhelming (mentally).
Participant one: I thought the video with the hologram in the front to be more satisfying, as I didn’t have to switch my attention from the speaker to the hologram, but could look at both simultaneously. It made it easier to focus and pay attention to the gestures of the speaker as well.
Participant five: For me, I liked when the hologram was in front of me. The gestures helped me understand more. When I had to look at the pictures on the side, I felt I would miss the gestures, so it was better to understand when the hologram was in front of me because I could see both. But the gestures were more important to me.

Appendix A.2. Viewing Strategy

Appendix A.2.1. Side Viewing Strategy

Participant one (middle viewing preference): Because it was a static image, not a moving or gif, it was easy to memorize the image and then focus on the speaker and the gestures, while thinking about the image in my head. I would look at the image for a couple seconds, to memorize it, and then look at the agent and gestures. I would look back and forth.
Participant five (middle viewing preference): When the holograms were on the side, I would quickly glance at the holograms and then go back to the gestures and the agent. I viewed the agent like I view people, I watch their mouth and their hands. I concentrated on the speaker, even though he was not a real human, just like a real talking situation.
Participant three (side viewing preference): I did look back and forth a little bit. Most of my attention was on the picture, but I would briefly look at the man and then look back at the images for a bit longer. The agent was something extra.

Appendix A.2.2. Middle Viewing Strategy

Participant four (middle viewing preference): Since the images were in the same line, I could slightly look up, so I didn’t really have to do much. I would see the image at first, but when he would move his hands, I would shift my attention to his hands and his face, and sometimes switch back to the image, but it wasn’t something that took a great deal of concentration or anything.
Participant six (side viewing preference): But in the middle, it was hard because the gestures in the background distracted me from focusing on the image. I realized that it just wasn’t going to work for me, so at first, I tried to just focus on the image, but then I switched completely to listening to the audio to avoid images and gestures. I picked a (blank) spot in the video and just focused on that spot. When the picture changed, I would briefly look at the picture and then go back to the spot.
Participant two (side viewing preference): There was so much information in one place that I stopped paying attention to what he was saying. I mean, I was looking at them (images and gestures) and trying to separate them in my mind, so I couldn’t focus on what he was saying anymore.

References

  1. Mayer, R.E. The past, present, and future of the cognitive theory of multimedia learning. Educ. Psychol. Rev. 2024, 36, 8. [Google Scholar] [CrossRef]
  2. Mayer, R.E. Using multimedia for e-learning. J. Comput. Assist. Learn. 2017, 33, 403–423. [Google Scholar] [CrossRef]
  3. Mayer, R.E. Principles for reducing extraneous processing in multimedia learning: Coherence, signaling, redundancy, spatial contiguity, and temporal contiguity. In The Cambridge Handbook of Multimedia Learning; Mayer, R.E., Ed.; Cambridge University Press: New York, NY, USA, 2005; pp. 183–200. [Google Scholar]
  4. Mayer, R.E.; Johnson, C.I. Revising the redundancy principle in multimedia learning. J. Educ. Psychol. 2008, 100, 380. [Google Scholar] [CrossRef]
  5. Chandler, P.; Sweller, J. The split-attention effect as a factor in the design of instruction. Br. J. Educ. Psychol. 1992, 62, 233–246. [Google Scholar] [CrossRef]
  6. Mayer, R.E. Introduction to multimedia learning. In The Cambridge Handbook of Multimedia Learning, 2nd ed.; Mayer, R.E., Ed.; Cambridge University Press: New York, NY, USA, 2014; pp. 1–24. [Google Scholar] [CrossRef]
  7. Mayer, R.E.; DaPra, C.S. An embodiment effect in computer-based learning with animated pedagogical agents. J. Exp. Psychol. Appl. 2012, 18, 239. [Google Scholar] [CrossRef]
  8. Moreno, R.; Mayer, R.E. Cognitive principles of multimedia learning: The role of modality and contiguity. J. Educ. Psychol. 1999, 91, 358. [Google Scholar] [CrossRef]
  9. Sweller, J. Cognitive load theory, learning difficulty, and instructional design. Learn. Instr. 1994, 4, 295–312. [Google Scholar] [CrossRef]
  10. Sweller, J.; Ayres, P.; Kalyuga, S. The Split-Attention Effect. In Cognitive Load Theory. Explorations in the Learning Sciences, Instructional Systems and Performance Technologies; Springer: New York, NY, USA, 2011; Volume 1. [Google Scholar] [CrossRef]
  11. Ayres, P.; Sweller, J. The split-attention principle in multimedia learning. In The Cambridge Handbook of Multimedia Learning, 2nd ed.; Mayer, R.E., Ed.; Cambridge University Press: New York, NY, USA, 2014; pp. 206–226. [Google Scholar] [CrossRef]
  12. Mayer, R.E.; Moreno, R. A split-attention effect in multimedia learning: Evidence for dual processing systems in working memory. J. Educ. Psychol. 1998, 90, 312. [Google Scholar] [CrossRef]
  13. Mayer, R.E.; Moreno, R. Nine ways to reduce cognitive load in multimedia learning. Educ. Psychol. 2003, 38, 43–52. [Google Scholar] [CrossRef]
  14. Schroeder, N.L.; Cenkci, A.T. Do measures of cognitive load explain the spatial split-attention principle in multimedia learning environments? A systematic review. J. Educ. Psychol. 2020, 112, 254–270. [Google Scholar] [CrossRef]
  15. Pouw, W.; Rop, G.; De Koning, B.; Paas, F. The cognitive basis for the split-attention effect. J. Exp. Psychol. Gen. 2019, 148, 2058. [Google Scholar] [CrossRef] [PubMed]
  16. Lowe, R.K. Animation and learning: Selective processing of information in dynamic graphics. Learn. Instr. 2003, 13, 157–176. [Google Scholar] [CrossRef]
  17. Bétrancourt, M.; Dillenbourg, P.; Clavien, L. Display of Key Pictures from Animation: Effects on Learning. In Understanding Multimedia Documents; Rouet, J.F., Lowe, R., Schnotz, W., Eds.; Springer: Boston, MA, USA, 2008. [Google Scholar] [CrossRef]
  18. Huff, M.; Schwan, S. Integrating information from two pictorial animations: Complexity and cognitive prerequisites influence performance. Appl. Cogn. Psychol. 2011, 25, 878–886. [Google Scholar] [CrossRef]
  19. Davis, R.O.; Lee, Y.J.; Vincent, J.; Wan, L. Exploring gesture frequencies and images in multimedia environments with pedagogical agents. J. Comput. Assist. Learn. 2024, 40, 3055–3071. [Google Scholar] [CrossRef]
  20. Hall, J.A.; Horgan, T.G.; Murphy, N.A. Nonverbal communication. Annu. Rev. Psychol. 2019, 70, 271–294. [Google Scholar] [CrossRef]
  21. Beattie, G.; Webster, K.; Ross, J. The fixation and processing of the iconic gestures that accompany talk. J. Lang. Soc. Psychol. 2010, 29, 194–213. [Google Scholar] [CrossRef]
  22. Gullberg, M.; Holmqvist, K. What speakers do and what addressees look at: Visual attention to gestures in human interaction live and on video. Pragmat. Cogn. 2006, 14, 53–82. [Google Scholar] [CrossRef]
  23. Kendon, A. Do gestures communicate? A review. Res. Lang. Soc. Interact. 1994, 27, 175–200. [Google Scholar] [CrossRef]
  24. Hostetter, A.B. When do gestures communicate? A meta-analysis. Psychol. Bull. 2011, 137, 297. [Google Scholar] [CrossRef]
  25. Vandergrift, L. 1. Listening to learn or learning to listen? Annu. Rev. Appl. Linguist. 2004, 24, 3–25. [Google Scholar] [CrossRef]
  26. Church, R.B.; Ayman-Nolley, S.; Mahootian, S. The role of gesture in bilingual education: Does gesture enhance learning? Int. J. Biling. Educ. Biling. 2004, 7, 303–319. [Google Scholar] [CrossRef]
  27. Willems, R.M.; Özyürek, A.; Hagoort, P. When language meets action: The neural integration of gesture and speech. Cereb. Cortex 2007, 17, 2322–2333. [Google Scholar] [CrossRef] [PubMed]
  28. Schlaug, G.; Knorr, U.; Seitz, R.J. Inter-subject variability of cerebral activations in acquiring a motor skill: A study with positron emission tomography. Exp. Brain Res. 1994, 98, 523–534. [Google Scholar] [CrossRef]
  29. Johnson-Frey, S.H.; Maloof, F.R.; Newman-Norlund, R.; Farrer, C.; Inati, S.; Grafton, S.T. Actions or hand-object interactions? Human inferior frontal cortex and action observation. Neuron 2003, 39, 1053–1058. [Google Scholar] [CrossRef]
  30. Grèzes, J.; Costes, N.; Decety, J. The effects of learning and intention on the neural network involved in the perception of meaningless actions. Brain 1999, 122, 1875–1887. [Google Scholar] [CrossRef] [PubMed]
  31. Morsella, E.; Krauss, R.M. The role of gestures in spatial working memory and speech. Am. J. Psychol. 2004, 117, 411–424. [Google Scholar] [CrossRef]
  32. Emmorey, K.; Reilly, J.S. (Eds.) Language, Gesture, and Space; Psychology Press: Hove, UK, 2013. [Google Scholar]
  33. Alibali, M.W. Gesture in spatial cognition: Expressing, communicating, and thinking about spatial information. Spatial Cogn. Comput. 2005, 5, 307–331. [Google Scholar] [CrossRef]
  34. van Mulken, S.; André, E.; Müller, J. The Persona Effect: How Substantial Is It? In People and Computers XIII; Johnson, H., Nigay, L., Roast, C., Eds.; Springer: London, UK, 1998. [Google Scholar] [CrossRef]
  35. Lester, J.C.; Voerman, J.L.; Towns, S.G.; Callaway, C.B. Deictic believability: Coordinated gesture, locomotion, and speech in lifelike pedagogical agents. Appl. Artif. Intell. 1999, 13, 383–414. [Google Scholar] [CrossRef]
  36. Mayer, R.E.; Sobko, K.; Mautone, P.D. Social cues in multimedia learning: Role of speaker’s voice. J. Educ. Psychol. 2003, 95, 419. [Google Scholar] [CrossRef]
  37. Atkinson, R.K.; Mayer, R.E.; Merrill, M.M. Fostering social agency in multimedia learning: Examining the impact of an animated agent’s voice. Contemp. Educ. Psychol. 2005, 30, 117–139. [Google Scholar] [CrossRef]
  38. Guo, Y.R.; Goh, D.H.L. Affect in embodied pedagogical agents: Meta-analytic review. J. Educ. Comput. Res. 2015, 53, 124–149. [Google Scholar] [CrossRef]
  39. Davis, R.O.; Park, T.; Vincent, J. A systematic narrative review of agent persona on learning outcomes and design variables to enhance personification. J. Res. Technol. Educ. 2021, 53, 89–106. [Google Scholar] [CrossRef]
  40. Davis, R.O.; Park, T.; Vincent, J. A meta-analytic review on embodied pedagogical agent design and testing formats. J. Educ. Comput. Res. 2023, 61, 30–67. [Google Scholar] [CrossRef]
  41. Davis, R.O. The impact of pedagogical agent gesturing in multimedia learning environments: A meta-analysis. Educ. Res. Rev. 2018, 24, 193–209. [Google Scholar] [CrossRef]
  42. Craig, S.D.; Twyford, J.; Irigoyen, N.; Zipp, S.A. A test of spatial contiguity for virtual human’s gestures in multimedia learning environments. J. Educ. Comput. Res. 2015, 53, 3–14. [Google Scholar] [CrossRef]
  43. Davis, R.O.; Vincent, J. Sometimes more is better: Agent gestures, procedural knowledge and the foreign language learner. Br. J. Educ. Technol. 2019, 50, 3252–3263. [Google Scholar] [CrossRef]
  44. Davis, R.O.; Vincent, J.; Wan, L. Does a pedagogical agent’s gesture frequency assist advanced foreign language users with learning declarative knowledge? Int. J. Educ. Technol. High. Educ. 2021, 18, 1–19. [Google Scholar] [CrossRef]
  45. Ritzhaupt, A.D.; Kealy, W.A. On the utility of pictorial feedback in computer-based learning environments. Comput. Hum. Behav. 2015, 48, 525–534. [Google Scholar] [CrossRef]
  46. Wang, J.; Dawson, K.; Saunders, K.; Ritzhaupt, A.D.; Antonenko, P.P.; Lombardino, L.; Keil, A.; Agacli-Dogan, N.; Luo, W.; Cheng, L.; et al. Investigating the effects of modality and multimedia on the learning performance of college students with dyslexia. J. Spec. Educ. Technol. 2018, 33, 182–193. [Google Scholar] [CrossRef]
  47. Hostetter, A.B.; Skirving, C.J. The effect of visual vs. verbal stimuli on gesture production. J. Nonverbal Behav. 2011, 35, 205–223. [Google Scholar] [CrossRef]
  48. Ritzhaupt, A.D.; Gomes, N.D.; Barron, A.E. The effects of time-compressed audio and verbal redundancy on learner performance and satisfaction. Comput. Hum. Behav. 2008, 24, 2434–2445. [Google Scholar] [CrossRef]
  49. Leppink, J.; Paas, F.; Van Gog, T.; Van Der Vleuten, C.P.M.; Van Merrienboer, J.J.G. Effects of pairs of problems and examples on task performance and different types of cognitive load. Learn. Instr. 2014, 30, 32–42. [Google Scholar] [CrossRef]
  50. Braun, V.; Clarke, V. What can thematic analysis offer health and well-being researchers? Int. J. Qual. Stud. Health Well-Being 2014, 9, 26152. [Google Scholar] [CrossRef] [PubMed]
  51. Terry, G.; Hayfield, N.; Clarke, V.; Braun, V. Thematic analysis. In The SAGE Handbook of Qualitative Research in Psychology; SAGE Publications Ltd.: Thousand Oaks, CA, USA, 2017; Volume 2, p. 25. [Google Scholar]
  52. Clarke, V.; Braun, V. Thematic analysis. J. Posit. Psychol. 2017, 12, 297–298. [Google Scholar] [CrossRef]
  53. Pirolli, P.; Card, S. Information foraging. Psychol. Rev. 1999, 106, 643–675. [Google Scholar] [CrossRef]
  54. Huk, T. Who benefits from learning with 3D models? The case of spatial ability. J. Comput. Assist. Learn. 2006, 22, 392–404. [Google Scholar] [CrossRef]
  55. Hostetter, A.B.; Alibali, M.W. Raise your hand if you’re spatial: Relations between verbal and spatial skills and gesture production. Gesture 2007, 7, 73–95. [Google Scholar] [CrossRef]
Figure 1. Middle and side images with agent gestures.
Figure 1. Middle and side images with agent gestures.
Information 16 00350 g001
Figure 2. Satisfaction descriptive plots.
Figure 2. Satisfaction descriptive plots.
Information 16 00350 g002
Figure 3. Cognitive load descriptive plots.
Figure 3. Cognitive load descriptive plots.
Information 16 00350 g003
Figure 4. Cued recall descriptive plots.
Figure 4. Cued recall descriptive plots.
Information 16 00350 g004
Table 1. Order of the research intervention and data collection.
Table 1. Order of the research intervention and data collection.
1. Prior knowledge test2. First video with side or middle image placement3. Satisfaction survey
4. Cognitive load survey
5. Cued recall assessment
6. Second video with the remaining image placement7. Satisfaction survey
8. Cognitive load survey
9. Cued recall assessment
Table 2. Group means and standard deviations.
Table 2. Group means and standard deviations.
Cognitive Load
GroupSatisfactionILELGLCued Recall
S145.65 (12.39)12.67 (5.88)17.15 (5.59)18.52 (7.471.25 (1.74)
S245.88 (12.78)14.20 (4.78)16.75 (4.88)18.70 (6.47)2.18 (2.36)
M148.80 (11.42)11.35 (4.50)17.83 (4.77)19.95 (7.16)1.60 (1.87)
M243.38 (11.28)14.95 (5.28)15.70 (3.66)17.73 (7.61)1.52 (2.05)
Table 3. Interview participants.
Table 3. Interview participants.
GenderMajorGradeConditionPreferred Placement
1FemaleInternational Relations and English LiteratureSeniorS1M2Middle
2MaleELLTSeniorM1S2Side
3FemaleEICCSophomoreM1S2Side
4FemaleEICCJuniorM1S2Middle
5MaleELLTSeniorS1M2Middle
6FemaleEICCSophomoreS1M2Side
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Davis, R.O.; Vincent, J.; Yang, E.B.; Lee, Y.J.; Lee, J.H. The Exploration of Combining Hologram-like Images and Pedagogical Agent Gesturing. Information 2025, 16, 350. https://doi.org/10.3390/info16050350

AMA Style

Davis RO, Vincent J, Yang EB, Lee YJ, Lee JH. The Exploration of Combining Hologram-like Images and Pedagogical Agent Gesturing. Information. 2025; 16(5):350. https://doi.org/10.3390/info16050350

Chicago/Turabian Style

Davis, Robert O., Joseph Vincent, Eun Byul Yang, Yong Jik Lee, and Ji Hae Lee. 2025. "The Exploration of Combining Hologram-like Images and Pedagogical Agent Gesturing" Information 16, no. 5: 350. https://doi.org/10.3390/info16050350

APA Style

Davis, R. O., Vincent, J., Yang, E. B., Lee, Y. J., & Lee, J. H. (2025). The Exploration of Combining Hologram-like Images and Pedagogical Agent Gesturing. Information, 16(5), 350. https://doi.org/10.3390/info16050350

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop