Next Article in Journal
Numerical Simulation of Water Migration during Soil Freezing and Its Resulting Characterization
Previous Article in Journal
Machine Learning-Based Classification of Body Imbalance and Its Intensity Using Electromyogram and Ground Reaction Force in Immersive Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Human–AI Co-Drawing: Studying Creative Efficacy and Eye Tracking in Observation and Cooperation

School of Mechanical Engineering, Southeast University, Nanjing 211189, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(18), 8203; https://doi.org/10.3390/app14188203
Submission received: 1 August 2024 / Revised: 7 September 2024 / Accepted: 9 September 2024 / Published: 12 September 2024

Abstract

:
Artificial intelligence (AI) tools are rapidly transforming the field of traditional artistic creation, influencing painting processes and human creativity. This study explores human–AI cooperation in real-time artistic drawing by using the AIGC tool KREA.AI. Participants wear eye trackers and perform drawing tasks by adjusting the AI parameters. The research aims to investigate the impact of cross-screen and non-cross-screen conditions, as well as different viewing strategies, on cognitive load and the degree of creative stimulation during user–AI collaborative drawing. Adopting a mixed design, it examines the influence of different cooperation modes and visual search methods on creative efficacy and visual perception through eye-tracking data and creativity performance scales. The cross-screen type and task type have a significant impact on total interval duration, number of fixation points, average fixation duration, and average pupil diameter in occlusion decision-making and occlusion hand drawing. There are significant differences in the variables of average gaze duration and average pupil diameter among different task types and cross-screen types. In non-cross-screen situations, occlusion and non-occlusion have a significant impact on average gaze duration and pupil diameter. Tasks in non-cross-screen environments are more sensitive to visual processing. The involvement of AI in hand drawing in non-cross-screen collaborative drawing by designers has a significant impact on their visual perception. These results help us to gain a deeper understanding of user behaviour and cognitive load under different visual tasks and cross-screen conditions. The analysis of the creative efficiency scale data reveals significant differences in designers’ ability to supplement and improve AI ideas across different modes. This indicates that the extent of AI participation in the designer’s hand-drawn creative process significantly impacts the designer’s behaviour when negotiating design ideas with the AI.

1. Introduction

AI technologies have been landing in the art field over the last decade. A series of text-to-image AI systems, such as Disco Diffusion, Midjourney, Stable Diffusion, Open AI’s DALL-E 2, and Google’s Imagen, are making great strides [1,2,3,4]. It has triggered a heated debate about human–AI cooperation and creativity in AI art. AI is based on image generation from natural language descriptions, which largely enhances the efficiency and effectiveness of the translation from creativity to vision [1]. Human–AI cooperative drawing refers to the process of drawing creation in which humans and AI technology participate together. In this cooperation, humans can provide creativity, conceptualization, and aesthetic guidance for decision-making and selection, while the AI uses its powerful computational capabilities and algorithms to generate unique images or create paintings [2,3]. In the past, authors of both traditional and digital painting creations needed to be skilled in the use of tools and have extensive technical experience to accurately map their brain’s imagery to the visual layer. However, in co-creation with text-to-image AI generators, artists and non-artists alike can input textual descriptions to generate many high-quality images [5].
Most of the current AI drawings use text to generate images with some limitations on the semantics that text can describe. There are learning costs for novices to learn operations such as multiple parameters and prompt commands. But there is currently the possibility of using sketches plus text to generate images, which has increased levels of human intervention and freedom, and in the AI real-time drawing generation application, KREA.AI, artists can interact with the AI through textual descriptions coupled with simple brushstroke drawings and colours, with the AI interpreting and elaborating on the user’s inputs in real-time and transforming them into complex, detailed images. This real-time feedback makes the drawing process full of unpredictability and allows the artist to see the AI’s feedback on the artist’s picture during the drawing process, allowing for AI-based adjustments to the picture during the drawing process, which also allows the user to have a variety of strategies for looking at the picture and forming new ways of cooperating on the drawing. The generative mechanisms use a verbal–visual model to understand the user’s “cue + hand-drawn” input, and then guide the generator to produce a high-quality image. They are capable of composing images of any style and content based on cues and hand drawings. It is worth discussing what new variations in the new HCI paradigm are possible. Existing research on the impact of human–AI collaborative mapping on visual and creative perception is scarce and lacks empirical studies [5,6]. The use of eye-tracking technology can record and analyse multiple metrics such as users’ attention distribution and concentration, as well as visual perception and cognitive load in cooperating with the AI, along with its unique value and rare application in such studies. Therefore, this study aims to explore how different observation strategies and cooperation modes affect human visual and creative perception.
The study will test these effects through eye-tracking experimental design and creative efficacy scale design, provide thoughts on different modes of cooperation for future design practitioners and AI design drawing enthusiasts, and promote the idea that more professionals or non-design and art professionals can also experiment with new ways of cooperation. People can choose more efficient human–AI collaborative drawing modes based on the theories in this article and improve cooperation and drawing efficiency and focus visual attention. The study improves visual search efficiency and provides new research perspectives and ideas for later researchers to choose and think about efficient and high-quality cooperation modes and interaction forms. Finally, this study has found that designers’ AI participation in hand-drawing behaviours in non-cross-screen collaborative drawing has a significant impact on designers’ visual perception. Designers’ ideas about the complementary refinement of AI vary significantly in different modes.

2. Literature Review

2.1. AI Generation Applications

AI plays an important role in art design and drawing. Most people express their ideas in rough sketches and lack the expertise to produce pleasing drawings. Existing AI methods are somewhat capable of transforming different user sketches into artistically beautiful drawings while preserving their semantic concepts [7]. KREA.AI’s user-friendly interface is a game changer for professional artists and amateurs alike. The platform offers a range of tools including colour pickers, shape tools, and the ability to integrate images as a basis for further drawing. What sets KREA.AI apart is its real-time rendering capabilities. While the user sketches on one side of the screen, the AI simultaneously generates detailed artistic interpretations on the other. At the heart of KREA.AI is the Latent Consistency Model (LCM), which dramatically reduces the number of steps required in the image generation process, enabling the rapid creation of complex visual effects. With over 2500 AI models, it opens up a world of possibilities for generating professional visual content and exploring the boundaries of imagination.
KREA.AI is a new creative tool that uses AI to generate high-quality visuals while understanding the user’s style, concept, or product. It can upload images and train the AI engine to generate images on canvas. KREA.AI (as shown in Figure 1) can be used in graphic design, product photography, concept art, architecture, and more. Artists can interact with the AI through simple brushstrokes and colour choices, and the AI interprets and elaborates on the user’s inputs in real-time and transforms them into complex, detailed images. This real-time feedback makes the painting process full of unpredictability. KREA.AI can use the camera and Screen2img screen switching as input. Users can simply translate camera screen switches to other devices and apps as inputs, which can be in the form of hand drawings, 3D models, graphics, sketches, reference drawings, etc., which KREA.AI will instantly translate into AI-generated images. This functionality pushes the boundaries of interactivity, enabling users to engage with artistic creations in entirely new ways. KREA.AI illustrates how AI can be a tool to augment and enhance human artistic expression. It is a testament to the symbiotic relationship between human ingenuity and machine intelligence.

2.2. Creative Mapping Visual Perception

For humans, 80 per cent of information and knowledge is acquired from the human visual system. Visual perception is more important than other perceptions in the construction of knowledge systems. Visual complexity, as a fundamental aspect of visual perception, is extremely important for humans to understand and perceive visual stimuli [8]. Visual complexity is considered to be the main cue for judging visual attractiveness, and from a psychological point of view, measures of visual complexity help human viewers to analyse the impact of visual complexity on aesthetic judgments and are therefore also useful for neuroscientists and psychologists interested in the mechanisms of object perception as well as in the processes of learning and memory [9,10]. In an applied sense, computer engineers can use measures of visual complexity to build information systems and tools [10,11] for image analysis, estimation, visualization, and recognition, and to allow designers to predict the aesthetic and emotional responses of consumers and users to the complexity of a product [12]. Cardaci et al. proposed an experiment to obtain the perceptual time of a painting. The experiment aimed to establish a relationship between objective measures of complexity and perceived time. The results showed a strong correlation between the psychological and computational results (statistical properties of paintings) [13]. Tamm et al. focused on how data analytics can be applied in creative decision-making to achieve more evidence-supported decisions [14].
By analysing the case study of Rovio, they explored how to make creative decisions more evidence-supported while preserving artistic intuition and human creativity. Therefore, there is much room for exploring the principles and methods of applying data analysis in creative decision-making. Spering and other scholars have used eye movement metrics as a window into the decision-making process [9]. Specifically, researchers can measure gaze duration to understand the allocation of attention during the decision-making process. Longer gaze durations may indicate in-depth processing of specific information. Researchers can understand the cognitive load and attention level of decision-makers by measuring pupil size. Larger pupils may indicate a higher cognitive load or higher attention to certain information. These eye movement metrics can provide information about the cognitive and attentional mechanisms involved in decision-making and help researchers understand the basis of decision-making behaviour [11,15]. Researchers have used pupil size as an indicator of participants’ decision uncertainty and confidence levels. They obtained this information by measuring the participants’ pupil size. The researchers recorded the amount of time participants spent making decisions. They found that decision time was related to decision accuracy and confidence level. Utz et al. used both quantitative and qualitative data collection methods to test the research framework [10]. Xie et al. used eye-tracking technology to record participants’ attention allocation and learning performance evaluation in different drawing modes [16]. Objective experiments were assessed with the help of eye movement equipment by quantitative indicators such as gaze duration, number of gaze points, time of first entry into the area of interest, and eye-tracking graphs [17]. Alemdag found that eye movement measurements provide inference on the cognitive processes of selection, organization, and integration; that multimedia learning principles, multimedia content, individual differences, metacognition, and emotion are potential factors that may affect eye movement measurements; and that the study results can support the link between cognitive processes and academic performance inferred from eye-tracking measures [18].
Higher visual attention to relevant pictures is associated with higher academic performance. Directing learners’ attention to pictures is crucial in multimedia learning. Verbal cues can direct learners’ attention. Pictures can be included in the text. More complete and appropriate transitions can be made between text and pictures. Chang noted that the human visual system is considered the most appropriate method for assessing image quality [19]. Karran used eye-tracking technology and the same stimulus presentation–trust relationship. Human trust in AI positively influences the order in which information is presented within the interface in terms of neighbourhood, which further influences user trust in the AI, and cognitive load per se does not mediate the confidence–trust relationship [20]. Eye tracking has advantages, particularly in the study of mapping processes. Compared to other methods, such as thinking aloud, where learners provide insight into cognitive processes (e.g., action strategies) by consciously memorizing thought processes during task processing, eye tracking provides the added value of recording the gradual progression of problem-solving based on eye movements without interfering with task processing. Triangulation of recorded eye movements, followed by review, provides a more nuanced understanding of the learner’s actual problem-solving process. In this way, problem-solving-related behaviours, such as longer dwell times, search processes or (unconscious) attributions of meaning, become apparent during the drawing process, which would otherwise go undetected. This in turn leads to new, process-related analytical perspectives on the drawing process [21]. A seven-point Likert scale was used for scoring. The purpose of the experiment was to validate the research framework and to obtain more information about the participants’ perceived stimuli. The analysis of the quantitative data allowed the researcher to verify the validity of the research framework. By analysing the qualitative data, the researcher can obtain more information for future research.

2.3. Creative Self-Efficacy

In the design conceptualization, the scholars compared the traditional approach with the integrated approach of the AIGC, as well as their correspondence with the HOTS. Self-efficacy is an individual’s subjective evaluation of his or her ability to perform a specific task, influenced by previous experiences and self-assessment. It is considered a key mediator that influences behavioural choices, effort intensity, and decision-making. Research has shown that enhanced self-efficacy has a key positive impact on creative thinking, leading to improved learning outcomes. Traditional measures of creativity include the Torrance Creativity Test, which is one of the most widely used instruments in creativity testing and contains both verbal and graphical components that measure four dimensions: fluency, flexibility, originality, and detail [22]. It is used to assess an individual’s ability to think creatively and is widely used in education, psychological research, and talent selection [22]. The Creative Behaviour Scale focuses on assessing the frequency and diversity of individuals’ generation of novel ideas and their willingness to explore different possibilities. It is suitable for studying individuals’ creative behaviours and creative thinking tendencies in their daily lives [23].
The Creative Self-Efficacy Scale assesses the degree of confidence individuals have in themselves to perform creative tasks. Based on self-efficacy theory, it focuses on an individual’s belief system and self-evaluation [24]. It is widely used in educational psychology and organizational behaviour research, especially in the context of innovation and creative performance. The Creative Achievement Questionnaire measures creativity by assessing an individual’s creative achievement in various domains (e.g., music, writing, science, etc.) [25]. It is used to identify and assess individuals with high creative potential and applies to both adults and adolescents. The Emerson Creative Stimulation Scale was developed by Teresa Amabile and focuses on assessing the factors that support and hinder individual creativity in the work environment. When assessing the degree of human creativity stimulated by four different collaborative approaches during a human–AI collaborative mapping experiment, the Creative Self-Efficacy Scale (CSES) is the appropriate scale to use given the characteristics and goals of the experiment [26].
The Creative Self-Efficacy Scale has a degree of applicability for the assessment of individual confidence. This scale focuses on assessing an individual’s confidence in their ability to complete creative tasks, which is particularly important for human–AI collaborative mapping experiments, where participants may need to adapt to new technologies and ways of cooperating [26,27]. Self-efficacy is a key factor influencing an individual’s ability to attempt and persist with creative tasks. Applicable to new environments and technologies, given that human–AI cooperation is a relatively novel field, individuals’ assessment of their confidence in their ability to successfully perform creative mapping in this new environment can provide important insights into understanding the impact of AI tools on creative stimulation. The scale is flexible in that the Creative Self-Efficacy Scale can be flexibly adapted to different experimental designs and cooperation styles. By making appropriate adjustments or additions to the scale, it can be made more relevant to specific human–AI cooperation settings. By assessing participants’ self-efficacy in different cooperation styles, researchers can gain insights into which cooperation styles are more likely to promote participants’ creative thinking and behaviours, and where participants may need more support or training. The Creative Self-Efficacy Scale has been widely used in the fields of educational psychology and organizational behaviour, illustrating its validity and reliability in assessing creative confidence in different contexts [28,29].

2.4. Human–AI Cooperation Model

Human–AI interaction can be conceptualized by using scenarios for four basic levels of cooperation between AI and humans in a work environment. [30] At level I, human–AI cooperation does not exist, as employees either compete directly with AI systems or work independently of them. This is especially true when using substitute decision-making AI systems, which offer employees the possibility to interact with these systems [31]. At Level II, humans and AI systems complement each other, with AI systems handling complex calculations or processing large amounts of data, while humans use their social and emotional skills to make complex decisions. At Level III, humans and AI systems depend on each other for task performance. Level IV is where the AI system becomes a true extension of the human brain and the two intelligences engage in fully collaborative work [32], as shown in Figure 2.
Human interaction with AI systems involves the allocation of functions (i.e., meaning, decision-making, execution capabilities, and execution authority) [33]. The research method of human–AI cooperation can adopt Therbligs Analysis (TA), which analyses the main interaction elements under different interaction tasks based on natural interaction actions, collects human interaction characteristics from eye movements and other dimensions, and explores the explicit and implicit interaction intentions of human beings in human–computer cooperation tasks to machine intelligence.
Cognitive Work Analysis (CWA) is used to deconstruct the typical decision-making task scenarios of human–computer interaction in complex information systems, to refine the human task objectives layer by layer, and to take the correlation between system components and system objectives. As in Figure 3, according to the human–AI collaborative mapping process chart, the human and AI each perform the behaviour and function process allocation, the corresponding process is divided, and the degree of participation of the human and AI at each stage is summarized. The drawing process involves drawing, adjusting, and selecting. Human behaviour involves hand drawing, adjusting AI parameters, and image screening. These actions correspond to the AI behaviour in real-time to generate, according to the human parameter adjustments for image transformation, export graphics.

2.5. Research Hypothesis

Previous studies have lacked a field study of mapping operations in real-life scenarios in conjunction with the synchronized recording of subjects’ behavioural creative writing [34]. Most simulation scenarios lack authenticity, while the collaborative scene for the simulation lacks the real operation of the subject’s situation and perception of the differences. So, in this mapping experiment, through the real mapping scenarios through the head-mounted eye-tracking device, the two-eye-tracking collaborative work in both eye-tracking eyes can help researchers and educators to more effectively track and promote the collaborative creative process [35]. By realistically detecting subjects performing drawing tasks under cooperation modes with different degrees of AI involvement and different visual strategies, we can collect real and effective eye movement data caused by differences in cooperation modes and different visual strategies. Therefore, we can investigate the cooperation modes based on the support of this scenario and equipment again. At the same time, the visual characteristics will, to a certain extent, affect the thinking and perception during the collaborative design process [36].
By changing the visual factors, we can study whether different visual observation modes in collaboration will cause differences in creative perception and design different observation strategies in collaboration with studying the differences in perception; different collaboration modes will produce visual differences and visual differences will cause differences in creative perception [21]. The differences in creative perception and the degree of differences caused by different collaboration modes and visual differences need to be operated in specific design and drawing scenarios and explored through experiments [16]. Based on the research questions of design collaboration modes and visual strategies, the following research hypotheses are made.
Hypothesis 1:
Different cross-device cooperation methods have a significant effect on users’ visual cognitive load in collaborative drawing with AI.
Hypothesis 2:
Different visual observation strategies for cross-device cooperation have a significant effect on users’ visual cognitive load in collaborative drawing with AI.
Hypothesis 3:
Different visual observation strategies for cross-device cooperation have a significant effect on users’ creative self-efficacy in collaborative drawing with AI.
Hypothesis 4:
Different cross-device collaborative approaches have a significant effect on users’ creative self- efficacy in collaborative drawing with AI.

3. Materials and Methods

3.1. Stimuli

3.1.1. Cross-Screen Mode Setting

Cross-screen drawing (AI-generated images and hand-drawn regions shown in split-screen) typically requires a higher cognitive load because it involves more complex hand–eye coordination. There is no direct spatial correspondence between the user’s hand movements on the drawing board and the visual feedback on the screen, meaning that the user must mentally establish a connection between the two. This separation results in the need for the user to constantly switch their visual attention between the drawing board and the display while adjusting their hand movements to match the output on the screen. Beginners, in particular, may find this process challenging as it requires the brain to process multiple streams of information simultaneously and takes time and practice to adapt.
Non-cross-screen drawing (AI-generated images and hand-drawn areas on the same screen) on the other hand, typically has a lower cognitive load because hand movements and visual feedback occur in the same location, which is more similar to drawing with a pen and paper. This direct spatial correspondence reduces the complexity of hand–eye coordination and allows users to more intuitively see how their movements translate into drawing results. By reducing the cognitive separation between vision and motion, users can focus more on the creative part of the process rather than spending a lot of energy on understanding how hand movements affect the outcome.
Cross-screen hand drawing (increased visual switching, decreased hand–eye coordination)—hand occlusion: The setup of using a tablet to draw and view the results on the same device, such as an iPad or other tablet with a touch screen, where the user views the image directly on the surface on which they are drawing, provides direct visual feedback. This approach is more intuitive because the gestures and visual feedback are in the same location, reducing the difficulty of hand–eye coordination and making it easier for beginners to get started.
Non-cross-screen hand drawing (reduced visual switching, improved hand-eye coordination)—hand–eye separation: Viewing the monitor while drawing with a digital pad, the drawing gestures are separated from the screen which the eye is viewing. This approach requires higher hand–eye coordination because the user needs to establish a mapping relationship in their head between their hand movements and the visual feedback [37]. This involves reduced hand occlusion and screen switching [22,33]. The cross-device cooperation scenarios are summarised in Table 1.

3.1.2. Masking Settings for Visual Strategies

In using the AI real-time image generation interface, the user can be hand drawing on the touch screen while looking up to observe the AI generation interface in the monitor. Based on the user’s changes in each stroke of the drawing, the AI can generate a different image. In the collaborative process, the user can use a different collaborative observation strategy.
The first collaborative observation strategy:
(According to the vertical description of the first strategy in Table 2).
Draw the entire sketch, then observe the AI image generation adjustment: The first collaborative observation strategy is to draw a complete sketch. Then, observe the AI-generated image to adjust. By adjusting the AI changes to the already drawn sketch, we obtain the final collaborative idea. In the sketch drawing stage, the user can carry out self-creative conception through drawing, without the influence of the AI.
The second collaborative observation strategy:
(According to the vertical description of the second strategy in Table 2).
Carry out hand drawing while observing the AI’s creativity: In the process of drawing, the AI participates in the drawing conception of the user, and the process of AI participation in the user’s conception enriches the user’s creativity. However, the visual cognitive load is aggravated. The user’s conception will be affected by the AI image.
Therefore, this study carries out eye tracking drawing experiments on users’ cognitive load and creativity in four collaborative drawing strategies using AI real-time generated interfaces. It evaluates creativity through creativity scales. It studies the relationship between visual cognitive load and creativity. Finally, it obtains a strategy of cooperation with the lowest cognitive load and the highest degree of creativity among the four collaborative drawing modes.

3.1.3. Creativity Scale Design

Creative self-efficacy refers to an individual’s belief in his or her ability to generate a novel, original idea. It appropriates ideas, solutions, or behaviours according to the task requirements in a given situation. It is part of an individual’s general self-efficacy [26]. The first three questions were derived from Oldham and Cummings (1996). The internal consistency coefficient of the whole scale is 0.93, which has good reliability. Details of the specific scales can be found in Appendix A (Table A1).

3.2. Experimental Design

The purpose of the experimental study is to investigate the effects of cross-screen, non-cross-screen, and different observation strategies on the cognitive load and the degree of creative stimulation of the users during collaborative drawing with AI using cross-device methods.

3.2.1. Experimental Setup and Subject

The experimental design was a 2 × 2 two-factor mixed experimental design with cross-screen/non-cross-screen as the between-subjects factor (A1/A2) and the two observational strategies as the within-subjects factor (B1/B2), with a total of four combinations. To determine the sample size, this paper used the G*Power 3.1.9.7 software to estimate the number of participants. The effect size was 0.40 with an alpha of 0.05, a significance level of 5% was used, the number of groups was 2, four replicated measures were taken for each participant, and the correlation among replicated measures was 0.5. This calculation gave an efficacy of 81.57%, indicating that the study required a minimum of 24 participants. Thirty-four postgraduate students majoring in industrial design from the Southeast University were recruited. Removing four subjects whose eyes could not be calibrated, the remaining postgraduate students, aged 23–25 years old, 13 males and 17 females, were divided into two groups: the cross-screen group A1B1, A1B2, and the non-cross-screen group A2B1, A2B2.
Meanwhile, additional data were obtained for a number of subjects from an experimental study of icon search characteristics based on feature-based reasoning by Peng et al. The data for a number of subjects in the experimental study of icon-seeking properties based on feature-based reasoning by Peng et al. were selected, retaining the eye-movement experimental data of the subjects whose data sampling rate was greater than 80%. Four datasets with a sampling rate lower than 80% were removed and 26 subjects with a sampling rate of 86% or more for the eye movement data were selected for analysis. For the drawing cue words, 20 industrial design task drawing cue words were randomly generated by ChatGPT. From the cue words, four were selected by an industrial designer as drawing instructions. Table 3 describes the experimental scenarios and experimental variables.

3.2.2. Experimental Procedures

Each subject in each group randomly chose 8 cue words to draw and completed 4 drawings in each experimental condition. Each drawing was controlled to be around 2–3 min. Each subject randomly completed 2 visual search strategy drawing tasks. Subjects wore a Tobii glass2, and subjects’ eye movement data were recorded. The time required and the number of images generated by the subjects after the completion of each drawing experiment were recorded. The hand-drawn sketch and the generated images were saved. In an experimental setting, Park et al. had participants complete a short test trial in which they were asked to observe and then draw simple geometric shapes. Participants were then asked to produce a series of three representative drawings [38] and the generated images were recorded. The total time for each subject to complete the experiment was controlled to be around 30 min. Each subject was asked to complete the questionnaire once after each visual strategy experiment. The questionnaire was completed twice in total.
Experimental equipment included the Tobii glass2; two computers, a Dell computer with a 27-inch monitor for connecting to the Tobii glass2, which was used to detect the subjects’ visual images in real-time, and a LENOVO computer with a 15.6-inch monitor with a resolution of 1080 × 1920 (60 Hz) for the subjects’ collaborative drawing experiments; a Huawei MatePad Pro tablet PC for cross-screen hand drawing; and a Wacom touchpad (CTL-4100 KO/C) for hand drawing. The breakdown of the equipment is shown in Table 4.
The guidance process for each mapping task is as follows and shown below in Figure 4:
  • Show the introduction of the experiment.
  • After carefully reading the experimental procedure, participants were asked to perform a practice task.
  • The eye-tracker headband was fitted and then a short calibration procedure was carried out. Adjustments were made to keep the participant’s line of sight perpendicular and at a distance of 60 cm from the monitor.
  • The cue word was displayed, the cue word was observed for 1 min, and the cue word was entered.
  • Participants started drawing for a controlled period of two and a half minutes, during which the eye tracker recorded eye movement data, gaze duration, number of gazes, eye jumps, etc.
  • Save the generated image.
  • Repeat the operation.
  • Fill in the questionnaire.
The experimental task was constructed as follows, with a recording of the subject completing the experimental scene and the subject’s visual field screen recorded by the eye tracker, as shown in Figure 5. Figure 5 mainly displays and explains the experimental scene, experimental settings, and scene construction. The left and right sides are used to distinguish the experimental states of participants and the AI collaborative design under different experimental settings. The middle area shows the human–AI collaborative relationship of the participants in two different experimental scene settings. Using arrows, diagrams, and circles, the purposes and roles of different devices in the human–AI collaborative design relationship are shown through detailed arrows and boxes.

3.2.3. Selection of Eye Movement Data Metrics

User creative experience is important in a variety of fields such as design, art, and education, and eye-tracking data can provide unique insights to help researchers and designers understand how users process information and make decisions in creative tasks [11]. Eye-tracking technology can record data such as the distribution of a user’s visual attention, gaze points, and sweep paths, which reflect the cognitive processes of the user when faced with a creative task. Here are a few key points on how eye movement data can be used to explain a user’s creative experience: Gaze points and gaze durations can mark the identification of creative elements. A user’s gaze points on creative content can reveal which elements captured their attention. Longer gaze durations may indicate that users are thinking deeply or trying to understand a particular creative element.
Gaze points and gaze durations can signify indicators of thought processes, with longer durations of gaze possibly associated with deep thinking or information processing, suggesting that the user is investing more cognitive resources in interacting with specific elements of the creative task. Sweeping paths can characterize information search and integration: users’ sweeping paths show how they move their eyes between creative content, reflecting the process of information search and integration. Non-linear or exploratory scanning paths may indicate that users are exploring connections between creative elements or looking for inspiration. Sweeping paths may mark the exploration of creative solutions; in solving creative problems, users may compare and weigh multiple potential solutions. This comparison process may be reflected by complex or repetitive sweeping paths. Heat maps can characterize areas of concentration of creative focus. Heat maps can show areas of concentration of a user’s eyes during a creative experience, revealing the creative elements that capture the user’s attention.
These areas of concentration may be starting points for innovative thinking or important creative triggers. Hotspot maps can indicate the distribution and breadth of visual attention, showing how users allocate their attention during creative tasks. Distracted distribution of attention may indicate that the user is searching widely for inspiration or information. Pupil diameter correlates with the degree of attentional focus, and pupillary response correlates with the distribution of attention. When an individual is particularly interested in an object or activity, their pupils dilate. In experiments, this can help the researcher to understand which elements or interactions attract the participants’ attention the most [39]. It investigated eye gaze patterns, where the search strategies employed by the participants depended, to some extent, on verbal fluency and ranged from strictly structured search patterns to random searches for the target word. It’s study of pupil changes that reliably predicted strategy during hierarchical decision-making change provides evidence for improved decision-making models, as well as for the development of more effective decision aids [8,40].
In conclusion, by analysing eye movement data, researchers and designers can gain insights into how users interact with creative content, and which elements most stimulate creative thinking, revealing the cognitive and visual strategies users use when faced with creative challenges. These insights can help optimize the design of creative content, enhance the user’s creative experience, and potentially promote higher levels of innovation and creative expression, hence the choice of annotation points, gaze duration, and pupil diameter as observational variables.

3.2.4. Data Processing and Analysis

Theoretical Basis for Data Processing

Peng and other scholars mainly analysed eye movement data including the overall number of gaze points and the average duration of gaze points [38]. Other data in the eye movement data analysis of the subjects with a sampling rate of less than 80% were excluded. So, in the end, only eye movement data from 24 valid subjects were used for further analysis [40]. Therefore, the data of subjects with a sampling rate higher than 80% were also selected in this study. Pupil size is one of the most commonly used response systems in psychophysiology, and can reflect changes in cognitive and emotional states [41,42]. However, preprocessing of pupil size data is a critical step as incorrect preprocessing can lead to data contamination and misinterpretation of pupil size data. Eldar et al. reported on the relationship between decision-making bias and pupillary responses. They conducted experiments to explore decision bias in different decision-making situations and investigated the association between the situations and decision bias by measuring participants’ pupillary responses [43]. Wu et al. used gaze duration and gaze points for data partitioning effects in an interactive interface cognition study [44]. It used repeated measures ANOVA to analyse eye size data [45] and ANOVA to analyse eye movement data.
A repeated measures ANOVA is a statistical method used to compare the performance of the same group of participants across conditions. Differences in behaviour across conditions can be determined by analysing participants’ eye movement measurements in different languages and target word lengths [41]. Chen used an experimental approach in which participants first viewed a reference icon and then compared it to a test icon. By analysing the results of the participants’ choices, an ANOVA (analysis of variance) test was used to determine the significance of the variables [45]. Leyla used a variety of data analysis methods to investigate the relationship between decision-making processes and pupil size. The Kruskal–Wallis test was used to assess the significance of the behavioural data and was used to analyse the effect of varying stimulus intensities on behavioural accuracy. Pearson’s correlation coefficient was used to analyse the relationship between stimulus intensity and participant-reported self-confidence. The Anderson–Darling test was used to check whether the data conformed to a normal distribution. It was used by the authors to check the distribution of pupillary signals. The Wilcoxon rank-sum test was used to compare the differences between the two sample groups. It was used by the authors to compare changes in pupil size between conditions [39].

Data Analysis Methods and Ideas

Analysing eye movement data is a complex but very valuable process that can be used to understand the user’s visual attention and reading habits, and the interaction patterns between the user and the interface [43]. Data collection and processing are performed after first completing the experiment to ensure that the data are clean and have no missing values. The data for each subject should include the identification of whether they belong to the cross-screen or non-cross-screen scenario, as well as the masking condition and non-masking data for grouping. Pupil diameter, gaze duration, number of gaze points, and decision time for both viewing strategies were recorded. Descriptive statistics were then performed to calculate the mean and standard deviation for each condition to gain an initial understanding and ensure that the data met the assumptions underlying the ANOVA. Next, the ANOVA’s precondition normality is tested: checking whether the dependent variable is approximately normally distributed in each condition. This can be achieved using the Kolmogorov–Smirnov or Shapiro–Wilk tests [46]. We used Levene’s Test to test for consistency of variance across groups and to ensure independence, making sure that the individual data points were independent of each other.
A mixed ANOVA was performed in SPSS, where the between-group variable was the cross-screen versus non-cross-screen group. The within-group variable was the observational strategy (presence or absence of visual interference). A test interaction was conducted to test whether there was an interaction between the between-group and within-group variables. If the interaction is significant, it indicates that different observation strategies have different effects on the two groups. Finally, follow-up tests can be performed. If the ANOVA shows a significant main effect or interaction, multiple comparisons, such as Tukey HSD or Bonferroni correction, may need to be performed to identify specific differences between levels. Finally, we report the results, including F-values, degrees of freedom, p-values, and effect sizes (e.g., biased eta squared) [47].

4. Results

The research conclusions based on our research hypotheses are as follows:
  • Different cross-device collaboration methods have a significant impact on the visual cognition of human–AI collaborative drawing design, and tasks in non-cross-screen environments are more sensitive to visual processing.
  • There are different effects of occlusion and non-occlusion on visual attention and cognitive load under cross-screen and non-cross-screen conditions. In non-cross-screen situations, occlusion and non-occlusion have a significant impact on the average gaze duration and pupil diameter. Therefore, different cross-device collaborative visual observation strategies have a significant impact on the visual cognitive load of users and AI collaborative drawing.
  • Under different visual observation strategies, the degree to which the AI participates in the designer’s hand-drawn creative process will have a significant impact on the designer’s behaviour in negotiating design ideas with the AI.
  • The degree to which the AI participates in the designer’s hand-drawn creative process under the influence of different cross-device collaboration methods will have a significant impact on the designer’s behaviour in negotiating design ideas with AI.
The specific statistical analysis is as follows.

4.1. Analysis of Eye Movement Data

After passing the normality test and the variance chi-square test, a repeated measures ANOVA was performed to analyse the significance of the effect of the within-group variable, masked versus non-masked, and the between-group variable, cross-screen versus non-cross-screen, on the decision time, the mean pupil diameter at the decision time, the gaze time at the decision time, and the number of gaze points [17].

4.1.1. Data Descriptive Statistics

Descriptive statistical analyses of the data were completed and distribution plots were generated for four key variables: The duration of the period showed that the distribution of the data exhibited a wide range, with a median of approximately 28,007 milliseconds. The number of gaze points showed a median distribution of 27 for the number of gaze points, showing a variation from very few to very many. The mean duration of gaze showed a median distribution, with mean duration of 681 milliseconds. The mean pupil diameter during gaze showed a median distribution of pupil diameters of 3.45 mm.
Occlusion decisions and occlusion hand-drawn scenes were analysed for both the cross-screen and non-cross-screen conditions to compare the differences in the following metrics [39]: total interval duration (duration of interval), number of whole fixations, and average duration of gaze (average duration of the whole fixations), and average whole fixation pupil diameter. The descriptive statistics are shown in Figure 6.

4.1.2. Analysis of Variance of Data

First, the data related to occlusion decisions and occlusion handwriting were filtered and grouped by cross-screen type. Then the analysis of variance (ANOVA) was performed. The results of the analysis of variance (ANOVA) have been calculated, from which we can see the effect of different cross-screen types and occlusion decisions and occlusion hand-drawings on the following variables.
For the total interval duration (duration of the interval), the effect of cross-screen type was F(1,157) = 2.65, p = 0.106, indicating no significant difference at the 0.05 level of significance. The effect of time type was F(1,157) = 2.86, p = 0.093, indicating no significant difference at the 0.05 level of significance, either. The interaction of cross-screen type with time type was F(1,157) = 1.48, p = 0.225, indicating no significant interaction.
For the number of whole fixations, the effect of cross-screen type was F(1,157) = 1.95, p = 0.164, indicating no significant difference. The effect of time type was F(1,157) = 10.24, p = 0.0017, showing a significant difference, indicating that different time types have a significant effect on the number of fixations. The interaction between cross-screen type and time type was F(1,157) = 1.50, p = 0.223; no significant interaction.
The ANOVA results were also analyzed for the remaining two variables (mean gaze duration and mean pupil diameter). The results of the analysis of variance (ANOVA) showed that there was also a significant effect of different cross-screen types and occlusion decisions vs. occlusion hand-drawings on the following two variables.
For the average duration of whole fixations, the effect of cross-screen type showed a significant difference: F(1,155) = 18.56, p < 0.0001. The effect of task type also showed a significant difference: F(1,155) = 18.58, p < 0.0001. The interaction between cross-screen type and time type was F(1,155) = 18.00, p < 0.0001; significant interaction.
For the average whole fixation pupil diameter, the effect of cross-screen type was F(1,155) = 10.40, p = 0.0015, indicating a significant interaction. The effect of task type showed a significant difference: F(1,155) = 5.79, p = 0.0173. The interaction between cross-screen type and task type was F(1,155) = 2.93, p = 0.089, indicating no significant interaction at the 0.05 level of significance.
In summary, the effects of cross-screen type and task type on total interval duration, number of gaze points, mean gaze duration, and mean pupil diameter were significant in the occlusion decision versus occlusion hand-drawing situation. In particular, the mean gaze duration and mean pupil diameter variables showed significant differences across task types and screen types. These results can help to provide a deeper understanding of user behaviour and cognitive load under different visual tasks and cross-screen conditions. To make these comparisons, two steps are required.
For the cross-screen condition, compare decision and hand-drawing time differences, gaze point differences, gaze duration differences, and pupil diameter differences between the occluded and non-occluded conditions.
For the non-cross-screen case, the same comparisons are performed. First, filter the data and perform the analyses for the cross-screen case and then the non-cross-screen case. We will use analysis of variance (ANOVA) to examine these differences and for the analyses for the cross-screen case.
For the cross-screen case, the results of the analysis are as follows. For the duration of interval, the effect of time type was F(1,79) = 0.162, p = 0.688, indicating that there was no significant effect of occlusion vs. non-occlusion on the total interval duration in the cross-screen condition. For the number of whole fixations, the effect of time type was F(1,79) = 2.03, p = 0.159, indicating that occlusion vs. non-occlusion had no significant effect on the number of fixations in the cross-screen condition.
Next, we view the corresponding analyses for the non-cross-screen condition. The results of the analyses for the non-cross-screen condition are as follows. For the total interval duration, the effect of time type was F(1,78) = 3.30, p = 0.073, indicating that occlusion vs. non-occlusion did not have a significant effect on total interval duration in the non-cross-screen condition, although it was close to the significant level. For the number of whole fixations, the effect of Time type was F(1,78) = 9.54, p = 0.0028, indicating that occlusion vs. non-occlusion had a significant effect on the number of fixations in the non-cross-screen condition.
The ANOVA results for mean gaze duration and mean pupil diameter in the cross-screen and non-cross-screen conditions are as follows.
For the cross-screen conditions and the average duration of whole fixations, The effect of time type was F(1,78) = 0.010, p = 0.920, indicating that there was no significant effect of masking vs. non-masking on the average gaze duration in the cross-screen condition. For the average whole fixation pupil diameter, the effect of time type was F(1,78) = 0.192, p = 0.663, indicating that there was no significant effect of masking vs. non-masking on the average pupil diameter in the cross-screen condition.
For the non-cross-screen conditions and the average duration of whole fixations, the effect of time type was F(1,77) = 23.35, p < 0.0001, indicating a significant effect of masking vs. non-masking on the average gaze duration in the non-cross-screen condition. For the average whole fixation pupil diameter, the effect of time type was F(1,77) = 12.28, p = 0.0008, indicating a significant effect of masking vs. non-masking on the average pupil diameter in the non-cross-screen condition.
These results reveal the different effects of masking versus non-masking situations on visual attention and cognitive load in the cross-screen and non-cross-screen conditions. The effects of masking versus non-masking on mean gaze duration and pupil diameter were more significant in the non-cross-screen condition, whereas the effects of these variables were not significant in the cross-screen condition. This may indicate that tasks in non-cross-screen environments are more sensitive to visual processing. The specific data is shown in Figure 7, Figure 8, Figure 9 and Figure 10.
The eye-movement hotspot maps are analyzed in the figure below, and the eye-movement analysis reveals that there are some differences in the distribution and concentration of attention in the choice decision-making process between different cooperation styles, as shown in Figure 11.

4.1.3. Summary of Eye Movement Data Analysis

The eye tracking data revealed the significant effects of observation strategies such as occlusion and non-occlusion, as well as cross-screen and non-cross-screen methods on the mean pupil diameter, fixation time, and number of fixation points. The eye-tracking hotspot maps of the four methods were analyzed, and it was found that different collaboration methods had certain differences in attention distribution and concentration during the selection decision-making process. We obtained visual characteristics and visual cognitive loads under different collaborative observation methods through eye-tracking data.

4.2. Scale Data Analysis

4.2.1. Reliability Analysis

The value of the reliability coefficient is 0.635, which is greater than 0.6, thus indicating that the quality of the research data reliability is acceptable. Regarding the “alpha coefficient of the deleted item”, 3, I have tricks to further improve the AI ideas, if deleted, the reliability coefficient will have a more obvious increase, so we can consider the correction or deletion of this item. As for the “CITC value”, since the CITC value corresponds to issue 3, I have tips to further improve the AI ideas, and is less than 0.2, this indicates that its relationship with the rest of the analyzed items is very weak, and we can consider deleting this item (in the case of pre-test analyses, this item can be corrected and then formal data can be collected). In summary, the study data reliability coefficient value is higher than 0.6, which, taken together, indicates that the data reliability quality is acceptable, as shown in Table 5.
(CITC represents the correlation between each specific measurement item and the overall score of the scale. The CITC value is closely related to the Cronbach’s α coefficient. Cronbach’s α coefficient is a commonly used indicator to measure the internal consistency reliability of a scale, which is calculated based on the correlation between items.)

4.2.2. Validity Analysis

The KMO and Bartlett’s tests were used to validate the validity. As can be seen from the table below, the KMO value is 0.676, between 0.6 and 0.7, and the research data are more suitable for extracting information (from the test results, the validity is considered average).
The results of the reliability and validity analysis show that the creative efficacy scale can be used for human–AI collaborative mapping for analysis, and the reliability coefficient value of the research data is higher than 0.6, which comprehensively indicates that the quality of the data’s reliability is acceptable, and the research data are more suitable for extracting information (from the test results, the validity is considered average), as shown in Table 6.

4.2.3. Non-Parametric Tests of Scales

Summarising the analysis, it can be seen that using the Mann–Whitney test statistic, the analysis shows that different subgroups (cross-screen 01, non-cross-screen 02) of samples for 1, I feel that I am good at coming up with novel ideas; 2, I am confident in my creative problem-solving skills; and 4, I am good at discovering new ways to solve problems, a total of three items, did not show a significant difference, and the other subgroup (cross-screen 01, non-cross-screen 02) samples showed a significant difference for 3, I have a knack for further refining AI ideas, a total of one item. The option of question 3, regarding subjects’ ideas in supplementing and refining the AI, showed a significant difference between the two modes, as shown in Table 7.
By analyzing the data on the gaze duration, number of gaze points, pupil diameter, and task duration in the grouping of eye tracking, it was concluded that the effects of cross-screen type and task type on the total interval duration, number of gaze points, mean gaze duration, and mean pupil diameter in the case of occlusion decision-making versus occlusion hand-drawing were significant. In particular, the mean gaze duration and mean pupil diameter variables showed significant differences across task types and screen types. These results can help us to gain a deeper understanding of user behaviour and cognitive load in different visual tasks and cross-screen conditions and of The differential effects of occlusion and non-occlusion situations on visual attention and cognitive load in cross-screen and non-cross-screen conditions.
The effects of occlusion versus non-occlusion on mean gaze duration and pupil diameter were more significant in the non-cross-screen condition, while the effects of these variables were not significant in the cross-screen condition. This may indicate that tasks in non-cross-screen environments are more sensitive to visual processing. This suggests that AI involvement in hand-drawing behaviours in non-cross-screen collaborative drawing has a significant effect on designers’ visual perception. Analysis of the creative efficacy scale data revealed significant differences in designers’ responses to the complementary refinement of AI ideas in the unused mode, suggesting that the degree of AI involvement in the designers’ hand-drawing creative process can have a significant impact on designers’ behaviour in negotiating ideas for design outcomes with the AI. The subjects’ hand-drawn sketches and selected AI-generated images were also collected to facilitate subsequent in-depth research.

5. Discussion

5.1. Theoretical Implications

Processing images generated by AI, such as selecting, adjusting, or deciding which images to use, involves both “selection” and “decision-making”. Selection is usually the process of choosing one or more of several options [48]. For example, the AI generates several images and the user needs to select the most appropriate one from among them. This process focuses on evaluating and comparing different options and then picking the best one [49]. Decision-making is a more complex process that typically involves the process of identifying and defining a problem, generating options for a solution, evaluating those options, and ultimately choosing a course of action. Decision-making is not just a comparison between options but also involves an understanding of the problem itself, goal setting, and evaluation of potential consequences. In the context of AI-generated images, users may need to make decisions before deciding which image to use [50]. This process may include the following: 1. Defining goals: determining the purpose or effect that the image is desired to achieve. 2. Generating options: using AI to generate the image or adjusting parameters to produce different image options. 3. Evaluating options: assessing the applicability of each image based on the goals and criteria. 4. Selecting: choosing the one that best meets the goals from among the generated images. 5. Implementing the decision: using the selected image for the next steps such as publishing, etc. [51].
Thus, the process combines both selection and decision-making: selection is a step in the decision-making process, while decision-making is a series of considerations and steps that precede selection. These two concepts are intertwined, especially in scenarios involving creative and visual content. The current study focuses on the subjective preference selection stage of novice users in the usage phase, and subsequent studies can continue to delve deeper into the decision-making aspect of the research by refining the selection behaviour process separating the various phases and launching fine-grained decision-making experiments for decision analysis [50,52]. With the emergence of AI artworks, investigating the dynamics of visual attention and drawing strategies will provide additional insights into artists’ expertise. Crucially, eye movement strategies should be linked to behavioural advantages in local and global processing so that they are based on an understanding of how they benefit perceptual decision-making [38]. AI has transformed artistic drawing creativity research. In addition, creative workers such as scientists and artists are increasingly using AI in their creative processes, and the concept of co-creation has emerged to describe the hybrid creativity of humans and AI [53]. The role of AI differs between the scientific and artistic creative processes. Scientists need AI to produce accurate and trustworthy results, while artists use AI to explore and play [54]. Unlike scientists, some artists also see their cooperation with AI as co-creation [49]. The development of human–AI cooperation will become co-creation, and co-creation can explain the contemporary creative process in the age of AI and should be the focus of future research on human–AI collaborative creativity [55,56,57].

5.2. Practical Implications

By recording eye movements, visual processing of stimuli can be quantitatively captured, allowing conclusions to be drawn about the underlying cognitive processes and the strategic approaches drawn by participants [58]. Based on this, the mapping process can be characterized and obstacles can be identified. The use of eye tracking widens the field of view [59]. The study of mapping processes for collaborative patterns and observation strategies illustrates the added value that analyzing mapping processes using eye tracking can provide. The use of eye tracking in real-life operations can demonstrate its application in investigating the collaborative analysis possibilities of drawing. It provides insights into cognitive processes like action strategies. Eye tracking offers added value by recording progressive problem-solving based on eye movements without interfering with task processing [21].
Due to differences in subjects’ creativity, willingness to draw, degree of divergent thinking, equipment familiarity, and drawing skills, subjects with higher willingness [34], a higher degree of divergent thinking, more proficient use of equipment, and higher drawing skills will have more drawing time [16] and thinking time and observation time of the AI in the collaborative drawing process, and this is the reason why drawing time was not taken as the dependent variable but was chosen as the decision time.
There are differences in the bases of subjects’ creative choices. Interview observation found that 80–90% of subjects preferred to choose the generated image that was close to what they wanted for their drawing [60,61], and in the case of masking and separating the display screen from the hand-drawing process, subjects preferred to choose the one that was close to what they drew. However, in the case of observing the changes in the generation of the AI while they drew, some of them would make changes in their drawing based on the ones that were generated by the AI and for a small portion of the subjects, the choice of the image may be because the generated image looks better, or the generated image could be very different, breaking through the original fixed thinking of the subjects [62]. This selection basis can correspond to the three sub-dimensions of creativity: aesthetics, novelty, and semantic consistency. The experiment has fewer trials and the drawing task is simple; thus, the cognitive load generated by the subjects completing the task more easily is smaller. Subsequent experiments should be adjusted in terms of the number of trials. The simplicity of the cue words in the experiment resulted in a large difference in the subjects’ semantic understanding. In subsequent experimental setups, the operational task could be disassembled, the decision-making process could be separated from the overall drawing process, and the creative decision-making process could be experimentally simulated, allowing subjects to select images based on different assessment dimensions, as well as grouping and selecting image categories, styles, and complexities [55].

6. Conclusions

In this paper, under the support of the new application technology AI real-time drawing and design software, based on a review of the literature and the theoretical research of our predecessors, an eye-tracking experiment combined with the creative efficacy scale is conducted, which provides theoretical support for the direction of the experimental method of human–computer collaborative drawing. At the same time, through experimental research, it has been found that there is a significant difference in visual cognition and perception under the different collaborative modes. There is a certain degree of difference in the creative efficacy under the different modes, which can provide future design practitioners and AI design and drawing enthusiasts with different cooperation modes to think about. Based on the theories in this article, we hope to help researchers to choose more efficient human–AI collaborative drawing modes and to improve the efficiency of cooperation and drawing; to focus visual attention and improve the efficiency of visual search; and to provide a new way for later researchers to choose and think about efficient and high-quality cooperation modes and interaction forms to provide new research perspectives and ideas.
The paper analyses human–AI real-time drawing collaboration. It uses KREA.AI, eye tracking, and a creative efficacy scale. Two visual search methods and different collaboration scenarios are the variables in a mixed experiment. Subjects draw with an AI interface while wearing an eye tracker for data collection. After each mode, they fill in scales. This subjective–objective combined method explores the impact of different collaboration modes and drawing strategies on creative efficacy and visual perception through data analysis. The study aims to understand the collaborative process and improve human–AI interaction in drawing.

7. Future Research and Prospects

7.1. Further Research on Subjects

In future in-depth research, our methods can be further developed based on the age and gender of the participants. They are imaginative and innovative in creativity and boldly try different drawing methods and collaboration modes. They are sensitive to colour and form in visual perception and can quickly accept image styles generated by the AI. They focus on personalised expression and self-presentation. For youth, the strong ability to learn and adapt enables them to combine AI tools with professional knowledge to create professional and in-depth works. Creativity focuses on practicality and innovation, using AI to solve practical problems or create commercially valuable works.
The middle-aged group has a relatively low acceptance of new technology, but after mastering it, they can create works with in-depth connotations with their rich experience and professional knowledge. Creatively, they pay attention to cultural value and historical significance and use AI to inherit and promote traditional culture. Visually, we focus on the stability and atmosphere of the work and adjust the image conservatively. They focus on the emotional expression and life lessons in the works. For the elderly group, there is generally a low acceptance of new technology, but some of them keep an open attitude. With the help of others, they try to use AI tools to create commemorative works. Their creativity focuses on reminiscence value and emotional attachment. Visually, they focus on simplicity and simple image adjustments with additional focus on the family and social value of the work.
Creatively, they value technical content and innovation and experiment with complex drawing strategies and collaborative modes. Visually, they focus on the power and impact of their work and make bold adjustments to their images. There is a focus on competition and achievement value. There is a creative emphasis on beauty and artistry, using soft colours and curves to create a warm and romantic atmosphere. Visually, they pay attention to the details and delicacy of their works and fine-tune their images. They focus on sharing and social value.
In human–AI collaborative drawing, people of all ages and genders have the opportunity to express their creativity and create colourful works.

7.2. Implications for AI Developers

In the technical improvement aspect, our work helps in the understanding of user needs and habits by studying human–AI collaborative mapping to optimise tool functionality and improve ease of use and usability. Eye-tracking data can be used to improve the interface design and place frequently used buttons in areas where users’ eyes are focused. The creative efficiency and visual perception of different collaboration modes can also be analysed to develop more innovative collaboration modes.
The results of this research can support marketing, promote the benefits of the tool, and improve the product based on feedback. The impact of human–AI collaboration on creativity can also be explored to help formulate development strategies and increase R&D investment.
In the user experience aspect, we can optimise tool performance and stability based on user experience feedback. Finally, we can continuously improve features and services through interactive collaboration, such as opening forums to collect opinions.

Author Contributions

Conceptualization, Y.P.; methodology, Y.P.; software, Y.P.; validation, Y.P.; formal analysis, Y.P.; investigation, Y.P.; resources, Y.P.; data curation, Y.P.; writing—original draft preparation, Y.P.; writing—review and editing, Y.P. and L.W.; visualization, Y.P.; supervision, L.W.; project administration, L.W.; funding acquisition, C.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research on Human–Computer Interaction Interface Design Mechanism for Human–Computer Collaboration, National Natural Science Foundation of China, NO. 72271053.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to appreciate the experts and participants that took part in the experiments.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Creative efficacy scale design.
Table A1. Creative efficacy scale design.
NumberIssues
1234567
011. I think I am good at coming up with novel ideas
022. I have confidence in my creative problem-solving skills
033. I have tricks for further refining AI ideas
044. I am good at finding new ways to solve problems

References

  1. Lyu, Y.; Wang, X.; Lin, R.; Wu, J. Communication in Human–AI Co-Creation: Perceptual Analysis of Paintings Generated by Text-to-Image System. Appl. Sci. 2022, 12, 11312. [Google Scholar] [CrossRef]
  2. Zhao, Y. Artificial Intelligence-Based Interactive Art Design under Neural Network Vision Valve. J. Sens. 2022, 2022, 3628955. [Google Scholar] [CrossRef]
  3. Lv, Y. Artificial Intelligence-Generated Content in Intelligent Transportation Systems: Learning to Copy, Change, and Create! [Editor’s Column]. IEEE Intell. Transp. Syst. Mag. 2023, 15, 2–3. [Google Scholar] [CrossRef]
  4. AIGC: The Rise and Future Prospects of Artificial Intelligence Generated Content—Cloud Community—Huawei Cloud. Available online: https://bbs.huaweicloud.com/blogs/398164?utm_source=zhihu&utm_medium=bbs-ex&utm_campaign=other&utm_content=content (accessed on 20 January 2024).
  5. Hong, J.-W. Bias in Perception of Art Produced by Artificial Intelligence. In Human-Computer Interaction. Interaction in Context; Kurosu, M., Ed.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 10902, pp. 290–303. ISBN 978-3-319-91243-1. [Google Scholar]
  6. Wingström, R.; Hautala, J.; Lundman, R. Redefining Creativity in the Era of AI? Perspectives of Computer Scientists and New Media Artists. Creat. Res. J. 2022, 36, 177–193. [Google Scholar] [CrossRef]
  7. Sun, L.; Chen, P.; Xiang, W.; Chen, P.; Gao, W.; Zhang, K. SmartPaint: A Co-Creative Drawing System Based on Generative Adversarial Networks. Front. Inf. Technol. Electron. Eng. 2019, 20, 1644–1656. [Google Scholar] [CrossRef]
  8. Oksanen, A.; Cvetkovic, A.; Akin, N.; Latikka, R.; Bergdahl, J.; Chen, Y.; Savela, N. Artificial Intelligence in Fine Arts: A Systematic Review of Empirical Research. Comput. Hum. Behav. Artif. Hum. 2023, 1, 100004. [Google Scholar] [CrossRef]
  9. Spering, M. Eye Movements as a Window into Decision-Making. Annu. Rev. Vis. Sci. 2022, 8, 427–448. [Google Scholar] [CrossRef] [PubMed]
  10. Utz, V.; DiPaola, S. Using an AI Creativity System to Explore How Aesthetic Experiences Are Processed along the Brain’s Perceptual Neural Pathways. Cogn. Syst. Res. 2020, 59, 63–72. [Google Scholar] [CrossRef]
  11. Guo, Y.; Lin, S.; Acar, S.; Jin, S.; Xu, X.; Feng, Y.; Zeng, Y. Divergent Thinking and Evaluative Skill: A Meta-Analysis. J. Creat. Behav. 2022, 56, 432–448. [Google Scholar] [CrossRef]
  12. Déguernel, K.; Sturm, B.L.T. Bias in Favour or Against Computational Creativity: A Survey and Reflection on the Importance of Socio-Cultural Context in Its Evaluation. Available online: http://kth.diva-portal.org/smash/get/diva2:1757836/FULLTEXT01.pdf (accessed on 20 January 2024).
  13. Li, G.; Chu, R.; Tang, T. Creativity Self Assessments in Design Education: A Systematic Review. Think. Skills Creat. 2024, 52, 101494. [Google Scholar] [CrossRef]
  14. Tamm, T.; Hallikainen, P.; Tim, Y. Creative Analytics: Towards Data-Inspired Creative Decisions. Inf. Syst. J. 2022, 32, 729–753. [Google Scholar] [CrossRef]
  15. Wammes, J.D.; Roberts, B.R.T.; Fernandes, M.A. Task Preparation as a Mnemonic: The Benefits of Drawing (and Not Drawing). Psychon. Bull. Rev. 2018, 25, 2365–2372. [Google Scholar] [CrossRef]
  16. Xie, H.; Zhou, Z. Finger versus Pencil: An Eye Tracking Study of Learning by Drawing on Touchscreens. J. Comput. Assist. Learn. 2024, 40, 49–64. [Google Scholar] [CrossRef]
  17. Pei, Y.; Wang, L.; Xue, C. Research on the Efficiency and Cognition of the Combination of Front Color and Background Color and Color in the Interface of Express Cabinets on the Operation of Human Machine Interface Tasks. In Proceedings of the HCI International 2023—Late Breaking Papers, Copenhagen, Denmark, 23–28 July 2023; Kurosu, M., Hashizume, A., Marcus, A., Rosenzweig, E., Soares, M.M., Harris, D., Li, W.-C., Schmorrow, D.D., Fidopiastis, C.M., Rau, P.-L.P., Eds.; Springer Nature Switzerland: Cham, Switzerland, 2023; pp. 194–212. [Google Scholar]
  18. Alemdag, E.; Cagiltay, K. A Systematic Review of Eye Tracking Research on Multimedia Learning. Comput. Educ. 2018, 125, 413–428. [Google Scholar] [CrossRef]
  19. Chang, H.-W.; Bi, X.-D.; Kai, C. Blind Image Quality Assessment by Visual Neuron Matrix. IEEE Signal Process. Lett. 2021, 28, 1803–1807. [Google Scholar] [CrossRef]
  20. Karran, A.J.; Demazure, T.; Hudon, A.; Senecal, S.; Léger, P.-M. Designing for Confidence: The Impact of Visualizing Artificial Intelligence Decisions. Front. Neurosci. 2022, 16, 883385. [Google Scholar] [CrossRef]
  21. Braun, I.; Graulich, N. Die Zeichnung im Blick—Nutzung von Eye-Tracking zur Analyse der zeichnerischen Erschließung von Mesomerie-Aufgaben. Chemkon 2022, 29, 261–266. [Google Scholar] [CrossRef]
  22. Hahm, J.; Kim, K.K.; Park, S.-H. Cortical Correlates of Creative Thinking Assessed by the Figural Torrance Test of Creative Thinking. NeuroReport 2019, 30, 1289–1293. [Google Scholar] [CrossRef] [PubMed]
  23. Bandi, A.; Adapa, P.V.S.R.; Kuchi, Y.E.V.P.K. The Power of Generative AI: A Review of Requirements, Models, Input–Output Formats, Evaluation Metrics, and Challenges. Future Internet 2023, 15, 260. [Google Scholar] [CrossRef]
  24. Daviddi, S.; Orwig, W.; Palmiero, M.; Campolongo, P.; Schacter, D.L.; Santangelo, V. Individuals with Highly Superior Autobiographical Memory Do Not Show Enhanced Creative Thinking. Memory 2022, 30, 1148–1157. [Google Scholar] [CrossRef]
  25. Oldham, G.R.; Cummings, A. Employee Creativity: Personal and Contextual Factors at Work. Acad. Manag. J. 1996, 39, 607–634. [Google Scholar] [CrossRef]
  26. Lee, Y.K.; Park, Y.-H.; Hahn, S. A Portrait of Emotion: Empowering Self-Expression through AI-Generated Art. arXiv 2023, arXiv:2304.13324. [Google Scholar]
  27. Shi, Y.; Shang, M.; Qi, Z. Intelligent Layout Generation Based on Deep Generative Models: A Comprehensive Survey. Inf. Fusion 2023, 100, 101940. [Google Scholar] [CrossRef]
  28. Fan, M.; Yang, X.; Yu, T.T.; Liao, V.Q.; Zhao, J. Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization. arXiv 2021, arXiv:2112.12387. [Google Scholar] [CrossRef]
  29. Kuang, E.; Jahangirzadeh Soure, E.; Fan, M.; Zhao, J.; Shinohara, K. Collaboration with Conversational AI Assistants for UX Evaluation: Questions and How to Ask Them (Voice vs. Text). In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; pp. 1–15. [Google Scholar] [CrossRef]
  30. Sowa, K.; Przegalinska, A.; Ciechanowski, L. Cobots in Knowledge Work: Human—AI Collaboration in Managerial Professions. J. Bus. Res. 2021, 125, 135–142. [Google Scholar] [CrossRef]
  31. Lindebaum, D.; Vesa, M.; Den Hond, F. Insights From “The Machine Stops” to Better Understand Rational Assumptions in Algorithmic Decision Making and Its Implications for Organizations. Acad. Manag. Rev. 2020, 45, 247–263. [Google Scholar] [CrossRef]
  32. Ozmen Garibay, O.; Winslow, B.; Andolina, S.; Antona, M.; Bodenschatz, A.; Coursaris, C.; Falco, G.; Fiore, S.M.; Garibay, I.; Grieman, K.; et al. Six Human-Centered Artificial Intelligence Grand Challenges. Int. J. Hum.–Comput. Interact. 2023, 39, 391–437. [Google Scholar] [CrossRef]
  33. Abbass, H.A. Social Integration of Artificial Intelligence: Functions, Automation Allocation Logic and Human-Autonomy Trust. Cogn. Comput. 2019, 11, 159–171. [Google Scholar] [CrossRef]
  34. Zhou, Y.; Yang, P.; Xu, X.; Shao, B.; Feng, G.; Liu, J.; Luo, B. FMASketch: Freehand Mid-Air Sketching in AR. Int. J. Hum.–Comput. Interact. 2024, 40, 2142–2152. [Google Scholar] [CrossRef]
  35. Schindler, M.; Lilienthal, A.J. Students’ Collaborative Creative Process and Its Phases in Mathematics: An Explorative Study Using Dual Eye Tracking and Stimulated Recall Interviews. ZDM—Math. Educ. 2022, 54, 163–178. [Google Scholar] [CrossRef]
  36. Liu, Y.-L.E.; Lee, T.-P.; Huang, Y.-M. Exploring Students’ Continuance Intention Toward Digital Visual Collaborative Learning Technology in Design Thinking. Int. J. Hum.–Comput. Interact. 2023, 40, 2808–2821. [Google Scholar] [CrossRef]
  37. Xiang, W.; Sun, L.; Chen, S.; Yang, Z.; Liu, Z. The Role of Mental Models in Collaborative Sketching. Int. J. Technol. Des. Educ. 2015, 25, 121–136. [Google Scholar] [CrossRef]
  38. Park, S.; Wiliams, L.; Chamberlain, R. Global Saccadic Eye Movements Characterise Artists’ Visual Attention While Drawing. Empir. Stud. Arts 2022, 40, 228–244. [Google Scholar] [CrossRef]
  39. Yahyaie, L.; Ebrahimpour, R.; Koochari, A. Pupil Size Variations Reveal Information About Hierarchical Decision-Making Processes. Cogn. Comput. 2024, 16, 1049–1060. [Google Scholar] [CrossRef]
  40. Peng, N.; Xue, C. Experimental study on icon search characteristics based on feature inference. J. Southeast Univ. 2017, 47, 703–709. [Google Scholar]
  41. Beelders, T. Visual Search Patterns for Multilingual Word Search Puzzles, a Pilot Study. J. Eye Mov. Res. 2023, 16, 6. [Google Scholar] [CrossRef]
  42. Lin, C.J.; Prasetyo, Y.T.; Widyaningrum, R. Eye Movement Parameters for Performance Evaluation in Projection-Based Stereoscopic Display. J. Eye Mov. Res. 2018, 11, 3. [Google Scholar] [CrossRef]
  43. Eldar, E.; Felso, V.; Cohen, J.D.; Niv, Y. A Pupillary Index of Susceptibility to Decision Biases. Nat. Hum. Behav. 2021, 5, 653–662. [Google Scholar] [CrossRef]
  44. Wu, Z. Research on Color Emotion Evaluation of AI Images from the Perspective of Sensory Engineering. Master’s Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2022. [Google Scholar]
  45. Chen, X.; Tang, X.; Zhao, Y.; Huang, T.; Qian, R.; Zhang, J.; Chen, W.; Wang, X. Evaluating Visual Consistency of Icon Usage in Across-Devices. Int. J. Hum.–Comput. Interact. 2023, 40, 2415–2431. [Google Scholar] [CrossRef]
  46. Wu, X.; Chen, Y.; Li, J. Study on Error-Cognition Mechanism of Task Interface in Complex Information System. In Advances in Safety Management and Human Factors; Arezes, P., Ed.; Advances in Intelligent Systems and Computing; Springer International Publishing: Cham, Switzerland, 2018; Volume 604, pp. 497–506. ISBN 978-3-319-60524-1. [Google Scholar]
  47. Hershman, R.; Henik, A.; Cohen, N. CHAP: Open-Source Software for Processing and Analyzing Pupillometry Data. Behav. Res. Methods 2019, 51, 1059–1074. [Google Scholar] [CrossRef]
  48. Yadav, S.; Chakraborty, P.; Mittal, P. Designing Drawing Apps for Children: Artistic and Technological Factors. Int. J. Hum.–Comput. Interact. 2022, 38, 103–117. [Google Scholar] [CrossRef]
  49. Jin, S.V.; Youn, S. Social Presence and Imagery Processing as Predictors of Chatbot Continuance Intention in Human-AI-Interaction. Int. J. Hum.–Comput. Interact. 2023, 39, 1874–1886. [Google Scholar] [CrossRef]
  50. Zhu, M.; Bao, D.; Yu, Y.; Shen, D.; Yi, M. Differences in Thinking Flexibility between Novices and Experts Based on Eye Tracking. PLoS ONE 2022, 17, e0269363. [Google Scholar] [CrossRef]
  51. Awano, N.; Hayashi, Y. Object Categorization Capability of Psychological Potential Field in Perceptual Assessment Using Line-Drawing Images. J. Imaging 2022, 8, 90. [Google Scholar] [CrossRef]
  52. Cheng, S.; Dey, A.K. I See, You Design: User Interface Intelligent Design System with Eye Tracking and Interactive Genetic Algorithm. CCF Trans. Pervasive Comput. Interact. 2019, 1, 224–236. [Google Scholar] [CrossRef]
  53. Le Guillou, M.; Prévot, L.; Berberian, B. Bringing Together Ergonomic Concepts and Cognitive Mechanisms for Human—AI Agents Cooperation. Int. J. Hum.–Comput. Interact. 2023, 39, 1827–1840. [Google Scholar] [CrossRef]
  54. Lee, S.; Lee, M.; Lee, S. What If Artificial Intelligence Become Completely Ambient in Our Daily Lives? Exploring Future Human-AI Interaction through High Fidelity Illustrations. Int. J. Hum.–Comput. Interact. 2023, 39, 1371–1389. [Google Scholar] [CrossRef]
  55. Guo, X.; Qian, Y.; Li, L.; Asano, A. Assessment Model for Perceived Visual Complexity of Painting Images. Knowl.-Based Syst. 2018, 159, 110–119. [Google Scholar] [CrossRef]
  56. Karimi, P.; Rezwana, J.; Siddiqui, S.; Maher, M.L.; Dehbozorgi, N. Creative Sketching Partner: An Analysis of Human-AI Co-Creativity. In Proceedings of the 25th International Conference on Intelligent User Interfaces, Cagliari, Italy, 17 March 2020; ACM: New York, NY, USA; pp. 221–230. [Google Scholar]
  57. Jang, W.; Chang, Y.; Kim, B.; Lee, Y.J.; Kim, S.-C. Influence of Personal Innovativeness and Different Sequences of Data Presentation on Evaluations of Explainable Artificial Intelligence. Int. J. Hum.–Comput. Interact. 2023, 40, 4215–4226. [Google Scholar] [CrossRef]
  58. Bozkir, E.; Özdel, S.; Wang, M.; David-John, B.; Gao, H.; Butler, K.; Jain, E.; Kasneci, E. Eye-Tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges. Available online: https://arxiv.org/abs/2305.14080v1 (accessed on 1 May 2024).
  59. Chang, Y.; He, C.; Zhao, Y.; Lu, T.; Gu, N. A High-Frame-Rate Eye-Tracking Framework for Mobile Devices. In Proceedings of the ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 1445–1449. [Google Scholar]
  60. Stein, I.; Jossberger, H.; Gruber, H. Investigating Visual Expertise in Sculpture: A Methodological Approach Using Eye Tracking. J. Eye Mov. Res. 2022, 15, 5. [Google Scholar] [CrossRef]
  61. Turkmen, R.; Pfeuffer, K.; Machuca, M.D.B.; Batmaz, A.U.; Gellersen, H. Exploring Discrete Drawing Guides to Assist Users in Accurate Mid-Air Sketching in VR. In Proceedings of the Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, CHI 2022, New Orleans, LA, USA, 29 April–5 May 2022; Assoc Computing Machinery: New York, NY, USA, 2022; p. LBW67. [Google Scholar]
  62. Humphreys, P.; Jones, G. The Decision Hedgehog for Creative Decision Making. Inf. Syst. E-Bus. Manag. 2008, 6, 117–136. [Google Scholar] [CrossRef]
Figure 1. KREA.AI user interface (the webpage URL of the software is as follows: https://www.krea.ai (accessed on 6 March 2024).
Figure 1. KREA.AI user interface (the webpage URL of the software is as follows: https://www.krea.ai (accessed on 6 March 2024).
Applsci 14 08203 g001
Figure 2. Human–AI collaborative mapping mode level map.
Figure 2. Human–AI collaborative mapping mode level map.
Applsci 14 08203 g002
Figure 3. Design process: human–computer function allocation and cooperation model (self-drawn by the authors).
Figure 3. Design process: human–computer function allocation and cooperation model (self-drawn by the authors).
Applsci 14 08203 g003
Figure 4. The flow of experimental tasks.
Figure 4. The flow of experimental tasks.
Applsci 14 08203 g004
Figure 5. Scene diagram of experimental recording of field subjects.
Figure 5. Scene diagram of experimental recording of field subjects.
Applsci 14 08203 g005
Figure 6. Histograms of task time and eye movement data (task time, number of gaze points, gaze duration, and mean pupil size).
Figure 6. Histograms of task time and eye movement data (task time, number of gaze points, gaze duration, and mean pupil size).
Applsci 14 08203 g006
Figure 7. Boxplots of number of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other. As shown in this Figure, the distribution of number of whole fixations for the four modes is high in Mode 1 and Mode 3, and low in Mode 2 and Mode 4,as shown in this Figure).
Figure 7. Boxplots of number of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other. As shown in this Figure, the distribution of number of whole fixations for the four modes is high in Mode 1 and Mode 3, and low in Mode 2 and Mode 4,as shown in this Figure).
Applsci 14 08203 g007
Figure 8. Boxplots of average duration of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average duration of whole fixations distribution shows that the data in patterns one and three are mostly concentrated in a very small range, while patterns two and four are relatively high and the data distribution is relatively scattered, with significant differences between the data, as shown in this Figure).
Figure 8. Boxplots of average duration of whole fixations for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average duration of whole fixations distribution shows that the data in patterns one and three are mostly concentrated in a very small range, while patterns two and four are relatively high and the data distribution is relatively scattered, with significant differences between the data, as shown in this Figure).
Applsci 14 08203 g008
Figure 9. Boxplots of average whole fixation pupil diameter for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average whole fixation pupil diameter data shows that the median of Mode 2-4 data is higher, while Mode 1-3 data is lower, as shown in this Figure).
Figure 9. Boxplots of average whole fixation pupil diameter for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, the average whole fixation pupil diameter data shows that the median of Mode 2-4 data is higher, while Mode 1-3 data is lower, as shown in this Figure).
Applsci 14 08203 g009
Figure 10. Boxplots of number of saccades for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, for number of saccades, the median of the data in Mode 2-4 is low, while Mode 1-3 is high, as shown in this Figure).
Figure 10. Boxplots of number of saccades for four types of tasks in non-cross-screen situations.(Red represents mode one, that is Competing or working separately, dark green represents mode two, that is Supplementing each other, peacock blue represents mode three, that is Competing or working separately, and purple represents mode four, that is Supplementing each other, As shown in this Figure, for number of saccades, the median of the data in Mode 2-4 is low, while Mode 1-3 is high, as shown in this Figure).
Applsci 14 08203 g010
Figure 11. Hotspot maps of the four models.
Figure 11. Hotspot maps of the four models.
Applsci 14 08203 g011
Table 1. Classification of cross-device cooperation scenarios.
Table 1. Classification of cross-device cooperation scenarios.
Cross-DeviceCross-ScreenNon-Cross-Screen
Hand–eye coordinationLow levelHigh degree
Visual switchingHigh degreeLow level
Table 2. Classification of cross-device observation strategies.
Table 2. Classification of cross-device observation strategies.
Observation StrategiesFirst StrategySecond Strategy
Look after DrawAct as Go-between
AI engagementNot involved in conceptualisationInvolved in conceptualisation
Visual switchingLow levelHigh degree
Table 3. Collaborative model task scenario map.
Table 3. Collaborative model task scenario map.
RankL1L2L3L4
Collaborative modelsCompeting or working separatelySupplementing each otherCompeting or working separatelySupplementing each other
Human–AIApplsci 14 08203 i001Applsci 14 08203 i002Applsci 14 08203 i003Applsci 14 08203 i004
Mode sceneApplsci 14 08203 i005Applsci 14 08203 i006Applsci 14 08203 i007Applsci 14 08203 i008
SettingCross-screen maskNon-cross-screen maskCross-screen, non-maskNon-cross-screen,
non-mask
Table 4. Table of experimental equipment.
Table 4. Table of experimental equipment.
Tobii glass2Dell Computer MonitorLenovo LaptopMatePad TabletWacom Touchpad
Applsci 14 08203 i009Applsci 14 08203 i010Applsci 14 08203 i011Applsci 14 08203 i012Applsci 14 08203 i013
Table 5. Cronbach’s reliability analysis.
Table 5. Cronbach’s reliability analysis.
IssueCorrection Line Total Correlation (CITC)Deleted αCronbach α
I think I am good at coming up with novel ideas0.6530.4510.635
I am confident in my creative problem-solving skills0.4520.539
I have tricks to further add to the AI ideas0.1760.794
I am good at finding new ways to solve problems0.5640.475
Standardization Cronbach α: 0.701.
Table 6. KMO and Bartlett’s test inspection.
Table 6. KMO and Bartlett’s test inspection.
KMO0.676
Bartlett Sphericity Inspectionapproximate chi-square (math.)28.888
df6
p0.000
The KMO value is used to evaluate the partial correlation between variables, and its range of values is between 0 and 1.
Table 7. Non-parametric test analysis results.
Table 7. Non-parametric test analysis results.
Grouping (Cross-Screen 01, Non-Cross-Screen 02) Median M (P25, P75)Mann–Whitney Test Statistic U WorthMann–Whitney Test Statistic z Worthp
1.0 (n = 30)2.0 (n = 28)
1. I think I am good at coming up with novel ideas5.000 (5.0,5.0)5.000 (4.3,6.0)361.000−1.0430.297
2. I am confident in my creative problem-solving skills5.000 (4.0,5.3)5.000 (4.3,6.0)339.500−1.3180.187
3. I have tricks to further add to the AI ideas5.000 (4.0,5.0)5.000 (5.0,6.0)294.500−2.0410.041
4. I am good at finding new ways to solve problems5.000 (4.8,5.0)5.000 (4.3,5.8)404.000−0.2720.785
Median M represents the median. (P25, P75) represent quartiles, and Mann Whitney U test is a non parametric test method used to compare whether there is a significant difference in the distribution of two independent samples.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pei, Y.; Wang, L.; Xue, C. Human–AI Co-Drawing: Studying Creative Efficacy and Eye Tracking in Observation and Cooperation. Appl. Sci. 2024, 14, 8203. https://doi.org/10.3390/app14188203

AMA Style

Pei Y, Wang L, Xue C. Human–AI Co-Drawing: Studying Creative Efficacy and Eye Tracking in Observation and Cooperation. Applied Sciences. 2024; 14(18):8203. https://doi.org/10.3390/app14188203

Chicago/Turabian Style

Pei, Yuying, Linlin Wang, and Chengqi Xue. 2024. "Human–AI Co-Drawing: Studying Creative Efficacy and Eye Tracking in Observation and Cooperation" Applied Sciences 14, no. 18: 8203. https://doi.org/10.3390/app14188203

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop