*1.1. Gesture Recognition*

The field of gesture recognition has been a hot topic, with various potential applications from playing games to medical treatment. Different researchers have utilized various

**Citation:** Wu, H.; Han, Y.; Zhou, Y.; Zhang, X.; Yin, J.; Wang, S. Investigation of Input Modalities Based on a Spatial Region Array for Hand-Gesture Interfaces. *Electronics* **2021**, *10*, 3078. https://doi.org/ 10.3390/electronics10243078

Academic Editor: Paolo Visconti

Received: 28 October 2021 Accepted: 7 December 2021 Published: 10 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

devices to conduct studies in this area. In terms of gesture data collection, common methods include: data gloves [1]; Kinect video capture devices [2]; Leap Motion capture devices [3]; collecting data through the device's first-view camera in the AR/VR environment [4,5]; and the use of heterogeneous sensors to collect data to improve the recognition rate [6]. In terms of experiment types, image segmentation [4] and image classification [2] are more common, and researching non-contact tactile feedback in AR/VR environments [5]. In [7], a cross-label recognition system is proposed. Promoting gesture recognition by improving large data intelligent editing processes [6]. There is also an identification method that measures the distance and angle between the fingers [1,8] studied arm gestures. Gesture recognition has many applications, such as gestures that interact with animation in shadow puppet shows [6] and interact with television [9]. In the area of medicine, doctors can use gestures to safely interact with computers to control images without the need to touch an operating room screen [10]. Navigating and manipulating large amounts of data suitable for high-resolution wall displays [11]. In the driving field, by exploring the space in front of an in-car screen, in-car touchscreen interaction can be expanded with the careful application of a target expansion strategy, allowing interaction with in-car systems to be more convenient [12]. Gesture recognition uses several technologies. In [1], the recognition rate is improved through a deep learning-based gesture spotting algorithm. In [4], a gesture recognition deep neural network was proposed which recognizes ego hand gestures from videos (videos containing a single gesture) by generating and recognizing embeddings of ego hands from image sequences of varying lengths. A novel deep neural network is designed in [7], which embeds gestures in the high-dimensional Euclidean space. It tackles the spatial resolution limits imposed by RF hardware and the specular reflection effect of RF signals. In [2], the support vector machine (SVM) classifier is used to classify the data. Ultrasonic haptic technology is used in E to develop and integrate air haptics that do not require wearing or holding any equipment in the virtual reality game experience [5]. This paper presents a set of spatial partitioning strategies for designers as guidelines that can improve the types of technologies described.

#### *1.2. Interaction Based on a Spatial Region*

Gesture interaction has developed rapidly as one of the important research areas of human-computer interaction. However, we have checked the existing literature and found that researchers are more concerned with new interactive technologies developed by interactive channels such as large screens, cameras, and sensors. These studies have made great contributions to improving the efficiency of human-machine interaction. Human activities and space are closely linked, so researchers must pay attention to the controllability capabilities of users' space. Interaction techniques based on a spatial region array are novel and promising and have a wide range of applications. To achieve multi-layer interaction, a novel multi-layered gesture recognition method using Kinect has been proposed and explores the essential linguistic characters of gestures [13]. The method can obtain relatively high performance. Multi-layer interaction techniques divide the interaction space into multiple interaction layers. Each layer has a special function; users can access different commands by accessing the different layers. The overall interaction height and different minimum layer thicknesses for vertical and horizontal search tasks were experimentally explored in [14]. In [15], three target selection techniques were developed for air pointing: small angular ray casting movements, large movements in a 2D plane, and movements in a 3D volume. Although those techniques were designed systematically to use from one to three dimensions, the target selection techniques were presented without strategies of common space partitioning. Many researchers have designed techniques based on spatial regions, but they have not focused on the division of space [16,17]. Some researchers have tried to divide the space using angles [18–22]. However, there is a lack of basic research on common spatial partitioning. The purpose of this paper is to explore common operational spatial partitioning in the user interface.

#### *1.3. Interaction of Visually Impaired Individuals*

There is a need for computer interactions that can also be used by visually impaired individuals; meeting this need has appealed to many researchers. A framework was proposed for exploring the differences between the spatial sense ability of visually impaired and sighted persons in three longitudinal models [23]. Through exploring the effect of spatial ability on a visually impaired person's sense of position within web pages, we know users can obtain an accurate overview of a web page with audio feedback when using a touchscreen [24]. By connecting the use of touch sensation and other multimedia design elements, it was found that touch sensation plays a critical role in improving application design for people with visual impairments [25]. Although there is a lack of systematic study on the common operational spatial region array of a visually impaired individual, gesture-free interaction by the status of thumb (GIST) is a wearable gestural interface that uses a depth camera to collect a users' hand gestures and can help a visually impaired individual perform everyday tasks [26]. There are techniques based on the two-dimensional structures of a keyboard surface that explore different methods of non-visual interaction [27]. To enable the blind to read the text, an affordable mobile application for the visually impaired person was proposed. The text could be read into speech format using text-to-speech conversion in a Text to Speech (TTS) framework [28]. Immersive virtual reality (VR) to provide a realistic walking experience for the visually impaired is proposed in [29]. A novel immersive interaction using a walking aid, i.e., a white cane, is designed to enable users with visual impairments to process ground recognition and inference processes realistically.

In summary, interactive technology based on spatial gestures has been integrated into people's daily lives, including visual users and visually impaired people. Therefore, further research on interaction technology based on spatial gestures is beneficial to improve the interaction efficiency between users and computers in daily life.

#### **2. Materials and Methods**

The extensive research mentioned above has focused on design techniques. However, this paper focuses on developing a set of guidelines based on spatial partitioning strategies. To assess users' spatial controllability, we attempt to reveal the common operational region when executing spatial gestures. Thus, in this paper, we have focused on investigating input modalities based on a spatial region array for hand gesture interfaces. We conducted a systematic study of human performance when selecting targets with a spatial region array, and developed two interaction techniques and four spatial partitioning strategies as design guidelines for human-computer interaction designers.

A Leap Motion M010 controller, a computer (including a keyboard and a display screen), and an experimental model designed by Unity 3D in the C# language were used in the experiment. The Leap Motion device can detect the hand's position in a range from 25◦ to 165◦ and is symmetrical. The experimental program was designed in Visual Studio 2019 and the Unity 3D Environment and ran on a 3.60 GHz AMD Ryzen R5-3600 CPU PC with Windows 10 Professional. The display resolution was set to 1000 × 800 pixels in the pilot studies and 1920 × 1080 pixels in Experiment 1 and Experiment 2.

To improve the users' spatial controllability, we first focused on the height and width of a rectangle (in front of and parallel to the screen) representing the average range of hand movements when a user sits down at a desk. We first determined the common operation area through a pilot study, which was realized by Leap Motion and unity 3D, as shown in Figure 1a. Leap Motion systems can detect and track hands, fingers, and finger-like tools. Its visual range is an inverted pyramid with the spire in the center of the equipment, as shown in Figure 1b. Leap motion's system adopts the right-hand Cartesian coordinate system, and the returned values are in real-world millimeters. The origin is at the center of the leap motion controller. The *x*-axis and *z*-axis are on the horizontal plane of the device, the *x*-axis is parallel to the long side of the device, the *z*-axis is parallel to the short side, and the *y*-axis is vertical upward, as shown in Figure 1c. Leap motion

provides a set of dataset updates, and each frame of data contains a list of basic tracking data. When a hand is detected, it is assigned a unique ID indicator. For as long as the motion is analyzed, the leap motion program will give the frame motion factors based on the motion of the hand. Through the hand object, the current position information of the hand can be obtained. Unity 3D is a tool for creating interactive applications. It adopts a graphical development environment and can deploy projects to multiple platforms such as Windows. Unity's coordinates are world coordinates, which are consistent with leap motion, so we can accurately locate the hand motion in unity's world coordinates.

**Figure 1.** Schematic figure of experimental process and equipment: (**a**) experimental process, including pilot study, experiment 1 and 2; (**b**) the detectable spatial area of Leap Motion; (**c**) coordinate system of Leap Motion.

We imported a toolkit that supports Leap Motion gesture development in Unity. The toolkit contains prefabricated hands, related gesture action scripts, and case demonstrations, all of which can be used to help developers complete Leap Motion development work. The next step was to build an experimental development platform and add the "LeapHandController" prefab to the created scene. By observing whether the hand on the interface was within the capture range of the camera, we adjusted it to a suitable position and adjusted the size of the hand controller. The parameter was set to 1 to make it the same size as the real hand, so that the real hand could be moved in real-time to control the movement of the virtual hand, which is convenient for the user interaction operation described later. We imported the "Vectrosity" plug-in to meet the interface drawing requirements in our experiment. In this experiment, the plug-in was used to edit the experimental interface and achieve dynamic performance (for example, green represented a random target; when the target was selected, it appeared red; and yellow indicated the movement trajectory of the hand, etc.). To achieve the purpose of collecting experimental data, we recorded the acquired data in an Excel file and saved the file to the local disk. The logic processing of the business was implemented by C#. The logic included the method of drawing rectangles, the method of drawing UI interface, the method of setting the timer, the method of deleting rectangles, the method of randomly generating non-repeated layers, the method of setting data table, and the method of writing data to the Excel table.

#### *2.1. Pilot Study*

The study focused on designing, conducting, and analyzing a users' performance on a spatial region array, and addressed the following issues:


Many possible factors impact the interaction between the users and the Leap Motion controller. For example, the size of the spatial region array, the sensitivity of the Leap Motion device, a visual or non-visual task, and whether the users' performance of the task used the left or right hand. For the study's manageability and validity, we restricted our investigation to a situation where users sat in front of the Leap Motion device, centering it between the computer screen and the users' body, as Figure 1a shows.

#### 2.1.1. Participants

Twelve students (two females, 10 males) participated in the user study. Their ages ranged from 22 to 30 (M = 25, SD = 2.08). The average body height was 168.17 cm (SD = 8.96). All of them were daily computer users.
