1. Introduction
Neurological disorders are relevant causes of disability worldwide, affecting a large number of people each year [
1,
2]. The total burden for healthcare systems due to neurological disorders is increasing with the growing, ageing population [
3]. The common symptoms of neurological conditions include seizures, reduced sensation, pain, deteriorated consciousness and cognition, muscle weakness, and diminished coordination, among others [
1]. Although these deficits affect the independent function of the limbs, recent evidence shows that functional dependencies between the eye and hand may also be impaired, reflecting a lack of coordination in limb motor control [
4,
5,
6].
Good coordination between the sensory visual system and the musculoskeletal system is required for efficient and effective human function during interaction with the nearby environment [
7]. Poor coordination can result in the deterioration of performance in activities of daily living (ADL) in patients with neurological disorders, requiring specific rehabilitation treatments focused on increasing motor control. For this purpose, it is essential to quantify coordination in such patients in order to obtain the information needed to understand the contributing factors and shed light on the development of suitable rehabilitation [
8].
The assessment of upper-limb motor impairments in neurorehabilitation is usually performed using clinical scales and tests to obtain objective and standardised outcome measures [
9]. There is a large diversity of clinical tests available, each focused on assessing specific impairments or functional limitations caused by neurological deficits [
10]. Examples of these tests include the Fugl–Meyer Assessment (FMA) of Motor Recovery after Stroke, the Wolf Motor Function Test (WMFT), the Nine-Hole Peg Test (NHPT), the Action Research Arm Test (ARAT), and the Box and Blocks Test (BBT). The assessment of motor coordination in most of these tests is based on the performance level during point-to-point movements (e.g., finger-nose, hand-ear, reaching tasks, etc.). Indeed, coordinated movements are multidimensional and require the coordination of multiple subsystems, including eye–hand coordination [
11].
Eye–hand coordination is the ability of our vision system to coordinate information received through the eyes to control, guide, and direct the hands in the accomplishment of a task [
12,
13]. In this context, the scientific literature describes several systems aimed at measuring the level of motor coordination using different technologies such as computer vision, sensorisation of objects, or robotic devices. For example, the study by Otten et al. [
14] presented a framework that uses low-cost sensors to collect data from patient movements and automate most of the FMA, including coordination-related items. Another instrumented system was presented in [
15], which aims to automate the ARAT. The performance of point-to-point reaching tasks is measured through the sensorisation of a 7.5 cm cube. In [
16], the automation of the BBT was presented using a Microsoft Kinect camera (Redmond, WA, USA) to count the cubes transported and track hand movements to measure the motor coordination level in cube transfers. Moreover, Pernalete et al. [
17] proposed metrics for eye–hand coordination by mapping a robotic haptic device to a virtual environment during drawing tasks and correlating this with eye-gaze data using an eye tracker.
Furthermore, serious games and virtual reality are increasingly used in healthcare because they promote physical activity in patients through specifically designed gameplay in virtual scenarios [
18]. This approach offers various advantages over traditional methods, such as adaptability to patient needs, friendly and motivating interaction, and automatic acquisition of performance-based information, showing effectiveness in improving motor functionality in patients with neurological disorders [
19,
20]. Various studies have used gaming technology to quantify UL motor coordination. For example, Oña et al. [
21] automated the Fugl–Meyer Upper-Extremity (FMUE) test using a virtual reality-based platform to evaluate motor coordination via endpoint trajectories. The results were highly correlated with the standard FMUE. Virtual versions of the BBT are also available using sensors such as the Microsoft Kinect V1 [
22] or the Leap Motion Controller (LMC) [
23,
24] to detect the hand and its movements during virtual cube transfers. Gagnon et al. [
25] presented a virtual version of the NHPT that involves manipulating an instrumented handle in order to move nine pegs into nine holes displayed in a virtual environment. Movement coordination is measured by the number of zero-crossings of the hand acceleration vector. Overall, these virtual versions of motor coordination assessment methods present a high correlation with the real systems, showing a promising use of this technology for upper-extremity and eye–hand coordination applications.
Different sensors and technologies have been used to measure motor coordination, allowing the capture of different parameters related to motor performance (position, velocity, pressure, etc.). This highlights the importance of making multiple devices available to clinicians to extract as many parameters as possible during evaluation. Thus, multimodality can enhance the effectiveness of serious games through multimodal sensory integration (devices to capture human factors) [
26]. Traditional input/output devices include the mouse, keyboard, and touchpad. There are also advanced devices such as eye trackers, hand trackers, haptics, motion trackers, gesture controllers, and EEG and EMG devices to name a few [
27]. Proper multimodal integration in serious games is still challenging in neurological rehabilitation since a one-size-fits-all system cannot address all patients’ needs.
On this basis, this paper presents a system to measure eye–hand coordination ability using a virtual scenario and multimodal devices. In a previous work [
28], the development of the virtual environment was briefly presented, and the main goals of the proposed framework were established. Also, two input devices with small workspaces, such as the mouse and tablet pen, were included (related to fine motor skills). The present article extends the description of the virtual environment, proposing generalisable tools to develop similar serious games. Additionally, the main novelty of this paper is the integration of a robotic arm as an input device, which promotes a higher demand for interjoint coordination when performing the drawing task due to the larger workspace (related to gross motor skills). To the best of our knowledge, there is no game-based system available that addresses the assessment of both fine and gross motor skills. Additionally, a case study with healthy participants was conducted to analyse the feasibility of multimodal systems for capturing hand trajectories and the viability of using these data to compare the user’s performance when executing drawing-like tasks with input devices that demand different interjoint coordination.
The remainder of this article is organised as follows.
Section 2 presents a summary of related works. In
Section 3, the system architecture is presented.
Section 4 describes the design process and the main features of the multimodal game-based tool.
Section 5 presents the eye–hand coordination game and the integration of multimodal devices, particularly robotic ones. Subsequently, the results of the preliminary trial are presented in
Section 6. Finally, the discussion and conclusions can be found in
Section 7.
2. Background
Eye–hand coordination (EHC) is a complicated process that requires the precise activation of ocular and manual motor systems [
29]. Typically, gold standard clinical tests include EHC assessment sections involving tasks such as the placement of objects in holes, reaching, and grasping small objects. The rating of motor coordination is assessed through spatial and/or temporal variables [
30] (e.g., completion time, smoothness, and other performance-based indicators). However, these methods may be affected by subjective factors such as inter-operator variability (experimenter bias) and the use of gross hand/arm movements, which require not only EHC but also arm control. Therefore, there are several technological systems available that address the assessment of EHC and motor skills.
One early method for EHC assessment was the ‘Moore Eye–Hand Coordination and Colour-Matching Test’, which measured the time taken to place marbles in holes and match coloured marbles with colour-coded holes as quickly as possible [
31]. A similar modern assessment method is that proposed by Gagnon et al. [
25], which involves an instrumented version of the NHPT. This system involves manipulating an instrumented handle to place nine pegs into nine holes displayed in a 2D virtual environment. Movement coordination is measured by the number of zero-crossings in the hand acceleration vector. Similarly, a vision-based approach was proposed by Balaceanu et al. [
32] to automatically score hand coordination based on the NHPT. This system extracts multiple kinematic parameters (period, velocity, time) to objectively rate motor coordination using machine learning techniques.
Other systems have been developed that use computer-based interaction to estimate EHC. For example, in a previous work by the authors of the present paper, the use of a tablet pen system was proposed to capture hand trajectories in drawing-like tasks [
28]. However, this system was not evaluated. Lee et al. developed and tested a tablet-based system that involves tracing up to 13 shapes with a stylus pen [
7]. Each shape corresponds to a different difficulty level based on the complexity and length of the outline. This system can automatically store the time taken to complete each trace and the spatial accuracy of the tracing. Their results suggest that this method is suitable for the assessment of visuomotor capacity in patients after stroke [
33]. Despite the above-mentioned systems reducing subjectivity in the evaluation, this kind of system is also limited to fine motor skills assessment due to the small size of the NHPT device or the tablet screen size.
In this sense, other methods have tried to increase the workspace range to allow for measuring gross motor coordination by promoting performing tasks in larger workspaces. Svendsen et al. [
34] presented a camera-based system that gathers information on the player’s motor performance with respect to the visual stimuli presented in a game. The system uses two cameras to estimate the arm pose, and based on its reaching response to random visual stimuli, eye–hand coordination is assessed. Similarly, Irawan et al. [
35] designed a sensor-based system to assess eye–hand coordination in tasks involving reaching. The system consists of a steel frame, a series of touch sensors, an LCD, and a carpet for the testee to stand on. EHC is evaluated based on the execution time and the number of sensors the user has successfully turned off by touching. It should be noted that reaching tasks demand higher interjoint coordination than writing because they involve larger muscle masses [
36,
37].
Despite most of the previous solutions utilising game-like interfaces, a different approach is the use of fully VR-based systems to develop virtual versions of the gold standard clinical tests. For example, the virtual BBT [
22,
24], which is used to measure hand coordination when grasping and moving small virtual cubes, or virtualised versions of the NHPT [
38,
39], where small virtual pegs must be grasped and arranged. The major advantage of these systems is that hand movements are detected during interaction in VR using cameras [
38] or the LMC [
39], facilitating natural interaction with the hands and avoiding the need for wearable sensors to measure coordination. However, the lack of haptic feedback may affect motor performance. A recent study compared eye–hand coordination from a previously validated real-world object interaction task with the same task recreated in controller-mediated VR [
40]. Their findings suggest that current VR technology does not replicate real-world experiences in terms of EHC.
Since EHC may be enhanced by hand contact with objects, more sophisticated methods of EHC assessment involve the use of haptic feedback. Arsenault et al. [
41] presented a tapping-task system with force feedback. A head tracker is used to estimate eye position, and a Phantom force feedback device allows the user to touch the targets. Similarly, Pernalete et al. [
17] proposed metrics for eye–hand coordination, using a mapping of a robotic haptic device to a virtual environment, and correlating it with the eye gaze using an eye tracker. These studies found that simulating contact with the Phantom device improved performance. However, the reduced workspace of the Phantom constrains the interjoint demand for finger–wrist–elbow coordination (fine motor skills). Furthermore, Pernalete et al. found similar performance when the arm is constrained to reduce shoulder movement, suggesting that fine motor skills are independent of whether the arm is constrained [
17].
Finally, there are technical solutions that aim to improve the rehabilitation process of EHC. For example, the LMC was used as an input device for a self-adaptive game aiming to improve eye–hand coordination and provide short-term training [
42]. The game requires a player to pop specific balloons appearing on a screen, aiming to prevent them from flying. In a study by Shen et al. [
43], a novel approach for the rehabilitation of EHC and finger dexterity was developed incorporating Augmented Reality (AR) technology. This application promotes interaction with virtual piano keys while the user wears a glove to detect finger motion.
On this basis, eye–hand coordination was considered in this study because it is essential for performing activities of daily living (ADL) autonomously. Neurological diseases significantly affect hand-eye coordination, reducing the autonomy of performing ADL for a large number of people. The proper evaluation of how neurological deficits affect the relationship between eye control and limb control is a fundamental gap in the knowledge base and an essential component for defining adequate rehabilitation strategies. Similar to other cases of functional assessment automation, gaming technology can contribute to developing useful tools for both assessment and rehabilitation. Hence, the authors hypothesise that the use of multimodal devices and serious games can help to measure fine and gross motor coordination based on strategies that demand different levels of interjoint coordination.
3. System Architecture
Drawing is a basic activity that improves how the eyes and fingers work together to accomplish a task. Hence, this paper proposes the use of a virtual environment that promotes drawing tasks to measure the degree of fine and gross motor coordination using multimodal devices. The architecture of the proposed framework is illustrated in
Figure 1, which includes the assessment mode and the workspace according to the device used.
The first multimodal device group focuses on the assessment of eye–hand coordination using a virtual scenario that shows a labyrinth-like path with a ball that can move throughout. The user is encouraged to move the ball from the starting to the final point of the labyrinth. The motion of the ball can be controlled by a mouse (desktop mode), a stylus pen (tablet mode), or a small robotic device. The ball trajectory is automatically stored, and by comparing it with the reference trajectory, the eye–hand coordination level can be estimated. Note that the use of the small robotic device provides the user with haptic feedback during the drawing tasks in a small workspace, which obtained positive results in a previous study [
17]. However, an extended assessment involving the coordination of hand, elbow, and shoulder joints was not studied.
The second multimodal device group aims to measure groos motor coordination using the same virtual environment but promoting large movements that involve the synergistic motion of the arm. For this purpose, the use of a robotic arm is considered in this paper. The robotic arm can allow for a large workspace and, depending on the control techniques, different strategies, including assistance, resistance, or other force-based control techniques, can be implemented.
4. Game Development Methodology
The Unity 2021.3.1f1 game engine was employed in this project for the development of the eye–hand coordination game. Overall, the implementation of the virtual environments must fulfil the following specifications:
Intuitive game–user interface: The graphical interface must allow the user to launch and navigate through the serious game scenes intuitively.
User feedback: Similar to classical video games, this serious game must allow the user to earn points when certain goals or missions are achieved. Thus, this reward system will engage the user, encouraging improvements in scores and, consequently, motor performance.
Customisation options: The serious game must allow for adjustments to fit the game mechanics to the characteristics of users. This feature is essential in neurorehabilitation systems since patient needs can vary widely.
Game dynamics: The game session dynamics should be intuitive and straightforward, focusing on completing drawing tasks. Users will be able to perform freehand movements, but the path design will impose certain conditions, detecting when the user goes outside the path limits.
Clinical outcomes: In addition to displaying user results, it is crucial to obtain therapeutic information for clinical supervision, which provides additional indicators regarding the quality of exercise performance. These indicators include time spent, performance points, failures, and hand trajectories. Based on these data, the proposed system aims to contribute to developing a high-resolution metric for measuring eye–hand coordination.
Automatic data storage: All the parameters obtained in each session must be stored automatically in the patient’s record. The stored file should be in a standard text format to facilitate data management. Thus, the physician can supervise therapy progress and perform a post-analysis of the data if needed.
Screen responsiveness: Since the serious game can be executed in different environments (home or hospital), the graphical interface should not be distorted. So, a self-adaptive screen resolution functionality is required.
Multimodal connectivity: The system must allow the user to interact with the serious game through different peripheral devices. This means that the serious game must implement a transparent functionality for detecting different input devices to command game actions, such as moving the ball through the labyrinths while performing drawing-like tasks.
The main elements in the game are (1) the virtual paths and (2) a ball that represents the player. Therefore, this study proposes two principal approaches that can be generalised to other projects: (1) the method for creating the paths and (2) the multimodal strategy. On one hand, a method for creating interactive paths was needed, which should be customisable (size, width, appearance) and capable of detecting different events during the drawing task. On the other hand, the ball representing the player must be controlled using a multimodal approach, with the game capable of detecting the connected device and allowing the user to choose which one to use.
4.1. Path Creation in Virtual Environment
The ProBuilder v5.2 package [
44] was used to develop the paths for this project. ProBuilder allows for building simple geometry with detailed editing capabilities. This tool can be installed from the Package Manager tab by searching for ProBuilder in the browser. Once installed, the Probuider window can be accessed from the Tools tab.
The path creation process is described as follows. Firstly, an image of the path is imported into the background scene as a template for the layout (
Figure 2a). Then, an editable object is created using the ProBuilder tool by indicating specific points on the template and using the “New Poly Shape” function (
Figure 2b). This function creates a virtual plane-shaped object from the union of points (
Figure 2c). From the Inspector panel, an extrusion operation can be applied to the plane-shaped figure to generate a 3D object.
The creation of paths using the ProBuilder tool allows for the inclusion of “Mesh Colliders” by default. Mesh Colliders are components that use imported mesh data, which Unity uses for environment collision (a collision occurs when the physics engine detects that the colliders of two GameObjects make contact or overlap). In our case, the mesh data are the 3D labyrinths, and the Colliders are built based on the labyrinth meshes. In this way, collisions with the labyrinth walls and surfaces are detected.
The GameObjects that represent the labyrinths have the following components: the main labyrinth, circles as start (green) and end (red) points, the central line, and lateral boundaries (see
Figure 3). The main labyrinth object represents the working area for drawing. The green and red circles denote the start and end of the maze route, respectively. The central line object represents the ideal trajectory to complete the maze. Finally, the lateral boundaries are GameObjects that detect when the player goes outside the main route. Optionally, the Collider for these boundaries can be enabled to prevent the player from going outside the main path, helping the user.
Both the main path and the centre line were created using the “New Poly Shape” function. The path boundaries were created by drawing the silhouette of the path, first on one side and then on the other, with a small width. In this way, two figures were obtained that will later be used to detect when the user goes outside the path. For creating the start and end circles, the “New Shape Tool” function offered by ProBuilder was used, which allows for the creation of predefined shapes. In this case, two cylinders were created, with the desired radius and height, resulting in the start and end cylinders of the path. Note that this section presents the general method for building paths into the virtual environments designed by clinicians. This methodology can be generalised for designing other paths for different applications.
4.2. Multimodal Device Connectivity
One of the goals of this project is to develop a game-based multimodal framework that allows commanding game actions using different input devices. In this case, the idea is to control the ball motion using an off-the-shelf device, ranging from the most simple to the most complicated. This approach is illustrated in
Figure 4. Since we are using planar labyrinths, the input devices must allow the user to perform hand drawing-like movements on a plane. Various input devices have been integrated into this approach: a mouse, a stylus pen, the Phantom Omni (Haption, Grenoble, France), and the KUKA LBR IIWA (KUKA Robotics, Augsburg, Germany) collaborative robot. The use of these devices can be analysed from the following perspectives:
These devices enable capturing the hand trajectory (position), which is transferred to the virtual environment in order to move the ball through the labyrinths. However, due to the inherent differences among devices, the position-tracking method is distinct for each one.
In the case of the mouse, a script based on the “Event System” component of Unity was implemented. This module is a way of sending events to objects in the video game based on input, be it a keyboard, mouse, touch, or custom input. Thus, the ball GameObject can be displaced when an event occurs: the user places the cursor on the ball and drags it by left-clicking. Similarly, the stylus pen and touchscreen require the same script. However, in this case, there is no need for any clicking.
In the case of the haptic device, the Phantom Omni device calculates and sends the position and rotation of the end-effector to Unity using the
OpenHaptics v3.5 library [
45]. This plugin allows one to control the position of a sphere in the Unity scene and, according to the design restrictions, to provide a realistic haptic experience when drawing. This sphere contains a
HapticObject component that is attached to the ball in the game.
Finally, the Robot Operating System (ROS) is used to enable the robotic arm to communicate with the Unity environment. On one hand, the IIWA robot publishes in an ROS node the position and orientation of the tool centre point (TCP) via the PosRotMsg topic. On the other hand, the Unity scene can send/receive messages from the ROS using the ROS-TCP-Connector library. Thus, the game scene contains a subscriber to the PosRotMsg topic that, upon receiving the message, executes the callback function with the position received that is linked to the ball.
5. The Eye–Hand Coordination Game
The game is made up of several scenes; however, there are two main virtual environments: the first one focuses on the evaluation, and the second one focuses on the improvement of eye–hand coordination. Both environments are integrated into a module in which the physiotherapist, or the patient at home, follows instructions through scenes to initialise the video game. The first scene is a welcome environment with the game name (
Figure 5a). The second scene is an interface that presents the two modes available in the application (
Figure 5b).
The scenes described below appear once the user has chosen the mode, starting with the registration scene, in which the physician must enter the patient’s information, such as name, age, pathology, and affected arm. With this basic information, different sessions for the same patient or between patients can be distinguished. Note that this functionality is oriented to help clinicians in patient data management for the use of serious games in healthcare facilities.
5.1. Evaluation Mode
In the evaluation mode, eye–hand coordination is estimated based on different parameters derived from the performance of drawing tasks. For this purpose, the gameplay promotes the use of paths (labyrinths) where a ball must be driven from the start point to the final point. This gameplay is simple, but it requires good eye–hand coordination to carry the ball from the green circle (start point) to the red circle (final point) while minimising the expended time and avoiding going outside the central path.
Figure 6a presents the virtual scene for a path. The red circle indicates the point from which the ball begins to move. The red circle is the target point. In the middle of the path, there is a white line that indicates the “ideal” trajectory (equal distance from the side walls). When the ball collides with the side walls, a prompt message (cloud-like) appears to notify the user to correct the trajectory. Additionally, several “coins” are distributed along the ideal trajectory. Each reached coin gives a point. Finally, the game time and score are displayed in a panel in the upper-left corner. The scores are determined by the coins gathered by the user.
Figure 6b presents the full set of virtual environments implemented in this game based on the system described in [
17]. The patterns were designed by therapists at a handwriting clinic and categorised into three main classes: Straight-to-Curvy, Complex, and Width designs. Complex patterns are longer and more demanding to follow. Straight-to-Curvy patterns involve transitions from straight lines to curvy lines. Finally, width patterns increase their level of difficulty by making the path narrower.
In addition to the different path designs, the assessment scene allows configuring each path according to the patient’s characteristics.
Figure 7 presents the settings screens for the evaluation mode. The main options are as follows:
Normal Options:
- −
Size: Using a slider, this option allows the user to scale the path size proportionally.
- −
Game mode: This option allows choosing between simple and awarded (coins) paths.
Advanced options:
- −
Ball size: Three ball sizes are available to change the accuracy level.
- −
Path tolerance: This advanced option controls the threshold to activate the audio and visual warnings related to the tolerance for the ball going outside the path. Three levels are available: low (a message appears when the ball goes outside the path), medium (a message appears when half the ball is outside the path), and high (a message appears when the ball touches the walls).
Evaluation Mode with Robotic Devices
The video game is designed to run in desktop or tablet mode. That is, the mouse or stylus pen can be used as an input device to control the ball motion. Due to the nature of these input devices, feedback other than sound or visual prompts cannot be provided. However, in the evaluation mode with assistance, the goal is to use a device that provides haptic feedback to the user, rather than just capturing the hand position, like the mouse or stylus pen. According to the results in [
17], the haptic effect helped make the traces more accurate, faster, and smoother.
On this basis, the idea of haptic feedback using a robotic system has been pursued in the present work. Thus, two robotic systems are integrated into the eye–hand coordination game: (1) the Phantom Omni device and (2) the LBR IIWA collaborative robot.
Phantom Omni
Figure 8 illustrates the assistance mode using the Phantom Omni robot. As previously mentioned, this device transmits its end-effector pose to the virtual environment via the
OpenHaptics library. In the virtual scene, there is a virtual sphere (green circle with dashed line) attributed to the haptic device’s end-effector, which is attached to the player ball (blue circle). Inside the path, the player ball and the virtual sphere (corresponding to the robot’s TCP) are aligned, so no haptic effect is active.
When the user moves the robot’s TCP outside the path (
Figure 8a), the system applies haptic feedback that tries to align the robot’s TCP (point A) with the player ball (point B), similar to spring behaviour.
Figure 8b details the haptic effect. Note that the player ball cannot go through the path boundaries, and the collision point (ball boundary) is detected automatically. The attraction force exerted by the Phantom Omni towards the collision point increases as the user moves the robot’s TCP away from the path. Thus, the haptic effect
increases proportionally to the distance
d between the position of the robot’s TCP and the position of the player ball, with an impedance
k value adjustable in the game setup scene.
Note the following design considerations: the player ball cannot go through the path boundaries, the collision point is detected automatically, and the user cannot move upwards or downwards in the track plane because those movements are constrained by the maximum force exerted by the device, creating a virtual constraint that keeps the pointer of the Phantom in the desired drawing plane.
This method has been termed a haptic tunnel since the previously described behaviour is enabled along the complete labyrinth route, and the virtual elements are three-dimensional.
Collaborative Robot
The second robotic device integrated in this project is the LBR IIWA collaborative robotic arm. Collaborative robots, or cobots, are a category of robots designed to work alongside humans instead of working in secured and isolated environments [
46]. The chosen collaborative robot has a maximum reach of 800 mm, a payload capacity of 14 kg, and seven degrees of freedom (DoFs). It also has integrated sensitive torque sensors in all seven axes. These features enable the implementation of advanced force-controlled operations, contact detection capabilities, and programmable compliance.
Similar to the Phantom Omni, the idea of this robotic device is to encourage drawing tasks with some haptic assistance while having a larger workspace. This approach is relevant because fine motor skills may not be linked with upper-arm functionality. In a previous study [
17], the results indicated that fine motor skills are independent of constraints in the upper arm. This suggests that using a small drawing workspace can measure fine motor skills, but gross motor skills (those requiring higher interjoint coordination) could be overlooked. Therefore, the goal of this assessment mode is to promote the execution of drawing tasks that involve the synergistic motion of the upper limb using the robot’s larger workspace. The user must hold the robot’s TCP to drive the robot’s end-effector through the virtual path, as shown in
Figure 9.
For the development of this application, we implemented impedance control for the compliance interaction between the robotic arm and the patient. Impedance control is a method of controlling the end-effector of the robot in a non-conventional mode that can comply with both force and position instructions [
47]. It allows for the definition of dynamic behaviour for the interaction between the robot’s end-effector and the environment. The desired interaction behaviour uses a dynamic impedance model, usually a mass-spring-damper system, which is decoupled and may be linear or nonlinear. This type of control technique can be defined in the joint space, where it is assimilated to elasticity in the joints of the robot, or it can be defined in the space of the end-effector, called Cartesian impedance. On this basis, we implemented Cartesian force control, capable of producing accurate joint torques so that the appropriate force is realised at the end-effector in order to accommodate the interaction forces with the user.
For this application, the impedance control has two components: one restrictive component and one assistive component, which depend on the position of the robot’s TCP. To control the IIWA robot, the FRI (Fast Robot Interface) protocol is used [
48]. This is a real-time communication protocol for online control of the positions and efforts of the IIWA robot.
On one hand, the force control restricts the robot’s motion to the
virtual plane (XY plane) by applying a force with high impedance in the Z-direction (see
Figure 9b). Similarly, the orientation of the robot’s TCP is also locked to reduce the difficulty of guiding the robot. Note that a custom-made hand-held tool is used to facilitate robot mobilisation.
On the other hand, when the position of the robot’s end-effector is on the plane, the impedance control aims to assist the user in staying inside the path. This behaviour is similar to the one described for the haptic tunnel in the Phantom Omni system. This means that when the user drives the robot’s end-effector inside the path, no forces are applied, and the robot operates with tight position control and gravity compensation. However, when the user moves the robot’s TCP outside the path, interaction forces are applied along the normal to the path boundaries. The force imitates a damped spring that draws the robot’s TCP toward the contact point at the path’s boundary. The stiffness of the force control in gravity compensation mode is customisable to control the level of assistance.
The IIWA robot is also compatible with the Robot Operating System (ROS), allowing for reading information about the robot’s status. Thus, in order to establish communication between the robot and the Unity environment, the Unity scene integrates the ROS-TCP-Connector package for sending/receiving messages from the ROS. The game scene contains a subscriber to a topic with the robot’s pose that updates the position of the player ball based on the received data. It should be noted that, unlike the Phantom Omni, the received position corresponds to the robot’s coordinate system, which does not match the Unity reference system (see
Figure 9b). Therefore, it was necessary to implement a transformation method to adapt the position received from the robot to the virtual workspace in the Unity environment. Note that for this evaluation mode, the virtual path can be displayed in desktop mode or using a VR headset for a fully immersive experience.
7. Discussion
Currently, there is evidence showing that patients with neurological disorders, such as stroke, experience a lack of coordination in that the normal spatial and temporal relationships between eye and hand movements during tasks requiring coordination are impaired, despite intact visual function [
4]. These eye–hand relationships may need to be specifically targeted during the assessment stage and subsequently during rehabilitation. Hence, the development of tools to help clinicians in both stages is especially relevant.
More relevant features of state-of-the-art systems for EHC assessment include drawing-like tasks (related to fine motor coordination) [
7,
33], reaching tasks (related to gross motor coordination) [
34,
35], game-based user interfaces [
31,
32], and haptic feedback [
25,
41], which are usually addressed separately. However, such features are integrated into the multimodal game-based system proposed in this paper. Multimodality allows for better use of different input devices to explore new strategies for EHC using the same tool, namely the eye–hand coordination game. In a previous conference article by the authors of this paper [
28], the development of the virtual environment was briefly presented and the main goals of the proposed framework were established. Also, two input devices with small workspaces, such as the mouse and tablet pen, were included (related to fine motor skills). However, the multimodal framework was neither technically validated nor extended to methods for quantifying gross motor coordination. The current paper adds two robotic devices and presents trials to validate the proposed framework.
Firstly, the evaluation of fine motor coordination was addressed using a methodology similar to that in related studies, specifically the use of drawing tasks [
7] that can include haptic assistance [
17,
41]. The analysis of endpoint trajectories from drawing tasks has shown positive results in EHC assessment [
17,
33]. In this way, the assessment mode proposed in this paper involved point-to-point drawing tasks in a game-like virtual scenario. The user can draw according to different difficulty levels based on the characteristics of paths designed by expert clinicians. One relevant feature of the proposed system is its multimodality, which currently includes devices such as a mouse, a tablet pen, the Phantom Omni, and a robotic arm. The inclusion of multimodal devices can provide more performance-based information for a more detailed evaluation [
26]. For example, the tablet pen system may provide useful information about the pressure and orientation exerted when drawing. The robotic systems can offer haptic feedback and hand force detection. This performance-based information can be properly combined with the position tracking of traces in order to improve the diagnosis of motor skill problems.
Moreover, similar to our previous conference paper [
28], this paper also presents a novel method aimed at extending the assessment capabilities from fine to gross coordination by integrating a collaborative robotic arm into the game-based system. This new device promotes the use of whole-arm movements to move the ball through the labyrinths. Motor skills describe the ability to control and coordinate movements. This can include fine motor control (e.g., small movements of the fingers and hands) and gross motor control (for example, large and coordinated movements of the arms, trunk, and legs) [
53]. Gross motor skills require the coordination of large body parts for actions such as sitting, running, jumping, and throwing, while fine motor skills require the coordination of smaller movements between the fingers, hands, and feet for actions such as grasping, manipulating small objects, writing, and using scissors [
37]. In the scope of this paper, fine motor skills can be evaluated using small devices such as a mouse, a tablet pen, or the Phantom Omni since they promote drawing tasks similar to writing. Conversely, gross motor skills can be evaluated using the larger workspace provided by the robotic arm, since in this layout, the drawing task is more similar to painting a wall, which requires the coordination of larger muscle masses. Note that if the patient interacts with the robot while standing, the drawing execution would require the coordination of more body joints. Other studies have also tried to increase the workspace range to allow measuring gross motor coordination by promoting tasks in larger workspaces [
34,
35]. However, advances in force-control techniques in robotics open new opportunities in the assessment and rehabilitation of EHC due to the programming versatility of robotic devices.
In order to validate the feasibility of the proposed multimodal framework, particularly the robot-based system, preliminary experiments were carried out with healthy volunteers in the laboratory. On one hand, the results show that the proposed system allows for storing the position trajectory of the player ball in the video game (which is directly related to the hand movements of patients) with a sufficient frequency to reproduce the hand movements. This position information can be directly related to the hand movements of patients and used to analyse the motion quality. The analysis of endpoint trajectories is a common approach employed in similar studies [
7,
17,
32]. On the other hand, using a larger robotic device does not affect the expected gameplay, since all the participants had no problems performing the drawing tasks. For this experiment, the robot was operating in gravity compensation mode, reducing the resistance to motion. However, more force control-based strategies can be explored using the robotic arm, including different resistance levels to make it harder or easier for the patient to move the end-effector, even incorporating the assistance-as-needed approach. These preliminary results are useful in preparing for pilot trials with patients with UL motor problems.
Furthermore, this paper presents a method for trajectory comparison using the DTW algorithm, showing that it is possible to detect differences between right- and left-handed participants or dominant versus non-dominant hands. The poorer performance, marked by higher DTW values for traces developed with the non-dominant hand, suggests that this multimodal system is feasible for measuring motor skills. Furthermore, the DTW values also show that performance in drawing tasks involving large movements is different.
Figure 11 shows that higher DTW values were obtained for the robot-based system, clearly distinguishing between fine and gross motor performances. These results agree with the findings of Bondi et al. [
54], who investigated the relationship between fine and gross motor coordination tasks through a network analysis linking graphomotor tasks (using tablet PC tracing), gross coordination (using items from the Körperkoordinationstest für Kinder), and strength (measured by handgrip) in school children. Their results suggest that specific parameters of tracing tasks, as well as other graphomotor tasks, should be taken into account when evaluating developmental trajectories and dysfunctions.
As a final remark, most of the methods for assessing motor coordination are based on performance levels during point-to-point movements (endpoint trajectory). However, these endpoint trajectories can be achieved through an infinite number of joint configurations (degrees of freedom [DoFs]), providing abundant movement patterns to accomplish the same motor task [
55]. The “principle of abundance” implies that the numerous DoFs allow the system to adapt movements to changing environmental conditions, which often occur during the performance of ADL [
36]. Therefore, measuring interjoint coordination in relation to motor abundance and task-specific variability is important for understanding reaching deficits [
36]. This highlights the need to develop more accurate assessment systems that can identify different levels of motor coordination. It is particularly relevant in neurological rehabilitation, where motor abundance is reduced, which leads to patients finding other solutions for task accomplishment, including compensatory movements [
36]. The multimodal framework presented in this paper appears feasible for evaluating different motor performances and distinguishing between fine and gross motor coordination using endpoint trajectories. However, the full potential of this framework is yet to be explored, requiring further research and pilot studies.
7.1. Perspectives on the Proposed Multimodal Framework
The above highlights that, in the holistic context of rehabilitation, it is important to not only exercise motor function but also to assess the effects of treatments [
9]. This assessment–rehabilitation dichotomy should be considered when designing processes or tools intended for deployment in healthcare. Thus, the expected benefits of the proposed framework aim to serve a dual purpose: the assessment and rehabilitation of eye–hand coordination. Previous work advocated for the development of a framework for the assessment and rehabilitation of UE motor function using serious games [
21], with promising results [
18]. The proposed multimodal framework allows for the exploration of novel strategies for the assessment and rehabilitation of fine and gross coordination, as summarised in
Table 3.
Regarding the evaluation mode, the effect of proprioception on motor skill performance could be estimated by choosing the proper setup. The drawing tasks can be performed with or without with the hand aligned with the gaze direction. This feature of the proposed multimodal framework can extend the assessment options for clinicians, allowing them to detect additional functional problems that are not solely related to motor capabilities.
The rehabilitation mode is currently in development and involves a mobile path (top-down or side-by-side) with a ball in the middle. The user’s goal is to keep the ball in the middle by using a robotic device to control it. This method explores eye–hand coordination in a task requiring the interception of a moving target [
56]. The idea is to coordinate the hand movement according to dynamic stimuli with varying difficulty levels. Patients with neurological disorders often retain eye control (in most cases), but limb function is impaired. Thus, improving limb function may result in better eye–hand coordination. This approach requires further research and trials for consolidation.
Another issue to explore in future work is the effect of the immersive experience on motor skill performance. Factors related to 3D depth perception, such as viewing angle, stereo viewing technology, and visual cues, significantly affect movement performance [
57]. The eye–hand coordination game can be executed in a VR headset to increase the perception level. The robotic arm could then be used as an input to control the path drawing (during assessment mode) or the ball motion (during rehabilitation mode). Also, an eye-gaze tracking method can be included in this framework, since recent advances in imaging sensors and computer vision have enabled video-based gaze detection [
58], and modern VR headsets often include it [
59]. The combination of multimodal sensory information can help improve the assessment metrics, enabling patients and therapists to select an appropriate course of robotic therapy.
7.2. Limitations
This paper is limited to the development and validation of a game-based multimodal framework for the assessment of upper-limb motor coordination. Additionally, the validation experiments were conducted with healthy volunteers in the laboratory, which means the obtained results cannot be directly generalised to patients with neurological impairments. Therefore, future work will involve conducting pilot trials with patients with motor coordination dysfunctions. These pilot trials should include a control group and an appropriate sample size to enable statistical analysis. Further research and large-scale randomised controlled trials are essential before such novel rehabilitation techniques can be incorporated into clinical practice.