A Survey of Augmented Reality for Human–Robot Collaboration

Chang, Christine T.; Hayes, Bradley

doi:10.3390/machines12080540

Open AccessReview

A Survey of Augmented Reality for Human–Robot Collaboration

by

Christine T. Chang

^*

and

Bradley Hayes

Department of Computer Science, University of Colorado Boulder, Boulder, CO 80309, USA

^*

Author to whom correspondence should be addressed.

Machines 2024, 12(8), 540; https://doi.org/10.3390/machines12080540

Submission received: 17 June 2024 / Revised: 19 July 2024 / Accepted: 2 August 2024 / Published: 7 August 2024

(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)

Download

Browse Figures

Versions Notes

Abstract

For nearly three decades, researchers have explored the use of augmented reality for facilitating collaboration between humans and robots. In this survey paper, we review the prominent, relevant literature published since 2008, the last date that a similar review article was published. We begin with a look at the various forms of the augmented reality (AR) technology itself, as utilized for human–robot collaboration (HRC). We then highlight specific application areas of AR for HRC, as well as the main technological contributions of the literature. Next, we present commonly used methods of evaluation with suggestions for implementation. We end with a look towards future research directions for this burgeoning field. This review serves as a primer and comprehensive reference for those whose work involves the combination of augmented reality with any kind of human–robot collaboration.

Keywords:

robotics; human–robot interaction; human–robot collaboration; augmented reality

1. Introduction

Augmented reality (AR) has been explored as a tool for human–robot collaboration (HRC) since 1993 in [1], and research related to AR for HRC has expanded further with the deployment of the Magic Leap 1 [2] and Microsoft HoloLens 2 [3], arguably the most advanced head-mounted displays for AR on the market. In 2008, Green et al. [4] presented a literature review of AR for human–robot collaboration; however, in the years that have passed since then, AR for HRC has evolved immensely. The ACM/IEEE International Conference on Human–Robot Interaction hosts annual workshops on Virtual, Augmented, and Mixed Reality for Human–Robot Interaction (VAM-HRI) [5,6,7,8,9], further evidence that these technologies of augmented reality and robotics are becoming increasingly used together. This survey is intended as a continuation and expansion of the review begun by Green et al. [4], necessitated by the significant progress and innovations in this area since then and the lack of other survey articles on this particular topic.

Milgram et al. [1] define augmented reality as an overlay of virtual graphics and virtual objects within the real world, and this is the basic definition used throughout this paper. Green et al. add that “AR will allow the human and robot to ground their mutual understanding and intentions through the visual channel affording a person the ability to see what a robot sees” [4]. Whether the real world is viewed unobstructed, partially obstructed, or through an intermediate display, the AR features are placed over these real-world images. Technologies that enable augmented reality include mobile devices such as head-mounted displays or handheld tablets, projection-based displays, and static screen-based displays and are detailed in Section 3. This paper aims to focus on the topics of augmented reality as applied specifically to human–robot collaboration and thus excludes related but different topics such as virtual reality, augmented virtuality, or augmented reality for purposes other than HRC. Because human–robot collaboration occurs across all types of robots, we include examples of this variety within every section.

2. Methodology

We conducted this literature review by using IEEExplore, Google Scholar, and ACM Digital Library. The keywords utilized for the search were “augmented reality” and “mixed reality”. If the conference or journal was not robotics-focused, the keyword “robot” was also used. We included works from the proceedings of highly refereed robotics, human–robot interaction, and mixed-reality conferences, as well as associated journals. Conference proceedings and journals included the ACM/IEEE International Conference on Human–Robot Interaction (HRI); Robotics: Science and Systems (RSS); International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS); IEEE International Conference on Robot and Human Interactive Communication (ROMAN); IEEE International Conference on Intelligent Robots and Systems (IROS); IEEE International Conference on Robotics and Automation (ICRA); ACM/IEEE Virtual Reality International Conference (IEEE VR); IEEE International Conference on Control, Automation, and Robotics (ICCAR)); IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR)); IEEE International Conference on Emerging Technologies and Factory Automation (ETFA); CIRP Annals: Journal of the International Academy for Production Engineering; IEEE International Conference on Mechatronics and Machine Vision in Practice (M2VIP); IISE Transactions; Transactions on HRI, Frontiers in Robotics and AI; Frontiers in VR; and ICAR. We recognize that this method does not elicit a fully comprehensive review of all literature on HRC via AR; however, we believe that our sample size is large enough to be representative of where the field has been and is heading. A summary of the sections and papers included is found in Table 1. During this search, we discovered no existing reviews on augmented reality for human–robot collaboration, further solidifying the need for such an article.

We then examined the literature around augmented reality for human–robot collaboration, using the following questions to determine how to organize the discussion for each article:

Is the contribution primarily about helping to program, create, and/or understand a robot and/or system?
Is the contribution primarily about improving the collaborative aspects of a human–robot interaction?

In many cases, there is significant overlap in these contributions, and thus multiple valid possible organizations of these works. For this article we use the more significant area of contribution to situate the research with respect to other relevant literature. We then used these questions to help us organize the structure of this review, guiding the themes for our subsections as well.

First, we begin by exploring the many different manifestations of AR as it has been used for HRC since 2008 (Section 3). We then highlight the literature as it represents the categories defined above in Section 4 and Section 5. Section 6 reviews a representative selection of the evaluation strategies and methods utilized in the related studies. And we conclude with a vision for where research on AR for HRC might be most useful in the future (Section 7).

3. Reality Augmented in Many Forms

Augmented reality can manifest in different forms. Head-mounted displays are some of the most commonly considered AR devices, frequently used in cases where the person is collocated with a robot and needs the use of both of their hands. Mobile phones and tablets offer a different experience with augmenting the real world, which is especially useful when those devices’ other capabilities or apps might be utilized or to conduct smaller-scale interactions that do not necessitate an immersive view. Projection-based displays can be ideal for tabletop collaborative work or in consistent manufacturing environments, while static screen displays might best serve remotely located users. Below, we discuss various modalities of AR, their uses, and how they have changed over time, particularly as applied to human–robot collaboration. We do this by presenting a list of works separated by AR modality due to the different interactions enabled and required.

3.1. Mobile Devices: Head-Mounted Display

Head-mounted displays (HMDs) for AR have increased in popularity for use in HRC as the technology has matured. One example of current state-of-the-art is the Microsoft HoloLens 2, pictured in Figure 1. Furthermore, since 2009, the research has evolved from showing basic prototypes and designs for using HMDs, as in Chestnutt et al. [10], to more recently providing detailed design frameworks [11] and conducting extensive user studies with HMDs [12,13,20,27].

Generally HMDs are used for in situ interactions with robots, whether aerial, tabletop, or ground-based. This way, the virtual images (objects and/or information) can be placed over the physical objects within the environment that the user is currently experiencing. Depending on the maturity of the technology and the desired implementation, virtual images can be either egocentric or exocentric. A helpful way to understand the difference between these two display types is to imagine a path being visualized. An exocentric display provides an external perspective of the path, such as a map, whereas an egocentric display provides a perspective from the point of view of a person actually traveling along that path. In the remainder of this subsection, we highlight literature that exemplifies the evolution of HMDs over time, while also indicating the multitude of ways in which they can be used to facilitate HRC.

In Chestnutt et al. [10], the human user draws a guide path for a humanoid robot in the HMD, and the specific left and right footsteps are then shown to the user in their HMD such that they can anticipate where the robot will step. The robot plans its specific steps (shown as virtual footprints) based on the general path provided by the human (shown as a line drawing). In this paper, written in 2009, all of these technologies are obviously still relatively nascent, a full user study is not conducted, and some alternatives to drawing the robot path are considered, such as joystick control. We see this change with modern research showing an increased expectation of rigor, a positive indicator of the field maturing.

Also in 2009, Green et al. [14] utilize an HMD to allow a user to view virtual obstacles and plan a path for a simulated robot in AR. The HMD device used in the study, the eMagin Z800, was wired to a computer, and the work was carried out in simulation. This simulation-based work is further evidence of earlier studies finding ways to conduct AR-HRC research with still-maturing platforms.

Four years later in 2013, Oyama et al. [15] debut a “slenderized HMD” to provide a teleoperator the perspective of the robot. The device utilizes the same base HMD as in Green et al. [14] but then also augments it with stereo cameras and a camera witth a wide field of view. Similarly, the HMD in Krückel et al. [16] allows for teleoperation of an unmanned guided vehicle, but in this case the operator’s view is augmented with an artificial horizon indicator and heading information. Furthermore, the operator can look around the entire environment, as they are effectively immersed in it with the use of the Oculus Rift HMD, a device intended for virtual reality more than augmented reality. This begs the question of what actually “counts” as AR; in the cases of Oyama et al. [15], Krückel et al. [16], the human’s reality is not actually being augmented, it is instead placed virtually into the environment of the robot. We claim that it is in fact augmented reality, since it is not a virtual environment that is being augmented. Despite the human not existing in the same location as the robot that they are controlling, a real environment is being augmented with virtual images, all of which the human user is able to see and affect.

The Microsoft HoloLens was introduced in 2016, facilitating new research on AR for HRC using HMDs. Readers may note that the HoloLens is referenced throughout the literature mentioned in this paper, as it is relatively straightforward to work with and represents the state of the art in augmented reality technology for head-mounted devices. The HoloLens 1 places images as holograms, or virtual images overlaid on the real world, in the wearer’s field of view. This capability, along with the incorporation of sensors allowing for detection of gaze, voice, and gesture, made the HoloLens a revolutionary hardware development. In late 2019, the second version was released, HoloLens 2, with additional features and improvements including a more comfortable fit and eye tracking. The HoloLens has been mass-produced for approximately 5 years now, making it widely available for research.

In Guhl et al. [17], Guhl et al. provide a basic architecture for utilizing the HoloLens for industrial applications. Using tools such as Unity and Vuforia, robots can be modeled on the HoloLens, safety planes can be rendered to keep the human and robot safely separate, and sound can be played. These concepts and capabilities are suggested in the hopes of allowing users to foresee robots’ motions and thereby productively interfere.

Technology in Yew et al. [18] takes the AR user’s environment and “transforms” it into the remote environment of the teleoperated robot. Real objects in the user’s environment are combined with virtual objects in AR, such as the robot and the objects with which it is interacting, thereby reconstructing the actual site of the robot for the teleoperator.

A robotic wheelchair user is outfitted with a Microsoft HoloLens in Zolotas et al. [19]. A rear-view display is provided, the future paths of the wheelchair are projected onto the floor, possible obstacle collisions are highlighted, and vector arrows (showing both direction and magnitude) change with the user-provided joystick velocity commands. One set of findings from this study was its deeper understanding of users’ comfort with AR feedback. The authors also further confirmed the restrictive field of view of the HoloLens and cited it as a limiting factor in the usefulness of the AR. The work in Zolotas and Demiris [12] then builds on Zolotas et al. [19] by adding “Explainable Shared Control” to the HMD. In this way, the researchers aim to make the robotic wheelchair’s reasoning more transparent to the user. The AR is classified as “environmental” (exocentric) or “embodied” (egocentric), depending on whether it is fixed to the environment or fixed to the user or robot. In another recent robotic wheelchair study using the HoloLens Chacón-Quesada and Demiris [20], different types of icons and display modes are tested. The user can control the wheelchair from within the AR interface, and a choice of movement options is shown to the user in their field of view.

The HoloLens was also used to program a UR5 robot arm to conduct pick-and-place tasks in Rudorfer et al. [21]. The platform uses the built-in recognized HoloLens gestures to interact with the six-degree-of-freedom robot via a drag-and-drop type gesture. The goal of this system is to enable a user to command a robot to perform pick-and-place actions, moving Lego blocks from one location to another. In Puljiz et al. [22], a feasibility study explores a method of generating the robotic arm as a manipulable hologram within the HoloLens, using a registration algorithm and the built-in gesture recognition. The virtual robot is overlaid on the physical robot, with the goal of teleoperation. Either the end-effector can be manipulated, or the linkages can be moved to create the desired positions. In practice, issues with segmentation resulted in the hand tracking not performing well on dark backgrounds and when close to objects.

The study conducted in Elsdon and Demiris [23] uses a HoloLens in conjunction with an “actuated spray robot” for the application of specific doses of topical medication. The amount of medication dispensed is shown to the user only via AR, rendering an otherwise unobservable result for the user.

Reardon et al. [24] show how AR can aid a human who is conducting search efforts collaboratively with a mobile ground robot. In this case, the robot is providing location and navigation information to the human teammate via AR. The primary technical contribution from this study is the alignment of the frames of the human and the robot. This study also uses AR markers for the testing of targets and navigation. The goal of Kästner and Lambrecht [25] is to evaluate the HoloLens’s performance under five different visualization modes: without any sensor data visualization; with laser scan visualization; with environment map visualization; with laser scan and environment map visualization; and with laser scan, environment, and navigation visualization. The experiment uses AR to present a visual map of the space, set goal locations for the ground robot, and visualize the robot’s path along the floor. The main limitations of the technology are from the constant visualization of real-time data, especially the laser scan data for position and obstacle tracking.

Hedayati et al. [26] explore three different design methodologies, which all prove to be improvements over the baseline. A HoloLens is again utilized as the ARHMD platform, with three classifications for interface designs: augmenting the environment (which they call the Frustrum design), augmenting the robot (the Callout design), or augmenting the user interface (the Peripherals design). These design frameworks work quite well for the situations where the robot is separate from the human and they are collocated in the environment, but may not apply as well in all situations, for example, when the robot is a wheelchair that the user is operating from a first-person perspective. In related work, Walker et al. [11] also utilizes this design framework (augmenting the environment, augmenting the robot, augmenting the user interface), and showcases four reference designs (NavPoints, Arrow, Gaze, Utilities) for designing AR for HRC.

The limitations and drawbacks of head-mounted displays are made clear in Qian et al. [27], where a HoloLens is used to assist the first assistant during robotic-assisted surgery. The weight of the device, as well as its limited field of view, are both stated as problematic in participant interviews. The intent of AR in this case was to be able to (virtually) view instruments inside the patient and to provide real-time stereo endoscopic video in a convenient location.

Similarly to Qian et al. [27], Walker et al. [13] also use a HoloLens to display a hologram robot (“virtual surrogate”) that is manipulated for teleoperation. However, in this study, the user is collocated with the robot, which is an aerial quadcopter robot instead of a tabletop robotic arm, and a handheld Xbox controller instead of hand gesture recognition is the mode of teleoperation. Two designs are tested: one that behaves like a typically teleoperated robot with the physical quadcopter immediately responding to the virtual surrogate’s movements and another where the virtual surrogate is used to set waypoints in AR that the physical quadcopter can be signaled to begin at any time. These are compared against a purely teleoperated robot, without any virtual surrogate. In the user study, both task completion time and response time are faster in the experimental conditions, and participants also preferred the experimental designs over direct teleoperation.

3.2. Mobile Devices: Handheld Display

Augmented reality that uses a handheld mobile device display, such as a tablet or smartphone, is a frequent implementation of AR. These kinds of devices are ubiquitous, and creating an app that can be deployed to almost anyone is relatively straightforward, simple, and inexpensive. Since the release of the iPhone in 2007, mobile devices like it are increasingly at people’s fingertips, and there is already a dependable baseline level of familiarity with how to interact with AR in this form. As mentioned in the introductory paragraph to this section, handheld mobile displays provide for an AR experience that is non-immersive compared to the HMD; furthermore, handheld devices are typically more affordable ways to implement AR for HRC.

The AR format in Fung et al. [28] uses the Sony Vaio ultra mobile PC, a handheld touchscreen device that recognizes fiducial markers (special tags) in the space to provide on-screen information to the user, enabling them to program a robot to carry out a limited set of tasks. The user takes photographs with the handheld device, enabling the recognition of objects and locations in the photograph, and then actions are allowed to be programmed using these recognized objects and locations. In this way, a robot can be programmed to operate simple home appliances, such as a hot water kettle.

The Samsung Galaxy S II smartphone is used in Lambrecht and Krüger [29] as the mobile device on which to display AR, with the goal being intuitive industrial robot programming. The mobile device displays virtual objects relevant to the robot’s motions, and the user can interact using hand gestures. Information from both an external 3D motion tracking system and the 2D camera on the mobile device are combined to interpret the hand gestures.

That same year, Bonardi et al. [30] present an iPad application for arranging robotic movable furniture either in situ with AR (“Augmented/A”) or in virtual reality (“Virtual/V”). Tables and chairs can be placed virtually into the actual environment, and different experimental conditions either allowed the participant to move freely about the space with the iPad (“Dynamic/D”) or required them to remain stationary with the iPad anchored in place (“Static/S”). Participants were also tracked with the Kinect sensor. All subjects in this 2 × 2 study were provided time to practice using the software on the iPad using the virtual, static condition, and then performed two of the four conditions (SV, SA, DV, or DA). Participants preferred dynamic over static conditions and performed better in the dynamic condition with respect to precision, and they also expressed a preference for augmented representation over virtual despite no observed performance differences. The choice of an external mobile display for the interaction is notable here, as it allows the person to manipulate objects on a tangible screen while moving around the environment with their field of view unencumbered.

A Samsung Galaxy Tab 4 is used to compare the use of AR with traditional robot programming in an industrial environment in Stadler et al. [31]. The participant completes three different tasks to program a Sphero 2.0 robot ball in either an AR or a no-AR condition. In the AR condition, “task-based support parameters” are provided, whereas these parameters are not given in the no-AR condition. Workload measures are lower in the AR condition, while task completion time increases, possibly due to the apparent desire for participants to be more accurate in the AR condition, provided with more visibility to the task.

More industrial robot programming is explored with mobile screen AR in Hügle et al. [32]. The user first moves around the space with a tablet, using pointing and arm movements, while the six-DOF robot arm remains stationary. Next, the user validates the robot poses and trajectories aided by the AR application, able to adjust the program as well as physically move the robot. Finally, the user leaves the area so that the robot can safely demonstrate its learned movements. Gestures are recognized using the tablet’s camera, the user receives AR feedback on the gesture interpretation, and a virtual robot is also displayed to demonstrate the current program.

The Apple iPad Pro is the mobile device of choice for Frank et al. [33]. Fiducial markers are arranged on a table surrounding a humanoid robot with two six-DOF arms. Manipulable objects, also labeled with markers, must be moved around the table. Three different interfaces, all using the iPad, are tested in a between-subjects study. The three interfaces are a conventional egocentric (to the robot) interface, where users view the area from the perspective of the robot’s on-board camera; a conventional exocentric interface, which displays an overhead camera view of the workspace; and an experimental mobile mixed-reality interface, which uses the tablet’s rear-facing camera as the point of view. The reachable space can be highlighted virtually on the tablet. Statistically, participants perform equally well with all interface modes. Because the egocentric interface requires users to move around to gain perspective of the robot, this modality is less preferred by participants than the other two modalities. Likewise, the egocentric interface users also report a higher workload. There is obvious variability among participants using the mobile interface, possibly due to the variety of movements available to those users.

In Sprute et al. [34], a Google Tango tablet with an RGB-D camera is used to define spaces that a mobile robot is allowed to occupy, using “virtual borders”. Holding the tablet, a user moves around the space and chooses points in a specified plane. These points are displayed on the screen along with the virtual borders that they define. This method is compared against two baseline methods: visual (physical) markers and a laser pointer. Ultimately, the results showed that the tablet method produced a similar accuracy as the baseline methods and resulted in a faster teaching time.

In Chacko and Kapila [114], a Google Pixel XL allows a user to select an object and a goal location, which are then shared with a four-DOF tabletop robot manipulator with a one-DOF gripper. The mobile AR display features two buttons (one for setting the target and another for clearing), crosshairs to assist with locating a target, shading to denote reachable regions, and virtual objects to indicate the intended final placement. Different versions of the interface are provided to allow the user to program either one pick-and-place object at a time or multiple objects together. Participants rate the workload required for this task and interface as relatively low. Chacko and Kapila [35] extend Chacko and Kapila [114] by expanding the types of objects to be manipulated, allowing for two different grasping modes (vertical and horizontal), and adjusting the AR display accordingly.

The software developed in Rotsidis et al. [36] is intended to facilitate trust between robots and users, using a mobile phone AR application to increase transparency. The AR display has modes that show a ground robot’s decision-making capabilities in tree-like formats. Subtrees can be expanded with a tap, and users can debug the program and access additional information. This kind of transparency increases the likelihood that the robot is perceived as alive, lively, and friendly by study participants.

As demonstrated by this review of mobile device AR display, the uses are incredibly diverse and allow for a variety of functionality and information provision. Another commonly used mode of augmenting the real world for HRC is projection. Much of the work in this area has occurred within the past 4 or 5 years, perhaps due to the maturation of projection and motion capture technologies.

In 2016, work in Andersen et al. [37] utilized projection mapping to facilitate autonomous robotic welding. An operator uses a Wii remote to control a cursor and communicate with the robot. In the experiment, the projection is displayed on a mock-up of a shop wall. The participant completes two separate tasks, one requiring them to correct a number of incorrect locations for welding, and another to teach the welding task to the robot. The functionality of the projection system was rated relatively highly by mostly novice participants, due in part to the projection visualization of task information.

In a car door assembly task Kalpagam Ganesan et al. [38], projections are used to dynamically indicate various cues to human collaborators with robots. Object locations are tracked with a vision-based system, and this enables projection mapping on top of the 3D objects. Three modes of communication were tested: printed mode, in which subjects received printed instructions; mobile display mode, in which subjects received a tablet with instructions; and projection mode, providing just-in-time instructions via projection mapping with mixed reality cues. Participants had to collaborate with a robot to complete the door assembly task. The amount of time required to understand a subtask was lower in the projection mode than in the printed or mobile display modes. Furthermore, the subjective questionnaire revealed higher fluency, clarity, and feedback with the projection mode. All participants also favored the projection mode in this within-subjects test.

In another industrial application in Materna et al. [39], a human subject uses spatial augmented reality to program a robot to prepare parts for assembly. Projections are displayed on a touch-enabled table that is also within reach of the robotic arms. Since all work occurs on the table, the location of the projections in this same area is intended to increase focus and situational awareness, improve use by novice users, and remove the need for other devices. The tabletop system serves both as input for the robot and feedback for the human. Lists of instructions and programs, dialog boxes, and images representing objects to be manipulated are all “widgets” shown on the tabletop surface. Unfortunately, the affordances of the touch-capable table proved to be lacking, and five of the six participants agreed with the statement, “Sometimes I did not know what to do”, demonstrating once again that shortcomings in the tools can deeply affect the overall experience.

Similar to Materna et al. [39], in Bolano et al. [40], a tabletop projection system is also used. In this case, however, information is shown about robot behavior and detected parts, with the goal of clarifying the task and the robot’s intent, and the table is not touch-enabled, nor are any inputs solicited from the user. Without the hindrance of a confusing touch interface as in Materna et al. [39], the usefulness of tabletop projection can be assessed. Because in this example the user is working concurrently with the robot rather than programming it, understanding the intent and future movements is especially useful. If the robot makes an unpredictable move, the human user can see with a glance the goal location and immediately assess whether or not a collision is imminent.

3.3. Static Screen-Based Display

A mode of AR display that has declined in popularity in recent years is that of a screen-based display, generally placed on a desktop for viewing. This display is distinct from the mobile device displays discussed earlier, as it cannot be moved with the user on the fly, nor is it generally equipped with a mobile camera. Research involving static displays for HRC is largely for remote use purposes, featuring an exocentric camera view and virtual overlays for the remote user. Here, we highlight some examples of these static displays for AR, though this modality has been less common in recent years.

Work in 2009 used a screen-based display to facilitate dental drilling in Ito et al. [41]. Virtual images were projected onto teeth to perform the drilling required to prepare them for a crown. The path of the drill can be superimposed and feedback shown on the screen. The machine is teleoperated via joystick, and the AR system enables the replication of the original operation.

In 2010, a remote operator is shown a live view of a robot arm with additional information on top of and around the robot in view in Notheis et al. [42]. Both virtual and real cameras are enabled, with the virtual model showing the intended movement of the real robot. The user can validate the movements via the screen prior to the action being taken in real life.

In proof-of-concept work performed in 2012 in Domingues et al. [43], the intent is to provide users with a virtual scuba diving experience. While an underwater robot (ROV) was teleoperated, a screen-based AR displays controls and the video feed from the ROV. The user can choose whether to use the on-board ROV camera or the virtual ROV for controlling the robot.

A stationary touchscreen AR display is used in 2013 to allow users to teleoperate a ground-based robot in another room by manipulating a 3D model on the screen in Hashimoto et al. [44]. The user draws the robot path on the screen with their finger, and various cameras are provided to augment the user’s view, including a third-person view camera. Three movement modes are tested with the touchscreen input: Movement After Touching (the robot does not move until the person is no longer touching the screen), Movement During Touching (the robot moves as soon as the user begins to manipulate the model but stops immediately when the screen is no longer being touched and the model moves to the current location of the robot), and Movement During and After Touching (the robot begins as in Movement During Touching, but when the user stops touching the screen, the robot continues to the final model position). Only 12 participants were involved in the study, which makes generalizations about the usefulness of each mode difficult, and there were participants who preferred each of the three modes.

3.4. Alternate Interfaces

A survey of the literature on AR for HRC would be deficient without the acknowledgement of the development of various peripheral devices for interacting in augmented reality. Here, we provide examples of the diverse types of peripherals.

One example of a peripheral being used with AR is in Osaki et al. [45], where a projection-based AR is combined with a drawing tool peripheral to set a path for a mobile ground-based robot. Additional commands and communication are provided by the drawing tool, including navigation by a virtual string (as if it were a leash and the robot were a dog) and the use of different colors to indicate stop or go.

To enable robot use by people with mobile disabilities, a “tongue drive system” (TDS) is developed for use with an AR headset in Chu et al. [46]. Using tags and object recognition, a user is able to perform pick-and-place and manipulation tasks faster with the TDS than with manual Cartesian inputs from a keyboard.

One proposed concept, and an example of where this kind of technology might lead us in the future, is an immersive suit for the elderly: the “StillSuit” in Oota et al. [47]. The main purpose of the robotic StillSuit is to enable interaction with the environment. Using “Lucid Virtual/Augmented Reality”, the central nervous system and musculoskeletal system are modeled, providing the user with the sensations of performing a particular task.

In Gregory et al. [48], users perform gestures while wearing a Manus VR gesture glove, capable of tracking each finger’s movement. While wearing a HoloLens, users provide movement instructions to a ground-based robot via the gesture glove. A key insight learned in this pilot study is that gestures should be chosen so that they can be easily formed by all users.

3.5. AR Combinations and Comparisons

Other themes in the literature included the comparison of different AR modalities via user studies and the combining of modalities to achieve improved effects. These studies are important for those who may be deciding whether to implement AR in different modalities or how to provide AR insight to both an egocentric and an exocentric user simultaneously; thus, related works are shared below.

Augmented reality can be a combination of technologies, such as in Huy et al. [49], which combines projections using a laser writer system (or spatial augmented reality, SAR) with the Epson Moverio BT-200 AR Glass (an HMD) and a multimodal handheld device prototyped for the study. The laser writer is mounted to a ground-based mobile robot to provide directional feedback, the human can provide commands via the handheld device, and other visual feedback can be provided via the HMD. The intent of testing both versions of AR (projection and HMD) is for those cases where some of the communicated information may be sensitive, while other information may be needed by all those in the vicinity of the robot for safety purposes.

Sibirtseva et al. [50] compare different AR methods where the three conditions are HMD, projector, and a monitor. Participants claim that the HoloLens is more engaging, possibly due to the mobility that an HMD allows, but generally prefer the projection-based AR for a tabletop robot manipulator conducting a pick-and-place task because it was “natural”, “easy to understand”, and “simple”.

Similar to Huy et al. [49], in Bambušek et al. [51], a HoloLens is combined with projection AR so that an outsider can see what the HMD-wearer is doing. The study indicated a high task load for the HMD and confusion when both were used. Ultimately, the task completion time was faster with the HMD regardless of the high Task Load Index rating. The unreliable touch-enabled table proved to be problematic, as seen in other studies like Materna et al. [39].

AR (as well as VR in this instance) has also been used as a training tool for the operation of a conditionally autonomous vehicle in Sportillo et al. [52]. In a between-subjects study, three different training methods are tested: on-board video tutorial, AR training, and VR simulator. In this Wizard-of-Oz study, all participants are able to take over in the appropriate situations within the required time, regardless of their training method, but participants trained with AR or VR have a better understanding of the procedure and better performance time.

4. Programming and Understanding the Robotic System

We encountered a large subset of literature that discussed the problems of allowing a user or designer to better understand, create, or improve the human–robot collaborative system via augmented reality. Below, we discuss these in their respective subsections based on the ways in which they do so or their intended domain.

4.1. Intent Communication

The research highlighted in this subsection addresses the problem of the communication of robot intent to humans via AR. Section 4.2, Path and Motion Visualization, is related to intent, but it is differentiated in that intent is not always path- or trajectory-based. A robot might want to communicate an overall plan, a goal location, or a general intent so that the human collaborator does not duplicate efforts, alter the environment, or put themselves in danger. Thus, we share this section specifically dedicated to intent communication.

One key example of intention explanation is in Chakraborti et al. [53], where the “Augmented Workspace” is utilized both before and during task execution. The aim of this work is to keep the human collaborator informed, increase the fluency of the collaboration, increase the clarity of the plans (before and during task execution), and provide a common vocabulary. Particularly notable is the Projection-Aware Planning Algorithm, where “the robot can trade-off the ambiguity on its intentions with the cost of plans”. Similarly, algorithms for interpreting the scene and establishing and updating the virtual borders to be shown to the HMD wearer are presented in Sprute et al. [54].

The overarching goal of Reardon et al. [55] is to provide straightforward, bidirectional communication between human and robot teammates. The human is provided information to more clearly understand the robot’s intent and perception capabilities, while the robot is provided information about the human that enables it to build a model. By enabling this bidirectional communication, the authors seek to influence human behavior and increase the efficiency of task completion. The task at hand in this experiment is the cooperative exploration of an uninstrumented building. The robot and human (wearing an AR HMD) are independently performing SLAM, and their frames of reference must first be aligned with each other. Next, the maps from both sources are composited. Finally the robot’s information is provided to the human teammate visually in their AR-HMD. The information visually communicated to the human via the AR-HMD includes the robot’s current plan; the composite map, to facilitate understanding of the current state of the exploration task; and other information to convey how the robot is evaluating future actions [55].

In cases where humans and industrial robots must work in close proximity, safety and trust can be improved by indicating the robot’s intent to the human. For example, in Bolano et al. [40], a human collaborator works in a shared space on an assembly task. Using projection-based AR, the user can immediately see whether a part is recognized by the system and also be shown the current target, trajectory path, and/or swept volume of the robotso that they can safely move out of the way (or know that they are already working in a safe space), even if it might appear as though the robot is moving towards them.

To aid in the disambiguation of human commands, Sibirtseva et al. [50] present a system that involves natural language understanding, a vision/object recognition module, combining these two for reference disambiguation, and the provision of both a visualization in AR and an autonomous robot controller. After a pilot study to establish human language preferences for the reference disambiguation visualization system, a relatively straightforward pick-and-place task for different colors of blocks is established to compare three modalities of AR.

In a similar experiment, Williams et al. [56] performs a within-subjects study to investigate how a robot can communicate intent to a human via AR images as deictic gestures (such as circling an object in the user’s field of view), rather than using physical deictics (such as pointing). The experimental results suggest design guidelines for “allocentric mixed reality deictic gestures”, including the suggestion to use these gestures in contexts where language may be difficult or impossible, or when the intended target may be perceived as outside the robot’s perspective, and to use them in combination with language when the situation allows.

A key result of communicating robot intent is the calibration of a human user’s trust that results from their mental model of the system and from an understanding of its capabilities and limitations. This calibration of trust is one of the primary goals of Rotsidis et al. [36]. Using a mobile phone-based AR, a tree-like display of the robot’s plans and priorities was shown to a human for both transparency and for debugging.

Even more recently, ref. [57] compared different two different AR robot gestures (a virtual robot arm and a virtual arrow). Based on the robot’s deictic gesture, the participant chose the virtual item that they believed the robot was indicating. While the arrow gesture elicited more efficient responses, the virtual arm elicited higher likability and social presence scores for the robot. These results carry various implications for intent communication, including an important choice between likability and efficiency. Further, AR is shown in [58] to be a a promising technology for the bi-directional communication of intent and increased task efficiency through experiments that provide avenues for both the human and the robot to communicate intent and desires. Other AR-enabled indication methods that have been explored include a virtual robotic arm on a physical robot that points to desired objects, as demonstrated in Hamilton et al. [57]. This study compares the virtual arm with a virtual arrow and finds that while arrows support a faster reaction time, a virtual arm makes the robot more likable. AR-based visualizations—which include placing a virtual robot in the physical space along with sensor data and a map grid—are also tested in Ikeda and Szafir [59] for supporting debugging by roboticists.

4.2. Path and Motion Visualization and Programming

Another popular problem in human–robot collaboration is that of understanding and programming robot trajectory and motion. As clarified in Section 4.1, here we focus on paths and trajectories of the robots and how AR can be used to visualize or program these trajectories.

In a straightforward and intuitive example from Osaki et al. [45] in 2008, the human user draws lines in AR (via both the projector and HMD), using a peripheral device, for the robot to follow. The lines are then processed in trajectories which the robot can take. Similarly, in Chestnutt et al. [10] a human user directs a humanoid robot by drawing a guide path on the ground in AR. The system then plans left–right footstep sequences for the robot that are also displayed via AR, and the user is able to modify the path if necessary.

For a remote laser welding task, a similar line-following approach is taken in Reinhart et al. [60], also in 2008. First, the welding locations are denoted with the specific welding task to be completed using AR projections, and next, the robot paths are optimized for task completion. Published approximately 8 years later, Andersen et al. [37] is also related to welding, this time for stud welding in a shipbuilding environment. Projection mapping is used in this instance as well, and a lab-based user study indicates positive results for novice users in programming the robot to conduct accurate welding activities.

In Green et al. [14], the authors set three different experimental conditions for humans navigating a simulated robot through a maze with the use of AR. The three within-subject conditions tested are an Immersive Test, using an onboard camera and teleoperation without any AR; Speech and Gesture no Planning (SGnoP), providing AR interaction with speech and gestures; and Speech and Gesture with Planning, Review, and Modification (SGwPRM), adding to the prior condition the opportunity to review the plan before it is executed by the robot. While the immersive condition is preferred by test subjects and most easily executed, SGwPRM yields the most accurate results. Significant user learning had to take place in both of the AR conditions, while the pure teleoperation is a more natural mode of control. This study combines a number of different options, such as displaying the path before robot movement begins, utilizing AR tags to display virtual objects to the user, and integrating speech and gesture inputs.

A significant amount of research covers different ways to “teach” or program a robot using AR. In Hulin et al. [61], visual and haptic signals are given via AR to a human who is using Programming by Demonstration to teach a robot arm a trajectory. The signals are intended “to avoid singularities”. The following year, in Fung et al. [28], a human user takes photographs with an AR-enabled device and then provides annotations, which are transferred to a ground robot’s movement. In another study, from Bonardi et al. [30], while it does not use separate ground robots, the furniture itself is robotic and modular. Users interact with an iPad to control the arrangement of the furniture in a shared space. While these papers covered scenarios with humans in the same space as a robot, Hashimoto et al. [44] instead deals with a robot being teleoperated from another room via touchscreen. Also in 2013, Gianni et al. [62] presented a framework for remotely operating a semi-autonomous ground robot as well. Their framework includes an AR interface that allows for path planning and obstacle navigation through a handheld pen peripheral, as well as a localization system that used dead reckoning in addition to ICP-SLAM and a trajectory tracking algorithm. This kind of remote communication is designed to be especially useful for situations that might pose greater risk to a human, such as emergency rescue or scouting. Both Lambrecht and Krüger [29] and Lambrecht et al. [63] focus on honing hand gesture recognition algorithms for the spatial programming of industrial robots. Specific contributions include the recognition of specific gestures that map to robot poses, trajectories, or task representations, and improvements in skin color classifier and hand/finger tracking. In a 2014 user study, Coovert et al. [64] demonstrate the effectiveness of projections (such as arrows) from the robot onto the floor in front of it when moving in an environment among humans. Participants feel more confident about the robot’s movement and more accurately predict its movement with projections than without. In another study the following year, Chadalavada et al. [65] suggested that a mobile ground robot that projects its intentions onto the floor with simply a contour line is preferable to no projection at all.

Rather than use AR for directing or programming the robot, Makris et al. [66] suggest that an AR HMD can be used in a human–robot collaborative assembly environment to provide the human with robot trajectory visualizations so that they can stay safely away from those areas. However, the presented system does not offer any recourse if the user does intersect the denoted trajectory/path. In a study by by Walker et al. [11], different ARHMD visualization designs are tested for communicating to a human in a shared space what the intent of a quadcopter robot is. Four different visualizations are tested in a between-subjects study: NavPoints, Arrow, Gaze, and Utilities. These visualization designs each have different purposes and uses.

Hügle et al. [32] present a programming method for a robot arm that involves both haptic (Programming by Demonstration) and gesture-based input. The gesture-based input is used to provide a rough definition of the poses within the space, while AR images are used to validate the poses and trajectories and alter the program. Next, the human takes turns leaving the space while the robot moves to the next pose, re-entering the space to provide hands-on feedback and alterations, and then leaving again for the next movement. Once the program is finalized, it is transferred to the controller.

In Materna et al. [39], users program a PR2 robot as an assembly assistant, using projection-based AR on a touch-enabled table. They use a block programming technique (with the blocks projected on the table) to select the appropriate steps for the robot to complete, and the target locations for parts are also highlighted virtually on the table. Templates are available offline for the users to work from, and specific parametric instructions (such as pick from feeder or place to pose) are supported. No pre-computed joint configurations or trajectories are stored, and all paths are planned after the program is set.

The system in Krupke et al. [67] allows a human user to interact with a virtual robot, move it virtually, confirm the movements via speech after watching a visualization of the picking motion, and then observe the actual physical robot move according to those movements, the goal being a pick-and-place task. In another pick-and-place task, non-experts are asked to program a robot used to move printed circuit boards to and from their testing locations [68]. A form of block programming is used in which “pucks” are chosen and placed by the user to indicate actions and their sequences to the robot. Bambušek et al. [51] provide a user with a HoloLens HMD for programming a robot for a pick-and-place task but also augment it with AR projections so that others can see what the HMD-wearer is doing to avoid confusion and provide for safety. In this case, the robot need not be present for the programming to take place, as object placement occurs entirely virtually at first. Interactive Spatial Augmented Reality (ISAR) occurs along with virtual kinesthetic teaching (ISAR-HMD).

In Kästner and Lambrecht [25], a large portion of the work focuses on aligning the coordinate systems of the HoloLens and the robot, similar to Reardon et al. [55], both in 2019. After alignment is assured, then sensor data can be visualized, which include the navigation path of the robot that is extracted from the global path planner. Results show a struggle to visualize the large amounts of real-time laser scan data using the HoloLens, a limitation to be addressed in the future. To assist humans in remotely exploring unsafe or inaccessible spaces via UAV, Liu and Shen [69] use a HoloLens to display an autonomous UAV’s “perceived 3D environment” to the human collaborator, while the human can also place spatial targets for the robot. In an attempt to develop an all-inclusive AR system, Corotan and Irgen-Gioro [70] present a combined augmented reality platform for “routing, localization, and object detection” to be used in the autonomous indoor navigation of a ground robot. Other noteworthy recent research presents AR-based methods for programming waypoints and states for robot arms [71,72], as well as for programming robots through learning from demonstration [73] (see Figure 2) and for projecting the intended paths a social robot might take [74].

4.3. Adding Markers to the Environment to Accommodate AR

One method of making AR easier to implement is to change the surroundings by providing tags, markers, or other additions and alterations. While this requires that the environment can actually be prepared in this way (both that it is physically possible and temporally feasible), these kinds of features can significantly increase the ease of AR implementation. Furthermore, AR markers and tags are generally used to address the problems of placement, labeling, and recognition encountered when using AR technology and aim to increase users’ understanding of the system. Below, we share research that demonstrates these kinds of accommodations.

In Green et al. [75], a Lego Mindstorms NXT robot path is planned by a human user by combining fiducial markers, other graphics, gestures, and natural language, specifically deictics. Paddles with different markers that indicate instructions such as “stop” or “left” provide instructions for the robot, while the robot confirms the human’s plan using natural language responses. AR, specifically using the markers in the environment, allows for a common communication platform between the human and the robot. The exploration of AR for HRC using AR markers continues to progress in Green et al. [14], where the authors set three different experimental conditions for humans navigating a simulated robot through a maze with the use of AR. AR markers are placed in the participant’s physical environment, on which the virtual obstacles in the maze were modeled.

A similar task of programming a robot to follow a pre-set list of instructions utilizes fiducial markers in Fung et al. [28]. With this handheld AR, labels are displayed in the user’s view, allowing them to match the objects with the provided instructions and then provide direction to the robot.

The title of “Mixed reality for robotics” in Hönig et al. [76] is so generic as to give away the novelty of this research area. The authors’ goal is to show how mixed reality could be used both for simulation and for implementation. One single physical robot is used as a basis for additional virtual robots, and simulation is pitched as a research and development tool. In this study, markers are placed on the robots in the real world to make it easier for the simulation to mimic the motion directly.

AR has been explored for many uses in a manufacturing environments, such as in Peake et al. [77] where AR markers are used to overlay objects on the factory floor. The images displayed virtually can be pulled from the cloud and can provide information about machine status and equipment usage.

There are many kinds of uses for AR tags and fiducial markers or ways in which the environment can be altered to accommodate the use of augmented reality. Fiducial markers are used in Frank et al. [33] to both denote possible goal locations and to label movable objects, which are to be recognized by the robot and the AR device. This significantly simplifies the recognition aspects, removing that process from the system. In order to locate and orient a ground-based robot in a confined space, Hashimoto et al. [44] label its corners with fiducial markers. This facilitates the control of the robot by a remote user via touchscreen.

4.4. Manufacturing and Assembly

One domain in which solutions for creating and understanding the human–robot collaborative system are particularly applicable is that of manufacturing and assembly. Specific tasks performed in such environments, and which can benefit from the use of AR, include tool alignment, workspace visualization, safety precautions, procedure display, and task-level programming. Especially over the last 5 years, the manufacturing environment has become a popular research area for AR in HRC.

In a study intended to represent the tasks of a factory robot, Stadler et al. [31] task participants with using a tablet-based AR to teleoperate a Sphero robot in three different activities: tool center point teaching, trajectory teaching, and overlap teaching. The AR tablet provides “task-based support parameters” in the form of shapes, guiding lines, start and end points, and radii. Workload decreases with the tablet-based AR; however, task completion time increases. The authors suggest this could be attributed to the support parameters providing a visible comparison for exactness.

In a robot-assisted assembly scenario, AR shows potential usefulness in multiple ways, such as displaying assembly process information, visualizing robot motion and the workspace, providing real-time alerts, and showing production data [66]. The specific case study applies to the automotive industry, where a COMAU NJ 130 robot works in a cell collocated with a human. A red volume denotes the robot’s workspace, the green volume is safe for the operator, and the current task is shown at the top of a screen. This proof of concept is intended to show the additional safety and efficiency afforded with the use of AR. Also in 2016, ref. [78] applied an “object-aware projection technique” to facilitate robot-assisted manufacturing tasks like the installation of a car door. Projections such as wireframes and warning symbols aid the human in understanding robot intent. Another study intended to improve assembly operations, Materna et al. [39], uses a PR2 robot as the worker’s assistant, helping to prepare the parts for assembly. The worker is aided by AR to create a block program for the robot, see the instructions, view object outlines, and receive information about the state of the system, as well as additional information. Unfortunately, the robot itself is relatively unreliable during the experiment, and other usability issues are also apparent (participants blocking part of the table where the robot should place its parts, or participants intentionally or unintentionally ignoring errors shown via dialog boxes and audio in the system). Future studies should take into consideration these kinds of limitations.

Ref. [77] also work towards implementing AR in a robot-enabled factory, using a mobile device and AR tags to display virtual objects and their expected manipulation by the robot on the factory floor. Research in Guhl et al. [17] takes this concept further by implementing multiple AR modalities that allow a worker to impose movement restrictions, change joint angles, and create programs for a robot in the factory on the fly, including the UR 5, Comau NJ 130, and KR 6.

A seemingly common application for AR for HRC is in robotic welding [18,37,60]. The dangers of welding combined with the accuracy required for welding tasks are perhaps what make this a potentially useful application. In Reinhart et al. [60], AR was used to assist with programming the remote laser welder, providing a user the capability to define task-level operations. In both Reinhart et al. [60] and Andersen et al. [37], projection-based AR is used to display the weld plan to the user directly on the area to be welded. In Yew et al. [18], however, an HMD displays virtual objects in the user’s field of view so that they can teleoperate a remote welder.

Puljiz et al. [79] draw on the built-in mapping and localization capabilities of the HoloLens to establish safe zones and other areas of interest within a robot cell, rather than relying on an external source. The results presented in the paper show that the mapping can aid in the setup of the robot cell, and the HMD allows for straightforward editing of the map and safety zones. In a different way, Tung et al. [80] show how adding visual workspace divisions can provide significantly more predictability in how a human and robot collaboratively manipulate objects in a tabletop scenario (see Figure 3).

5. Improving the Collaboration

The subsections that follow contain literature that addresses the problem of improving the collaboration between the robot and the human via augmented reality. Research is grouped depending on the domain of the collaboration. We examine domains from different perspectives, including use cases and applications.

5.1. AR for Teleoperation

Beginning with [115] and continuing with [116], robot teleoperation has remained a central problem in human–robot collaboration, for which augmented reality can provide some solutions. The contributions of research using AR for teleoperation are summarized here.

Ito et al. [41] suggest visual overlays for robot-assisted, teleoperated dental work, in yet another example of the use of AR for HRC in the medical fields. In this particular case, the work is not performed directly on patients but for a dental milling machine to prepare tooth crowns. In this paper, the machine itself is presented, with the AR concept being a virtual object superimposed over the actual object while the machine was being operated.

For UAV (unmanned aerial vehicle) control, AR has been shown to improve the situational awareness of the operators and to improve the path choice of the operators during training, as in Hing et al. [81]. (For more on situational awareness evaluation, see Section 6.1.5). Operators are provided with two different types of AR “chase views” that enable them to observe the UAV in the environment. Other teleoperated robots are those operated beneath the surface of the water (ROVs, or remotely operated vehicles, also known as UUVs or unmanned underwater vehicles). Domingues et al. [43] present a virtual diving experience that used teleoperated ROVs and AR. Riordan et al. [82] showcase a real-time mapping and display of subsea environments using technology enabled by UUVs; this provides remote teleoperators with a live experience of the environment in relatively high resolution via the combination of technologies presented in the paper.

Another way of assisting a remote operator is by placing them virtually into the environment of the robot as in Krückel et al. [16] so that they can in fact operate egocentrically. An alternative to placing the operator into the entire virtual environment is to use a combination of virtual and real objects to mimic the robot’s workspace, as in Yew et al. [18]. In this example, a maintenance robot is shown virtually in AR, along with some aspects of its surroundings, while prototypes of some of the physical features are also present in the operator’s immediate environment. In this way, tasks such as visual inspection or corrective task execution can be completed remotely via teleoperation.

With the comprehensive system presented in Huy et al. [49], a peripheral/haptic device is used to teleoperate the robot, and information and feedback are shown to the human user via an HMD and laser projection mounted to the mobile ground robot. One feature of the handheld peripheral is a laser pointer that can be used to identify a goal location for the robot, following which the operator confirms the choice in AR, then the robot moves to that location autonomously.

As the concept of using AR for teleoperation continues to evolve, the designs have become more advanced. In Hedayati et al. [26], three different design methodologies are presented for communicating information to an operator collocated with an aerial robot. This design framework urges the designer to consider how information is presented, whether it is (1) augmenting the environment, (2) augmenting the robot, or (3) augmenting the user interface. In the experiment, each of these three interface design implementations prove to be an improvement over the baseline.

Puljiz et al. [22] present a method of generating a six-DOF robot virtually in AR with a HoloLens, and then allowing the user to manipulate the hologram as a form of teleoperation, either in situ or remotely. Similarly, Walker et al. [13] successfully demonstrate the use of “augmented reality virtual surrogates” of aerial robots that can be manipulated using an HMD as a form of teleoperation. In a shared control situation, where a human user with a remote control must grasp an object with a robot arm using an assistive controller, Brooks and Szafir [83] show that AR visualization increases acceptance of assistance and improves the predictability rating but does not affect the perceived usability. There is even evidence that humans in the remote control of robot swarms prefer trajectory information delivered via AR [84].

5.2. Pick-and-Place

While pick-and-place operations are applicable across many of the domains already discussed such as path planning, manufacturing, and teleoperation, here, we highlight problems of pick and place in human–robot collaboration as solved by augmented reality for those who are interested in this particular body of research.

In Hashimoto et al. [44], a multi-DOF robot arm is mounted to a mobile ground robot, giving the resulting system a total of six DOF. This robot is then teleoperated through a touchscreen AR interface to perform tasks remotely (in another room), such as approaching a bottle, grasping it, and dropping it into the trash. The experiment is designed to determine subjects’ preferred type of interaction with the touchscreen. Unfortunately, these results are somewhat inconclusive, as the study was conducted on a small scale and participants did not show one clear preference.

In Frank et al. [33] a tabletop two-armed robot is controlled via an AR-enabled tablet in a shared space. Different views are provided to the user in a between-subjects study: overhead, robot egocentric, and mobile (using the rear-facing camera on the tablet). Mixed reality is enabled in all of these views to the extent possible with the cameras employed. The pick-and-place task requires users to command the robot to move tabletop objects from one location on the table to their designated bins on the table in front of the robot. Yet again, the results show a relatively equal performance level among participants, regardless of the view provided.

Sibirtseva et al. [50] use verbal commands for a YuMi robot performing object retrieval tasks and investigate the implementation of different visualizations to clarify the requests. In a within-subjects study, three visualization modalities are tested: monitor, which uses an external screen to highlight the potential object; projector, wherein the object is highlighted directly on the workspace; and head-mounted display, where a HoloLens highlights the object virtually in the real world. The system uses a wizard to perform natural language recognition for colors and shapes of the objects; the remainder of the system is designed for the experiment. The authors choose a flat workspace for the experiment, assuming that a more complex workspace or area would essentially bias the results toward an HMD being preferable, due to difficulties with projection and/or occlusions. The claim is that this experiment is intended to compare the three AR modalities as directly as possible, rather than optimize for a specific task. While participants claim that the head-mounted display is more engaging, they generally prefer the projection-based AR.

To investigate the use of “drag-and-drop” in AR to program a UR5 robot arm, Rudorfer et al. [21] test their “Holo Pick-n-Place” method. A user can virtually manipulate an object from one place to another within the HoloLens, and those instructions are then interpreted by the system and sent to the robot. The HoloLens uses object recognition to overlay the virtual CAD models of objects onto the physical objects, which the user can then drag and drop into the desired locations. A proof of concept is presented, and accuracy proves to be limited due to the HoloLens’s limitations in gaze and calibration. The system also does not allow object stacking or placement anywhere other than on one surface. With the release of the HoloLens 2, some of these issues may be resolved in future studies.

In Chacko and Kapila [35], virtual objects are created and manipulated by a human user in AR, and these virtual objects are then used by the robot to optimize a pick and place task. The system allows an estimation of position, orientation, and dimension of an object in physical space that is unknown to the robot, and this information is used by the robot to then manipulate the object. The user also dictates what type of grasping motion to use, with the options being horizontal (objects that can be grasped from above, so as to keep them oriented horizontally) and vertical (objects that can be grasped from the sides, so as to keep them oriented vertically).

In Bambušek et al. [51], a HoloLens and touch-enabled table with AR projection are combined to program a robot to perform tabletop pick-and-place tasks. In this case, these modalities were compared with kinesthetic teaching, or physically manipulating the robot’s arms. An advantage of this system is the removal of the requirement that the robot be present during programming, since tasks can be verified in the HoloLens.

5.3. Search and Rescue

Search and rescue operations present a natural application for using AR to facilitate and amplify human–robot collaboration. Dangerous situations can be explored by robots while a human provides guidance, oversight, and even teleoperation from a distance, using the improved situational awareness and nuanced communication enabled by AR. Specific issues that can be addressed by AR in a search and rescue HRC situation include a potentially dynamic and unknown environment, often resulting in the need for visual assistance, as well as the remote communication of essential information about safety, terrain, or location of human and robot agents.

In 2009, Martins and Ventura [85] implemented a rectification algorithm for using an HMD to teleoperate a mobile ground robot. In this application, head movements can be tracked and utilized to tilt the camera or turn the robot. Additionally, when the user’s head is tilted from side to side, the rectification algorithm ensures that the remote image stays aligned with the horizon. Gianni et al. [62] propose a framework for planning and control of ground robots in rescue environments. A human operator uses an AR interface that provides capabilities for path planning, obstacle avoidance, and a pen-style interaction modality. The following year, in 2014, Zalud et al. [86] demonstrate a method of combining color and thermal images in AR especially for use cases with low visibility as in rescue situations. Four years later, Reardon et al. [24] implemented AR for search and rescue with a ground based robot (Clearpath Robotics Jackal) using a HoloLens. The advances with this new technology included the vector-style visualization of the robot pose and trajectory and expedited communication of search results.

In Reardon et al. [55], an explorer robot and human user communicate with each other via an AR HMD, with the key components being an unstructured, uninstrumented environment and bi-directional communication. An autonomous robot searches the environment with a human, with the intent to expedite the search over what could be performed with solely robotic or solely human exploration. The human (via the HMD) and the robot are equipped with SLAM capability and are able to share their respective information with each other and thus create a composite map of the area. Furthermore, the AR is used to communicate the current plan, the task’s state, and future actions of the robot, thereby also influencing the choices that the human makes. In an extension of this work, Gregory et al. [48] demonstrate the usefulness of a gesture glove for giving commands to the robot for reconnaissance style missions. In a pilot study, novice participants must use the Manus VR gesture glove and a HoloLens to command the robot in mapping three different environments (subway platform, basement, and office building). Preliminary results show that these tasks can be completed in both Line-of-Sight and Non-Line-of-Sight operations without extensive training and also highlighted the importance of choosing easily articulated gestures. Researchers also note that the participants make use of commands in unanticipated ways, such as utilizing a “return” command to only partially move the robot back, to then be able to issue a different command from this intermediate location. Reardon et al. [87] demonstrated that an ARHMD could be a suitable method for communicating robot-observed changes in the environment. The experiment, conducted remotely, provided participants with video of the environment with AR-provided, circular shaded regions that highlighted changed areas. Participants were then asked to rate their confidence in the AR-provided change indicators. While improvements could be made on this method, it proved to be a significant step in implementing this kind of visualization to aid in scene change identification. Taking these techniques a step further, Walker et al. [88] show that an ARHMD could be used to allow emergency responders to quickly visualize an area, for example during firefighting operations, particularly by augmenting images provided by a remote robot.

Even more recently, Tabrez et al. [89] explored different types of AR communication for joint human–robot search tasks, leveraging techniques from explainable AI, where insight is provided into a robot’s decision-making to attempt to improve situational awareness (see Figure 4). In a comparison (as well as a combined interface), they found that the combination of prescriptive and descriptive guidance led to the highest perceived trust and interpretability and the highest task performance, and made human collaborators act more independently.

5.4. Medical

There are a number of applications of AR for improving human–robot collaboration in robot-assisted dental work, as well as for robot-assisted surgery. Ref. [90] provide an extensive review of AR for robotic-assisted surgery, providing a comprehensive list of application paradigms: surgical guidance, interative surgery planning, port placement, advanced visualization of anatomy, supervised robot motion, sensory substitution, bedside assistance, and skill training. We will highlight some of the medical applications here; however, for a full review of AR in robotic-assisted surgery, the reader should refer to Qian et al. [90].

For performing dental work, Ito et al. [41] presents visual overlays in AR for a robot-assisted dental milling machine via teleoperation. Virtual objects are superimposed on physical objects, allowing the user to see the trajectory of the cutting tool path as well as a patient’s internal bones.

For a situation requiring first aid, experts are often not at the site to provide treatment. It is specifically cases like these that Oyama et al. [91] attempts to address with a Remote Behavior Navigation System (RBNS). This system equips a person at the site of the emergency with a camera, microphone, and HMD, while a remote expert is able to view the camera feed and provide directions for care that are mimicked in the HMD virtually. The experiment challenges a participant to construct an arm sling using the RBNS, remotely guided by an expert.

The AR system presented in Filippeschi et al. [92] is a complete system for remote palpation (examination by touch) in the case where a patient and a doctor are not collocated. Both visual and haptic feedback are provided to the doctor, and the patient is in view of an RGBD camera.

For assistance both before and during surgery, Adagolodjo et al. [93] develop an AR system for visualizing tumors and blood vessels around the surgery site. Approximate 3D pose information is obtained from 2D silhouettes, proving this method potentially useful for planning surgical operations. Similarly, in Zevallos et al. [94], AR is used to show the shape and location of tumors by visually overlaying that information onto the actual organ, in an effort to assist surgeons. In this example the surgeons use the da Vinci Research Kit (dVRK), a robotic surgery assistant. A system is presented to autonomously locate the tumor, provide stiffness and related information about the tumor and then overlay the information on a model of the affected organ for display to the user. Another application for surgery is from Qian et al. [27], where the First Assistant is provided with a HoloLens that is equipped to aid them with instrument insertion and tool manipulation while using the da Vinci robotic surgery assistant. Experimental results show potential improvement in efficiency, safety, and hand-eye coordination.

Elsdon and Demiris [23] use a HoloLens and a “spray robot” for dosed application of topical medication. Because sprayed dosage is difficult to visualize, the density is visualized virtually, and the Actuated Spray Robot is enabled with three different modes: manual (user must pull the trigger and move sprayer), semi-automatic (the trigger is actuated automatically but user must move the spray head), and autonomous (both the trigger and the head articulation are automated). A more even density (greater accuracy) is achieved with both semi-automatic and automatic modes than with manual spraying, although manual was fastest. The experimenters speculate that because both of the automatic modes do not allow mistakes to be made, participants may tend toward perfection in those modes, increasing the time spent on the task. This technology could also be applicable in manufacturing, for paint and other coatings requiring a spray application.

5.5. Space

Space applications pose challenging problems, especially as the work sites reach farther and farther from Earth. Any teleoperation must account for the time delays imposed by these long communication distances, a problem explored deeply by [95]. Xia et al. [96] attempt to work within these constraints by using augmented reality to help simulate the time delay for a remote operator. Via AR, different virtual fixtures are tested to aid the operator, both with and without a time delay. The use of virtual line fixtures is the best option, with or without the delay, while using virtual planes decreases the task time to less than 1/3 of the unassisted task with a time delay. The design of this experiment, while in this case applied to satellite repair, is derived from medical applications and could have applications in this field as well, especially as it relates to medical care during space travel.

Somewhat surprisingly, the literature on AR for HRC in space applications seems few and far between. Furthermore, most of the found literature is for remote teleoperation rather than collocation. We speculate that this could be due to a combination of factors. Most importantly, currently humans are only present in space in low Earth orbit, on the International Space Station or on brief launches in relatively small spacecraft. While some robots exist in these locations, the opportunities for incorporating AR into their use have been sparse. Furthermore, due to the time delay in communicating with remote robotic spacecraft and rovers, such as the Mars Exploration Rovers (Spirit and Opportunity) or the Mars Science Laboratory (Curiosity) prohibits convenient real-time HRC. Thus, more of the research related to these kinds of collaboration feature virtual reality or augmented virtuality instead. With upcoming missions due to land humans on the moon, and eventually on Mars, this is an area rich for future research.

5.6. Safety and Ownership of Space

The collaboration problem of indicating to humans whether a space is safe to traverse, whether space is “owned” by the robot, or whether it is otherwise occupied or available has been explored in a number of different studies. As mentioned above in Section 4.1, the work in Bolano et al. [40] displays to users the intended goal locations, paths, and swept volumes of the robot and its end effector. The technology in Sprute et al. [34] provides a human with the ability to restrict a robot’s workspace by drawing on a tablet in AR. In Makris et al. [66], shaded rectangular prisms in a human’s AR HMD denote the “safety volume” in green and the “robot’s working area” in red. Alternately, in Frank et al. [33], red shaded areas of the working plane indicate prohibited regions for the robot, and green shaded areas indicate allowable regions that the robot can reach. Puljiz et al. [79] also highlight the ability to denote safety zones using their HMD-based mapping and interaction methods in a robot work cell in a manufacturing environment. New work in spatial ownership during collocated activities also shows that AR-delivered visualizations alone are insufficient for achieving human compliance with robot instructions, even in a high-risk environment where humans are in close proximity to potentially dangerous airborne robots [97] (see Figure 5).

Notably, the use of green and red seems mostly dependent on whether the human is teleoperating, programming, or otherwise controlling the robot (in which case green indicates areas they are allowed to move the robot into), or whether they are performing a task in parallel (in which case green indicates areas where they are safe from the robot).

5.7. Other Applications

While somewhat unconventional, the following applications provide unique and creative perspectives on the possibilities for implementing AR for HRC. These researchers are trying to push people’s boundaries on what makes for a good AR/HRC combination. We included these unconventional perspectives with the intent to inspire future work envisioning such systems. These works ask questions like “How can we make this something that might be useful every day?” and “What do people think about incorporating robots and AR into their daily activities?”.

In Ro et al. [98], a robot is presented as a museum docent that uses projection-based AR to share information with human visitors. Applications for this technology might also expand past museums to malls and city streets, or even classrooms.

Mavridis and Hanson [99] designed the Ibn Sina (Avicenna) theatre installation to integrate humans and technology and to provide a place for art, research, and education to come together. The stage is outfitted with sensors and is occupied by a humanoid robot along with humans. Though not yet fully implemented, the theater is intended to be interactive and is to be equipped with a screen, lights, and audio and video systems, enabling holograms and interaction.

Anticipating future restaurant applications, Pereira et al. [100] present a fast food robot waiter system in a Wizard-of-Oz study. Participants in a within-subjects study teleoperate the robot either solo or with a partner, using a headset and joysticks.

Omidshafiei et al. [101] outline the usefulness of AR when prototyping and testing algorithms. By combining physical and virtual robots in an augmented environment via the use of projection AR, motion capture, and cameras, different systems can be tested and evaluated in full view of the researchers, and without the risks involved in deploying them in the outside world.

Another nascent research area for AR-based HRC is Socially Assistive Robot tutoring, as in Mahajan et al. [102]. In this study, the researchers assess the use of common 2D usability metrics, such as performance, manipulation time, and gaze and their correlation with usability scores from the System Usability Scale (SUS) survey. During an AR-assisted programming task, they find a positive correlation of usability with gaze, but not with manipulation time or performance.

6. Evaluation Strategies and Methods

In general, we are all working toward developing something “better”. What we mean by “better”, however, can have vastly different definitions based on the context and the intent. Better could be faster, more efficient, more direct, safer, with higher fluency, with greater situational awareness, or many other possibilities. In order to evaluate whether something is better, both objective and subjective measures can be made via multiple kinds of evaluations. These evaluations and measures are the subject of this section.

Because there are many aspects to evaluation, here we take a few different approaches. First, we highlight some instruments and questionnaires that have been used in evaluating AR for HRC. Then we discuss the choice to conduct extensive user studies, pilot testing, or only proof-of-concept testing, and the value of each of these options, as well as considerations for recruiting participants.

6.1. Instruments, Questionnaires, and Techniques

6.1.1. NASA Task Load Index (TLX)

The use of the NASA Task Load Index or NASA TLX instrument [103] is perhaps one of the most widespread in assessing AR for human–robot collaboration [23,31,33,35,39,51,67]. The NASA TLX assesses the workload on six scales [103], and was created by Hart and Staveland in 1988 [117]. The six scales are Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, and Frustration. The instrument is now available in both paper-and-pencil as well as mobile app format [103], making it very easy for the experimenter to deploy and for the subject to use.

6.1.2. Godspeed Questionnaire Series (GQS)

The Godspeed Questionnaire Series [104,105] was developed by Bartneck et al. in 2009 as a way to measure “anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots”. Each of these five areas contain three to six Likert-type scales on which to rate the robot. This questionnaire was used to measure the “perception of an artificial embodied agent” in Rotsidis et al. [36], while in Williams et al. [56], only the Likability section was utilized.

6.1.3. User Experience Questionnaire (UEQ)

Both Bambušek et al. [51] and Kapinus et al. [68] utilized the User Experience Questionnaire [118], or UEQ, as part of the evaluation. The UEQ is a 26-item assessment; each item is ranked on a seven-point scale. The results provide a rating of the product being evaluated on six separate scales: attractiveness, perspicuity, efficiency, dependability, stimulation, and novelty.

6.1.4. System Usability Scale (SUS)

Measuring usability with the SUS is a method of quantifying a somewhat qualitative element of a design or technology. One measure of usability that a number of studies [39,51,71,83,102] have utilized is the System Usability Scale or SUS [106]. The SUS consists of 10 statements that users can rank on a scale of 1 to 5, from strongly disagree to strongly agree). Example statements include “I think that I would like to use this system frequently” and “I found the system very cumbersome to use”. To attain the total SUS score, for all odd numbered responses, subtract 1, and for all even numbered responses, subtract the response from 5. Add these scores together, then multiply the total by 2.5. This provides a score in the range of 0 to 100.

6.1.5. Situational Awareness Evaluation

A common claim is that AR lends itself to increasing the user’s situational awareness, or SA. Many papers in this survey claimed to evaluate situational awareness [18,33,82,90,119,120,121], but few actually had a way to evaluate this [24,26,55,81]. Endsley [107] defines situation awareness as “the pilot’s internal model of the world around him [sic] at any point in time”, what roboticists might refer to as a mental model. Specifically, a version of the Situational Awareness Global Assessment Technique (SAGAT) developed by Endsley [107] is used in Srinivasan and Schilling [119]. The SAGAT was developed in 1988 (interestingly, this also coincides with the original publication of the NASA TLX) to assess aircraft designs for pilots’ situational awareness. Scholtz et al. adapted the SAGAT in 2004 for (semi-)autonomous vehicles (“robotic vehicles”) and human–robot interaction, specifically the “supervisory role” that humans play in this situation [122,123]. In the original SAGAT, the experiment is paused at various points throughout the study, and during these pauses, the pilot/subject is asked a series of questions that are intended to assess their awareness of aspects of the current situation. The evaluation is given via computer to allow for randomized questions as well as rapid response inputs. A composite score is acquired based on the total response results. It is important to note that SAGAT is a technique and not a specific instrument or questionnaire. The particular questions asked during each pause or interruption are entirely dependent on the environment in which the SA is being evaluated.

6.1.6. Task-Specific Evaluations

When conducting a user study, researchers should conduct a thorough search to discover existing instruments for their technology’s particular use case.

For example, in testing the functionality of an AR design to be used by robotic wheelchair operators, Zolotas et al. [19] chose skills from the Wheelchair Skills Test, version 4.2 [124,125]. The most current version of this manual is now version 5.1 [108], and it contains the specifics of the Wheelchair Skills Test, or WST, with individual skills, a questionnaire (WST-Q), and training. Examples of the skills assessed include turn while moving forwards (90°), turn while moving backwards (90°), and get over threshold (2 cm). Because there is an established test and instrument for these kinds of skills, it follows that the WST and WST-Q would be used to evaluate an AR system intended to assist robotic wheelchair users.

6.1.7. Comprehensive Evaluation Designs

Experiments in Kalpagam Ganesan et al. [38] utilize “questionnaire items…inspired and adopted from Hoffman [126] [since updated in Hoffman [109], Gombolay et al. [110], and Dragan et al. [111]”. Here, we discuss why these three works present ideal fodder for comprehensive questionnaires.

In Hoffman [109], Hoffman defines fluency in HRI and then presents metrics for measuring fluency. In defining fluency, he states that

when humans collaborate on a shared activity, and especially when they are accustomed to the task and to each other, they can reach a high level of coordination, resulting in a well-synchronized meshing of their actions. Their timing is precise and efficient, they alter their plans and actions appropriately and dynamically, and this behavior emerges often without exchanging much verbal information. We denote this quality of interaction the fluency of the shared activity.

Hoffman also clarifies that fluency is distinct from efficiency, and that people can perceive increased fluency even without an improvement in efficiency. These fluency measures include both objective (for example, percentage of total time that both human and robot act concurrently) and subjective metrics (for example, scale ratings of trust and improvement).

Both Gombolay et al. [110] and Dragan et al. [111] actually draw substantially from the measures presented in Hoffman [109]. Ref. [110] choose to use 13 questionnaire items from the subjective metrics in Hoffman [126] and augment this list with 8 of their own “Additional Measures of Team Fluency”, focused on the human’s satisfaction with the teamwork. Ref. [111] uses both objective and subjective measures from Hoffman [109], and add items related to closeness, predictability, and legibility.

We recognize that none of the studies that Kalpagam Ganesan et al. [38] draws from are necessarily related to the use of augmented reality for human–robot collaboration. However, the relevance and appropriateness is apparent and can easily be used in combination with other metrics specific to AR.

6.2. The Choice to Conduct User/Usability Testing

Three main themes in testing and evaluation emerge from the papers reviewed. (1) Pilot testing provides a way to verify that research, technology, or evaluation is headed in the right direction, or to determine certain specifics about a subsequent evaluation. (2) Proof of concept experiments or prototypes can demonstrate that a particular technology can in fact be implemented and might also highlight additional directions to take the research. (3) User or usability testing provides the researchers with feedback and data on their current designs; the better the participant pool (again, note that “better” is a loaded word here), the more trust they can typically have in their results. We look more deeply at each of these three themes in this section.

6.2.1. Pilot Testing as Verification

Some studies use a pilot test to then inform a larger-scale test that is also described in the same paper. In Qian et al. [27], where the authors present a form of AR to assist a surgeon’s First Assistant with the da Vinci robotic manipulator, they first perform a pilot test with three surgeons. After this initial evaluation, and using feedback from the pilot subjects, they then conduct an n = 20 user study. Ref. [67] briefly mentions an initial pilot study to evaluate whether pointing and head gaze were natural modes of selection for a user, before explaining their more thorough n = 16 user study. In Sibirtseva et al. [50], a human–human pilot study is conducted (n = 10), where data are collected on the vocabulary used to describe Lego objects between human partners. Informed by this pilot, the authors decided to resort to a a wizarded system for the natural language processing portion of their experimental setup.

Alternately, other studies only present a pilot test and then address how this test might inform future, larger-scale testing. Ref. [112] report on their pilot study (n = 10) that requires users to complete two tasks in two different conditions: the experimental condition of a “proposed AR-robotic interface” and a gamepad. These authors then proceed to discuss a case study, where the technology is applied to the process of carbon-fiber-reinforced-polymer production and then pilot-tested on one user. To evaluate the design of an AR HMD for wheelchair users, ref. [19] ran a between-subjects pilot test on 16 participants who needed to navigate a route 4 separate times, either with or without the AR visual assistance. All of the results can inform future iterations of the design. In Yew et al. [18], a pilot test is presented using their prototype to show that combining virtual objects with in situ spaces can function for teleoperation of robots. Tasks are completed by the novice users (n = 5) in a short amount of time, setting the stage for future evaluations and also revealing areas for improvement of the design (tracking sensors and algorithms, depth sensors for unforeseen hazards).

6.2.2. Usability Testing

Throughout this paper, there have been examples of numerous studies that conduct full usability or user testing. Some highly cited examples include [11,26,53]. Commonalities among these experiments include a relatively high number of participants and a thoroughly and intentionally designed study. In all of these examples, participants take part in the study in person. Another option is to perform testing using Amazon Mechanical Turk (MTurk) users who view videos or simulations of the system. By using MTurk, the number of subjects can often be expanded; however, the limitations include the mode of interaction and the kinds of participants.

6.2.3. Proof-of-Concept Experiments

The two kinds of evaluation presented in Section 6.2.1 and Section 6.2.2 are both intended to gather objective data (for example, how long a task takes to complete or where there is overlap in the duties of the human and the robot), as well as subjective data (for example, whether the human user understood a command or preferred a certain type of interface). Meanwhile, other experiments published show that a technology can indeed be implemented in a certain way, with the intent to solve a particular problem. One example of this kind of experiment is in Reardon et al. [55]. In this work, the authors thoroughly document how they successfully implemented an AR display for use in assisting a human user while they collaboratively explored a potentially dangerous space with a ground-based robot. They combined an understanding of cooperative exploration with complete integration of the robot’s and human’s points of view and augmented this with additional data provided to the human by the robot. In the experiments described, the system successfully performed all necessary tasks.

Other examples of a proof-of-concept study include a generalized AR system that is laid out for human operators working with assembly line robots in automotive manufacturing [66], an AR/VR system in collaboration with an ROV designed to enable virtual SCUBA diving [43], virtual drag-and-drop programming of a robot arm for a pick-and-place task [21], robotic-assisted masking of areas for mechanical repairs [113], a system for the AR-enabled online programming of industrial robots including motion and hand gesture tracking [29], an architecture for implementing AR for programming robots using multiple modalities in industrial settings [17], and the use of built-in mapping functionality in a HoloLens to establish the working environment for a robot arm in a work cell [79].

6.2.4. Choosing the Type of Evaluation to Conduct

How does one choose the right kind of evaluation for a particular technology or study? Elements to consider include (a) how far along the technology is in its development, (b) how many test subjects it would take to validate or evaluate the design, (c) whether the technology is safe for human subjects, (d) what research questions are being asked. Sometimes, a pilot study may be warranted to obtain additional details before proceeding. In other cases, it is only the technology that needs to be showcased, and extensive user testing is not necessary. If the researchers are attempting to show increased usability, safety, or fluency, a full scale human subjects experiment will be necessary. We recommend starting by examining the goals of the evaluation, for example, framing it in terms of one of the previous three sections (pilot testing, usability testing, or proof of concept). From there, similar studies can be referenced that have comparable intents. Informed by this survey and prior work, the researcher can choose appropriate instruments or evaluation techniques for their own purposes.

6.2.5. Recruiting Participants for Human Subjects Studies

We would also like to address the issue of recruiting participants for user studies. There are multiple factors to consider, all related to diversity in the participant pool, which we enumerate here.

Diversity in experience. Novice participants are often recruited local university student population out of convenience. Researchers should consider whether recruiting experienced or trained participants (who might be experts or professionals in the tasks being performed) might benefit their study.
Diversity in age. Again, if the participants are mostly recruited from one age group, such as university undergraduates or employees of one group at a company, their prior experiences may prove to be somewhat uniform. As technology continues to advance rapidly, participants of different ages will inevitably have varied technological literacy. Researchers should consider the impact this might have on their results and what they are seeking to learn from the study.
Diversity in gender, race, and ethnicity. User study participants should be recruited to reflect the population as a whole (see Palmer and Burchard [127]). As with the prior items in this list, participant populations that are not representative can affect the usefulness of the results.

Most importantly, researchers must recognize in any publications the shortcomings of a participant population. Demographic and other relevant information about participants can help clarify what these gaps might be and allow for critical reflection on whether this could have affected any results.

7. Future Work

The field of augmented reality for human–robot collaboration is vast. One can examine the suitability of various AR technologies for an HRC task, the design of the AR interfaces, the user experience, the comfort, and the safety. We can ask questions about what humans are capable of, how the human and the robot can work together or separately, how much the human should be asked to do, or how they should be asked to do it. Alternately, we can ask questions about what the robot can do, how the robot should be instructed or programmed, and what levels of tasks it can perform. At a system level, we can design systems that seamlessly integrate a human, robot, and AR device; we can examine behaviors of systems in all kinds of environments, indoors and outdoors; we can evaluate how well the systems function either remotely or in situ. The 2020 Robotics Roadmap [128] assembled by a consortium of universities in the US lays out some specific current challenges for human–robot interaction, including accessible platforms, datasets, and evaluation. All of the works presented here take various perspectives on these questions and more. However, as with all research areas, there is still much to explore. Here we will touch upon a few key areas that are calling for innovation and improvement.

In many ways, the field will continue to evolve with the maturation of augmented reality technology, including the next generations of head-mounted displays, improved handheld AR, and possibly even innovations to projection-based AR. As recounted in Puljiz et al. [22], issues with segmentation demonstrate the need for improvement in AR capabilities with regard to skin color, limb, and gesture recognition. AR must be able to work in all kinds of environments regardless of lighting, background, or the user’s skin color in order to be effective. Furthermore, in Kästner and Lambrecht [25], the main limitations are from the constant visualization of real-time data, especially the laser scan data for position and obstacle tracking. These difficulties demonstrate the current processor and visualization limitations in AR technology.

AR technology has also been described as bulky [38], cumbersome [129], and having a limited field of view [19,27,50,130,131]. All of these issues present opportunities for improvement in the AR technology itself.

The collaboration of HRI researchers with those developing cutting-edge user interfaces should also be emphasized. To obtain accurate and meaningful results from user studies, AR interfaces must utilize established principles of design for accessibility and functionality. In Stadler et al. [31], the authors suspected that because of an excess of detailed information provided through AR, users actually took more time to complete a task that should have been faster with the help of the AR display. Questions such as “What is the appropriate level of information to provide to someone performing an AR-assisted task?” could be asked of a UI designer and incorporated into future work.

7.1. Robots and Systems Designed to Be Collaborative

The works included in this review typically utilize one robot (ground-based, robotic arm, aerial, underwater, or humanoid) in collaboration with one human. The robots are designed for a variety of purposes—to be universal manipulators, drive over smooth or rough terrain, or easily navigate in a three-dimensional space. But not all of these robots are designed expressly for the purpose of working in close collaboration with humans. Some were chosen based on their ease of manipulation in a programming-by-demonstration task or their safety features. However, what happens when we first take into account the possibility that a human might be working in close proximity? What kinds of features can we innovate to ensure the person’s safety as well as ensure that the robot completes its task? How might this robot behave? And what might this collaborative environment look like in different environments?

7.2. Humans as Compliant Teammates

Much work exists that explores the role of the human as the director, manager, or overall controller. But what if we turned this idea on its head and made the human a vital component on a robot-driven team? What if AR was utilized to direct one or more humans in a collaborative task with one or more robots? What if we were able to easily expand past the currently typical robot–human dyad, which the vast majority of the works surveyed here involved?

Furthermore, we are continuing to think of these as human–robot teams. The goal is not to replace human workers altogether, but to utilize the strengths and intelligences of both humans and robots to increase productivity and efficiency. How can we make both humans and robots more productive by teaming them together? As Reardon et al. [55] point out, we want to “influence the human’s model of the robot’s knowledge and behavior, and shape the human’s performance. In this way, we treat the human and robot teammates as peer members of the cooperative team, and seek to influence each through information communication”.

7.3. Evaluation

In Section 6, we summarize different methods of evaluating a technology and measuring improvements. However, it is also obvious how much room for innovation there is in this particular area. There are very few standardized, validated, and widely used instruments. Pick-and-place and other manufacturing-related tasks are also prevalent in the literature, yet few evaluation methods are alike, making it difficult to compare across different studies. Greater collaboration among researchers could yield some semi-universally accepted evaluations for typical AR for HRC tasks, such as teleoperation (both remote and in situ), aerial robot piloting and communication, or pick-and-place tasks.

8. Conclusions

We are thinking ahead to a future when robots will be able to plan and execute even more efficiently than they can at present and when augmented reality is an unobtrusive and fluid method of interaction regardless of modality. What happens when the human is no longer omniscient and the robot is making decisions without the human in the loop? How can we ensure the human feels they are part of the system and that they simultaneously remain safe in the presence of robots? Augmented reality will only continue to mature into a more accessible technology, and its role in human–robot collaboration can become much more impactful and relevant to many different domains.

Author Contributions

Conceptualization, C.T.C. and B.H.; methodology, C.T.C. and B.H.; writing—original draft preparation, C.T.C.; writing—review and editing, C.T.C. and B.H.; supervision, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by the Draper Scholar Program. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of Draper.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AR	Augmented Reality
HRI	Human–Robot Interaction
HRC	Human–Robot Collaboration

References

Milgram, P.; Zhai, S.; Drascic, D.; Grodski, J. Applications of augmented reality for human-robot communication. In Proceedings of the 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS ’93), Yokohama, Japan, 26–30 July 1993; Volume 3, pp. 1467–1472. [Google Scholar] [CrossRef]
Magic Leap, I. Magic Leap 1. 2018. Available online: https://www.magicleap.com/devices-ml1 (accessed on 14 July 2020).
Microsoft HoloLens | Mixed Reality Technology for Business. 2020. Available online: https://www.microsoft.com/zh-cn/ (accessed on 14 July 2020).
Green, S.A.; Billinghurst, M.; Chen, X.; Chase, J.G. Human-Robot Collaboration: A Literature Review and Augmented Reality Approach in Design. Int. J. Adv. Robot. Syst. 2008, 5. [Google Scholar] [CrossRef]
Williams, T.; Szafir, D.; Chakraborti, T.; Ben Amor, H. Virtual, Augmented, and Mixed Reality for Human-Robot Interaction. In Proceedings of the Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, Chicago, IL, USA, 5–8 March 2018; HRI ’18. pp. 403–404. [Google Scholar] [CrossRef]
Williams, T.; Szafir, D.; Chakraborti, T.; Soh Khim, O.; Rosen, E.; Booth, S.; Groechel, T. Virtual, Augmented, and Mixed Reality for Human-Robot Interaction (VAM-HRI). In Proceedings of the Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, New York, NY, USA, 23–26 March 2020; HRI ’20. pp. 663–664. [Google Scholar] [CrossRef]
Rosen, E.; Groechel, T.; Walker, M.E.; Chang, C.T.; Forde, J.Z. Virtual, Augmented, and Mixed Reality for Human-Robot Interaction (VAM-HRI). In Proceedings of the Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, Association for Computing Machinery, Boulder, CO, USA, 8–21 March 2021; HRI ’21 Companion. pp. 721–723. [Google Scholar] [CrossRef]
Chang, C.T.; Rosen, E.; Groechel, T.R.; Walker, M.; Forde, J.Z. Virtual, Augmented, and Mixed Reality for HRI (VAM-HRI). In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, Sapporo, Japan, 7–10 March 2022; HRI ’22. pp. 1237–1240. [Google Scholar]
Wozniak, M.; Chang, C.T.; Luebbers, M.B.; Ikeda, B.; Walker, M.; Rosen, E.; Groechel, T.R. Virtual, Augmented, and Mixed Reality for Human-Robot Interaction (VAM-HRI). In Proceedings of the Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, New York, NY, USA, 13–16 March 2023; HRI ’23. pp. 938–940. [Google Scholar] [CrossRef]
Chestnutt, J.; Nishiwaki, K.; Kuffner, J.; Kagamiy, S. Interactive control of humanoid navigation. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 10–15 October 2009; pp. 3519–3524. [Google Scholar] [CrossRef]
Walker, M.; Hedayati, H.; Lee, J.; Szafir, D. Communicating Robot Motion Intent with Augmented Reality. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, Chicago, IL, USA, 5–8 March 2018; HRI ’18. pp. 316–324. [Google Scholar] [CrossRef]
Zolotas, M.; Demiris, Y. Towards Explainable Shared Control using Augmented Reality. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 3–8 November 2019; pp. 3020–3026. [Google Scholar] [CrossRef]
Walker, M.E.; Hedayati, H.; Szafir, D. Robot teleoperation with augmented reality virtual surrogates. In Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction, Daegu, Republic of Korea, 11–14 March 2019; HRI ’19. pp. 202–210. [Google Scholar]
Green, S.A.; Chase, J.G.; Chen, X.; Billinghurst, M. Evaluating the augmented reality human-robot collaboration system. Int. J. Intell. Syst. Technol. Appl. 2009, 8, 130–143. [Google Scholar] [CrossRef]
Oyama, E.; Shiroma, N.; Niwa, M.; Watanabe, N.; Shinoda, S.; Omori, T.; Suzuki, N. Hybrid head mounted/surround display for telexistence/telepresence and behavior navigation. In Proceedings of the 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Linköping, Sweden, 21–26 October 2013; pp. 1–6. [Google Scholar] [CrossRef]
Krückel, K.; Nolden, F.; Ferrein, A.; Scholl, I. Intuitive visual teleoperation for UGVs using free-look augmented reality displays. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 4412–4417. [Google Scholar] [CrossRef]
Guhl, J.; Tung, S.; Kruger, J. Concept and architecture for programming industrial robots using augmented reality with mobile devices like microsoft HoloLens. In Proceedings of the 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Limassol, Cyprus, 12–15 September 2017; pp. 1–4. [Google Scholar] [CrossRef]
Yew, A.W.W.; Ong, S.K.; Nee, A.Y.C. Immersive Augmented Reality Environment for the Teleoperation of Maintenance Robots. Procedia CIRP 2017, 61, 305–310. [Google Scholar] [CrossRef]
Zolotas, M.; Elsdon, J.; Demiris, Y. Head-Mounted Augmented Reality for Explainable Robotic Wheelchair Assistance. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1823–1829. [Google Scholar] [CrossRef]
Chacón-Quesada, R.; Demiris, Y. Augmented Reality Controlled Smart Wheelchair Using Dynamic Signifiers for Affordance Representation. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 4812–4818. [Google Scholar] [CrossRef]
Rudorfer, M.; Guhl, J.; Hoffmann, P.; Krüger, J. Holo Pick’n’Place. In Proceedings of the 2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA), Turin, Italy, 4–7 September 2018; Volume 1, pp. 1219–1222. [Google Scholar] [CrossRef]
Puljiz, D.; Stöhr, E.; Riesterer, K.S.; Hein, B.; Kröger, T. General Hand Guidance Framework using Microsoft HoloLens. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 5185–5190. [Google Scholar] [CrossRef]
Elsdon, J.; Demiris, Y. Augmented Reality for Feedback in a Shared Control Spraying Task. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 1939–1946. [Google Scholar] [CrossRef]
Reardon, C.; Lee, K.; Fink, J. Come See This! Augmented Reality to Enable Human-Robot Cooperative Search. In Proceedings of the 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Philadelphia, PA, USA, 6–8 August 2018; pp. 1–7. [Google Scholar] [CrossRef]
Kästner, L.; Lambrecht, J. Augmented-Reality-Based Visualization of Navigation Data of Mobile Robots on the Microsoft Hololens—Possibilities and Limitations. In Proceedings of the 2019 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), Bangkok, Thailand, 18–20 November 2019; pp. 344–349. [Google Scholar] [CrossRef]
Hedayati, H.; Walker, M.; Szafir, D. Improving Collocated Robot Teleoperation with Augmented Reality. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, Chicago, IL, USA, 5–8 February 2018; HRI ’18. pp. 78–86. [Google Scholar] [CrossRef]
Qian, L.; Deguet, A.; Wang, Z.; Liu, Y.H.; Kazanzides, P. Augmented Reality Assisted Instrument Insertion and Tool Manipulation for the First Assistant in Robotic Surgery. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 5173–5179. [Google Scholar] [CrossRef]
Fung, R.; Hashimoto, S.; Inami, M.; Igarashi, T. An augmented reality system for teaching sequential tasks to a household robot. In Proceedings of the 2011 RO-MAN, Atlanta, GA, USA, 31 July–3 August 2011; pp. 282–287. [Google Scholar] [CrossRef][Green Version]
Lambrecht, J.; Krüger, J. Spatial programming for industrial robots based on gestures and Augmented Reality. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 466–472. [Google Scholar] [CrossRef]
Bonardi, S.; Blatter, J.; Fink, J.; Moeckel, R.; Jermann, P.; Dillenbourg, P.; Jan Ijspeert, A. Design and evaluation of a graphical iPad application for arranging adaptive furniture. In Proceedings of the 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, Paris, France, 9–12 September 2012; pp. 290–297. [Google Scholar] [CrossRef]
Stadler, S.; Kain, K.; Giuliani, M.; Mirnig, N.; Stollnberger, G.; Tscheligi, M. Augmented reality for industrial robot programmers: Workload analysis for task-based, augmented reality-supported robot control. In Proceedings of the 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, USA, 26–31 August 2016; pp. 179–184. [Google Scholar] [CrossRef]
Hügle, J.; Lambrecht, J.; Krüger, J. An integrated approach for industrial robot control and programming combining haptic and non-haptic gestures. In Proceedings of the 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Lisbon, Portugal, 28 August–1 September 2017; pp. 851–857. [Google Scholar] [CrossRef]
Frank, J.A.; Moorhead, M.; Kapila, V. Mobile Mixed-Reality Interfaces That Enhance Human–Robot Interaction in Shared Spaces. Front. Robot. AI 2017, 4, 1–14. [Google Scholar] [CrossRef]
Sprute, D.; Tönnies, K.; König, M. Virtual Borders: Accurate Definition of a Mobile Robot’s Workspace Using Augmented Reality. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 8574–8581. [Google Scholar] [CrossRef]
Chacko, S.M.; Kapila, V. Augmented Reality as a Medium for Human-Robot Collaborative Tasks. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–18 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Rotsidis, A.; Theodorou, A.; Bryson, J.J.; Wortham, R.H. Improving Robot Transparency: An Investigation with Mobile Augmented Reality. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–18 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Andersen, R.S.; Bøgh, S.; Moeslund, T.B.; Madsen, O. Task space HRI for cooperative mobile robots in fit-out operations inside ship superstructures. In Proceedings of the 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, USA, 26–31 August 2016; pp. 880–887. [Google Scholar] [CrossRef]
Kalpagam Ganesan, R.; Rathore, Y.K.; Ross, H.M.; Ben Amor, H. Better Teaming Through Visual Cues: How Projecting Imagery in a Workspace Can Improve Human-Robot Collaboration. IEEE Robot. Autom. Mag. 2018, 25, 59–71. [Google Scholar] [CrossRef]
Materna, Z.; Kapinus, M.; Beran, V.; Smrž, P.; Zemčík, P. Interactive Spatial Augmented Reality in Collaborative Robot Programming: User Experience Evaluation. In Proceedings of the 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Nanjing, China, 27–31 August 2018; pp. 80–87. [Google Scholar] [CrossRef]
Bolano, G.; Juelg, C.; Roennau, A.; Dillmann, R. Transparent Robot Behavior Using Augmented Reality in Close Human-Robot Interaction. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–18 October 2019; pp. 1–7. [Google Scholar] [CrossRef]
Ito, T.; Niwa, T.; Slocum, A.H. Virtual cutter path display for dental milling machine. In Proceedings of the RO-MAN 2009—The 18th IEEE International Symposium on Robot and Human Interactive Communication, Toyama, Japan, 27 September–2 October 2009; pp. 488–493. [Google Scholar] [CrossRef]
Notheis, S.; Milighetti, G.; Hein, B.; Wörn, H.; Beyerer, J. Skill-based telemanipulation by means of intelligent robots. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 5258–5263. [Google Scholar] [CrossRef]
Domingues, C.; Essabbah, M.; Cheaib, N.; Otmane, S.; Dinis, A. Human-Robot-Interfaces based on Mixed Reality for Underwater Robot Teleoperation. IFAC Proc. Vol. 2012, 45, 212–215. [Google Scholar] [CrossRef]
Hashimoto, S.; Ishida, A.; Inami, M.; Igarashi, T. TouchMe: An Augmented Reality Interface for Remote Robot Control. J. Robot. Mechatron. 2013, 25, 529–537. [Google Scholar] [CrossRef]
Osaki, A.; Kaneko, T.; Miwa, Y. Embodied navigation for mobile robot by using direct 3D drawing in the air. In Proceedings of the RO-MAN 2008—The 17th IEEE International Symposium on Robot and Human Interactive Communication, Munich, Germany, 1–3 August 2008; pp. 671–676. [Google Scholar] [CrossRef]
Chu, F.J.; Xu, R.; Zhang, Z.; Vela, P.A.; Ghovanloo, M. Hands-Free Assistive Manipulator Using Augmented Reality and Tongue Drive System. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 5463–5468. [Google Scholar] [CrossRef]
Oota, S.; Murai, A.; Mochimaru, M. Lucid Virtual/Augmented Reality (LVAR) Integrated with an Endoskeletal Robot Suit: StillSuit: A new framework for cognitive and physical interventions to support the ageing society. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1556–1559. [Google Scholar] [CrossRef]
Gregory, J.M.; Reardon, C.; Lee, K.; White, G.; Ng, K.; Sims, C. Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control. arXiv 2019, arXiv:1909.06415. [Google Scholar]
Huy, D.Q.; Vietcheslav, I.; Seet Gim Lee, G. See-through and spatial augmented reality—A novel framework for human-robot interaction. In Proceedings of the 2017 3rd International Conference on Control, Automation and Robotics (ICCAR), Nagoya, Japan, 24–26 April 2017; pp. 719–726. [Google Scholar] [CrossRef]
Sibirtseva, E.; Kontogiorgos, D.; Nykvist, O.; Karaoguz, H.; Leite, I.; Gustafson, J.; Kragic, D. A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In Proceedings of the 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Nanjing, China, 27–31 August 2018; pp. 43–50. [Google Scholar] [CrossRef]
Bambušek, D.; Materna, Z.; Kapinus, M.; Beran, V.; Smrž, P. Combining Interactive Spatial Augmented Reality with Head-Mounted Display for End-User Collaborative Robot Programming. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–18 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Sportillo, D.; Paljic, A.; Ojeda, L. On-road evaluation of autonomous driving training. In Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction, Daegu, Republic of Korea, 11–14 March 2019; HRI ’19. pp. 182–190. [Google Scholar]
Chakraborti, T.; Sreedharan, S.; Kulkarni, A.; Kambhampati, S. Projection-Aware Task Planning and Execution for Human-in-the-Loop Operation of Robots in a Mixed-Reality Workspace. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4476–4482. [Google Scholar] [CrossRef]
Sprute, D.; Viertel, P.; Tönnies, K.; König, M. Learning Virtual Borders through Semantic Scene Understanding and Augmented Reality. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 4607–4614. [Google Scholar] [CrossRef]
Reardon, C.; Lee, K.; Rogers, J.G.; Fink, J. Communicating via Augmented Reality for Human-Robot Teaming in Field Environments. In Proceedings of the 2019 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Würzburg, Germany, 2–4 September 2019; pp. 94–101. [Google Scholar] [CrossRef]
Williams, T.; Bussing, M.; Cabrol, S.; Boyle, E.; Tran, N. Mixed reality deictic gesture for multi-modal robot communication. In Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction, Daegu, Republic of Korea, 11–14 March 2019; HRI ’19. pp. 191–201. [Google Scholar]
Hamilton, J.; Phung, T.; Tran, N.; Williams, T. What’s The Point? In Tradeoffs between Effectiveness and Social Perception When Using Mixed Reality to Enhance Gesturally Limited Robots. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction, New York, NY, USA, 8–11 March 2021; HRI ’21. pp. 177–186. [Google Scholar] [CrossRef]
Chandan, K.; Kudalkar, V.; Li, X.; Zhang, S. ARROCH: Augmented Reality for Robots Collaborating with a Human. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 3787–3793. [Google Scholar] [CrossRef]
Ikeda, B.; Szafir, D. Advancing the Design of Visual Debugging Tools for Roboticists. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, Sapporo, Japan, 7–10 March 2022; HRI ’22. pp. 195–204. [Google Scholar]
Reinhart, G.; Munzert, U.; Vogl, W. A programming system for robot-based remote-laser-welding with conventional optics. CIRP Ann. 2008, 57, 37–40. [Google Scholar] [CrossRef]
Hulin, T.; Schmirgel, V.; Yechiam, E.; Zimmermann, U.E.; Preusche, C.; Pöhler, G. Evaluating exemplary training accelerators for Programming-by-Demonstration. In Proceedings of the 19th International Symposium in Robot and Human Interactive Communication, Viareggio, Italy, 13–15 September 2010; pp. 440–445. [Google Scholar] [CrossRef]
Gianni, M.; Gonnelli, G.; Sinha, A.; Menna, M.; Pirri, F. An Augmented Reality approach for trajectory planning and control of tracked vehicles in rescue environments. In Proceedings of the 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Linköping, Sweden, 21–26 October 2013; pp. 1–6. [Google Scholar] [CrossRef]
Lambrecht, J.; Walzel, H.; Krüger, J. Robust finger gesture recognition on handheld devices for spatial programming of industrial robots. In Proceedings of the 2013 IEEE RO-MAN, Gyeongju, Republic of Korea, 26–29 August 2013; pp. 99–106. [Google Scholar] [CrossRef]
Coovert, M.D.; Lee, T.; Shindev, I.; Sun, Y. Spatial augmented reality as a method for a mobile robot to communicate intended movement. Comput. Hum. Behav. 2014, 34, 241–248. [Google Scholar] [CrossRef]
Chadalavada, R.T.; Andreasson, H.; Krug, R.; Lilienthal, A.J. That’s on my mind! Robot to human intention communication through on-board projection on shared floor space. In Proceedings of the 2015 European Conference on Mobile Robots (ECMR), Lincoln, UK, 2–4 September 2015; pp. 1–6. [Google Scholar] [CrossRef]
Makris, S.; Karagiannis, P.; Koukas, S.; Matthaiakis, A.S. Augmented reality system for operator support in human–robot collaborative assembly. CIRP Ann. 2016, 65, 61–64. [Google Scholar] [CrossRef]
Krupke, D.; Steinicke, F.; Lubos, P.; Jonetzko, Y.; Görner, M.; Zhang, J. Comparison of Multimodal Heading and Pointing Gestures for Co-Located Mixed Reality Human-Robot Interaction. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1–9. [Google Scholar] [CrossRef]
Kapinus, M.; Beran, V.; Materna, Z.; Bambušek, D. Spatially Situated End-User Robot Programming in Augmented Reality. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–18 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Liu, C.; Shen, S. An Augmented Reality Interaction Interface for Autonomous Drone. arXiv 2020, arXiv:2008.02234. [Google Scholar]
Corotan, A.; Irgen-Gioro, J.J.Z. An Indoor Navigation Robot Using Augmented Reality. In Proceedings of the 2019 5th International Conference on Control, Automation and Robotics (ICCAR), Beijing, China, 19–22 April 2019; pp. 111–116. [Google Scholar] [CrossRef]
Gadre, S.Y.; Rosen, E.; Chien, G.; Phillips, E.; Tellex, S.; Konidaris, G. End-User Robot Programming Using Mixed Reality. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 2707–2713. [Google Scholar] [CrossRef]
Ostanin, M.; Mikhel, S.; Evlampiev, A.; Skvortsova, V.; Klimchik, A. Human-robot interaction for robotic manipulator programming in Mixed Reality. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2805–2811. [Google Scholar] [CrossRef]
Luebbers, M.B.; Brooks, C.; Mueller, C.L.; Szafir, D.; Hayes, B. ARC-LfD: Using Augmented Reality for Interactive Long-Term Robot Skill Maintenance via Constrained Learning from Demonstration. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 3794–3800. [Google Scholar] [CrossRef]
Han, Z.; Parrillo, J.; Wilkinson, A.; Yanco, H.A.; Williams, T. Projecting Robot Navigation Paths: Hardware and Software for Projected AR. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, Sapporo, Japan, 7–10 March 2022; HRI ’22. pp. 623–628. [Google Scholar]
Green, S.A.; Chen, X.Q.; Billinghurst, M.; Chase, J.G. Collaborating with a Mobile Robot: An Augmented Reality Multimodal Interface. IFAC Proc. Vol. 2008, 41, 15595–15600. [Google Scholar] [CrossRef]
Hönig, W.; Milanes, C.; Scaria, L.; Phan, T.; Bolas, M.; Ayanian, N. Mixed reality for robotics. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 5382–5387. [Google Scholar] [CrossRef]
Peake, I.D.; Blech, J.O.; Schembri, M. A software framework for augmented reality-based support of industrial operations. In Proceedings of the 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), Berlin, Germany, 6–9 September 2016; pp. 1–4. [Google Scholar] [CrossRef]
Andersen, R.S.; Madsen, O.; Moeslund, T.B.; Amor, H.B. Projecting robot intentions into human environments. In Proceedings of the 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, USA, 26–31 August 2016; pp. 294–301. [Google Scholar] [CrossRef]
Puljiz, D.; Krebs, F.; Bösing, F.; Hein, B. What the HoloLens Maps Is Your Workspace: Fast Mapping and Set-up of Robot Cells via Head Mounted Displays and Augmented Reality. arXiv 2020, arXiv:2005.12651. [Google Scholar]
Tung, Y.S.; Luebbers, M.B.; Roncone, A.; Hayes, B. Workspace Optimization Techniques to Improve Prediction of Human Motion During Human-Robot Collaboration. In Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, New York, NY, USA, 11–15 March 2024; HRI ’24. pp. 743–751. [Google Scholar] [CrossRef]
Hing, J.T.; Sevcik, K.W.; Oh, P.Y. Improving unmanned aerial vehicle pilot training and operation for flying in cluttered environments. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 5641–5646. [Google Scholar] [CrossRef]
Riordan, J.; Horgan, J.; Toal, D. A Real-Time Subsea Environment Visualisation Framework for Simulation of Vision Based UUV Control Architectures. IFAC Proc. Vol. 2008, 41, 25–30. [Google Scholar] [CrossRef]
Brooks, C.; Szafir, D. Visualization of Intended Assistance for Acceptance of Shared Control. arXiv 2020, arXiv:2008.10759. [Google Scholar]
Sachidanandam, S.O.; Honarvar, S.; Diaz-Mercado, Y. Effectiveness of Augmented Reality for Human Swarm Interactions. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 11258–11264. [Google Scholar] [CrossRef]
Martins, H.; Ventura, R. Immersive 3-D teleoperation of a search and rescue robot using a head-mounted display. In Proceedings of the 2009 IEEE Conference on Emerging Technologies Factory Automation, Palma de Mallorca, Spain, 22–25 September 2009; pp. 1–8. [Google Scholar] [CrossRef]
Zalud, L.; Kocmanova, P.; Burian, F.; Jilek, T. Color and Thermal Image Fusion for Augmented Reality in Rescue Robotics. In Proceedings of the 8th International Conference on Robotic, Vision, Signal Processing & Power Applications, Penang, Malaysia, 10–12 November 2013; Lecture Notes in Electrical Engineering. Mat Sakim, H.A., Mustaffa, M.T., Eds.; Springer: Singapore, 2014; pp. 47–55. [Google Scholar] [CrossRef]
Reardon, C.; Haring, K.; Gregory, J.M.; Rogers, J.G. Evaluating Human Understanding of a Mixed Reality Interface for Autonomous Robot-Based Change Detection. In Proceedings of the 2021 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), New York City, NY, USA, 25–27 October 2021; pp. 132–137. [Google Scholar] [CrossRef]
Walker, M.; Chen, Z.; Whitlock, M.; Blair, D.; Szafir, D.A.; Heckman, C.; Szafir, D. A Mixed Reality Supervision and Telepresence Interface for Outdoor Field Robotics. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 2345–2352. [Google Scholar] [CrossRef]
Tabrez, A.; Luebbers, M.B.; Hayes, B. Descriptive and Prescriptive Visual Guidance to Improve Shared Situational Awareness in Human-Robot Teaming. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, Virtual, 9–13 May 2022; pp. 1256–1264. [Google Scholar]
Qian, L.; Wu, J.Y.; DiMaio, S.P.; Navab, N.; Kazanzides, P. A Review of Augmented Reality in Robotic-Assisted Surgery. IEEE Trans. Med. Robot. Bionics 2020, 2, 1–16. [Google Scholar] [CrossRef]
Oyama, E.; Watanabe, N.; Mikado, H.; Araoka, H.; Uchida, J.; Omori, T.; Shinoda, K.; Noda, I.; Shiroma, N.; Agah, A.; et al. A study on wearable behavior navigation system (II)—A comparative study on remote behavior navigation systems for first-aid treatment. In Proceedings of the 19th International Symposium in Robot and Human Interactive Communication, Viareggio, Italy, 13–15 September 2010; pp. 755–761. [Google Scholar] [CrossRef]
Filippeschi, A.; Brizzi, F.; Ruffaldi, E.; Jacinto, J.M.; Avizzano, C.A. Encountered-type haptic interface for virtual interaction with real objects based on implicit surface haptic rendering for remote palpation. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 5904–5909. [Google Scholar] [CrossRef]
Adagolodjo, Y.; Trivisonne, R.; Haouchine, N.; Cotin, S.; Courtecuisse, H. Silhouette-based pose estimation for deformable organs application to surgical augmented reality. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 539–544. [Google Scholar] [CrossRef]
Zevallos, N.; Rangaprasad, A.S.; Salman, H.; Li, L.; Qian, J.; Saxena, S.; Xu, M.; Patath, K.; Choset, H. A Real-Time Augmented Reality Surgical System for Overlaying Stiffness Information. June 2018, Volume 14. Available online: https://www.roboticsproceedings.org/rss14/p26.pdf (accessed on 26 May 2020).
Sheridan, T. Space teleoperation through time delay: Review and prognosis. IEEE Trans. Robot. Autom. 1993, 9, 592–606. [Google Scholar] [CrossRef]
Xia, T.; Léonard, S.; Deguet, A.; Whitcomb, L.; Kazanzides, P. Augmented reality environment with virtual fixtures for robotic telemanipulation in space. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 7–12 October 2012; pp. 5059–5064. [Google Scholar] [CrossRef][Green Version]
Chang, C.T.; Luebbers, M.B.; Hebert, M.; Hayes, B. Human Non-Compliance with Robot Spatial Ownership Communicated via Augmented Reality: Implications for Human-Robot Teaming Safety. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 9785–9792. [Google Scholar] [CrossRef]
Ro, H.; Byun, J.H.; Kim, I.; Park, Y.J.; Kim, K.; Han, T.D. Projection-Based Augmented Reality Robot Prototype with Human-Awareness. In Proceedings of the 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Daegu, Republic of Korea, 11–14 March 2019; pp. 598–599. [Google Scholar] [CrossRef]
Mavridis, N.; Hanson, D. The IbnSina Center: An augmented reality theater with intelligent robotic and virtual characters. In Proceedings of the RO-MAN 2009—The 18th IEEE International Symposium on Robot and Human Interactive Communication, Toyama, Japan, 27 September–2 October 2009; pp. 681–686. [Google Scholar] [CrossRef]
Pereira, A.; Carter, E.J.; Leite, I.; Mars, J.; Lehman, J.F. Augmented reality dialog interface for multimodal teleoperation. In Proceedings of the 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Lisbon, Portugal, 28 August–1 September 2017; pp. 764–771. [Google Scholar] [CrossRef]
Omidshafiei, S.; Agha-Mohammadi, A.; Chen, Y.F.; Ure, N.K.; Liu, S.; Lopez, B.T.; Surati, R.; How, J.P.; Vian, J. Measurable Augmented Reality for Prototyping Cyberphysical Systems: A Robotics Platform to Aid the Hardware Prototyping and Performance Testing of Algorithms. IEEE Control. Syst. Mag. 2016, 36, 65–87. [Google Scholar] [CrossRef]
Mahajan, K.; Groechel, T.R.; Pakkar, R.; Lee, H.J.; Cordero, J.; Matarić, M.J. Adapting Usability Metrics for a Socially Assistive, Kinesthetic, Mixed Reality Robot Tutoring Environment. In Proceedings of the International Conference on Social Robotics, Golden, CO, USA, 14–18 November 2020. [Google Scholar]
NASA Ames. 2019. Available online: https://www.nasa.gov/centers-and-facilities/ames/nasa-ames-astrogram-november-2019/ (accessed on 25 August 2020).
Weiss, A.; Bartneck, C. Meta analysis of the usage of the Godspeed Questionnaire Series. In Proceedings of the 2015 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Kobe, Japan, 31 August–4 September 2015; pp. 381–388. [Google Scholar] [CrossRef]
Bartneck, C.; Kulić, D.; Croft, E.; Zoghbi, S. Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots. Int. J. Soc. Robot. 2009, 1, 71–81. [Google Scholar] [CrossRef]
Brooke, J. Usability and Context. In Usability Evaluation in Industry; Jordan, P.W., Thomas, B., McClelland, I.L., Weerdmeester, B., Eds.; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
Endsley, M. Situation awareness global assessment technique (SAGAT). In Proceedings of the IEEE 1988 National Aerospace and Electronics Conference, Dayton, OH, USA, 23–27 May 1988; Volume 3, pp. 789–795. [Google Scholar] [CrossRef]
Kirby, R.; Rushton, P.; Smith, C.; Routhier, F.; Axelson, P.; Best, K.; Betz, K.; Burrola-Mendez, Y.; Contepomi, S.; Cowan, R.; et al. Wheelchair Skills Program Manual Version 5.1. 2020. Available online: https://wheelchairskillsprogram.ca/en/manual-and-form-archives/ (accessed on 31 August 2020).
Hoffman, G. Evaluating Fluency in Human–Robot Collaboration. IEEE Trans. Hum.-Mach. Syst. 2019, 49, 209–218. [Google Scholar] [CrossRef]
Gombolay, M.C.; Gutierrez, R.A.; Clarke, S.G.; Sturla, G.F.; Shah, J.A. Decision-making authority, team efficiency and human worker satisfaction in mixed human–robot teams. Auton. Robot. 2015, 39, 293–312. [Google Scholar] [CrossRef]
Dragan, A.D.; Bauman, S.; Forlizzi, J.; Srinivasa, S.S. Effects of Robot Motion on Human-Robot Collaboration. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, New York, NY, USA, 2–5 March 2015; HRI’ 15. pp. 51–58. [Google Scholar] [CrossRef]
Quintero, C.P.; Li, S.; Pan, M.K.; Chan, W.P.; Machiel Van der Loos, H.; Croft, E. Robot Programming through Augmented Trajectories in Augmented Reality. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1838–1844, ISSN 2153-0866. [Google Scholar] [CrossRef]
Dinh, H.; Yuan, Q.; Vietcheslav, I.; Seet, G. Augmented reality interface for taping robot. In Proceedings of the 2017 18th International Conference on Advanced Robotics (ICAR), Hong Kong, China, 10–12 July 2017; pp. 275–280. [Google Scholar] [CrossRef]
Chacko, S.M.; Kapila, V. An Augmented Reality Interface for Human-Robot Interaction in Unconstrained Environments. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 3222–3228. [Google Scholar] [CrossRef]
Licklider, J.C.R. Man-Computer Symbiosis. IRE Trans. Hum. Factors Electron. 1960, HFE-1, 4–11. [Google Scholar] [CrossRef]
Sheridan, T.B. Telerobotics, Automation, and Human Supervisory Control; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
Hart, S.G.; Staveland, L.E. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Researc. In Advances in Psychology; Human Mental Workload; Hancock, P.A., Meshkati, N., Eds.; North-Holland: Amsterdam, The Netherlands, 1988; Volume 52, pp. 139–183. [Google Scholar] [CrossRef]
Laugwitz, B.; Held, T.; Schrepp, M. Construction and Evaluation of a User Experience Questionnaire. In Proceedings of the HCI and Usability for Education and Work; Lecture Notes in Computer Science; Holzinger, A., Ed.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 63–76. [Google Scholar] [CrossRef]
Srinivasan, L.; Schilling, K. Augmented Reality Exocentric Navigation Paradigm for Time Delayed Teleoperation. IFAC Proc. Vol. 2013, 46, 1–6. [Google Scholar] [CrossRef]
Szafir, D.; Mutlu, B.; Fong, T. Designing planning and control interfaces to support user collaboration with flying robots. Int. J. Robot. Res. 2017, 36, 514–542. [Google Scholar] [CrossRef]
Liu, H.; Zhang, Y.; Si, W.; Xie, X.; Zhu, Y.; Zhu, S.C. Interactive Robot Knowledge Patching Using Augmented Reality. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 1947–1954. [Google Scholar] [CrossRef]
Scholtz, J.; Antonishek, B.; Young, J. Evaluation of a human-robot interface: Development of a situational awareness methodology. In Proceedings of the 37th Annual Hawaii International Conference on System Sciences, Big Island, HI, USA, 5–8 January 2004. [Google Scholar] [CrossRef]
Scholtz, J.; Antonishek, B.; Young, J. Implementation of a situation awareness assessment tool for evaluation of human-robot interfaces. IEEE Trans. Syst. Man, Cybern. Part Syst. Hum. 2005, 35, 450–459. [Google Scholar] [CrossRef]
Wheelchair Skills Program, D.U. Wheelchair Skills Program (WSP) Version 4.2—Wheelchair Skills Program. 2013. Available online: https://wheelchairskillsprogram.ca/en/skills-manual-forms-version-4-2/ (accessed on 31 August 2020).
Kirby, R.; Swuste, J.; Dupuis, D.J.; MacLeod, D.A.; Monroe, R. The Wheelchair Skills Test: A pilot study of a new outcome measure. Arch. Phys. Med. Rehabil. 2002, 83, 10–18. [Google Scholar] [CrossRef] [PubMed]
Hoffman, G. Evaluating Fluency in Human-Robot Collaboration. 2013. Available online: https://hrc2.io/assets/pdfs/papers/HoffmanTHMS19.pdf (accessed on 20 November 2023).
Palmer, N.; Burchard, E. Underrepresented Populations in Research. 2020. Available online: https://recruit.ucsf.edu/underrepresented-populations-research (accessed on 25 September 2020).
Christensen, H.I. A Roadmap for US Robotics: From Internet to Robotics. Technical Report. 2020. Available online: https://robotics.usc.edu/publications/media/uploads/pubs/pubdb_1147_e2f8b9b1d60c494a9a3ce31b9210b9c5.pdf (accessed on 20 November 2023).
Rosen, E.; Whitney, D.; Phillips, E.; Chien, G.; Tompkin, J.; Konidaris, G.; Tellex, S. Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays. Int. J. Robot. Res. 2019, 38, 1513–1526. [Google Scholar] [CrossRef]
Oyama, E.; Shiroma, N. Behavior Navigation System for Use in harsh environments. In Proceedings of the 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics, Kyoto, Japan, 1–5 November 2011; pp. 272–277. [Google Scholar] [CrossRef]
Hietanen, A.; Pieters, R.; Lanz, M.; Latokartano, J.; Kämäräinen, J.K. AR-based interaction for human-robot collaborative manufacturing. Robot. Comput.-Integr. Manuf. 2020, 63, 101891. [Google Scholar] [CrossRef]

Figure 1. Isometric (a) and top (b) views of the Microsoft HoloLens 2, a commonly used head-mounted display for augmented reality.

Figure 2. An example of communicating constraints to a robotic system using augmented reality [73].

Figure 3. An example of adding visual divisions within a workspace via AR to improve human–robot collaboration in a manufacturing environment [80].

Figure 4. An example of using AR to improve the shared mental model in collaboration between a human and a robot during a search and rescue scenario [89].

Figure 5. An example of using augmented reality to communicate spatial ownership in a shared space environment between a human and an airborne robot [97].

Table 1. This table summarizes the categories outlined in this literature review and lists the articles associated with each category. Many papers are cited in more than one category, as the categories are not mutually exclusive; rather, they are intended to provide multiple perspectives on the relevant literature.

Contributions and Categorizations of Included Papers
Modalities
Mobile Devices: Head-Mounted Display	[10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]
Mobile Devices: Handheld Display	[28,29,30,31,32,33,34,35,36]
Projection-based Display	[37,38,39,40]
Modalities
Static Screen-based Display	[41,42,43,44]
Alternate Interfaces	[45,46,47,48]
AR Combinations and Comparisons	[39,49,50,51,52]
Creating and Understanding the System
Intent Communication	[36,40,50,53,54,55,56,57,58,59]
Path and Motion Visualization and Programming	[11,14,25,28,29,30,32,37,39,44,45,51,55,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74]
Adding Markers to the Environment	[14,28,33,44,75,76,77]
Manufacturing and Assembly	[17,18,31,37,39,60,66,77,78,79,80]
Improving the Collaboration
AR for Teleoperation	[13,16,18,22,26,41,43,49,81,82,83,84]
Pick-and-Place	[21,33,35,44,50,51]
Search and Rescue	[24,48,55,62,85,86,87,88,89]
Medical	[23,27,41,90,91,92,93,94]
Space	[95,96]
Safety and Ownership of Space	[33,34,40,66,79,97]
Other Applications	[98,99,100,101,102]
Evaluation Strategies and Methods
Instruments, Questionnaires, and Techniques	[103,104,105,106,107,108,109,110,111]
The Choice to Conduct User/Usability Testing	[11,17,18,19,21,26,27,43,50,53,55,66,67,79,112,113]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chang, C.T.; Hayes, B. A Survey of Augmented Reality for Human–Robot Collaboration. Machines 2024, 12, 540. https://doi.org/10.3390/machines12080540

AMA Style

Chang CT, Hayes B. A Survey of Augmented Reality for Human–Robot Collaboration. Machines. 2024; 12(8):540. https://doi.org/10.3390/machines12080540

Chicago/Turabian Style

Chang, Christine T., and Bradley Hayes. 2024. "A Survey of Augmented Reality for Human–Robot Collaboration" Machines 12, no. 8: 540. https://doi.org/10.3390/machines12080540

APA Style

Chang, C. T., & Hayes, B. (2024). A Survey of Augmented Reality for Human–Robot Collaboration. Machines, 12(8), 540. https://doi.org/10.3390/machines12080540

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Survey of Augmented Reality for Human–Robot Collaboration

Abstract

1. Introduction

2. Methodology

3. Reality Augmented in Many Forms

3.1. Mobile Devices: Head-Mounted Display

3.2. Mobile Devices: Handheld Display

3.3. Static Screen-Based Display

3.4. Alternate Interfaces

3.5. AR Combinations and Comparisons

4. Programming and Understanding the Robotic System

4.1. Intent Communication

4.2. Path and Motion Visualization and Programming

4.3. Adding Markers to the Environment to Accommodate AR

4.4. Manufacturing and Assembly

5. Improving the Collaboration

5.1. AR for Teleoperation

5.2. Pick-and-Place

5.3. Search and Rescue

5.4. Medical

5.5. Space

5.6. Safety and Ownership of Space

5.7. Other Applications

6. Evaluation Strategies and Methods

6.1. Instruments, Questionnaires, and Techniques

6.1.1. NASA Task Load Index (TLX)

6.1.2. Godspeed Questionnaire Series (GQS)

6.1.3. User Experience Questionnaire (UEQ)

6.1.4. System Usability Scale (SUS)

6.1.5. Situational Awareness Evaluation

6.1.6. Task-Specific Evaluations

6.1.7. Comprehensive Evaluation Designs

6.2. The Choice to Conduct User/Usability Testing

6.2.1. Pilot Testing as Verification

6.2.2. Usability Testing

6.2.3. Proof-of-Concept Experiments

6.2.4. Choosing the Type of Evaluation to Conduct

6.2.5. Recruiting Participants for Human Subjects Studies

7. Future Work

7.1. Robots and Systems Designed to Be Collaborative

7.2. Humans as Compliant Teammates

7.3. Evaluation

8. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI