NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones

Nanzatov, Aiur; Peña-Castillo, Lourdes; Meruvia-Pastor, Oscar

doi:10.3390/electronics14173368

Open AccessEditor’s ChoiceArticle

NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones

by

Aiur Nanzatov

¹,

Lourdes Peña-Castillo

^1,2

and

Oscar Meruvia-Pastor

^1,*

¹

Department of Computer Science, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada

²

Department of Biology, Memorial University of Newfoundland, St. John’s, NL A1C 5S7, Canada

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(17), 3368; https://doi.org/10.3390/electronics14173368

Submission received: 4 July 2025 / Revised: 21 August 2025 / Accepted: 21 August 2025 / Published: 25 August 2025

(This article belongs to the Special Issue Emerging Technologies in Augmented, Virtual, and Mixed Reality: Advancing Human Experience in Digital and Physical Environments)

Download

Browse Figures

Versions Notes

Abstract

Two-factor authentication (2FA) has become widely adopted as an efficient and secure way of validating someone’s identity online. Two-factor authentication is difficult in virtual reality (VR) because users are usually wearing a head-mounted display (HMD) which does not allow them to see their real-world surroundings. We present NRXR-ID, a technique to implement two-factor authentication while using extended reality systems and smartphones. The proposed method allows users to complete an authentication challenge using their smartphones without removing their HMD. We performed a user study in which we explored four types of challenges for users, including a novel checkers-style challenge. Users responded to these challenges under three different configurations, including a technique that uses a smartphone to support gaze-based selection without the use of a VR controller. A 4 × 3 within-subjects design allowed us to study all of the proposed variations. We collected performance metrics along with user experience questionnaires containing subjective impressions from thirty participants. Results suggest that the checkers-style visual matching challenge was the most preferred option, followed by the challenge involving entering a digital PIN submitted via the smartphone. Participants were fastest at solving the digital PIN challenge, with an average of 12.35 ± 5 s, followed by the Checkers challenge with 13.85 ± 5.29 s, then the CAPTCHA-style challenge with 14.36 ± 7.5 s, whereas the alphanumeric password took almost twice as long, averaging 32.71 ± 16.44 s. The checkers-style challenge performed consistently across all conditions with no significant differences (p = 0.185), making it robust to different implementation choices.

Keywords:

2FA; smartphones and HMDs; phone in VR; virtual reality and human–computer interaction; information security; VR interaction techniques; two-step authentication in VR

1. Introduction

This research explores the design space of multi-device authentication methods that combine smartphones and VR headsets, with a focus on supporting two-factor authentication (2FA) in VR applications for increased security.

Authentication methods have evolved significantly, driven by the ongoing battle between security developers and cyberattackers [1,2,3,4,5,6]. As hackers and automated systems become more sophisticated, innovative protection mechanisms continue to evolve as well [7,8,9,10]. This dynamic highlights the need for security solutions that are both robust and user-friendly, balancing security and strong protection of privacy with seamless user experiences.

While there are continuously emerging security risks that are particular to XR systems, including vulnerabilities in the VR devices themselves [11], gaze exploitation [12], 3D disguises [13,14] and de-anonymization attacks [15], many security risks associated with the use of extended reality systems are also present in non-VR systems [16], including identity theft and impersonation, data privacy breaches, and attack types such as man-in-the-middle, phishing, and social engineering [17]. NRXR-ID implements two-factor authentication (2FA) to prevent unauthorized access to accounts and systems by providing heightened scrutiny about the identity of a user. By preventing unauthorized access to high-stakes portions of the system, NRXR-ID blocks potential intruders from accessing both gaze data and other sensitive user data in the first place.

A majority of the concerns related to XR systems involve the uncontrolled collection of data from users in ways that may compromise their privacy [18]. Dealing with data collection vulnerabilities, including de-anonymization attacks, would be outside the scope of this article; instead, our focus is on reinforcing authentication mechanisms in order to verify the identity of users in XR in real time. As noted by Acheampong et al. [19], “permission management in XR devices is often broad and persistent and lacks real-time notifications, allowing malicious applications to exploit microphone, camera, and location access without the users knowing.” Two-factor authentication is particularly useful when users are about to engage in a high-stakes event that requires increased certainty about the user’s identity in real-time scenarios. NRXR-ID supports the 2FA framework by presenting multi-device solutions in XR for real-time verification of a user’s identity in a high-stakes scenario.

1.1. Motivation

As technology progresses, safeguarding user credentials requires increasingly robust strategies [6,20], as user credentials may become compromised without the user’s fault or knowledge. In this context, two-factor authentication has become a widely used method for secure identity validation, providing enhanced protection by requiring the user to possess two separate devices or pieces of information for authentication. Our focus on multi-device two-factor authentication in VR is justified by its widespread adoption and effectiveness in improving security outside the XR domain. Traditional single password-based systems remain vulnerable to attacks such as phishing, advances in artificial intelligence malware, and data breaches, making them insufficient on their own for safeguarding sensitive data [21]. There are two stages in two-factor authentication. In the first stage, a user gains access to the system, for example by providing a secure password or personal identification number (PIN). In many cases, that is considered enough, for example when a user regularly accesses an email account from the same web browser of a particular computer; however, in some cases, a system flags a situation that requires a confirmation of identity (for example, when the user tries to access their email from a new computer), and this is where the the user is required to check the second device to find a challenge that they must provide to confirm their identity.

The combination of passwords and two-factor authentication has emerged as a key solution for safeguarding security and privacy in non-VR systems [21]. In fact, increased use of 2FA across various industries has been observed recently, including technology, banks, social media, and health organizations [22]. More particularly, one-time-use PINs sent using email and SMS have become the most commonly used two-factor authentication method in the financial sector [22]. In most cases, the second device used for authentication is a smartphone, where users receive a text message to the phone number associated with their online account. The text message typically contains a numeric code (the PIN) that they need to enter in the primary device where the need for authentication was generated in the first place. In other cases, users receive the code in an alternative email account or an app, and are requested to check their email account or the authenticator app to find out the code. There are many high-stakes situations where the need to perform two-factor authentication may occur within a VR environment. For example, a user who needs to purchase a valuable asset in a VR game could be asked to authenticate during the game. Similarly, in a tele-health scenario, a health-care provider in VR could be asked to confirm their credentials in order to access a new set of sensitive medical records. In a third scenario, a VR user using a web browser in the metaverse may be asked to authenticate their identity to access their email account or to confirm a commercial transaction flagged as suspicious by their financial institution.

In virtual reality (VR), performing two-factor authentication becomes more challenging than usual because users typically cannot see the devices in their real-world surroundings while wearing a head-mounted display (HMD). Instead, a user would typically be forced to lift their HMD to read the code or access their email account on a smartphone or nearby desktop computer, or might need to step out of the physical playing area to be able to see their smartphone using pass-through cameras mounted in the HMD. In some cases, lifting the HMD may trigger a halting of the application that is being executed in the HMD. This could potentially disrupt users’ VR experience or their activities within the VR application where the need for authentication likely originated.

1.2. Near-Range Extended Reality for 2FA

Near-range extended reality (NRXR) refers to a type of immersive VR experiences that integrate real-world elements in close proximity to the user [23,24]. The need to become aware of the real-world surroundings and the actions of others in close proximity to those within VR has existed since VR’s origins [23,24,25,26,27,28]. While substantial work has been carried out in the area of integrating personal devices, including tablets [29], smartphones [5,30,31], smart watches [32], keyboards [23], and desktop workspaces [33], the specific topic of identity authentication using such devices in VR has been much less explored. To facilitate user authentication tasks without leaving the VR context, we propose the use of near-range extended reality and suggest how it can be used to support two-factor authentication. We refer to the use of NRXR for two-factor authentication as NRXR-ID. NRXR-ID allows users to complete authentication using their smartphones without removing their HMDs. NRXR-ID prioritizes awareness of objects in close proximity to the user, because in the context of authentication there is no need to become aware of elements from the real world that might be far from the user. In prior work, McGill et al. [23] found that selectively presenting some elements from reality as users engage with VR allows for optimal performance and maintains the user’s sense of presence. With NRXR, users can access their smartphones, nearby laptops, or personal computers while staying aware of the VR environment. This is particularly important when users need information from both the real world and the VR environment to complete the authentication procedure.

For this project, NRXR was implemented as a technique for VR systems in which a depth-sensing camera mounted on the HMD allows users to see nearby objects (i.e., those 10–120 cm from the camera) within the virtual environment, with the main goal being that users are able to see their smartphone from within VR. This implementation creates a unique form of mixed reality (MR) experience in which physical objects that are close to the user become visible in the virtual world in a more selective way [23,34,35], differing from the Meta Quest’s passthrough or the Vision Pro’s crown dial in the way in which real-world imagery is blended into the VR experience. We implemented near-range extended reality with an external depth-sensing camera because the SDK for the Meta Quest 2 did not provide access to the depth field video stream or pass-through cameras, which in turn prevented us from obtaining a depth video stream to filter for nearby objects in front of the user. For widespread deployment, developers with access to the built-in cameras and/or depth sensors in newer mixed-reality HMDs should be able to implement NRXR without the need to mount an external depth-sensing camera as long as those HMDs can convey a depth field as seen from the user’s perspective.

With the latest generation of VR HMDs such as the Meta Quest 3, Varjo XR4, and Apple Vision Pro, which support mixed and augmented reality (MR/AR), the ability to combine the real and virtual worlds to enhance the VR user experience has become much more convenient. Recent MR/AR solutions include the “passthrough” mode in devices such as the Meta Quest and the Apple Vision Pro. In contrast to prior implementations of NRXR [23,24,35], these solutions do not adjust the amount of the real world that is shown to the users as a function of the distance between objects in the real world and the user. For example, the passthrough mode in the Meta Quest allows users to see their physical surroundings through built-in cameras by default when they step outside the virtual boundaries or by means of mixed reality windows to the user’s surroundings using passthrough components in Unity. However, depending on the use case scenario, asking users to step out of VR could completely disrupt the VR experience. As an alternative passthrough design, a physical “crown” dial in the Apple Vision Pro lets users manually adjust the passthrough region in their field of view. In this case, the VR imagery can be extended from the center of projection to the sides, as if holding a curtain that can be manually opened of closed in front of the viewer, while a passthrough view of the real world shows up in the peripheral regions of the user’s field of view. However, this manual adjustment is independent of events happening outside of the VR world and is not necessarily meant to allow nearby objects to become visible or increase security; rather, it is meant to allow objects in the peripheral vision region to become visible in lieu of a VR background. Despite these differences in passthrough designs, it can be argued that the new higher-end HMDs are well positioned to support 2FA by offering a more convenient setup (including HMD with no camera attached, better image and camera resolution, and hand tracking). In fact, the Apple Vision Pro already allows users to see their smartphone while wearing the HMD; we discuss its viability for 2FA, as well as that of the Meta Quest 3 and Meta Quest Pro, in the Future Work section. These latest advances in HMD technologies allows us to focus on questions related to the design choices available in making an XR implementation of two-factor authentication that is useful, convenient, and preferred by users, as explored in the next sections.

1.3. Research Questions

In this section, we describe the research questions that will guide our exploration of how two-factor authentication can be supported in VR, mixed reality (MR), and augmented reality (AR) using NRXR.

Returning to the topic of authentication, it is crucial to design authentication methods that offer both robust security and practical convenience [36]. This research seeks to answer questions about the best way to help VR users perform two-factor authentication without needing to remove their HMD or lose awareness of what is happening in the VR environment. The main research questions we address in this work are:

RQ1: Is it possible to use NRXR to support people in the process of confirming their identity via two-factor authentication?
RQ2: What type of authentication challenges and modalities are most suitable for implementing a user-friendly form of two-factor authentication in the XR context?
RQ3: When implementing two-factor authentication using NRXR, what is most effective for users: to present a challenge using the smartphone and have the challenge answered within VR, or the other way around?
RQ4: What are users’ experiences, impressions, and preferences on two-factor authentication when using NRXR?

To answer these research questions, we have evaluated several forms of NRXR-ID through a user study, as described in the Methodology section.

2. Related Work

There are multiple methods for authentication tailored for AR/XR. Behavioral biometrics based on users’ unique behavioral patterns offer a non-intrusive authentication method. For instance, hand tracking in AR/VR environments can identify users based on their finger movements and gestures. Liebers et al. [37] demonstrated the effectiveness of hand tracking for implicit user identification in immersive environments.

Other novel VR authentication methods include: (1) Gesture-Based Authentication, where users authenticate through personalized gestures such as high fives or fist bumps, maintaining immersion and improving security [38]; (2) Direction-Based Authentication (DBA), where users navigate virtual environments and select directions to form a password, balancing memorability, efficiency, and security [39]; and (3) SPHinX, a method in which users paint or trace patterns on a 3D object, which offers enhanced security and reduces risks such as shoulder surfing [40]. These methods aim to integrate authentication seamlessly into the VR experience while addressing security concerns. Although they could be used to gain initial access to a system, they are not suitable for supporting multi-device authentication.

Biometric authentication provides a viable alternative in VR, where traditional methods such as PINs and passwords are difficult to implement. Biometric authentication methods include retinal scanning [41], iris scanning [42], and skull bone conductivity [43]. A study by Heruatmadja et al. [44] reviewed biometric techniques for identification in VR and highlighted the accuracy of finger vein and hand movement biometrics, using machine learning methods such as k-Nearest Neighbors (k-NN) and support vector machine (SVM). These techniques would first need to be made available in HMDs, and could then be used for the first stage of the two-factor authentication process to gain initial access to a system, whereas smartphones could be used for the second step of authentication, i.e., confirming the user’s identity.

Recent research has explored the use of traditional password-based authentication in VR and AR environments [45]. Entering passwords in these settings can be cumbersome and negatively impact the user experience [6,38], highlighting the need for more intuitive solutions. Alternatives such as virtual or touch-sensitive physical keyboards are promising for text entry in VR environments [46]. While physical keyboards are effective in VR, they require external camera-based tracking systems [46]. Touch-sensitive keyboards, which track fingertip movement directly on the keyboard’s surface, offer a potentially intuitive and accessible option for password entry in VR.

Decentralized technologies and self-sovereign identity (SSI) offer a promising alternative for enhancing VR authentication. SSI mitigates vulnerabilities in traditional methods such as predictable passwords and biometric theft by providing a decentralized framework that gives users control over their identity. It incorporates memory-based authentication, in which users recall and create scenes stored on the inter-planetary file system (IPFS) and blockchain, reducing risks from centralized data storage [47,48]. However, its success depends on overcoming challenges such as memory recall, technical integration, scalability, and user acceptance.

While the above solutions have explored particular forms of single authentication in XR, this paper focuses on the confirmation-of-identity stage of 2FA. This research is among the first empirical studies to compare different types of challenges involving multi-device authentication while exploring different device configurations.

Using Smartphones in VR

Several research projects have explored integrating smartphones [49,50,51,52,53,54,55] as well as other input devices such as handheld controllers [56,57] and smartwatches [58,59,60,61] in virtual, augmented, and mixed realities. For instance, [62] utilized smartphones as input devices for interacting with displays without expensive tracking devices, while [63] proposed using mobile devices for various interactions in VR. Other handheld devices such as touchpads have also been explored for VR interactions [64].

In [65], smartphones were used for selection tasks and teleport-based navigation in VR, showing that they can be comparable to VR controllers. In [54], the concept of AV was used to access smartphones, emphasizing the importance of including users’ hands and realistic skin tones to enhance interaction. NRXR incorporates these important features. Zhu et al. [66] explored how smartphones should be spatially anchored in VR, suggesting that physically holding smartphones and using direct touch improves accuracy and speed. This concept aligns with how NRXR provides users access to their smartphones in VR.

3. Methodology

To assess the limitations, advantages, and drawbacks of our proposed system, we considered various ways in which two-factor authentication can be implemented. While the number of potential implementations is large, we have focused on four types of challenges for this initial exploration. A user study was conducted in order to compare these four strategies in terms of efficiency and user satisfaction.

Given the variety of possible approaches, we chose to focus on four specific authentication methods: solving a CAPTCHA challenge (a well-established verification method), entering a numeric PIN code (the most common form of two-factor authentication), checkers-style matching (a visual matching challenge suitable for graphical interfaces), and providing an alphanumeric password using a virtual keyboard. Figure 1 illustrates these four forms of authentication, which are described in detail in the next sections.

3.1. CAPTCHA-Style Challenge

In the CAPTCHA challenge, the user is presented with a 3× 3 grid of images and must select three objects that match the description provided in the dialog showing the challenge request [1]. Only the challenge creator knows which items in the grid correspond to the correct answer. To implement this, the challenge creator sends a general description of the relevant icons to one device and the challenge grid to the second device used for authentication. For example, the user might be asked to select all three images containing animals on one device, as shown in Figure 2A,E, while the second device displays a grid with animals and other content, as shown in Figure 1A. There are three “themes” that can be fit into a 3× 3 set of tiles and users need to select three tiles to solve a challenge. Thus, users must complete two rounds of CAPTCHA challenges, requiring at least six clicks in total to solve the task.

3.2. Numeric Code Challenge

One of the most common forms of two-factor authentication involves sending a six-digit numerical code to the user’s smartphone [21]. Typically, the user receives this code via text message and then enters it on a second device. This method is widely used by banks, email providers, and other online services. In our implementation, we replicate this process in a custom smartphone app. The user receives the numerical code on one device (as in Figure 2B) and is then prompted to enter it on the second device (as in Figure 1B).

3.3. Checkers Matching Challenge

The checkers-style visual matching challenge was devised as a form of visual authentication suitable for graphical displays. Its tiled arrangement resembles the CAPTCHA-style challenge, and is designed with visual simplicity in mind. The user receives a grid of checkered tiles consisting of four rows and four columns (4 × 4) on one device (as in Figure 2C) and receives a second 4 × 4 grid of checkered tiles on the second device (as in Figure 1C). The task consists of flipping each tile between black and white states by tap or selection until the tile arrangement shown in the second device matches the arrangement shown in the first device. There are only six differences in the tiles, meaning that the user could solve the challenge with as few as six flips. Compared with the images shown in the CAPTCHA-style challenge, which require recognition of the contents of each picture in the grid, the checkers-type challenge has the advantages of the tiles having the maximum contrast possible and being easily identifiable as either on or off. Finally, the tiles can be easily encoded as a compact bit sequence, encrypted/decrypted, and converted to a visual challenge (or the other way around) once a user has submitted a response.

3.4. Alphanumeric Password Challenge

Alphanumeric passwords are often used as the first line of authentication for users of online systems. Given the inherent vulnerabilities of password-only authentication, two-factor authentication has emerged as a crucial supplementary measure [21]. In this challenge, we have kept some of the established rules for creating robust passwords (use of both upper and lower case; combination of numbers, letters, and special characters). To keep this challenge’s complexity at a level comparable to the other authentication strategies, we added a restriction that the password will have maximum length of six characters, even though most passwords are usually a minimum of eight characters. This keeps the number of required clicks from the users comparable to that of the other three challenges, although users still have to switch between the different character sets on their virtual keyboards to complete the challenge. This type of challenge is illustrated in Figure 1D, Figure 2D, Figure 3 and Figure 4.

3.5. Configuration Possibilities for 2FA in XR

In NRXR-ID, the smartphone can act as the device showing a 2FA challenge prompt, after which the HMD can be used as the device to provide the solution to the challenge; it is also possible to perform this process the other way around. In either case, none of the challenges can be successfully answered by someone without access to both the HMD and the smartphone during the process of authentication, ensuring that the system meets the two-factor authentication principle that the user must have two separate forms of access in order to complete a challenge. Figure 5 highlights some of the possibilities available for interaction. In this study, we explored three distinct configurations.

3.5.1. Condition 1: HMD1_Phone2

In this condition the HMD is used as the first device for the authentication challenge. Here, the HMD is the device that shows the user the passcode, i.e., the challenge to be solved, which is also described as the “expected response”. The smartphone is used as the second device for authentication, i.e., the device where the user enters and submits their solution to the challenge. The flowchart for this condition is shown in Figure 6A, where it can be seen that this condition requires a client app to be installed on the user’s smartphone in order to capture and submit the responses to the challenges posed in the VR application. Figure 7 illustrates how different challenges are communicated to the participant within the VR and how participants see the smartphone app while wearing the HMD. Figure 1 shows the smartphone app screens corresponding to each type of challenge, and Figure 4C shows the smartphone’s virtual keyboard used for answering the password challenge. A user scenario for this modality is when a bank’s client is accessing their bank account via a web browser shown in VR and needs to confirm their identity for a transaction that has been flagged by the bank, such as a stock market transaction. The bank then generates a token that the user can only enter using a bank-provided app that has been pre-installed on the client’s smartphone. The user then grabs their phone, opens their bank’s app, and enters the token shown in the web browser using the app.

3.5.2. Condition 2: Phone1_SVRP2

In this case the smartphone is used as the first device for authentication, that is, the device that shows the user the passcode with the expected response, i.e., the challenge to be solved (Figure 2, left panel), after which the HMD is used as the second device for the authentication, i.e., the device where the user enters and submits their solution to the challenge (Figure 8). The flowchart for this condition is shown in Figure 6B. Because the challenges can be solved within the VR environment with different input devices, users need a selection mechanism with which to either select tiles from grids, select digits on a virtual keypad, or select characters from virtual keyboards shown in the HMD. For the alphanumeric password, a virtual keyboard (as used by Boletsis et al. [68]) was presented to the users within the VR (shown in Figure 3). Head-based gaze selection using a clicker has been suggested as the most preferred method for gaze selection [67], and is the approach we used. Another possibility is the use of built-in eye trackers for selection [69]. However, most commercial VR HMDs do not provide built-in eye tracking capabilities. To allow users to submit their response while wearing the HMD, we used SmartVR pointer gaze-based selection (SVRP2), a method that allows for selection using head-based gaze selection in VR using the smartphone as a clicker to confirm selections [65]. Similar to Condition 1, this condition enables users to perform the whole authentication process relying solely on the VR HMD and their smartphone. A use case for this type of interaction is when a VR user is watching videos, browsing TV channels, or selecting a movie to watch by using their gaze for pointing and selection of their entertainment choice, then needs to confirm their identity to override a parental control. The system then generates a checkers-style challenge. The user can use their smartphone to solve the challenge shown in VR using gaze-based selection, after which they would have completed the whole process using only their HMD and smartphone.

3.5.3. Condition 3: Phone1_VRC2 (a.k.a. the Baseline)

In this case, the smartphone is used as the first device for the authentication, that is, the device that shows the user the passcode with the expected response, i.e., the challenge to be solved (Figure 2, right panel), and the HMD is used as the second device for the authentication, i.e., the device where the user enters and submits their solution to the challenge (Figure 3 and Figure 9). The flowchart for this condition is shown in Figure 6C. This condition is very similar to condition 2 above, with one major difference in that the standard VR controller provided by the manufacturer of the HMD (VRC2) is used as the input mechanism within the VR. In this condition, the user holds their smartphone with one hand (to see or review the challenge) and the VR controller with the other hand (as shown in Figure 9A). In this way, the user can check their smartphone while solving the challenge using the VR controller. This is considered our baseline condition, as most commercial VR HMDs come with VR controllers that are used for the purpose of selection, navigation, and other interactions in VR and most people make use of two-factor authentication by receiving text messages on their smartphones and entering the code on the application that originated the authentication request. In a commercial application, the CAPTCHA, numeric, and password challenges could be submitted via text messages without having any app installed on the smartphone, with the checkers challenge being the exception. We chose to use a smartphone client app across all challenges for experimental consistency between conditions in order to make sure that all tasks provided a consistent look and feel. In a user scenario, a gamer could be using the VR controller while playing as usual, and might receive a text message asking them to confirm their identity in order to purchase an asset that is particularly expensive or to override a budgetary control that might have been set on the gamer’s account preferences, in which case they can continue to use the VR controller to respond to the challenge presented in the VR.

3.6. Overview

Figure 6 shows the flowcharts with the key steps for 2FA using NRXR-ID for each condition. In 2FA, there is an initial login where the user provides their password or PIN (first authentication factor). After that, the system verifies that the first password provided matches the user credentials; if this is the case, then the system allows the user to access the basic features of the application. If the user arrives at a situation that generates an additional need for authentication (step 1 in the flowcharts), such as when trying to make an expensive transaction or access sensitive information, the system prompts the user for a second different authentication factor, which is illustrated in steps 2–5 in the flowcharts. If the second authentication factor is correct (steps 6–8 in the flowcharts), the user is granted full access. A video overview for the challenges and conditions evaluated for NRXR-ID is provided in the supplementary materials accompanying this publication.

The traditional two-factor authentication that most people are familiar with today is modeled by the conditions Phone1_VRC2 (the baseline condition) and Phone1_SVRP2, under which users who would normally be using a PC or another device that triggers the second factor request are now in VR, where the HMD device hosting the VR app issues an authentication challenge request. The users of these conditions receive the expected response or passcode on the second factor device (their smartphone), look at the validation challenge containing the expected response in the smartphone, and address or solve it in VR while wearing the HMD. In the reverse condition (HMD1_Phone2), the roles of the devices are swapped; the client app on the phone merely captures the user’s response to the challenge posed within the VR (which now shows the expected response) and submits the user response to the server application. This can only occur after the user has seen the expected response or passcode from inside the VR environment. For all conditions, the application that manages the authentication protocol and validates the input is the server application running in the VR system.

Given the challenge types and conditions described above, we designed a 4 × 3 experiment with four tasks or challenges (the CAPTCHA, checkers, numeric, and password challenges) and three conditions for NRXR authentication delivery, for a total of twelve authentication options. We then performed a user study to evaluate the feasibility and usability of each option and measure users’ performance and preferences on each challenge per condition.

3.7. System Hardware and Software

The system was implemented using the Unity 3D Game Engine, a widely recognized platform for the development of interactive applications and virtual environments. Furthermore, Unity’s compatibility with smartphone development frameworks such as Android SDK allows for seamless incorporation of smartphone capabilities, including touch inputs and mobile-specific functionalities such as tapping, drawing, or processing input from virtual keyboards.

The hardware and software utilized in this implementation included:

Meta Quest 2. Featuring a resolution of 1832× 1920 pixels per eye and a refresh rate of 90 Hz, this headset provides a high-quality visual experience essential for immersion in virtual environments.
–
Software Package Requirements: Integration of the Oculus PC app for Meta Quest Link establishes a seamless connection between the headset and the Unity engine. This software enables efficient data transfer and real-time rendering, minimizing latency and maximizing responsiveness in interactions.
Intel RealSense and Technology Developer Kit (SR300). This depth-sensing camera was incorporated to enhance close-range depth perception, a critical feature for user interactions within a VR context. The camera’s specifications (color resolution of 1920× 1080 at 30 frames per second along with an operating range of 0.3 m to 2 m) enable precise spatial awareness and tracking of user movements. Such capabilities are particularly important in tasks requiring accurate depth recognition because they allow the system to interpret user actions in real time and respond appropriately, thereby facilitating a more intuitive interaction model [70]. Depth Field of View: H = 73, V = 59, D = 90. Auto-exposure: Off. Brightness level: 350 (set inside Unity).
ZTE Z557BL Smartphone, ZTE, Shenzhen, China. This device has a touchscreen resolution of 480× 854 pixels along with basic processing capabilities suitable for the application’s requirements. The smartphone has 1.0 GB RAM and 8 GB storage. Android version: 8.1.0. Dimensions: 14.53 × 7.19 × 0.91 cm.
Software requirements. We used SteamVR Runtime for Windows along with Unity-compatible Intel RealSense SDK 2.0 to ensure cohesive operation among all hardware components within the Unity environment. We selected version (2020.3.25f1) of Unity.

Figure 10a illustrates the two-way interaction involving the HMD, smartphone, and human operator. In different scenarios, the request for the code may be initiated by either the HMD or the smartphone, while the second device is used to input the code. Figure 10b,c shows front and side views of an experimental setup with the depth-sensing camera attached to the headset; the camera can be easily tilted to adjust the view angle for more convenient usage depending on user’s height, arm length, and preferred phone positioning.

3.7.1. Blending of the RGBD Camera Feed

In this implementation of NRXR, the Intel RealSense RGB-D camera is mounted on the Meta Quest 2 headset. The Intel RealSense camera combines a depth sensing camera and a regular video camera. In Unity, the RGBD stream from the camera is projected as an RGBD texture on a transparent window placed in front of the viewer. To blend the objects in the near range with the VR imagery, we use a fragment shader that takes in the input from the camera as an RGBD stream and adjusts the transparency for each pixel before sending the image to the display. If the depth value for a given pixel goes beyond a certain distance threshold (∼120 cm), we set that pixel to be completely transparent/passthrough. In this way, it does not block the view of the VR imagery behind the transparent window, allowing users to see most of the VR environment. A smoothing function for the alpha value is added for pixels near the distance threshold to ensure that objects fade in and out smoothly while all pixels that are below the threshold become opaque.

3.7.2. Preventing Overexposure During Smartphone Display Capture

To see the smartphone from within the HMD, the user has to place their smartphone in front of the HMD so that the phone is in front of the RGBD camera. Most cameras come with a dynamic auto-exposure function which automatically adjusts exposure according to the brightness levels of the captured scene; this ensures that video frame images are not too dark or bright under varying lighting conditions. This auto-exposure feature usually performs well for most applications; however, when bright screens such as the screen of a smartphone are present in the captured video, there is often overexposure in the regions where the bright object is shown on the captured image. In the case of smartphones, the unintended consequence is that most of the portion of the video frames that capture the smartphone screen may become too bright and virtually blank due to overexposure. This tends to happen even more often when the room illumination is low and when users bring the smartphone screen closer to the HMD. Under these circumstances, users would have a hard time reading instructions and using their smartphone inside VR. To deal with this issue, we turned off auto-exposure on the RGB-D camera and set it to a low level of exposure, allowing the text and other contents of the smartphone screen to be viewed by the users. As illustrated in Figure 4, adjustments of around 350 ± 100 units of exposure for the Intel Real Sense camera within the Unity profiler provided the best results for visibility, ensuring that the smartphone screen remained readable. An unintended consequence is that a small portion of the surroundings of the smartphone become darker. In the case of the proposed setup, this is usually not problematic, as the focus is not on capturing surrounding objects but allowing the user to focus on the information presented on the smartphone screen.

4. Experimental Design

After ethical approval was received from an Academic Committee on Ethics in Human Research, we performed a 4× 3 within-subject user study with N = 30 participants in a VR scenario emulating a walk through an old town dotted with teleport markers. A pre-test questionnaire was conducted prior to the experiment to determine the demographic characteristics of the participants.

After completing the demographic questionnaire and signing the consent form for data collection, the participants received a short set of instructions and proceeded to begin the experiment. Figure 1, Figure 2, Figure 3 and Figure 4 and Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 illustrate the VR scenarios and conditions that we employed in our study. Participants were recruited through email distribution lists and postings on social media. Prior to commencing the study, participants were required to fill out an ethics consent form. Participants then went through fifteen rounds of the VR environment, using a VR controller to navigate via teleport markers to move through the VR environment. Afterwards, they completed another fifteen rounds using the smartphone and SmartVR pointer technique [65] to move through the environment in the same way as the first time. Then, participants were randomly assigned to one group in which they would be exposed to all three conditions in different order (six different orders were possible, so each group had five participants). For each condition, participants started with one trial round (not measured). After the trial round, each participant performed five measured rounds plus one round for collecting feedback, completing seven rounds per condition in total. During completion of each round, users encountered the challenges in the following order: CAPTCHA, numeric, checkers, and password. Users moved from one challenge to the next using teleport locations which directed them to the next challenge. The teleporting task provided a washout period between challenges. We collected the time it took participants to complete each challenge from the moment participants arrived to the challenge until they solved it, excluding the time that participants spent teleporting from one challenge to the next. For instance, if a participant failed to travel to the start of a challenge the first time and then succeeded the second time, thereby taking longer than others, this would not affect the task completion time recorded for the challenge at which the participant arrived. We also collected the number of unsuccessful attempts for each challenge.

To capture participants’ perceptions and experiences, on the last round we collected their feedback using seven-point Likert scales, asking the following questions immediately after each challenge:

How much did you like this way of authenticating?
1 (I did not like it at all) 2 3 4 (Neutral) 5 6 7 (I liked it very much)
On a scale from 1 to 7, How effective did you find this way of authenticating?
1 (Not effective at all) 2 3 4 (Neutral) 5 6 7 (Very effective)
On a scale from 1 to 7, How easy to use did you find this way of authenticating?
1 (Not easy at all) 2 3 4 (Neutral) 5 6 7 (Very easy)

We chose to use a small set of questions because these three questions embody key aspects of user experience involving user preferences, perceptions of system effectiveness, and ease of use, as well as to avoid user fatigue. Question 1 maps to questions 1 and 8 of the System Usability Scale (SUS) questionnaire [71] and to question 7 on the NASA Task Load Index (TLX) questionnaire [72]. Question 2 maps to questions 2, 5, 6, and 8, of the SUS. Question 3 maps to the Single Ease Question (SEQ) [73] and questions 3, 4, 7, and 8 of the SUS questionnaire.

At the end of the experiment, we also collected post-use fatigue measurements using an adapted version of Kennedy’s Simulator Sickness Questionnaire [74] and unstructured feedback related to the participants’ experience using a paper-based questionnaire with open-ended questions. The questions included the request to provide users’ feedback on whether they preferred any particular methods, challenges, and input type conditions provided to them during the experiment, what symptoms the VR experience may have caused, and what tasks or interactions they found to be easiest and hardest. We did not collect measures such as the SUS or the TLX for each combination, as it would not have been practical to ask participants to answer full TLX or SUS questionnaires for each of the twelve combinations of task and condition. Instead, we prioritized obtaining specific information on user preferences, likes and dislikes, and perception of effectiveness and ease of use for each combination of challenge and condition.

Data Analysis

To test whether all six order groups’ composition in terms of participants’ gender, age, occupation, and previous HMD experience was as expected from a random participant group allocation, we performed Pearson’s chi-square tests with p-values calculated by Monte Carlo simulation with 100,000 replicates. To evaluate differences on participants’ task completion time and number of clicks between conditions, we performed ANOVA analyses for each of the four tasks. In the ANOVA analyses, the effects of the order and round number were considered. We used Tukey’s test as a post hoc analysis for pairwise comparison of the means between conditions, orders, and rounds. To test whether participants’ responses to the Likert-scale questions in the user experience questionnaire differed among conditions, we performed pairwise Wilcoxon tests with false discovery rate (FDR) correction for multiple testing. All statistical data analyses were conducted in R (version 4.3.1) and plots were created using the R data visualization package ggplot2 (version 3.4.3).

5. Results

5.1. Participant Demographics

Out of the 30 participants, 40% of them (12/30) were aged 18 to 24, 56.67% (17/30) were aged 24 to 48, and one was between ages 48 and 65 (3.33%). About 36.67% (11/30) were computer science students, both graduate and undergraduate, while another 40% of were non-computer science students (12/30); the other seven (23.33%) were either staff, faculty or alumni. One-third of participants (10/30) identified as female and two-thirds identified as male (20/30).

Almost half of the participants (14 people, 46.67%) had never used an HMD. A third (33.33%, 10/30) had used an HMD for less than a month in total, while 10% (3/30) had between 1 and 6 months experience with HMDs and another 6.67% (2/30) had significantly more experience using HMDs (6 months to 2 years). One participant preferred not to answer the question regarding HMD usage experience.

5.2. Performance Metrics

Overall, participants took the longest to complete the password challenge and were the quickest to solve the numeric challenge (Table 1). The independent variable that most significantly affected the overall completion times was the challenge (ANOVA p-value

< 2 \times 10^{- 16}

), followed by the interaction between challenge and condition (ANOVA p-value

= 2.72 \times 10^{- 15}

, Table 1). The effect of the condition on the overall completion times was significant, with an ANOVA p-value of

= 0.0002

. When looking at the effect of the condition on each challenge, we found that in three of the challenges (password, CAPTCHA, and numeric) the factor that most affected the completion time of the participants was the condition (Table 2). Checkers performed consistently across all conditions with no significant differences (p = 0.185), making it robust to different implementation choices. The effect of the order in which the conditions were presented to the participants and the effect of the round number both varied substantially between the challenges. For instance, these two factors (round and order) were found to be statistically significant for the CAPTCHA and checkers challenges. For these two challenge types, the participants’ speed increased after the first round. This was most evident for the CAPTCHA challenge, where participants grew faster in the last round (see Table 3). The round effect on the numeric and password challenge was not statistically significant. These results indicate that it is necessary to take the authentication challenge/task into consideration when selecting a configuration modality, as we discuss below.

Interactions Between Challenge Types and Conditions

As expected and as was found based on the ANOVA results, there were statistically significant differences in mean completion time per condition for the CAPTCHA, numeric, and password challenges (Figure 11 and Table 1). The largest observed differences in participants’ performance were the following: in the password challenge, participants were on average 8 s faster in the Phone1_SVRP2 condition than in the Phone1_VRC2 condition; in the numeric challenge, participants were on average 3.2 s faster in the HMD1_Phone2 condition than in the Phone1_VRC2 condition; finally, in the CAPTCHA challenge, participants were 5.3 and 4.8 s slower in the HMD1_Phone2 condition than in the Phone1_VRC2 and Phone1_SVRP2 conditions, respectively. For HMD1_Phone2, users performed best in the numeric and checkers challenges. For the numeric challenge, users performed best in two out of the three conditions, with the Phone1_VRC2 condition being the slowest. Users performed well on the CAPTCHA in both Phone1 conditions, where they were on average 5 s faster than in the HMD1_Phone2 condition.

As mentioned previously, the minimum number of clicks required to complete any of the challenges was six. On average, participants required

7.96 \pm 2.97

,

8.3 \pm 2.04

,

8.8 \pm 1.56

, and

13.97 \pm 7.99

clicks to complete the numeric, checkers, CAPTCHA, and password challenges, respectively. The number of clicks was not significantly affected by the condition, order, or round for the CAPTCHA and numeric challenges. Condition was the only significant factor for the checkers (F value 18.72, p-value

1.57 \times 10^{- 8}

) and password (F value 24.93, p-value

5.55 \times 10^{- 11}

) challenges. For the checkers challenge, participants needed on average 1.4 and 0.85 more clicks to complete the challenge in the Phone1_SVRP2 and Phone1_VRC2 conditions, respectively, than in the HMD1_Phone2 condition (Figure 12). For the password challenge, participants needed on average 3.74 and 6.15 more clicks to complete the challenge in the Phone1_SVRP2 and Phone1_VRC2 conditions, respectively, than in the HMD1_Phone2 condition (Figure 12).

We tracked the number of unsuccessful attempts made by the participants and calculated the success rate for each combination of challenge and condition as the percentage of successful attempts over total attempts (Table 4). Overall success rates averaged from 90% to 93% for each condition. The challenge with the highest success rate (96.33%) was the numeric challenge, followed by CAPTCHA and checkers. The condition with the highest overall success rate was Phone1_SVRP2 (93%). The challenge that had the lowest success rate was the password challenge (88.67%).

Unsurprisingly, there was a significant positive correlation between completion time and number of clicks. This correlation was strongest for the password challenge (Spearman’s

ρ

0.53, p-value <

2.2 \times 10^{- 16}

) and weakest for the CAPTCHA challenge (Spearman’s

ρ

0.15, p-value 0.002).

5.3. Participant Feedback

5.3.1. Structured Feedback Analysis

In terms of participants’ structured feedback (using Likert scales), participants preferred the Phone1_SVRP2 and Phone1_VRC2 conditions (FDR-adjusted Wilcoxon rank’s p-value 0.007 and

1.9 \times 10^{- 5}

, respectively) to the HMD1_Phone2 condition (see Figure 13). Similarly, participants found the Phone1_SVRP2 and Phone1_VRC2 conditions (FDR-adjusted Wilcoxon rank’s p-value 0.003 for both) more effective than the HMD1_Phone2 condition. Following the same trend, participants perceived the Phone1_SVRP2 and Phone1_VRC2 conditions (FDR-adjusted Wilcoxon rank’s p-value 0.003 and 0.0001, respectively) as easier to use than the HMD1_Phone2 condition. Participants’ scores for the Phone1_SVRP2 and Phone1_VRC2 conditions were comparable.

Regarding the challenges, the participants disliked the password challenge (FDR-adjusted paired Wilcoxon rank p-values

\leq 2.4 \times 10^{- 13}

) more than any of the other three challenges. Participants preferred the checkers challenge to the CAPTCHA and numeric challenges (FDR-adjusted paired Wilcoxon rank p-values of 0.009 and 0.023, respectively), and liked the numeric challenge slightly more than the CAPTCHA challenge (FDR-adjusted paired Wilcoxon rank p-values of 0.067). Similarly, participants found the password challenge to be less effective (FDR-adjusted paired Wilcoxon rank p-values

\leq 1.4 \times 10^{- 7}

) than any of the other three challenges, and found the checkers challenge to be more effective than the CAPTCHA and numeric challenges (FDR-adjusted paired Wilcoxon rank p-values of 0.028 and 0.040, respectively). There was no statistically significant difference in perceived effectiveness by the participants between the numeric and CAPTCHA challenges. Finally, the participants found the password challenge to be more difficult (FDR-adjusted paired Wilcoxon rank p-values

\leq 6.7 \times 10^{- 14}

) than any of the other three challenges. There was no statistically significant difference in how difficult the participants found the other three challenges. In addition, participants did not report post-use fatigue or other adverse symptoms from the simulation sickness questionnaire, with the exception of one person who experienced temporary dizziness after completing the experiment.

5.3.2. Unstructured Feedback Analysis

In terms of unstructured feedback, not all participants expressed a particular preference for a certain combination of challenge and condition; however, the most frequently mentioned as the best combination was the checkers challenge during condition 2, i.e., Phone1_SVRP2. Exemplifying this view, a participant commented: “Reading the codes from the smartphone’s screen seemed the most efficient to me due to the possibility to simultaneously look and read the codes and enter the answer with head movement. Liked the idea of authentication via “checkers” system, it was the most entertaining one.”

Participants expressed their preferences regarding the best and worst challenges provided during the experiment by answering the unstructured feedback section in the post-questionnaire. The results of the unstructured feedback regarding the challenges is summarized in Table 5. As shown in the table, comments were read and sorted in four categories per challenge. We then calculated the percentages of comments in each cell and divided them over the total of comments (42 comments, 23 positive and 19 negative in total). By far the most positively commented challenge was the checkers challenge, with no negative comments and with 26% of all comments mentioning it as either a good challenge or the best. The runner-up was the numeric code, with 16.67% of comments being positive and 7.14% negative. The CAPTCHA challenge received evenly mixed replies, with 9.52% positive comments and 9.52% negative. The password challenge caused users to struggle the most, as evidenced by 28.57% of comments about it being negative and only 2.38% being positive. To exemplify, one of the participants commented: “With respect to entering the code from the keypad of the phone, I almost got the feeling that my grandpa gets while he enters a text message.”

We performed a similar analysis for the three conditions. Regarding these, we registered 42 comments in total, with twice as many positive as negative entries (28 positive vs. 14 negative). Participants overwhelmingly preferred the Phone1_SVRP2 condition over the two other conditions, with 35.71% of all comments liking most or some of its features and only 2.38% disliking some aspect of it. The second-most preferred condition was Phone1_VRC2, with 23.81% comments liking some or most of it but 11.9% disliking some or most of it. The HMD1_Phone2 condition was the least preferred, with 7.14% of comments liking some or most of it but 19.05% disliking some or most of it. This overall negative balance between positive and negative comments may stem from the difficulties participants encountered when completing the CAPTCHA and password challenges. See Table 6 for the detailed results.

5.4. Summary of Results

When considering the four challenge types, users were fastest on average with the numeric PIN challenge, followed closely by the checkers challenge and CAPTCHA challenge, with the keyboard password challenge a distant fourth. Across the structured feedback, users preferred the checkers challenge the most, followed by the numeric PIN challenge. In terms of unstructured feedback, The CAPTCHA challenge received equal amounts of positive and negative remarks, while the alphanumeric password received the most negative comments.

The effect of the condition in terms of the participants’ completion times varied across challenges. While the password challenge seems the least suitable for use in a VR environment overall, participants completed this challenge much faster in the Phone1_SVRP2 condition, highlighting the convenience of gaze-based interaction supported by a smartphone. For the checkers-style challenge, the condition did not impact users’ performance, which speaks in its favor as offering consistent performance across conditions. For the numeric PIN challenge, participants completed the challenge the fastest in the HMD1_Phone2 condition, and it was also the fastest for the Phone1_SVRP2 condition. For the CAPTCHA challenge, the Phone1_SVRP2 and Phone1_VRC2 conditions allowed participants to complete the challenge much faster than the HMD1_Phone2 condition, which may be explained by the fact that it is easier to recognize the tile’s contents when they are shown in VR than when they are shown via NRXR. In general, users were fastest under the Phone1_SVRP2 condition, and they perceived the Phone1_SVRP2 and Phone1_VRC2 conditions more positively than the HMD1_Phone2 condition.

6. Discussion

With respect to Research Question 1 (RQ1), we observed that all participants were able to complete all challenges under all three conditions. This is encouraging given the variety in the authentication challenges presented. The success rates we observed provide strong support for the view that it is possible to use NRXR-ID to support users to authenticate their credentials via two-factor authentication and that users are able to fully identify text, digits, images, and patterns without removing their HMDs. However, we detected significant differences between the different challenges and conditions overall, as described below.

With respect to RQ2, regarding which type of authentication challenges is most suitable to implement two-factor authentication in the VR context, our results indicate that in addition to the numeric PIN, the checkers challenge was among the two most suitable for deployment, as it was the most liked, most effective, and easiest to use among all the conditions. While some participants excelled in some challenges under certain configurations, the checkers challenge produced good results overall and was consistent across all conditions with no significant differences (p = 0.185) in terms of average execution times, making it robust to different implementation choices. In addition, it was clearly the most preferred by users in the unstructured feedback results. In the unstructured feedback, only the checkers challenge received more positive comments than the numeric PIN code challenge. We believe that the checkers-style matching challenge was preferred the most by users in part because of a human factor known as the novelty effect, in which new methods are preferred to well-established methods. In addition, the unstructured feedback analysis suggests that even though some participants found the numeric PIN challenge to be the most efficient, the checkers-style challenge was found to be entertaining; we believe that this mix of novelty and entertainment value might have influenced the participants’ preference towards the checkers challenge. We also observed a round effect for the checkers challenge (see Table 3), with users taking about 2.5 s longer in the first round compared to the subsequent rounds. We believe that this was because the participants were still learning how to solve the checkers challenge in the first round; for rounds 2 to 5, the results were quite similar, suggesting that by this time the participants were familiar enough with the challenge to solve it in a consistent amount of time.

The numeric PIN challenge was consistently a good alternative to the checkers challenge, and has the advantage of being the method most users are probably familiar with. It does not require a smartphone app in the Phone1_ condition. In addition, it allowed for the fastest average completion times among all four types of challenges and showed the highest success rate under the Phone1_SVRP2 condition. Because of its familiarity, there were no statistically significant effects based on the round in which the challenge was completed, meaning that no learning effects were present.

The CAPTCHA challenge was third in performance and was ranked neutral to quite positive in user preferences, but was rated as the least liked, least effective, and least easy to use in condition 1 (HMD1_Phone2). Despite this, it had the third-highest success rate in condition Phone1_VRC2 and fourth-highest in condition Phone1_SVRP2, when considering all combinations of challenge and condition. We believe that the reason for this result is that it is harder to recognize the shapes in the tiles using the NRXR technique than when they are shown straight on the HMD. The CAPTCHA challenge also exhibited the strongest round effect, with the checkers challenge showing the second-strongest (see Table 3). We believe that this was because the participants were learning how to deal with the challenge over the course of the rounds, and may have learned to identify groups of three related items more quickly.

From all four challenges, it is quite clear that the password challenge is the least suitable for the second step of the authentication, i.e., confirmation of the user’s identity. Not only does it involve the recognition of a larger set of letters and special characters, each letter and special character occupies a smaller area in the display. This is specially true in the HMD1_Phone2 condition. The password challenge is also the most complex to use, in that users need to switch between special, upper, and lower case key sets; thus, even though users were asked to enter six characters, as in the other challenges, the complexity of the task was larger due to the need to switch between keyboard sets. In addition, the penalty for correcting a mistake was larger for keyboard users, as they needed to delete some successfully entered characters until they returned to the character that might have been incorrectly entered in the first place. All these factors may explain why even when participants had a larger virtual keyboard in conditions 2 and 3 (Phone1_SVRP2) and (Phone1_VRC2), they still ranked the password challenge lowest from among all four challenges. On the other hand, it can be argued that the password challenge is the most robust to a brute force attack, as it allows for approximately 100 billion configurations in this experimental setup, and even more with a larger number of characters. Since it is one of the most reliable forms of authentication in non-VR environments using 2FA [21], we believe the password challenge could be used as the first line of authentication in VR, i.e., when the user first gains access to the XR environment, after which one of the three other methods could be used for confirmation of a user’s identity as the need arises.

With respect to RQ3, concerning whether it is more effective for users to present a challenge using the smartphone and have the challenge answered within VR (conditions Phone1_SVRP2 and Phone1_VRC2) or vice versa (condition HMD1_Phone2), our results show that participants found it less effective overall to present the challenges using the HMD first and respond using the smartphone (HMD1_Phone2). This is mostly due to the fact that both the CAPTCHA and password challenges were found to be least effective under this condition. When the challenge was checkers or numeric, the users found both variants equally effective. This is also why we suggest that the most suitable method for authentication is the checkers matching challenge and that the second-most suitable is the numeric challenge, as they are more robust to changes in the delivery condition. CAPTCHA was generally ranked higher in the Phone1 conditions (Phone1_SVRP2 and Phone1_VRC2) than in the HMD1_Phone2 condition. Overall, the condition with the highest success rate was Phone1_SVRP2, followed by Phone1_VRC2.

With respect to RQ4, collecting both structured and unstructured feedback was useful for obtaining insights regarding the participants’ experiences, impressions, and preferences. Overall, it is clear that the percentage of participants who preferred the checkers challenge was comparable to the percentage of participants who disliked the password challenge; thus, these techniques are at opposite ends of the spectrum. The numeric challenge was the second-most embraced challenge, after checkers. The percentage of positive and negative views on the CAPTCHA challenge balanced each other across conditions. With the exception of the password challenge, users found the three other challenges fairly easy to use irrespective of the interaction modality. In terms of the different conditions, a much larger proportion of participants had a positive impression of condition Phone1_SVRP2 than the two other conditions.

In terms of the interaction between conditions and challenges, it is interesting to see that while the password challenge performs very differently across conditions, the checkers challenge remains stable. We believe that this occurs because the checkers challenge is more visually clean and simple than the alphanumeric keyboard. For this reason, it can be used in a more consistent way across conditions, similar to the numeric PIN challenge. On the other hand, the password challenge, and to some extent the CAPTCHA challenge, become harder to solve when seen through the passthrough camera. In the case of CAPTCHA, the icons portraying images of animals, vehicles, stairs, etc., become harder to appreciate through the passthrough video, making this challenge particularly hard under Condition 1. Similarly, for the password challenge, each key occupies a smaller region in the smartphone screen, and this effect is worsened by passthrough and the HMD’s native resolution (see Figure 4). This makes either of the Phone1 conditions better candidates for the password challenge, where the answer can be provided in VR with a clear view of the virtual keyboard (see Figure 3). However, we noticed that the password challenge also performed poorly under Condition 3 (Phone1_VRC2). When looking at Table 4, the success rate under the VRC2 condition is lower for the password challenge, while looking at Figure 12, the average number of clicks is larger for the VRC2 condition than for the SVRP2 condition. This means that VRC2 has some inherent disadvantage with respect to SVRP2 when operating virtual keyboards, which suggests that gaze-based interaction (used in SVRP2) outperforms VRC2 due to some difficulty when trying to select the keys on the keyboard, perhaps due to the laser-pointer style interaction when using the VR controller on a larger set of target keys.

6.1. Security Considerations and Potential Vulnerabilities

While 2FA offers increased security in non-VR settings, VR-specific vulnerabilities such as potential gaze exploitation [12] and de-anonymization attacks [15] may still remain in NRXR-ID. On the other hand, the difficulty of man-in-the-middle or object-in-the-middle attacks [13,14] should be increased due to challenges for attackers in interfering with the video feeds of the HMD’s built-in cameras or the user’s smartphone. With respect to the vulnerabilities associated with the proposed use of NRXR, it is important to ensure that hackers and collaborators do not have access to the HMD’s built-in camera feeds in order to prevent them from seeing the user’s smartphone through a network connection. This can be accomplished by implementing hardware encryption or a software lock such as visual cryptography [75], which decodes messages sent from machines for human observers to perceive. Ideally, this mechanism should be activated during authentication in such a way that it only allows the feed from the NRXR to be visible to the person wearing the HMD, with no one else except that person having a view of the NRXR.

With respect to the challenges explored in the study, it is worth noting that some participants reported that the CAPTCHA challenge was “relatively easy to guess”. We believe that this impression may be caused by the fact that CAPTCHA tiles share information across multiple tiles, whereas the other challenges do not require any internal consistency to solve the challenge (e.g., knowing one digit of the six-digit PIN code does not help in knowing the other five, as all six digits are independent from each other). Smart CAPTCHA users may identify a “theme” (i.e., a group of images that belong together) and could make a guess even without receiving the challenge. To counter this, having larger grids of tiles (e.g., 5× 5) would help reduce this risk. However, each tile then would become visually smaller (less screen real estate), and its contents might become harder to recognize, which would make the challenge harder to solve and potentially more frustrating for users. In addition, CAPTCHA may be potentially vulnerable to a brute-force attack if combined with a machine learning attack, where machine learning could be used to identify themes in the tiles to make educated guesses as part of the brute-force attack. For these reasons, we believe that the CAPTCHA challenge may be the most vulnerable to these type of attacks. With respect to the checkers challenge, we used a 4× 4 grid in the experimental setup, providing 2¹⁶ (or 65,536) different combinations. For increased robustness, the grid could be expanded to a 5× 4 grid, where there would be 2²⁰ possible arrangements, which is a little more than a million different combinations and would make the checkers challenge as hard to crack by brute force as the six digit numerical code, which provides one million options. Finally, the alphanumeric password was limited to six characters, whereas most current secure password guidelines recommend creating a password with at least eight characters. Because the results clearly show that this was the least preferred option and the one that took the longest, it can be expected that increasing the length of the password to eight characters would only make this option take even longer and be even less preferred by users. To address this issue, it has been suggested that such passwords be handled using password managers, which when combined with 2FA would offer the most robust combination in non-VR settings [21,76]. This might be also the case when using XR systems. However, the password manager system in the XR device would then need to be protected from misuse and the HMD would need to be revised for potential vulnerabilities, as suggested by Sha et al. [11].

6.2. Ethical Considerations for Real-World Deployment

There are several ethical considerations to be taken into account when using NRXR-ID for authentication in higher-stakes real-world scenarios. First is the fact that all communications between the server and the client need to use encrypted channels. Submitting passcodes or expected responses through unencrypted means would compromise the integrity of the process, as a sophisticated intruder might be able to eavesdrop and discover the correct response. As mentioned in the introduction, another potential attack is de-anonymization, by which an intruder attempts to discover the identity of a user by combining data that are privately accessible to the user with publicly available data. To maintain the security of NRXR-ID protocols, it is suggested to prevent streaming of the authentication session to other participants or users, in particular by limiting access to the audio and video streams from the built-in cameras and microphone. Otherwise, a person receiving the video stream coming from the HMD’s cameras could be able to see the contents of the smartphone or hear the user speaking the password, thereby finding out the password or expected response. NRXR-ID mitigates these problems by requiring the expected response to be entered through interaction with the scene displayed on the HMD, which is usually exclusive to the person wearing the HMD. In this way, a person who does not have access to the HMD would not be able to complete the authentication even if they saw the expected response through another means, such as through shoulder surfing.

Regarding the issue of accessibility for users with disabilities, it is fair to say that some NRXR-ID conditions lend themselves to use by people with disabilities much better than others. In particular, users with a lack of mobility or inability to use their upper limbs would be able to use Condition 2 (Phone1_SVRP2), which allows people to successfully perform authentication using gaze-based interaction. In this case, tapping or selection in VR can be accomplished by alternative mechanisms, such as blinking the eyes or staring longer at the item to be selected.

Other ethical concerns regarding human factors in XR have been discussed by Abraham et al. [77], who identified two additional sets of key concerns on top of security, namely, privacy and influence over behavior. While these concerns are valid and relevant, they are outside the scope of this article. For general guidelines on the ethical use of XR that adequately supports privacy, interested readers can refer to the guidelines and documents from the XR Safety Initiative (XRSI) and its privacy framework [78].

6.3. Limitations

We have identified several limitations with the current setup. The first limitation is the presence of cables. Cables are problematic for many reasons: first, they limit the mobility of users, who cannot move too far from the workstation; second, users can become entangled, for example by turning around; third, they introduce mechanical stress on the HMD by means of plugging and unplugging from the HMD and external camera or accidental pulling by the user. Ideally, users should operate without a physical cable connecting them to a workstation. With some exceptions, the latest generation of HMDs almost completely liberate users from the need to remain connected to a computer or laptop. We considered the possibility of using wireless cameras, but found that the delay between the user’s actions and the time the actions were reflected in the VR world represented a major limiting factor due to transmission times. This issue may be revised as faster network protocols become the norm. In addition, the depth-sensing camera we used was not designed to be wireless; thus, our setup would require extra hardware in order to be made entirely wireless, adding to the current burden of wearing both the HMD and the camera itself.

Another limitation lies in the types of challenges that can be used for two-factor authentication in VR. A very large number of possibilities exist for implementing two-factor authentication challenges in the virtual world. One example is the use of gaming-style 3D puzzles, including those involving more complex challenges where users need to rotate an object in 3D space to discover a code to be used on the second device, as well as challenges involving mixed reality juxtaposition of objects from the real and virtual world. As the possibilities are vast, we focused on those which are well established and left others for future exploration.

A limitation of the user study is that we did not collect additional user evaluation metrics such as cognitive load or disruption to immersion. It would be useful to collect such metrics in a follow-up study. However, as noted by the usability experts of the Nielsen and Norman Group, it might not be practical to request the TLX for each of the twelve combinations, as “It’s a relatively complex questionnaire that needs to be answered after every key task, and so will add a lot of time (and potential participant fatigue) to the overall test process,” “It can disrupt the study flow and make the experience quite a bit less natural for participants than if they progress smoothly through a test scenario,” and “It will often require that the facilitator explain the instrument multiple times” [79]. However, it is suggested to add a full SUS questionnaire for the whole experiment [71] as well as the NASA-TLX for the special case of complex mission-critical workflows.

In terms of participant demographics, a little less than half of the participants (46.67%) reported no prior HMD experience. In addition, the participant pool, which predominantly consisted of students, may have skewed the results towards younger and more tech-savvy users. In the future, it would be desirable to expand the participant demographics to include a wider range of ages and occupations in order to obtain more generalizable results.

6.4. Future Work

To achieve a more accessible implementation of NRXR-ID, we are exploring the possibility of using the built-in cameras in HMDs to extract the depth field without the need for an external camera. Until very recently, Meta has provided developers with access to the passthrough cameras of its high-end HMDs (Meta Quest 3 and Meta Quest Pro), opening the door to an implementation of NRXR in their systems. The Apple Vision Pro already allows users to see and operate their smartphones in well-illuminated environments. However, we were unable to use it for our studies, as it was released just last year and it would have taken significant time to switch to the development environment of the Apple ecosystem. In addition, the Apple Vision Pro costs approximately ten times more than a Meta Quest 3 and five times more than a Meta Quest Pro, raising questions of the affordability of the device, how this would limit the availability of the solution to the enterprise segment of the XR market, and how this might reduce the impact or availability of 2FA to the wider XR community.

In a separate line of inquiry, we are working towards the implementation of NRXR without the use of a depth stream altogether. Machine learning methods can be used to perform real-time segmentation by detecting smartphones and users’ hands from RGB streams, which could then be used to simulate or replace the need for depth estimation. These modifications would provide the crucial benefit of eliminating the need for a cable to connect the depth-sensing camera. Another potential improvement would be the use of mid-air hand tracking (also known as hand-based interaction) to let users execute the second part of the two-factor authentication process without a VR controller. Even though this would also eliminate the need for the VR controller when answering the authentication challenge in Phone1-style conditions, it would also remove the haptic feedback and the tangible nature of the setup, which is something that has been reported as having its own advantages in the related literature [53,66]. Regarding gesture detection, it is worth noting that no single set of gestures for interacting in VR has become established as an industry standard. For instance, the Apple Vision Pro relies solely on eye-tracking in combination with the pinch gesture; in contrast, the Microsoft HoloLens relies on pinching, sliding, and poking in the air for selection, while also supporting other gestures through the Mixed Reality Toolkit (MRTK). Finally, the Meta Quest 3 supports its own set of gestures through the Interaction SDK, which supports both hand tracking and VR controllers. We are planning to design follow-up user studies comparing the most preferred methods discussed in this article with hand tracking-based and hand gesture-based interaction techniques. As mentioned above, using the built-in cameras already present in the latest generation of HMDs would yield another potentially significant improvement. During the production of this article, Meta Quest HMD’s did not provide developer access to the RGBD or the depth streams to allow us to implement NRXR without the external camera. Many HMD manufacturers cannot provide or do not allow software developers to have direct access to the video and depth streams from the built-in cameras, citing potential privacy concerns. The streaming data from the HMD could be a stereo video stream, from which the depth could be extracted; alternatively, if the HMD has built-in depth sensors, it could be a depth video stream. Having access to the built-in hardware in the HMDs to implement NRXR would be useful for replicating the functionality of the external depth-sensing camera, providing an alternative pathway for a convenient and more accessible implementation. The next line of improvement involves the use of high dynamic range (HDR) and high definition (HD) video capture. This would improve support for capturing smartphones, smart watches, tablets, and digital screens in general by providing higher resolution, and would reduce the darkening of unlit elements in the scene such as the users hands, providing a more realistic reproduction of the user’s skin tone, for instance. An alternative design that could improve screen readability in VR would be to use a 2D–3D hybrid setup to track the smartphone and show the user’s hands, following the approach of [54]. This approach relies on a wired camera mounted on the HMD to track the device, but this could also be replaced with access to the RGB stereo streams from the HMD built-in cameras, as suggested above.

Summary of Future Work

The key issues identified for the next steps are: (1) removal of cables and the external depth-sensing camera; (2) implementation of NRXR-ID in the Meta Quest 3 and other commercially available HMDs; and (3) support for hand-based interaction to eliminate the need for VR controllers.

The methods to be used here include using wireless networking protocols and high-speed data transmission channels as well as the use of Meta’s Interaction SDK to support the family of Meta Quest HMDs and make use of Meta’s hand tracking and gesture interpretation libraries. In addition, the use of OpenXR and the Mixed Reality Toolkit Library will be essential to support other HMDs.

The expected outcomes are: (1) a wireless implementation of NRXR-ID will provide maximum user comfort and will greatly increase the accessibility of the system to the general public, as no additional external equipment would be required; (2) implementation on the Meta Quest 3 and other commercial HMDs will make this solution available for a much wider user base; and (3) hand-based interaction will facilitate the authentication process, as users will be able to use their own hands to provide the expected responses directly on the challenges’ input panels.

When we obtain a prototype that incorporates these features, we will be able to perform a formal security evaluation before making this method widely available to the public. Lamsweerde [80] presented a roadmap for formal specifications that represents a foundational work in software engineering. However, an outline for formal security evaluation would need to be tailored to the specific organization implementing NRXR-ID, and would the include the discovery of priorities, data systems, processes, clients, main stakeholders, and organizational concerns. To this end, a formal roadmap for XR will need to be developed taking into consideration issues around safety and security such as those highlighted by Emteq Labs and the XRSI group [81].

The potential impact of this work on real-world applications is that a much wider range of users will be able to make use of two-factor authentication in VR systems for a variety of applications, from confirming sensitive online banking transactions to purchasing expensive assets in video games using commercially available HMDs. Preliminary pilot testing or feasibility studies outside of the lab environment will provide critical information for real-world deployment.

6.5. Design Implications of the Findings

Despite the aforementioned limitations, the results of this study provide several insights that inform VR interface design decisions related to two-factor authentication.

First, we have found that while the current implementation of NRXR used in the study could soon be implemented in a more convenient way (with the removal of the external camera having the highest priority, as highlighted in the Future Work section), near-range extended reality can already be effectively used to allow participants to access their smartphones in order successfully achieve two-factor authentication without the need to remove the HMD while in VR (RQ1).

In accordance with the findings of this study, VR researchers and developers may instead focus on developing alternatives to or variations of the checkers challenge. VR designers have the choice to adopt the use of the checkers challenge or to continue relying on the use of six-digit numeric PINs, as both options are suitable and among the most preferred by users (RQ2 and RQ4). In particular, if VR designers are interested in offering users a novel visual challenge that might be well-received by a young and tech-savvy audience, they might prefer the checkers challenge. If the design focus is instead centered on efficiency and familiarity or if the designer wants to avoid deploying a client app on the smartphone, the numeric PIN challenge would be the better choice.

The CAPTCHA challenge can be a third candidate of choice, but deserves additional considerations as it seems easier to crack. To prevent users or AI agents from easily solving CAPTCHA challenges, several dynamic visual matching variations have emerged online which make the validation process more complex and significantly increase the degree of difficulty of solving a challenge. However, these might need to be evaluated formally in order to make sure that they do not lead to increased user frustration. Furthermore, VR developers can feel confident in avoiding alphanumeric passwords as the second factor or limiting their use to the first time users log into the system, as the password challenge was found to be the least suitable and least preferred option (RQ2 and RQ4). Developers may also wish to consider the use of secure password manager systems, but should be aware that these may open up access vulnerabilities.

Additionally, when faced with the design choice of whether to use a gaze pointer or VR controller for selecting items within VR, our results suggest that the gaze pointer is the preferred option and performs slightly better (RQ4). With respect to the question of whether a phone should be used as the device showing the challenge (Phone1 conditions) or the device used to respond to the challenge (HMD1_Phone2 condition), our findings suggest that most people prefer to have the phone as the device showing the challenge (Phone1 conditions), with HMD then used for answering the challenge (RQ3).

More generally, this work illustrates how NRXR can be used to mix the real and virtual worlds in order to facilitate certain tasks that require information from both environments. In addition, it highlights the fact that the passthrough mode currently available in many HMDs can be refined in a way that prioritizes access to the real-world elements found in close proximity to VR users while also allowing them to remain aware of the VR environment.

7. Conclusions

In this paper, we have explored methods for implementing two-factor authentication in VR using smartphones. We have demonstrated that it is possible for users to perform such authentication successfully in many different ways. The results presented here suggest that NRXR-ID can be used to scan and select images in order to answer a CAPTCHA-style challenge, match visual patterns in a checkerboard-style challenge, operate virtual numeric keypads, and both read and type short passwords in VR for authentication purposes.

We found that the checkers-style matching challenge is the most suitable option among those considered, closely followed by entering a six-digit numeric code using a virtual numeric keypad. This result is significant given that this option is the one most people are familiar with at this time, and is used by the vast majority of digital service providers requiring two-factor authentication for access to their services. Other familiar methods such as CAPTCHA-style challenges are also viable if the challenge is presented within VR. We found that short but robust passwords are the most challenging to be entered, and are also clearly disliked by users. There are many other variants for facilitating two-factor authentication in VR, and further research is needed to explore these variants.

This work shows that there are still open questions regarding the design and behavior of passthrough mode. This mode could be further refined to incorporate elements of the real world through a more selective approach in order support the activities taking place inside the VR environment in a more targeted way.

Supplementary Materials

The accompanying video for NRXR-ID can be downloaded at: https://www.mdpi.com/article/10.3390/electronics14173368/s1.

Author Contributions

Conceptualization, A.N. and O.M.-P.; Methodology, A.N. and O.M.-P.; Software, A.N.; Validation, A.N., L.P.-C. and O.M.-P.; Formal analysis, L.P.-C. and O.M.-P.; Investigation, A.N. and L.P.-C.; Resources, A.N.; Data curation, L.P.-C.; Writing—original draft, A.N. and O.M.-P.; Writing—review & editing, L.P.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by Memorial University’s School of Graduate Studies.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Landwehr, C.E. Cybersecurity and Artificial Intelligence: From Fixing the Plumbing to Smart Water. IEEE Secur. Priv. 2008, 6, 3–4. [Google Scholar] [CrossRef]
Al-Hasan, M.; Deb, K.; Rahman, M.O. User-authentication approach for data security between smartphone and cloud. In Proceedings of the 2013 8th International Forum on Strategic Technology (IFOST 2013), Ulaanbaatar, Mongolia, 28 June–1 July2013; Volume 2, pp. 2–6. [Google Scholar] [CrossRef]
Flores, P. Digital Simulation in the Virtual World: Its Effect in the Knowledge and Attitude of Students Towards Cybersecurity. In Proceedings of the 2019 Sixth HCT Information Technology Trends (ITT), Ras Al Khaimah, United Arab Emirates, 20–21 November 2019; pp. 1–5. [Google Scholar] [CrossRef]
Al-Hadadi, M.; Shidhani, A.A. Smartphone security awareness: Time to act. In Proceedings of the 2013 International Conference on Current Trends in Information Technology (CTIT), Dubai, United Arab Emirates, 11–12 December 2013; pp. 166–171. [Google Scholar] [CrossRef]
Henrysson, A.; Ollila, M. Augmented reality on smartphones. In Proceedings of the 2003 IEEE International Augmented Reality Toolkit Workshop, Tokyo, Japan, 7 October 2003; pp. 27–28. [Google Scholar] [CrossRef]
Anastasaki, I.; Drosatos, G.; Pavlidis, G.; Rantos, K. User Authentication Mechanisms Based on Immersive Technologies: A Systematic Review. Information 2023, 14, 538. [Google Scholar] [CrossRef]
Mylonas, A.; Dritsas, S.; Tsoumas, B.; Gritzalis, D. Smartphone security evaluation The malware attack case. In Proceedings of the International Conference on Security and Cryptography, Seville, Spain, 18–21 July 2011; pp. 25–36. [Google Scholar]
Dörflinger, T.; Voth, A.; Krämer, J.; Fromm, R. “My smartphone is a safe!” The user’s point of view regarding novel authentication methods and gradual security levels on smartphones. In Proceedings of the 2010 International Conference on Security and Cryptography (SECRYPT), Athens, Greece, 26–28 July 2010; pp. 1–10. [Google Scholar]
Oh, T.; Stackpole, B.; Cummins, E.; Gonzalez, C.; Ramachandran, R.; Lim, S. Best security practices for android, blackberry, and iOS. In Proceedings of the 2012 The First IEEE Workshop on Enabling Technologies for Smartphone and Internet of Things (ETSIoT), Seoul, Republic of Korea, 18 June 2012; pp. 42–47. [Google Scholar] [CrossRef]
Mashkina, I.V.; Guzairov, M.B.; Vasilyev, V.I.; Tuliganova, L.R.; Konovalov, A.S. Issues of information security control in virtualization segment of company information system. In Proceedings of the 2016 XIX IEEE International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia, 25–27 May 2016; pp. 161–163. [Google Scholar] [CrossRef]
Sha, L.; Chen, X.; Xiao, F.; Wang, Z.; Long, Z.; Fan, Q.; Dong, J. VRVul-Discovery: BiLSTM-based Vulnerability Discovery for Virtual Reality Devices in Metaverse. ACM Trans. Multimed. Comput. Commun. Appl. 2025, 21, 1–19. [Google Scholar] [CrossRef]
Wang, H.; Zhan, Z.; Shan, H.; Dai, S.; Panoff, M.; Wang, S. GAZEploit: Remote Keystroke Inference Attack by Gaze Estimation from Avatar Views in VR/MR Devices. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security (CCS’24), Salt Lake City, UT, USA, 14–18 October 2024; pp. 1731–1745. [Google Scholar] [CrossRef]
Fujita, M.; Kurasaki, S.; Kanaoka, A. Securing Cross Reality: Unraveling the Risks of 3D Object Disguise on Head Mount Display. In Proceedings of the 13th International Conference on the Internet of Things (IoT’23), Nagoya, Japan, 7–10 November 2024; pp. 281–286. [Google Scholar] [CrossRef]
Lebeck, K.; Ruth, K.; Kohno, T.; Roesner, F. Securing Augmented Reality Output. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 320–337. [Google Scholar] [CrossRef]
Sabra, M.; Vinayaga-Sureshkanth, N.; Sharma, A.; Maiti, A.; Jadliwala, M. De-anonymizing VR Avatars using Non-VR Motion Side-channels. In Proceedings of the 17th ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec’24), Seoul, Republic of Korea, 27–29 May 2024; pp. 54–65. [Google Scholar] [CrossRef]
De Guzman, J.A.; Thilakarathna, K.; Seneviratne, A. Security and Privacy Approaches in Mixed Reality: A Literature Survey. ACM Comput. Surv. 2019, 52, 1–37. [Google Scholar] [CrossRef]
Peter, F.K. AR/VR Security Risks: Protecting Digital Spaces and Virtual Identities. Available online: https://www.presencesecure.com/ar-vr-security-risks-protecting-digital-spaces-and-virtual-identities/ (accessed on 20 August 2025).
Schmidt, L.; Yigitbas, E. Taxonomy and Analysis of Security Vulnerabilities, Privacy Violations and Potential Mitigation Strategies to XR Systems. In Proceedings of the 18th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA’25), Corfu Island, Greece, 25–27 June 2025; pp. 368–375. [Google Scholar] [CrossRef]
Acheampong, R.; Popovici, D.M.; Balan, T.C.; Rekeraho, A.; Oprea, I.A. A Cybersecurity Risk Assessment for Enhanced Security in Virtual Reality. Information 2025, 16, 430. [Google Scholar] [CrossRef]
Dastgerdy, S. Virtual Reality and Augmented Reality Security: A Reconnaissance and Vulnerability Assessment Approach. arXiv 2024, arXiv:2407.15984. [Google Scholar] [CrossRef]
Jubur, M.; Shrestha, P.; Saxena, N. An In-Depth Analysis of Password Managers and Two-Factor Authentication Tools. ACM Comput. Surv. 2025, 57, 1–32. [Google Scholar] [CrossRef]
Bhanderi, D.; Kavathiya, M.; Bhut, T.; Kaur, H.; Mehta, M. Impact of Two-Factor Authentication on User Convenience and Security. In Proceedings of the 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 15–17 March 2023; pp. 617–622. [Google Scholar]
McGill, M.; Boland, D.; Murray-Smith, R.; Brewster, S. A Dose of Reality: Overcoming Usability Challenges in VR Head-Mounted Displays. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15), Seoul, Republic of Korea, 18–23 April 2015; pp. 2143–2152. [Google Scholar] [CrossRef]
Budhiraja, P.; Sodhi, R.; Jones, B.R.; Karsch, K.; Bailey, B.P.; Forsyth, D.A. Where’s My Drink? Enabling Peripheral Real World Interactions While Using HMDs. arXiv 2015, arXiv:1502.04744. [Google Scholar]
Foerster, K.T.; Gross, A.; Hail, N.; Uitto, J.; Wattenhofer, R. SpareEye: Enhancing the safety of inattentionally blind smartphone users. In Proceedings of the 13th International Conference on Mobile and Ubiquitous Multimedia (MUM ’14), Melbourne, VIC, Australia, 25–28 November 2014; pp. 68–72. [Google Scholar] [CrossRef]
Nahon, D.; Subileau, G.; Capel, B. “Never Blind VR” enhancing the virtual reality headset experience with augmented virtuality. In Proceedings of the 2015 IEEE Virtual Reality (VR), Arles, France, 23–27 March 2015; pp. 347–348. [Google Scholar] [CrossRef]
Hartmann, J.; Holz, C.; Ofek, E.; Wilson, A.D. RealityCheck: Blending Virtual Environments with Situated Physical Reality. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI’19), Glasgow, UK, 4–9 May 2019; pp. 1–12. [Google Scholar] [CrossRef]
Kanamori, K.; Sakata, N.; Tominaga, T.; Hijikata, Y.; Harada, K.; Kiyokawa, K. Obstacle Avoidance Method in Real Space for Virtual Reality Immersion. In Proceedings of the 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany, 16–20 October 2018; pp. 80–89. [Google Scholar] [CrossRef]
Surale, H.B.; Gupta, A.; Hancock, M.; Vogel, D. TabletInVR: Exploring the Design Space for Using a Multi-Touch Tablet in Virtual Reality. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI’19), Glasgow, UK, 4–9 May 2019; pp. 1–13. [Google Scholar] [CrossRef]
Zhu, F.; Grossman, T. BISHARE: Exploring Bidirectional Interactions Between Smartphones and Head-Mounted Augmented Reality. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI’20), Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar] [CrossRef]
Ha, T.; Woo, W. ARWand: Phone-Based 3D Object Manipulation in Augmented Reality Environment. In Proceedings of the 2011 International Symposium on Ubiquitous Virtual Reality, Jeju Island, Republic of Korea, 1–4 July 2011; pp. 44–47. [Google Scholar] [CrossRef]
Siddhpuria, S.; Malacria, S.; Nancel, M.; Lank, E. Pointing at a Distance with Everyday Smart Devices. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–11. [Google Scholar] [CrossRef]
Wentzel, J.; Anderson, F.; Fitzmaurice, G.; Grossman, T.; Vogel, D. SwitchSpace: Understanding Context-Aware Peeking Between VR and Desktop Interfaces. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI’24), Honolulu, HI, USA, 11–16 May 2024. [Google Scholar] [CrossRef]
Lorensen, W.; Cline, H.; Nafis, C.; Kikinis, R.; Altobelli, D.; Gleason, L. Enhancing reality in the operating room. In Proceedings of the Visualization’93, San Jose, CA, USA, 25–29 October 1993; pp. 410–415. [Google Scholar] [CrossRef]
Tecchia, F.; Avveduto, G.; Brondi, R.; Carrozzino, M.; Bergamasco, M.; Alem, L. I’m in VR! using your own hands in a fully immersive MR system. In Proceedings of the 20th ACM Symposium on Virtual Reality Software and Technology (VRST’14), Edinburgh, UK, 11–13 November 2014; pp. 73–76. [Google Scholar] [CrossRef]
Biddle, R.; Chiasson, S.; Van Oorschot, P. Graphical passwords: Learning from the first twelve years. ACM Comput. Surv. 2012, 44, 1–41. [Google Scholar] [CrossRef]
Liebers, J.; Brockel, S.; Gruenefeld, G.; Schneegass, S. Identifying Users by Their Hand Tracking Data in Augmented and Virtual Reality. Int. J. Hum.–Comput. Interact. 2024, 40, 409–424. [Google Scholar] [CrossRef]
Rupp, D.; Grießer, P.; Bonsch, A.; Kuhlen, T.W. Authentication in Immersive Virtual Environments through Gesture-Based Interaction with a Virtual Agent. In Proceedings of the 2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Orlando, FL, USA, 16–21 March 2024; pp. 54–60. [Google Scholar] [CrossRef]
Huang, Y.; Zhang, D.; Rosenberg, E.S. DBA: Direction-Based Authentication in Virtual Reality. In Proceedings of the 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Shanghai, China, 25–29 March 2023; pp. 953–954. [Google Scholar] [CrossRef]
Bologna, D.; Micciché, V.; Violo, G.; Visconti, A.; Cannavò, A.; Lamberti, F. SPHinX Authentication Technique: Secure Painting autHentication in eXtended reality. In Proceedings of the 2023 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Shanghai, China, 25–29 March 2023; pp. 941–942. [Google Scholar] [CrossRef]
Vora, R.A.; Bharadi, V.A.; Kekre, H.B. Retinal scan recognition using wavelet energy entropy. In Proceedings of the 2012 International Conference on Communication, Information & Computing Technology (ICCICT), Mumbai, India, 19–20 October 2012; pp. 1–6. [Google Scholar] [CrossRef]
Daugman, J. 600 million Citizens of India Are Now Enrolled with Biometric ID. Available online: https://www.spie.org/news/5449-600-million-citizens-of-india-are-now-enrolled-with-biometric-id (accessed on 20 August 2025).
Schneegass, S.; Oualil, Y.; Bulling, A. SkullConduct: Biometric User Identification on Eyewear Computers Using Bone Conduction Through the Skull. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI’16), San Jose, CA, USA, 7–12 May 2016; pp. 1379–1384. [Google Scholar] [CrossRef]
Heruatmadja, C.H.; Meyliana; Hidayanto, A.N.; Prabowo, H. Biometric as Secure Authentication for Virtual Reality Environment: A Systematic Literature Review. In Proceedings of the 2023 International Conference for Advancement in Technology (ICONAT), Goa, India, 24–26 January 2023; pp. 1–7. [Google Scholar] [CrossRef]
Hadjidemetriou, G.; Belk, M.; Fidas, C.; Pitsillides, A. Picture Passwords in Mixed Reality: Implementation and Evaluation. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA’19), Glasgow, UK, 4–9 May 2019; pp. 1–6. [Google Scholar] [CrossRef]
Menzner, T.; Otte, A.; Gesslein, T.; Grubert, J.; Gagel, P.; Schneider, D. A Capacitive-sensing Physical Keyboard for VR Text Entry. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1080–1081. [Google Scholar] [CrossRef]
Kürtünlüoğlu, P.; Akdik, B.; Duygu, R.; Karaarslan, E. Towards More Secure Virtual Reality Authentication for the Metaverse: A Decentralized Method Proposal. In Proceedings of the 2023 16th International Conference on Information Security and Cryptology (ISCTürkiye), Ankara, Turkey, 18–19 October 2023; pp. 1–6. [Google Scholar] [CrossRef]
Mchale, S.; Murr, L.; Zhang, P. Using Decentralized Identifiers and InterPlanetary File System to Create a Recoverable Rare Disease Patient Identity Framework. In Proceedings of the 2023 7th International Conference on Medical and Health Informatics (ICMHI ’23), Kyoto, Japan, 12–14 May 2023; pp. 142–149. [Google Scholar] [CrossRef]
Desai, A.P.; Pena-Castillo, L.; Meruvia-Pastor, O. A Window to your Smartphone: Exploring Interaction and Communication in Immersive VR with Augmented Virtuality. In Proceedings of the 2017 Computer and Robot Vision (CRV), Edmonton, AB, Canada, 16–19 May 2015. [Google Scholar]
Alaee, G.; Deasi, A.P.; Pena-Castillo, L.; Brown, E.; Meruvia-Pastor, O. A User Study on Augmented Virtuality Using Depth Sensing Cameras for Near-Range Awareness in Immersive VR. In Proceedings of the IEEE VR’s 4th Workshop on Everyday Virtual Reality (WEVR 2018), Reutlingen, Germany, 18 March 2018. [Google Scholar]
Mohr, P.; Tatzgern, M.; Langlotz, T.; Lang, A.; Schmalstieg, D.; Kalkofen, D. TrackCap: Enabling smartphones for 3D interaction on mobile head-mounted displays. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019. [Google Scholar] [CrossRef]
Hattori, K.; Hirai, T. Inside-out Tracking Controller for VR/AR HMD using Image Recognition with Smartphones. In Proceedings of the ACM SIGGRAPH 2020 Posters, SIGGRAPH 2020, Virtual Event, USA, 17 August 2020; Association for Computing Machinery: New York, NY, USA, 2020. [Google Scholar] [CrossRef]
Zhang, L.; He, W.; Bai, H.; He, J.; Qiao, Y.; Billinghurst, M. A Hybrid 2D-3D Tangible Interface for Virtual Reality. In Proceedings of the ACM SIGGRAPH 2021 Posters, Virtual Event, USA, 9–13 August 2021. [Google Scholar] [CrossRef]
Bai, H.; Zhang, L.; Yang, J.; Billinghurst, M. Bringing full-featured mobile phone interaction into virtual reality. Comput. Graph. 2021, 97, 42–53. [Google Scholar] [CrossRef]
Unlu, A.E.; Xiao, R. PAIR: Phone as an Augmented Immersive Reality Controller. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology, Osaka, Japan, 8–10 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Hincapié-Ramos, J.D.; Özacar, K.; Irani, P.P.; Kitamura, Y. GyroWand: IMU-based Raycasting for Augmented Reality Head-Mounted Displays. In Proceedings of the SUI 2015—Proceedings of the 3rd ACM Symposium on Spatial User Interaction, Los Angeles, CA, USA, 8–9 August 2015; Association for Computing Machinery, Inc.: New York, NY, USA, 2015; pp. 89–98. [Google Scholar] [CrossRef]
Young, T.S.; Teather, R.J.; Mackenzie, I.S. An arm-mounted inertial controller for 6DOF input: Design and evaluation. In Proceedings of the 2017 IEEE Symposium on 3D User Interfaces (3DUI 2017), Los Angeles, CA, USA, 18–19 March 2017; pp. 26–35. [Google Scholar] [CrossRef]
Kharlamov, D.; Woodard, B.; Tahai, L.; Krzysztof, P. TickTockRay: Smartwatch-based 3D pointing for smartphone-based virtual reality. In Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology, Munich, Germany, 2–4 November 2016; pp. 363–364. [Google Scholar] [CrossRef]
Kim, H.I.; Woo, W. Smartwatch-assisted robust 6-DOF hand tracker for object manipulation in HMD-based augmented reality. In Proceedings of the 2016 IEEE Symposium on 3D User Interfaces (3DUI 2016), Greenville, SC, USA, 19–20 March 2016; pp. 251–252. [Google Scholar] [CrossRef]
Hirzle, T.; Gugenheimer, J.; Rixen, J.; Rukzio, E. WatchVR: Exploring the Usage of a Smartwatch for Interaction in Mobile Virtual Reality. In Proceedings of the Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; Association for Computing Machinery: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
Park, K.B.; Lee, J.Y. New design and comparative analysis of smartwatch metaphor-based hand gestures for 3D navigation in mobile virtual reality. Multimed. Tools Appl. 2019, 78, 6211–6231. [Google Scholar] [CrossRef]
Pietroszek, K.; Kuzminykh, A.; Wallace, J.R.; Lank, E. Smartcasting: A discount 3D interaction technique for public displays. In Proceedings of the 26th Australian Computer-Human Interaction Conference on Designing Futures: The Future of Design, Sydney, Australia, 2–5 December 2014; pp. 119–128. [Google Scholar] [CrossRef]
Aseeri, S.A.; Acevedo-Feliz, D.; Schulze, J. Poster: Virtual reality interaction using mobile devices. In Proceedings of the IEEE Symposium on 3D User Interface 2013 (3DUI 2013), Orlando, FL, USA, 16–17 March 2013; pp. 127–128. [Google Scholar] [CrossRef]
Budhiraja, R.; Lee, G.A.; Billinghurst, M. Interaction techniques for HMD-HHD hybrid AR systems. In Proceedings of the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR 2013), Adelaide, Australia, 1–4 October 2013; pp. 243–244. [Google Scholar] [CrossRef]
McDonald, B.; Zhang, Q.; Nanzatov, A.; Peña-Castillo, L.; Meruvia-Pastor, O. SmartVR Pointer: Using Smartphones and Gaze Orientation for Selection and Navigation in Virtual Reality. Sensors 2024, 24, 5168. [Google Scholar] [CrossRef] [PubMed]
Zhu, F.; Sousa, M.; Sidenmark, L.; Grossman, T. PhoneInVR: An Evaluation of Spatial Anchoring and Interaction Techniques for Smartphone Usage in Virtual Reality. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’24), Honolulu, HI, USA, 11–16 May 2024. [Google Scholar] [CrossRef]
Pathmanathan, N.; Becher, M.; Rodrigues, N.; Reina, G.; Ertl, T.; Weiskopf, D.; Sedlmair, M. Eye vs. Head: Comparing Gaze Methods for Interaction in Augmented Reality. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications (ETRA’20), Stuttgart, Germany, 2–5 June 2020. [Google Scholar] [CrossRef]
Boletsis, C.; Kongsvik, S. Controller-based Text-input Techniques for Virtual Reality: An Empirical Comparison. Int. J. Virtual Real. 2019, 19, 2–15. [Google Scholar] [CrossRef]
Blattgerste, J.; Renner, P.; Pfeiffer, T. Advantages of Eye-Gaze over Head-Gaze-Based Selection in Virtual and Augmented Reality under Varying Field of Views. In Proceedings of the Workshop on Communication by Gaze Interaction (COGAIN’18), Warsaw, Poland, 15 June 2018. [Google Scholar] [CrossRef]
Ha, T.; Feiner, S.; Woo, W. WeARHand: Head-worn, RGB-D camera-based, bare-hand user interface with visually enhanced depth perception. In Proceedings of the 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Munich, Germany, 10–12 September 2014; pp. 219–228. [Google Scholar] [CrossRef]
Brooke, J. SUS—A Quick and Dirty Usability Scale. 1990. Available online: https://digital.ahrq.gov/sites/default/files/docs/survey/systemusabilityscale%2528sus%2529_comp%255B1%255D.pdf (accessed on 20 August 2025).
NASA; Hart; Staveland. NASA Task Load Index (TLX). 1990. Available online: https://humansystems.arc.nasa.gov/groups/TLX/index.php (accessed on 20 August 2025).
Sauro, J.; Dumas, J.S. Comparison of three one-question, post-task usability questionnaires. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’09), Boston, MA, USA, 4–9 April 2009; pp. 1599–1608. [Google Scholar] [CrossRef]
Kennedy, R.S.; Lane, N.E.; Berbaum, K.S.; Lilienthal, M.G. Simulator Sickness Questionnaire: An Enhanced Method for Quantifying Simulator Sickness. Int. J. Aviat. Psychol. 1993, 3, 203–220. [Google Scholar] [CrossRef]
Andrabi, S.J.; Reiter, M.K.; Sturton, C. Usability of augmented reality for revealing secret messages to users but not their devices. In Proceedings of the Eleventh USENIX Conference on Usable Privacy and Security (SOUPS’15), Ottawa, ON, Canada, 22–24 July 2015; pp. 89–102. [Google Scholar]
Tirfe, D.; Anand, V.K. A Survey on Trends of Two-Factor Authentication. In Proceedings of the Contemporary Issues in Communication, Cloud and Big Data Analytics; Sarma, H.K.D., Balas, V.E., Bhuyan, B., Dutta, N., Eds.; Springer: Singapore, 2022; pp. 285–296. [Google Scholar]
Abraham, M.; Saeghe, P.; Mcgill, M.; Khamis, M. Implications of XR on Privacy, Security and Behaviour: Insights from Experts. In Proceedings of the Nordic Human-Computer Interaction Conference (NordiCHI’22), Aarhus, Denmark, 8–12 October 2022. [Google Scholar] [CrossRef]
X Reality Safety Intelligence. The XRSI Privacy Framework. 2020. Available online: https://xrsi.org/wp-content/uploads/2020/09/XRSI-Privacy-Framework-v1_002.pdf (accessed on 20 August 2025).
Laubheimer, P. Beyond the NPS: Measuring Perceived Usability with the SUS, NASA-TLX, and the Single Ease Question After Tasks and Usability Tests. 2018. Available online: https://www.nngroup.com/articles/measuring-perceived-usability/ (accessed on 20 August 2025).
Lamsweerde, A.v. Formal specification: A roadmap. In Proceedings of the Conference on The Future of Software Engineering (ICSE’00), Limerick, Ireland, 4–11 June 2000; pp. 147–159. [Google Scholar] [CrossRef]
Emteq Labs; X Reality Safety Intelligence. An Imperative: Developing Standards for Safety and Security in XR Environments. 2021. Available online: https://xrsi.org/wp-content/uploads/2021/02/An-Imperative-Emteq-XRSI-Whitepaper-on-Standards-for-safety-and-Security.pdf (accessed on 20 August 2025).

Figure 1. Overview of input panels for answering different challenge types: (A) CAPTCHA-style-selection of tiles to answer the requested challenge; (B) numeric input panel to enter a six-digit code; (C) checkers input panel for visual matching of two checkered grids; (D) six-character alphanumeric password submission dialog.

Figure 2. Overview of the authentication challenges as presented during Phone1 conditions 2 and 3 (Phone1_SVRP2 & Phone1_VRC2): (A–D) exemplify the panels showing the expected response (or passcode) as presented on the smartphone when using Phone1_SVRP2, where the tap button allows users to indicate their selection within the VR using the gaze-based SmartVR Pointer [65,67]; (E–H) exemplify challenges showing the expected response as presented on the smartphone when using Phone1_VRC2. In this condition participants use the VR controller’s trigger button to indicate their selection within the VR.

Figure 3. Virtual keyboard utilized to solve the alphanumeric password challenge in Conditions 2 (Phone1_SVRP2) and 3 (Phone1_VRC2), as shown in the VR environment.

Figure 4. Exposure adjustments related to passthrough video capture: (A) illustrates how passthrough video with auto-exposure turned on often results in overexposure of the smartphone screen’s image; (A,B) show that too-high or too-low exposure levels cause the image to darken or brighten so much that it becomes nearly impossible to read its contents; (C) shows how appropriate adjustment of the exposure level allows users to discern the screen contents.

Figure 5. Showcase of NRXR-ID use scenarios. From left to right: participant using the smartphone and a VR controller; using the smartphone only; view from within the VR environment; answering an authentication challenge with the VR controller.

Figure 6. Flowcharts of key steps for two-factor authentication using NRXR-ID: (A) Condition 1: HMD1_Phone2; (B) Condition 2: Phone1_SVRP2; (C) Condition 3: Phone1_VRC2.

Figure 7. Overview of the authentication methods of Condition 1 (HMD1_Phone2). (A) Illustrates the experimental setup with a user holding the smartphone to complete the challenges. VR screenshots of request panels asking participants to solve a challenge using the smartphone app: (B) CAPTCHA challenge; (C) numeric challenge; (D) checkers challenge; (E) password challenge.

Figure 8. Overview of the authentication methods of Condition 2 (Phone1_SVRP2). (A) Illustrates the experimental setup, with a user holding the smartphone to read the challenge or passcode shown on the smartphone. VR screenshots of challenges to be solved using the SmartVR Pointer: (B) CAPTCHA challenge; (C) numeric challenge; (D) checkers challenge; (E) password challenge.

Figure 9. Overview of the authentication methods of Condition 3 (Phone1_VRC2). (A) Illustrates the experimental setup, with a user holding their smartphone to read the challenge or passcode shown on the smartphone. VR screenshots of challenges to be solved using the VR controller: (B) CAPTCHA challenge; (C) numeric challenge; (D) checkers challenge; (E) password challenge.

Figure 10. Experimental setup and apparatus. (a) Basic interaction modalities: the expected response can either be shown in the VR HMD and the solution can provided by the user using their smartphone, or the other way around; (b) the front view displays the camera mounted on the headset, with the camera’s position relative to the headset being adjustable; (c) the side view displays the dynamic adjustment of the camera angle, oriented to facilitate a comfortable position for the user to hold their phone in front of the camera.

Figure 11. Comparison of user performance differences between conditions by completion time, showing 95% confidence intervals of the pairwise differences in mean completion time between conditions for all four challenges. Circles indicate the mean difference. The dashed vertical gray line indicates the point of no difference between the means. The farther the confidence interval is from the dashed vertical line, the more statistically significant the difference. Differences are reported in seconds.

Figure 12. Comparison of user performance differences between conditions by number of clicks, showing 95% confidence intervals of the pairwise differences in mean number of clicks between conditions for all four challenges. Circles indicate the mean difference. The dashed vertical gray line indicates the point of no difference between the means. The farther the confidence interval is from the dashed vertical line, the more statistically significant the difference. Differences are reported in number of clicks.

Figure 13. Results from the user experience questionnaire, showing the distribution of Likert-scale scores assigned by participants to each condition per challenge based on how much they liked the condition, how effective they found it, and how easy to use they perceived it. On the scale, 7 is best, 4 is neutral, and 1 is worst. A horizontal line inside the box indicates the median score, while the box height indicates the inter-quartile range (IQR).

Table 1. Mean completion time and standard deviation per condition for each of the four challenges. The lowest mean completion time per challenge and lowest overall average completion time are highlighted in bold. The challenge with the lowest mean completion time per condition is underlined. All times are in seconds.

Condition	CAPTCHA	Numeric	Checkers	Password
HMD1_Phone2	$17.76 \pm 10.46$	$\underset{̲}{10.68 \pm 3.27}$	$13.63 \pm 5.18$	$33.45 \pm 19.22$
Phone1_SVRP2	$12.92 \pm 4.16$	$12.47 \pm 4.79$	$14.47 \pm 6.50$	$28.35 \pm 9.33$
Phone1_VRC2	$12.41 \pm 5.02$	$13.90 \pm 5.92$	$13.44 \pm 9.33$	$36.32 \pm 18.05$
Average ± sd	$14.36 \pm 7.5$	$12.35 \pm 4.95$	$13.85 \pm 5.29$	$32.71 \pm 16.44$

Table 2. ANOVA results of completion times per challenge type; p-values less than 0.01 are in bold.

Factor	CAPTCHA		Numeric		Checkers		Password
Factor	F-Value	p-Value	F-Value	p-Value	F-Value	p-Value	F-Value	p-Value
Condition	30.02	$5.99 \times 10^{- 13}$	17.45	$5.09 \times 10^{- 8}$	1.69	$0.185$	9.69	$7.62 \times 10^{- 5}$
Round	12.13	$2.31 \times 10^{- 09}$	3.17	$0.014$	3.49	$0.008$	1.90	0.11
Order	15.28	$3.83 \times 10^{- 07}$	1.75	$0.175$	7.14	$0.0009$	6.88	0.001

Table 3. Completion times by round for the CAPTCHA and checkers challenge types. The effect of round on completion times was found to be statistically significant only for CAPTCHA and checkers.

Round	1	2	3	4	5
CAPTCHA	18.4 ± 12.1	14.2 ± 5.4	14.2 ± 6.3	13.0 ± 5.2	12.0 ± 4.2
Checkers	15.6 ± 8.4	13.7 ± 4.4	13.2 ± 3.7	13.2 ± 3.8	13.6 ± 4.4

Table 4. Success rate per condition for each of the four challenges. The highest success rate per challenge and the highest overall average success rate are highlighted in bold.

Condition	CAPTCHA	Numeric	Checkers	Password	Average
HMD1_Phone2	85%	97%	92%	88%	90%
Phone1_SVRP2	94%	99%	89%	91%	93%
Phone1_VRC2	96%	93%	91%	87%	91%
Average ± sd	$91.67 % \pm 5.86 %$	$96.33 % \pm 3.06 %$	$90.67 % \pm 1.53 %$	$88.67 % \pm 2.08 %$	91.83%

Table 5. Summary of participants’ comments regarding the challenges encountered during the experiment. The last column shows in bold the sum of the percentages for each row. The last row shows in bold the balance of the percentage of positive comments minus negative ones.

Category	CAPTCHA	Numeric	Checkers	Password	Total
Best challenge	2.38%	4.76%	14.29%	0.00%	21.43%
Good challenge	7.14%	11.90%	11.90%	2.38%	33.33%
Bad challenge	7.14%	7.14%	0.00%	19.05%	33.33%
Worst challenge	2.38%	0.00%	0.00%	9.52%	11.90%
Balance	0.00%	9.52%	26.19%	−26.19%	9.52%

Table 6. Summary of participants’ comments regarding the conditions encountered during the experiment. The last column shows in bold the sum of the percentages for each row. The last row shows in bold the balance of the percentage of positive comments minus negative ones.

Category	HMD1_Phone2	Phone1_SVRP2	Phone1_VRC2	Total
Mostly Positive	0.00%	21.43%	9.52%	30.95%
Positive	7.14%	14.29%	14.29%	35.71%
Negative	9.52%	2.38%	7.14%	19.05%
Mostly Negative	9.52%	0.00%	4.76%	14.29%
Balance	−11.90%	33.34%	11.91%	33.32%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nanzatov, A.; Peña-Castillo, L.; Meruvia-Pastor, O. NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones. Electronics 2025, 14, 3368. https://doi.org/10.3390/electronics14173368

AMA Style

Nanzatov A, Peña-Castillo L, Meruvia-Pastor O. NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones. Electronics. 2025; 14(17):3368. https://doi.org/10.3390/electronics14173368

Chicago/Turabian Style

Nanzatov, Aiur, Lourdes Peña-Castillo, and Oscar Meruvia-Pastor. 2025. "NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones" Electronics 14, no. 17: 3368. https://doi.org/10.3390/electronics14173368

APA Style

Nanzatov, A., Peña-Castillo, L., & Meruvia-Pastor, O. (2025). NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones. Electronics, 14(17), 3368. https://doi.org/10.3390/electronics14173368

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

NRXR-ID: Two-Factor Authentication (2FA) in VR Using Near-Range Extended Reality and Smartphones

Abstract

1. Introduction

1.1. Motivation

1.2. Near-Range Extended Reality for 2FA

1.3. Research Questions

2. Related Work

Using Smartphones in VR

3. Methodology

3.1. CAPTCHA-Style Challenge

3.2. Numeric Code Challenge

3.3. Checkers Matching Challenge

3.4. Alphanumeric Password Challenge

3.5. Configuration Possibilities for 2FA in XR

3.5.1. Condition 1: HMD1_Phone2

3.5.2. Condition 2: Phone1_SVRP2

3.5.3. Condition 3: Phone1_VRC2 (a.k.a. the Baseline)

3.6. Overview

3.7. System Hardware and Software

3.7.1. Blending of the RGBD Camera Feed

3.7.2. Preventing Overexposure During Smartphone Display Capture

4. Experimental Design

Data Analysis

5. Results

5.1. Participant Demographics

5.2. Performance Metrics

Interactions Between Challenge Types and Conditions

5.3. Participant Feedback

5.3.1. Structured Feedback Analysis

5.3.2. Unstructured Feedback Analysis

5.4. Summary of Results

6. Discussion

6.1. Security Considerations and Potential Vulnerabilities

6.2. Ethical Considerations for Real-World Deployment

6.3. Limitations

6.4. Future Work

Summary of Future Work

6.5. Design Implications of the Findings

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI