1. Introduction
Wireless virtual reality (VR) offers a seamless user experience by removing the restricted mobility of conventional wired VR headsets. The greater freedom of movement offered by wireless VR allows users to more conveniently immerse in a virtual world which in turn results in improved quality of experience (QoE). The delivery of such VR video services over networks requires data rates of ≥1.6 Gbps [
1] depending on the video resolution and anticipated level of interaction with the virtual world. Furthermore, the motion-to-photon (MTP) latency, i.e., the delay between the viewer’s head movement and the head-mounted display’s (HMD’s) change reflecting this movement, must be kept as ≤20 ms [
1]. In the context of home Wi-Fi, for example, network latencies of ≤10 ms for fair experience, ≤7 ms for comfortable experience, ≤5 ms for ideal experience have been recommended in [
1]. As a consequence, the network contribution to this MTP latency budget also needs to be significantly improved to guarantee satisfactory QoE. To accommodate such high performance wireless data applications in terms of throughput and latency, several standards have been developed and are being worked on regarding advanced wireless and cellular network technologies. In particular, standardization of the required wireless connectivity takes place both in the field of wireless local area networks (WLAN), and broadband cellular networks such as fifth generation (5G), beyond fifth generation (B5G) and sixth generation (6G) networks.
Contributing to the understanding of network-induced temporal disturbances in wireless VR systems, this paper reports an experimental study that focuses on the VR video frame freeze length characteristics of a commercial TPCAST wireless VR solution [
2]. Similar as other wireless systems that target ultra-reliable low latency communications, the commercial TPCAST wireless VR solution also utilizes beamforming, higher order modulation constellations, and millimeter wave technology in the 60 GHz band. As such, the results and statistical models provided in this paper may be considered as representative for the freeze characteristics in such practical settings. In the following, an overview of the related wireless technologies targeting ultra-reliable low latency communications, experimental studies, and wireless VR testbeds are provided leading to the motivation, research questions, and contributions of the work reported in this paper.
1.1. Related Work
1.1.1. Wireless Technologies toward Ultra-Reliable Low Latency Communications
In response to emerging new applications such as 8K video, VR, gaming, and cloud computing, the IEEE 802.11 Task Group BE started to work on the IEEE 802.11be standard (Wi-Fi 7) enabling extremely high throughput at low latency [
3]. IEEE 802.11be is anticipated to operate in the 2.4 GHz, 5 GHz, and 6 GHz bands offering 320 MHz channel bandwidth. Higher order modulation constellations up to 4096-ary quadrature amplitude modulation (4096-QAM) will be supported while the number of spatial streams will be increased up to 16 through the use of
multi-user multiple-input multiple-output (MU-MIMO) systems. The high signal-to-noise ratio (SNR) required for reliable operation of 4096-QAM shall be achieved by beamforming. IEEE 802.11be aims to provide at least one operation mode that supports data rates great than 30 Gbps. It is envisaged that a final version of the IEEE 802.11be standard will be available by early 2024.
Further, the WiGig specifications produced under the umbrella of the Wireless Gigabit Alliance provides a set of wireless network protocols for operation in the 60 GHz band [
4]. The WiGig system, also known as 60 GHz Wi-Fi, includes IEEE 802.11ad [
5] and IEEE 802.11ay [
6], and supports data rates of up to 7 Gbps. It resorts on beamforming which allows to explore the signal reflections induced by the propagation environment, e.g., surfaces, ceilings, and floors. Specifically, the phased array antenna beamforming used in the WiGig system enables robust communication with transmitter-receiver distances of up to 10 m.
The WirelessHD or UltraGig proprietary standard [
7] targets wireless transmission in the 60 GHz band for high-definition (HD) video content. The high date rate physical layer of WirelessHD uses quadrature phase shift keying (QPSK), 16-QAM, and 64-QAM together with equal and unequal error protection, and provides the option for spatial multiplexing. WirelessHD supports data rates of up to 28 Gbps, uses beamforming at the transmitter and receiver, and supports a communication range of up to 10 m. The WirelessHD technology is used in the TPCAST wireless VR system that was developed for the HTC Vive HMD (see
Section 2).
Further, VR and augmented reality (AR) applications and services are foreseen to be delivered over broadband cellular networks such as the currently rolled out 5G networks and to be expanded to cater for digital realities offering immersive experiences in B5G and 6G mobile networks [
8]. Regarding HMD technologies for such networks, the Qualcomm Snapdragon XR2 5G reference design, for instance, aims at next-generation 5G-enabled HMDs for untethered VR, AR, and mixed reality (MR). Network technologies considered to deal with the stringent performance indicators of ultra-reliable low latency communications include network slicing and edge-cloud processing architectures such as cloud VR.
Irrespective of whether a dedicated private network or a network slice over a public network is used for the delivery of untethered immersive media, the underlaying infrastructure must offer: (1) High bandwidth to accommodate the required high data rates, (2) low latency to keep within the MTP budget, (3) high reliability to avoid spatial and temporal artifacts being inflicted to the immersive media. The challenge of providing sufficiently high bandwidth has been tackled by moving toward higher frequencies, e.g., 60 GHz millimeter waves (mmWaves), and the related wider frequency bands. The requirement of low latency is being addressed by finding a balance between advanced on-device processing and moving processing to the edge. Beamforming at the transmitter and receiver along with adaptive coding and modulation schemes are increasingly engaged at the physical layer to provide ultra-reliable wireless links.
1.1.2. Experimental Studies and Wireless VR Testbeds
In [
9], a multi-path multi-tier 360
video streaming technique is proposed taking into account the dynamics of the network bandwidth as well as the user viewing direction. In this work, video encoding is performed at multi-tiers, i.e., the base-tier encoded with a basic rate, enhancement-tier coded with multiple rates and assisting view prediction accuracy, and correction-tier carrying information about the predicted view direction. A 5G wireless research platform is used to collect real bandwidth traces over a WiGig testbed which, in combination with a Wi-Fi component, allows for studying the multi-path approach. In addition, real user field of view traces with diverse head movement patterns are used for the assessment of the multi-tier approach. The analytical and experimental performance assessment in terms of optimal rate allocation, throughput, latency, and network utilization demonstrated that a high level of QoE can be achieved.
The experimental study reported in [
10] assesses the IEEE 802.11ad performance on a smartphone regarding throughput, coverage, mobility, and power consumption. The experimental setup focused on an indoor environment using an Asus Republic of Gamers (ROG) Android gaming smartphone which is equipped with an IEEE 802.11ad chip set. All measurements were performed with a Netgear Nighthawk X10 Smart WiFi Router which allows for 4K streaming, VR gaming, and instant downloads. In this context, the ROG smartphone was assessed as a general purpose IEEE 802.11ad smartphone. The measurement campaign revealed that the ROG smartphone was able to support Gbps throughput while power consumption was less of an issue than originally expected. However, the results suggest that the beamforming protocol on the smartphone performs suboptimally and affects the network performance. Overall, it was concluded that IEEE 802.11ad is a promising technology for upcoming bandwidth-intensive applications on smartphones.
Building on [
10], a more comprehensive performance assessment of IEEE 802.11ad on smartphones is provided in [
11]. The experimental study presented in this work considers the following three experimental setups. First, the IEEE 802.11ad setup operating at the 60 GHz band comprises of an Asus ROG Android gaming smartphone and a Netgear Nighthawk X10 Smart WiFi Router. Physical layer data rates range from 1.65 Gbps to 2.3 Gbps depending on the scenario. Second, a Samsung Galaxy S10 and Xiaomi Mi 10 phone and an ASUS RT-AX88U router constitute the IEEE 802.11ax-2021 or IEEE 802.11ax setup operating at the 5 GHz band. In this setup, the smartphones constrained the maximum physical layer data rate to 1.2 Gbps. Third, the state-of-the-art IEEE 802.11ac setup operating at the 5 GHz band, was used for comparison with the IEEE 802.11ad scenario, also using the Netgear Nighthawk router with the ROG phone. The physical layer data rate of this setup was limited to 866 Mbps. The performance of these setups were evaluated with respect to range, coverage, mobility, blockage, and power consumption. The measurement data allowed to assess the impact of smartphone features such as antenna array placements as well as performance comparisons among members of the family of IEEE 802.11 standards. Although inferior to an IEEE 802.11ad-enabled laptop, it was shown that an IEEE 802.11ad-enabled smartphone can sustain Gbps data rates given that the range to the access point is kept within a few tens of ft. Overall, it was found that the IEEE 802.11ad setup, which is placed at the 60 GHz band, outperforms the 5 GHz counterparts associated with IEEE 802.11ac and IEEE 802.11ax. However, challenges faced with the IEEE 802.11ad setup include performance degradation due to the beamforming protocol, antenna array placement, and human blockage.
In [
12], an extensive measurement study is reported focusing on the predictability of throughput of an IEEE 802.11ad system on the downlink to an IEEE 802.11ad-enabled mobile device. In this study, using a similar experimental setup as in [
10,
11], the mobility patterns and orientations of the mobile device were varied. It was shown that throughput may be predicted in real time using a small neural network model with good accuracy at varying timescales. Regarding wireless VR applications, given that the prediction error was above 10% for 40% of the time at the 10 ms timescale, this work suggests to be conservative in the use of throughput prediction in this context.
Off-the-shelf WiGig and WiFi interface modules were used in [
13] to establish a motion-aware WiGig-WiFi interplay system for VR (MW
IVR). This testbed comprises of a local PC for processing, a Netgear Nighthawk X10 serving as access point (AP), and a Travelmate laptop with Sparrow 11ad module. An HTC Vive headset is connected to the laptop using high definition multimedia interface (HDMI) and universal serial bus (USB) cables which allows limited translational and rotational movements in the vicinity of the laptop. The testbed collects user motion, link signal levels, and the generated frame sizes, and opportunistically switches between WiGig and WiFi while controlling the VR frame encoding for latency regulation. The distance between AP and laptop was limited in the WiGig wireless VR scenario to 3 m offering a throughput of approximately 1.5 Gbps with the antenna embedded in the laptop facing the AP (front view). However, the throughput measured for the distance of 2 m and varying antenna direction is reported to decrease almost proportionally with the antenna angle turning away from the front view. It is conjectured that this behavior may be induced by imperfect and slow beamforming operations. The throughput in the WiFi wireless VR scenario, on the other hand, was measured being constant for distances up to 5 m but dropped to one-third of the maximum throughput that was achieved by WiGig. The experimental results indicate that the MW
IVR outperforms the WiGig-only system in terms of latency regulation and video fidelity.
A VR gaming testbed referred to as Air Light Virtual Reality (ALVR) is presented in [
14] and used to objectively and subjectively assess the performance of cloud VR gaming. The ALVR testbed consists of a Windows workstation with an Intel Core i5 CPU, 64 GB RAM, and an NVIDIA RTX 2080Ti GPU to constitute a cloud gaming server, a BSD workstation also with an Intel Core i5 CPU acting as traffic shaper, an ASUS RT-AC88U IEEE 802.11ac WiFi router offering wireless connectivity, and an Oculus Quest HMD equipped with an IEEE 802.11ac network interface. The testbed measures frame rate, latency, server load, client load, and throughput while bandwidth, delay, and packet loss rate are varied using Dummynet on the BSDF-based traffic shaper. The results indicate that bandwidth limitations may have higher influence on user experience than latency and packet loss rate. Related performance studies using cloud gaming testbeds include the work reported in [
15] that adapts the open source cloud gaming platform [
16] to conduct experiments on conventional mobile devices.
A wireless VR platform for commodity HMDs using a programmable WiGig interface as a potential cable replacement is presented in [
17]. This platform comprises of a desktop PC with Intel Core i7-4790 CPU, 16 GB memory, and an NVIDIA TITAN X CPU which runs the VR applications and renders all graphic contents. A wireless Internet protocol (IP) network was setup using the Intel WiGig modules 17265 and W13100. A Dell XPS M3800 laptop with NVIDIA GeForce 970 GPU is connected to an HTC Vive HMD for studying the computational power needed to decompress the received frames. The performance assessment of this experimental study focused on frame-processing time and end-to-end latency in the context of VR gaming. It is shown that GPU-based optimization techniques and parallel processing can reduce the total frame-processing time but it remains still challenging to meet the end-to-end latency requirements of wireless high-quality VR applications. Powerful new hardware and novel software techniques are stated as being desirable to enable wireless high-quality VR applications.
In [
18], an end-to-end wireless VR platform is proposed offering low latency remote rendering. An Intel Core i7 based PC equipped with an NVIDIA TITAN X GPU is used as rendering server. The server holds a Mellanox 10 Gbps network interface card that is connected through a 10 Gbps Ethernet cable with a Netgear Nighthawk X10 WiGig AP. A ThinkPad X1 Yoga laptop, equipped with an Intel i7-7600U CPU and an HD 620 integrated GPU with H.264 hardware decoder ASIC, is used as client. The laptop uses a Qualcomm QCA6320/QCA6310 WiGig module to wirelessly connect to the WiGig AP and connects to an HTC Vive HMD through HDMI and USB cables. The proposed parallel rendering and streaming mechanism reduces streaming latency while the proposed VSync driven rendering technique minimizes display latency. It is shown that this wireless VR platform supports 2K VR resolutions at a refresh rate of 90 Hz with less than 16 ms and 4K VR resolution with 20 ms end-to-end latency.
In [
19], a TPCAST wireless VR testbed is presented that allows for studying temporal data delivery issues of video frames. In particular, the TPCAST wireless adapter and the underlying WirelessHD standard is used with the HTC Vive HMD. The testbed allows for monitoring and recording of traces from which video freezes at the wireless HMD can be detected. It also records traces that capture information about the input traffic from the HMD and hand controls to the rendering computer. The TPCAST wireless VR testbed was used in [
20] to measure, analyze, and model network-induced temporal disturbances in VR applications with focus on freeze durations. The freeze length distributions deduced from the measurement campaign were modeled as geometric distributions.
1.2. Motivation and Contributions
Inspired by all of the above, we have conducted a measurement campaign on VR video frame freeze length, referred to as freeze length in the sequel for brevity, using a commercial TPCAST wireless VR solution. Apart from latency, video freezes and their durations or lengths are also important temporal performance indicators that impact on the QoE of networked VR applications and services. Based on the video traces collected in this experimental study, the following research questions are pursued in this paper:
- (1)
What are the statistical characteristics of the number of freezes, freeze intensities, and freeze lengths to be dealt with in wireless VR solutions?
- (2)
What are the statistical models that form a compact representation of the measured freeze events?
The research reported in this paper utilizes a wireless VR testbed which is based on the commercial TPCAST wireless adapter together with the HTC Vive HMD. This hardware has been chosen because TPCAST has been specifically developed and optimized for the HTC Vive HMD. In the meantime, a TPCAST adaptor is also available for the Oculus Rift HMD. The TPCAST wireless VR testbed can be used as a measurement environment that allows obtaining real data on temporal impairments observed in a commercial network solution rather than needing to focus on optimizing the wireless VR technology itself. The related work reported above has extensively assessed throughput and latency issues of wireless VR systems while little has been reported on freezes and freeze lengths. The comprehensive experimental study reported in this paper hence focuses on measuring freeze characteristics and their statistical analysis and modeling. The statistical models of the freeze characteristics may be included in wireless VR simulators supporting the development of physical layer, medium access layer, and higher layer functionalities. They also may serve as network-disturbance models for VR QoE studies, e.g., generating realistic freeze events in wireless VR stimuli to support the development of perceptual quality measures, similar as is done for conventional videos [
21,
22].
The main contributions of this paper are summarized as follows:
A comprehensive description of the experimental setup of the wireless VR TPCAST testbed is provided including room topology and its structuring into quadrants, relevant hardware components and functions of the TPCAST-based wireless VR testbed, rotational movement procedure of the full-swivel chair adopted to realize the HMD rotational movement pattern, and the six distances imposed on the placements of the transmitter and receiver of the wireless link.
The number of freezes and freeze lengths deduced from the VR video traces that were recorded in the measurement campaign are reported both accumulated over the four quadrants as well as focusing on each of the four quadrants. This analysis is provided for the six considered distances.
Histograms of the freeze lengths measured in frames are provided for different distances between transmitter and receiver to shed light on the density of the underlying distributions of the measured data. Measures of the central tendency of freeze lengths are provided to support the statistical modeling of the freeze lengths.
Cumulative complementary histograms (CCHs) of the freeze length both accumulated over the four quadrants and quadrant-specific CCHs are provided. The CCHs reveal those scenarios which cause tails beyond the central part of the distributions of the freeze length.
Exponential distributions and their equivalent geometric distributions are obtained as statistical models for the freeze length characteristics using curve fitting along with their goodness of fit.
The remainder of this paper is organized as follows. The experimental setup is described in
Section 2 including the TPCAST wireless VR testbed, VR video traffic, room topology, and measurement procedure.
Section 3 conducts a statistical analysis of the VR video traces that were gathered in an extensive measurement campaign. Focus is given to histograms and CCHs of the VR video freeze data. In
Section 4, the statistical models capturing the statistical characteristics of the freeze lengths are derived. Finally, conclusions and directions for future research are given in
Section 5.
4. Statistical Models of Freeze Length Characteristics
In this section, we derive models of the progressions of the CCHs with increasing freeze length
x. While
x is a positive integer, the progressions shown in
Figure 10 and
Figure 11 are exponential, which can also be described equivalently by geometric progressions using a straightforward parameter transformation. We are specifically interested in the parameters of the fittings to the measured data, as they reveal properties, similarities, and differences between the CCHs for the various distances and quadrants.
First, we focus on the domain of shorter freeze lengths
, for which a steeper exponential or an equivalent geometric progression is observed. Because the minimum value of the freeze length is given as
, we use the following exponential fitting function to serve as statistical model of the freeze length characteristics:
where the argument of the operator exp
is written as an exponent of Euler’s number
e, parameter
scales the exponential function, and a negative parameter
describes a progression that decays with increasing argument. The
fit function of the MATLAB
Curve Fitting Toolbox
was used to create the fit to the freeze length data
x and to determine the model parameters
and
. Because the freeze lengths assume discrete values, the statistical model of the exponential progression in (
4) can also be described by a geometric sequence using the following relationship:
where
denotes the so-called common ratio.
Table 3 shows the coefficients of the fitting functions (
4) and (
5) and the goodness of fit in terms of sum of squared error (SSE), root mean square error (RSME), coefficient of determination
, and adjusted coefficient of determination
. The results for the different distances are shown accumulated over all quadrants and for each quadrant in which a sufficiently large number of freezes were observed to support a meaningful fitting. The adjusted coefficients of determination of the fits are in the range
, with the majority of values above 0.95 which indicates a very good fitting performance.
The coefficients
and
of the exponential function (
4) for the different scenarios in the domain of short freeze lengths
are obtained in the following ranges:
| and | | for | m |
| and | | for | m |
| and | | for | m |
In all cases, the strongest variation of the decay parameter
among the quadrants appears at distance
m. On the other hand, variations of the decay parameter
for the distances
m and
m are much narrower, indicating similar progressions. It may be conjectured that this characteristic is caused by the room topology (see
Figure 2) which shows that the transmitter Tx,v at
m is positioned at the boundary between the open space multi-purpose lecture hall and the right-hand side narrower extension.
Alternatively, considering the geometric sequence models, the common ratio is obtained in the following ranges:
| for | m |
| for | m |
| for | m |
Again, the largest variation of the common ratio among the quadrants appears for
d = 12.5 m, while the corresponding values for
m and
m are much closer to each other, indicating similar shapes. It should be mentioned that the common ratios
over all quadrants are of a similar order as the values obtained for the pilot measurement campaign in [
20], i.e.,
for
m and
for
m (results for
m were not reported).
The result that most of the scale parameters
of the models for the short freeze lengths are larger than 1 indicates a shift of the fitting functions toward larger
x values, crossing the point
. In contrast, a fit of a non-shifted exponential or geometric CCH would cross the point
representing the discrete freeze length and corresponding maximum CCH value. The actual shift
of a CCH can be obtained as the solution to the following condition:
which yields
Table 3 shows the shifts
for the different quadrants and distances which assume positive non-integer values
with the exception of Q1 for
m. To comply with the property of a CCH
, i.e.,
, the shift
inherent in the model
in (
5), causing the crossing of the point
, is accounted for by defining
Second, we consider the domain of long freeze lengths
constituting the tails of the freeze length progressions. In particular, tails with slower decays captured within the domain
compared to domain
appear for the following cases (see also
Figure 10 and
Figure 11):
m: | all, Q3, Q4 |
m: | all, Q1, Q3, Q4 |
Table 4 shows the coefficients and common ratio of the exponential and geometric progressions, respective, for the tails of domain
along with the goodness of fit performance for the following fitting functions:
We observe adjusted coefficients of determination in the range which again relates to very good fitting performance. However, in contrast to the short freeze length domain , the values of the scale and decay parameters vary widely in the ranges and , respectively. On the other hand, as , the concept of a shift is not needed for the modeling of the tails.
To reveal additional insights into the freeze length characteristics associated with the tails, the fitting function (
9) may be formulated as compound probability
where the probability
that a long freeze (LF) occurs is given by
Further, the probability
that a freeze
occurs conditioned on such a freeze being long, is given by
Let us consider quadrant Q3 for m that shows a particularly flat and low tail. The probability that the tail occurs is rather small, i.e., . However, the corresponding tail decays very slowly: , indicating a risk for extraordinary large freeze lengths.
Finally,
Figure 12 shows the fittings over all quadrants according to the models formulated in (
4) and (
9) for freeze lengths in the domains
and
, respectively. Given that log-linear plots are used in these figures to capture the large range of values of the exponential relationship, the divergence between data and fitting for increasing freeze length is numerically small while visual inspection may give the impression of a larger difference (see, e.g.,
Figure 12a for
m).