1. Introduction
Imagine sitting at a work desk, thinking about all the destinations one can visit. Imagine having the freedom and ability to visit any place, any time one requires a short break from attending to the chain of emails and worksheets at work. Vacations have been found to provide psychological and physiological respite [
1]. Interaction with nature and biomes of one’s choosing has also been found to restore directed attention, the root of chronic mental fatigue if left unrestored over a prolonged period [
2,
3,
4]. Rapid urbanisation has made interaction with unpolluted nature increasingly difficult; some countries have generations of people who have not grown up around any form of unpolluted nature. According to the United Nations, as of mid-2018, the population density of urban areas had surpassed that of rural areas by 5%, and urban areas held approximately 55% of the world’s population [
5]. This percentage was projected to increase to 68% by 2050; with 90% of the increase occurring in Asia and Africa [
5]. Cities are the most heavily populated urban areas due to the concentration of industrial, residential, and other public amenity constructions within a concentrated amount of land space. Modern civilisations have attempted to alleviate problems of overcrowding cities by spreading out into towns and suburban areas neighbouring the cities, where more land space is available.
This form of urban planning does not work for sovereign city states such as Singapore, where housing, recreational, commercial, industrial and even green spaces have to be accommodated onto the 721.5 km
2 island [
6,
7]. In the interest of housing an approximate population of 5.6 million people in the limited space, close to 98% of Singapore’s population lives in densely-built high-rise apartments. Including the commercial buildings that house offices and other amenities, Singapore has a total of 6021 high-rise structures [
8]. Singapore’s development pattern has made it challenging for untouched natural elements to be visually and physically accessible to residents [
7]. Singapore’s city planners tried to overcome these circumstances by integrating nature into its urban design, planting trees along the majority of its roads; building public plazas that incorporate green spaces and community gardens, and integrating nature directly into building designs (e.g., green roofs, vertical greenery) [
7,
9]. Following the increasing prominence of empirical research concerning the benefits of nature towards overall wellbeing, and amidst the growing evidence of benefits from incorporating nature into built design [
10], architectural designs on the island embraced the movement, resulting in hospitals, residential buildings, and commercial skyscrapers that incorporated greenery directly into the developments [
11]. Nonetheless, despite extensive efforts to create a ‘city in a garden’, limited land space means that residents have few opportunities to interact with pristine nature, so methods of restoration that do not require direct interaction with nature might aid in the prevention of directed attention fatigue among students and workers. Instead of exploring treatments for chronic mental fatigue, the purpose of this study was to look for prevention methods. To that end, we must first consider the nature of attention.
One of the first researchers to address attention in a systematic manner was William James. In his book
The Principles of Psychology, he divided attention into several constructs [
12], two of which form the focus of the current study. The first is involuntary attention—a passive, instinctual form of attention that is effortless and usually a natural reaction towards a stimulus of personal interest to the individual. The second is voluntary (directed) attention—an actively maintained form of attention derived from a person’s cognitive well, activated when the person is required to concentrate on a task while actively inhibiting internal and external distractions. Other researchers concurred with James’s theory on the selective nature of directed attention, deeming it a necessity due to the human brain’s limited capacity to process information [
13,
14]. They reasoned that without selectivity, it would be difficult to maintain on-task concentration when people are faced with competing and distracting stimuli. Parasuraman [
13] and James [
12] both stressed the importance of attention as a tool of intentionality, to better distinguish, perceive and memorise information. This state of intent has been juxtaposed with a state of distractedness, described as being in a daze or a state of confusion. Such a state can be the result of directed attention fatigue.
Stimuli that are irrelevant to the task at hand can take the form of an active distraction that negatively affects performance, unless the distracting stimulus can be inhibited. The importance of inhibition was confirmed by John Ridley Stroop, when he elicited the eponymous Stroop effect, whereby a specific feature (e.g., word meaning) interferes with the processing of another feature (e.g., word colour) [
15]. An act of inhibition, however, does not sustain itself beyond a few seconds; according to James [
12], voluntary (directed) attention involves the repetition of successive efforts to keep focused on a task. Repeated efforts of inhibition would eventually cause depletion of the brain’s attentional resources and lead to directed attention fatigue (DAF). DAF is a phenomenon whereby fatigue occurs due to sustained concentration on one task, while actively inhibiting distractions over an extended period [
3,
16]. Symptoms of DAF include increased irritability not related to personal situations, significantly decreased impatience, working memory deficit and a decreased ability to complete cognitive tasks, such as information processing, reasoning, planning and decision-making, without predisposed trauma to the brain [
17]. Left unrestored, DAF symptoms have been found to impact work productivity (e.g., increase in medical leave and turn-over rates) and result in academic failures [
17]. DAF is also the root cause of chronic mental fatigue, which is potentially irreversible [
3,
16].
Kaplan and Kaplan introduced the attention restoration theory (ART) in 1989 to explain the positive effects of unpolluted green spaces on directed attention. The Kaplan’s research, extending across several decades, showed that elements within natural environments facilitate restoration through soft fascination, compatibility, being away and extent [
2,
3]. When an environment fulfills all four components, it will stop demanding directed attention and the brain will get a chance to restore its directed attention capacity without distraction or undue strain [
3]. After restoration, the brain’s decision-making ability, alongside psychological and emotional processing capabilities, should improve.
Being away refers to the person being psychologically removed from daily stressors and routines and into a more relaxed environment, surrounded by components that would provide respite. However, the said situation would be less likely to support restoration if it lacked certain technical qualities. Extent refers to the level of immersion (connectedness) and quality of explorative experiences (scope) the environment can offer [
2]. According to past studies, a higher level of immersion should encourage engagement and the perceptual phenomenon of “presence” within a space [
18]. Increased perception of being present in a space would mean an increased likelihood of feeling as if one is away. The level of extent afforded to the person was found to reliably predict environmental preference. Within the Kaplans’ ART, being highly present in an environment also indicates restorative potential [
2,
19]. The environmental elements of natural spaces have been found to capture all four components of the ART, to offer attentional restoration and psychological respite [
2,
3].
2. Background to the Current Study
The question of how to maximise the level of restoration and respite remains unresolved, with researchers struggling to standardise the components needed to achieve restoration among urban dwellers consistently. If successful, a well-restored directed attention capacity should, theoretically, help to minimise performance error, while inhibiting distractors in both workplace and academic settings [
2,
3].
According to the evolutionary theory, human beings have an intrinsic positive response to environments consisting of natural elements [
20]. This was most noticeable in individuals who were experiencing stress. Unpolluted biomes near bodies of water with visible horizons are often viewed as havens that fulfil basic survival and safety needs [
2,
21,
22]. Wilson [
20] popularised the term biophilia to portray human beings’ innate need to connect with nature. Derived from evolutionary theory, the biophilia hypothesis posited that because humans had been primarily exposed to nature during large parts of our evolution, an affinity for nature-like stimuli was developed over time. Wilson established that biophilia is a “complex of learning rules” developed over thousands of years. Consequently, human beings would have little time for genetic adaptation toward non-nature settings; therefore, we have a relatively low inclination towards artificial, sterile and urban environments [
23].
Transposing Wilson’s biophilia hypothesis onto developing modern cities, Beatley [
24] emphasised the importance of integrating nature into cities. By making cities more biophilic in design, and thereby minimising the separation between humans and nature, it might be possible to address the human need for “a daily dose of nature” [
24,
25]. This concept was introduced by Rachel Kaplan [
26], when she introduced the concept of micro-restorative experiences (MRE) and suggested that engaging in activities, such as glancing out of a window during very short breaks at work (for as little as a few seconds), can provide some brief respite to an employee’s directed attention. For the benefit of urban dwellers who spend the majority of their time indoors, the presence of pseudo-natural elements can be crucial for MRE [
26,
27]. A later study provided support for the efficacy of MRE. Lee et al. [
27] studied the effects of exposure to either a meadow green roof or a bare concrete roof in short bursts of 40 s and found that participants in the green roof condition made significantly fewer omission errors in their post-exposure Sustained Attention to Response Task (SART). This study provided evidence for the efficacy of integrated urban greenspaces on directed attention restoration within urban spaces.
Nonetheless, it may well be the case that the “perfect” greenscape does not exist because every person’s vision would be different, with influencing factors including urban air and noise quality, weather conditions (e.g., humidity; heat; snow; monsoons), plant conditions (e.g., pollen season; temperate plants that lose their greenness), human-traffic and crowdedness [
28], which can affect consistent patronage of green spaces [
29]. Global weather fluctuates throughout the year, even in tropical climates, where flash floods are often recorded. Saw, Lim and Carrasco [
29] reported that high temperature coupled with intense humidity made it uncomfortable for residents to engage with outdoor urban greenspaces in Singapore. Another common phenomenon of places such as Singapore is heavy human traffic, which would make it difficult to achieve the feeling of being away and soft fascination because, even in the presence of nature, the need to navigate human traffic can potentially suppress the relaxation and directed attention restorative effects of urban greenspaces. These findings contradict those from studies completed in temperate regions of the world, where access and usage of greenspace was found to have positive effects on wellbeing [
29]. Inferences made from such findings also point to the potential efficacy of offering a widened range of nature exposure modalities, because modalities such as paintings and virtual exposures allow for human–nature interactions, without the potential effects of navigating traffic, pollutants and weather discomforts.
City dwellers reside and work primarily in manmade, concrete dominated environments (e.g., offices, classrooms, enclosed spaces) that are often a significant distance away from unpolluted greenspaces. McGeary and McGeary [
1] pointed out that relaxing activities, such as taking vacations, allowed employees to unwind and essentially remove themselves from burnout-inducing workplace stressors. Such vacations were presented as a potential preventive measure from chronic mental fatigue in the person-centered approaches. The said vacations were found to be beneficial to the normalisation of psychological and physiological systems of the person, but only when they were perceived as satisfying [
1]. The significant restorative effects that satisfying vacations have on individuals have been widely tested and accepted [
1,
30]. However, Kühnel and Sonnentag [
30] revealed in their 2011 study that beneficial effects of vacations faded out within one month; the fade was sped up by a post-vacation increase in job demands. They also found that daily efforts to maintain relaxation experiences post-vacation significantly delayed the fade-out of positive vacation effects. Many employees failed to maintain the positive effects of their vacations because the daily maintenance time had to be reduced to compensate for lost time and productivity while they were away [
30]. Given that employees in Singapore receive between seven to fourteen paid days of leave annually [
31], it would not be feasible to take vacations every four weeks to avoid the return of emotional exhaustion [
32]. Hence, it would be ideal if employees were afforded the ability to take daily out-of-office, vacation-like experiences to any location of their choosing to maintain the positive vacation effects.
In his study on the best places to take study breaks on campus amongst university students, Felsten [
33] found that manipulating interiors by painting large murals of dramatic nature can potentially provide opportunities for restoration. If we transposed Felsten’s findings onto those of Kühnel and Sonnentag [
30], then designing the workspace to give employees continued exposure to relaxing surroundings could potentially help with maintenance and delay the fade-out of vacation effects. However, the crux of the matter was employees not having enough time to maintain those vacation effects [
30]. If MRE can restore attention in exposures from as brief as 40 s [
27,
34], perhaps virtual-reality exposures can accomplish restorative goals in under 5 min, a period of time affordable for many organisations and employees. Kühnel and Sonnentag [
30] noted that vacations had to be subjectively satisfying, and if that is of consequence to the maintenance process, then modalities that provide high levels of immersion and flexibility in stimuli selection could simulate short “vacations” and potentially improve maintenance. One such modality gaining popularity aligned with technological advances to improve the user experience is virtual reality (VR), for which immersion and presence are key concepts.
The ability to perform daily maintenance to prolong the positive effects of vacations has been made possible with the availability of technologically advanced VR systems. Virtual reality refers to computer-generated simulation of three-dimensional (3D) images that form interactive virtual spaces that surround users [
35]. Handheld or haptic controllers are used to control and interact with the virtual environment (VE) and other virtual elements. Immersive virtual reality (IVR) requires a system to deliver real-world surroundings convincing enough that the user’s disbelief is suspended, and they are psychologically engaged with the system. IVR systems include high quality 3D visual display, auditory, haptic feedback, and position tracking systems [
35]. One such study using IVR examined the potential reduction in stress, cognitive fatigue, and negative affects through the exposure of nature setting exposure using IVR [
36]. Their results suggested that IVR immersion produced similar beneficial effects to exposures using surrogate nature (e.g., photographs, videos). Valtchnov et al. [
36] noted the endless plasticity potential of VR software, and how that can translate into freedom to create virtual environments that encompass visual elements that fit the subjective preferences of users. We now consider the matter of immersion and the associated issue of presence.
Various definitions of immersion have existed since the late 1990s, primarily dominated by two schools of thought, including that of Witmer and Singer [
37], and Slater [
38]. Witmer and Singer defined immersion as a psychological state of the individual perceiving him/herself as being enveloped by, interacting with and included in a virtual environment. Slater argued against that definition in his article, clarifying that Witmer and Singer’s notion of immersion was, in fact, a closer definition for presence, referring to the origin of the specialist use of “presence”, which was extrapolated by Sheridan in 1992 [
39] from Marvin Minsky’s [
40] “telepresence”. Sheridan [
39] defined presence as the effect of people’s reported level of interaction and immersion in virtual environments, which is similar to Witmer and Singer’s [
37] definition of immersion. As such, we draw on Slater’s [
38] definition of immersion, which is the extent that a system’s hardware and software designs can afford higher reported levels of presence.
For immersion to be successfully translated, the technical offerings of the system should be well developed and generalisable to its target users. Immersion refers to the objective level of sensory fidelity a VR system can provide [
38,
41]. The quality of the system plays a significant role in the level of presence experienced by the participant. Slater and Wilbur [
42] further dissected the concept of successful immersion into four components, inclusive, extensive, surrounding, and vivid. Inclusive indicates the extent to which the surrounding physical reality is shut out. Extensive specifies the range of sensory modalities accommodated. Surrounding signifies the virtual environment’s panoramic extent rather than limitation to a narrow field. Vivid indicates the resolution, richness and quality of the VR system’s display. In an ideal situation where the maximum level of immersion is provided, causing the user to experience absolute presence, all sensory modalities should be fulfilled (e.g., sound, temperature, pressure, smell) and the user would disregard the physical environment [
43]. Immersion is quantifiable and can be controlled by the quality of a system’s setup; many commercial systems are optimised to meet these technological standards to meet ever-increasing user demands for achieving immersive virtual reality (IVR).
The framed screen of a television or computer display forms a discontinuity between the space of one’s current reality and the reality shown through the display; the discontinuity between different spatial and temporal realities [
42]. The fidelity of the screen displayed, and flat-projection forms of virtual environments are usually perceived as “not real”, because of their limited abilities to replicate the encapsulating sensation that real environments have on the person’s visuo-spatial system [
42]. There have been exceptions to the technical limitations of screen displays, where absolute presence through gradual stages of immersion was reportedly experienced by video gamers [
44]. Brown and Cairns [
44] defined perceived presence among video-gamers as a gradual experience in three stages, engagement, engrossment, and total immersion. They believed these stages to be sequential and the only way to achieve full immersion was to fulfill a combination of computerised, human and contextual factors in every stage. They found this form of immersion to be experienced only when gamers favored the specific game enough to fulfil at least the lowest stage of engagement. Slater and Usoh [
45] developed the concept of “body centered interaction”, which suggested that a close-to-perfect match between proprioception and sensory feedback at the perceptual and cognitive levels would afford a system-user “match”; an important component that could significantly increase the level of immersion.
Commercial VR systems (e.g., HTC Vive; Oculus Rift) use wide-field head-mounted devices (HMD) and hand-held controllers, along with no more than two palm-sized aerial movement tracking sensors to provide correspondence between the user’s head, upper body movements and the system’s visual display response. The level of immersion that VR systems provide can put the user in an already high level of immersion by planting their physiological sensors directly into an environment that fully envelops them on a visual level at the very least. Although the stages of immersion identified and described by Brown and Cairn [
44] take a longer time to achieve, total immersion was also reported by Ferreira and Falcão [
43], with gamers reporting decreased awareness toward their physical surroundings and selves. Directed attention was activated when there were specific missions and goals to be completed within a game, which aligns with the concept of hard fascination in the ART [
3]. While it has been reported that certain tasks can be enjoyable and would enhance the restorative effects of exposures, the findings reported by Ferreira and Falcão implied that directed attention was activated when individuals were faced with the specifics of in-game missions, which can be stress-inducing rather than affording attention restoration [
43].
Presence refers to a person’s reaction to immersion, meaning the user’s subjective evaluation of the virtual system’s technological setup, level of appreciation towards the interaction possibilities the system afforded, and the subjective experience of being in the VE [
41]. Presence arises from an appropriate conjunction of human perceptual and motor systems and immersion. When provided with the same immersive system, different persons may report different levels of presence [
41]. According to Slater, the subjective state of presence can be evaluated through the user’s reported experience of being in a VE, which would often correlate positively with the level of immersion afforded by the system. The level of presence experienced might also be affected by the individual’s innate level of immersive tendency [
41]. Immersive tendency encapsulates individual differences that would reflect one’s preference for information presented in different modalities (e.g., the absence of or mismatched auditory input might be a crucial hindrance for one person, but hardly noticeable for another) [
41].
It is important to differentiate presence from disengagement, due to disinterest towards the content presented. For example, if the user said, “Wow! This sounds like I am in a theatre, enjoying the orchestra live!”, this is a sign of presence. If the user said, “The music is really uninteresting after a few minutes and my mind started to drift and lose focus”, they were reflecting on their subjective preference based solely on the content, rather than presence experienced from the level of immersion afforded [
41]. A VE can be extremely presence-inducing, yet contain uninteresting content. On the other hand, VEs can be interesting and emotionally captivating for an individual, and yet reflect the quality of the content presented, not presence or immersion. As the level of presence experienced increases, the resulting behavior should be as though one were in a real-life situation with one’s perceptual, proprioceptive and autonomic systems activated, even though cognitively appreciating one is not in a real-life situation [
35,
41]. To successfully achieve presence, the system’s fidelity to reality should be high and matched to its target user’s perceptual system expectations, making the VE indistinguishable from reality. When the barrier of a television or computer screen is broken with the use of technologically advanced VR head-mounted displays and the formerly remote virtual environment becomes immediate, the level of immersion and perceived presence is expected to increase significantly. The user transcends the “screen-barrier” and enters into the VE, a defining feature of successful VR exposure [
35,
42,
46]. Supporting evidence was found when participants displayed more animated usage of their bodies during immersion (e.g., leaning, turning around and bending down), and reported higher perceived presence during exposures that had minimal delay in visual and headtracking feedback [
47,
48]. The impact of system components on perceived presence significantly outweighed the importance of pictorial realism (content component) [
47,
48].
Empirical research findings support the notion that individual differences pose a greater impact on presence than the technical characteristics of virtual environments [
49]. Immersive tendency refers to an individual’s predisposed tendency to experience presence, and this can potentially justify the different reactions and levels of engagement when different individuals are presented with the same VE. Researchers tend to control for this innate tendency because many believe that it can potentially confound research findings [
36,
50]. A person’s disposition to experience presence can be substantiated by individual beliefs, experiences, knowledge and even mental state [
51]. Individuals with experiences and knowledge that ties them to a specific environment (e.g., surgeons experiencing the VE of a surgical theatre, compared to a non- medical personal experiencing the same VE) were found to experience increased levels of presence because the key elements of the environment made sense or were meaningful to them [
52]. Others found that people diagnosed with obsessive-compulsive disorder (OCD) reported immersive tendency scores that correlated positively to their level of anxiety experienced during anxiety-inducing VR exposures [
53]. If VR exposures can induce anxiety—a negative effect—it could potentially solicit positive effects and restore directed attention by presenting participants with VEs aligned to their dispositional preferences, to maximise their level of immersive tendency and optimise restoration.
The Present Study. A comprehensive body of research performed in temperate regions confirms that nature exposure can help restore directed attention and increase general wellbeing (e.g., [
3,
22,
27,
54,
55,
56,
57,
58]). The generalisability of these studies has primarily been limited to temperate regions (e.g., Europe, North America). Saw, Lim and Carrasco [
29] provided some contradictory findings to indicate that such findings might not readily be generalised to tropical regions, due to climate differences. Since Singapore’s constant heat and humidity can be a deterrent to being with nature in the outdoors to restore directed attention, the current study was designed to examine if VEs that replicate real-life nature environments can produce similar restorative effects, in the comfort of the users’ air-conditioned spaces. We also sought deeper understanding of the effects from different levels of immersion on directed attention restoration.
Aims and Hypotheses. The first aim is to study the restorative levels of different virtual exposures. The second aim is to study the impact that immersive tendency might have on perceived presence and level of perceived restoration. Six hypotheses were proposed.
H1. Perceived restoration will be the highest for VR nature, compared to other exposures.
H2. Response time on a cognitively demanding task will be the fastest after exposure to VR nature, compared to other exposures.
H3. Accuracy on a cognitively demanding task will be the highest after exposure to VR nature, compared to other exposures.
H4. Perceived presence will be the highest for VR nature.
H5. Immersive tendencies would have a significant effect on perceived presence after controlling for age, hours spent doing cognitive tasks and occupation.
H6. Immersive tendencies will moderate the effect of VR nature exposure on perceived restoration.
4. Results
IBM SPSS v24 (IBM Corp., Armonk, NY, USA) was used to analyse the data.
Table 2 shows the descriptive statistics for the measures used in this study. There were no missing data from the 120 participants and all were used for analysis.
For the immersive tendencies quasi-independent variable, the highest and lowest possible scores of the ITQ were calculated and then dissected into quartiles for categorisation. Two groups emerged from the scores falling into either the second quartile (medium) or third quartile (high). They were dummy coded (medium immersive tendency = “0”; high immersive tendency = “1”) for the benefit of the data analyses.
To test the six hypotheses, we applied the following statistical analyses:
One-way repeated-measures analysis of variance (ANOVA) for Hypotheses 1, 3 and 4,
Friedman two-way ANOVA for Hypothesis 2,
One-way multivariate analysis of covariance (MANCOVA) for Hypothesis 5,
Hayes PROCESS Model 1 (moderation) for Hypothesis 6.
4.1. H1: Comparison Test of Perceived Restoration across Three Types of Environmental Exposure
A one-way ANOVA was used to compare participants’ perceived restoration derived from three different environmental exposures. The assumption of normality in data distribution was violated, as indicated by the Shapiro–Wilk test (
p > 0.05) [
81,
82]; hence, the data were analysed based on an adjusted
p-value of 0.01. Additionally, Mauchly’s test indicated a violation in the sphericity assumption; hence, results from the more stringent Huynh–Feldt Epsilon corrections are reported.
The ANOVA results showed that participants perceived significantly different levels of restoration from the different virtual exposures, F(1.88, 223.88) = 72.60, p < 0.001, partial η2 = 0.379. Pairwise comparisons revealed that the participants perceived a higher level of attention restoration from the VR-N exposure than the VR-U exposure and the N-VR exposures. There was no significant difference between participants’ perceptions of attention restoration from the VR-U and N-VR exposures (p > 0.01).
4.2. H2: Comparison Test of Objective Restoration across Three Types of Environmental Exposure
The SART response time data sets severely violated the assumption of normality, and the subsequent removal of 20 outliers did not help in normalising the data. Therefore, a Friedman two-way ANOVA was used to test if response time on a directed attention-demanding task will decrease (faster) after exposure to VR-N, compared to the VR-U and N-VR exposures. All three assumptions of the Friedman two-way ANOVA were met. The analysis output indicated that there was no statistically significant difference in mean response times across the four measurement times depending on the effects of the three virtual exposures (
Figure 4a), as Friedman
Χ2 = 1.098 (corrected for ties),
df = 3,
N-Ties = 120,
p = 0.777.
A Pearson’s bivariate correlation analysis was run to test correlations of speed and accuracy within the conditions as evidence of a potential speed-accuracy trade-off, whereby participants are less accurate due to a strategy of acting as fast as possible [
83,
84,
85]. Statistically significant positive correlations for speed and accuracy scores indicate that this was very likely to be the case (
Table 3), because higher accuracy scores correlated with higher response time values that signify slower speeds.
4.3. H3: Accuracy on a Cognitively Demanding Task Will Be Highest after Exposure to VR Nature Compared to the Other Exposures
A one-way repeated measures analysis of variance (ANOVA) was used to compare the mean accuracy scores on the SART across the baseline measure and three exposure conditions. Shapiro–Wilk statistics indicated that the assumption of normality was violated (
p < 0.01). However, the
z-scores of skewness (
Zs) and kurtosis (
Zk) were calculated for each group of scores, and the
z-values all fell within the ±3.29 bracket, which meets the null hypothesis (α = 0.05) for medium-sized samples (50 <
n < 300); hence, the assumption of normality was met [
86]. Moreover, skewness (
Zs) scores were between −0.5 and 0.5, which meant that the distribution for each group was approximately symmetrical, as SARTbaselineACC (
Zs) = 0.303; SARTvrnACC (
Zs) = −0.443; SARTvruACC (
Zs) = 0.290; SARTnvrACC (
Zs) = −0.032.
Z-scores for kurtosis also proved to be within the acceptable range, as SARTbaselineACC (Zk) = −2.265; SARTvrnACC (Zk) = −2.984; SARTvruACC (Zk) = −2.612; SARTnvrACC (Zk) = −2.658. The boxplot visualisations also appeared to be symmetrical without an indication of outliers. Fmax was 1.196, demonstrating homogeneity of variances. However, the Mauchley’s test indicated that the assumption of sphericity was violated (p < 0.001); hence, the Huynh–Feldt correction was applied for analysis.
The ANOVA results indicated a significant difference in accuracy for the directed attention task amongst the four measurement times,
F(2.659, 316.473) = 4.238,
p < 0.01, partial
η2 = 0.034. Pairwise comparisons further revealed that participants showed significantly more accuracy in the SART after the VR-N exposure (
M = 52.73,
SD = 29.04), compared to their baseline measurement (
M = 46.53,
SD = 26.55). However, VR-N did not garner statistically significant differences in SART performance accuracy compared to the VR-U (
M = 51.20,
SD = 27.98) and N-VR (
M = 51.03,
SD = 28.70) exposures (
Figure 4b).
4.4. H4: Perceived Presence Will Be Highest for VR Nature
A one-way repeated measures analysis of variance (ANOVA) was used to compare 120 participants’ ratings of presence of the three different virtual exposures measured using the IPQ+. A Shapiro–Wilk test (
p > 0.05) [
81,
82] indicated that one of three categories (N-VR) violated the assumption of normality (α = 0.003); however, the boxplots showed no outliers or extreme scores, and the Q-Q plots visalisation showed no obvious deviation from the line. A calculation of the skewness and kurtosis revealed that the assumption of normality was still supported, with a minor skew to the left. Shapiro–Wilk statistics for the other two categories (VR-N and VR-U) indicated that the assumption of normality was supported. The ANOVA is generally robust towards the violation of this assumption; hence, we moved forward with the analysis.
Fmax was 2.001, demonstrating homogeneity of variances. Mauchly’s test indicated a violation in the sphericity assumption; hence, the results from the more stringent Huynh–Feldt Epsilon correction will be reported.
The ANOVA results showed that participants experienced more presence in some exposures than others, as
F(1.83, 217.96) = 183.89,
p < 0.001, partial
η2 = 0.61. Pairwise comparisons further revealed that participants experienced a higher sense of presence in the VR-N exposure (
M = 4.98,
SD = 0.87) than the VR-U exposure (
M = 5.40,
SD = 0.83), and the N-VR exposure (
M = 2.98,
SD = 1.17). Finally, the VR-U exposure provided significantly better presence rating than the N-VR exposure (
Figure 5b).
4.5. H5: Immersive Tendencies Would Have a Significant Effect on Perceived Presence after Controlling for Age, Hours Spent Doing Cognitive Tasks and Occupation
A one-way multivariate analysis of covariance (MANCOVA) was used to test if a person’s level of perceived presence would vary significantly based on their innate level of immersion tendency, after controlling for the potential effects of age, the number of hours they spend on cognitive tasks per week, and their occupation.
Assumption testing for normality of data groups showed no violation of the Kolmogorov–Smirnov statistics and histograms for each group, based on medium sample size
n = 120 [
86]. The assumptions of homogeneity of regression slopes and homogeneity of variances were supported, because all IV by covariate interactions were non-significant on all three levels of the DV (
Table 4). Box’s M was also non-significant at α = 0.001, indicating that homogeneity of the variance-covariance matrices could be assumed. Multicollinearity was also not violated, as the dependent variables were not excessively correlated with one another.
Furthermore, the relationships that existed between the dependent variables were roughly linear. A multivariate outlier was identified, as maximum Mahalanobis distance = 14.033; Cook’s distance = 0.005, which was larger than the critical X2 value for df = 3 at α = 0.012. Although Cook’s distance for this data file was small (<1), removing the data file lowered the maximum Mahalanobis distance to 9.085, satisfying the assumption of multivariate normality. Analysis proceeded without that data file, with n = 119 (medium immersive tendency, n = 75; high immersive tendency, n = 44).
Once the underlying assumptions were met, a repeated measures MANCOVA was conducted. The findings showed that the ITQ groups (dummy coded “0” for medium versus “1” for a high level of innate immersive tendency) significantly predicted the level of presence on the combined variables test, as F(3, 112) = 3.117, p = 0.027, partial η2 = 0.078.
However, analysis of the dependent variables when age, occupation and number of hours spent on cognitive tasks per week were held constant showed that the participants’ innate immersive tendency did not have a statistically significant effect on the level of reported presence, as ipqVR-N F(1, 114) = 3.616, p = 0.06, partial η2 = 0.031; ipqVR-U F(1, 114) = 1.837, p = 0.178, partial η2 = 0.016; ipqN-VR F(1, 114) = 0.028, p = 0.866, partial η2 = 0.000.
Although the MANCOVA was known to be robust towards uneven sample groups, another analysis was performed after randomly removing 31 data files from the medium ITQ group to match the number in the high ITQ group, new n = 88, with 44 data files in each group. This was to confirm that the uneven sample was not the cause of the non-significant finding. The result was an improved Levene’s test result for all three dependent variables, p > 0.05, a slightly inflated F-output and effect size, with improved significance level in the combined variables test, as F (3, 81) = 3.777, p = 0.014, partial η2 = 0.123. The findings when other variables were controlled remained non-significant.
4.6. H6: Immersive Tendencies Will Moderate the Effect of VR Nature Exposure on Perceived Restoration
To find out whether an individual’s immersive tendency would moderate the effect of VR nature exposure on perceived directed attention restoration, a PROCESS Model 1 with 5000 bootstrap resamples analysis was conducted. The statistical model is displayed in
Figure 6.
The IPQvrn (predictor variable) and SRPRSvrn (outcome variable) were centered prior to analysis to avoid potentially problematic high multicollinearity with the interaction term [
87], and the ITQ (moderator variable) was dummy coded into “0” for medium immersive tendency; and “1” for high immersive tendency. One of the prerequisites for moderation analyses was to have equal sample sizes between groups; 32 data files were randomly selected and removed from the immersive tendency (medium) pool to match the immersive tendency (high) pool (
n = 44). The total number of participants for this analysis was
n = 88, which still aligns with the rule of thumb of 30 for each subgroup [
88]. The overall model was statistically significant, as
R2 = 0.3270,
F(3, 84) = 13.606,
p < 0.001. This showed that close to 3.3% of variance in SRPRS was explained by the predictors.
Upon closer observation, the IPQvrn variable (presence experienced during the VR-N exposure) showed a statistically significant main effect on SRPRS (perceived restoration), as unstandardised β = 0.569, t(84) = 3.535, p < 0.001. The innate immersive tendency of the participants did not have a significant impact on perceived restoration, as unstandardised β = −5.143, t(84) = −1.881, p < 0.063. The interaction between IPQvrn (predictor) and ITQ (moderator) also fell short of statistical significance, as F(1, 84) = 0.948, p = 0.333, ΔR2 = 0.008, unstandardised β = 0.213, t(84) = 0.974, p = 0.333. Participants’ level of presence experienced during the VR-N exposure enhanced their sense of directed attention restoration, and this was regardless of the level of innate immersive tendency.
5. Discussion
This study aimed to explore the potential effects that level of immersion and type of virtual environment can have on directed attention restoration and level of presence in a VR environment.
The hypothesis that the VR nature exposure will garner significantly higher perceived restoration scores was supported. The high-level immersion afforded by the VR-system surpasses the traditional “screen-saver” mode of watching a virtual presentation on a flat laptop screen [
89]. This finding supports the results of past research where it was found that an immersive system would transport the user into a new environment, “removed” from the physical one [
41,
42,
90]. Slater [
89] observed that computer screens were able to afford immersion, but it took deliberate attention and significantly longer exposures. The ability to expedite the immersion process would translate to less deliberate attention, which is assumed to be associated with improved attention restoration. This can potentially help employees maintain their positive vacation effects for longer and avoid rapid fade out of those effects once they have returned to their employment [
30]. The speed at which the person was immersed was not the sole reason for increased perceived restoration, because the urban VR did not significantly differ in restorative outcomes compared to the non-VR nature exposure. This is a step forward because it confirms that VR immersion can potentially offer mental respite and restoration to individuals who might avoid outdoor greenspaces because of Singapore’s climate (i.e., high heat and humidity) [
29].
The hypothesis that concerned the effect on response time for a cognitively demanding task was not supported. The data set severely violated the normality assumption. Participants had common feedback about not realising that they could take a slightly longer time to react to the trials to enhance accuracy.
The hypothesis that concerned the effect on accuracy for a cognitively demanding task was partially supported. Accuracy scores on the NOGO trails significantly improved from the baseline scores after participants were exposed to the VR-N exposure. However, there was no significant difference when the scores were compared to the other exposures. Manly et al. [
61] pointed out the importance of reflex inhibition and the SART accuracy scores directly reflect this. A diminished ability to inhibit reflex could mean depletion in directed attention, as it is an important component of cognitive tasks, such as social behavioural decisions in output and decision-making [
3,
17]. Although the scores did not differ significantly among the three exposures, it seems that high-immersive nature scenes would offer directed attention restoration. This finding could be extremely useful for a target demographic of employees and students who do not enjoy battling the heat and humidity to engage with nature, as an alternative to efficiently restore directed attention in the comfort of their indoor space [
29].
The hypothesis that concerned perceived presence in VR nature exposure was supported. The VR-N exposure garnered significantly higher scores on the IPQ compared to the other two exposures. The factors that make up the questionnaire are primarily directed towards the virtual system’s achievement of high-quality immersion output to its user [
71]. The VR-U exposure had a better mean score outcome than the non-VR exposure, confirming that systems that can provide higher immersion would translate to increased user presence. It is unclear whether it was the participants’ preference for nature that resulted in the higher score compared to the urban scene, or because the VR-N exposure had movement in it (a 360° video), which should increase the level of realness compared to the static view of the VR-U scene.
The static 360° image of the VR-U scene might have caused participants to experience incongruence between the static visual and “movement” in audio, because the urban soundscape audio had moving cars and light chatter of people. The VR-N would have offered congruity between audio and visual, because the plants and trees moved lightly with the flow of the wind and the lake had ripples of water movement, to match the chirping birds and rustling sounds. If it was solely the element of higher engagement due to scene realness, then this outcome supports past findings that found it possible to have highly immersive but extremely disengaging virtual experiences, and the IPQ+ was able to differentiate the two [
41].
Hypothesis 5 that concerned the effect of perceived presence on immersive tendencies after controlling for demographic factors was partially supported. Participants’ immersive tendencies were found to have a significant effect on perceived presence, but this effect became non-significant when the participant’s age, occupation and number of hours spent on cognitive tasks were controlled. These contradicted findings of several past studies, where immersive tendency was believed to have a significant effect on people’s perceived presence more than the actual immersion qualities of the virtual setup [
49]. An individual’s beliefs, life experiences and mental state were reported to influence their immersion tendency [
51]. One can infer from this that immersive tendency might be considered an enduring trait within the person, but the level of presence felt by that person might be affected by a combination of many other things (i.e., age, experiences, level of mental fatigue and quality of VR system).
Hypothesis 6 was not supported because no interaction effect was detected between the predictor variable (VR-N exposure) and immersive tendencies (ITQ). VR-N did have a statistically significant positive main effect on perceived restoration (SRPRS). Researchers have been known to control for participants’ immersive tendencies by limiting their sample to within one standard deviation from the mean, based on test results of the ITQ [
36]. Some have claimed that immersive tendency was the key to perceived presence, significantly surpassing the influence of the technical excellence of a virtual system [
49]. Perhaps in the few years of VR technical advancements, the influence of realism and other technical aspects of VR immersion have surpassed that of the innate immersive tendency in significantly affecting directed attention restoration. Quality of a VR system’s output now and in the future could become so engaging that the innate ability to experience presence might matter a lot less.
5.1. Theoretical and Practical Implications
The current study has provided insights into alternative methods of respite that technologically advanced VR systems can offer to city dwellers, especially those who might prefer to avoid that discomfort of getting to and being immersed in natural environments to seek directed attention restoration [
29]. VR systems have the potential to help prevent chronic mental fatigue, an irreversible state that can cause significant loss to organisations, due to an increase in employee turn-over rates and medical leave. Kaplan and Kaplan’s ART and the efficacy of nature’s restorative abilities had been rigorously tested and have been found to be generalisable to people in many parts of the world [
90,
91].
While there has been an increase in the number of visitors to parks and other outdoor green and blue spaces, many Singapore residents do not appreciate having to endure the physical discomfort of heat and humidity just to interact with nature [
29]. The current study offered participants a chance to interact with nature via VR, within the comfort of an airconditioned room with a faux grass patch to sit on and found statistically significant effects on the level of perceived presence and significantly higher reported directed attention restoration. Significantly higher scores in accuracy meant their restored directed attention improved executive functions, such as decision-making and reflex control. A broader participant age range and the addition of working adults should improve generalisability.
5.2. Limitations and Future Research
The stimuli selection exercise was not successful and the current study’s investigators could not be certain that the exposure scenes they had selected were the best choices. Future studies can look at designing user interfaces, where participants can select and put together elements they prefer and “build their own VE”. It would be feasible to look at whether presence will increase significantly since it includes the element of individual preference, and whether that will translate into higher directed attention restoration scores. The matter of motion sickness was avoided through selection of the alternative city street environment (urban exposure); none of the participants reported any experience of motion sickness. However, future studies should also take this possibility into consideration for stimulus selection.
The second limitation also involves the selection of the urban exposure scene. Ideally, it should have been a 360° video, as was the case for the nature scene, and this discrepancy might have affected the outcome of our measures, because participants might have been bored in the static urban exposure because nothing was moving for three minutes. Standardising this in future studies would mean improved confidence if a significant effect is still detected for the nature scene, rather than just a ripple effect of higher presence, or because the participants got bored sitting in a static urban scene.
The time of day that participation of the study took place was not standardised (earliest 9 a.m., latest 11 p.m.); therefore, the attention fatigue baseline of the participants was also not standardised. If future studies can standardise the time of participation, an optimal time during the day can be found to expose people to the restorative scenes to maximise the directed attention restoration effects. We had participants who claimed low presence because when they looked down at where their feet were supposed to be, they felt as if they were floating. The VR system used for the current study did not include a “physical body” within the VE that would have seemed to be a continuation of the participant’s real body to provide more continuity in the visual spectrum.
A large percentage of Singaporeans grew up on the island without exposure to unpolluted nature. This might mean that being immersed in unpolluted nature would have counter restorative effects because they might not feel safe and comfortable in such an environment. Future research can compare Singaporeans and citizens of urban locations, who have grown up interacting with unpolluted nature, then moved to a city to work as an adult. Such a comparison would explore Wilson’s [
20] biophilia hypothesis better. With that answer, researchers would have better understanding to inform the design of restorative VEs for urbanised populations who find respite in simulated nature.