*Review* **Extended Reality for Smart Building Operation and Maintenance: A Review**

**Marco Casini**

Department of Planning, Design, and Technology of Architecture (PDTA), Sapienza University of Rome, 00185 Roma, Italy; marco.casini@uniroma1.it

**Abstract:** The operation and maintenance (O&M) of buildings and infrastructure represent a strategic activity to ensure they perform as expected over time and to reduce energy consumption and maintenance costs at the urban and building scale. With the increasing diffusion of BIM, IoT devices, and AI, the future of O&M is represented by digital twin technology. To effectively take advantage of this digital revolution, thus enabling data-driven energy control, proactive maintenance, and predictive daily operations, it is vital that smart building management exploits the opportunities offered by the extended reality (XR) technologies. Nevertheless, in consideration of the novelty of XR in the AECO sector and its rapid and ongoing evolution, knowledge of the specific possibilities and the methods of integration into the building process workflow is still piecemeal and sparse. With the goal to bridge this gap, the article presents a thorough review of virtual reality (VR), augmented reality (AR), and mixed reality (MR) technologies and applications for smart building operation and maintenance. After defining VR, AR, and MR, the article provides a detailed review that analyzes, categorizes, and summarizes state-of-the-art XR technologies and their possible applications for building O&M along with their relative advantages and disadvantages. The article concludes that the application of XR in building and city management is showing promising results in enhancing human performance in technical O&M tasks, in understanding and controlling the energy efficiency, comfort, and safety of building and infrastructures, and in supporting strategic decision making for the future smart city.

**Keywords:** building operation and maintenance; extended reality; virtual reality; augmented reality; mixed reality; immersive technologies; digital twins; metaverse

### **1. Introduction**

In the last five years, a more widespread adoption of digitalization and innovative technologies and systems has started making the digital and physical worlds become more deeply interconnected and interrelated, thus transforming the way that buildings and infrastructure are managed with the goal to provide a more energy efficient, comfortable, sustainable, and profitable built environment [1–3].

This transformation is enabling buildings to exchange, process, and exploit data and information, communicate with users, and share their assets with those of cities. Unprecedented possibilities are coming from the spread of key technologies such as building information modeling (BIM), artificial intelligence (AI), big data, and the Internet of Things (IoT), which are making the built environment become smarter, allowing building performance and user experiences to be greatly improved [4–10]. At the urban level, in particular, the integration of intelligent building systems with those of the city can allow for smart energy management capable of taking into account power demand and availability in real time [11–13].

At the center of this revolution is the possibility of overlapping (combining, mirroring) the life cycle of the building with a "digital twin building life cycle" able to interact bidirectionally with the physical world [1]. The integration of sensing technologies into the

**Citation:** Casini, M. Extended Reality for Smart Building Operation and Maintenance: A Review. *Energies* **2022**, *15*, 3785. https://doi.org/ 10.3390/en15103785

Academic Editor: Fernando Morgado-Dias

Received: 29 March 2022 Accepted: 18 May 2022 Published: 20 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

real world can generate a detailed and continuous flux of data that can power realistic and accurate digital replicas of the physical objects (digital twins) that can be freely analyzed and configured, knowing that every informed decision made in the digital world will be valid, and automatically implemented, in the physical world in turn [14,15]. In this way, the huge amount of data collected in both the construction and management phase of the building can be easily analyzed by artificial intelligence and machine learning algorithms to identify patterns and to create a data model to make predictions and support decision making along the entire value chain (resource optimization, cost prediction, risk analyses, predictive maintenance, etc.) [16].

The transition towards this digital ecosystem has accentuated the need for a more effective visualization of such a wealth of information, highlighting the limitations of traditional visual techniques throughout the life cycle of the building and requiring more dynamic tools capable of creating immersive virtual experiences of the real world, and thus allowing a greater understanding of the built environment. As real spaces are becoming more and more connected and sensor enabled, the demand is to combine real-time 3D digital representations with information from heterogeneous data sources, including BIM models and IoT sensors, and to improve the user experience with human interfaces with clear information dashboards and intuitive controls such as vocal and gesture commands.

In this picture, the extended reality (XR) technology, encompassing virtual, augmented, and mixed reality, is proving revolutionary in supporting building operation and maintenance, providing users with enhanced visualization thanks to the ability to show superimposed instructions, technical schemes, or sensor data right in their field of view, as well as allowing remote control and hands-free communication [17].

XR is the tool that sits between the digital twin and real-world information, essentially helping people to complete the marriage of the virtual and physical worlds and improve both. The application of XR technology shows promising results in enhancing human performance in carrying out operational tasks, as well as improving maintenance activities and supporting strategic decision making [18]. Furthermore, in a world situation where remote work has become the standard for companies in every industry (with over 1 in 4 Americans continuing to work remotely in 2022 [19]), integrated cooperation platforms enable remote connected colleagues and experts to see the live, first-person view captured by the XR viewer and collaborate proactively with the person on field.

Currently, the literature about XR technologies in the AECO sector and their applications in O&M is growing but is still limited in scope and lacking in more operative aspects. In particular, Alizadehsalehi et al. [20] presented a review of the most recent VR, AR, and MR technologies in the design and construction industry and an introduction to the most commonly used wearable XRs on the market. Khan et al. [21] presented a study of 64 papers on the integration level of XR immersive technologies with BIM in the AECO industry, divided into eight domains: client/stakeholder, design exploration, design analysis, construction planning, construction monitoring, construction health/safety, facility/management, and education/training. Sidani et al. [22] presented a review of 24 selected papers on the tools and techniques of BIM-based augmented reality in the AECO sector exploring 6 main application fields: collaboration, construction design, construction management, construction safety, facility management, and worker performance. Delgado et al. [17] presented a study on the usage landscape of AR and VR in the AEC sectors and proposed a research agenda to address the existing gaps in the required capabilities. Cheng et al. [23] presented a state-of-the-art review on MR applications in the AECO industry where 87 journal papers on MR applications were identified and classified into four categories: architecture and engineering, construction, operation, and applications in multiple stages. Delgado et al. [24] presented a systematic study of the factors that limit and drive the adoption of AR and VR in a construction sector-specific context. Albahbah, et al. [25] reviewed seven applications of AR that may benefit the construction industry, namely, safety management, communication and data acquisition, visualization, construction management education, progress tracking, quality management, and facility management. Zhu et al. [26] presented a state-of-the-art

review of the application of virtual and augmented reality technologies for emergency management in built environments, while Li et al. [27] reviewed VR and AR applications in construction safety. Coupry et al. [28] presented a study on how XR technologies combined with digital twin technology can improve the maintenance operations in smart buildings. Noghabaei et al. [29] investigated the trends in AR/VR technology adoption in the AEC industry by conducting two user surveys in 2017 and 2018 involving 158 industrial experts. The results showed the potential for a solid growth of AR/VR technologies in the AEC industry in the following 5 to 10 years. The survey also highlighted some limitations of adopting AR/VR in the AEC industry such as "lack of budget," "upper management's lack of understanding of these technologies," and "design teams' lack of knowledge". Prabhakaran et al. [30] carried out a systematic review of scientific publications between the years 2010 and 2019 to understand the state-of-the-art immersive technology applications in AEC, revealing the following nine critical challenges: infrastructure, algorithm development, interoperability, general health and safety, virtual content modelling, cost, skills availability, multi-sensory limitations, and ethical issues.

In consideration of the speed with which this technology has appeared in the AECO sector and the equally rapid technological and application developments, in turn linked to the diffusion of other digital technologies (BIM, IoT, 3D mapping), knowledge of the specific possibilities offered by XR technology and on the methods of integration into the building process workflow is still piecemeal and sparse. Currently, there is no granular study that analyzes how and for what purposes O&M companies are using XR technologies. In particular, the differences between VR, AR, and MR are not yet clear to industry insiders, and there is still insufficient information on which devices are most appropriate to employ and which skills are necessary for their effective use. The study presented in this paper seeks to fill these gaps.

Overall, this article presents a thorough review of virtual reality (VR), augmented reality (AR), and mixed reality (MR) technologies and applications for smart building operation and maintenance. After defining VR, AR, and MR, the article provides a detailed review that analyzes, categorizes, and summarizes the state-of-the-art XR technologies and their possible applications for building O&M along with their relative advantages and disadvantages.

The paper is structured in six sections. After the introduction, Section 2 describes the research method used. Section 3 is dedicated to the definition of extended reality (XR) technology, illustrating the main differences between virtual (VR), augmented (AR), and mixed reality (MR), and presenting an overview of the most recent applications of XR in the AEC industry. Section 4 describes XR technologies and the most commonly used wearable XR on the market in terms of features, ease of use, and specifications. Section 5 highlights and illustrates the main possible applications of XR in buildings' O&M showing the potential benefits in terms of time and cost reductions and performance improvement. Finally, the conclusions are expounded in Section 6.

### **2. Research Method**

This review is based on the following main sources of knowledge:


### *2.1. Academic Articles Review*

Literature review was conducted on articles retrieved from well-acknowledged academic journals within the domain of the AECO industry to reflect the recent development trend and current situation of XR applications in the building sector.

Scopus, Web of Science, and Google Scholar were used as database search engines. The selected search period ranged from January 2018 up to the end of April 2022. This interval was chosen because of the breakthrough innovations that took place in the last

five years on immersive technologies, along with the widespread adoption of BIM and digital Twin in the AECO sector and in O&M in particular. Document and source type was restricted to conference papers, research, and review articles in English language. Subject area was limited to engineering, computer science, environmental science, and energy.

Research method followed two steps. As a first step, articles were selected from the aforementioned databases using in the title, keywords, and in the abstract sections, different combinations of the following keywords: extended reality, virtual reality, augmented virtuality, augmented reality, mixed reality, immersive technologies, digital twin, building, architecture, engineering, design, construction, operation, maintenance, building information modeling (BIM), civil engineering, and facility management (FM).

Based on the findings of the first search round, a second search round was conducted by manually filtering the papers related to XR applications in the AECO industry to remove irrelevant papers. The abstract of each paper was read by the author to ensure that the application of the paper was within the AECO industry. For example, several papers obtained from the first round of searching were about design in manufacturing or engineering in mechanics. Additionally, other papers were removed as they were repeats or duplicates. After two rounds of filtering, 72 articles were selected from 43 journals and 4 proceedings and classified into three application categories, namely: (i) architecture and engineering design, (ii) building construction, (iii) building operation and management.

In order to track down the evolution of XR throughout past years, additional papers before 2018 were considered and examined, and 14 articles were added to the list for a total of 86 (see Tables 1 and 2).

**Table 1.** Selected literature for review.


**Table 2.** Number of selected articles by journal and year of publication.


### **Table 2.** *Cont.*


Selected articles concerning XR application in building operation and management were further divided into VR and AR/MR categories, each articulated in different domains according to their possible use in building facility management as shown in Table 3.


### *2.2. XR Hardware and Software Research*

Along with academic articles review, a thorough analysis of XR hardware and software technology on the market was carried out in order to understand the state-of-the-art technology of the industry in terms of performance, availability, and accessibility of products as well as the trends shown by in-development prototypes and proofs of concept.

VR and AR devices database such as VRcompare [105] and infinite.cz [106] were used to conduct a wide view research of the market offering, which identified 70 different VR headsets and 34 AR/MR headsets released in the period 2018–2022 (see Table 4).



All XR devices were then investigated by examining the technical sheets and documentation available from the manufacturers themselves, with particular focus on display characteristics and virtual space fruition. In particular, relevant characteristics considered

included typology of the device (head-mounted, smartphone, or tablet based), type of 3D tracking, degrees of freedom (DoF), screen resolution, field of view (FoV), and, in the case of VR, whether the device was tethered to a PC or self-contained. Such information allowed the most relevant products to be identified to compare and present to the reader in the VR, AR, and MR chapters of the article. type of 3D tracking, degrees of freedom (DoF), screen resolution, field of view (FoV), and, in the case of VR, whether the device was tethered to a PC or self-contained. Such information allowed the most relevant products to be identified to compare and present to the reader in the VR, AR, and MR chapters of the article. Likewise, starting from online databases such as Sourceforge [107] and Capterra [108],

display characteristics and virtual space fruition. In particular, relevant characteristics considered included typology of the device (head-mounted, smartphone, or tablet based),

*Energies* **2022**, *15*, 3785 6 of 37

Likewise, starting from online databases such as Sourceforge [107] and Capterra [108], software on the market to develop and use XR contents were investigated, studying the different possible applications in the building process. In addition to commercial ready-to-use XR software investigated in the following sections of this article, most used development toolkits for AR, VR, and MR were examined to better contextualize limits, requirements, and opportunities of current and future applications in the building process. software on the market to develop and use XR contents were investigated, studying the different possible applications in the building process. In addition to commercial ready-touse XR software investigated in the following sections of this article, most used development toolkits for AR, VR, and MR were examined to better contextualize limits, requirements, and opportunities of current and future applications in the building process. In particular, resources from main AR software development kits (SDK) for

In particular, resources from main AR software development kits (SDK) for smartphones application, such as AR Core by Google (Mountain View, CA, USA), ARKit by Apple (Cupertino, CA, USA), Vuforia (PTC, Boston, MA, USA), and Wikitude (Salzburg, Austria) were consulted to understand the means of AR visualization and interaction (environment understanding and motion tracking). Web-based AR tools such as AR.js and A-Frame were also considered. smartphones application, such as AR Core by Google (Mountain View, CA, USA), ARKit by Apple (Cupertino, CA, USA), Vuforia (PTC, Boston, MA, USA), and Wikitude (Salzburg, Austria) were consulted to understand the means of AR visualization and interaction (environment understanding and motion tracking). Web-based AR tools such as AR.js and A-Frame were also considered. Concerning VR, research covered both 3D and BIM model creation tools (Autodesk

Concerning VR, research covered both 3D and BIM model creation tools (Autodesk Revit and 3ds Max, Graphisoft Archicad, SketchUp Pro, Cinema 4D, Rhinoceros) as well as the main graphic engines supporting VR visualization and interaction (Unity Reflect, Unreal Engine 4, Enscape, CryEngine). These engines are compatible with the majority of VR devices on the market: Meta Quest (Meta, Cambridge, MA, USA), Valve Index (Valve Corporation, Bellevue, WA, USA), HTC Vive (HTC, Taoyuan, Taiwan), PlayStation VR (Sony Interactive Entertainment Inc., San Mateo, CA, USA), etc.) out of the box, i.e., they can read input from sensors and controllers, process this data, and produce the correct instructions for visualization. Additional functions allowed by SDKs provided by the manufacturers of VR headsets were considered. Revit and 3ds Max, Graphisoft Archicad, SketchUp Pro, Cinema 4D, Rhinoceros) as well as the main graphic engines supporting VR visualization and interaction (Unity Reflect, Unreal Engine 4, Enscape, CryEngine). These engines are compatible with the majority of VR devices on the market: Meta Quest (Meta, Cambridge, MA, USA), Valve Index (Valve Corporation, Bellevue, WA, USA), HTC Vive (HTC, Taoyuan, Taiwan), PlayStation VR (Sony Interactive Entertainment Inc., San Mateo, CA, USA), etc.) out of the box, i.e., they can read input from sensors and controllers, process this data, and produce the correct instructions for visualization. Additional functions allowed by SDKs provided by the manufacturers of VR headsets were considered.

Regarding MR, the main resource tool for importing 3D models and integrating MR functionalities investigated was Microsoft Mixed Reality Toolkit (MRTK, Redmond, WA, USA) available both for Unity and for Unreal Engine 4 graphical engines, which can be further customized using the OpenXR standard. Regarding MR, the main resource tool for importing 3D models and integrating MR functionalities investigated was Microsoft Mixed Reality Toolkit (MRTK, Redmond, WA, USA) available both for Unity and for Unreal Engine 4 graphical engines, which can be further customized using the OpenXR standard.

#### **3. Extended Reality 3. Extended Reality**

The terms extended reality, "X-reality", or XR, refer to the use of different technologies to create immersive digital experiences, and include various combinations of computergenerated content and reality, including virtual reality (VR), augmented reality (AR), and mixed reality (MR) [109]. XR is used as an umbrella category that encompasses all real– virtual combined environments as well as man–machine interactions through computer technology and wearables [20,21]. The terms extended reality, "X-reality", or XR, refer to the use of different technologies to create immersive digital experiences, and include various combinations of computer-generated content and reality, including virtual reality (VR), augmented reality (AR), and mixed reality (MR) [109]. XR is used as an umbrella category that encompasses all real–virtual combined environments as well as man–machine interactions through computer technology and wearables [20,21].

These different technologies within the XR domain are set apart according to their immersivity in the virtual environment, to the level of interaction between the real and virtual worlds, and to which hardware is required for fruition (see Figure 1). These different technologies within the XR domain are set apart according to their immersivity in the virtual environment, to the level of interaction between the real and virtual worlds, and to which hardware is required for fruition (see Figure 1).

**Figure 1.** Extended reality and virtual continuum Redrawn from [110]. **Figure 1.** Extended reality and virtual continuum Redrawn from [110].

Virtual reality (VR) includes all immersive content that is completely digital and computer-generated (CG) and experienced through a VR headset or head-mounted Virtual reality (VR) includes all immersive content that is completely digital and computer-generated (CG) and experienced through a VR headset or head-mounted display (HMD). In VR, the user is isolated from the real world and its surroundings as the current reality is replaced aurally and visually with a new 3D digital environment.

Augmented reality (AR) instead superimposes CG content over the real world, which remains tangible. This CG overlay can also superficially interact with the environment in real time. AR is primarily experienced through wearable goggles or HMDs or via smartphone and tablet screens.

Finally, mixed reality (MR), also known as immersive media, hybrid reality, or spatial computing, produces a CG overlay in which virtual 3D elements can integrate, enrich, and interact with the real-world environment beneath. MR wearable devices combine several technologies and commonly feature transparent lenses to allow CG overlay. The definition of MR often includes the concept of augmented virtuality, another take of XR in which the scenario is a virtual world to which real contents are added, as in the case of a virtual classroom where the lesson from a teacher in the flesh can be attended, compared to conventional AR in which the context is the physical world to which digital contents are overlaid.

The diffusion of all XR technologies is on the rise with an expected market of USD 60.55 billion by 2023 [22] and an expected CAGR of 57.91% over the forecast period 2022– 2027 [111]. At the same time, the worldwide market for AR and VR headsets grew 92.1% year over year in 2021, with global shipments reaching 11.2 million units [112].

After being considered in the top 10 Gartner strategic technology trends for 2019 [113], since 2020, XR has been rapidly approaching a much more mature state and becoming more integral to business and IT. Several companies are currently developing supporting technologies that may help commercialize XR technologies globally, and MR-based applications were one of the top 10 ranked ICT technologies in 2020 [114]. The increasingly pervasive adoption of geo-localized apps, supported by an extended network and massive digitization that involves both the built environment and processes, along with the intersection with data mining, deep learning, and machine learning, is creating an evergrowing fertile ground for XR development and experimentation. Finally, the increasing adoption of gesture-based computing is also aiding the market growth by opening up several applications and opportunities.

Currently, XR technology represents one of the most promising innovations within the digitalization process of the construction industry, showing remarkable potential in increasing quality, efficiency, productivity, and profitability in all phases of the AECO process—design, construction, operation, maintenance, and real estate management.

XR can support designers, facility managers, and technicians by enhancing their physical realities with digital data and making the combined content readily shareable, allowing them to improve the quality and efficiency of work, to take better decisions faster, as well as to collaborate and communicate in a much more effective way. Using XR, they can visualize, explore, and comprehend blueprints, models, and site conditions more conveniently, greatly improving workflows throughout the lifecycle of buildings and infrastructure (see Table 5). Overall, the principal uses of XR in AECO activities include stakeholder engagement, design support and review, construction planning, progress monitoring, construction safety, operative support, operation and management, as well as workmen training [1].

### *3.1. Virtual Reality*

Virtual reality (VR) refers to the use of computer hardware and software to generate realistic aural, visual, and other sensations, creating an immersive environment and simulating the user's physical presence in it by enabling real-time interactions via sensorimotor channels.

As a concept, VR first became mainstream in the 1990s as several industries were influenced by videogames and by the first immersive human–computer interaction prototype, the "Man-Machine Graphical Communication System". A second wave of VR emerged after 2005 and saw more effective applications in different fields, including engineering, design, architecture, construction, medicine, mental health, military, education and training, fine arts, entertainment, business, communication and marketing, and travel. Today, research and development on VR devices, hardware and software products, and

user interfaces is progressing rapidly, and technology adoption is quickly reaching many leading players in manufacturing, including several in the AECO industry.



To provide users with effective and productive VR experiences, VR content creators carefully design and translate 3D models, 2D images, and spatial sounds into a machineoperable format that considers the intended final use, the VR device employed, and the degree of interaction expected. Specifically, VR experiences can be either static (i.e., limited to the immersive visualization of a 360◦ spherical image or video from a predetermined point of view, with no possibility of free movement) or dynamic (allowing the user freedom of movement and interaction within the VR environment).

Dynamic VR grants much more freedom to the viewer compared to static VR, allowing them to wander unrestricted inside the CG environments, interact with reproduced objects, and even alter the scene directly from within the virtual dimension. Static images are not a viable solution for dynamic VR, in fact all content must be generated continuously according to the user's input and movements using Real Time 3D (RT3D) rendering tools. However, rendering time must be fast enough to be imperceptible by the viewer, to ensure that the virtual reproduction of reality is as authentic as their analog experience and to provide them with precise control and feedback over their movement and interaction. Such level of performance cannot be achieved by conventional 3D and BIM rendering software, hence VR usually employs specific optimized graphic engines that often borrow their technology from the videogame industry. These can be integrated with user–environment interactions using their own libraries and VR platform software development kits (SDKs).

With its advanced capabilities of immersive and interactive visualization, VR has been advocated to facilitate design, engineering, construction, and management for the built environment [17,31,115].

As VR provides a fully immersive experience, it can prove a particularly effective tool for designers, contractors, and owners to review plans and fully comprehend operation sites before projects are put into motion. The VR application controls the entire environment that the user sees, creating a uniquely detailed and dynamic individual experience. VR can also improve collaboration among stakeholders [32], enable a better understanding of complex designs [33], help identify design issues [34], support outdoor [35] and indoor [36] lighting design, realistically recreate building geometry so that users can fully comprehend the project and reach a better design decision [37], and aid collaborative decision making [38]. Regarding the construction phase, evidence suggests that VR technologies can be effectively deployed to support construction safety training [47], project schedule control [48], and optimization of the construction site layout [49].

### *3.2. Augmented Reality*

Augmented reality (AR) can be defined as the technology that combines the physical and digital worlds by superimposing CG content—text, images, and interactive graphics onto real-world objects. It is experienced through HMDs, smartphones, or tablets able to provide users with both the CG video feed and the direct view of the real world. Superimposed information can be either constructive (i.e., additive to the natural environment) or destructive (i.e., masking the natural environment). Differently from VR, therefore, AR is neither immersive, as the viewer continues to see the real world around them without any feeling of being elsewhere, nor exclusive, as the user maintains their capacity of interaction with objects and people around them. As such, the main benefit of AR consists in the possibility of providing content or information associated with the context in which users actually are, for instance, integrating in the real environment objects that are not present in reality, greatly facilitating the transfer of information from abstract representations to the physical world. AR can provide relevant information right in the context where it is needed, proving enormous potential in facilitating manual tasks, enabling time saving, and improving quality and efficiency, as already shown by evaluations in numerous projects.

AR systems commonly determine the positioning of the CG content in space using geolocation (GNSS, Wi-Fi triangulation, beacons, etc.) in situations where the approximation can be large, or, when higher precision is required, via using tags, markers, or anchors in the real world, usually consisting of 2D prints similar to QR codes, which can be recognized by the camera acting as a trigger for displaying the CG content as well as a reference point to define its position, size, and perspective.

Nonetheless, in all the aforementioned cases, the CG content cannot recognize any physical objects within the real-world environment, meaning that CG and real-world content are not able to respond to one another. Even when the 3D virtual object is correctly placed, scaled, and rotated in the space environment, it is solely anchored to the camera view and is always displayed in the foreground with respect to the real objects in the environment. For instance, when the user sees a CG image of a chair on the floor in front of them, the latter appears in the foreground even if a real object is introduced in between, disrupting the illusion. Indeed, AR does not allow digital objects to be occluded behind real objects, thus preventing true interaction between the two worlds: the information of objects overlaid onto the real environment cannot be interacted with, and that is where mixed reality comes into play.

Researchers have proposed several ways to use AR for architecture, engineering and construction, and facility management (AEC/FM) projects that can yield many advantages for enhancing and improving representation techniques on a job site [50,51].

Thanks to their ability to precisely overlay CG objects and information on the real world, in the AECO industry, AR and MR certainly show their most effective deployment in the phases of construction and maintenance of buildings, where they act as true human augmentation tools [52–54].

Nevertheless, AR and MR, the latter in particular, are finding numerous and promising applications in the design phase as well [17,39,40], concerning, in particular, building renovations, infrastructure design, urban planning [17], digital fabrication in architecture [41], virtual tours, and in situ walkthroughs (for instance overlaying virtual reconstruction on heritage sites to enrich cultural tours) [42–44], or for an augmented user experience in art and architecture exhibitions (such as for example the Exploring SongEun Art Space exhibition in Seoul by Herzog and de Meuron of November 2021), engaging attendees in a combination of physical and virtual worlds in which virtual agents can be interacted with as one would with real world counterparts [45].

### *3.3. Mixed Reality*

Mixed reality can be described as an upgrade of AR that further bridges the virtual and real world together, providing a more connected experience where both virtual and real elements can fully interact with each other. As a term, "mixed reality" was first defined in 1994 by Paul Milgram and Fumio Kishino as "anywhere between the extrema of the virtuality continuum", with the virtuality continuum extending from the completely virtual throughout to the completely real environments [110].

Mixed reality can effectively remove the boundaries between real world and virtual content by properly implementing occlusion: CG objects can be correctly obscured from the user's point of view by objects in the physical environment, effectively becoming part of the real world. A CG ball will bounce off real tables and walls, or disappear under a real couch. MR allows users to manipulate virtual elements as they would in the real world, with the digital content behaving and reacting accordingly. For instance, one could turn a virtual object around using physical gestures to inspect it from all angles.

Mixed reality can be achieved either by integrating CG content into the real world or by adding elements from the real world into the virtual environment (such as streaming video from physical spaces, as through a webcam). Therefore, MR must be experienced through semi-transparent lenses or MR headsets equipped with a camera in order to film and display the user's environment. During their operation, MR devices continuously scan the surrounding environment to update the associated 3D model: this allows digital content to be placed over the real world and enables users to interact with it seamlessly, thus enabling the mixed reality experience [114].

In the design phase, in addition to the possibilities offered by AR, MR can provide virtual collaborative environments in which several users from different places in the world can meet to inspect and interact with a CG maquette as they would with a physical one actually present in the room, communicating the design intent to stakeholders without needing expensive real models. For instance, SketchUp Viewer is an MR application designed for multiple platforms, the main features of which are a tabletop display of a 3D model and of a 1:1 scaled version of the model itself, as well as a user interface (UI) to edit or view information on the tabletop model and a navigation interface for the 1:1 model.

Carrasco et al. [46] assessed the effectiveness of design review using MR compared to traditional 2D methods. The results showed that MR-based design review can effectively communicate 85% of the information to the client instead of 70% provided by 2D media. At the same time, it showed the potential to enhance the client's comprehension of the aesthetic characteristics of materials, giving the possibility to replace physical samples or mockups during the finishing stage of construction.

Through MR, clients can see what is being constructed, where before the physical elements would have been uncertain, acquiring a level of confidence in the end product that they would not traditionally have. Using programs such as Holoview MR, stakeholders can view a building as if completed in its actual location, accurately interpreting plans and their spatial relationship to the surrounding physical environment. BIM/CAD models overlay real world construction environments for when works have already started. Contractors

can easily see how their upcoming activities are impacted by existing works. As the model overlay is extremely accurate (to 1 cm), engineers and contractors can confidently identify design mistakes and resolve clashes and constructability issues early. The new design scope can be projected into the construction environment so that all stakeholders can assess the impact that changes would have on current and future works. Furthermore, MR allows plans to be reviewed and issues to be corrected, to remotely undertake progress walkthroughs, and to undertake code compliance inspections for all elements included in the design model.

### **4. XR Technologies**

XR can be visualized through portable devices such as smartphones and tablets, or through HMDs that can be worn on the head or integrated into helmets (akin to helmetmounted displays for aviation pilots). HMDs contain a display and lens assembly in front of either one (monocular HMD) or both eyes (binocular HMD). The employed display technologies include liquid-crystal displays (LCDs), organic light-emitting diodes (OLED), liquid crystal on silicon (LCos), or multiple micro-displays to increase total resolution and field of view.

Virtual reality HMDs can only display computer-generated imagery (CGI) and feature an electronic inertial measurement unit (IMU) that uses a combination of accelerometers, gyroscopes, and sometimes magnetometers to keep track of their specific acceleration, angular rate, and orientation.

AR and MR headsets, instead, must overlay CGI onto the view of the real world; therefore, they feature optical head-mounted displays (OHMDs). OHMDs employ optical mixers, which consist of partly silvered mirrors that let the user look through the lens while reflecting artificial imagery produced by the device.

Currently, portable XR devices still have limited capacity in terms of the size and complexity of displayed files because the performance required for continuous 3D rendering in terms of CPU, RAM, and storage is ill-suited for miniaturization and low battery consumption without resorting to an external computing unit such as a high-end PC. The current limitations of XR technology, namely the compromise between desktop or portable solutions in terms of performance, fidelity, and mobility, can be effectively addressed by resorting to cloud computing, meaning that a key role will be played by the progress in wireless communication, in both local (802.11ax or Wi-Fi 6) and mobile networks (5G).

### *4.1. VR Technologies*

VR systems normally include more than one hardware device to allow full operation. Generally speaking, the hardware components of any VR system can be classified into three categories according to their function, namely, display, controllers, and motion capture (mocap) devices. The display outputs stereoscopic images to users and is an essential element of the VR system. Commonly used display types include HMDs, mobile devices, and display walls. A typical VR platform can generally be classified into three categories, namely head-based, stationary, and hand-based operation.

Head-based VR devices consist of helmets or HMDs in which CGI is displayed on the internal screen or pair of screens, one for each eye, with an embedded position-tracking sensor that keeps track of where the user is looking (see Figure 2). Conversely, stationary VR platforms are usually fixed in place and employ projectors and/or large screens to display CGI to viewers. Lastly, hand-based VR devices are held by the viewers up to their eyes with their own hands and include smartphones or tablets.

Concerning head-mounted systems, three types of VR headsets on the market are suitable for the AECO sector (see Table 6): tethered VR (also known as desktop VR or PC VR), standalone VR (also known as wireless VR or all-in-one VR), and smartphone VR headsets (also known as mobile VR or VR viewers).

(**a**) (**b**)

**Figure 2.** Meta Quest 2 VR headset. (**a**) Front view; (**b**) Rear view **Figure 2.** Meta Quest 2 VR headset. (**a**) Front view; (**b**) Rear view.


Concerning head-mounted systems, three types of VR headsets on the market are **Table 6.** Main VR tethered and standalone headsets on the market.

Valve Index tethered 2 × LCD binocular 108° 1440 × 1600 144 Hz 2019 Varjo Aero tethered 2 × Mini LED binocular 102° 2880 × 2720 90 Hz 2022 Tethered VR headsets (HTC VIVE Pro 2, HP Reverb G2 (HP, Palo Alto, CA, USA) and Varjo AERO (Varjo Technologies Oy, Helsinki, Finland)) are designed to be connected to the PC, with or without wires, to exploit the resources inside them. They need compatibility with operating systems and specific hardware requirements. Standalone VR headsets (Meta Quest 2 (Meta, Cambridge, MA, USA), HTC VIVE Focus 3/Plus (HTC, Taoyuan, Taiwan), and Pico Neo 3 Pro (VR Expert, Utrecht, The Netherlands)) are completely independent viewers designed to operate without the need for other peripherals and without wires. They are equipped with internal memory and integrate all functions. Bluetooth or a smartphone connection may be required for configuration. Smartphone VR headsets are designed to be connected to the smartphone, which can be physically inserted into the viewer or connected via Bluetooth. In this regard, we have Tethered VR headsets (HTC VIVE Pro 2, HP Reverb G2 (HP, Palo Alto, CA, USA) and Varjo AERO (Varjo Technologies Oy, Helsinki, Finland)) are designed to be connected to the PC, with or without wires, to exploit the resources inside them. They need compatibility with operating systems and specific hardware requirements. Standalone VR headsets (Meta Quest 2 (Meta, Cambridge, MA, USA), HTC VIVE Focus 3/Plus (HTC, Taoyuan, Taiwan), and Pico Neo 3 Pro (VR Expert, Utrecht, The Netherlands)) are completely independent viewers designed to operate without the need for other peripherals and without wires. They are equipped with internal memory and integrate all functions. Bluetooth or a smartphone connection may be required for configuration. Smartphone VR headsets are designed to be connected to the smartphone, which can be physically inserted into the viewer or connected via Bluetooth. In this regard, we have viewers designed exclusively for a smartphone brand or universal VR viewers that adapt to multiple models and operating systems (BNEXT VR PRO, https://www.aniwaa.com/product/vr-ar/bnext-vr-pro/ (accessed on 15 March 2022)).

> viewers designed exclusively for a smartphone brand or universal VR viewers that adapt to multiple models and operating systems (BNEXT VR PRO, https://www.aniwaa.com/product/vr-ar/bnext-vr-pro/ (accessed on 15 March 2022)). Notwithstanding the type of headset, the immersivity level that can be achieved depends on different characteristics of the device, which include the field of view (FoV) and the quality of the display (pixel density, color accuracy, dynamic range, and brightness), the refresh rate (or frame rate) of the CGI, the number of movements allowed to the user (degree of freedom, DoF), the accuracy tracking system, the presence of Notwithstanding the type of headset, the immersivity level that can be achieved depends on different characteristics of the device, which include the field of view (FoV) and the quality of the display (pixel density, color accuracy, dynamic range, and brightness), the refresh rate (or frame rate) of the CGI, the number of movements allowed to the user (degree of freedom, DoF), the accuracy tracking system, the presence of controllers, and the audio system. Generally speaking, a satisfactory immersive experience requires a FoV of a minimum of 100 degrees (for reference, the human eye has about 220 degrees FoV), a refresh rate between 90 Hz and 120 Hz, six degrees of freedom (3DoF rotational freedom to

> controllers, and the audio system. Generally speaking, a satisfactory immersive

360 degrees head rotation and 3DoF positional freedom to allow up/down, left/right, and forward/backward movement), as well as an accurate motion tracking system.

Wide FoV displays enable viewers to experience the virtual environment in a more lifelike way, i.e., focusing on what is in front of them while also perceiving peripheric objects. Higher refresh rates and low latency are instead recommended to avoid motion sickness symptoms (so-called cybersickness). 6DoF controllers allow for more advanced interactions compared to less sophisticated point-and-click controllers (limited to 3DoF).

Motion tracking is a crucial function in the VR system because it ensures user movements and orientation are effectively replicated on screen, in turn enabling a satisfactory interaction with the immersive VR environment [116]. VR Mocap systems continuously capture and process real-world motions of the user in order to track their current view and provide positioning for interaction with the virtual environment. A more precise motion tracking translates to a more seamless and lifelike immersion in VR. Likewise, any perceived gap or lag between the user's actions in real life and their reproduction in VR may greatly disrupt the immersive experience. Positional tracking can be either external or internal.

External tracking (also known as outside-in tracking) employs external sensors and/or cameras to keep track of the VR headset's position and orientation within a user's defined space (room-scaling). For full room scale, more than two sensors are installed to avoid any occlusion. Internal tracking (also known as inside-out tracking) uses one or more front-facing cameras embedded into the VR headset to detect its position and may function with the support of external markers. Internal tracking generally has poorer performance and is less accurate than external tracking; however, it is much more convenient to set up.

Mocap technology commonly employs optical, inertial, mechanical, and magnetic sensors. In portable VR systems, sensors are usually embedded in the headset itself and their data feed is processed by algorithms to provide motion tracking. Other solutions may rely on more complex systems including volumetric capture (MS Azure Kinect, Microsoft, Redmond, WA, USA), hand tracking and haptics (Ultraleap, Leap Motion, San Francisco, CA, USA), eye tracking, or even full body tracking thanks to special suits.

Finally, interactive controllers (joysticks or wands, data gloves, and haptic devices) are an equally important media for enhancing the reality of VR environments as they are responsible for how users interact with objects in the virtual environment and for which sensory feedback they receive, including haptic and auditory feelings.

### *4.2. AR and MR Technologies*

Similarly to VR, hardware components required for augmented and mixed reality fruition include processing units, displays, several sensors, and dedicated input devices [54].

On the other hand, both AR and MR technologies are context-aware instead of immersive: their key characteristic is the capacity to combine the reality that users see with their own eyes with CG objects that are seamlessly overlaid in their specific position, differently from VR in which users are completely isolated from the real world upon entering the virtual environment. Therefore, a crucial function of context-aware technologies is geolocation, that is, ensuring the correct alignment of the virtual environment onto the real world in order for the device to properly display the virtual content in its expected position.

In particular, AR and MR devices can calculate the coordinates of their actual 3D position in the real world by processing the spatial relationship between themselves, external markers, and key points through the method of Simultaneous Localization and Mapping (SLAM). As soon as the AR/MR device turns on, its sensor equipment (cameras, gyroscope, and accelerometer) scan the surroundings and feed their data to an algorithm able to reconstruct a 3D model of the real-world environment and then position itself within it. Following this process, the system understands its environment well enough to be able to display CG objects that are realistically placed, oriented, and illuminated to feel part of the real world, with the viewer able to move close and inspect them from multiple directions. Differently from AR, after the SLAM process is complete and the CG

content is properly positioned in the real space, MR additionally allows the virtual objects to be occluded from the view when they would be obscured by real ones (such as walls, floors, and columns that stand between the viewer and their expected position). MR also allows this occlusion to be controlled to display, for example, the pipes beneath a floor or wall surface as an X-ray view by regulating the transparency of the related shaders in the graphical engine.

In particular, there are several methods to display concealed utilities in AR/MR, which mainly differ in how they provide the perception of depth of the virtual object in relation to the viewer, and therefore in how accurate and intuitive the AR/MR scene is to the user. Muthalif et al. [57] focused on the AR/MR visualization of underground pipe utilities and investigated six main display techniques, namely X-ray view, topographic view, shadow view, image rendering, transparent view, and cross-sectional view.

Among these, the X-ray view, which superimposes a cutout of the ground—a virtual "excavation box"—in which utilities are shown in their correct depth and position, was found to be the most accurate and intuitive for the viewer, allowing them to distinguish even multiple utilities in the same view. However, this method requires an accurate and detailed 3D model (a BIM file or even a 3D point cloud of a previous excavation), as well as pre-captured data to operate at best, and the virtual excavation box may end up covering a large part of the real-world view, which may pose hazards in working conditions.

The topographic view instead superimposes a 2D map of underground utilities directly on the ground akin to traditional paint markings on street and paved surfaces. Such a technique is very intuitive, leaves most of the real-world view non-occluded, and does not require any 3D model; however, it lacks any information on depth and may prove confusing if more utilities are shown at once. The shadow view integrates depth information into the topographic view by means of shadow lines and projections on the surface, but this adds complexity to the scene.

Image rendering improves the 3D visualization of objects in space by integrating additional reference points in context, e.g., adding virtual edges to the real world and masking virtual objects beneath real ones (occlusion). This improves the understanding of the scene at the expense of additional computing power and 3D model accuracy needed for localization. Lastly, the transparency view and cross-sectional view see little application in the field.

Table 7 shows the different static and dynamic methods with which virtual content can be superimposed to the real-world view.


**Table 7.** Static and dynamic methods to overlay CGI to the physical world.

In order to support human activities effectively, AR and MR devices should be, preferably, head mounted, as this allows to hands-on interaction to be eliminated with the device, should be equipped with semi-transparent lenses or optical displays, to allow CGI to be superimposed on the real-world view, and should feature cameras and sensors to scan the

real environment continuously and allow mixed reality experiences. AR MR headsets can also be equipped with directional speakers and active noise reduction microphones, with more advanced products allowing vocal control by virtual assistants.

Multiple models of AR and MR devices are currently on the market [55], designed to meet different needs (see Table 8).


**Table 8.** Main AR and MR devices on the market.

Overall, AR can be implemented according to four methods: optical see-through, video see-through, eye multiplexed, and projection based. The former two are the most widely adopted in AR headsets available on the market [56]. In optical see-through systems, AR is achieved by superimposing virtual images over the direct view of the real world, commonly by projecting CG content through half mirrors or prisms. With this method, the real-time view of the world is maintained while seeing AR content. In video see-through systems, on the other hand, the camera of the AR device continuously captures the real world in front of it, processes each frame by adding CG content, and finally displays the AR image to the viewer on the device's screen. Concerning the types of display featured in AR devices, there are three main solutions: monocular (a screen in front of one eye), binocular (the same screen in front of both eyes), or dichoptic (a different screen in front of each eye, to enable depth perception).

Regarding AR HMDs, Google Glass Enterprise Edition 2 (see Figure 3) is among the most diffused optical see-through devices in the industry. It is used to access documents on the go, which may include texts, images annotated with detailed instructions, training videos, or quality assurance checklists. It can also connect with other AR devices to communicate, for instance livestreaming one's view to enable real-time collaboration and remote assistance. Control methods include voice commands to launch applications allowing full hands-free operation. *Energies* **2022**, *15*, 3785 16 of 37

**Figure 3.** Google Glass Enterprise Edition 2 (**left**) and Epson Moverio AR glasses (**right**). **Figure 3.** Google Glass Enterprise Edition 2 (**left**) and Epson Moverio AR glasses (**right**).

Moverio BT-300FPV AR glasses by Epson (Suwa City, Japan) are optimized for drone flying: the video feed from the aircraft is displayed in first-person view (FPV) on a transparent screen, which allows the viewer to pilot the UAV, monitor its flight statistics, and keep the UAV always in sight (see Figure 3). In 2022, XYZ (London, UK) released The Atom, a next-generation engineering-grade AR headset for construction. Combining a Moverio BT-300FPV AR glasses by Epson (Suwa City, Japan) are optimized for drone flying: the video feed from the aircraft is displayed in first-person view (FPV) on a transparent screen, which allows the viewer to pilot the UAV, monitor its flight statistics, and keep the UAV always in sight (see Figure 3). In 2022, XYZ (London, UK) released The Atom, a next-generation engineering-grade AR headset for construction. Combining a

safety-certified hard hat, augmented reality displays, and the in-built computing power of

In addition to headsets, smartphones or tablets can provide handheld AR experiences: the user directs the device's camera at the real-world environment and the AR app installed superimposes content such as an image, animation, or data on the main screen. Indeed, most mobile devices on the market feature high resolution cameras as well as accelerometers, a GNSS receiver, solid state compass, and even LiDAR (Measure Australia, Surry Hills, Australia), making them fully equipped for AR operation. AR applications and software development kits (SDKs) are already available from the largest consumer tech companies such as Apple, Facebook, Google, and others, and some AR headsets are explicitly designed to mount smartphones (Mira Prism, Vuzix M300, https://www.aniwaa.com/product/vr-ar/vuzix-m300/, accessed on 15 March 2022).

On the other hand, the market for MR devices is still smaller than AR, and currently consists of Microsoft HoloLens and Magic Leap (West Sunrise Boulevard Plantation, FL, USA) with the recent introduction of Varjo XR-3 (see Figure 4). Indeed, MR goggles are high-performing HMDs, which must include sensors such as gyroscopes, accelerometers, Wi-Fi antennas, digital compasses, GNSS, and conventional and depth sensing cameras [1,2] in order to scan and capture the surroundings and integrate them with fully

**Figure 4.** Magic Leap 1 (**left**), MS HoloLens 2 on Trimble (Sunnyvale, CA, USA) XR10 (**center**), and

The flagship MR device on the market is arguably MS HoloLens, currently in its second iteration, which takes advantage of eye and hand movement tracking to support the

millimeter accuracy.

interactive CG content.

Varjo XR-3 (**right**).

safety-certified hard hat, augmented reality displays, and the in-built computing power of the platform HoloSite, the device can position holograms of BIM models on-site to millimeter accuracy. the platform HoloSite, the device can position holograms of BIM models on-site to millimeter accuracy. In addition to headsets, smartphones or tablets can provide handheld AR

Moverio BT-300FPV AR glasses by Epson (Suwa City, Japan) are optimized for drone flying: the video feed from the aircraft is displayed in first-person view (FPV) on a transparent screen, which allows the viewer to pilot the UAV, monitor its flight statistics, and keep the UAV always in sight (see Figure 3). In 2022, XYZ (London, UK) released The Atom, a next-generation engineering-grade AR headset for construction. Combining a safety-certified hard hat, augmented reality displays, and the in-built computing power of

**Figure 3.** Google Glass Enterprise Edition 2 (**left**) and Epson Moverio AR glasses (**right**).

In addition to headsets, smartphones or tablets can provide handheld AR experiences: the user directs the device's camera at the real-world environment and the AR app installed superimposes content such as an image, animation, or data on the main screen. Indeed, most mobile devices on the market feature high resolution cameras as well as accelerometers, a GNSS receiver, solid state compass, and even LiDAR (Measure Australia, Surry Hills, Australia), making them fully equipped for AR operation. AR applications and software development kits (SDKs) are already available from the largest consumer tech companies such as Apple, Facebook, Google, and others, and some AR headsets are explicitly designed to mount smartphones (Mira Prism, Vuzix M300, https://www.aniwaa.com/product/vr-ar/vuzix-m300/, accessed on 15 March 2022). experiences: the user directs the device's camera at the real-world environment and the AR app installed superimposes content such as an image, animation, or data on the main screen. Indeed, most mobile devices on the market feature high resolution cameras as well as accelerometers, a GNSS receiver, solid state compass, and even LiDAR (Measure Australia, Surry Hills, Australia), making them fully equipped for AR operation. AR applications and software development kits (SDKs) are already available from the largest consumer tech companies such as Apple, Facebook, Google, and others, and some AR headsets are explicitly designed to mount smartphones (Mira Prism, Vuzix M300, https://www.aniwaa.com/product/vr-ar/vuzix-m300/, accessed on 15 March 2022). On the other hand, the market for MR devices is still smaller than AR, and currently

On the other hand, the market for MR devices is still smaller than AR, and currently consists of Microsoft HoloLens and Magic Leap (West Sunrise Boulevard Plantation, FL, USA) with the recent introduction of Varjo XR-3 (see Figure 4). Indeed, MR goggles are high-performing HMDs, which must include sensors such as gyroscopes, accelerometers, Wi-Fi antennas, digital compasses, GNSS, and conventional and depth sensing cameras [1,2] in order to scan and capture the surroundings and integrate them with fully interactive CG content. consists of Microsoft HoloLens and Magic Leap (West Sunrise Boulevard Plantation, FL, USA) with the recent introduction of Varjo XR-3 (see Figure 4). Indeed, MR goggles are high-performing HMDs, which must include sensors such as gyroscopes, accelerometers, Wi-Fi antennas, digital compasses, GNSS, and conventional and depth sensing cameras [1,2] in order to scan and capture the surroundings and integrate them with fully interactive CG content.

*Energies* **2022**, *15*, 3785 16 of 37

**Figure 4.** Magic Leap 1 (**left**), MS HoloLens 2 on Trimble (Sunnyvale, CA, USA) XR10 (**center**), and Varjo XR-3 (**right**). **Figure 4.** Magic Leap 1 (**left**), MS HoloLens 2 on Trimble (Sunnyvale, CA, USA) XR10 (**center**), and Varjo XR-3 (**right**).

The flagship MR device on the market is arguably MS HoloLens, currently in its second iteration, which takes advantage of eye and hand movement tracking to support the The flagship MR device on the market is arguably MS HoloLens, currently in its second iteration, which takes advantage of eye and hand movement tracking to support the positioning of virtual content and a user interface that does not rely on external controllers. Users can also log into HoloLens seamlessly using eye recognition. Moreover, HoloLens features smart microphones and natural language speech processing algorithms to ensure vocal controls work properly even in noisy industrial environments. Park et al. [58] conducted a literature review to investigate the current status and trends in HoloLens studies published over the past five years (2016–2020), showing a growing use of MS HoloLens multiple fields, from medical and surgical aids and systems, and medical education and simulation, to industrial engineering, architecture, and civil engineering, and other engineering fields.

A recent example of a video pass-through MR headset is represented by Varjo XR3, a tethered HMD that claims the industry's highest resolution (2880 × 2720) and the widest field of view (115◦ ), and provides depth awareness powered by LiDAR for pixel-perfect real-time occlusion and 3D world reconstruction. The device allows, in addition, a full VR experience.

Apart from standalone MR headsets, several smartphone-based MR headsets are already available or about to enter the market, including Tesseract Holoboard Enterprise Edition, Occipital Bridge (https://www.aniwaa.com/product/vr-ar/occipital-bridge/, accessed on 15 March 2022), and Zappar ZapBox (https://www.aniwaa.com/product/vrar/zappar-zapbox/, accessed on 15 March 2022). These can provide MR experiences using the camera and display of the smartphone or semi-transparent mirrors to overlay the CG content to the real world.

### **5. XR Applications in Building O&M**

Operation and Maintenance (O&M) represents a strategic activity in building and infrastructure management and includes a diverse range of services, skills, methods, and tools to ensure that they maintain the expected performance over their lifetime and to reduce energy consumption and upkeep costs at the urban and building scale [117]. Common O&M activities include the maintenance, repair, and replacement of components, energy management of the built asset, management of emergencies, management of changes or relocations of services and uses, security, management of hazardous and non-hazardous waste, ICT, and others [118,119]. Operational costs, energy consumption, indoor comfort, and emissions are some of the aspects that can be significantly affected by suboptimal efficiency and effectiveness of O&M. In particular, optimal planning of maintenance work is paramount to improve the efficiency of building management and minimize O&M costs, thus requiring the extensive adoption of advanced management tools, including building automation systems (BAS) and BIM-integrated facilities management software [59]. BIM has, in fact, numerous potential applications in O&M, both as a central part of an integrated facility management system or as a relevant data repository for the latter [120]. BIM allows facility managers to store, maintain, and access building information, including spatial, technical, warranty, maintenance, and spare parts data, and thereby to conduct their commissioning and O&M activities more efficiently [121,122]. The planning of major repairs, retrofits, and expansions greatly benefits from an up-to-date BIM model. BIM integration can be even more effective when paired with real-time data provided by building sensors, systems, and service robots that can offer detailed information on the asset's condition. In fact, BIM is the foundation for the digital building twin, that is, the digital replica of the built asset that is continuously updated by live data provided by sensors and systems in the real structure replicate, a breakthrough instrument with unprecedented potential in O&M improvement and optimization [14,123,124]. AI- and ML-powered analytics hosted in the cloud can extract meaningful knowledge from data streams provided by the digital twin, and results can be fed back into the system to allow self-learning and optimization [125]. The spread of BIM and digital twins has also paved the way for other technologies such as reality capture (3D laser scanning, point clouds, etc.) and new XR tools [60,126].

In particular, the application of VR, AR, and MR technology in building management is showing promising results in enhancing human performance in building O&M tasks, in improving maintenance operations, and in supporting strategic decision making [18]. Due to the possibility of benefiting from an immersive and collaborative environment and of superimposing a digital model, real environment, data, and information in an integrated way, XR allows, at best, the functioning of the digital twins of buildings and infrastructures, to be exploited.

### *5.1. VR Applications*

Thanks to its advanced capabilities of immersive and interactive visualization, VR can facilitate several building O&M activities. VR applications concern the following main areas:


• teleoperated maintenance, allowing effective interventions with robots even in dangerous or difficult to reach locations.

### 5.1.1. Maintainability Design

Regarding the design of maintainability, VR allows maintenance operations of any equipment or product to be realistically simulated in a fully replicated environment, enabling effective design reviews and identifying potential maintenance-related design flaws preemptively. Design teams can involve facility managers, technicians, and maintenance engineers in a VR simulation, allowing them to enter the project, replicate conventional operations, then return valuable feedback in the early stages of design. Such interaction with the virtual prototypes in the immersive environment enables a more complete and tactile understanding of the practical maintenance operations of any product well before it is commissioned [61].

Data observed during VR interactive simulations can in turn be collected and analyzed by embedded algorithms to allow qualitative and quantitative evaluation of maintainability. VR simulations have already proven effective in maintainability design, supporting the evaluation of ergonomics, accessibility of components, environment factors such as heat or radiation, workers' fatigue, and other human factors [62,63]. For instance, Akanmu et al. [64] presented an automated system that integrated BIM models, Microsoft Azure, and VR to engage facility managers in the design phase regarding the accessibility of building components for maintenance. The system provided a platform for mining and extracting knowledge from feedback provided by facility managers to improve building design tools. The functionality and usability of the system were presented with an example of the lighting and air-conditioning systems.

### 5.1.2. Immersive Building Visualization and Monitoring

VR technology transforms 2D or 3D models into immersive, fully manipulable virtual environments, enabling facility managers to interact with their assets to a much more tactile level than what is allowed by RT3D visualization. Several plug-ins for BIM authoring software (Revit, BIM 360, Archicad, etc.) such as Arkio, The Wild, Prospect, Sentio VR, Unity Reflect, or dedicated software (Twinmotion, Enscape, Fuzor, BIMx, VRcollab, Lumion, VRex, Resolve, Techviz) are today available on the market allowing 3D models to be brought into VR with a single click, and also importing data (so users can easily access BIM metadata and layer visibility) and enabling remote collaboration.

Differently from RT3D, in fact, VR visualization provides a realistic perception of spatial depth, enabling any project to be explored on an actual 1:1 scale. Although conventional 2D renderings and 3D digital models can allow an abstract understanding of the environment, the full experience of interaction with a physical space cannot be replicated with only a computer screen or sheet of paper. In urban planning, for instance, VR visualization allows the user to realistically experience how a new structure will fit in and relate with the surrounding buildings and environment.

Exploiting digital building twin (DBT) technology, VR can augment both the visualization of the digital model and the operation, management, and maintenance of the actual building [127]. In fact, a digital twin consists of an accurate and continuously updated model a given physical entity, which may be a single building component, e.g., an HVAC element, or an entire building, or even whole cities or districts. It allows users to keep track of the current status of its related physical twin, to simulate alternative scenarios, identify internal and external complexities, discover unusual patterns, monitor performance, and accurately predict ongoing trends to optimize operation [1,65]. The digital twin enables predictions and simulations of what-if scenarios, allowing to test adjustments and variations that would be inconvenient to try out in the real world, providing a larger space for experimentation and trial and error. Finally, AI and machine learning analytics can be integrated into digital twins to make them more intelligent and improve autonomous decision making.

Shahinmoghadam et al. [66] presented a method to monitor in real time the thermal comfort conditions of the digital twins of a building's closures, based on the integration of BIM and live IoT data into a game engine environment. The proposed system would allow facility operators and experts to intuitively monitor the complex and dynamic data associated with actual thermal comfort conditions. Users could navigate the virtual environment remotely and observe actual thermal sensations in real time in every place. Parameters could be changed via the user interface to implement different what-if scenarios and observe the outcomes based on the live monitoring data streaming from the IoT nodes installed within the building. A proposed application is in interior layout planning, in which designers can keep track of any spot that is flagged with discomfort signs.

At the city and district level, smart city digital twins can collect information from different sources on the territory in real time and transfer it to administrators, policymakers, and citizens, putting actual data at the base of more informed and participated-in decisions and enabling a truly knowledge-driven city management [67,128]. Acoustic engineering can take advantage of VR as well to realistically simulate sound sensory input, allowing the creation of urban and building spaces with a lower noise exposure and better acoustic comfort. Sound designers can import the BIM model into a VR engine to simulate the physics of acoustic waves passing through a structure and reflecting on different materials. With VR, users may distinguish how the sound bounces off a tree or a stone surface, or hear the difference between an open window and a closed one.

Regarding augmented visualization, VR enables exploration of the DBT even remotely, including first-person movement, the measurement of distances and angles, the interrogation of single components, querying any element by category, the clustering of objects, and the retrieval of documents (e.g., drawings, data sheets), as well as the direct modification of assets that are synced with the building BIM model. From inside the VR environment, the viewer may replace objects, track steps, take pictures, or sketch three-dimensional notes. Augmented visualization with VR can also be used for computational fluid dynamics (CFD) simulation, providing a more intuitive representation method to comprehend the airflow in a given space [68].

Concerning building management and maintenance, VR allows free navigation inside any environment and full interaction with its components. VR can therefore be used to interrogate single components of the DBT by exploring the 3D scene, facilitating monitoring of various internal and external parameters of the building (e.g., temperature, illumination, air quality), and tracking the condition of systems (e.g., on/off, consumption/production, or open/closed status, power levels, operating time). Additional uses include the remote control of systems, devices, and actuated components (e.g., windows, blinds), and the simulation of energy consumption and internal comfort in different scenarios to apply to the real building according to climate conditions, thermostat settings, operating times of the equipment, etc.

With VR, virtual meeting rooms can be set up to allow different members of the building management team to gather in the same place and time as 3D anthropomorphic avatars, even if they are physically remote. In the virtual space, issues can be discussed and solutions proposed as in any typical problem-solving meeting, with the added possibility for participants immersed in the synchronized environment to jointly manipulate and evaluate the emergence of creative solutions in real time. Compared to other forms of remote collaboration, VR co-presence was found to strengthen the sense of collaboration among team members, especially in terms of social aspects such as dependencies, encouragement, and mutual learning.

### 5.1.3. Human-Centered Building Management

A VR space can be used as a realistic, safe, and fully manipulable testing environment to observe how users would interact with the building in different scenarios and how any variation may affect the human state of mind and corresponding reactions, promoting a more user-centered building management [69]. Interactions with the virtual environment

provide valuable insights to inform decision making [70]. In particular, occupant behavior was found to have a significant impact on the building's performance and energy consumption, and has therefore become a major research area in the human–building interaction field. Human–building interactions that have energy efficiency consequences, i.e., indoor lighting preferences, space layout, daylighting (operational blinds), or the use of appliances, shall be fully comprehended. In this picture, researchers can take advantage of the VR environment to collect behavioral data to analyze the human–building interaction and to assess how human experience and behaviors affect the building's energy performance [70].

Furthermore, VR is an effective tool for behavior investigation and training in emergency situations, especially those concerning fire or earthquake occurrences [71,72]. In particular, VR allows users to construct a stressful environment in emergency evacuation simulations without any short-term or long-term harm to the participants, providing the virtual stimuli to evoke mental and behavioral responses similar to those experienced in real-life situations [73]. Lorusso et al. [74] proposed a VR platform integrated with numerical simulation tools to reproduce an evolutionary fire emergency scenario for an existing school building, including real-time simulation of the crowd dynamic during the evacuation process. The results showed that the proposed VR-based system can be used to help decision makers determine emergency plans and to help firefighters as a training tool to simulate emergency evacuation actions.

Quantitative measure metrics, i.e., movement speed and directions, can also be incorporated in VR-based human–building interaction studies, as well as biometric sensors such as EEG (electroencephalogram), GSR (the galvanic skin response), and PPG (facial- or vision-based electromyography), to assist researchers in gaining insights into how human body functions work and react in the case of different built environment settings, such as spatial cognition, with a more conclusive and complementary measurement result [75,76].

Risk situations and the safety of workers can be effectively assessed with VR, allowing the feasibility and convenience of different design alternatives to be gauged. Specific aspects that imperil the virtual labor can be easily identified during the VR experience and adjusted accordingly, avoiding unnecessary reworks down the line.

Finally, recent global sanitary emergencies have clearly shown that not only the quality and characteristics of spaces, but also the way in which people experience these spaces and interact with each other, can have a direct influence on people's health. The problem of movement and the interaction of people within both closed and open spaces has now become a critical issue for safety linked to the transmission of diseases, equal in importance to established concepts such as systems safety and fire protection and evacuation. Therefore, understanding how to arrange spaces and how to organize the movement of people has now become a central theme of planning at the building and district scale, aimed at minimizing health risks by maximizing well-being and productivity. Spaces where people congregate—offices, construction sites, schools, retail spaces, warehouses—must be designed to reduce the risk of disease transmission. Such planning requires an understanding of mathematics, biomechanics, data science, design, demography, psychology, local regulations, sociology, and geography, among other disciplines, which today can find support in VR technology as well as simulation software and artificial intelligence. In this sense, a study by Pavon et al. [77] utilized BIM possibilities to reduce crowding and facilitate social distancing as a COVID-19 measure in a public building. More recently Mukhopadhyay et al. [78] presented a VR-based DT implementation of a physical office space with the goal of using it as an automatic social distancing measurement system. The VR environment was enhanced with an interactive dashboard showing information collected from physical sensors and the latest statistics on COVID-19.

### 5.1.4. Personnel Training

VR tools are finding widespread application in personnel training, particularly for larger contractors, as these show numerous benefits compared to conventional training methods concerning the speed and effectiveness of learning, schedule flexibility and better compatibility with work activities, lower travel expenses, lower overall costs, and greater safety for both learners and equipment.

Such advantages are especially meaningful in the O&M sector, in which continuous training and skills certification is required for labor, and several environments and situations may pose a high risk for workers' safety.

In VR, multiple scenarios can be simulated, including extremely dangerous events, allowing learners to "learn by doing"—with the optional support of text or audio information without any risk for their safety or damage to the equipment. In addition to the end results and decision-making skills, VR-assisted learning allows other aspects to be evaluated such as response times, the precision of movements, and other specific variables, providing a more comprehensive insight on the level of training and competence acquired. Furthermore, VR can considerably reduce training costs for companies and learners, as it does not require real equipment to be rented—especially of large dimensions (cranes, etc.)—nor to replicate dangerous scenarios, while allowing training activities to be carried out on an individual level without affecting working hours or interrupting company activities (as it happens instead of group courses) [79]. The VR immersive environment is demonstrated to allow procedures and operations in a more involved and distraction-free way, with cognitive benefits including both working memory development (information encoding) and subsequent retrieval (information recalling). VR-trained subjects were observed to outperform those trained with conventional 2D and text instructions in both the rapidity and accuracy of maintenance operations [80]. In particular, haptic feedback from VR systems has proven to increase users' situational awareness and overall performance for maintenance applications. Indeed, although VR studies mostly focus on the pursuit of realistic visual effects, as vision accounts for 80% of external information processed by the brain, the possibility to accurately simulate the haptic feedback of the real maintenance process in VR is also very important, as the human–machine interaction in maintenance is mostly carried out by hand [63]. Compared to visual feedback, haptic feedback was found to be more effective in improving the performance of maintenance workers and particularly helpful in increasing situational awareness during telemanipulation tasks, in which the completion time was reduced and the quality of operation improved [81].

### 5.1.5. Vehicles and Robot's Teleoperation

VR can be successfully employed to enhance teleoperated maintenance operations, including those in hazardous or hard to reach environments such as nuclear plants, underwater facilities, offshore platforms, or emergency situations. Teleoperated vehicles include UAVs and robots equipped with articulated arms and manipulators characterized by impressive dexterity and precision of movement [82].

In VR-assisted teleoperation, drivers and operators wear headsets that display a full view of the working area and reproduce any sounds on-site as well as audible warnings. VR controllers perfectly mimic those on the real machine, providing an experience equal to sitting in the machine's cab, even if it is kilometers away. Verbal-, thought-, and eye-based control inputs are also employed. VR teleoperation also allows users to rotate operators in working shifts (even in different time zones), enabling 24/7 uptime. VR can completely immerse the robot operator in the task at hand, displaying stereoscopic video feed from the machine's cameras and faithfully reproducing commands imparted by VR controllers.

In its most advanced application, VR teleoperation involves biomorphic robots controlled using full-body suits: called VR exoskeletons, these robotic systems allow a machine placed in a separate location to perfectly synch with the movements of an operator as detected by a sensors-equipped exoskeleton [1]. Hand controllers, gloves, and similar allow objects of any size and weight to be manipulated with remarkable dexterity and accuracy. An example of a virtual exoskeleton is the Guardian GT system by Sarcos, consisting of a robotic system with one or two highly mobile arms mounted on a track or wheeled base, controlled by a body suit that allows various tasks (heavy lifting, intricate processes such as welding and joining, or even operating switches and levers) to be performed remotely.

### *5.2. AR Applications*

Augmented reality (AR) is the application of XR technology in real-world scenarios, and has found considerable utilization in building management activities, allowing technicians to superimpose data to the real-world view while on site by using wearable HMDs or a camera-equipped mobile device [52,53].

With the ability to provide real-time information, AR is being used within the industry to increase efficiency, improve safety, streamline collaboration, manage costs, and boost overall project confidence, proving effective in the following building O&M applications: building monitoring and management, maintenance and repair operations, renovations and retrofit works, personnel training, and workplace safety.

AR combines both physical and digital content into one coherent environment by overlaying CGI over the user's view through a mobile device or HMD. AR systems can determine their position using GNSS positioning and cameras, then present the user with real-time context-related data, continuously updating and displaying the necessary information as they move throughout space. Information and documents such as operational status, plans and drawings, and schedules are easily accessible, allowing fully informed decision making in the field. AR solutions can be implemented with either head-mounted displays (HMDs) such as smart goggles or handheld displays (HHS) such as tablets, smartphones, or dedicated devices. AR apps for smartphones or tablets, such as XMReality, Reflekt Remote, or XOi, can be easy installed, do not require any dedicated hardware, and directly overlay graphics and text on the video feed of the devices' cameras. However, these handheld devices are not particularly suitable for supporting manual work as the operator has to interrupt the activities to consult the device, and remote support is equally suboptimal.

Conversely, AR systems that employ HMDs such as Epson Moverio, Google Glass, or Vuzix M-Series, which can also be integrated in PPEs, e.g., helmets (XYZ Atom), allow the operator to retrieve information and to collaborate remotely while keeping their hands on the tools and their attention on the task. Both HHS and HMD AR devices are autonomous and self-packaged, granting sufficient mobility for training, service, and maintenance operations on-site.

Several programs that are available allow AR functions to be created and provided (UpSkill Skylight, ScopeAR Worklink), with visualization methods ranging from displaying text bullet points to interactive 3D animations. The simplest AR interfaces superimpose virtual arrows and text onto the real-world view. Plain text contents are quick to create or update and their overlay does not occlude the view of the user; however, such solutions are less intuitive and thus indicated more for supporting already skilled maintenance workers. Support data can be presented as static 2D images and 3D models as well, overlaying relevant information on the screen to enhance inspections or other interventions. CG content can be animated to display specific processes at each stage more intuitively. Depending on the worker's skills, AR support can guide their actions step by step or just present top-level information. Another function of AR headsets is remote conferencing with coworkers and technicians, allowing colleagues off-site a precise glance on the field; much more effective than traditional vocal communication and picture sharing.

Additionally, AR allows users to erect 3D models over 2D floor or site plans, enabling more intuitive and accurate previews of the work on the field. Mobile AR programs such as GAMMA AR and Trimble's SiteVision (see Figure 5) capture the scene ahead on a smartphone or tablet and overlay it with 3D models, allowing on-site preview of utilities, buildings, and infrastructures presented with cm accuracy.

In particular, Trimble SiteVision is an X-ray view AR tool consisting of a lightweight, handheld, field controller equipped with an integrated GNSS positioning system. The system combines satellite data, to correctly position itself, and measurements from the on-board rangefinder, to determine the position of target points relative to the 3D model. The user is presented with a CG model accurately superimposed on the real-world context scene captured by the camera, and can comprehend intuitively how the 3D model interacts

with the environment. As the viewer moves and shifts their point of view, the device continuously adjusts the 3D model on the display to match what the camera is seeing. By previewing the 3D model on the field in its expected position, potential issues can be detected beforehand to enact corrections timely and avoid clashes and unnecessary reworks down the line. In addition, the device allows users to capture photos, take measurements, and attach notes, then directly create and assign tasks to coworkers to ensure productive follow-up. Updates from the job site are transmitted from and archived in the office, allowing to better sync workers, activities, and information. conferencing with coworkers and technicians, allowing colleagues off-site a precise glance on the field; much more effective than traditional vocal communication and picture sharing. Additionally, AR allows users to erect 3D models over 2D floor or site plans, enabling more intuitive and accurate previews of the work on the field. Mobile AR programs such as GAMMA AR and Trimble's SiteVision (see Figure 5) capture the scene ahead on a smartphone or tablet and overlay it with 3D models, allowing on-site preview of utilities, buildings, and infrastructures presented with cm accuracy.

manual work as the operator has to interrupt the activities to consult the device, and

Conversely, AR systems that employ HMDs such as Epson Moverio, Google Glass, or Vuzix M-Series, which can also be integrated in PPEs, e.g., helmets (XYZ Atom), allow the operator to retrieve information and to collaborate remotely while keeping their hands on the tools and their attention on the task. Both HHS and HMD AR devices are autonomous and self-packaged, granting sufficient mobility for training, service, and

Several programs that are available allow AR functions to be created and provided (UpSkill Skylight, ScopeAR Worklink), with visualization methods ranging from displaying text bullet points to interactive 3D animations. The simplest AR interfaces superimpose virtual arrows and text onto the real-world view. Plain text contents are quick to create or update and their overlay does not occlude the view of the user; however, such solutions are less intuitive and thus indicated more for supporting already skilled maintenance workers. Support data can be presented as static 2D images and 3D models as well, overlaying relevant information on the screen to enhance inspections or other interventions. CG content can be animated to display specific processes at each stage more intuitively. Depending on the worker's skills, AR support can guide their actions step by step or just present top-level information. Another function of AR headsets is remote

*Energies* **2022**, *15*, 3785 23 of 37

remote support is equally suboptimal.

maintenance operations on-site.

**Figure 5.** Trimble SiteVision AR system (**left**) and smartphone-based AR (**right**). **Figure 5.** Trimble SiteVision AR system (**left**) and smartphone-based AR (**right**).

In particular, Trimble SiteVision is an X-ray view AR tool consisting of a lightweight, 5.2.1. Augmented Building Monitoring and Management

handheld, field controller equipped with an integrated GNSS positioning system. The system combines satellite data, to correctly position itself, and measurements from the onboard rangefinder, to determine the position of target points relative to the 3D model. The user is presented with a CG model accurately superimposed on the real-world context scene captured by the camera, and can comprehend intuitively how the 3D model AR allows facility managers to enhance information retrieval and visualization by presenting relevant content right in its real-world context, such as, for instance, displaying real-time sensor readings, process telemetry, or aggregated performance, thus improving the monitoring and control of whole building systems. This can be particularly effective for planning maintenance interventions, using virtual performance panels and dashboards to access real-time data for monitoring and control, and analytics purposes.

interacts with the environment. As the viewer moves and shifts their point of view, the device continuously adjusts the 3D model on the display to match what the camera is seeing. By previewing the 3D model on the field in its expected position, potential issues can be detected beforehand to enact corrections timely and avoid clashes and unnecessary reworks down the line. In addition, the device allows users to capture photos, take measurements, and attach notes, then directly create and assign tasks to coworkers to ensure productive follow-up. Updates from the job site are transmitted from and archived in the office, allowing to better sync workers, activities, and information. AR-enhanced visualization of non-visual data can be a useful cognitive aid for identifying the information needed for decision making in FM, and can highly improve the usability and accessibility of BIM data [83]. Chung et al. [84] presented a study in which AR-based smart FM systems demonstrated faster and easier access to information compared with existing 2D blueprint-based FM systems, with information obtained through AR allowing immediate, more visual, and easier means to express the information when integrated with actual objects. A study by Abramovici et al. [85] proposed an AR-based support system for collaboration among working-level stakeholders involved in the FM process, whereby AR was used not only as a tool for the visualization of maintenance data, but also for communication and alerts, or for displaying other coordination-related aspects for the whole team. Another study by Chu et al. [86] developed, tested, and evaluated mobile systems (Artifact) with cloud-based storage capabilities aimed at integrating BIM and AR to improve information-retrieving processes as well as operational efficiency during construction.

Alonso-Rosa et al. [87] presented a monitoring energy system based on mobile augmented reality (MAR) to visualize in real time the power quality parameters and the energy consumption of home appliances. Tests showed that by simply focusing the smartphone on the home appliance, the system could detect the image targets with a negligible response time to overlap the energy information on the captured frame.

AR systems can also overlay the results of building thermal or fluid dynamics simulations on the virtual model, or project those on the real environment [112,113]. Indeed, the integration of AR with numerical simulations can improve the solution of practical problems, reducing the misinterpretation in spatial and logical aspects by superimposing engineering analysis and simulation results directly on real-world objects [90].

### 5.2.2. Augmented Maintenance and Repair Operations

During actual interventions, technicians can use AR to augment their view of the physical world with overlaid digital content, which may consist of real-time data related to the task or asset of interest or even guidance on each step to replace a component. Differently from traditional instructions based on static text and images, AR presents information in a context-aware and more comprehensible manner, allowing more effective operations and greater flexibility in workers' deployment. Therefore, AR is especially indicated to support maintenance operations and training in buildings and infrastructure since it enables quick access to contextual information and seamless integration of data right into the operation, in addition to allowing remote support and assistance by colleagues and experts to improve problem solving and decision making in the field [84]. Moreover, machine vision and object recognition algorithms can be integrated in the AR experience to provide real-time feedback on the task: cameras can determine the actual position and orientation of the components, check those against the targets and show alerts to correct the procedure on the go, and prevent mistakes or reworks down the line. Traditional text and picture manuals can be transformed into digital multimedia content and seamlessly integrated into the workflow using AR headsets, allowing operators to completely focus on the task without diverting their attention. As such, AR helps to reduce unnecessary eye and head movements, improve spatial perception, and increase overall productivity [18]. Moreover, today, systems and equipment are growing in complexity and commonly embed several sensors to monitor operations and perform initial diagnosis. AR can retrieve and display valuable data and diagnostic results right beside the object to be maintained, instead of accessing this information with a separate computer.

Manuri et al. [91] proposed an AR-based system to help the user in detecting and avoiding errors during the maintenance process, which consisted of a computer vision algorithm able to evaluate, at each step of a maintenance procedure, if the user had correctly completed the assigned task or not. Validation occurred by comparing the image of the final status of the machinery, after the user had performed the task, and a virtual 3D representation of the expected result. In order to avoid false positives, the system could also identify both motions in the scene and changes in the camera's zoom and/or position, thus enhancing the robustness of the validation phase.

One of the most common applications of AR is certainly that of remote maintenance, also known as "collaborative maintenance" or "remote assistance", which consists of the cooperation of an expert and a technician who are actually in physically different locations. Especially in consideration of the growing technological complexity of building systems, the conventional remote assistance carried out over the telephone is no longer an effective solution. Programs for AR-based collaboration such as VSight or Virtualist, on the other hand, allow the transfer of information and knowledge between the expert and maintainer in real time in a remote "see-what-I-see" collaboration. Colleagues and experts can see the direct view captured by the operator's AR device on-site and send back voice instructions, documents, or superimpose their annotations directly on the technician's display. This enables even unskilled operators to carry out complex tasks on their own, allowing waiting times to be reduced for more experienced teams and overall system downtime [32,92].

Finally, AR solutions can support quality assurance by allowing users to compare the resulting configuration directly with the design model, which is overlaid in the correct location, highlighting inspection-relevant features and deviations right on the single element. Using AR programs such as GAMMA AR, available for Android or iOS, contractors can validate what has been built correctly with the BIM model and stick photos or add virtual notes to specific built or designed elements. Components of the model can be hidden, isolated, and annotated with text, pictures, or voice notes. With AR tools such as VisualLive, it is possible to vary the transparency of 3D models to allow X-ray views. Users must place QR codes on flat or vertical surfaces, then scan them to import and properly align the virtual models.

### 5.2.3. Augmented Renovations and Retrofit Works

The ability to merge virtual objects with the physical environment makes augmented reality particularly suited for previewing renovations and retrofit interventions. With AR, the user can use the screen of a smartphone or tablet to project a "digital window" that overlays the BIM model of an object, a building, or a road onto the real-world scene, helping the visualization of the positioning of entire structures, as well as specific design features, elements, or equipment. Using AR, a client can enter the construction site before the renovation works begin, walk around, and view the end result in a simulated built view, as the final 3D model is superimposed on the real images.

In building renovations, mobile AR apps such as AirMeasure and MeasureKit enable direct measurement of the height, length, and dimensions of objects directly on the screen of the device. Some programs (Magic Plan, Room Scan and Floor Plan Creator) allow the creation of 2D and 3D floor plans by pointing the device's cameras at the corners of the room and snapping pictures while turning around. AR solutions can also support the scheduling and planning of building renovations, allowing users to preview and virtually test alternative process sequences, such as crane trajectories or vehicle movements, in a spatial context, or evaluate manual activities such as assembly/disassembly in a realistic manner.

### 5.2.4. Workplace Safety

AR solutions can be effectively integrated in workplace safety procedures by using HMDs or mobile devices to scan tags or labels placed in specific areas or objects. These labels can then bring up text or 3D models to communicate specific safety or hazard information.

Hurtado et al. [93] proposed a method to support the collective protective equipment inspection process by equipping the safety advisor with an AR-based 3D viewer with an intuitive interface. Authors also provided the background for the use of AR in safety inspection processes on construction sites and in offering methodological recommendations for the development and evaluation of these applications.

Chen et al. [94] explored the use of AR technology to facilitate fire safety equipment inspection and maintenance using mobile devices, overcoming the constraints imposed by paper files on these tasks. The demonstration and validation results showed that the proposed AR-based system provided highly comprehensive, mobile, and effective access to fire safety equipment information, facilitating the presentation of information in an immediate, visual, and convenient manner.

Codina et al. [95] presented an implementation of AR for emergency situations in smart buildings by means of indoor localization using sub-GHz beacons. The system should help emergency services that need to move quickly to rescue trapped people in situations where there is no visibility, or in large buildings such as hospitals, by generating a three-dimensional model of the building and facilitating navigation through it.

### *5.3. MR Applications*

Mixed reality (MR) extends AR technology by enabling more direct interaction between reality and virtuality, allowing the user to manipulate virtual elements as they would in the real world with 3D digital content responding and reacting accordingly [129].

Microsoft HoloLens headsets are the most used MR devices in the construction sector as they are already certified as basic protection goggles (see Figure 6) and are available integrated into hard hats such as Trimble XR10 for personnel working in dirty, noisy, and safety-controlled environments. Trimble XR10 are also available with the Trimble HoloTint accessory visor, which is equipped with photochromic lenses for use outdoors or in brightly lit environments.

MR devices allow workers to visualize building plans, to make measurements on the field, to create 3D models from surrounding site scans, and enable technicians and designers to make modifications while on-site, proving effective in the following building

*Energies* **2022**, *15*, 3785 27 of 37

O&M applications: collaborative building management, maintenance and repair operations, renovations and retrofit works, personnel training, and workplace safety. safety-controlled environments. Trimble XR10 are also available with the Trimble HoloTint accessory visor, which is equipped with photochromic lenses for use outdoors or in brightly lit environments.

Mixed reality (MR) extends AR technology by enabling more direct interaction between reality and virtuality, allowing the user to manipulate virtual elements as they would in the real world with 3D digital content responding and reacting accordingly [129]. Microsoft HoloLens headsets are the most used MR devices in the construction sector as they are already certified as basic protection goggles (see Figure 6) and are available integrated into hard hats such as Trimble XR10 for personnel working in dirty, noisy, and

**Figure 6.** MS HoloLens MR headset. **Figure 6.** MS HoloLens MR headset.

*5.3. MR Applications* 

#### MR devices allow workers to visualize building plans, to make measurements on the 5.3.1. Collaborative Augmented Building Monitoring and Management

field, to create 3D models from surrounding site scans, and enable technicians and designers to make modifications while on-site, proving effective in the following building O&M applications: collaborative building management, maintenance and repair operations, renovations and retrofit works, personnel training, and workplace safety. 5.3.1. Collaborative Augmented Building Monitoring and Management As for AR, MR allows facility managers to enhance data visualization by displaying information right on field, including, for example, the visualization of real-time data from sensors, process parameters, or aggregated performance, therefore improving the As for AR, MR allows facility managers to enhance data visualization by displaying information right on field, including, for example, the visualization of real-time data from sensors, process parameters, or aggregated performance, therefore improving the monitoring, control, and maintenance of the whole building's components. The improved spatial awareness of MR allows the device to estimate the location and orientation of the user within the 3D BIM model or DT of the building, enabling first-person navigation aids (arrows, directions) inside the real-scale model with the possibility of displaying or interrogating all relevant data, including MEP systems, building components, and data from sensors and IoT devices.

monitoring, control, and maintenance of the whole building's components. The improved spatial awareness of MR allows the device to estimate the location and orientation of the user within the 3D BIM model or DT of the building, enabling first-person navigation aids (arrows, directions) inside the real-scale model with the possibility of displaying or interrogating all relevant data, including MEP systems, building components, and data from sensors and IoT devices. Localization is performed via GNSS positioning and/or by comparing the user's perspective to BIM based on deep learning computation, commonly stream-processed in graphics processing unit (GPU)-enabled servers via transmission control Localization is performed via GNSS positioning and/or by comparing the user's perspective to BIM based on deep learning computation, commonly stream-processed in graphics processing unit (GPU)-enabled servers via transmission control protocol/Internet protocol (TCP/IP). Once the viewer is correctly positioned and oriented in the virtual space, spatial mapping visually fits the object of interest (superimposed text, elements-to-build, concealed pipes) onto the MR image [96]. MR navigation is also particularly useful to locate the target for maintenance and repair operations (electrical cabinets, air handling units, pipes manifolds) and guide the technician to the location even if they are unfamiliar with the premises.

protocol/Internet protocol (TCP/IP). Once the viewer is correctly positioned and oriented in the virtual space, spatial mapping visually fits the object of interest (superimposed text, elements-to-build, concealed pipes) onto the MR image [96]. MR navigation is also particularly useful to locate the target for maintenance and repair operations (electrical cabinets, air handling units, pipes manifolds) and guide the technician to the location even if they are unfamiliar with the premises. A recent example of the application of MR in residential buildings is that of the Restart4Smart project by Sapienza University of Rome, which participated in the Solar Decathlon Middle East 2018, in which a digital twin of the building was created, and MS HoloLens was used on public tours to show visitors the structure and systems of the house, highlighting specific components such as pipes, tensioning cables, and technical devices [130].

A recent example of the application of MR in residential buildings is that of the Restart4Smart project by Sapienza University of Rome, which participated in the Solar Furthermore, MR wearable devices can enable a collaborative workspace between different professionals: as managing more complex buildings and infrastructure requires a great deal of collaboration between multiple teams, including designers, managers, contractors, users, tenants, supervisors, etc., a successful collaboration is needed at all times even if all the figures involved in a project are not always present at the workplace. MR allows the creation of collaborative environments where several people, who may even be in different places, can literally walk around and at the same time interact with a virtual 3D model as with a physical one actually present in the room, and show the design intent to stakeholders without needing an actual maquette (see Figure 7). Instead of waiting for all parties involved to assemble and review blueprints to make any changes, MR allows users

to take notes and share video views of any suspected issues and send the information to remote teams in real time. On-site and remote team members are able to consult with each other and work with the information needed, greatly lowering the time and costs needed to make a decision among teams. MR collaborative environments such as WearCom further increase spatial faithfulness by using spatial cues such as virtual avatars to locate every observer speaking with each other in a conference. MR allows users to take notes and share video views of any suspected issues and send the information to remote teams in real time. On-site and remote team members are able to consult with each other and work with the information needed, greatly lowering the time and costs needed to make a decision among teams. MR collaborative environments such as WearCom further increase spatial faithfulness by using spatial cues such as virtual avatars to locate every observer speaking with each other in a conference.

Decathlon Middle East 2018, in which a digital twin of the building was created, and MS HoloLens was used on public tours to show visitors the structure and systems of the house, highlighting specific components such as pipes, tensioning cables, and technical

Furthermore, MR wearable devices can enable a collaborative workspace between different professionals: as managing more complex buildings and infrastructure requires a great deal of collaboration between multiple teams, including designers, managers, contractors, users, tenants, supervisors, etc., a successful collaboration is needed at all times even if all the figures involved in a project are not always present at the workplace. MR allows the creation of collaborative environments where several people, who may even be in different places, can literally walk around and at the same time interact with a virtual 3D model as with a physical one actually present in the room, and show the design intent to stakeholders without needing an actual maquette (see Figure 7). Instead of waiting for all parties involved to assemble and review blueprints to make any changes,

*Energies* **2022**, *15*, 3785 28 of 37

devices [130].

**Figure 7.** MR collaborative environment with MS HoloLens 2. **Figure 7.** MR collaborative environment with MS HoloLens 2.

#### 5.3.2. Collaborative Augmented Maintenance and Repair Operations 5.3.2. Collaborative Augmented Maintenance and Repair Operations

Thanks to the full integration between the virtual and real world, MR represents the best suited technology to support maintenance and repair operations in the building and infrastructure sector. Naticchia et al. [97] merged information from a BIM model with an MR environment to reflect the maintenance workers' benefits. Another MR case study was investigated by Ammari and Hammad [98], who proposed a framework to coordinate BIM and MR for supporting field tasks in the facility management domain. Thanks to the full integration between the virtual and real world, MR represents the best suited technology to support maintenance and repair operations in the building and infrastructure sector. Naticchia et al. [97] merged information from a BIM model with an MR environment to reflect the maintenance workers' benefits. Another MR case study was investigated by Ammari and Hammad [98], who proposed a framework to coordinate BIM and MR for supporting field tasks in the facility management domain.

MR can perform all activities that require AR integration and have been described above, including remote collaboration, with much improved precision and contextual interaction. In addition to visualizing maintenance-related data such as service requests, work order information, real-time IoT readings of the asset being serviced, documentation, or knowledge base items, MR can highlight parts and operations in their correct order (screws, switches, levers, valves, etc.) and virtually expand or explode the 3D image of the machine to visualize how it is built and all its components. MR can perform all activities that require AR integration and have been described above, including remote collaboration, with much improved precision and contextual interaction. In addition to visualizing maintenance-related data such as service requests, work order information, real-time IoT readings of the asset being serviced, documentation, or knowledge base items, MR can highlight parts and operations in their correct order (screws, switches, levers, valves, etc.) and virtually expand or explode the 3D image of the machine to visualize how it is built and all its components.

In particular, MR 3D models have been observed to be easier to understand than conventional paper documents, allowing for faster assembly and fewer errors made during construction. Furthermore, MR-assisted construction showed that participants with no previous assembly experience achieved the best times using MR, and they were also faster than the most experienced participants who used traditional paper plans [99], confirming the potential of MR to address the current shortage of experienced or skilled labor in the construction industry.

By using an MR integrated cooperation platform such as Microsoft Dynamics 365 Remote Assist, a remotely connected expert can see the live, first-person view taken by the HMD, freeze any scene of this video feed, and create annotations and 3D holograms, which are then integrated into the real-world view of the person wearing the device, anchored to their position irrespective of head movements [100].

By far, the most promising feature of MR is the ability to combine all digital data and documentation with one's physical view. Information, including actual or expected locations of walls, pipes, outlets, switches, and ventilation, is accessible directly on the site

in levels that can be easily turned on and off. Having the precisely placed MEP overlays superimposed on the actual construction site can lead to subsequent work being completed with great precision. For example, contractors have been using robotic total stations and HoloLens to visualize a planned underground piping grid and mark the layout of required excavations accordingly directly on-site, thus minimizing unnecessary earth movements. Software such as BIM Holoview visualize infrastructure components where they will be constructed to evaluate and verify their correct positioning and can also be used as a post-construction tool to view existing infrastructure behind walls. The MR device can also overlay information on the work in progress, highlighting necessary tools and components, or overlaying visual guidelines for masonry or electrical installations [101], up to entire step-by-step assembly instructions.

The ability to accurately overlay BIM model data that would otherwise be hidden, such as built-in wall ducts or overhead ducts, allows spot interventions without wasting time and resources on probing. This is particularly useful for underground infrastructures, where excavation operations often present a high risk of inadvertently damaging the existing subsurface utilities, causing financial loss or even accidental injury [18]. A workflow proposed by Bentley Systems with the ContextCapture tool leverages real-world photographic reconstruction to generate a 3Dmesh in which 2D pipe maps are projected and aligned based on visible surface features such as manholes, valve access covers, and drains. On site, the pipe-augmented 3D mesh is geolocated with the physical world by selecting common control points (doors, lampposts), then turned off to show only the pipe augmentations in the aligned position. This way, it is possible to identify pipes and manholes to service and to trace precise excavation markers minimizing impact on the surrounding environment. At the same time, the user can visually identify discrepancies between the augmentation and the actual location of surface assets and propose changes to the pipe database by adjusting the virtual pipe's position directly on-site.

Regarding building maintenance with MR, ThyssenKrupp developed a HoloLensbased solution for elevator maintenance [131]. The application provides maintenance workers with an interactive UI and 3D holograms of the objects in need of fixing and offers live Skype calls for assistance (see Figure 8): the responder can visualize what the caller is seeing, making it possible to verify or assist the repair work in real time. *Energies* **2022**, *15*, 3785 30 of 37

**Figure 8.** ThyssenKrupp MR solution for elevator maintenance. **Figure 8.** ThyssenKrupp MR solution for elevator maintenance.

5.3.3. Collaborative Augmented Renovations and Retrofit Works 5.3.3. Collaborative Augmented Renovations and Retrofit Works

In renovation planning, MR can be used to improve information retrieval from BIM models and thus reduce time and errors in related tasks. Indeed, most issues originate In renovation planning, MR can be used to improve information retrieval from BIM models and thus reduce time and errors in related tasks. Indeed, most issues originate nor-

normally from poor coordination, incomplete or poor understanding by contractors,

Using MR glasses, project team members can walk to the site of the future intervention and digitally mark problems they encounter, such as cracks or misplaced elements. Each "tag" is linked to a 3D model of the space and comes complete with builtin links to requests for information (RFIs), drawing details, additional images, vocal notes, or preset tags, allowing workers to use the same MR headset in a second moment to locate,

Furthermore, MR can be used to measure the physical properties of a space, i.e., width, length, and height. Contractors can integrate this information into 3D models of the building, allowing them to generate even more accurate structures and have a more comprehensive view of how the project is being built. Using MR headsets, workers will be able to automatically take measurements of the built components and compare them against the dimensions specified in the as-designed models to identify any inconsistencies in the structures and quickly resolve them to prevent delays or higher costs (see Figure 9).

identify, diagnose, track, and ultimately fix issues.

productivity.

mally from poor coordination, incomplete or poor understanding by contractors, inability to detect errors or omissions early, or disconnect between stakeholders and construction teams. MR can mitigate all these problems greatly increasing overall productivity.

Using MR glasses, project team members can walk to the site of the future intervention and digitally mark problems they encounter, such as cracks or misplaced elements. Each "tag" is linked to a 3D model of the space and comes complete with built-in links to requests for information (RFIs), drawing details, additional images, vocal notes, or preset tags, allowing workers to use the same MR headset in a second moment to locate, identify, diagnose, track, and ultimately fix issues.

Furthermore, MR can be used to measure the physical properties of a space, i.e., width, length, and height. Contractors can integrate this information into 3D models of the building, allowing them to generate even more accurate structures and have a more comprehensive view of how the project is being built. Using MR headsets, workers will be able to automatically take measurements of the built components and compare them against the dimensions specified in the as-designed models to identify any inconsistencies in the structures and quickly resolve them to prevent delays or higher costs (see Figure 9). *Energies* **2022**, *15*, 3785 31 of 37

**Figure 9.** Example of Trimble XR10 MR application on construction site. **Figure 9.** Example of Trimble XR10 MR application on construction site.

Thanks to the interoperability between different MR and BIM systems, it is possible to automatically update BIM models and construction schedules from the MR device itself [17]. Workers can easily display interior and exterior views of a facility and make changes to virtual floor plans while keeping one view intact, such as removing or repositioning walls or components, or modifying the layout on their headset or mobile MR devices. This allows experts to resolve any errors in a virtual view before applying changes to the physical structure. As this digital data is continuously updated, it takes the guesswork out of any design changes while improving the workflow and preventing material waste. Retrieving BIM information also allows field workers to effectively monitor a project against its construction plan and ensure its successful completion. In addition to this useful information database, MR can allow users to virtually see the building's progress against its schedule, providing an additional level of project management. A Progress monitoring interface can provide color-coded overlays to easily identify sections of the construction site that are ahead or behind schedule [102]. Platforms such Fologram allow users to create interactive models that guide step-by-step construction in real-world environments. Fologram can also use QR codes or ArUco markers to manipulate models in space. For example, with wireframe models, these guides specify how to trace out the dimensions for molds to create building elements and where to place each brick in a wall with millimeter Thanks to the interoperability between different MR and BIM systems, it is possible to automatically update BIM models and construction schedules from the MR device itself [17]. Workers can easily display interior and exterior views of a facility and make changes to virtual floor plans while keeping one view intact, such as removing or repositioning walls or components, or modifying the layout on their headset or mobile MR devices. This allows experts to resolve any errors in a virtual view before applying changes to the physical structure. As this digital data is continuously updated, it takes the guesswork out of any design changes while improving the workflow and preventing material waste. Retrieving BIM information also allows field workers to effectively monitor a project against its construction plan and ensure its successful completion. In addition to this useful information database, MR can allow users to virtually see the building's progress against its schedule, providing an additional level of project management. A Progress monitoring interface can provide color-coded overlays to easily identify sections of the construction site that are ahead or behind schedule [102]. Platforms such Fologram allow users to create interactive models that guide step-by-step construction in real-world environments. Fologram can also use QR codes or ArUco markers to manipulate models in space. For example, with wireframe models, these guides specify how to trace out the dimensions for molds to create building elements and where to place each brick in a wall with millimeter accuracy.

MR can also be used to build mockups that combine digital and physical elements by

In O&M training, MR presents the great advantage of allowing the trainee to interact with real-world objects (also having tactile feedback) and to access virtual information at the same time, easily making the mapping between the real task and the instructions

HoloPundits can help to graphically visualize a blueprint as a completed project, enabling 3D walkthroughs, to adjust building spaces, to drag and drop props, and to replace the virtual buildings. Taqtile's HoloMaps is a VR/AR app used to navigate through a geographical holographic model by superimposing real-time data, integrating Microsoft Bing 3D with over 250 cities and landmarks as a 3D model of any size. Data source integration provides contextually relevant information (weather, real-time traffic, and Twitter feeds) that can be shared with team members or stakeholders locally or remotely, synchronously or asynchronously, alongside voice and text annotations, and MR photo

accuracy.

and video screenshots.

5.3.4. Personnel Training

MR can also be used to build mockups that combine digital and physical elements by inserting the digital model over the real one. This is particularly useful in the case of interventions on an urban scale. For example, the City Planning application by HoloPundits can help to graphically visualize a blueprint as a completed project, enabling 3D walkthroughs, to adjust building spaces, to drag and drop props, and to replace the virtual buildings. Taqtile's HoloMaps is a VR/AR app used to navigate through a geographical holographic model by superimposing real-time data, integrating Microsoft Bing 3D with over 250 cities and landmarks as a 3D model of any size. Data source integration provides contextually relevant information (weather, real-time traffic, and Twitter feeds) that can be shared with team members or stakeholders locally or remotely, synchronously or asynchronously, alongside voice and text annotations, and MR photo and video screenshots.

### 5.3.4. Personnel Training

In O&M training, MR presents the great advantage of allowing the trainee to interact with real-world objects (also having tactile feedback) and to access virtual information at the same time, easily making the mapping between the real task and the instructions without the need to use separate external training materials such as user guides or manuals. According to Fitts' skill acquisition model, MR enables the trainee to learn the basics of the task by observing augmented instructions and to develop behavioral and movement patterns during repeated execution of the instructed tasks, increasing their skills right from initial rehearsals [103].

### 5.3.5. Workplace Safety

In addition to supporting education and training, MR device applications are also about their ability to boost hazard identification for field safety, improve risk recognition, and enhance real-time communication between managers and workers. The communication of safety information with MR can be enhanced by sharing video feeds between the on-site MR device and off-site PCs or tablets, enabling bi-directional communication between workers and safety coordinators regarding potential hazards, violations, and tips on site, and verbalized annotations and comments that can be tagged and georeferenced on their actual digital location [104]. As MR glasses such as HoloLens 2 continually create and update the 3D map of the environment, it is also possible to provide orientation and navigation of premises in a building or construction site with overlapping arrows, indicators, and information panels that can accurately align to rooms, doors, and equipment.

### **6. Conclusions**

Numerous studies and fieldworks are demonstrating how XR technologies represent a very important tool to best exploit the possibilities offered by the digital transition of the AECO sector. Indeed, the AECO industry is faced with a confluence of trends and technologies that brings the virtual and physical worlds together to create a truly networked world in which intelligent objects communicate and interact with each other.

XR technologies, including VR, AR, and MR, allow immersive digital experiences to be created that permit users to easily visualize, explore, and understand designs, models, and site conditions with many benefits in terms of stakeholder engagement, design support and review, construction planning, progress monitoring, construction safety, and support to operations and management, as well as personnel training.

After providing a detailed review that analyzed, categorized, and summarized stateof-the-art XR technologies and their possible applications for building O&M along with their relative advantages and disadvantages, the article shows that XR can greatly improve building management and maintenance, allowing the optimization of building performance, cost-effectiveness, employee satisfaction, and productivity. VR, AR, and MR can be revolutionary in supporting designers, technicians, and facility managers in their activities thanks to the ability to provide immersive visualization of the asset and to superimpose instructions, sensor data, or technical schemes right in their field of view as well as allowing hands-free communication. Furthermore, integrated cooperation platforms can allow remote connected colleagues and experts to see the live, first-person view taken by the XR headset and collaborate actively with the users.

VR can provide a way to operate the facility remotely in an immersive environment and can be used to simulate alternative scenarios. AR has great potential to support building operation and management because it can provide useful information in context to site workers that operate and maintain the facilities. A combination of both technologies can support field and remote office workers at the same time and improve collaboration.

In particular, VR can support O&M activities through maintainability design, immersive building visualization and monitoring, human-centered building management, personnel training, and vehicles or robots' teleoperation. AR and MR can improve building performance and maintenance activities by enabling collaborative augmented building monitoring and management, collaborative augmented maintenance and repair operations, collaborative augmented renovations and retrofit works, personnel training, and workplace safety.

The main challenges facing the use of both AR and VR for facility management are the lack of integration with other facility management systems, the low accuracy and speed for updating information across several systems, and the difficulty to archive and revisit AR and VR experiences. Furthermore, XR headsets need to become less constrained by battery life, image rendering speeds need to improve, and siloed software platforms need the privacy controls and infrastructure support to integrate more seamlessly in the cloud. Interactions with the virtual space will become more intuitive, supporting gestures and eyegaze to aid in sketching, prototyping, and animation. Geospatial tags that sync real-world location data with building models will be automated and achieve greater precision.

Today's market trends relating to the diffusion of XR, and the convergence with other sectors such as gaming and video entertainment, suggest a rapid resolution of the present barriers to the application of these technologies to O&M, currently represented by the still high costs of the devices (MR in particular), the level of digital alphabetization required, and the sparse adoption of BIM 7D systems in facility management. On the other hand, the type of technology and the required digital literacy could attract and develop new and more skilled and qualified talents, addressing the labor shortage problems typical of the sector.

Finally, given the novelty of the subject, this study allowed the identification of a lack of scientific literature on the measured costs and benefits deriving from the application of XR technologies in the field of O&M, thus identifying this limitation as a need for further development in this area of research and for a larger diffusion of immersive technologies in the AECO sector.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

### **References**

