*Article* **BiodivAR: A Cartographic Authoring Tool for the Visualization of Geolocated Media in Augmented Reality**

**Julien Mercier 1,2\*, Nicolas Chabloz <sup>1</sup> , Gregory Dozot <sup>1</sup> , Olivier Ertz <sup>1</sup> , Erwan Bocher <sup>2</sup> and Daniel Rappo <sup>1</sup>**


**\*** Correspondence: julien.mercier@heig-vd.ch

**Abstract:** Location-based augmented reality technology for real-world, outdoor experiences is rapidly gaining in popularity in a variety of fields such as engineering, education, and gaming. By anchoring medias to geographic coordinates, it is possible to design immersive experiences remotely, without necessitating an in-depth knowledge of the context. However, the creation of such experiences typically requires complex programming tools that are beyond the reach of mainstream users. We introduce *BiodivAR*, a web cartographic tool for the authoring of location-based AR experiences. Developed using a user-centered design methodology and open-source interoperable web technologies, it is the second iteration of an effort that started in 2016. It is designed to meet needs defined through use cases co-designed with end users and enables the creation of custom geolocated points of interest. This approach enabled substantial progress over the previous iteration. Its reliance on geolocation data to anchor augmented objects relative to the user's position poses a set of challenges: On mobile devices, GNSS accuracy typically lies between 1 m and 30 m. Due to its impact on the anchoring, this lack of accuracy can have deleterious effects on usability. We conducted a comparative user test using the application in combination with two different geolocation data types (GNSS versus RTK). While the test's results are undergoing analysis, we hereby present a methodology for the assessment of our system's usability based on the use of eye-tracking devices, geolocated traces and events, and usability questionnaires.

**Keywords:** location-based augmented reality; augmented reality authoring tool; cartographic authoring tool; user-centered design; user experience; usability; educational technology; nature exploration; geolocated media spatial visualization; cartographic symbols visualization; immersive cartography; open source web tools; nature exploration; biodiversity education; educational technology

### **1. Introduction**

### *1.1. Goals and Structure of the Paper*

The goal of this paper is to describe the development of the location-based augmented reality (AR) web application *BiodivAR* and associated user-centered design (UCD) methods [1] for geolocation data enhancement. It also proposes a methodology to further evaluate the usability of the system. As such, it does not present new research results, but merely a new tool and the methods used to develop it. In 2017, the first iteration in the form of a proof of concept in educational technologies (edTech) saw light under the name of *BioSentiers*. This paper aims to present the output of the second iteration of this project and the methods behind it. The first iteration both highlighted some benefits of using locationbased AR for nature exploration and identified key challenges that needed to be addressed to make this pairing more meaningful. In the then prototype, the augmented objects were hard coded in the application: editing or adding points was therefore impossible. The AR interface also showed unsteadiness, which caused usability problems. Our new prototype

**Citation:** Mercier, J.; Chabloz, N.; Dozot, G.; Ertz, O.; Bocher, E.; Rappo, D. BiodivAR: A Cartographic Authoring Tool for the Visualization of Geolocated Media in Augmented Reality. *ISPRS Int. J. Geo-Inf.* **2023**, *12*, 61. https://doi.org/10.3390/ ijgi12020061

Academic Editors: Beata Medynska-Gulij, David Forrest, Thomas P. Kersten and Wolfgang Kainz

Received: 29 December 2022 Revised: 24 January 2023 Accepted: 6 February 2023 Published: 9 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

consists of an authoring tool for custom location-based AR experiences while attempting to overcome these issues.

The structure of the paper is divided into the following sections: The current Section 1 introduces the goals of the paper and the project's research goals. The Section 2 contains a review of existing projects leveraging AR in the field of education, as well as key findings of the *BioSentiers* project that provided the ground for this iteration. It also lists the specific challenges of using location-based AR for outdoor biodiversity education; The Section 3 presents descriptions of the conceptual and formal stages of designing the proposed system and their associated challenges; The Section 4 details the setup considers the possible benefits of using Real-Time Kinematic positioning technology (RTK); The Section 5 outlines the methodology that we intend to implement to assess our system's usability; The Section 6 discusses the progress of the established goals and the steps ahead.

### *1.2. Research Goals of the Project*

The *BiodivAR* project aims to explore the role location-based AR can play to support nature exploration and biodiversity education. Its research goals are therefore manifold and can be synthesized as follows:

**Create an authoring tool that is easy to use** The application we developed features a cartographic authoring tool for the creation of custom AR learning experiences (AR-LEs). It was designed based on information retrieved from interviews and co-design sessions with target users and it is primarily intended to facilitate the organization of field trips in nature. It should be actionable by anyone with little prior technological knowledge, [2] which comes with great utilisabilty requirements. It is also targeted to be implemented in the context of citizen science (CS) projects, through visualizing existing geolocated data, or contributing new data.

**Design a location-based AR interface that supports nature exploration** The application features an AR immersive mode for visualizing the text, photography, graphics, 3D animation, and sound in virtual environments. When these spatial media contain information about the geographical space they're anchored to, they have the potential to engage users in a singular way [3–8]. We aim to leverage this particular type of immersive interface's attractiveness and efficiency to engage users to actively explore outdoors.

**Try to better the usability of location-based AR interfaces** Because location-based AR depends on geolocation data, the accuracy of that data is of direct consequence on the interface's stability [4,5,8–11]. We propose the use of an external module for satellite navigation with real-time kinematics (RTK) that achieves centimeter-accuracy. We also lay down the concept for a data fusion method using simultaneous localization and mapping (SLAM) data and geolocation data to further improve heading data, which has an equally important impact on location-based AR interfaces.

**Evaluate the tool's usability** As a UCD project, its successes and failures are to be assessed primarily on the basis of its usability, acceptance and user satisfaction as measured by self-reported user data, and on the improvements observed with respect to the prior iteration [1]. We have conducted a comparative user test (n = 54) to assess the tool's usability in combination with two geolocation data sources. The results have not been fully processed and analyzed yet and will be presented in a subsequent paper.

### **2. Background**

### *2.1. Augmented Reality for Education*

Augmented reality (AR) describes a type of interface that overlay real and virtual objects in the field of view in real-time, whether it is through the screen of a mobile device or a head-mounted display. It is interactive and recorded in 3D [12]. Users' perception stems primarily from real-world objects, while virtual objects appear to be spatially or semantically related to the real world. In AR, virtual and real objects are combined in order to create the impression of an enriched environment. In an educational setting, AR can make traditional educational material more attractive by overlaying virtual information onto them, hence motivating students to learn [13]. Studies have shown that using AR in education fosters students' motivation [14]. ARLEs hold much potential for improving learning processes and they are undergoing intensive development and research [15]. Their usage is not very widespread yet. EdTech researchers suggested that this was due to the difficulty of developing augmented reality learning experiences, as it usually requires programming skills. They call for the creation of ARLE authoring tools that are easy to use for teachers and students [2]. Another possible reason is that there aren't many tools designed specifically for education. They rather tend to be developed for entertainment, civil engineering or general purpose [16]. However, it has been observed that the number of AR applications for education is increasing since 2010 [17,18]. A systematic literature covering 32 papers written between 2003 and 2013 revealed that the most reported advantages of AR in education are learning gains and motivation. It also showed that the typical target group is bachelor students, and the most common topics are Mathematics and Natural Sciences [19]. A meta-analysis conducted on 64 "AR in education" experiments has reveled that AR had had a medium effect on students' learning gains [20].

Similar to their marker-based or SLAM-based counterparts, location-based ARLEs are assumed to foster immersion and support learning[5,21–24]. Positive learning experiences and high motivation and engagement levels have been repeatedly reported [3–8]. Researchers have investigated ways to design effective and attractive immersive experiences for visualizing geographical virtual spaces based on traditional cartographic data and materials [25]. It was also reported that location-based AR supported contextualization, [26,27] ecological engagement, [28] and caused users to experience a positive interdependence with nature, which fosters improved immersion and learning [29]. Because it uses geolocation data–gathered through a Global Navigation Satellite System (GNSS) sensor–location-based AR only functions outdoors. Compared to marker-based or SLAM–which are go-to technologies for indoor environments–location-based AR does not operate at the same scale and does not cater to the same needs. location-based AR is suitable for outdoor use over distances of several dozen meters. The growing popularity of outdoor education [30] makes it a go-to technology, although leveraging AR technologies for outdoor learning can be challenging because of the informal setting it often takes place in [13]. On the other hand, location-based AR in particular can draw on learning theories that promote learning in context, or inquiry-based learning, [27] which is the valued teaching philosophy in the natural sciences field [31]. When aligned with sound pedagogical theory, AR will be more effective [32].

The use of location-based AR in an educational setting is also justified for the physical activity it is able to stimulate. In a study, researchers have found that using the locationbased AR application *Pokémon Go* [33] has led to substantial physical activity increases across genders, ages, weight status, and prior activity levels, unlike many health apps that were primarily designed to that end [34].

Despite the aforementioned reported benefits, AR in education is still at an early stage compared to other digital technologies. The majority of studies are short-term one-time experiments, with little to no longitudinal reports on its impact on learning outcomes [13]. In the specific case of location-based AR, jittery interfaces caused by the use of inaccurate GNSS data make it the weak leg of AR, [4,5,8,10] and has thus not been widely investigated.

### *2.2. Proof of Concept: The BioSentiers Mobile Application*

The application presented in this manuscript is the second iteration of a research project started in 2016. The Media Engineering Institute (MEI) and the Territorial Engineering Institute (INSIT) developed a location-based AR application named *BioSentiers* [9]. The goal of the project was to improve pupils' understanding of biodiversity and establish a connection with nature. The application featured geolocated points of interest (POIs) positioned adjacent to plant specimens. When users approached the specimen in real-life, the POI's color changed on the graphical interface and they were able to prompt an informa-

tional card on the plant species. The data collection and selection of the species presented were carried out by a biologist, and the application was tested with fifteen ten-year-old pupils during field trips in a natural reserve, as visible in Figure 1.

**Figure 1.** The *BioSentiers* location-based AR application. The species presented were prepared by a biologist so as to create an educational trail for pupils in a natural reserve. A video showing the application in use during a field trip is available on a Zenodo publicly accessible repository https://zenodo.org/record/6501843 (accessed on 28 December 2022).

The test emphasized the benefits of using AR for wayfinding in nature: users seemed more efficient when using the AR view rather than the 2D interactive map [9]. They were also more motivated and engaged with the activity, which concurs with conclusions drawn by comparable projects [3,5–8,26–28]. The test also revealed significant usability problems, caused by the instability of the anchoring of the POIs in the AR user interface. Because the display of POIs is directly controlled by the device's measured position, the GNSS lack of accuracy was held to be the cause [9]. At times, the virtual objects either disappeared completely or were so out of position that users got lost. Several other researchers have also underlined the usability problems caused by the imprecision of the geolocation data, which is often regarded as location-based AR's bottleneck [4,5,8,10,11]. In consequence, some have chosen to use marker-baser AR over location-based AR because their experience led them to consider location-based AR to be frustrating and distracting to users [35,36]. It was observed in the *BioSentiers* test that pupils mostly interacted with the tablet us screen (88.5% of the duration of the experience on average) rather than with the surrounding nature [9]. This imbalance is, at least in part, believed to be a collateral effect of inaccurate geolocation data: in the video captures, users are seen spending considerable time reorienting and repositioning themselves. These usability issues caused the users to deviate from the application's primary goal of exploring nature, and they even caused some to trip on branches and fall on the ground. In other instances of AR being used in an outdoor educational setting to promote the exploration of nature, researchers have made similar observations on how technology tends to monopolize users' attention. [37,38]. Mobile devices often require constant attention and interaction from the students, which leads them to focus on the device more than expected. In a wide review of mobile learning projects, the authors have found that the technology dominated the experience in a problematic way in 70% (28/38) of the cases [39].

Following the test, an evaluation [40] of the application offered further improvement suggestions. It recommended to add a function that would allow the users to publish their observations in the form of texts or photographs, rather than being restricted to a passive viewing role. It also suggested the application be partitioned between a "student" and "teacher" version, and the possibility for teachers to create their own learning experiences.

As a primarily technical proof of concept, *BioSentiers* did not leverage a UCD methodology during its conceptual phase, and as such failed to consider several usability aspects. Moreover, as this was an initially modest project, the user tests carried out with users

were not governed by a strict and complete methodology. Because of this, its results and conclusions are limited in scope and generalization. However, they sufficed to provide the basis for a new iteration, in which we attempted to address the reported challenges–which appear to be consistent with those of other edTech projects on location-based AR.

### **3. Materials and Methods**

### *3.1. User-Centered Design*

Since weaknesses and challenges had been identified in the previous iteration, we considered that a User-Centered Design would be most likely to help us overcome them. An application developed through UCD should focus and be optimized for the needs of end users so that they adapt to these needs, rather than have the users need to change their habits to use it [1]. During the early phases of conception, we attempted to formulate responses to the challenges raised by the *BioSentiers* project while also updating our goals: to offer a cartographic authoring tool for the creation of location-based AR experiences. We wanted to create a lightweight and flexible tool that would be easy to maintain and evolve, and suitable for different uses. The idea was to integrate the possibility of interacting with the content, and to make the creation of AR experiences accessible to non-specialists, so that the content could be adjusted to meet the specific needs of any field trip. The research goals (1.2) were set based on findings made since the beginning of the project (i.e., through interviews and co-design sessions). In order to come up with relevant solutions, we leveraged several UCD methods in order to gain a more thorough understanding of our target users' requirements and the context in which our system may be used. This entailed participant observation, interviews, co-design sessions, defining main use cases, rapid prototyping, ad hoc testing, and ideation, as described in Sections 3.1.1–3.1.6.

### 3.1.1. Participant Observation

Following the UCD framework, the conception began with a research phase that helped better understand the end users' environment, their requirements and their expectations. Our main use case is educational, for example in schools. Consequently, the system should be easy to handle by non-specialists such as teachers and students [2]. Other target groups have been included in the design process, including higher-education-level students in nature engineering, and citizen scientists. During this early stage, three participant observation sessions took place with biodiversity professionals. During these sessions, the first author of this paper followed educational field trips organized by Switzerland's main nature preservation organization for elementary and secondary school pupils (aged 8–12 y.o.). It was the opportunity to witness how biodiversity education is carried out outdoors, in informal learning contexts. One key observation was that tablets were already used to watch photographs, videos and listen to animal sounds that could not be witnessed on command. It was also observed that the younger audience was highly motivated by outdoor activities, even without the use of mobile technology. It may indicate that they are not the target audience that could benefit the most from the use of mobile technologies. Slightly older pupils may be more subject to becoming unengaged in educational and outdoor activities, yet very eager to use mobile devices. This led us to focus on a slightly older age group (12–15 y.o.) than the one that was previously targeted in the *BioSentiers* project.

### 3.1.2. Interviews

Interviews were conducted with four actors likely to be concerned by the use of the application: two education specialists from different disciplines (Outdoor education/French, Natural Sciences); a biodiversity educator; and an elementary school teacher. The interviews were conducted in an informal setting (three over videoconference and one face-to-face) while the interviewer took notes on paper or on the computer. These interviews helped refine the research goals (1.2) and to define specific features that would be relevant to each end-user profile. It also gave some participants the opportunity to voice

their concerns about the use of screens for outdoor education. This step reaffirmed our goal of better usability leading to a better balance between screen interaction versus nature interaction. The biodiversity educator from a nature preservation organization in particular expressed that the hypothetical motivation provided by the use of AR was irrelevant in some cases. In their experience, the children were already sufficiently motivated during the activities. However, they observed both the occasional loss of motivation and interest in mobile technologies in teenagers and therefore acknowledged the use of AR could help. This concurred with findings made during the participatory observation stage and encouraged us to narrow our target audience to the 12–15 years old pupils age group, which tends to lose interest in school and science [41] while being avid mobile technology users [42], according to some studies.

### 3.1.3. Co-Design

In addition to the interviews, three co-design workshops were organized with education and nature engineering specialists. These took place during the design development stage with the goal to ensure the resulting system would meet their needs and expectations, as potential end users. The first participant was a middle school teacher who had previously used the *BioSentiers* application with their pupils; the second one was an outdoor learning didactician; and the third was a higher education teaching assistant. The goal was to elaborate outdoor biodiversity learning sequences for different age groups that comprised the use of a mobile AR application, based on their own assumptions of what features such an app may provide. We came up with pedagogical scenarios for the yet hypothetical application in order to imagine how biodiversity learning could be integrated into the curriculum of different school subjects (French, Geography, Science, History). From the imagined scenarios, several new features were added to the application's architecture.

	- (a) A style customization feature for the POIs (colors, shapes, volumes, fills, etc.)
	- (b) A "geofencing" feature that would allow users to draw walls visible in AR
	- (c) Data visualization features on a 2D map that would allow users to visualize the media anchored within POIs remotely, after the field trip

would be used to collect data on the field as they explore nature. They would be tasked with collecting data, creating POIs, uploading photographs, identifying and monitoring specimens, writing comments, and making observations. In this scenario, the goals would both be for the students to learn about biodiversity by applying theoretical knowledge they have previously studied, and also to actively contribute to biodiversity data that can be harnessed for research. Thanks to the use of interoperable GeoJSON files, participants can further valorize the data they collected by uploading it to existing CS databases such as the Global Biodiversity Information Facility https:// www.gbif.org/ (accessed on 28 December 2022) or Infoflora (in Switzerland) https:// www.infoflora.ch/fr/ (accessed on 28 December 2022) projects. Because the objectives of this age group are aligned with–and better known as–those of CS, they have been modeled according to this nomination in the application's use cases. The workshop led to the inclusion of the following features that allow users to achieve CS goals:


### 3.1.4. Use Cases

The previous steps (participant observation, interviews, co-design) provided grounds to model the use cases for each of our target users, as visible in Figure 2. Our primary users are pupils who will use the system to learn about biodiversity by exploring a milieu while visualizing media in AR (sounds, images, 3D models) during field trips. For pupils to effectively use AR for educational purposes, the augmented content should be tailored to their needs, which implies that their teachers have to be able to use the system in the first place. To help pupils learn a new task or concept, teachers routinely use instructional scaffolding or target their zone of proximal development. AR contents should be adaptable and their management should be easy, as identified by Cubillo et al. [2]. Therefore, our system's secondary users are teachers whose goal is to teach biodiversity and use the desktop client of the system to develop or adapt an ARLE prior to a field trip. CS pursues the dual objectives of learning about biodiversity and collaboratively collecting data, which overlap with the goals of biodiversity education. We therefore included citizen scientists, who may use the system occasionally, as another secondary user group. The system should be a useful resource for CS initiatives, notably through the ability to visualize any collaboratively collected (interoperable) geodata in an AR interface.

### 3.1.5. Rapid Prototyping and Ad Hoc Testing

We used a rapid prototyping method [43] to create an early version of the application with ARIS, [44] a user-friendly, open-source authoring tool for AR experiences, as a method to test our initial ideas and gather user feedback. It included testing storytelling scenarios using analogies to convey abstract concepts related to biodiversity. We also wanted to test the use of gamification features such as characters, points, and missions, as visible in Figure 3. Due to the limitations of the underlying AR engine (Vuforia), the AR was marker-based, which means that visual targets had to be recorded in the application and placed in the field.

**Figure 2.** Use Cases: system's scope, end users and their interactions with the system, main expected scenarios and the goals that the system may help end users achieve.

**Figure 3.** Screenshots of the prototype made with ARIS. Location-trigger POIs prompt dialogs with fictional characters who send users on missions to go explore newly appeared POIs on the map. When reaching the POIs' location, users either had to listen to sounds, identify or photograph a specimen, or watch a video before returning to the character to earn a reward.

In July 2021, we tested this prototype during a one-day continuing education outdoor learning training session. As visible in Figure 4, twenty teachers from elementary and middle school individually explored a patch of forest with our AR game for about twenty minutes on a smartphone [45]. After the tests, we conducted group discussions to learn about what the end users thought of the system and its features.

**Figure 4.** Teachers watching a video in AR about a plant species, overlaying a visual marker placed next to a specimen.

Overall, a majority of the teachers thought that the system could help their pupils engage with nature and teach them contextual information about the plants they encountered. They discovered plant specimens and their taxonomy. They appreciated that the system prompted them to walk about and explore areas of interest. However, some of the participants failed to see the link between the augmented objects and plant specimens in the real world, which made them confused about the system's main goal. Based on our observations, this was due to the inaccuracy of the geolocation data, which concurred with what had been observed during the *BioSentiers* test. Text-based media in the system were deemed to be generally too long. The audio contents (animal sounds) were appreciated. This feedback helped us refine our specs and set our priorities to make our concept more user-friendly and engaging.

### 3.1.6. Ideation

As visible in Figure 5, an ideation session was organized as a method to leverage collective intelligence, ideate thoughts and come up with solutions to some of the problems we have faced while conceiving the application. The participants received a short introduction about the application, its target users and intended use cases. A wireframe was also presented. During the first activity, each participant had to identify any difficult, complex, or unclear points on post-its. They were then clustered by topic and displayed, and the participants discussed possible workarounds. In a second activity, participants were asked to draw on paper six different versions of a proposed interface that had previously been identified as complex for users. Both activities were helpful in simplifying the issues and coming up with the most logical options: The external observations made during the activity led to the removal of several redundant layers in the final application. They also allowed us to which of the system features were instrumental in achieving its set goals.

**Figure 5.** UX/UI designers and developers during an ideation session.

*3.2. Architecture and Implementation*

### 3.2.1. Concept

The *BiodivAR* mobile application aims to support nature exploration for pupils, students, and concerned citizens and allow them to visualize and interact with geolocated POIs containing media in AR. Additionally, it aims to enable teachers to author location-based AR experiences. Our concept thus assumes the two following main types of utilization: media visualization in AR on location/outdoors ; and the authoring of AR environments that can be carried out remotely/indoors. As such, the application comprises two main modules: a mobile component, which mainly allows AR visualization of geographic data, and a desktop cartographic authoring geoportal that allows users to create and manage environments and visualize data on a 2D interactive map. This two-part design is shown in Figure 6. The latter may also be used in a learning context. The translation and correspondence between the cartographic authoring tool and its 3D, augmented counterpart is at the core of the concept.

**Figure 6.** The *BiodivAR* concept: the mobile AR interface allows on-site, spatial interaction with media anchored to geographic coordinates. The desktop interface displays the same geodata on a 2D interactive map (leaflet). GeoJSON data can be imported or exported.

The data are stored within *bioverses*, which are thematic augmented reality learning environments (ARLEs). They can be set to public/private, editable or non-editable, duplicated, merged, and visualized on the 2D map. They contain geolocated points of interest (POIs), which contain various media and/or data:

	- It can be visible/audible from the moment a user is within the "scope," and disappear when they enter the "radius." Use case: a 3D cartographic symbol; a sound-effect that plays when users enter/exit the "radius."
	- It can be invisible/silent until the user reaches the "radius". Use case: a gallery of media about a specimen that should only be visible once users reach a specific area; a podcast that plays when users are facing a landmark).

### 3.2.2. Tools, Frameworks and Standards

After the conceptual grounds were set, the actual development of the application began and the adoption of suitable frameworks came into question. After reviewing various options (Unity, Ionic, Wikitude, WebXR) we opted for a web-based application for the numerous advantages it offered. It can be accessed through an URL in a WebXR-enabled Web Browser without downloading any file. A single deployment allows the targeting of handheld and head-mounted AR devices. As an applied research project, continuity is important, and this choice meant our code would run on new devices without the need for important updates. However, it also came with limitations. Because the code and assets are stored on a distant server, web-based applications tend to be slower and less responsive than native apps, which can be critical for AR usability. Access to the device's hardware features may also be limited. Last but not least, users are required to have access to an active network connection.

*BiodivAR* was thus conceived as an application using a client–server structure with web technologies exclusively. We had to adapt to the current WebXR Device API (Application Programming Interface) limitations, which is the standard for VR/AR/XR experiences on the web. WebXR uses Google's ARCore SDK and enables access to a device's inertial sensors and camera data in order to handle the AR cameras in compatible (chromiumbased) browsers. Although experimental support for the WebXR API is offered by Mozilla to iOS users through their "WebXR Viewer" application, Apple has yet to enable global support for the API. WebXR is thus officially supported on Android devices (as well as nonandroid AR glasses, i.e., Hololens 2). However, WebXR features have recently appeared in iOS 15.4 (June 2022) Safari browser, which tends to indicate that Apple is ready to adopt the standard in the close future, despite their own parallel development of the ARkit SDK and USDZ format in collaboration with Pixar.

We used the A-Frame https://aframe.io/ (accessed on 28 December 2022) open-source web framework to support the building of WebXR-compatible 3D scenes. It uses an entity component system architecture which is familiar to game development. It is convenient to use by both developers and designers to create 3D scenes without having to struggle with webGL. It also benefits from an enthusiastic and dedicated community which allows it to be enhanced with new features at a rapid pace. To our knowledge, the only existing open-source web framework that featured location-based AR is AR.js, which also features an authoring tool [47]. However, AR.js covers a wide range of AR types that we would not use in our system, and its geolocation components had not been updated or maintained in a while. Consequently, we created LBAR.js https://github.com/MediaComem/LBAR.js/ (accessed on 28 December 2022), a minimalist A-Frame library for creating WebXR locationbased anchors. LBAR.js places entities in a virtual 3D space based on geolocation data and inertial sensor data. It includes one system (*gps-position*) and three components (*facesnorth*; *gps-position*; *pitch-roll-look-controls*) [48]. The *gps-position* system defines the accuracy threshold over which (after a movement of *n* meters) the geolocated entities' anchoring is refreshed. The *faces-north* component forces an entity to always be facing north (as measured by the compass bearing, or absolute device orientation). It lets users create a parent entity aligned with reality under which all geolocated entities are anchored as children. The *gps-position* component allows the position of an entity (a geolocated POI) to be expressed in WGS84 coordinates, by converting those into local coordinates so that it may be anchored in the A-Frame scene. It needs to be a child entity of an entity with the *faces-north* component in order for it to work. The *pitch-roll-look-controls* component disables

the default yaw control of the AR camera, since the *faces-north* entity (and its children entities) rotates on that axis instead. Other open-source dependencies that were used for the client–server structure include SQLite (database), PrismaJS (ORM), hapi.js (API), vue.js (frontend), and leaflet.js (2D interactive map). The full dependencies' diagram is visible in Figure 7.

**Figure 7.** *BiodivAR* structure and dependencies. The user interfaces (desktop and mobile) are managed by the Vue.js framework, which sends a request to a Rest API built with hapi.js. The Rest API gets the requested data from an SQLite database through a Prisma.js object–relational mapping (ORM). The mobile and desktop 2D interactive maps are powered by the Leaflet.js library and the AR cameras and objects are managed by the WebXR API, the three.js 3D library and the A-Frame framework, with the LBAR.js [48] additional custom library we designed to power geolocated POIs. All of theses dependencies are open-source.

### 3.2.3. Desktop User Interface

In *BiodivAR*, users may browse bioverses they authored as well as publicly accessible ones. They may be opened in AR mode on mobile devices only because it requires geolocation and angular data that only they can provide. However, bioverses can also be visualized on desktop devices on an interactive map, as visible in Figure 8. On this interface, a user may create a new POI (1), import external geodata (GeoJSON) or export what is currently displayed on the map (2). They may browse available bioverses (3) and open several ones for parallel visualization, analysis, or merging data. In any of the available tabs (POIs, paths, tracelog and events), geodata may be toggled (4) to be made visible or invisible on the map. An object's title and description will appear in an infobox (5) when hovered or selected on the map. If the current bioverse can be edited by the current user, an "edit" button will appear. POIs' radius will display on the map (6) with the same custom style (color, stroke, fill, etc.) they were assigned for their AR, 3D counterpart. POIs' scope will show as a white dashed circle (7). If a POI contains audio media, the audio scope, or audibility threshold will appear on the map as a dark blue dashed circle (8). Events (9) and traces (10) is user-generated geodata that can be displayed on the map. The traces tab (12) holds a table with the trajectory a user has followed from the moment they opened a bioverse in AR mode to the moment they exit it. If selected, the paths will appear as randomly colored polylines on the map. The events tab (13) stores user-actions that occur during regular use of the application: entering/exiting a bioverse, entering/exiting a POI, creating/editing a POI, and opening/closing the 2D map. The events appear as red circles on the map, with infoboxes specifying their type. Traces and events are only accessible by users who authored them, in a logic of data transparency and awareness. This data may be collected during user testing for evaluation and analysis purposes. Finally, an additional menu (14) allows users to export, copy, paste, delete and reorganize data. POIs can be copied and pasted (individually or in bulk) from one bioverse to another. Bioverses can also be exported as GeoJSON files nd opened with other geoportals, or for archiving.

**Figure 8.** A view of *BiodivAR*'s desktop user interface with its main features highlighted. Various types of geodata (POIs, traces, events) may be displayed on the map by toggling them on their respective tabs.

When a POI is created or edited, the POI editor modal window is opened, as seen in Figure 9. It lets users manage and previsualize the POI as an embedded A-Frame 3D scene that can be navigated with the "WASD" keyboard keys. Some parameters are POI-wide (i.e., geographic coordinates, title, description). Style attributes can be assigned to a POI's radius and will reflect both on the 2D map and in AR mode. The scope (visibility/audibility threshold) can be adapted. Users may add new media (image/sound files, texts) and define their position relatively to the origin (= arrow helper). Various behaviors can be assigned to each object in the scene, such as whether the media should always face users or hold an absolute position; whether they should appear when users enter the scope *or* the radius; objects' position can be animated; and so on. The image on the left (Figure 9) previews the state of a POI before the user has entered the perimeter of the radius. The image on the right previews its state after the user has entered the radius.

**Figure 9.** The POI editor appears when a POI is created or edited. It lets users manage and customize POIs. Users can upload media (3D models, pictures, sound, or plain text) and position them in the embedded 3D scene preview. Each media may be assigned individual behaviors so that they appear at specific user-triggered events.

### 3.2.4. Mobile User Interface

*BiodivAR*'s mobile user interface is accessed with the same URL. After login in, (Figure 10a) users may browse bioverses similarly to the desktop counterpart (b). The "enter AR" blue button initiates webXR's AR mode and loads the geodata contained in a bioverse: the POIs are displayed in superposition with the camera view background, simulating their presence in the real world (c). Users may swipe the interactive 2D map from the bottom of the screen to help them navigate toward POIs (d). When they enter a POI's radius, author-scheduled events are triggered and new media may appear (e). In the example shown in Figure 10, a distribution map of the tree specimen sitting next to the POI appears, while a short audio podcast delivers information about the species.

**Figure 10.** *BiodivAR*'s mobile user interface: (**a**) login; (**b**) all available bioverses sorted by categories (authored by user, public, bookmarked); (**c**) AR view after opening a bioverse; (**d**) a collapsible 2D map for navigation; (**e**) view of a media showing the distribution map of an adjacent specimen's species, after entering a POI's radius.

### 3.2.5. Database Structure

The data are organized around the concept of *bioverse*, consistently with the application principle of managing this learning environment on an interactive 2D map and visualizing them in AR. A bioverse is stored in a GeoJSON object with the type "FeatureCollection" [49]. Each POI is a feature that contains a geometry type (point), geographic coordinates, and a bunch of user-defined properties related to the POI's style, contents, media, positions, and events. As visible in Figure 11, eight tables structure all the data in the database:

	- **–** *biovers-open|biovers-close*: a user opens or closes a bioverse.
	- **–** *biovers-enter-poi-708|biovers-exit-poi-708*: a user enters or exits the radius of a POI (+ POI id).
	- **–** *open-map|close-map*: a user opens or closes the map while in AR mode.
		- **–** *create|update-poi-933*: a user creates or edits a POI (+ POI id).

As visible in Figure 11, the *Bioverse* table is the central element. It includes all of the other tables and is therefore aware of the whole data. It contains *POIs*, *Paths*, *Events* and *UserTraces*. Each one of these elements may include its own subdata: a *POI* may include and be enriched by as many *Media* elements as desired. Most objects contain one or more *Coordinate* to either position it in the AR interface (*POI*, *Path*) or to log the location an object was generated by the user (*UserTrace* and *Event*). Eventually, these elements are linked to the *User* table to manage creation, deletion and editing rights.

**Figure 11.** *BiodivAR*'s data structure: The *Bioverse* table in the center contains all the other elements and notably *POIs*, which contain one *Coordinate* and *n Media*.

The data structure aims at supporting and leveraging interoperable standards so as to enable the use and creation of open data sets. The following data can be exported and downloaded as GeoJSON files by users (see Figure 8, 14):


• *Events*: a GeoJSON file that contains user-generated events that are most of the possible interactions between a user and the system while in AR mode.

### **4. Geolocation Data Enhancement**

### *4.1. Location-Based AR Bottleneck*

One of location-based AR's main advantages over other types of AR is the possibility to create and anchor augmented objects in a global spatial referential. They can thus be anchored remotely, as opposed to having to be physically on site to anchor content. It can leverage existing geodatabases, and it is possible to enrich them for further visualization or analysis of other geoportals. However, as observed during the *BioSentiers* field test [9] as well as by other researchers who have tried to use this technology for education purposes, [4,5,8,10] location-based AR's main drawback is the instability of the anchoring of augmented objects which results in a degraded user experience. Indeed, a typical mobile device-embedded GNSS sensor has limited accuracy [50]. Under favorable conditions, it can be as low as one meter. However, as conditions worsen, this range can routinely reach 15 to 30 meters [51]. Unlike 2D interactive maps, which tolerate somewhat imprecise geolocation data and remain usable, running location-based AR interfaces on such data impairs the user experience to the point of being inoperable. In comparison, marker-based or SLAM-based AR usually anchor virtual objects with millimeter-accuracy [52].

We believe that the combined use of location-based AR with consumer-grade mobile devices' geolocation data may be responsible for some part of the usability issues that are routinely associated with this technology. Furthermore, we believe that this causes an additional adverse side effect: an increase in the time spent interacting with the screen rather than with the surrounding nature (88.5% versus 11.5% during the *BioSentiers* tests). While it is of course necessary that users interact with the screen to benefit from the potential positive aspects of the ARLE, any added time they spend repositioning and reorienting themselves does not serve the purpose of the learning process and is to be prevented, if possible. We are under the impression that geolocation data accuracy may be the primary limiting factor for the normal use of location-based AR. In the course of our project, we will thus explore the use of external GNSS modules in combination with our application. We expect that more accurate data will result in an improved positioning of geolocated objects in the AR interface.

### *4.2. Geolocation Data Enhancement with RTK*

In an effort to acquire more accurate geolocation data than that provided by mobile devices' embedded GNSS modules, we propose to use external hardware modules for satellite navigation with Real-Time Kinematic Positioning (RTK). RTK is a type of differential positioning system that works with a base station whose actual and accurate position is known. The difference between the known position and the one continuously measured by the base station is known as the range error. The station broadcasts the range error over a standard internet protocol (NTRIP caster), which makes it accessible by any mobile device equipped with a multi-GNSS receiver and an internet connection (also known as "rover station"). The receiving mobile device needs to run a third-party open-source NTRIP client application (i.e., SW maps, Lefebure) to retrieve the range error or correction data. By removing the range error from each satellite distance measurement made by the multi-GNSS receiver, the application computes adjusted geolocation data and exposes it device-wide as a "mock-location." In developer mode, Android devices feature an option that replaces the default geolocation data with mock-location data. For optimal results, the base station and the rover need to be subjected to similar phenomena (atmospheric, meteorological, environmental etc.), which assumes that they must be in a radius of up to twenty kilometers. Thanks to this process, the majority of GNSS ambiguities are fixed and the rover station retrieves centimeter-accurate positioning.

### 4.2.1. Ardusimple RTK Surveyor Kit

After considering various options, we decided to use a low-cost (399€) ready-made GNSS/RTK module manufactured by Ardusimple for Android mobile devices [53]. It includes a Funduino processor with a u-blox ZED-F9P multi-GNSS receiver, a helical GNSS antenna (SY-301) and a Bluetooth module. Once the kit is powered and connected to a mobile device by Bluetooth or USB, the NTRIP client application retrieves the correction data from the base station and streams it to the kit's multi-GNSS receiver, which additionally measures satellite distances on its own. The correction data and measured data are integrated into the navigation engine, and the adjusted geolocation data are sent by the processor and its Bluetooth module to the NTRIP client application. The application exposes that data to the entire mobile device through the "mock location" feature. This setup is visible in Figure 12.

**Figure 12.** RTK geolocation data integration from the Ardusimple RTK kit to a mobile device. This diagram details the setup for real-time kinematic positioning: A reference/base station broadcasts correction data (RTCM raw format) through an NTRIP caster to an NTRIP client (mobile application). The data are integrated by the multi-GNSS receiver, which corrects most of the GNSS biases, resulting in centimeter-accurate geolocation. The Funduino processor broadcasts the geolocation data (3DOF) to the mobile device through a Bluetooth module.

### 4.2.2. Data Fusion: SLAM and GNSS

Although geolocation data is of importance in the context of location-based AR, orientation tracking and the pitch, yaw (also known as azimuth or heading) and roll data may play an equally important role in the accurate positioning and anchoring of augmented objects. Azimuth in particular is known to be relatively inaccurate on mobile devices' builtin compasses [54]. This is due to various causes ranging from magnetic anomalies caused by electronic devices, ferrous materials, and electrical or mechanical infrastructures. As a result, an augmented object will start diverging from its intended position in the AR interface as the user moves away from its exact location, even when used in combination with accurate geolocation data. Inversely, the placement of an augmented object in the interface will

become increasingly accurate as the user approaches its geographic coordinate. Because any small-scale microelectromechanical systems (MEMS) magnetic field sensor (magnetometer) will suffer the same perturbations, the use of a more accurate, external magnometer module does not seem to be an option. Instead, we are laying down a concept for data fusion using SLAM data and geolocation data to improve all at once the positioning and anchoring of the virtual layer. SLAM is the process of simultaneously mapping moving objects with optical sensors, achieving self-localization and modeling the surrounding environment while in motion. When a user moves while holding a device running a WebXR-powered system in AR mode, they keep track of their displacement, relative to a *local* origin (the place where AR mode was initiated) with great accuracy. Simultaneously, they are able to estimate their *absolute* geolocation with the Geolocation API, based on the device's GNSS sensors or an external RTK module. As soon as at least seven (3D) coordinates have been measured in both datums (*local* and *global*), it is possible to use a geometric similarity transformation method (also known as seven-parameter transformation or Helmert transformation) to derive a rotation matrix and three translation vectors. These allow us to seamlessly convert coordinates from one datum to another. This would enable continuous updates on the position of the augmented space as new measurements are made. Thanks to the rotation matrix, the heading data would be generated without relying on the magnetometers, but only from positions measured in both datums. This could potentially solve both the geolocation and heading accuracy issues at once. The implementation of this concept is currently in a pilot phase, in collaboration with Geographic Information System (GIS) specialists from the Territorial Engineering Institute (INSIT).

### **5. Proposed Methodology for Evaluation**

One of our project's research goals (1.2) is to try and improve the overall user experience of location-based AR interfaces so that they may achieve their full potential when it comes to nature exploration and biodiversity education. In the *BioSentiers* experiment, we observed that the geolocation data caused imprecise placement of the augmented objects in the interface, and that users seemed to spend a considerable amount of time interacting with the screen only to redirect themselves. We want to determine if the use of more accurate geolocation data provided by RTK positioning might help stabilize the AR interface and if it has an impact on usability, exploration, and interaction time with the screen.

### *Comparative User Test: Location-Based AR Combined with GNSS Versus RTK*

Between November and December 2022, we conducted a comparative user test (n = 54) with two groups. Half of the participants used the *BiodivAR* application in combination with the Ardusimple RTK kit, while the other half used it with data provided by the devices' embedded GNSS sensor as a control group. The participants used the app to explore a biodiversity-themed ARLE for 15 min. During the test, in-app events and geolocated traces were recorded by the application. 47 of the participants also agreed to wear an eye-tracking device (Tobii Pro Glasses 3) that captured their gaze direction in order to measure how much time they spent interacting with the screen versus nature. Directly after the test, participants answered an online survey containing a demographic questionnaire, an open question, and three different usability questionnaires:


With the data collected through the questionnaires, we intend to obtain an overall evaluation of the system we developed as well as more specific observations on the impact of different geolocation data. The eye-tracking data will allow us able to compare how much time users interacted with the screen versus nature within each group. The in-app events and geolocated traces will allow us to compute variables such as the total distance traveled, the time spent in or in-between POIs, and how long users have been using the interactive 2D map. We will investigate the role played by these independent variables (interaction time, total distance, amount of POIs visited, etc.) on user-reported usability by means of multiple linear regression. Thanks to this process, we expect to further observe the impact of geolocation data on usability. Finally, thanks to the unstructured feedback gathered through the open question, we are hoping to improve the *BiodivAR* application as much as possible before it is made available to a learning audience. The data collected during this test are currently being processed and the findings are not yet known. The design of the test is as visible in Figure 13.

**Figure 13.** Comparative user test design: an experimental group used the *BiodivAR* application for 15 min in combination with RTK data while the control group used it with GNSS data. Various data were collected during and after the test to help observe the impact of geolocation data on usability.

### **6. Conclusions**

In this paper, we present the UCD approach that has shaped the development of an original authoring application for location-based AR experiences. Its architecture handles various types of media for the customization of interactive, augmented points of interest triggered by user location. The system allows an immersive visualization of media in the geographical space, through a weighted use of geolocation, computer vision and inertial sensor data. Its end users may use it in various contexts such as tourism, education, infrastructure engineering, immersive cartography for navigation, geofencing, etc. The UCD approach presented allowed us to tailor the system to meet the end-users' needs while achieving the first two of our project's goals, namely the creation of a versatile authoring tool usable by non-specialists, and to use it to design engaging AR experiences featuring with geolocated media that populate the geographical space. While only a complete analysis of the comparative user test results will unlock the fulfillment of the remaining goals, the fact that we conducted the test keeps us on a steady track and allows us to expect quick results. Furthermore, running the test has put the system into practice, which was an enriching experience in itself. Considering previous efforts for location-based AR in education, several meaningful outcomes were achieved:


to tailor the application to fit the needs and expectations of our target users. We introduce the use of an external RTK module to try and address the inaccuracy of geolocation data and the usability issues it induces. While the previous prototype had hard-coded geolocated POIs, the current application allows the creation or editing of augmented objects by unskilled users. The cartographic interface with online editing features makes it possible for anyone without advanced programming skills to create AR experiences. The data structure we conceptualized enables the creation and editing of advanced, customizable and interactive geolocated POIs.

3. Finally, we conducted a comparative test to assess our system's usability and understand the impact of different geolocation data on usability. Eye-tracking data and geolocated traces will provide us with a multitude of unique points of view on how the application is actually used by participants. In a typical iterative design methodology, we will use user feedback to help us further refine and improve the application in a subsequent iteration.

In the coming months, we will process and analyze the results of the comparative user test we conducted. This will be informative in many ways, beginning with the relevance of using external RTK modules in our future experiments. As of now, we do not know whether the use of RTK data in combination with location-based AR has brought any enhancement in usability. If there is any difference between our two groups, they may still be statistically non-significant, because the anchoring of the POIs in the AR interface may also be impacted by the orientation data. Because we think the use of location-based AR in combination with standard GNSS data can cause usability problems and is the main limiting factor for broad adoption, our future efforts will focus on developing solutions to these. There are encouraging signs, such as the recent publication of Google's Geospatial API and Visual Positioning System (VPS), which is based on trillions of point clouds harvested over the last 15 years for the Google Street View service [58]. It retrieves accurate positioning by comparing the device's camera view with this huge database, after initial filtering based on the user's approximate geolocation. While this approach is very promising for solving the anchoring accuracy problem, it will only be available in locations where Google Street View cars were able to collect data. There will thus always remain large non-urban areas where this method will be unavailable. We therefore remain hopeful that the development of a fusion algorithm for SLAM and GNSS data will meet a certain type of need.

Following the analysis of our initial comparative user test results, we have planned to conduct a new test with pupils to evaluate the effectiveness of location-based AR in supporting nature exploration and biodiversity education. We are currently recruiting middle-school teachers to that end and we plan to begin the tests from June 2023 onward.

Several questions remain open that may provide guidance for future research. For example, we have not investigated the importance of visual variables for the differentiation of various POIs, although their impact on usability is documented [46]. When visualized on an AR interface, 3D cartographic symbols simulate their belonging to the user's spatial referential. Prior knowledge on the semiology of graphics may not apply in the same way it does to two-dimensional maps. While it is currently beyond our study's scope, we believe our tool would lend itself well to investigating the impact of cartographic design on usability. Researchers have made observations on the usefulness of various animated cartographic symbols by comparing objective data (effectiveness in completing a task) with expert opinion [59]. With a comparable approach in mind, it would be interesting for us to compare the usability scores obtained from self-reported user data with objective effectiveness, as measured by quantitative data generated by those same users (geolocated trace logs and eye-tracking data).

**Author Contributions:** Conceptualization, Julien Mercier; methodology, Julien Mercier; software, Nicolas Chabloz, Gregory Dozot and Julien Mercier; validation, Erwan Bocher, Olivier Ertz and Daniel Rappo; formal analysis, Julien Mercier; investigation, Julien Mercier; resources, Olivier Ertz and Daniel Rappo; data curation, Julien Mercier; writing—original draft preparation, Julien Mercier; writing—review and editing, Julien Mercier, Nicolas Chabloz, Olivier Ertz, Erwan Bocher and Daniel Rappo; visualization, Julien Mercier; supervision, Erwan Bocher, Olivier Ertz and Daniel Rappo; project administration, Olivier Ertz and Daniel Rappo; funding acquisition, Olivier Ertz and Daniel Rappo. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Swiss National Science Foundation (SNSF) as part of the NRP 77 "Digital Transformation" (project number 407740\_187313) and by the University of Applied Sciences and Arts Western Switzerland (HES-SO): Programme stratégique "Transition numérique et enjeux sociétaux".

**Informed Consent Statement:** Study participation was voluntary, and written informed consent to publish this paper was obtained from all participants involved in the study. Participants were informed that they could withdraw from the study at any point.

**Data Availability Statement:** The data presented in this study are openly available on Zenodo at https://doi.org/10.5281/zenodo.6542781 (accessed on 28 December 2022).

**Acknowledgments:** Thanks to Sébastien Guillaume and Kilian Morel of the Institute of Territorial Engineering (INSIT) for their help with implementing the RTK positioning systems and for their continuous collaboration on geographic information science matters.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

### **Abbreviations**

The following abbreviations are used in this manuscript:


### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
