**High Resolution Geospatial Evapotranspiration Mapping of Irrigated Field Crops Using Multispectral and Thermal Infrared Imagery with METRIC Energy Balance Model**

**Abhilash K. Chandel 1,2, Behnaz Molaei 1,2, Lav R. Khot 1,2,\*, R. Troy Peters 1,2 and Claudio O. Stöckle 2,\***


Received: 29 July 2020; Accepted: 28 August 2020; Published: 1 September 2020

**Abstract:** Geospatial crop water use mapping is critical for field-scale site-specific irrigation management. Landsat 7/8 satellite imagery with a widely adopted METRIC (Mapping Evapotranspiration at high Resolution with Internalized Calibration) energy balance model (LM approach) estimates accurate evapotranspiration (ET) but limits field-scale spatiotemporal (30 m pixel−1, ~16 days) mapping. A study was therefore conducted to map actual ET of commercially grown irrigated-field crops (spearmint, potato, and alfalfa) at very high-resolution (7 cm pixel−1). Six small unmanned aerial system (UAS)-based multispectral and thermal infrared imagery campaigns were conducted (two for each crop) at the same time as the Landsat 7/8 overpass. Three variants of METRIC model were used to process the UAS imagery; UAS-METRIC-1, -2, and -3 (UASM-1, -2, and -3) and outputs were compared with the standard LM approach. ET root mean square differences (RMSD) between LM-UASM-1, LM-UASM-2, and LM-UASM-3 were in the ranges of 0.2–2.9, 0.5–0.9, and 0.5–2.7 mm day<sup>−</sup>1, respectively. Internal calibrations and sensible heat fluxes majorly resulted in such differences. UASM-2 had the highest similarity with the LM approach (RMSD: 0.5–0.9, ETdep,abs (daily ET departures): 2–14%, *r* (Pearson correlation coefficient) = 0.91). Strong ET correlations between UASM and LM approaches (0.7–0.8, 0.7–0.8, and 0.8–0.9 for spearmint, potato, and alfalfa crops) suggest equal suitability of UASM approaches as LM to map ET for a range of similar crops. UASM approaches (Coefficient of variation, CV: 6.7–24.3%) however outperformed the LM approach (CV: 2.1–11.2%) in mapping spatial ET variations due to large number of pixels. On-demand UAS imagery may thus help in deriving high resolution site-specific ET maps, for growers to aid in timely crop water management.

**Keywords:** actual evapotranspiration; high spatiotemporal resolution; multispectral imagery; thermal infrared imagery; METRIC energy balance model; irrigated field crops

#### **1. Introduction**

Satellite-based remote sensing (RS) has been extensively used with energy balance models for regional scale evapotranspiration (ET) mapping [1–8]. One such widely adopted model is Mapping ET at high-resolution with Internalized Calibration (METRIC) [9,10]. The METRIC model uses Landsat 5/7/8, or Terra and Aqua satellite based multispectral and thermal infrared imagery data to compute ET as an energy balance residue. METRIC is advantageous for its independence from surface conditions, stabilized sensible heat flux (H) estimations, and internal calibration using hot and cold anchor pixels. This internal calibration compensates for any computational biases that arise from atmospheric dynamics, surface albedo, net radiation (Rn), surface temperature (Ts), air temperature (Ta), soil heat flux (G), and wind speed (u). The METRIC model has been evaluated for a wide range of irrigated field and orchard/tree crops under different agroclimatic zones, with studies reporting ET estimation errors in the ranges of 1 to 11% [11–22].

Conventional METRIC uses high orbiting satellite imagery and as a result has low spatial (~30 m pixel−<sup>1</sup> for Landsat and 1 km pixel−<sup>1</sup> for Terra and Aqua satellites) and temporal resolution (~16 days for Landsat) ET maps. Cloud cover is an additional challenge. Alternatively, the ET mapping can be improved spatiotemporally using unmanned aerial system (UAS)-based imagery data [23–25] that can be collected on-demand [7,26–29]. Recent UAS technology advancements have enabled rapid assessment of crop vigor and soil characteristics, crop water requirements, disease infestation, and yield prediction [23,26–30]. Pertinent to ET estimation, Chavez et al. [27] have used airborne imagery data with METRIC and reported daily ET errors of 9% for corn and sorghum grown on experimental plots. Similarly, Ortega-Farias et al. [31] have estimated ET using UAS imagery for a drip irrigated olive orchard and reported maximum errors below 7%. Brenner et al. [32] have estimated latent heat fluxes (LE) from energy balance model for irrigated grassland and observed errors below 10%. They also reported strong correlations of up to 0.9 between estimated LE and that measured with eddy covariance system. Furthermore, Paul et al. [33] have reported suitability of UAS based imagery in conjunction with METRIC model to estimate ET with maximum errors below 11% for irrigated forage, corn, and sorghum crops. All these studies have experimentally illustrated potential of UAS based RS data for accurate ET estimation. However, ET mapping of commercial crop fields from the grower's perspective has been limited for site-specific irrigation management. Additionally, the differences in spatial ET mapping due to in-field elevation variability and near-site available meteorological inputs have not been evaluated with respect to the conventional satellite-based METRIC approach.

Agricultural growers typically use standard crop coefficients or point soil moisture measurements for irrigation scheduling and supply water at a constant rate to the entire field. This results in either under or over irrigation. Satellite imagery also restricts site-specific irrigation management due to low spatiotemporal resolutions. Alternatively, ET mapping with UAS imagery could be directly applicable to growers. This study therefore assesses the suitability of UAS-based imagery to geospatially map actual ET at very high-resolution relative to the satellite-based imagery. The spatial ET differences resulting due to local field conditions and near-site meteorological inputs are also assessed. The specific objectives are: (1) ET mapping of irrigated spearmint, potato, and alfalfa crops under commercial operations using UAS-based multispectral and thermal infrared imagery with variants of METRIC energy balance model, and (2) parametric comparison of the model outputs and spatial ET variation assessment potentials from UAS imagery with that from satellite (Landsat 7/8) imagery.

#### **2. Materials and Methods**

#### *2.1. Field Sites*

Selected were commercial sites (Table 1) planted with spearmint (Cultivar (cv.) Native) in Toppenish, potato (cv. Russet Burbank) in Paterson, and alfalfa (cv. FD-4) in Prosser in Washington state (Figure 1). All sites received limited rainfall (<60 mm), and seasonal mean air temperatures up to 20 ◦C and cumulative reference ET between 1000 to 1200 mm were observed.


**Table 1.** Details of the field site and weather parameters for 2018 growth season.

**Figure 1.** Geolocations of (**a**) spearmint, (**b**) potato, and (**c**) alfalfa crop study sites (Source: Google Maps).

#### *2.2. Data Acquisition Campaigns*

#### 2.2.1. UAS-Based Imagery

A UAS (ATI AgBOT™, Aerial Technology International, Wilsonville, OR, USA) was deployed for imagery data acquisition (Figure 2). On-board was a multispectral imaging sensor (RedEdge 3, MicaSense, Inc., Seattle, WA, USA) with Blue (B, 475±10 nm), Green (G, 560±10 nm), Red (R, 668±5 nm), Red Edge (RE, 717±5 nm), and Near Infrared (NIR, 840±20 nm) wavebands and a radiometric calibrated thermal infrared imaging sensor (11,000 ± 3000 nm, Tau 2 640, FLIR Systems, Wilsonville, OR, USA). Image-specific geotags for the multispectral sensor were received from the global positioning system (GPS) receiver on-board UAS and that for thermal sensor were received from an independent GPS receiver (ThermalCapture GPS, TeAx Technology GmbH, Wilnsdorf, Germany). On-board the UAS was also a downwelling light sensor (DLS, MicaSense, Inc., Wilnsdorf, WA, USA) facing skyward to embed the solar irradiance data in the imagery spectrum during flights.

UAS flights were configured using a ground control software (MissionPlanner, version 1.3.49, Ardupilot, USA) for altitude of 100 m above ground level (AGL) to acquire multispectral and thermal images at ground sampling resolutions of 7 and 13 cm pixel−1, respectively. The software aided in configuring the multispectral sensor to acquire images at 85% front and 75% side overlaps. The thermal imaging sensor was configured independently to acquire frames at 3 Hz to ensure image overlaps of 90–95%. Imagery data was stored on-board in respective memory cards for the two sensors. A calibrated reflectance panel (CRP, Micasense Inc., Seattle, WA, USA) was imaged before and after

each flight and these reference images were used for radiometric calibration of the multispectral imagery (Figure 2).

Total six UAS flights were conducted in summers of 2018 (two missions/site × 3 sites) during the dates and approximate times of the Landsat 7/8 overpass. UAS imagery for spearmint crop was collected 10 days before the first harvest (dataset-1) and 37 days before the second harvest (dataset-2). Imagery for the potato crop was acquired 72 days (dataset-3) and 48 days before harvest (DBH, dataset-4). Data for the alfalfa crop was collected 2 days before the first harvest (dataset-5) and 7 days before the second harvest (dataset-6).

**Figure 2.** High-resolution imagery campaigns with optical sensors integrated with the small unmanned aerial system.

2.2.2. Satellite-Based Imagery and Weather Data

Landsat 7/8 imagery datasets and 1-arc SRTM (Shuttle Radar Topography Mission) digital elevation models (DEM) were downloaded for the data collection days (Table 2). Weather data logged at every 15 min was downloaded from the nearest (1–5 km) stations (AgWeatherNet, Washington State University, Washington, WA, USA). All the data collected on a day were grouped to have a total of six datasets (see Table 2) ready for analysis.


**Table 2.** Summary of datasets used as inputs to METRIC energy balance.

DOY: day of year, DBH: days before harvest, <sup>√</sup>: subset data, scene metadata: date, time, sun azimuth and solar elevation angles during flights.

*2.3. Imagery Analysis and METRIC Models Implementation*

#### 2.3.1. Preprocessing

UAS imagery was initially processed to obtain orthomosaics of the study site(s) through a series of image stitching operations (Figure 3a) in a photogrammetry and mapping software (Pix4D Mapper, Pix4D, Inc., Lausanne, Switzerland). Independent surface temperature orthomosaic was then georeferenced and resampled to the multispectral orthomosaic (7 cm pixel<sup>−</sup>1) in Quantum Geographic Information System (QGIS) platform (ver.2.18.16, Open Source). The "Georeferencer" tool with "Thin Plate Spline" as the transformation type was used for georeferencing and the "Nearest Neighborhood" method was used for resampling. Five surfaces reflectance (B, G, R, RE, and NIR (Figure 3b)), one surface temperature (Figure 3c) and a DEM orthomosaic were obtained for each UAS mission.

**Figure 3.** Images showing (**a**) preprocessing flow chart and unmanned aerial system imagery derived (**b**) near-infrared, and (**c**) temperature orthomosaics of the alfalfa field site (DBH: 2, DOY: 191).

#### 2.3.2. METRIC Model Implementation

The METRIC model is detailed in Allen et al. [9]. Briefly, model computes Rn, G, H and residual LE within the blending height (100–200 m AGL) which can be converted to actual water loss to the atmosphere (ET). METRIC eliminates the need for accurate surface temperature (Ts) measurements. It uses ground weather station-based reference ET (ETr, alfalfa crop based) for internal calibration and reduction of biases related to satellite imagery. The internal calibration indexes near-surface temperature gradients to the radiometric Ts using extreme hot and cold anchor pixels. The model was primarily developed for agricultural fields and does not need crop specific inputs. This enhances METRIC's applicability to high-resolution RS data. Small UAS based RS data captures distinct features of soil, vegetation, or mixed regions that can improve internal calibration of METRIC model with respect to the study environment [26–29,33].

Conventional METRIC maps regional ET for 185×185 km scene size and all its assumptions might not necessarily apply at the field-scale. Therefore, METRIC model was implemented with three variant approaches. The resulting approaches are named as UAS-METRIC-1, -2, and -3 (UASM-1, -2, and -3) with pertinent modification highlights summarized in Table 3. UASM-1 (Figure 4) was implemented to observe relative differences only due to UAS imagery specific inputs that include (i) scene metadata from flight missions, (ii) surface albedo from on-board multispectral imager, and (iii) high-resolution DEM. All other equations were identical as the conventional METRIC. Similar to LM, leaf area index (LAI) in UASM-1 was calculated using soil adjusted vegetation index (SAVI) with background adjustment factor (L) of 0.1 for high crop cover [34,35]. UASM-2 was employed for two reasons: (i) possibility to reduce model input data needs from the grower perspective and (ii) to assess pertinent differences caused by the DEM typical to flat agricultural fields located in non-terrain regions. Implementation of UASM-2

was identical to UASM-1 except that UASM-2 used a flat DEM for local field conditions. This scenario forced surface slopes and aspects to zero and elevation as the mean of all pixels in the DEM.

**Figure 4.** Flow chart for processing UAS-based imagery data with Mapping Evapotranspiration at High Resolution with Internalized Calibration (METRIC) energy balance model.

UASM-3 considered site-specific energy balance inputs typical to agricultural field plots imaged in this study. Pertinent modifications include (i) DEM as per field conditions (as in UASM-2), (ii) LAI calculation from spatial fraction canopy cover (FCC), (iii) incoming shortwave radiation (ISWR) from the nearest open field weather station, (iv) incoming longwave radiation (ILWR) calculation using air temperature (Ta), and (v) momentum roughness length (MRL) calculated without surface slope (S) adjustments. The LAI calculation in UASM-3 were based on the fact that surface features (i.e., soil, vegetation, and mixed) could be distinctively captured in every pixel [28–30] of high-resolution imagery. FCC was first derived from the normalized difference vegetation index (NDVI) [28,36,37] and soil was segmented using NDVI vegetation threshold mask (>0.3) to ensure LAI for the vegetation only. ISWR measurements from the nearest weather station may be assumed as actual shortwave radiation on the surface for the fact that these are open-field stations installed at 2 m AGL and located within 5 km radius from the study sites. Moreover, conventional ISWR calculation might get affected by cloud cover and other atmospheric dynamics [38,39]. Ta from nearest weather station was used to calculate ILWR and can be assumed constant for a field (Stefan Boltzmann's law, [40]).

For comparisons, Landsat imagery was processed through the conventional METRIC model as described in a "R package" (water, [41,42]), hereafter abbreviated as LM (Landsat-METRIC) approach. Level-2 surface reflectance products were downloaded for Landsat 8 while such products were calculated within the package for Landsat 7 datasets. Similar to the LM approach, all UASM approaches performed an energy balance within the blending height and used the conventional automated approach of selecting anchor pixels.

#### *2.4. Output Comparisons*

This study aimed at evaluating spatial ET maps from UAS based remote sensing (RS) data relative to satellite-based RS data for crops grown in commercial farming operations. As the conventional METRIC model is well-established for spatial ET mapping for its corroboration in over 25 countries and in almost all types of agroclimatic zones [22], the additional field validation was deemed not necessary, i.e., outside the scope of the study. Thus, we did not conduct ground-reference ET measurements.



**NDVI**: Normalized difference vegetation index, **SAVI**: Soil adjusted vegetation index, **K**: Solar extinction coe fficient, **RNIR**: Reflectance near-infrared band, **RR**: Reflectance red band, **L**: Soil background adjustment factor, θ**rel**: Angle of incidence, τsw: Atmospheric transmissivity, **d**: Relative earth-sun distance, **Gsc**: Solar constant, ε**a**: Atmospheric emissivity, σ: Stefan Boltzmann's constant, **Ts**: Surface temperature, **Ta**: Air temperature, and **S**: Surface slope derived from DEM and **Zom,mtn**: Momentum roughness length adjusted for varying elevation surface.

Since the energy balance computations with LM approach were performed for images of 180 km × 180 km dimension, intermediate and final output maps were clipped to the study site areas covered by the UAS flights. The mean, standard deviation (SD), and coefficients of variation (CV, %) to assess spatial variability capturing potentials of ET pixels from LM (ETLM) and UASM-1, -2, and -3 approaches (ETUASM) were calculated. Percentage absolute departures of mean ET (ETdep,abs, Equation (1)) from all UAS-based approaches with respect to that from the LM approach were also calculated. Next, ET maps from all UAS-based approaches were resampled to the resolution equal to ET maps from LM approach (30 m pixel<sup>−</sup>1) using the "Nearest Neighborhood" method and region of interest (ROI) samples were randomly extracted. Sample ET from these ROIs were compared using root mean square difference (RMSD) and Pearson's linear correlation (*r*) at 5% significance (RStudio, Inc., Boston, MA, USA).

$$ET\_{\text{dep.abs}}\left(\%\right) = \left( |ET\_{LLSAM} - ET\_{LM}| \times 100 \right) / ET\_{LM} \tag{1}$$

#### **3. Results**

Major energy balance computations from the UASM and LM approaches are presented in the following subsections. Results pertinent to spearmint (Table 4, Figures 5a–d and 6), potato (Table 5, Figures 5e–h and 7), and alfalfa crops (Table 6, Figures 5i–l and 8) are also presented.

#### *3.1. Crop Vigor*

The UAS imagery mapped crop vigor indicators of SAVI (Mean: 0.7–0.9, SD: 0.1–0.2) and NDVI (Mean: 0.8–0.9, SD: 0.1–0.2) similar to the Landsat imagery mapped SAVI (Mean: 0.7–0.85, SD: ~0.02) and NDVI (Mean: 0.8–0.9, SD: 0.02–0.1). The mean LAI from LM approach ranged between 4–6 m<sup>2</sup> m−<sup>2</sup> and that from UASM-1/-2 and -3 ranged from 5.1–5.8 m<sup>2</sup> m−<sup>2</sup> and 3.5–5.9 m<sup>2</sup> m−2, respectively. Mean FCC ranges were 0.91–0.93 (SD: ~0.14) for spearmint, 0.92–0.93 (SD: ~0.1) for potato, and 0.8–0.9 (SD: 0.1–0.24) for the alfalfa fields. All these estimates show high density vegetation also observed at the sites.

#### *3.2. Net Radiation, Soil Heat and Sensible Heat Fluxes*

The mean ISWR ranges from the LM, UASM-1, -2, and -3 approaches were 819–903, 817–927, 818–921, and 737–949 W m<sup>−</sup>2, respectively. Mean ILWR ranges from those approaches were 337–396, 296–347, 296–347, and 321–375 W m<sup>−</sup>2, respectively. Mean outgoing longwave radiation (OLWR) from the LM, UASM-1, -2, and -3 approaches ranged between 440–514, 381–451, 381–451, and 381–451 W m−2, respectively and Rn ranges were 508–594, 596–685, 596–685, and 564–716 W m−2. Mapped G from the LM, UASM-1, -2, and -3 approaches were in the ranges of 23.8–63.6, 19.4–35.9, 19.4–35.9, and 18.2–36.1 W m−2, respectively and mean H ranges were 60–181, 114–237, 126–250, and 127–233 W m<sup>−</sup>2, respectively.

#### *3.3. Daily Evapotranspiration*

The mean ranges of mapped instantaneous ET (ETinst) from the LM, UASM-1, -2, and -3 approaches were 0.5–0.7, 0.5–0.7, 0.5–0.7, and 0.5–0.7 mm h<sup>−</sup>1, respectively and pertinent ETrF ranges were 0.7–0.9, 0.74–0.95, 0.7–1, and 0.7–0.93, respectively. ETr24 for the six datasets were 9.4, 5.2, 6.3, 6.5, 8.8, and 8.5 mm day<sup>−</sup>1. The mean of mapped daily ET from LM, UASM-1, -2, and -3 approaches were in the ranges of 4–7, 4.3–8.4, 4.1–6.7, and 3.9–8.2 mm day<sup>−</sup>1, respectively. Ranges of absolute departures of daily ET (ETdep,abs) from UASM-1, -2, and -3 were 0.4–20.8, 2.0–14.4, and 2.2–21% of ETLM, respectively for the selected crops. Highest correlations and minimum RMSDs (Figures 6–9) were observed for UASM-2 (*r*: 0.66–0.9, RMSD: 0.5–0.9) followed by UASM-1 (*r*: 0.65–0.9, RMSD: 0.2–2.9) and UASM-3 (*r*: 0.65–0.9, RMSD: 0.5–2.7) approaches compared to the LM approach.





#### *Drones* **2020** , *4*, 52


**Table 5.** *Cont*.

*Drones* **2020** , *4*, 52

\* indicates same values for different approaches on same data collection day.

 crop.

(**b**) (**c**)

(**a**)

#### **4. Discussion**

#### *4.1. Crop Vigor*

High spatial variability in crop vigor was captured by the UAS imagery. Soil background adjustment factor (L) of 0.1 as in the LM calculated similar SAVI with the UAS imagery for selected crops [34,35]. Higher NDVI compared to SAVI for some datasets may be due to the saturation effects ignored in SAVI [43]. Soil-segmented FCC mapped similar LAI from UASM-3 as from the LM approach. MRL estimates from UAS and Landsat imagery were also similar. Slight differences in MRL from UASM-1 and -2 approaches were mainly due to surface slope factor in the case of UASM-1 approach and that between UASM-1 and -3 approaches were due to LAI and DEM inputs.

#### *4.2. Net Radiation and Soil Heat Flux*

The ISWR from LM, UASM-1, and -2 approaches were similar and slight differences were due to the DEM inputs. ISWR from UASM-3 approach was different due to its direct measurement source (nearest weather station). Such ISWR measurements may be assumed actual compared to the standard computations for the weather stations installed near study sites (within 5 km) approximately at 2 m height [38,39]. Higher ILWR estimates from the LM approach compared to UASM-1 and -2, were primarily due to higher Ts. Minor ILWR differences between LM and UASM-1 or -2 approaches would be due to slightly different DEMs used to calculate atmospheric transmissivities. ILWR computation in UASM-3 approach used Ta as the input (Stefan Boltzmann's law) instead of the Ts [40] and no variation for use of a single value may be assumed realistic at field level. OLWR differences between LM and UASM approaches were primarily due to differences in Ts from pertinent imaging systems. Similar OLWR from UASM-1, -2, and -3 approaches showed no effect of LAI differences that contribute only 1% to the surface emissivity calculations. Rn and G estimate differences between UASM and LM approaches can be attributed to differences between ISWR, ILWR, OLWR and surface albedo computations. While no Rn and G differences between UASM-1 and UASM-2 approaches were due to negligible differences between ISWR, ILWR, and OLWR. Such differences between UASM-1 and UASM-3 approaches were due to the ISWR and ILWR differences. Since high LAI was observed in all imaging campaigns, slightly high variations in OLWR and Rn may be expected during initial days of plantation (Low LAI).

#### *4.3. Sensible Heat Flux*

Differences between H estimates from LM and UASM approaches were primarily due to Ts differences, and secondarily due to different internal calibrations. As the Landsat 7/8 imagery has coarse resolution (~30 m pixel<sup>−</sup>1), it might not be readily possible to find uniform hot and cold anchor pixels. This demands for additional flexibility in anchor pixel identification also suggested by Jaafar et al. [42]. Contrarily, it was easier with the UAS-based approaches (~7 m pixel<sup>−</sup>1) to identify anchor pixels within the imaged field area (Table 7). Irrigated crops would have also facilitated identification of highly transpiring cold anchor pixels alike the reference alfalfa. H differences between UASM-1, -2, and -3 could be due to LAIs, MRLs, Rn, and G differences in the different cold pixels that resulted in different internal calibrations.

#### *4.4. Daily Evapotranspiration*

Similar ET was mapped from all UASM approaches compared to the LM approach (Figures 5–9). However, UASM approaches showed larger potential for assessing spatial variations in field ET maps (Figure 10) due to large number of pixels [21]. The ET differences between LM, UASM-1, and -3 approaches were large for dataset-5 (alfalfa field) and dataset-3 (potato field) primarily due to the internal calibration differences, H and Rn differences. The ETrF for alfalfa crop in dataset-6 was high compared to dataset-5 showing more matured crop at 2 DBH than 7 DBH. Notable ETrF differences between LM and UASM approaches could also be due to accumulated differences between

the intermediate outputs. ETrFs for selected crops did not exceed 1.1, complying to the cold anchor pixel assumption (1.05 × ETr, [9]). UASM-3 approach (METRIC model modified to local conditions) can also provide reliable geospatial ET maps as LM. As also reported by Paul et al. [33], G and H differences between LM and UAS approaches did not influence daily ET maps for irrigated field crops due to small magnitudes. UASM approaches indicate their versatility for geospatial ET mapping for a range of similar crops as spearmint, potato and alfalfa (*r*: 0.8–0.9). However, additional parametrization may be needed for different crop physiologies [21,27]. An average 10% uncertainty of UASM approaches for point scale ET might be acceptable as discussed in prior research studies that used similar UAS imagery with METRIC model [29,33]. Flight missions within solar noon ± 2 h could have produced comparable and reasonable ETrF used for 24-h [44].


**Table 7.** Details of hot and cold anchor pixels identified in the LM and UASM approaches.

**Figure 10.** Spatial daily ET variation potential assessments from LM, UASM-1, -2, and -3 approaches.

#### **5. Conclusions**

Agricultural growers are highly dependent on generalized crop coefficients or point measurements (soil moisture, sap flow, etc.) for irrigation scheduling and are restrictive towards satellite-based remote sensing approaches due to low spatiotemporal resolution and cloud cover issues. Considering the critical need for geospatial ET mapping towards precision irrigation management, ET of irrigated field crops was mapped using very high resolution multispectral and thermal infrared imagery. Three variants of standard METRIC energy balance model modified as per UAS and local conditions were used to process UAS imagery and their performance was evaluated relative to standard LM approach.

Increased spatial resolution did not influence the mean field-scale ET (low RMSD: 0.2–1.0 mm day<sup>−</sup>1, ETdep,abs: 0.4–21.2%, and *r*: 0.7–0.9). However, high spatial ET variation potential was assessed in UAS derived maps (6.7–24.3%) compared to the Landsat satellite derived maps (2.1–11.2%). Variants of UASM models performed very similar to the LM approach for irrigated field crops; spearmint (ETdep,abs: 0.4–6.2% and *r*: 0.7–0.8), potato (ETdep,abs: 0.9–21.2% and *r*: 0.7–0.8), and alfalfa (ETdep,abs: 7.2–9.9% and *r*: 0.8–0.9). UASM-2 showed highest similarity with the LM approach (RMSD: 0.5–0.9 mm day−<sup>1</sup> and *r*: 0.7–0.9) for very similar energy balance outputs. ET differences between LM and UAS based approaches were majorly due to the internal calibrations differences that could have been affected slightly by the spatial resolution. UASM-1 could be advantageous for agricultural fields with elevation variability and resulting temperature differences due to lapse rates. UASM-1 uses high-resolution UAS imagery derived DEM inputs that can be accurately compared to the satellite based DEMs. UASM-2 could be advantageous for the crop fields where spatial variability in ground elevation is negligible, as it also reduces additional DEM data requirement for ET mapping. UASM-3 uses actual meteorological parameters of ISWR and Ta (for ILWR) measured by the automated weather station near the field site. UASM-3 could also be advantageous for ensuring LAI calculation for the areas with vegetation only, as supported by high spatial resolution to distinctly capture soil, vegetation and mixed pixels. UASM-1 and UASM-3 could also be merged for field-relevant energy balance inputs such as meteorological parameters and ground elevation variabilities to obtain more realistic crop water use maps.

ET mapping at high spatiotemporal resolution provides more control over timely monitoring of spatial crop water requirements where restrictions of cloud cover interference and imagery acquisition at 16-day interval could be avoided unlike LM approach. LM based approaches may assist in reviewal of seasonal water distribution rights while UASM approaches may assist in irrigation water savings through grower friendly and on-demand site-specific irrigation prescription maps/tools.

**Author Contributions:** Conceptualization, A.K.C., B.M., and L.R.K.; data curation, A.K.C. and B.M.; funding acquisition, L.R.K., R.T.P., and C.O.S.; drone flights, A.K.C. and B.M.; investigation, A.K.C. and B.M.; methodology, A.K.C., L.R.K., and C.O.S.; project administration, L.R.K., R.T.P., and C.O.S.; resources, L.R.K., R.T.P., and C.O.S.; software, A.K.C.; supervision, L.R.K., R.T.P., and C.O.S.; validation, A.K.C. and B.M.; visualization, A.K.C., B.M., L.R.K., R.T.P., and C.O.S.; writing—original draft, A.K.C.; and writing—review and editing, A.K.C., B.M., L.R.K., R.T.P., and C.O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the USDA-National Institute of Food and Agriculture projects 1016467, WNP0745, and WNP0839, and Washington State University-CAHNRS Office of Research Emerging Research Issues Internal Competitive Grant Program.

**Acknowledgments:** Authors would like to thank Landon Lombers, Troy Grimes, and WSU farm management team for their assistance in providing the commercial field sites for data collection.

**Conflicts of Interest:** Authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Vegetation Extraction Using Visible-Bands from Openly Licensed Unmanned Aerial Vehicle Imagery**

#### **Athos Agapiou 1,2**


Received: 25 May 2020; Accepted: 25 June 2020; Published: 26 June 2020

**Abstract:** Red–green–blue (RGB) cameras which are attached in commercial unmanned aerial vehicles (UAVs) can support remote-observation small-scale campaigns, by mapping, within a few centimeter's accuracy, an area of interest. Vegetated areas need to be identified either for masking purposes (e.g., to exclude vegetated areas for the production of a digital elevation model (DEM) or for monitoring vegetation anomalies, especially for precision agriculture applications. However, while detection of vegetated areas is of great importance for several UAV remote sensing applications, this type of processing can be quite challenging. Usually, healthy vegetation can be extracted at the near-infrared part of the spectrum (approximately between 760–900 nm), which is not captured by the visible (RGB) cameras. In this study, we explore several visible (RGB) vegetation indices in different environments using various UAV sensors and cameras to validate their performance. For this purposes, openly licensed unmanned aerial vehicle (UAV) imagery has been downloaded "as is" and analyzed. The overall results are presented in the study. As it was found, the green leaf index (GLI) was able to provide the optimum results for all case studies.

**Keywords:** vegetation indices; RGB cameras; unmanned aerial vehicle (UAV); empirical line method; Green leaf index; open aerial map

#### **1. Introduction**

Unmanned aerial vehicles (UAVs) are widely applied for monitoring and mapping purposes all around the world [1–4]. The use of relatively low-cost commercial UAV platforms can produce high-resolution visible orthophotos, thus providing an increased resolution product in comparison to the traditional aerial or satellite observations. Throughout the years, the technological development in sensors and the decrease in the cost of the UAV sensors has popularized them both to experts as well as amateurs [5,6]. While several countries have lately adopted restrictions due to safety reasons, UAVs are still being used for mapping relatively small areas (in comparison to aerial and satellite observations) [7].

Today a variety of UAVs and cameras exist in the market, providing a plethora of options to end-users. As [8] mentioned in their work, UAVs can be classified according to the characteristics of the drones, such as their size, ranging from nano (<30 mm) to large size (>2 m) drones, their maximum take-off weight (from less than 1 kg to more than 25 kg), their range of operation, etc. In addition, existing UAVs' cameras can also be classified into visible red–green–blue (RGB), near-infrared (NIR), multispectral, hyperspectral, and thermal cameras.

Once the stereo pairs of the images are taken from the UAV camera sensors, these are processed using known control points and orthorectified based on the digital surface model (DSM) produced by the triangulation of the stereo pairs [9]. In many applications, the detection of vegetated areas is essential, as in the case of monitoring agricultural areas or forests [10–13]. Even if vegetation is not a goal of a study, vegetation needs to be masked out to produce a digital elevation model (DEM) and provide realistic contours of the area.

Vegetated areas can usually be detected using the near-infrared part of the spectrum (approximately between 760–900 nm). At this spectral range, healthy vegetation tends to give high reflectance values in comparison to the visible bands (red–green–blue, RGB) [14]. The sudden increase in reflectance at the near-infrared part of the spectrum is a unique characteristic of healthy vegetation. For this reason, the specific spectral window has been widely exploited in remote sensing applications. Indeed, numerous vegetation indices based on different mathematical equations have been developed in the last decades, aiming to detect healthy vegetation, taking into consideration atmospheric effects and the soil background reflectance noise [15]. One of the most common vegetation indexes applied in remote sensing applications is the so-called normalized difference vegetation index (NDVI), which is estimated using the reflectance values of the near-infrared and the red bands of multispectral images [16].

However, in most UAV cameras, the near-infrared part of the spectrum which is sensitive to vegetation is absent. UAVs cameras are normally sensitive to recording the visible part of the spectrum (red–green–blue), thus making the detection of vegetated areas quite challenging. In addition, radiometric calibration of the images is needed to convert the raw digital numbers (DNs) into reflectance values. To do this, calibration targets and field campaigns are essential to have a good approximation of the backscattered radiance of the various targets observed in the orthophotos [17,18].

This study aims to investigate the detection of vegetated areas based on limited metadata information, and with no information regarding the reflectance properties of targets visible in the orthophoto. For this reason, five orthophotos from openly licensed an unmanned aerial vehicle (UAV) imagery repository was used, while simplifying linear regression models were established to convert the DNs of the images to reflectance values. Once this was accomplished, then more than ten (10) different visible vegetation indices were applied, and their results are discussed. The methodology presented here can, therefore, be used in products where knowledge is limited, and the extraction of vegetation is needed to be carried out in a semi-automatic way.

#### **2. Materials and Methods**

For the needs of the current study, five different datasets were selected through the OpenAerialMap platform [19]. The OpenAerialMap relies on a set of open-source tools, where users can upload their products, such as orthophotos, filling basic metadata information, to support their re-use. OpenAerialMap provides a set of tools for searching, sharing, and using openly licensed satellite and unmanned aerial vehicle (UAV) imagery. The platform is operated by the Open Imagery Network (OIN). All images uploaded in the platform are publicly licensed under the Creative Commons (CC) license (CC-BY 4.0), thus allowing both sharing and adaptation of the content from third parties and other users.

The case studies were selected based on the following criteria: (1) have a different context, (2) have a different geographical distribution, (3) captured by different UAV/camera sensors, and (4) quality of the final orthophoto. In the end, the following case studies were identified and downloaded for further processing (Table 1). A preview of these areas can be found in Figure 1. Case study 1 was a 6 cm-resolution orthophoto, from a highly urbanized area (Figure 1a), located in the Philippines, where vegetation was randomly scattered. A mixed environment was selected as the second case study from St. Petersburg in Russia (Figure 1b), where both high trees, grassland, and buildings were visible. A UAV corridor mapping along a river near the Arta city, Greece, was the third case study (Figure 1c). At the same time, it should be mentioned that watergrass was also visible in this orthophoto.


**Table 1.** Case studies selected through the OpenAerialMap [19].

**Figure 1.** Case studies selected through the OpenAerialMap [19]. (**a**) a highly urbanized area, at Taytay, Philippines, (**b**) a campus at St. Petersburg, Russia, (**c**) a river near Arta, Greece, (**d**) a picnic area at Ohio, USA, (**e**) an agricultural area at Nîmes, France, and (**f**) the geographical distribution of the selected case studies.

The next case study referred to a picnic area at Ohio, USA, with low vegetation (grass) and some sporadic high trees (Figure 1d), while the last case study was an agricultural field near Nîmes, France. All orthophotos have a named resolution of few centimeters (5–6 cm), without the ability to evaluate further geometric distortions of the images (e.g., radial distortion, root mean square error, maximum horizontal and vertical error, and distribution of the control points). Therefore, these orthophotos were further processed "as is". The first two orthophotos were obtained using the DJI Mavic 2 Pro, while the DJI FC6310 model was used for the case study of Greece. The SONY DSC-WX220 and Parrot Anafi UAV models were used for the last two case studies.

Once the orthophotos were downloaded, the digital numbers (DN) of each band were calibrated using the empirical line method (ELM) [18]. The ELM is a simple and direct approach to calibrate DN of images to approximated units of surface reflectance in the case where no further information is available as in our example. The ELM method aims to build a relationship between at-sensor radiance and at-surface reflectance by computing non-variant spectral targets and comparing these measurements with the respective DNs in the image. The calibration of raw DNs to surface reflectance factor is based on a linear relationship for each image band using reflectance targets of the image. The derived prediction equations can consider changes in illumination and atmospheric effects [20]. In our case study, since no additional information was available, the impact of the atmospheric effects was ignored. The ELM for the RGB UAV sensed data could be estimated using the following equation:

$$
\rho\_{(\lambda)} = \mathbf{A}^\* \mathbf{D} \mathbf{N} + \mathbf{B}\_\prime \tag{1}
$$

where ρ(λ) is the reflectance value for a specific band (range 0%–100%), DNs are the raw digital numbers of the orthophotos, and A and B are terms which can be determined using a least-square fitting approach. Figure 2 illustrates the basic concept of the ELM calibration.

Digital Number at band*<sup>i</sup>*

**Figure 2.** Empirical line method (ELM) schematic diagram.

In the case where no appropriate targets were used, a simple normalization of the orthophotos was followed using image statistics per each band. Once the orthophotos were radiometrically calibrated, with pixel values between 0 and 1, various visible vegetation indices were applied. In specific, we implemented ten (10) different equations, as shown in more detail in Table 2. The following vegetation indices were applied to all case studies: (1) Normalized green–red difference index, (2) green leaf index, (3) visible atmospherically resistant index, (4) triangular greenness index, (5) red–green ratio index, (6) red–green–blue vegetation index, (7) red–green ratio index, (8) modified green–red vegetation index, (9) excess green index, and (10) color index of vegetation. These indices explore in different ways the visible bands (red–green–blue). The outcomes were then evaluated and compared with random points defined in the orthophotos. The overall results are provided in the next section.


**Table 2.** Vegetation indices used in this study.

where ρ<sup>b</sup> is the reflectance at the blue band, ρ<sup>g</sup> is the reflectance at the green band, ρ<sup>r</sup> is the reflectance at the red band, λ<sup>b</sup> is the wavelength of the blue band, and λ<sup>r</sup> is the wavelength of the red band.

#### **3. Results**

#### *3.1. Radiometric Calibration of the Raw Orthophotos*

After the detailed examination of the selected orthophotos, high target reflectance pixels were identified and mapped in three out of the five case studies. These targets were selected, namely in case study 1 (Figure 3a) and case study 2 (Figure 3b). For both these orthophoto, smooth white roofs were found, and the average DN value per band was extracted. A similar procedure was also implemented for the fourth case study (Figure 3c), where a white high reflectance asphalt area was evident in the southern part of the image. In contrast, no dark objects, such as deep water reservoirs, newly-placed asphalt, or other black targets, were visible in these images. Therefore the ELM was applied using as-known input parameters of the DNs from these high reflectance targets. For the rest orthophotos (case study 3 and case study 5), no visible variant targets could be detected in the images due to their environment. In these cases, an image normalization between 0 and 1, using the image statistics, was implemented.

**Figure 3.** High reflectance targets selected for case study 1 (**a**), case study 2 (**b**), and case study 4 (**c**). For case studies 3 and 5, no appropriate targets were found, and an image-based normalization was applied.

Upon the radiometric calibration, several spectral signatures from different targets in the range of the visible part of the spectrum (i.e., approximately between 450 and 760 nm) were plotted. This is an easy way to understand if the simplified EML and image normalization procedures that we followed here did not distort the reflectance of the targets. Figure 4 presents the results from the reflectance analysis regarding the first case study (similar findings were also reported for the other case studies). Three types of targets are presented here: vegetation (first row of Figure 4), asphalt (second row of Figure 4), and soil (third row of Figure 4). Figure 4a,d,g shows the general location of these targets, while a closer look at these targets is shown in Figure 4b,e,h, respectively. The spectral signature diagram of the three targets can be seen in the last column of Figure 4 (Figure 4c,f,i). The vegetation spectral profile (Figure 4c) followed the typical spectral behavior of healthy vegetation within this part of the spectrum with low reflectance values in the blue and red part of the spectrum and higher reflectance in the green band. Asphalt targets (Figure 4f) had a similar reflectance value for all three bands, while its relatively high reflectance (i.e., between 60%–75%) can be explained due to the type of the asphalt and its age. The soil spectral profile (Figure 4i) showed a similar pattern with the asphalt with a slight increase in the reflectance as we moved from the blue to the red part of the spectrum. Other types of targets (not shown here) had a reflectance pattern as those expected from the literature, which is an indicator that the ELM did not distort any spectral band, and provided, as best as possible, reasonable outcomes.

**Figure 4.** High reflectance targets selected for case study 1 (**a**), case study 2 (**b**), and case study 4 (**c**). For case studies 3 and 5, no appropriate targets were found, and an image-based normalization was applied.

#### *3.2. Vegetation Indices*

Once the orthophotos were radiometrically corrected, all vegetation indices mentioned in Table 2 were applied. The final results of this implementation for case studies 1 and 4 are shown in Figures 5 and 6, respectively. The calibrated RGB orthophoto of each case study is shown in Figures 5a and 6a, while the normalized green–red difference index is shown in Figures 5b and 6b. Similarly, Figures 5c–k and 6c–k show the results from the green leaf index, visible atmospherically resistant index, triangular greenness index, red–green ratio index, red–green–blue vegetation index, red–green ratio index, modified green–red vegetation index (i), excess green index, and color index of vegetation (k), respectively. Vegetated areas are highlighted with the light grayscale tone, while non-vegetated areas with the darkest tone of grey.

As shown in Figures 5 and 6, all vegetation indices were able to enhance vegetation in both areas; however, the best performance was observed for Figure 5b,f,i. For Figure 6, a clearer view of the vegetated areas can be detected in Figure 6e. Similar findings were also observed in the rest of the case studies not shown here, indicating that visible vegetation indices using the RGB color can enhance healthy vegetation; however, their performance rate is based on the context of the image. Indeed, for instance, the triangular greenness index in Figure 5e tended to give poor results since vegetation was not well enhanced in the urban environment. However, the same index was the best for the picnic area in Figure 6e.

**Figure 5.** Vegetation indices results applied in the red–green–blue (RGB) orthophoto of the case study No. 1 (**a**), normalized green–red difference index (**b**), green leaf index (**c**), visible atmospherically resistant index (**d**), triangular greenness index (**e**), red–green ratio index (**f**), red–green–blue vegetation index (**g**), red–green ratio index (**h**), modified green–red vegetation index (**i**), excess green index (**j**), and color index of vegetation (**k**). Vegetated areas are highlighted with the light grayscale tone, during non-vegetated areas with darkest tone of gray.

**Figure 6.** Vegetation indices results applied in the RGB orthophoto of the case study No. 4 (**a**), normalized green–red difference index (**b**), green leaf index (**c**), visible atmospherically resistant index (**d**), triangular greenness index (**e**), red–green ratio index (**f**), red–green–blue vegetation index (**g**), red–green ratio index (**h**), modified green–red vegetation index (**i**), excess green index (**j**), and color index of vegetation (**k**). Vegetated areas are highlighted with the light grayscale tone while non-vegetated areas with the darkest tone of grey.

Extraction of the vegetated regions using RGB cameras can be quite problematic, regardless of the vegetation index applied, in an environment similar to the one of case study 3 (along a river). Figure 7 below shows a closer look at the northern part of the river for all ten vegetation indices mentioned in Table 2. While some indices can enhance the vegetation along the river, as the case of the normalized green–red difference index (Figure 7b), the river itself can also be characterized as "vegetated areas". This is due to the low level of water in the river and the apparent watergrass within the river.

Therefore, as it was found from the visual interpretation of the results, the RGB vegetation indices can enhance vegetated areas. However, they can also give false results. For this reason, a statistical comparison for all vegetation indices and all case studies was applied.

**Figure 7.** Vegetation indices results applied in the RGB orthophoto of the case study No. 3 (close look) (**a**), normalized green–red difference index (**b**), green leaf index (**c**), visible atmospherically resistant index (**d**), triangular greenness index (**e**), red–green ratio index (**f**), red–green–blue vegetation index (**g**), red–green ratio index (**h**), modified green–red vegetation index (**i**), excess green index (**j**), and color index of vegetation (**k**). Vegetated areas are highlighted with the light grayscale tone while non-vegetated areas with the darkest tone of grey.

#### *3.3. Statistics*

To evaluate the overall performance of the ten vegetation indices (see Table 2) per case study, several points were distributed in each orthophoto (100 in total per case study). These points were randomly positioned either over vegetated areas (trees, grass, etc.) or scattered in other types of targets (e.g., asphalt, roofs, water, etc.). The geographical distributions of the points per case study are visualized in Figure 8, while Figure 9 presents the allocation of the random points for "vegetated" and "non-vegetated areas". As expected in orthophotos with limited vegetation, such as the case study of No. 1, the number of points characterized as "vegetation" was less than the "non-vegetation" points (14 and 86 points, respectively).

**Figure 8.** Case studies selected through the OpenAerialMap. (**a**) a highly urbanized area, at Taytay, Philippines, (**b**) a campus at St. Petersburg, Russia, (**c**) a river near Arta, Greece, (**d**) a picnic area at Ohio, USA, (**e**) an agricultural area at Nîmes, France.

**Figure 9.** Overall distribution of the 100 random points over "vegetated" and "non-vegetated" areas for the five different case studies.

The normalized difference between the mean value for each index over "vegetated areas" and "non-vegetated" areas is presented in Table 3. Blue color indicates the lowest normalized difference value, while red color, the highest value per vegetation index (V1 to V10). Overall the normalized difference spanned from 1.2% to 269% for all indices and case studies. For the NGRDI (Normalized green red difference index, V1 of Table 3), the lowest value was observed for case study No. 3, which visualized an area along a river near Arta, Greece. The highest values were reported for the small agricultural area of case study No. 5. The normalized difference of the NGRDI index for all case studies was between 50% to 107%. Similar observations were reported for the green leaf index (GLI) and Visible atmospherically resistant index (VARI) indies (V2 and V3 of Table 3, respectively). An analogous pattern was also reported for the red–green ratio index (IRG), red–green–blue vegetation index (RGBVI), modified green–red vegetation index (MGRVI), and excess green index (ExG) indices (V5, V6, V8, and V9 of Table 3, respectively). For the triangular greenness index (TGI) (V4 of Table 3), the lowest normalized difference was once again reported for case study No. 3, but the highest one for the urban

areas of case study No. 1. The same area also gave the highest relatives values for the red–green ratio index (RGRI) (V7 of Table 3). Finally, for the color index of vegetation (CIVE) (V10 of Table 3), the lowest score was reported for case study No. 5, and the highest at the picnic area in case study No. 4.


**Table 3.** The normalized difference for "vegetated" and "non-vegetated" areas for all vegetation indices (V1 to V10) mentioned in Table 2 for each case study.

In general, we can state that case study No. 3 (river area) tended to give low differences between the vegetated and non-vegetated areas despite the vegetation index applied, indicating that this is by far the most challenging environment to work with and to try to discriminate vegetation from the rest areas. In contrast, high separability for all indices could be seen for case study No. 5 (agricultural area) and case study No. 1 (urban areas).

Based on the results of Table 3, we have then relatively compared the normalized difference for all case studies per vegetation index, setting the vegetation No. 1 (NGRDI) as a reference index. The results of this analysis are shown in Figure 10. The normalized difference indicates the percentage difference between (index No. i − index No. 1)/index No. 1. Therefore, the negative values in Figure 10 suggest that the specific index provided the poorest results in comparison with the NGRDI index (vegetation No. 1). In contrast, high values imply that the particular index gives better results compared to the NGRDI index. Vegetation indices that are closed to zero signify that they have similar performance with the reference index (NGRDI).

**Figure 10.** The relative difference of vegetation indices No. 2 to No. 10 in comparison to vegetation index No. 1.

From the results of Figure 10, we can observe that the most promising index was vegetation No. 2, namely the green leaf index (GLI), which provided better results in comparison with the NGRDI index for all case studies. Its performance ranged from 10% to 35%. This is the only index that provided better results for all case studies. Good performance for all case studies, with the exception of case study No. 3 (a river near Arta, Greece), was the vegetation index No. 9 (excess green index, ExG), as it provides a relative difference in comparison with the NGRDI index between 12% and 23% for case studies 1 and 2, and 4 and 5. In contrast, for case study No. 3, the specific index tended to give the worst performance (42%) in comparison with the reference index. Vegetation indices 5, 7, and 10 (IRG, red–green ratio index; RGRI, red–green ratio index, and CIVE, color index of vegetation, respectively) seemed not to perform better than the reference index for all case studies.

However, it is important to notice that for each case study, the optimum index varied, which is also aligned with the previous findings of Table 3. For case study 1 (a highly urbanized area, at Taytay, Philippines), the best index was No. 4 (TGI, triangular greenness index), for case study 2 (a campus at St. Petersburg, Russia), the best index was vegetation No. 8 (MGRVI, modified green–red vegetation index). For case study 3 (a river near Arta, Greece), the best index was vegetation No. 2 (GLI, green leaf index). Finally, for the last two case studies, No. 4 and No. 5, indicating a picnic area at Ohio, USA, and an agricultural area at Nîmes, France, respectively, the best indices were again vegetation No. 4 (TGI, triangular greenness index) and vegetation No. 6 (RGBVI, red–green–blue vegetation index).

#### **4. Discussion**

The results presented in the previous section, provide some very helpful information regarding the extraction of vegetation in visible orthophotos, in various environments. It was shown that the application of vegetation indices based on visible bands could highlight vegetated areas and, therefore, enhance healthy vegetation. Indeed, the results presented in Figures 5 and 6 indicate that several indices could perform a high separability between vegetated and non-vegetated areas, while from the findings of Table 3, itis demonstrated that for each case study there was a unique index that could highlight these two different areas, with a relative difference ranging from 57% up to 269%. The differences between the vegetated and non-vegetated areas were also found to be statistically significant for all case studies, after the application of a *t*-test with a 95% confidence level.

While this is true, the context of some orthophotos can also be characterized as quite challenging as the case study No. 3. The results from the application of all indices are shown in Figure 7, which shows that the detection of vegetated areas could also have several false positives within the river basin.

Beyond the spectral complexity and heterogeneity of the orthophoto, some other factors, not discussed in the paper, can also influence the overall performance of the indices. Initially, the spectral response filters of each camera used for these orthophotos were different. Differences in the sensitivity of the cameras to capture in specific wavelengths the backscattered reflectance values can be significant, as was demonstrated in the past from other studies [30]. In addition, the resolution of the orthophoto was not always optimum for each case study. Recent studies [31,32] have shown that the optimum resolution for remote sensing applications is connected not only to the spatial characteristics of the targets under investigation but also with their spectral properties. Finally, assumptions made during the radiometric calibration of the orthophotos need to be taken into consideration. At the same time, a pre-flight plan with special targets and spectroradiometric campaigns can minimize these errors.

#### **5. Conclusions**

Vegetation extraction has attracted the interest of researchers all around the world due to its importance of monitoring agricultural areas, forests, etc. While their detection is based on the exploitation of the near-infrared part of the spectrum, the tremendous increase in low altitude platforms, such as the UAVs, equipped with only visible cameras, has made this task quite challenging.

In this paper, we explored openly licensed unmanned aerial vehicle (UAV) imagery from the OpenAerialMap platform, selecting five different case studies, with different contexts and UAV sensors. Since these products were downloaded "as is", it was necessary to apply a radiometric correction before any further processing. For this reason, the EML image-based technique was applied for some case studies (namely case studies Nos. 1, 2, and 4), while for the rest of the case studies (Nos. 3 and 5), normalization of the orthophotos based on image statistics was applied. This procedure does not require any knowledge of either ground targets or field campaigns with spectroradiometers and spectral reflectance targets, which could not be performed in this study (i.e., after the UAV flight). Once the radiometric calibration was applied and verified using spectral signatures profiles of targets on the UAVs, then various visible vegetation indices were applied to all case studies. The results were further elaborated to examine the performance of each index. From the findings of this study, two aspects can be highlighted:


The findings of this study can be applied in any RGB orthophoto, taken either from a low altitude system or even aerial images. Given the wide application of ready-to-fly (RTF) drones with a cost of approximatively less than 2000 euros, RGB cameras will continue to play an important role in the near future for small survey campaigns. While field campaigns and particular targets are necessary to calibrate the reflectance of the images, if these are, for any reason, absent, then a similar approach presented here can be followed. In the future, specialized vegetation indices can be developed for addressing specific needs, thus making the extraction of vegetation an easier and more straightforward procedure. Given the various phenological growth stages of vegetation, a dynamic threshold method can be investigated in the future for specific types of vegetation (e.g., crops) towards the automatic extraction of vegetation from RGB orthophotos. These vegetation-specific optimum thresholds could eventually be use to mask or extract the vegetated areas. Finally, a different approach for the extraction of vegetation based on supervised classification analysis can be performed in the future.

**Author Contributions:** Conceptualization, methodology, formal analysis, investigation, writing—original draft preparation, A.A. Author has read and agreed to the published version of the manuscript.

**Funding:** This article is submitted under the NAVIGATOR project. The project is co-funded by the Republic of Cyprus and the Structural Funds of the European Union in Cyprus under the Research and Innovation Foundation grant agreement EXCELLENCE/0918/0052 (Copernicus Earth Observation Big Data for Cultural Heritage).

**Acknowledgments:** The author acknowledges the use of high resolution openly licensed unmanned aerial vehicle (UAV) imagery. All imagery is publicly licensed and made available through the Humanitarian OpenStreetMap Team's Open Imagery Network (OIN) Node. All imagery contained in OIN is licensed CC-BY 4.0, with attribution as contributors of Open Imagery Network. All imagery is available to be traced in OpenStreetMap (https://openaerialmap.org). Thanks, are also provided to the Eratosthenes Research Centre of the Cyprus University of Technology for its support. The Centre is currently being upgraded through the H2020 Teaming Excelsior project (www.excelsior2020.eu).

**Conflicts of Interest:** The authors declare no conflict of interest

#### **References**


© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Comparing UAS LiDAR and Structure-from-Motion Photogrammetry for Peatland Mapping and Virtual Reality (VR) Visualization**

**Margaret Kalacska 1,\*, J. Pablo Arroyo-Mora <sup>2</sup> and Oliver Lucanus <sup>1</sup>**


**Abstract:** The mapping of peatland microtopography (e.g., hummocks and hollows) is key for understanding and modeling complex hydrological and biochemical processes. Here we compare unmanned aerial system (UAS) derived structure-from-motion (SfM) photogrammetry and LiDAR point clouds and digital surface models of an ombrotrophic bog, and we assess the utility of these technologies in terms of payload, efficiency, and end product quality (e.g., point density, microform representation, etc.). In addition, given their generally poor accessibility and fragility, peatlands provide an ideal model to test the usability of virtual reality (VR) and augmented reality (AR) visualizations. As an integrated system, the LiDAR implementation was found to be more straightforward, with fewer points of potential failure (e.g., hardware interactions). It was also more efficient for data collection (10 vs. 18 min for 1.17 ha) and produced considerably smaller file sizes (e.g., 51 MB vs. 1 GB). However, SfM provided higher spatial detail of the microforms due to its greater point density (570.4 vs. 19.4 pts/m2). Our VR/AR assessment revealed that the most immersive user experience was achieved from the Oculus Quest 2 compared to Google Cardboard VR viewers or mobile AR, showcasing the potential of VR for natural sciences in different environments. We expect VR implementations in environmental sciences to become more popular, as evaluations such as the one shown in our study are carried out for different ecosystems.

**Keywords:** bog; drone; Oculus Quest 2; Mer Bleue; SfM; UAV; augmented reality; AR

#### **1. Introduction**

Peatlands cover a significant area globally (≈3%), and in particular of northern regions (e.g., ≈12% of Canada), and they have an increasingly important role in carbon sequestration and climate change mitigation [1–4]. Ongoing monitoring of peatlands over large spatial extents through the use of satellite-based Earth observation products is needed to understand their response to climate change (e.g., [5–7]). However, given their generally poor accessibility and the fine-scale topographic variation of vegetation microforms (often <1 m in height), satellite-based mapping requires validation from ground data (e.g., water table depth, species composition, biochemistry) [8,9]. Unmanned aerial systems (UAS) have shown potential for characterizing these ecosystems at fine scales [9–11]. In general terms, microtopographic features such as hollows and hummocks are key elements that are closely related to complex and associated hydrological, ecophysiological, and biogeochemical processes in peatlands [12]. Hummocks are elevated features composed of vascular plants overlaying mosses that consistently remain above the water table, while hollows are lower lying areas with primarily exposed mosses [13]. The multitemporal characterization of hollows and hummocks at submeter scales is key to validating satellite-derived products such as phenology tracking, net ecosystem exchange estimation, etc. [9].

**Citation:** Kalacska, M.; Arroyo-Mora, J.P.; Lucanus, O. Comparing UAS LiDAR and Structure-from-Motion Photogrammetry for Peatland Mapping and Virtual Reality (VR) Visualization. *Drones* **2021**, *5*, 36. https://doi.org/10.3390/ drones5020036

Academic Editors: Higinio González-Jorge

Received: 15 April 2021 Accepted: 6 May 2021 Published: 9 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

To date, mapping microtopography with UAS has relied on two main technologies: light detection and ranging (LiDAR) and structure-from-motion (SfM) multiview stereo (MVS) photogrammetry (hereinafter referred to as SfM) with variable results for each technology (e.g., [14–16]). LiDAR is an active remote sensing technology that uses a pulsed laser generally between 800 and 1500 nm for terrestrial applications, to measure ranges, i.e., the variable distances from the instrument to objects on the surface of the Earth. It does so by measuring the exact time it takes for the pulses to return after they are reflected off objects or the ground [17]. In contrast, SfM is a passive remote sensing technique that uses overlapping offset photographs from which to reconstruct the landscape [18,19]. In forested areas, LiDAR pulses can penetrate the canopy and allow for the development of both canopy and surface terrain models [17], while SfM only provides a surface model of the highest layer, often the canopy, as seen from the photographs [20]. Comparatively, across ecosystems, SfM is shown to produce higher density point clouds than those from LiDAR. Previously in peatlands, mapping microtopography has been compared between UAS SfM and airborne LiDAR (e.g., [16]). Many studies have also employed airborne LiDAR for large-scale peatland assessments (e.g., [21–26]). Terrestrial laser scanning (TLS) has also been shown to successfully map microforms at very high spatial detail (e.g., [27]). However, no formal study has rigorously compared UAS LiDAR and SfM for mapping peatland microtopography.

Because peatlands are both fragile ecosystems and in general have poor accessibility, tools to remotely study, access, and visualize peatland structure in 3D are needed for advancing our understanding of their response to climate change. Although not a new technology [28], the recent advances in virtual reality (VR) [29], with its applications in medicine [30], conservation [31], geosciences [32,33], e-tourism [34,35], and education [36], among others, provide novel opportunities to study peatlands and other ecosystems remotely without disturbance [37]. VR is technology (hardware and software) that generates a simulated environment which stimulates a "sense of being present" in the virtual representation [38]. In contrast, augmented reality (AR) superimposes the virtual representation on the real world through glasses or other mobile digital displays, in turn supplementing reality rather than replacing it [39]. Thus, through VR, users experience an immersive experience of the field conditions in a cost-effective and repeatable manner. For instance, [29] showcases the advantages of VR, such as the quantification and analysis of field observations, which can be performed at multiple scales. While early implementations required extensive and expensive hardware, such as CAVE (CAVE Automatic Virtual Environments) [38], recent commercial grade VR systems that utilize improved head mounted displays (HMD), such as Oculus Rift, Sony PlayStation VR, HTC Vive Cosmos, etc., allow for outstanding visualization capabilities and sharing of scientific output through web-based platforms.

Our study aims to bridge the implementation of 3D models derived from UAS (LiDAR and SfM) and VR/AR visualization. Thus, our objectives are to (1) compare SfM and LiDAR point cloud characteristics from a peatland; (2) compare the representation of peatland microtopography from the SfM and LiDAR data; and (3) provide a qualitative evaluation of VR and AR usability and quality of visualization of the two point clouds. We further discuss the potential of VR in peatland research and provide web-based examples of the study area. While we primarily focus on VR due to the maturity of the technology and its suitability for scientific data visualization, we also briefly compare the point clouds in AR. To our knowledge, ours is the first study to compare microtopography between LiDAR and SfM for a peatland, in addition to investigating peatland VR/AR models derived from UAS data.

#### **2. Materials and Methods**

#### *2.1. Study Area*

This study was carried out at Mer Bleue, an ≈8500 year-old ombrotrophic bog near Ottawa in Ontario, Canada (Figure 1). A bog is a type of peatland commonly found in

northern regions. Bogs are acidic, nutrient-poor ecosystems, receiving incoming water and nutrients only from precipitation and deposition. Mer Bleue is slightly domed, with peat depth decreasing from >5 m across most its area to ≈30 cm along the edges. It has a hummock–hollow–lawn microtopography with a mean relief between hummocks and hollows of <30 cm [40,41]. While the water table depth is variable throughout the growing season, it generally remains below the surface of the hollows [42]. Malhotra et al. (2016) [43] found a strong association between spatial variations in vegetation composition, water table depth, and microtopography. However, the strength of the association varied spatially within the bog. Mosses, predominantly *Sphagnum capillifolium, S. divinum*, and *S. medium* (the latter two species were formerly referred to as *S. magellanicum*) [44] form the ground layer of the bog and can be seen exposed in low lying hollows. Vascular plants comprise the visible upper plant canopy of the hummocks (Figure 1). The most common vascular plant species are dwarf evergreen and deciduous shrubs (*Chamaedaphne calyculata*, *Rhododendron groenlandicum, Kalmia angustifolia, Vaccinium myrtilloides*), sedges (*Eriophorum vaginatum*), and trees (*Picea mariana, Betula populifolia*, and *Larix laricina*) [45]. Hummocks have been estimated to account for 51.2% and hollows for 12.7% of the total area [46]. Trees and water bodies (open and vegetated) around the margins of the peatland, which are heavily impacted by beavers, comprise the remaining classes.

**Figure 1.** *Cont*.

**Figure 1.** (**A**) Map of Mer Bleue, near Ottawa in Ontario, Canada. Locations where photographs B–E were taken are indicated on the map. (**B**) UAV photograph facing north, taken in October; (**C**) photograph facing SE across the study area, taken in June; (**D**) UAV photograph of the southern margin of the study area where dense stands of *Typha latifolia* (cattail) grow in areas of permanent slow-moving water impacted by beavers. Photograph facing west, taken in May. (**E**) Photograph facing the treed bog, taken in June. A 360◦ aerial panorama acquired in late June can be viewed at https://bit.ly/mbpano2017 (accessed on 14 April 2021).

#### *2.2. Airframe*

We used a Matrice 600 Pro (M600P) (DJI, Shenzhen, China) for both the RGB photograph and LiDAR acquisitions (Figure 2, Table A1). The M600P is a six-rotor unmanned aerial vehicle (UAV) with a maximum takeoff weight of 21 kg (10.2 kg payload) (DJI Technical Support, 2017) that uses an A3 Pro flight controller with triple redundant GPS, compass, and IMU units. We integrated a differential real-time kinetic (D-RTK) GPS (dual-band, four-frequency receiver) module with the A3 Pro [47] for improved precision of navigation [10]. For both datasets, DJI Ground Station Pro was used for flight planning and for the automated flight control of the M600P.

**Figure 2.** M600P RTK enabled UAV with the (**A**) Canon DSLR and (**B**) LiAIR S220.

#### *2.3. Structure from Motion Photogrammetry*

A Canon 5D Mark III digital single-lens reflex (DSLR) camera with a Canon EF 24–70 mm f/2.8L II USM Lens set to 24 mm was used for the RGB photograph acquisition in June (Table A1). This is a full frame (36 × 24 mm CMOS) 22.1 MP camera with an image size of 5760 × 3840 pixels (6.25 μm pixel pitch). At 24 mm, the field of view of the lens is 84◦. With the camera body and lens combined, the total weight was 1.9 kg. The camera was mounted on a DJI Ronin MX gimbal (2.3 kg) for stabilization and orientation control (Figure 2a). The camera's ISO was set to 800 to achieve fast shutter speeds of 1/640 to 1/1000 s at f/14 to f/16. The photographs were acquired from nadir in Canon RAW, (.cr2) format and were subsequently converted to large JPG (.jpg) files in Adobe Lightroom® with minimal compression. Because the M600P does not automatically geotag the photographs acquired by third party cameras, geotags were acquired separately.

Geotagging was achieved through a postprocessing kinematic (PPK) workflow with an M+ GNSS module and Tallysman TW4721 antenna (Emlid, St. Petersburg, Russia) to record the position and altitude each time the camera was triggered (5 Hz update rate for GPS and GLONASS constellations) (Table A1). A 12 × 12 cm aluminum ground plane was used for the antenna to reduce multipath and electromagnetic interference and to improve signal reception. The camera was triggered at two second intervals with a PocketWizard MultiMax II intervalometer (LPA Design, South Burlington, VT, USA). A hot shoe adaptor between the camera and the M+ recorded the time each photograph was taken with a resolution of <1 μs (i.e., flash sync pulse generated by the camera). The setup and configuration steps are described in [48]. The weight of the M+ GNSS module, the Tallysman antenna, the intervalometer and cables were 300 g combined. Photographs were acquired from an altitude of 50 m AGL with 90% front overlap and 85% side overlap. With the aforementioned camera characteristics, i.e., altitude and overlap, the flight speed was set to 2.5 m/s by the flight controller. The total flight time required was ≈18 min.

Base station data from Natural Resources Canada's Canadian Active Control System station 943020 [49] (9.8 km baseline) was downloaded with precise clock and ephemeris data for PPK processing of the M+ geotags. The open-source RTKLib software v2.4.3B33 [50] was used to generate a PPK corrected geotag for each photograph. A lever arm correction was also applied to account for the separation of the camera sensor from the position of the TW4721 antenna.

We used Pix4D Enterprise v4.6.4 (Pix4D S.A, Prilly, Switzerland) to carry out an SfM-MVS workflow to generate the dense 3D point cloud (Table A1). Unlike UAV integrated cameras with camera orientation written to the EXIF data, the DSLR photographs lack this information. However, these initial estimates are not necessary because during processing, Pix4D calculates and optimizes both the internal (e.g., focal length) and external camera parameters (e.g., orientation). In addition to the camera calibration and optimization in the initial processing step, an automatic aerial triangulation and a bundle block adjustment

are also carried out [51]. Pix4D generates a sparse 3D point cloud through a modified scale-invariant feature transform (SIFT) algorithm [52,53]. Next, the point cloud is densified with an MVS photogrammetry algorithm [54]. For this comparison, we did not generate the raster digital surface model (DSM) through Pix4D (see Section 2.5).

#### SfM Point Cloud Accuracy

Two separate flights (≈12 min total flight time) with the same equipment described above were carried out ≈30 min earlier in a vegetated field, 300 m south of the primary bog study area. This field was located on mineral soil and therefore is less impacted by foot traffic than the fragile bog ecosystem. In an area of 0.2 ha, twenty targets to be used as check points were placed flat on the ground. Their positions were recorded with an Emlid Reach RS+ single-band GNSS receiver (Emlid, St Petersburg, Russia) (Table A1). The RS+ received incoming NTRIP corrections from the Smartnet North America (Hexagon Geosystems, Atlanta, GA, USA) NTRIP casting service on an RTCM3-iMAX (individualized master– auxiliary) mount point utilizing both GPS and GLONASS constellations. The accuracy of the RS+ with the incoming NTRIP correction was previously determined in comparison to a Natural Resources Canada High Precision 3D Geodetic Passive Control Network station, and it was found to be <3 cm X, Y, and 5.1 cm Z [55]. The photographs from the camera and geotags were processed the same way as described above with RTKlib and Pix4D up to the generation of the sparse point cloud (i.e., prior to the implementation of the MVS algorithm). Horizontal and vertical positional accuracies of the sparse 3D point cloud were determined from the coordinates of the checkpoints within Pix4D. The results of this accuracy assessment are used as an estimate of the positional accuracy of the SfM model of the study area within the bog where no checkpoints were available.

#### *2.4. LiDAR*

We used a LiAIR S220 integrated UAS LiDAR system (4.8 kg) (GreenValley International, Berkeley, CA, USA) hard mounted to the M600P in August (Figure 2b) (Table A1). The system uses a Hesai Pandar40P 905 nm laser with a ±2 cm range accuracy, a range of 200 m at 10% reflectivity, and a vertical FOV of –25◦ to +15◦ [56,57]. The Pandar40P is a 40-channel mechanical LiDAR that creates the 3D scene through a 360◦ rotation of 40 laser diodes. The majority of the lasers (channels 6–30) are within a +2◦ to –6◦ range of the FOV [58]. The integrated S220 system utilizes an RTK enabled INS (0.1◦ attitude and azimuth resolution) with an external base station and a manufacturer stated relative final product accuracy of ±5 cm. The system includes an integrated Sony a6000 mirrorless camera that is triggered automatically during flight. These JPG photographs are used to apply realistic RGB colors to the point cloud in postprocessing.

Two flights at 50 m AGL and 5 m/s consisting of 6 parallel flight lines (40 m apart) were carried out. Importantly, prior to the flight lines, two figure 8s were flown to calibrate the IMU. The same figure 8s were repeated after the flight lines prior to landing. Total flight time was ≈10 min. The LiAcquire software (GreenValley International, Berkeley, CA, USA) provided a real-time view of the point cloud generation.

LiAcquire and LiNAV were used for the postprocessing of trajectory data and the geotagging of the RGB photographs. The LiDAR360 software (GreenValley International, Berkeley, CA, USA) was then used to correct the boresight error, carry out a strip alignment, merge individual strips, and calculate quality metrics consisting of analyses of the overlap, elevation difference between flight lines, and trajectory quality.

#### *2.5. Analysis*

The open source CloudCompare Stereo v2.11.3 (https://www.danielgm.net/cc/) (accessed on 14 April 2021) software was used to analyze the point clouds (Table A1). After the initial positional difference between the point clouds was computed, the LiDAR point cloud was coarsely aligned to the SfM point cloud followed by a refinement with an iterative closest point (ICP) alignment. Each point cloud was detrended to remove

the slope of the bog surface. The point clouds were then clipped to the same area and compared. Characteristics including the number of neighbor points, point density, height distribution, surface roughness (distance between a point and the best fitting plane of its nearest neighbors), and the absolute difference between point clouds were calculated. DSMs at 10 and 50 cm pixel sizes were also created from each dataset. CloudCompare was used to generate the DSMs rather than Pix4D and LiDAR360 respectively to ensure differences in the surfaces are not due to varying interpolation methodology between the different software. The average method with the nearest neighbor interpolation (in case of empty cells) was chosen for the rasterization of the point clouds.

To classify the hummocks and hollows, the DSMs were first normalized in MATLAB v2020b (MathWorks, Natick, MA, USA) by subtracting the median elevation in a sliding window of 10 × 10 m [59]. Hummocks were defined as having a height range 5–31 cm above the median and hollows as >5 cm below the median. These thresholds were defined on the basis of expert knowledge of the site. In the SfM data, this corresponded to the 55th–90th percentile of the height for hummocks and the bottom 38th percentile for hollows. In the LiDAR data, it corresponded to the 48th–71st percentile of the height for hummocks, and the bottom 40th percentile for hollows. A decision tree was used to assign the DSM pixels to hummock, hollow, and other classes based on their normalized height value.

To quantify the shape and compare the apparent complexity of the microforms from the SfM and LiDAR, we calculated the 3D Minkowski–Bouligand fractal dimension (D) of the surface of the bog [60]. The 3D fractal dimension combines information about an object/surface across different spatial scales to provide a holistic quantification of the shape [61]. The point clouds were converted to triangular meshes at rasterization scales of 10 and 50 cm in CloudCompare. The fractal dimension, D, was then calculated following the methodology described in [61]. The fractal dimension is a scale-independent measure of complexity. As defined by [62], fractals are "used to describe objects that possess selfsimilarity and scale-independent properties; small parts of the object resemble the whole object". Here, D is a measure of the complexity of the bog surface as modeled by the triangular mesh objects from the SfM and LiDAR data sources. The value of D ranges from 0 to 3, with higher values indicating more complexity in the shapes. In this case, the complexity quantified by D is related to the irregularity pattern [61], with more regular shapes having lower values.

Lastly, empirical semivariograms were used to compare the scale dependence of the hummock–hollow microtopography to determine whether the scale of the vegetation pattern captured by the SfM and LiDAR datasets is similar. The spatial dependence of the height of the vegetation can be inferred from the semivariogram which plots a dissimilarity measure (γ) against distance (h). The range, sill, and nugget describe the properties of the semivariogram. The range indicates the spatial distance below which the height values are autocorrelated. The sill indicates the amount of variability and the nugget is a measure of sampling error and fine-scale variability. Previous application of empirical semivariograms to terrestrial LiDAR data from a peatland indicated the hummock–hollow microtopography had an isotropic pattern with a range of up to 1 m, and in sites with increased shrub cover, the range increased to 3–4 m [27]. The empirical semivariograms were calculated in MATLAB v2020b for a subset of the open bog that did not include boardwalks.

In order to generate the PLY files (i.e., Polygon file format, .ply) needed for VR and AR visualization, the horizontal coordinates (UTM) were reduced in size (i.e., number of digits before the decimal) using a global shift. In this case, 459,400 was subtracted from the easting and 5,028,400 from the northing. Binary PLY files were then generated with CloudCompare.

Both VR (Section 2.6) and AR (Section 2.7) visualizations were compared to a standard web-based 3D point cloud viewer as a baseline. We used a Windows server implementation of Potree v1.8 [63], a free open-source WebGL based point cloud renderer to host the point clouds (https://potree.github.io/) (accessed on 14 April 2021).The Potree Converter

application was used to convert the LAS files (.las) into the Potree file and folder structure used by the web-based viewer for efficient tile-based rendering. In addition to navigation within the point cloud, user interactions include measurements of distance and volume and the generation of cross sections.

#### *2.6. Virtual Reality Visualization*

We tested the VR visualization of the point clouds with an Oculus Quest 2 headset (Facebook Technologies LLC, Menlo Park, CA. USA) (Table A1). The Oculus Quest 2 released in 2020, is a relatively low cost, consumer-grade standalone VR HMD. It has 6 GB RAM and uses the Qualcomm Snapdragon XR2 chip running an Android-based operating system. The model we tested had 64 GB of internal storage. The fast-switching LCD display has 1832 × 1920 pixels per eye at a refresh rate of 72–90 Hz (depending on the application, with 120 Hz potentially available in a future update).

In order to access point cloud visualization software, the Oculus Quest 2 was connected to a Windows 10 PC through high-speed USB 3. In this tethered mode, the Oculus Link software uses the PC's processing to simulate an Oculus Rift VR headset and to access software and data directly from the PC. The PC used had an Intel Core i7 4 GHz CPU, 64 GB RAM, and an NVIDIA GeForce GTX 1080 GPU. The PLY files were loaded in VRifier (Teatime Research Ltd., Helsinki, Finland), a 3D data viewer package that runs on Steam VR, a set of PC software and tools that allow for content to be viewed and interacted with on VR HMDs. The two touch controllers were used to navigate through the point clouds as well as to capture 2D and 360-degree "photographs" from within the VR environment.

As a simple and low-cost alternative VR visualization option, we also tested two Google Cardboard compatible viewers, a DSCVR viewer from I Am Cardboard (Sun Scale Technologies, Monrovia, CA, USA), and a second generation Google Official 87002823-01 Cardboard viewer (Google, Mountain View, CA, USA) (Table A1). These low-tech viewers can be used with both iOS and Android smartphones by placing the phone in the headset and viewing VR content through the built-in lenses. The LiDAR and SfM point clouds were uploaded to Sketchfab (https://sketchfab.com) (accessed on 14 April 2021) (in PLY format), an online platform for hosting and viewing interactive and immersive 3D content. The models were accessed through the smartphone's web browser. The entire LiDAR point cloud was viewable with the smartphone's web browser, but the SfM model was subset to an area of 0.3 ha of the open bog and an 0.4 ha area of the treed bog due to the maximum 200 MB file size upload limitations of our Sketchfab subscription. The PLY models were scaled in Sketchfab relative to a 1.8 m tall observer.

#### *2.7. Augmented Reality Visualization*

In comparison to consumer VR systems, AR head-up-displays and smart glasses capable of visualizing scientific data are predominantly expensive enterprise grade (e.g., Magic Leap 1, Epson Moverio series, Microsoft Hololens, Vuzix Blade, etc.) systems. Therefore, we tested mobile AR using webhosted data viewed through an iOS/Android smartphone application. The point clouds in PLY format were uploaded to Sketchfab, and the models were accessed in AR mode via the Sketchfab iOS/Android smartphone application. The entire LiDAR point cloud was viewable with the smartphone application, but the SfM model was subset to an area of 788 m<sup>2</sup> due to RAM limitations of the phones tested (i.e., iPhone XR, 11 Pro, 12 Pro and Samsung Galaxy 20 FE).

#### **3. Results**

#### *3.1. SfM-MVS Point Cloud*

Each of the 333 bog photographs was geotagged with a fixed PPK solution (AR ratio μ = 877.3 ± 302, range of 3–999.99). The precision of the calculated positions was μ = 1.2 ± 0.6 cm (easting), μ = 1.6 ± 0.7 cm (northing), and μ = 3.2 ± 1.3 cm (vertical). The final ground sampling distance (GSD) of the bog point cloud was 1.2 cm. Pix4D found a median of 58,982 keypoints per photograph and a median of 26,459.9 matches

between photographs. Total processing time in Pix4D was ≈ 2.5 h (Intel® Xeon® Platinum 8124M CPU @ 3.00GHz, 69 GB RAM). The average density of the final point cloud was 2677.96 point per m<sup>3</sup> (40,605,564 total points).

In the field south of the bog, the point cloud was generated with a GSD of 1.8 cm, and similar to the bog dataset, all photographs were geotagged with a fixed PPK solution. Pix4D found a median of 75,786 keypoints per photograph and a median of 23,202.9 matches between photographs. The positional accuracy of this point cloud in relation to the checkpoints was RMSEx = 5 cm, RMSEy = 6 cm, and RMSEz = 5 cm. These values serve as an estimate of the positional accuracy of the bog point cloud.

#### *3.2. LiDAR Point Cloud*

The individual LiDAR strip quality metrics calculated by LiDAR360 are shown in Table 1. These metrics are calculated for each entire strip, including edges and turns that were not used in the final dataset. At an acquisition height of 50 m AGL, the width of the individual LiDAR strips was ≈80 m with neighboring strips overlapping by 50–52%. As expected, the treed portion of the bog had the greatest elevation difference between neighboring strips (13.1–17.3 cm) compared to the open bog predominantly comprised of hummocks and hollows (5.8–7.1 cm).

**Table 1.** Quality metrics of the full individual LiDAR strips calculated from LiDAR360.


<sup>1</sup> These values represent the full strips, including edges without overlap, turns, and infrastructure (sheds) that were cut from the final dataset.

#### *3.3. Point Cloud Comparisons*

The final SfM and LiDAR point clouds covering 1.71 ha are shown in Figure 3. The SfM dataset has 30,413,182 points while the LiDAR dataset has 1,010,278 points (Table 2). As a result, the SfM point cloud is 19.6 times larger (LAS format) than the LiDAR dataset. The data acquisition time was nearly double for the SfM (18 vs. 10 min), and the computation time to generate the 3D point cloud was at least 10 times greater than for the LiDAR dataset. Considering the time needed to process the geotags and prepare the photographs (i.e., convert from CR2 to JPEG and color correct if necessary), the SfM point cloud takes even longer to generate.

The increased detail obtained from the ≈30x more points in the SfM dataset is apparent in Figure 3, resulting in a more realistic reconstruction of the bog. The several "no data" areas in the LiDAR dataset (shown in black) and the linear pattern of point distribution are artefacts from the mechanical laser diodes spinning during acquisition in a system hard mounted on a moving platform (Figure 2b).

Figure 4 illustrates the point density of the two datasets. The SfM dataset has an average density of 570.4 ± 172.8 pts/m<sup>2</sup> while the LiDAR dataset has an average density of 19.4 ± 7.5 pts/m2. In both data sets, the lowest density is in the treed bog.

**Figure 3.** The final 1.71 ha point clouds for the bog study area from (**A**) SfM and (**B**) LiDAR at three increasing levels of scale.



<sup>1</sup> Includes transit from takeoff area and the two sets of figure 8s required for the LiDAR INS calibration after takeoff and before landing, and does not include time on the ground between flights. <sup>2</sup> These files contain only six columns: x, y, z coordinates and R, G, B color intensity. <sup>3</sup> Does not include time needed to covert or geotag the photographs for the SfM.

**Figure 4.** Point density of the (**A**) SfM and (**B**) LiDAR datasets. The number of neighbors is the count of points within a sphere witha1m radius. The pts/m<sup>2</sup> represents the number of points within a surface area of 1 m2. The distribution next to the color bars represents the histogram of the height values. No data shown in black.

Despite the differences in point density, the gross microtopography and presence of both large and small trees can be seen in both datasets (Figure 5). A *t* location-scale distribution was found to best fit the vegetation height from both datasets based on the AIC criterion (Table 3, Figure 6). This distribution better represents data with a heavier tail (i.e., more outliers) than a Gaussian one. In this case, relatively fewer points representing the trees are the outliers. The distribution is described by three parameters, location (μ), scale (σ) and shape (ν). Larger values of ν indicate a lesser tail and, therefore, a distribution more similar to a Gaussian. A two-sample Kolmogorov–Smirnov test indicates the height values are from different continuous distributions (k = 0.11, *p* = 0, α = 0.05). Figure 6 shows that the SfM's distribution is slightly wider (σ = 1.591) than that of the LiDAR (σ = 0.1151).

**Figure 5.** Subset of the point clouds illustrating the height of the vegetation (m ASL) for a subset of the point cloud from (**A**) SfM and (**B**) LiDAR. The distribution next to the color bars represents the histogram of the height values.

**Table 3.** Parameters of the best-fit *t* location-scale distributions of height (m ASL) from the two datasets; μ = location, σ = scale, ν = shape, CI = confidence interval.


**Figure 6.** Best-fit *t* location-scale distribution probability density functions of height for the SfM and LiDAR datasets.

Prior to alignment in CloudCompare, there was a 15 ± 22 cm vertical and 50 ± 7 cm horizontal offset between the point clouds. After ICP, the horizontal offset decreased to 10.5 ± 11.5 cm. The sparseness of the LiDAR point cloud precluded a closer horizontal alignment. Vertically, the difference in height varies by spatial location (average of 4 ± 13 cm) (Figure 7) due to a more pronounced depression in center of the SfM-MVS dataset where the bog has a higher density of hollows. However, when the uncertainties of the height values of both the SfM and LiDAR surfaces are taken into account, the height differences are minimal for the majority of the study area.

**Figure 7.** Difference in height between the SfM and LiDAR point clouds. The distribution next to the color bar represents the histogram of the difference in height.

The values of surface roughness (Figure 8) reveal similarities across both datasets, with the trees and boardwalks differentiated from the hummocks and hollows with higher values of roughness. In the SfM dataset, hummocks (roughness ≈ 0.1–0.35) can be better differentiated from hollows (roughness ≈ 0.06). From the LiDAR dataset, the sparseness of the point cloud results in incomplete definition of the hummocks (roughness ≈ 0.05–0.29).

**Figure 8.** Surface roughness for (**A**) SfM and (**B**) LiDAR datasets calculated with a kernel size of 1 m (the radius a sphere centered on each point). The distributions next to the color bars represent the histogram of the roughness values.

After rasterization, the density of the points averaged per pixel of the 10 cm DSM was 17.6 ± 7.3 pts/px from the SfM and 0.6 ± 1.1 pts/px from the LiDAR. At a 50 cm pixel size, the density increased to 437.5 ± 144.4 pts/px for the SfM and 14.5 ± 8.3 pts/px for the LiDAR. The low point density for the LiDAR at the 10 cm pixel size resulted in interpolation artefacts. From the DSMs, the percentages of classified hummocks and hollows are similar between the SfM and LiDAR classifications (Table 4). In both cases, the proportions of the two microforms decreased with increasing pixel size, most notably for the SfM hummock class (loss of 5%). For both pixel sizes, the estimated total area of hummocks and hollows is lower from the LiDAR DSM than those generated from the SfM.


**Table 4.** Percentage of hummocks (HU) and hollows (HO) in the study area classified from the SfM and LiDAR DSMs.

Comparisons of transects (Figure 9) across the profile of a tree and hummocks and hollows, from the 10 cm DSM of each dataset, reveal similarities in the heights along the transects. The remaining horizontal offset between the two datasets is most apparent in the profile of the tree (Figure 9a), but it can also be seen in the hummocks and hollows (Figure 9b) to a lesser degree. The incomplete resolution of the tree crown can be seen in the transect across the tree with sections dropping to ground level due to the low density of the LiDAR. At the finer resolution of the height in the hummocks and hollows transect, a vertical offset of 10–20 cm can be seen between the SfM and LiDAR data. This transect is located near the center of the study area and as can be seen in Figure 7, the difference in height between the datasets in that section is 9–21 cm.

**Figure 9.** Comparison of transects across a profile of a tree (**A**) and hummocks and hollows (**B**) for the SfM and LiDAR datasets. The panels on the left illustrate the DSMs from which the transects were extracted.

The 3D fractal dimension reveals opposite patterns of complexity between the 10 and 50 cm scales for the SfM and LiDAR derived triangular meshes (Table 5). At both scales, the LiDAR data have higher values of D, indicating greater complexity of the 3D shape of the bog surface. However, this is likely influenced by the sparseness of the point cloud resulting in artefacts following interpolation producing artificial complexity. The lowest value of D (1.36), obtained for the 10 cm SfM data, indicates that at that scale, the microtopography of the bog is more regular. At 50 cm, some of the lawns (height values spanning ±5 cm around the median) that are intermediate between the hummocks and the hollows are grouped together with either the neighboring hummock or hollow, resulting in a more distinct boundary between microforms and a more irregular pattern and greater value of D (1.81).


**Table 5.** Value of the 3D Minkowski–Bouligand fractal dimension (D) for the SfM and LiDAR.

Similar to the findings of [27], we also found that the bog has an isotropic (nondirectional) semivariogram (from both SfM and LiDAR). From the SfM, the range was approximately 2.5 m with a sill of 0.06 and a nugget of 0.01. The LiDAR had similar results with a range of approximately 2.7 m, a sill of 0.05, and a nugget of 0.01. The semivariograms from both datasets support a hummock–hollow pattern. The longer range value of the LiDAR indicates it was able to resolve a less well-defined pattern between the hummocks and hollows than the SfM.

Lastly, based on the system implementations and acquisition of the data, Table 6 summarizes the main strengths and weaknesses of SfM and LiDAR data acquisition for 3D surface reconstruction of the bog.



<sup>1</sup> System described here has a low user-friendliness (i.e., complex to operate) but integrated systems (e.g., Phantom 4 RTK [55]) are more user friendly. <sup>2</sup> System described here has a high potential for component failure unlike integrated systems. <sup>3</sup> The DSLR setup described here requires precise balancing of the camera on the gimbal, which can be difficult and time consuming to achieve in the field. This is not a concern for integrated systems.

#### *3.4. Web-Based Visualization*

Both point clouds could be visualized in full spatial extent through a web browser from both a desktop computer (Figure 10) and smartphone. Navigation was simple and intuitive using either the mouse (desktop) or by swiping across the screen (smartphone). For both datasets, virtually no lag was experienced when interacting with the point clouds. The basic tools, which included measuring distances and areas and drawing cross sections (Figure 10b), further allowed the user to explore the characteristics of the bog. While interactivity and usability were high, this baseline implementation lacked the "sense being present" within the data. The overall definition of the detail in the point clouds depended on the speed of the internet connection. The server used Cat6 Ethernet to a gigabit broadband connection. From the user side, slow connections, especially on a mobile browser (e.g., HSPA-3G 3.5–14 Mbps), resulted in the point clouds requiring more time to load at full resolution especially for the SfM model (i.e., tens of seconds). On an LTE mobile internet connection (197 Mbps), there was no difference in the speed the models would load (i.e., <5 s) in comparison to a high-speed Wi-Fi or Ethernet connection (i.e., 150–800 Mbps). This web-based implementation is the simplest to access, requiring the user only to click a URL.

**Figure 10.** Screen captures illustrating the (**A**) SfM and (**B**) LiDAR point clouds in the web-based Potree viewer. The LiDAR data are shown with the cross-section tool enabled, illustrating that while the microtopography is difficult to see in the point cloud due to the low density of points, when viewed as a cross section, the difference in elevation between the hummocks and hollows is visible. The point clouds can be viewed at https://bit.ly/MB\_SfM (accessed on 14 April 2021) and https://bit.ly/MB\_lidar (accessed on 14 April 2021), respectively.

#### *3.5. Virtual Reality Visualization*

#### 3.5.1. Oculus Quest 2

Similar to the web-based visualization, the full point clouds could be loaded and displayed in the HMD through VRifier (Figure 11). The LiDAR point cloud loaded near instantaneously while ≈15–20 s were needed for the SfM model to load. The Oculus Quest 2 provided a full immersive experience with a higher "sense of being present" in the data than what was achieved by the web-based visualization. In this VR implementation, the importance of point density was apparent. With the SfM model, the user has the "next best" experience to being in the bog in person due to the high level of detail while the low point density of the LiDAR resulted in a less realistic experience because of the gaps in the data. Similar to the web-based viewer, the ability to scale the model easily with the touch controllers enhanced the immersive experience.

**Figure 11.** Video from VRifier illustrating the experience navigating the SfM and LiDAR point clouds on the Oculus Quest 2 headset. The input PLY files and video are available for download from https://doi.org/10.5281/zenodo.4692367 (accessed on 14 April 2021).

While generation of the PLY files was straightforward, the setup and integration of the Oculus Quest 2 and the desktop PC were more complicated, requiring the installation and configuration of several software packages and drivers. As of April 2021, VRifier was still in development, and not all features had been implemented. While it was possible to navigate through the point cloud and capture 2D and 3D panoramas (Figure 12) from within VRifier, tools to measure distances or areas were not available. When combined, the software packages (i.e., VRifier, Steam, various Oculus services) committed between 1.5–3 GB of the PC's RAM and 2.5–3% of the CPU during the visualization of the models.

**Figure 12.** Videos illustrating 360◦ panoramas of the (**A**) SfM and (**B**) LiDAR point clouds within VRifier. These panoramas are being viewed on the Insta360 Player but can be opened by most 360◦ photograph viewers. The 360◦ panoramas are available for download from https://doi.org/10.5281/zenodo.4692367 (accessed on 14 April 2021).

> One of the most useful options from within VRifier was the generation of the 360◦ panoramas (Figure 12). These files (PNG format, .png) can be readily shared, and many free programs are available to view them in 360◦ format. While they do not provide the navigation element of the immersive experience, these files are a suitable alternative for sharing geospatial data visualization.

#### 3.5.2. Google Cardboard

Other than the web-browser, the Google Cardboard headsets wwere the easiest for visualizing the 3D models. However, the quality of the stereoscopic 3D effect depended on smartphone model used due to the differences in screen size. For example, it was not possible to avoid duplication artefacts with the iPhone XR (screen size 6.06") with either viewer, but on the iPhone 11 Pro (screen size 5.85"), both viewers worked well in showing clear 3D content. Both viewers are intended to work with screens 4–6" in size. With the Google 87002823-01 Cardboard viewer, navigation through teleportation within the model was straightforward, but it did not work with the DSCVR headset, in which the experience was more similar to viewing a static 360◦ 3D photograph. Despite the 3D effect, it was less immersive than with the Oculus Quest 2 implementation.

#### *3.6. Augmented Reality Visualization*

We found the density of the 3D point clouds and the resultant file sizes to be a limiting factor in the usability of the mobile AR viewer. While the entire LiDAR point cloud (14 MB in .ply) could be opened in the Sketchfab application (Figure 13a), the SfM model had to be reduced in overall extent to 788 m2 (20 MB in .ply) (Figure 13b). In addition, the relatively small screen size of the smartphones did not allow for fine scale investigation of the models. Nevertheless, the ability to "walk through" and inspect the models from different viewpoints by simply rotating the phone allowed for a partially immersive experience. With the LiDAR data, the sparseness of the point cloud resulted in the user being able to see through the model to the real-world ground below (Figure 13b), and the hummock–hollow microtopography was very difficult to discern. From the SfM model, gross microtopographic features could be seen on the screen, but because of the small spatial extent of the subset dataset, very little of the bog's spatial structure could be examined. Table 7 summarizes a comparison between the main considerations of the different point cloud visualizations in VR and AR tested here for the SfM and LiDAR point clouds.


**Table 7.** Qualitative comparison between main considerations for visualization of the LiDAR and SfM point clouds.

<sup>1</sup> A local PC installation of the Potree viewer that does not require access to the internet to view the models is available. <sup>2</sup> After initial setup, internet access is not required. <sup>3</sup> Models could be saved locally to the smartphone and accessed without internet.

**Figure 13.** Videos illustrating a screen recording of the AR visualization of the (**A**) SfM and (**B**) LiDAR point clouds through the iOS Sketchfab application. The models can be viewed in AR at https://skfb.ly/onuU9 (LiDAR) (accessed on 14 April 2021) and https://skfb.ly/onuUs (SfM) (accessed on 14 April 2021).

#### **4. Discussion**

Microtopography and vegetation patterns at various scales can provide important information about the composition and environmental gradients (e.g., moisture and aeration) in peatlands. Ecological functions, greenhouse gas sequestration, and emission and hydrology can further be inferred from detailed analyses of the vegetation patterns [27,43]. As expected, our study revealed differences between SfM and LiDAR bog microtopography characterizations. The greatest difference is the spatial detail defining the microforms in the point clouds or DSMs. This is a result of the varying point densities, i.e., 570.4 ± 172.8 pts/m2 from the SfM versus 19.4 ± 7.5 pts/m2 from the LiDAR. Despite being sparser than the SfM, the UAS LiDAR data are considerably higher in density than conventional airborne LiDAR data from manned aircraft due to the low altitude of the UAS data collection. For example, airborne LiDAR data over the same study area produced a point cloud with a density of 2–4 pts/m2 [59]. Similarly, the authors in [64] reported a point density of 1–2 pts/m2 from airborne LiDAR for wetlands in Eastern Canada. Nevertheless, the point density achieved here for the LiDAR is lower than that reported by other UAS systems used to study forested ecosystems (e.g., up to 35 pts/m2 [65]).

Contrary to most forest ecosystems with a solid mineral soil ground layer, the ground layer of the bog is composed of living *Sphagnum* sp. moss over a thick peat column (several meters) with high water content, which prevents the pulses from encountering a solid non-vegetated surface below. Furthermore, the shrubs that comprise the hummocks have a complex branch architecture. A laser pulse encountering vegetation is likely to undergo foliage structural interference, resulting in reduced amplitude of return in comparison to solid open ground [66]. Luscombe et al. (2015) [67] showed that dense bog vegetation disrupts the return of the laser pulses and can result in an uncertain representation of the microform topography. Similar to the authors in [22,25] who found that penetration of the laser pulses into hummock shrub canopy was low from airborne LiDAR, because the vegetation blocked the pulse interaction with the ground beneath hummocks, our results also did not show multiple returns over the hummocks. As can be seen in the cross section of the LiDAR point cloud (Figure 9b), the points follow the elevation of the top of the canopy. A similar phenomenon was noted in other ecosystems with short dense vegetation such as crops and grasslands [27]. The SfM also cannot distinguish between the tops of the hummocks and the moss ground layer beneath. Our results were also similar to those by the authors [23,24] who found that exposed *Sphagnum* sp. mosses are good planar reflectors for LiDAR, which allows for mapping the surface details in open bogs.

As input to models that require a DSM as part of the workflow or as a covariate, e.g., peat burn intensity mapping [68], biomass estimation [59], and peat depth estimation [21], either the SfM or LiDAR would be sufficient. Both retain the gross microtopography of the bog, with similar semivariogram ranges and complexity (at the 50 cm scale). LiDAR should be used with caution at fine scales of interpolation due to the artefacts introduced from the low point density. Where fine scale detail is required (<10 cm), the SfM provides better results.

While both technologies provide valuable datasets of the bog, they are optimized for different scenarios (Table 6). The SfM dataset is better suited for studies that require fine spatial detail over a smaller area (<10 ha). The longer time for data acquisition and processing make this technology more feasible for localized studies. In contrast, the more efficient LiDAR is better suited to acquiring data over larger areas at lower spatial detail. At the expense of total area covered, from a lower altitude and with a slower flight speed the point density of the LiDAR could be increased, but further testing is required to determine by how much in this ecosystem. Both payloads are of moderate weight, 4.5 kg for the SfM and 4.8 kg for the LiDAR (Table 6) and as such require a UAS with enough payload capacity (e.g., M600P used in our study).

When manipulating the point clouds on a desktop PC or viewing them through the web-based Potree viewer, the difference in file size (1 GB for the SfM vs. 51 MB for the LiDAR LAS files) is not apparent when navigating within or interacting with the dataset. Even with a slow mobile internet connection, the Potree viewer remained useable. The file size also was not an important consideration when viewing the point clouds in VR with the Oculus Quest 2. Because the HMD is tethered to the PC during this operation and the desktop computer is rendering the data, the full datasets can be readily interacted with. When mobile VR (e.g., Google Cardboard) or mobile AR was used, the file size of the SfM dataset hindered the user experience. The main limitation for mobile VR was the file size of the cloud-based hosting platform (i.e., Sketchfab) and RAM capacity of the smartphones for AR. Potentially, the commercial AR implementations developed for medical imaging would not have the same file size restrictions, although these were not tested here.

All VR and AR visualizations provided a sense of agency through the user's ability to turn their head or smartphone and explore the bog through a full 360◦ panorama and change their perspective or scale of observation. While this ability is also true for the 360◦ panoramas captured within VRifer, dynamic agency was only fully achieved by motion tracking in the VR and AR implementations. As described by [69], this is an important distinction between a desktop digital experience and immersive technology. Such transformative developments in visualization lead to the user becoming "part of" the digital representation as opposed to the digital content remaining distinct from the user's experience [69]. Of the VR and AR tested here, only the Oculus Quest 2 rendered a visually immersive experience. In comparison to other VR implementations such as CAVEs and video walls with smart glasses, the full 360◦ panoramic view of the VR HMD cannot be matched [70].

Visualization technology is important because it allows users to study areas of interest in virtual environments in 3D, and it facilitates the interaction of groups in different locations, the collection of data in time and space, and the ability to view the object studied in environments with varying scales. In addition to its use in scientific queries, the immersive digital content is a further benefit for educational material and for the individual exploration of questions related to the datasets. Adding virtual models of the

region of interest accessible with immersive VR or AR technology greatly benefits the overall understanding and interest in the subject matter [71,72]. Because VR/AR content is interactive, the datasets can now be manipulated by each person with different questions or interests.

With the popularization of this technology for gaming and entertainment, there has been both a surge in development and improvement in the quality of the hardware but also a decrease in price in consumer grade VR headsets. Therefore, it is becoming more feasible to equip teams to use this technology both for meetings and also for virtual collaboration to work with datasets and colleagues from anywhere in the world. Popular for virtual tech support, AR lags behind VR in technological maturity for geospatial visualization. Nevertheless, with more compact datasets, such as the LiDAR point cloud, these 3D scenes can be displayed from most modern smartphones, making it both easily accessible and readily available to share interactive files. With the majority of VR and AR development in fields other than geospatial sciences (e.g., gaming, marketing, telepresence), there is a need for improved functionality and the ability of the specialized software to effectively handle the large files produced from technologies such as SfM and LiDAR [73].

Despite their promise, neither VR nor AR can replicate virtual environments with sufficient detail or fidelity to be indistinguishable from the real world. They are not a substitute for fieldwork, nor firsthand in situ field experiences. Rather, they are tools to augment and enhance geospatial visualization, data exploration, and collaboration.

#### **5. Conclusions**

It is only a matter of time before peatland ecosystem models (e.g., [74–76]) become adapted for 3D spatially explicit input. Fine-scale microtopographic ecohydrological structures that can be represented from either UAS SfM or LiDAR would provide the resolution needed for models to quantify how peatland structure and function changes over time [67], which can lead to insights into the ecohydrological feedbacks [43]. We show that vegetation structure can be reliably mapped from UAS platforms using either SfM or LiDAR. This is important in sites such as Mer Bleue where the spatial structure of the peatland accounts for 20–40% of the vegetation community distribution [43] and associated ecohydrology. Given the scarcity of UAS LiDAR studies in peatlands (compared to the SfM literature), additional research in peatlands (and other wetlands) is essential. New relatively low-cost LiDAR technologies, such as the DJI's Zenmuse L1 (point rate of 240,000 pts/s, up to 3 returns and manufacturer stated high vertical and horizontal accuracy), could provide new opportunities to expand the use of LiDAR in peatlands and other ecosystems.

**Author Contributions:** Conceptualization, M.K. and O.L.; Data curation, M.K.; Formal analysis, M.K. and J.P.A.-M.; Investigation, O.L.; Methodology, M.K.; Writing—original draft, M.K., J.P.A.-M and O.L.; Writing—review & editing, M.K., J.P.A.-M and O.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Canadian Airborne Biodiversity Observatory (CABO) and the Natural Sciences and Engineering Research Council Canada. The APC was funded by MDPI.

**Data Availability Statement:** The data presented in this study (LAS files) are available on request from the corresponding author following the CABO data use agreement from https://cabo.geog. mcgill.ca (accessed on 14 April 2021).

**Acknowledgments:** We thank Jacky Heshi from CanDrone for technical support with the LiAIR S220. We also thank the three anonymous reviewers, Nicolas Cadieux, Kathryn Elmer, Deep Inamdar, and Raymond J. Soffer for their comments which helped improved the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

**Table A1.** Summary of equipment and software used in this study.


<sup>1</sup> There is a free plan for individuals with limitations to file size uploaded. <sup>2</sup> Free for users to view models. Open source and free software are designated with an asterisk (\*).

#### **References**


### *Review* **Application of Drone Technologies in Surface Water Resources Monitoring and Assessment: A Systematic Review of Progress, Challenges, and Opportunities in the Global South**

**Mbulisi Sibanda 1,\*, Onisimo Mutanga 2, Vimbayi G. P. Chimonyo 3,4, Alistair D. Clulow 5, Cletah Shoko 6, Dominic Mazvimavi 7, Timothy Dube <sup>7</sup> and Tafadzwanashe Mabhaudhi <sup>3</sup>**


**Abstract:** Accurate and timely information on surface water quality and quantity is critical for various applications, including irrigation agriculture. In-field water quality and quantity data from unmanned aerial vehicle systems (UAVs) could be useful in closing spatial data gaps through the generation of near-real-time, fine resolution, spatially explicit information required for water resources accounting. This study assessed the progress, opportunities, and challenges in mapping and modelling water quality and quantity using data from UAVs. To achieve this research objective, a systematic review was adopted. The results show modest progress in the utility of UAVs, especially in the global south. This could be attributed, in part, to high costs, a lack of relevant skills, and the regulations associated with drone procurement and operational costs. The progress is further compounded by a general lack of research focusing on UAV application in water resources monitoring and assessment. More importantly, the lack of robust and reliable water quantity and quality data needed to parameterise models remains challenging. However, there are opportunities to advance scientific inquiry for water quality and quantity accounting by integrating UAV data and machine learning.

**Keywords:** drones; green water; integrated water management strategies; remote sensing; smallholder farms; water productivity

#### **1. Introduction**

Freshwater accounts for only 2.5% of the total amount of water on the earth's surface, and about 1.5% of that amount is accessible for biophysical processes [1]. Meanwhile, freshwater is a fundamental input in agricultural production, numerous manufacturing industries, and a basic need for domestic uses. Specifically, agriculture accounts globally for

**Citation:** Sibanda, M.; Mutanga, O.; Chimonyo, V.G.P.; Clulow, A.D.; Shoko, C.; Mazvimavi, D.; Dube, T.; Mabhaudhi, T. Application of Drone Technologies in Surface Water Resources Monitoring and Assessment: A Systematic Review of Progress, Challenges, and Opportunities in the Global South. *Drones* **2021**, *5*, 84. https://doi.org/ 10.3390/drones5030084

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 19 July 2021 Accepted: 26 August 2021 Published: 28 August 2021 Corrected: 20 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

about 70% of the total freshwater usage, mostly through irrigation [2,3]. Intense competition for water between different sectors will increase with an increase in the world population increase from the current 7.8 billion to about 9.7 billion by 2050. Consequently, global agricultural production is expected to increase by 60 to 70% [4], which will substantially increase water demand.

In the global south, particularly Southern Africa, water resources are unevenly distributed, and this is compounded by climate variability (i.e., an unpredictable seasonality of precipitation). The quality and quantity of available water affect all water users, including crop irrigation. Based on the recent findings presented by Bronkhorst et al [5], irrigation agriculture contributes to 25–30% of South Africa's agricultural production, and is responsible for up to 90% of high-value crops production and 25 to 49% of industrial crop production; however, it uses 60% of freshwater resources [5,6]. Meanwhile, urban and rural water use (including domestic use) consume 30% of available water resources. In this regard, there is an urgent need to identify accurate and efficient methods for assessing the quality and quantity of available surface water resources. The quantity and quality of available water resources are conventionally determined from in situ measurements, which in some cases can be time-consuming and costly [7]. In situ measurements do not always provide adequate spatial representativeness, and information may not be readily available to users such as farmers. In situ measurements may not always provide information about the temporal variability of available water, which is necessary for managing crop irrigation [7].

Earth observation and geospatial technologies have been widely proven to provide synoptic, timely, and spatially explicit data of various aspects of the earth's surface, including the spatio–temporal variability of both the quality and quantity of available water [7]. The literature shows that clean water generally absorbs electromagnetic energy mostly from the visible section (green) to the longer wavelengths in the infrared sections [8,9]. Subsequently, water has been detected and discriminated from other landcover types in this regard. Furthermore, this attribute of clean water has facilitated the determination of water quantity (surface volumes, spatial extent) and quality of surface water resources based on earth observation data and geospatial approaches. Earth observation facilities have been proven useful in accurately and efficiently characterising various attributes of surface water resources. These include the moderate-resolution imaging spectroradiometer (MODIS) [10], Landsat [11], SPOT [12], and Worldview [13], Medium Resolution Imaging Spectrometer (MERIS) [14], to mention a few. Work by Gholizadeh et al. [7] comprehensively details parameters that have been widely used to estimate water quality using remote sensing techniques. However, Gholizadeh et al. [7] extensively illustrate the application of remote sensing techniques at regional and landscape scales. Additionally, freely available satellite-borne earth observation facilities such as the Landsat and Sentinel 2 multispectral instrument remain inapplicable for local to farm-scale water resources monitoring and management.

Unmanned aerial vehicle systems (UAVs), also known as drones, have emerged as a potential alternative for mapping and monitoring the quality and quantity of water resources at local scales. This is because drones are flexible, relatively cheaper in comparison to in situ measurements and spaceborne remote sensing, and can be flown at low altitudes, offering very high spatial resolution data, with high prospects of timely and accurately characterising water quality and quantity for smallholder irrigation farms (Xiang et al., 2019). Dissimilar to satellite and other air-borne sensors, UAVs could be used in monitoring hazards (i.e., after landslides, floods, fires) [15] because they generate near-real-time, fine resolution, spatially explicit information. Despite the usefulness of UAVs, their application in agriculture, rural development, and, more importantly, water resources management, remains limited. Although some studies have sought to assess the literature on the utility of drones for a water resources assessment [7,16,17], the studies do not provide a systematic review that focuses on characterising water quality and quantity in the context of smallholder farming in the global south. To the best of our knowledge, the aforementioned studies did

not conduct any bibliometric analysis to evaluate progress, gaps, and challenges faced by the global south in utilising drone technologies for mapping and monitoring the quality and quantity of surface water bodies. In this regard, this paper seeks to review and offer an in-depth systematic assessment of literature on progress, challenges, and opportunities in the utilisation of UAVs in mapping and monitoring surface water resources for improving crop water production in smallholder farms in the global south.

#### **2. Materials and Methods**

This study sought to conduct a systematic literature review on assessing the quality and quantity of water using UAVs. The review was structured into two sections. The first section sought to establish the progress attained using drone technologies to map and monitor open water bodies and identify existing gaps. The second section then outlined the challenges and opportunities for applying drone technologies in mapping and monitoring open water bodies for improving crop water production. To address these sections, the literature search and analysis were conducted in three phases

#### *2.1. Phase 1: Literature Search*

The initial step of the literature search was to identify keywords, terms, and phrases used in the actual search strings. The review's objective was copied and pasted into Google Scholar, and the top three articles—Gholizadeh et al. [7], Lally et al. [16], and Cancela et al. [18]—were downloaded and reviewed for keywords, terms, and phrases. We identified the following keywords and their variants: "unmanned aerial vehicle(s)", "drone(s)", "Remote sensing", "GIS", "crop water use" "irrigation", "water productivity", "water use efficiency", "water bodies", "dam(s)", "reservoir(s)" OR "river(s)", water quality", "water quantity", and "water volume". The query strings used across the databases are presented in Table 1. The searches were restricted to titles, abstracts, and keywords.



SCOPUS, Web of Science, and Science Direct were utilised to establish literature based on specified keywords. The literature search was framed based on the PRISMA statement (Table 1). Since the current work was adding to what has already been established, the literature search was not restricted to the above databases. We used Google Scholar to identify and include articles that SCOPUS, Web of Science, and Science Direct had not indexed. The search covered the period from 1980 to January 2021.

Initially, the literature searches from SCOPUS, Web of Science, and Science Direct retrieved 136, 108, and 73 articles, respectively (Table 1, Figure 1). All retrieved literature were compiled in EndNote in preparation for screening. Specifically, the bibliographic information of the articles was used at this stage. The first screening procedure was the removal of duplicates considering that the key search terms were similar. In the second step, literature that were not written in English were excluded from the analysis. The following step involved examining whether each article was based on detecting and/or predicting surface water quantity or quality. Full-length articles of the selected abstracts were then sought and downloaded. Subsequently, 214 articles were retained after the screening procedure (Figure 1). A Microsoft Excel spreadsheet was then created to capture the details of each study. The spreadsheet was then reduced to consider only the studies that had specifically utilised drones in mapping and modelling the quality and quantity of surface water bodies. Fifty-six articles were considered on the drones' database (Table 1). The developed database was later used for quantitative assessment of the captured information.

**Figure 1.** Selection of the studies considered in this review.

#### *2.2. Phase 2: Data Extraction*

The database created from the previous phase was used to identify and comprehensively outline the existing progress, gaps, challenges, and opportunities in using drone technologies to map and model the quality and quantity of surface water bodies. To address these objectives of the study, the second phase extracted data from the identified articles. Specifically, information on the year the study was conducted, the study site, type of the surface water body, water quality parameter, sensor, vegetation indices, prediction or classification algorithms, and optimal spectral variables derived were captured. The categorical variables were then converted into numerical variables in preparation for data analysis. Meanwhile, key bibliometric information was also extracted during this phase.

The bibliometric data extracted included author names, country, year of publication, the title of the article, name of the journal, and abstract. A few studies and grey literature that were not captured by the review were included at this phase. Subsequently, this phase was also regarded as evaluating the systematic review's relevance and quality assessment stage.

#### *2.3. Phase 3: Data Analysis*

Identified literature and extracted data were subjected to quantitative and qualitative analysis. For the quantitative analysis, basic statistical frequencies were conducted. Furthermore, exploratory trend analysis was conducted in assessing progress on the utility and applicability of satellite and drone-based sensors in mapping and modelling the quality and quantity of surface water bodies. Bibliometric analysis was also conducted to reveal trends of key terms in monitoring surface water bodies. Bibliometric analysis is a quantitative method used to assess published articles and has become helpful in evaluating peer-reviewed studies in a specific field of research [19,20]. The evolutionary trends were inferred by statistically assessing the occurrence and co-occurrence of key terms used to map and monitor surface water bodies using VOSviewer software [21]. The titles and abstracts of articles in the final database (with 214 articles in Table 1), as well as the database of articles that specifically used drones, derived datasets (with 56 articles in Table 1) were used in the VOSviewer to investigate how concepts and topics have evolved in mapping and monitoring the quality and quantity of surface water bodies. Considering that only the occurrence, co-occurrence of key terms, and frequency distributions were computed, bias assessment was not conducted. Meanwhile, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist (http://www.prisma-statement.org/, accessed on 19 July 2021) was used as a guideline to avoid biased reporting [22,23]. The peer review system of the *Drones* MPDI Journal was also used in evaluating findings presented in this study.

The review was then presented in two sections to address the research objectives. The first section explored the progress attained, hitherto, in mapping and modelling the quality and quantity of surface water bodies using remotely sensed data. This section presented and detailed literature trends quantitatively evaluating the quality and quantity of surface water bodies. Specifically, water quality and quantity parameters, earth observation sensors (cameras), sensor platforms, algorithms, and optimal spectral variables that were used to date were showcased throughout this phase. The final phase then outlined and discussed the challenges, gaps, and opportunities existing in the context of knowledge creation in mapping and modelling the quality and quantity of surface water bodies using dronederived remotely sensed data.

#### **3. Results**

#### *3.1. Searched Literature Characteristics*

In evaluating the evolution and topical concepts of mapping and monitoring the quality and quantity of open water bodies, results showed that the utilisation of drone-based "remote sensing" "application" "case studies" trending mostly in "small reservoirs" of "China" around 2017 (Figure 2). The period from 2018 to 2020 represents an intensification in terms of the imagery analysis in the evaluation of water quality. This period was marked by the application of hyperspectral drone cameras in mapping water quality (Figure 2). Meanwhile, Figure 3 illustrates three topical clusters, green, blue, and red, in monitoring water. The key terms from the blue cluster were "UAV', 'remote sensing', ('image'), 'application,' 'mapping', 'chlorophyll', 'concentration', 'low cost', and 'measurement', which directly imply the utility of UAVs as a low-cost remote sensing system associated with mapping chlorophyll concentrations (Figure 3). The second-largest cluster linked to UAVs was in red and had 'reservoir', 'lake', 'dam', 'basin', 'area', 'volume', 'area', 'data', and Landsat. This articulates the major water quantity parameters, i.e., volume area, that were widely characterised using Landsat data and 'GIS 'techniques. The third cluster in green had 'water quality', 'water', 'model', 'river', 'turbidity', and 'irrigation' as the key terms in

order of importance (Figure 3). This cluster presented the linkages between chlorophyll and turbidity concentrations which were previously modelled using satellite-borne data mainly in the context of evaluating the quality of crop irrigation water.

**Figure 2.** Direction and revolution of topical concepts in mapping and monitoring the quality and quantity of open water bodies' derived using data from abstracts and titles.

**Figure 3.** Topical concepts in mapping and monitoring the quality and quantity of open water bodies' derived using data from abstracts and titles.

#### *3.2. Progress in Modelling Water Quality and Quantity*

Generally, progress is noted in detecting, mapping, and monitoring surface water resources, using remotely sensed data (Figure 3). As in Gholizadeh, Melesse and Reddi [7], the results of this study illustrated that most of the studies that utilised earth observation data sought to characterise water quality more than water quantity (Figure 3). The widely

researched water quality parameters included conductivity [24,25], pH [25,26], Cl− [24], dissolved oxygen [27], total suspended solids (TSS) [28,29], chlorophyll [30–33], turbidity [34–36], K+, ammonium nitrogen (NH4-N), sodium (Na+), BOD, magnesium (Mg), total phosphorous, orthophosphate (PO4–P), temperature and total nitrogen, iron (Fe), COD, zinc (Zn), calcium (Ca), manganese (Mn), salinity, copper (Cu), bicarbonate HCO3<sup>−</sup>, sodium-absorbed ratio (SAR), coliform, cadmium (Cd), chromium (Cr), Ca2+, HCO3<sup>−</sup>, and total hardness in order of frequency, as illustrated in Figure 4b. These parameters were mostly characterised using satellite remotely sensed data.

**Figure 4.** Frequency of studies that mapped surface water resources per year based on (**a**) both satellite and drone-borne sensors. (**b**) Drone-borne sensors.

The use of satellite remotely sensed data in mapping and modelling water quality and quantity has of late received extensive attention. This is illustrated by the steady increase in the number of studies that applied remote sensing techniques in mapping and monitoring water quality and quantity (Figure 4a). Meanwhile, a significant number of studies ventured into the utility of drones (Figure 4b). This study showed that works that utilised UAVs in mapping and monitoring water quality and quantity appeared around 2013 (Figure 4b). The studies that evaluated the utility of drone-derived data for mapping the quantity of water were significantly less than those that sought to assess the quality of water. Specifically, only fourteen studies assessed the level of water, whereas thirty-seven studies assessed water quality parameters based on drone remotely sensed data. However, the majority of the aforementioned studies were conducted based on drone remotely sensed data, principally mapped and monitored chlorophyll content [30,32,33,37,38] and turbidity in lakes, ponds, and dams (Figure 5b) [34–36].

The quality of irrigation water that is generally considered acceptable should be colourless, odourless, foamless with minimum turbidity, TDS below 1000 mg L−<sup>1</sup> at circumneutral pH, and a specific conductance below 1.5 mmhos/m [39–41]. COD, ZSD, TP, conductivity, pH, TSS, DO, and turbidity are critical attributes of water that need to be frequently monitored if high-quality crops and full potential harvests are to be attained.

For irrigation water, COD is a critical attribute that needs to be frequently monitored. COD and biochemical oxygen demand (BOD) are appropriate indicators for organic matter concentrations in irrigation water. When COD and BOD are high in irrigation water, much of the oxygen in the water will be consumed during the decomposition of organic matters resulting in an anaerobic condition [42]. In this process, soil oxides such as Fe3+, Mn5+ and SO4 <sup>2</sup><sup>−</sup> will exhaust oxygen to reduce the oxidation–reduction potential. Subsequently, the generated iron, manganese, sulphides, and organic acids may limit crop uptake and the absorption of nutrients. This frequently results in stunted growth, poor quality, and reduced harvests.

**Figure 5.** Frequency of studies that considered a specific water quality parameter based on (**a**) all satellite and drone-borne sensors. (**b**) Drone-borne sensors only.

ZDS is an indicator of turbidity and the total number of suspended particles in irrigation water [39]. A high concentration of suspended particles tends to result in an altered colour of the water and lower ZDS measurements. Higher concentrations of suspended sediments result in the clogging of irrigation equipment such as sprinklers. Therefore, high turbidity or TSS impedes irrigation by drippers and sprinklers [43].

In terms of the spatial distribution, the majority of studies that have hitherto ventured into the utility of drones in mapping water quality parameters such as chlorophyll, turbidity, DO, TSS, and pH were mostly conducted in China, the USA, Latin America, Europe, and Australia (Figure 6). This could be attributed to the fact that the earliest drone technologies began in Europe, the USA, and China between 1849 and 1916. Since then, the technology has been spreading. However, very few studies have been conducted in the global south,

especially in Africa. Subsequently, there is a need to consider and prioritise water quality parameters such as chlorophyll, turbidity, DO, TSS, and pH when devising irrigation water quality assessment techniques, especially in the global south.

**Figure 6.** Spatial distribution of UAV-based remote sensing studies in the context of open water bodies.

#### *3.3. Types of Sensor Platforms*

Some of the most widely used satellite platforms are the Landsat, Shuttle Radar topographic mission, MODIS, SPOT, and Sentinel 2 MSI (Figure 4b). Studies involving mapping and monitoring the quality and quantity of surface water bodies based on satellite-borne remotely sensed data have drastically increased. This could be attributed to the significant increase in earth observation technologies. However, no studies were conducted using both satellite and drone data simultaneously (Figure 3). As illustrated in the characterisation of literature, t-studies based on UAVs remotely sensed data only picked up in 2013 (Figures 2 and 6). The increase in the research effort and attention towards the utility of UAVs in relation to satellite-borne data could be explained by that they offer near-real-time, fine resolution, and remotely sensed data suitable for high-throughput quantification of water quantity and quality at user-defined revisit frequencies.

Satellite platforms of freely available datasets such as Sentinel 2 and Landsat tend to be limited by cloud cover and relatively coarser spatial and temporal resolutions, which are difficult to implement at farm scales. The findings of this study also illustrated that 77% of the studies in mapping surface water resources using drones were conducted based on the multi-copter platform systems, and 23% were based on fixed-wing platforms (Table 1). Interestingly, innovative octocopters and hovercrafts have also been harnessed for this particular purpose [25,44]. The dominance of the multi-copter platforms in water resources mapping could be because they are relatively cheaper than fixed-wing platforms [45,46]. Specifically, in their comparative study, Brito et al. (2019) noted that the superiority of the multirotor platforms was established better in the context of two-dimensional mapping surfaces, as is the case with mapping agricultural fields and surface water resources. Above all, the multi-copter platforms were associated with a capability for vertical take-off and

landing (VTOL). Most of the DJI multi-copters can VTOL [46]. This makes it easy to utilise multi-copter drones in any environment. However, batteries are their major setback [46]. The weight of multi-copter drone batteries and their capacity limits their flight duration significantly. Specifically, 35% of the studies noted in this study utilised the DJI multi-copter series from the Chinese company, Shenzhen DJI Sciences and Technologies Ltd (Figure 7). This was the most widely used platform based on the findings of this study. Generally, the DJI Matrice and the Phantom series were the dominant DJI platforms noted in our study. This could be attributed to the fact that the DJI Matrice platforms seem to be compatible platforms that can be integrated with many types of sensors, as illustrated in Table 2, when compared to other platforms.

**Figure 7.** Frequency and types of UAV platforms that were used in mapping water quality and quantity.


**Table 2.** Platforms and sensors that were used in mapping water quality.

Meanwhile, fixed-wing drone platforms' superiority was reported in mapping linear features (i.e., rivers and roads) [45]. They were also associated with longer flight durations [46]. However, the disadvantage of the fixed-wing drones is that they require a runway, making it challenging to operate them in any environment [46]. Despite these differences, the current drive in the drone technology industry is to harness the VTOL of multi-copters with a long flight time associated with fixed-wing platforms through creating a hybrid VTOL fixed-wing UAV [46].

#### *3.4. Sensors and Spectral Wavebands*

In terms of the sensors, the results of this review showed that from the satellite-borne sensors, Landsat, had been the widest-used sensors for characterising irrigation water quality across the world (Figure 7). Specifically, Landsat 5 was the most widely used sensor with 39 studies, followed by Landsat 8 Operational land Instrument (OLI) with 18 studies and then Landsat 1 with only three studies. These findings were similar to those of Gholizadeh, Melesse, and Reddi [7], who also echoed the dominance of Landsat data in mapping and monitoring water quality. This could be attributed to the fact that Landsat is the longest mission that has been consistently supplying remotely sensed data suitable for a wide variety of applications, including water quality and quantity parameters, without any charges. However, the moderate spatial resolution of Landsat datasets of a 30 m ground sampling distance has limited its applications to regional and landscape scales. Subsequently, there has been a gap at local scales since the available VHSR sensors such as Worldview and QuickBird are associated with high costs.

The advent of drone technologies has seen the utility of sensors, such as Nikon (NIKKOR AF-S 24–85 mm f/3.5–4.5G ED VR) and the Nikon D800 [47], GoPro Hero 4 Black Edition [48], Feiyu Mini 3D Pro [48], Sony [44], and CMOS [49] to the multispectral sensors such as the MicaSense, Parrot Sequoia [28,50–55] Sentera [38], MicaSense RedEdge multispectral [29,56], and the hyperspectral sensors such as Headwall Photonics Inc (207 bands), Ocean Optics STS-VIS (640 bands) [27], AvaSpec-dual Gaia (640 bands) [35,57], Sky-mini Nano-Hyperspec [30], Canon EOS 5DS R, and Headwall Nano-Hyperspec (640 bands) for local-scale water remote sensing applications (Table 2). However, as the spectral resolution of drone sensors increases, the associated costs also increase linearly. From our results, it can be observed that the most widely used sensors were the Cannon, Sony, MicaSense, and the Nikon.

Nevertheless, most of the Cannon, Sony, and Nikon sensors acquire images only in the visible section of the electromagnetic spectrum. They cover the red, green, and blue (RGB) regions of the electromagnetic spectrum (Figure 8). The RGB sections of the electromagnetic spectral alone do not offer sufficient data for extensive applications in areas such as characterising water quality despite their relatively limited costs and very high spatial resolutions in relation to other robust sensors. Meanwhile, the MicaSense series are multispectral cameras that acquire data not only in the visible section, but also cover the red edge and the near-infrared sections of the electromagnetic spectrum at a very high spatial resolution. This makes these the most sought after in the context of characterising a wide variety of applications ranging from the characterisation of vegetation traits to water levels and quality [26,28,29,56]. For example, the MicaSense RedEdge multispectral sensor covers the RGB and the RedEdge, NIR and the thermal infrared portions of the electromagnetic spectrum at an optimal ground sampling distance beyond 4 cm depending on the flight height. These spectral settings make this sensor comparable to the renowned Sentinel 2 multispectral instrument that also covers almost the same spectral regions, save for the thermal infrared section. Based on the findings of this study, there is a growing interest in the utility of hyperspectral sensors in mapping water quality and quantity. Specifically, these hyperspectral sensors cover the spectrum range between 300 and 1000 nm of the electromagnetic spectrum (Table 3). The major advantage of hyperspectral remotely sensed data in water quality remote sensing is the sensitivity to small changes in water quality parameters such as chlorophyll and TSS concentrations. Hyperspectral wavebands have a narrow spectral resolution of about 1–3.5 nm, making them more sensitive in relation to the multispectral drone cameras that are generally broader.


**Table 3.** Details of drone based hyperspectral sensors.

**Figure 8.** Satellite-borne sensors that were used in mapping surface water resources.

Table 3 illustrates a summary of the technical details of the hyperspectral sensors that were used in mapping water quality and quantity using drones. These sensors typically covered the visible to the NIR sections of the electromagnetic spectrum at very high spatial resolutions. The visible and NIR infrared (VIS-NIR) sections of the electromagnetic spectrum have been widely proven to be instrumental in assessing water quality. The premise that could explain the high frequency in the utility of the RGB spectrum (illustrated in Figure 8) was the relatively lesser costs associated with such three-band sensors. Furthermore, as aforementioned, the ease associated with interpreting the spectral signature of water in the visible and the near-infrared. However, there seems to be limited efforts to evaluate other sections of the electromagnetic spectrum in relation to the VIS–NIR in characterising water quality parameters. Based on the performance of hyperspectral data in other areas of research [58,59], there is a need to test the robustness and capability of the narrow spectral channels in detecting various water quality parameters.

#### *3.5. The Role of Drone Data Derived Vegetation Indices and Machine Algorithms in Remote Sensing Water Quality and Quantity*

Numerous vegetation indices were derived from drone remotely sensed data for characterising surface water quality and quantity. The most widely used sections of the electromagnetic spectrum in detecting water quality parameters were the visible section (blue and green) and the NIR wavebands. In this regard, vegetation indices such as the red and near-infrared (NIR), Surface Algal Bloom Index (SABI) [60], two-band algorithm (2BDA) [26], NDVI, and Green NDV [33], as well as band combinations and differencing such as (R+NIR/G) were used mostly in characterising chlorophyll content as well as TSS. As was suggested in many studies, the combination of sensitive spectral variables with robust and efficient algorithms produce accurate models. This study noted that algorithms such as linear regression (LR), image differencing, matching pixel-by-pixel (mpp), artificial neural networks (ANN), and the Manning–Strickler and adaptive cosine estimator were utilised in characterising mostly water quality parameters (Figure 9). The mpp based algorithms were also detected during the bibliometric analysis illustrated in Figure 3 (red cluster). Despite being a parametric estimator, LR was the most widely used algorithm because it is simple to implement [61] across various statistical platforms ranging from Microsoft Excel to R statistics. Since LR is a parametric statistic, it requires the data to suit specific assumptions such as normality that are often a challenge to attain. In this regard, there is a need for more efforts in assessing the utility of robust machine learning algorithms such as stochastic gradient boosting, random forest, and the ANN in mapping water quality based on drone remotely sensed data (Figure 10).

**Figure 9.** Spectral resolutions of drone sensors illustrated in Table 1.

**Figure 10.** Algorithms used to detect and map water quality and quantity using drone remotely sensed data.

#### **4. Discussion**

#### *4.1. Evolution of Drone Technology Applications in Remote Sensing Water Quality and Quantity*

Results in this study showed that the application of drones dates back to the late 1940s. Initially, drones were developed on the offensive as cheap and less risky military airborne fighting machines. With modernisation and the ease of prohibitive regulations, drones became a significant source of spatial data. Specifically, between 2012 and 2014, the United States of America eased the regulations that restricted UAVs for other purposes. Subsequently, the entire global village began to venture into utilising drones in earth observation. It was also observed that studies on the utility of UAVs in mapping and monitoring water quality and quantity are significantly increasing (Figure 4b) [16]. This could be explained by the advancement in drone and sensor technologies as well as the easing of restrictive regulations associated with drone technologies.

Meanwhile, results showed that more efforts from the community of practice were widely exerted towards mapping water quality in relation to water quantity. Specifically, only fourteen studies assessed the level of water, whereas thirty-seven studies assessed water quality parameters based on drone remotely sensed data [44,47–49,62–72]. A few examples of studies that mapped water levels included Ridolfi and Manciola [63] who used a method that was based on the Ground Control Points (GCPs) to detect water levels, where water level values were measured using drone-derived data. Meanwhile, Adongo et al. [64] assessed the utility of undertaking bathymetric surveys combined with geographic information systems (GIS) functionalities in remotely determining the reservoir volume of nine irrigation dams in three northern regions of Ghana. On the other hand, the majority of water quality-related studies that were conducted based on drone remotely sensed data, principally mapped and monitored the chlorophyll content [30,32,33,37,38] and turbidity in lakes, ponds and dams (Figure 5b) [34–36]. This trend was also revealed through the bibliometric analysis illustrated in Figure 3. Other water quality parameters that were of interest include the chemical oxygen demand (COD) [26,35,73], Secchi disk depth (ZSD) [26,34,74], total nitrogen [35], total phosphorous [35,73], conductivity [24–26,73], water quality index [73], pH [27,75], total suspended solids (TSS) [28,29,76], dissolved Oxygen (DO) [75,77], and turbidity [35,48], in order of importance illustrated by their frequency in the literature.

#### *4.2. Challenges in the Application of Drone Technologies with Special Reference to the Global South*

The major challenge associated with many regions is the statutory regulations that govern the operation of UAVs [77–79]. In many countries, there are still stringent restrictions regarding where and how UAVs are supposed to be operated [16]. In some countries of the global south, the take-off mass, the maximum altitude of flight, and the operational areas of drones tend to be regulated [16]. For instance, the South African Civil Aviation Authority (SACAA) stipulates that remotely piloted aircraft or toy aircraft should not be operated at 50 m or closer to any person or group of persons. It states that remotely piloted aircraft or toy aircraft must not be operated at an altitude higher than 45.72 m (150 ft) from the ground unless approved by the Director of Civil Aviation of the SACAA. Remotely piloted aircraft or toy aircraft weighing more than 7 kg should be operated only if approved by the SACAA (http://www.caa.co.za/pages/rpas/remotely%20piloted%20aircraft%20systems.aspx, accessed on 19 July 2021). The size of the UAV which is often associated with its batteries, engine efficiency, load, and type of UAV (fixed-wing or multi-rotor) tends to determine the length of time it can spend on a single flight plan and the size of the area it can cover [46,79]. In this regard, the regulation on the mass of UAV at taking off tends to indirectly restrict the areal extent that can be covered as well as the size of the camera to be mounted for research purposes, amongst other uses [16,68]. Specifically, due to the weight restrictions, many of the sensor types that are frequently used tend to be lightweight, small-size, and general consumer grades with limited spectral resolutions [15]. Moreover, SACAA states that Remotely Piloted Aircraft Systems (RPAs) shall not be operated beyond the visual-line-ofsight (BVLOS). Insufficient flight autonomy to cover large areas [50] limits the areal extent that can be covered by drones to a farm or field scale. Meanwhile, a supporting regulation

and operationalisation of BVLOS drone technology applications will facilitate coverage of greater areas on a single mission. Covering a greater area on a single mission improves the cost-effectiveness in the acquisition of VHR imagery. This will increase the prospects of drone applications in covering large dams and lakes in mapping and monitoring water quality and quantity. Further advancements and improvements towards the automation of drone operations will sanction routine monitoring and mapping applications. This study shows that single and three-band cameras are the most widely used sensors in characterising water quality parameters (Table 1 and Figure 6).

The SACAA stipulates a need for a pilot license to operate UAVs for commercial purposes in South Africa. However, the Eighth Amendment of the Civil Aviation Regulations, 2011, which came into operation on 1 July 2015, contains Part 101 on Remotely Piloted Aircraft Systems, states that

*"2.3* ... *the SACAA acknowledges that many entrepreneurs interested in obtaining a Remotely Piloted Aircraft Systems Operator Certificate (ROC) to provide aerial services, for example, real estate photography, academia etc. are not aviation professionals. As such, they have limited aviation backgrounds, and a lack knowledge about existing flight and airspace regulations. To protect the safety of the public and for these individuals to become viable UAS operators, they need to be aware of the requirements and the process. UAS operators, in turn, must be informed on the current regulations, policies and procedures to develop safe business practices in a similar fashion to professional "manned" aviation companies" (source: http://www.caa.co.za/RPAS%20AICs/AIC%20 007-2015.pdf, accessed on 19 July 2021).*

Meanwhile, the process of acquiring a licence costs about USD1500–2000. On the other hand, the prices of drone platforms and cameras remain high and beyond the reach of many researchers. Drone platforms with the capability of mounting various cameras generally vary between USD1000 and 10,000, inclusive of the sensor; hence, they are not accessible for research purposes in most Southern African countries. Only the affordable small platforms restricted in terms of sensors type (spectral and spatial resolution), flight height, and flight time are easily accessible and widely used for recreational purposes. This current review highlights the current state of affairs (opportunities and challenges) associated with research using cutting-edge drone technologies, especially to some of the countries in the global south. Although highlighting limitations such as the lack of funding, laboratories, and human capacity, this study sought to expose the plausible opportunities associated with these technologies. In this regard, this work will invoke ways in which researchers in countries of the global south can be aware of the prospects of UAV technologies and seek collaborations with countries of the global north. For instance, UNICEF, Virginia Polytechnic Institute, and State University, commonly known as Virginia Tech, joined the Government of Malawi in establishing The African Drone and Data Academy (ADDA) (https://www.unicef.org/malawi/african-drone-and-data-academy-malawi, accessed on 19 July 2021). The ADDA aimed to be a centre of excellence for dually equipping young people in Malawi and the African region with necessary 21st-century skills, while strengthening the drone ecosystem for a more effective humanitarian and development response in Southern Africa.

Mapping water volume using drone remotely sensed data is also one of the major challenges in the global south and across all continents. In mapping such complex channel bathymetric characteristics, there is a need for robust sensor systems that could penetrate water in detecting the water body's volume. Active sensors could be suitable for this procedure in conjunction with robust machine learning algorithms.

#### *4.3. Research Gaps and Opportunities*

The following gaps were identified from the results of this study in the context of irrigation water quantity and quality monitoring based on drone remotely sensed data:

• There are a limited number of studies that have sought to evaluate the utility of drone remotely sensed data in the global south;


#### *4.4. Way Forward: Closing the Gaps in the Utilisation of Drone Technology in Mapping Water Quality and Quantity*

Research efforts need to be promoted to evaluate UAVs' utility in monitoring irrigation water quality and quantity, especially in the global south's smallholder farms that are susceptible to climate variability shocks and unpredictable rains. As the fourth industrial revolution is progressing, UAVs are emerging as an innovative source of near-real-time spatial data for mapping and monitoring surface water resources to improve the agricultural sector productivity. Drone data have high prospects of providing well-calibrated, time-efficient, and spatially explicit data models on water quantity and quality. In this regard, the application of multispectral sensors in characterising surface water levels and water quality needs to be conducted if a sustainable utilisation of water resources and water security is to be achieved in the light of the rapidly growing population and its associated water demands. Since the current focus in the agricultural sector is towards reducing the amount of irrigation water while increasing agricultural productivity, timely and accurate fine spatial resolution data derived using drones such as the DJI series in concert with multispectral sensors such as the Mica sense and hyperspectral sensors such as the Headwall Nano-Hyperspec could be useful in detecting and mapping the spatio–temporal variability of available irrigation water quality and volume at field levels. Field-level near real-time fine resolution, spatially explicit information on water quality, and quantity models are useful in informing smallholder farmers in the field and policymakers away from the fields about water leakages occurring at the grassroots level. Near-real-time fine resolution and information on water quality and quantity will help farmers plan their irrigation schedules, resulting in limited water leakages and losses while improving productivity. This is very critical in reducing further strains on the already stressed-out water resources. This evidence-based, timely information on the available water resources is critical for farmers to suit their irrigation water management strategies based on in-field spatial variability and seasonal changes in water quantity and quality. Subsequently, formulating robust and effective local-to-regional frameworks and policies to facilitate sustainable utilisation and water management are more likely to be achieved.

#### **5. Conclusions**

The objective of this study was to conduct a systematic review, assess progress, opportunities, and challenges for using drone-derived remotely sensed data to map and model water quality and quantity. The utility of UAVs globally in mapping and monitoring the amount of surface water and its quality at a farm scale is still in its infancy. This is partly due to high costs, a lack of personnel with the requisite skills, and the stringent regulations of securing and operating drones. Nevertheless, drones are cutting-edge technology with high prospects of providing spatially explicit, timely, robust, and reliable surface water resources accounting. There is a need to swiftly embrace this technology to minimize water leakages for improving on-farm irrigation strategies, and draw local, regional, and national strategies and policies focusing on the sustainable utilisation of water to reduce the strain on the already stressed water resources. There is room for research on a wide range of

aspects on the quality and quantity of irrigation water in situ, which require research efforts and integration with other upcoming innovative technologies such as artificial intelligence and deep learning computer advances.

**Author Contributions:** Conceptualization, M.S., V.G.P.C. and T.M.; methodology, V.G.P.C., M.S., and T.M.; investigation, M.S., V.G.P.C., T.D., T.M. and O.M.; resources, T.M. and O.M.; data curation and analyses, M.S. and V.G.P.C.; writing—original draft preparation, M.S., V.G.P.C., T.D. and T.M.; writing—review and editing, M.S., V.G.P.C., T.D., T.M., D.M., A.D.C., C.S., and O.M.; visualization, M.S. and V.G.P.C., project administration, T.M.; funding acquisition, T.M.; critical review and redrafting, M.S., V.G.P.C., T.D., D.M., A.D.C., C.S., T.M. and O.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** The Water Research Commission of South Africa is acknowledged for funding through the WRC Project, No. K5/2791//4, 'Use of drones to monitor crop health, water stress, crop water requirements and improve crop water productivity to enhance precision agriculture and irrigation scheduling'. This work was also based on the research supported in part by the National Research Foundation of South Africa (grant number 119409).

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


### *Article* **Monitoring Dynamic Braided River Habitats: Applicability and Efficacy of Aerial Photogrammetry from Manned Aircraft versus Unmanned Aerial Systems**

**M Saif I. Khan 1,\*, Ralf Ohlemüller 2, Richard F. Maloney <sup>3</sup> and Philip J. Seddon <sup>1</sup>**


**Abstract:** Despite growing interest in using lightweight unmanned aerial systems (UASs) for ecological research and conservation, review of the operational aspects of these evolving technologies is limited in the scientific literature. To derive an objective framework for choosing among technologies we calculated efficiency measures and conducted a data envelopment productivity frontier analysis (DEA) to compare the efficacy of using manned aircraft (Cessna with Aviatrix triggered image capture using a 50 mm lens) and UAS (Mavic Pro 2) for photogrammetric monitoring of restoration efforts in dynamic braided rivers in Southern New Zealand. Efficacy assessment was based on the technological, logistical, administrative, and economic requirements of pre (planning), peri (image acquiring) and post (image processing) phases. The results reveal that the technological and logistic aspects of UASs were more efficient than manned aircraft flights. Administratively, the first deployment of UASs is less efficient but was very flexible for subsequent deployment. Manned aircraft flights were more productive in terms of the number of acquired images, but the ground resolution of those images was lower compared with those from UASs. Frontier analysis confirmed that UASs would be economical for regular monitoring of habitats—and even more so if research personnel are trained to fly the UASs.

**Keywords:** unmanned aerial systems (UAS); aerial photogrammetry; habitat monitoring; braided river habitats; efficiency; data envelopment analysis (DEA)

#### **1. Introduction**

The use of digital aerial imagery for habitat monitoring is an evolving technology [1]. Increasing computational power, the availability of low-cost unmanned aerial systems (UASs), and the development of software for image analysis have made aerial imagery using UASs a tool of growing interest among conservation researchers and practitioners [2]. Since these technologies are relatively new, there is only a handful of scientific papers discussing their operational complexities [3]. It will be useful to have a comparative summary of available technology with information about their applicability and efficiency for a given purpose such as habitat monitoring [4–6], wildlife monitoring [7–9], vegetation change analysis [10,11], forest inventory [12], monitoring agricultural productivity [13,14] etc.

Braided rivers are one of the most dynamic ecosystems of the world [15]. The unique geomorphology and hydrological regime give rise to a range of habitats along the braided riverbeds [16]. These diverse microhabitats are often endangered due to the growing threat of habitat modifications induced by the upstream hydrological change and invasion by introduced flora and fauna [15]. Monitoring the consequences of these impacts is important for conserving the unique and often endemic flora and fauna of these habitats [17]. Braided river systems are prone to changes, as the complex spatial and fluvial arrangements are easily altered by fluctuations in the upstream flow regime [15]. The dynamic nature of the

**Citation:** Khan, M.S.I.; Ohlemüller, R.; Maloney, R.F.; Seddon, P.J. Monitoring Dynamic Braided River Habitats: Applicability and Efficacy of Aerial Photogrammetry from Manned Aircraft versus Unmanned Aerial Systems. *Drones* **2021**, *5*, 39. https://doi.org/10.3390/ drones5020039

Academic Editors: Higinio González Jorge and Margarita Mulero-Pazmany

Received: 30 March 2021 Accepted: 14 May 2021 Published: 17 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

braided river ecosystem makes habitat monitoring challenging as changes often happen within a short period following local weather events, such as high precipitation in the catchments. Lack of accessibility, logistics, and resources can be additional challenges. Remote sensing methods are being increasingly used in such situations [6].

There are many candidate remote sensing technologies to choose from, ranging from satellite imagery to lightweight unmanned aerial vehicles (UASs) [18], as well as aerial photographs using flights with manned aircraft [19]. Unmanned aerial systems are also referred to as unmanned aerial vehicles (UAVs). The term "unmanned" is sometimes argued as not being gender-neutral and "unoccupied" is suggested as an alternative [20]. However, we have persisted with the terms "manned" and "unmanned" as these are widely used. Each of these remote sensing methods have a wide range of features to choose from. For example, UASs vary in flying technology from fixed-wing or multirotor [21] and can be equipped with various sensors including RGB [11], infra-red [22], thermal bands [9], or even laser scanners [23]. Reviews of different UAS technologies are available in the literature [21], but the technology is ever-evolving and new choices are added to the mix relatively frequently. There are also varying options for flight planning, operating, image acquisition, and image processing software [24]. However, there is a lack in the scientific literature of objective comparison of technological alternatives for a given application [1]. Cost-based decision making often underestimates logistical, administrative, and other technological challenges [3]. There is need for a comprehensive framework that can incorporate these different aspects and can still objectively compare different technologies and assess their efficiencies for a given purpose. In this study, we assess the applicability of aerial photogrammetry for monitoring changes in the habitat features of the Aparima River, a braided river in Southern New Zealand, and compare the efficacy of using manned aircraft versus unmanned aerial vehicles (UASs).

#### **2. Materials and Methods**

#### *2.1. Study Site*

A 10 km stretch of the Aparima River (46.0003◦ S, 168.1095◦ E) is being monitored for changes in habitat features due to ongoing commercial gravel extraction from the riverbed. The goal of the monitoring is to assess the changes and inform habitat management solutions to maintain habitat suitability for a range of native species. In this research, manned aircraft and unmanned aerial photogrammetry tools are compared in order to select an economically viable and technologically suitable remote sensing monitoring system.

#### *2.2. Manned Aircraft and Unmanned Flight Missions*

The manned aircraft flights were performed with a Cessna 180 customized to carry aerial photogrammetry equipment [19]. Onboard was an Aviatrix aerial photography system capable of triggering photographs at pre-fixed points along the planned flight lines. Altitude above ground level, and image resolution and the number of flight lines was also pre-set. The camera used was a Canon EOS 5DS r with a Sigma 50 mm lens. The flight planning was carried out in Flight Planner Pro software from Aeroscientific (Adelaide, Australia), licensed through the Department of Conservation. The image sizes were 8688 × 5792 pixels with 50% sidewise and 68% forward overlaps. Three manned aircraft flight missions were carried out in February 2018, December 2019, and October 2020 at 608 m, 518 m, and 304 m altitude above ground level (AGL), respectively. All manned aircraft flights were performed at a speed of 166.70 km/h.

The unmanned flights were carried out with a commercially available Mavic2 pro quadcopter (DJI, China) carrying a 20 MP Hasselblad camera with a 28 mm equivalent focal length. The image sizes for unmanned flights were 5472 × 3648 pixels with 70% overlaps. Flight planning for UASs was carried out with Pix4D software (Prilly, Switzerland) and the same software was used to fly the UAS with DJI control on a Samsung A10 mobile phone. The unmanned flights took place in October 2020 and November 2020. One site at the

Northern end of the study area was performed on both the October and November missions, before and after a flood event of the riverbed habitat induced by high precipitation in late October. The flooding event possibly triggered the abandonment of colony exploration by white fronted terns (*Sterna striata*) in that site. The abandonment was a potential indicator of changes in habitat features. The re-deployment of the UAS within this short period was carried out to search for the changes in the habitat due to the flooding event. The average speed of UAS flights was 16.80 km/h and all were performed at 50 m altitude AGL. No ground control points (GCPs) were used for manned aircraft or UAS flights.

The manned aircraft flights covered the whole 10 km stretch of the Aparima River study area covering on average 863 ha. The unmanned flights covered three subsections of this larger area, comprising Northern and Southern ends, and a central area, a total of more than 83 ha (Figure 1). When necessary, the manned aircraft flight image geotags were updated using ExifTool software (Kingston, ON, Canada) [25] by syncing with the aviatrix trigger time log. All image processing analyses, including image mosaicking, were carried out in ESRI ArcGIS Pro 2.5, Redlands, CA, USA.

**Figure 1.** Flight plans and image footprints of flight missions over the Aparima River (**a**) Manned aircraft flight 2018, (**b**) Manned aircraft flight 2019, (**c**) Manned aircraft flight 2020, (**d**) unmanned flight 2020 over the Northern end, (**e**) unmanned flight 2020 over the central area, (**f**) unmanned flight 2020 over the Southern end of the manned aircraft flights. The image mosaics of these areas are included in Figure 2.

**Figure 2.** Output image mosaics from different manned aircraft and unmanned flight missions over the Aparima River: (**a**) Manned aircraft flight 2018, (**b**) Manned aircraft flight 2019, (**c**) Manned aircraft flight 2020, (**d**) unmanned flight 2020 over the Northern end, (**e**) unmanned flight 2020 over the central area, (**f**) unmanned flight 2020 over the Southern end, (**g**) zoomed in subsection of manned aircraft flight 2018, (**h**) zoomed in subsection of manned aircraft flight 2019, (**i**) zoomed in subsection of manned aircraft flight 2020, (**j**) zoomed in subsection of unmanned flight 2020 over the Northern end. Red, blue, and yellow polygons on 1e delineates the Northern, central and Southern sections performed with UASs. The zoomed in subsections (**1g**,**1h**,**1i**,**1j**) showcases the ground resolution difference among the flight missions.

#### *2.3. Input Resource Assessment*

For a better insight into the aerial imaging technology, we have separated aerial photography operations into pre-, peri-, and post-flying phases. The pre-flying phase includes flight planning and mobilizing resources for the flying operation. The periflying phase includes the actual flight operation and image acquisition. The post-flying phase includes image sorting, storage, and various image analysis techniques. For ease

of comparison, we have considered orthophoto mosaicking of the acquired images as the minimum step to be completed in the image processing phase.

Each of these three phases has technological, logistical, administrative, and economic aspects to assess. Technological resource assessment includes software and hardware availability, the required operating system or platform, and the trained personnel to operate these tools. Logistics refers to organizing the required resources for the operation. The administrative aspect is considered separately from logistics as it deals with compliance with various rules and guidelines for undertaking flight missions for aerial photography.

#### 2.3.1. Technological Aspects

Technological complexity is subjective depending on the previous experience of the user. For a more objective assessment, the days of training required for someone who is generally enthusiastic about the technology is used as a proxy for the training needs. The data are generated from available formal training manuals, and the least number of days needed for basic functional engagement with the technology is considered. For example, for drone flying, diploma courses of a few months are available. However, for basic engagement, a two-day course and some hands-on flying experience is a basic necessity. Consequently, 30 days is considered as a proxy for drone flying training needs. For manned aircraft flight, however, flying the plane and managing the aerial imagery tools (Aviatrix system) are two distinct sets of skills. In our case, it was the same person who was the pilot who simultaneously managed the Aviatrix system on board. However, for resource assessment, we have considered the requirement as two trained skillsets. Any time requirement is rounded up to the next full day.

#### 2.3.2. Administrative Aspects

For the administrative assessment, we considered the number of days required to secure the consent of all relevant stakeholders as an indicator of operational requirement. Stakeholders were selected based on the rules and guidelines in place for flying airplanes and drones. The stakeholders included air traffic control authorities, local councils, government organizations, and private owners. The time required for consent was considered as a whole, as the consent process could move simultaneously once initial communication had been established.

#### 2.3.3. Economic Assessment

For the economic assessment, the cost of procuring all the services and equipment was considered. This includes the overall time required in technological, administrative, and logistical fronts for the different flying phases. For recurring monitoring missions, we included the establishment cost in the first mission and only included the resources that were required for any subsequent mission. For a manned aircraft flight, the cost of customization required to equip the airplane for aerial imaging is available from the reporting by the New Zealand Department of Conservation [19]. The lead time required for the customization of the manned aircraft flight is included as the logistics organization time for the first flight as a required investment. The cost of a drone flight is taken from the advertised price of the drone available for sales in New Zealand [26]. For our case, we borrowed the drone from the Department of Conservation's Maukahuka project [27]. The communication required to organize the drone is taken as a substitute for the time required for procuring a drone from the market. No customization was required for the drone itself.

Apart from the quantitative assessments, we used information synthesis on different operational complexities as a qualitative assessment. For example, this would include an assessment of the particular software, which has restricted access or limitations on being used in the field because of the unavailability of a portable platform. Where possible, we have directly compared the resource requirement for the two technologies. The analytical frame used for these analyses is presented in Table 1.


**Table1.**Analyticalframeworkforcomparingmannedaircraftandunmannedflyingtechnologiesforaerialphotography.

#### *2.4. Efficiency Measurements and Data Envelopment Analysis (DEA)*

For quantitative efficiency assessment, we used Farrell efficiency within the data envelopment assessment (DEA) framework [28,29]. Farrell measures of efficiency are based on proportional changes to input or output, i.e., how much input can proportionally be reduced while maintaining the same output (input efficiency), and with the same amount of input how much can we proportionally increase the productivity (output efficiency) [11]. The input–based Farrell efficiency or just input efficiency, for a technology where input is x and output is y, is defined as

### *Ef ficiency* (*Input*), *Ei* = *min* { *Ei* > 0 | (*Eix*; *y*)}

In other words, it is the maximum proportional reduction of all inputs x that still can produce y amount. An input efficiency measure of 0:7 would indicate that the technology can save 30% off all inputs while producing the same outputs. Similarly, output–based Farrell efficiency or output efficiency is defined as

$$Efficiency \(Output\), \ E\_o = \max\left\{ \begin{array}{c} E\_o > 0 \; | \; (x; \; E\_o y) \right\} \right\}$$

In other words, the maximal proportional increment of all outputs y that is possible for a given inputs x. An output efficiency measure of 1:2 suggests that the output can be increased by 20% without any additional input [28]. Simply put, the input efficiency is the actual input divided by the minimum input required, while the output efficiency is the actual output divided by expected output [28]. Again, in general terms, efficiency is measured by output divided by inputs. However, for a set of observed productivity scenario, the efficiency is scaled to the different input sets. The efficiency measure reported here is adjusted through the parameters used in the benchmarking procedure such as free disposability hull, variable returns to scale, decreasing returns to scale, etc. The reported efficiencies here are on a scale of 0 to 1, with 1 being the optimum effi-ciency [28–30].

The non-parametric DEA analysis is a widely used method for comparing production efficiency among different firms using varying technologies [29,31–33]. The chosen frontier type for DEA analysis was FDH+, which is a combination of free disposability hull (FDH) and constant return to scale [28,30]. FDH is a stair-way shaped frontier and the least restrictive on input data [28]. The free disposability implies that increased input results in the same or in higher productivity. The choice of these efficiency measures and frontier analysis is inspired by it being able to be used with low data availability, while being non-parametric requires the least amount of data transformation. The key productivity indicator (KPI) used for efficiency analysis is the number of the image acquired or processed. Area coverage and the desired ground resolution of those images were also included for discussing different aspects of efficiency. Since coarser (numerically higher) ground resolution means lower productivity, multiplicative inverse, or reciprocal of ground resolution, i.e., 1/ground resolution in cm was used in a quantitative analysis of efficiency. To make the fractions an integer, the reciprocal was normalized by multiplying by 100. We used the R package benchmarking [30] for all efficiency measurements and DEA production frontier analysis.

#### **3. Results**

#### *3.1. Resource Assessment and Efficiency Measures*

#### 3.1.1. Technological

For manned aircraft flight, flight planner pro software was used. The software is licensed to the Department of Conservation. A demonstration version is available only by contacting the company and they also provide orientation and training sessions for the software as needed for any potential customer. Even though this is useful, there are certain challenges to accessing the service, as one must prove genuine intent as a potential buyer. The flight planner is available on PC and can be taken on board the flight with the Aviatrix

system. An Aviatrix trigger box is available on order through aero scientific, which is a more restrictive accessibility than for other platforms available off the shelf.

For UAS, Pix4D software suite was used for flight planning and control of the drone during flight. The suite has software for flight planning and has extensions for operating flights on various platforms. Most of the drone flight planning and flying control interface comes as an application (app) to be used on Android mobiles and Apple iOS. This is convenient during field operation as adjustments and modifications can be easily accommodated. Similar software can be used for image analysis of both manned aircraft and unmanned flights. However, the images that were taken through manned aircraft flights needed to go through GPS geotagging correction as camera GPS tags are not always able to keep up with the image capture rate, and often record the same GPS location for multiple images.

Technologically, manned aircraft and unmanned flights have similar personnel requirements. For the peri-flying phase, the manned aircraft flight would need people trained with two distinct skill sets, whereas unmanned flights can be undertaken by one trained person and one supporting person (Supplementary Materials Table S1) to keep the UAS under constant manual viewing. However, the training needs for the peri-flying phase is much higher for manned aircraft flights than for unmanned flights (Table 2).

Whereas the manned aircraft flights generally have more outputs in terms of images captured or processed (Table 3), the image quality as indicated by ground resolution is higher for UASs (Table 3; Figure 2).

These findings are reflected in the efficiency measures (Table 4) where under technological efficiency, on average, manned aircraft flights are estimated to be about 90% efficient compared to the efficiency of unmanned flights through different flying phases (Table 4). The overall efficiency of the technological aspect for manned aircraft flights was about 76% that of the unmanned flights. The detailed efficiency measures for all the flight missions are included in Supplementary Materials Table S2.


**Table 2.** Average input across manned aircraft and unmanned aerial missions through their different phases of


**Table 3.** Outputs and key performance indicators (KPI) for different manned aircraft and unmanned flight missions over the Aparima River.

> **Table 4.** Average efficiency of manned and unmanned flights through different flying phases over the Aparima River.


#### 3.1.2. Logistical

The logistics requirement is the highest for the peri-flying phase for both manned aircraft and unmanned flights. However, the logistical requirement, both in terms of time and cost, for manned aircraft flights is noticeably higher than for unmanned flights (Table 4). In terms of efficiency, overall, the manned aircraft flights are on an average 33% less efficient than unmanned flights (Table 4). For the pre- and post-flying phase, the logistics requirements for manned aircraft flights are still higher than for unmanned flights. The lower logistics requirement for UAS makes this more flexible, especially for re-deployments.

#### 3.1.3. Administrative

The administrative input requirement for unmanned flights is higher than for manned aircraft flights, especially for the pre-flying phase (Table 2; Supplementary Materials Table S1). The unmanned flights require consent from any private landowners that have private land within the flight range. This adds to the time required for both planning and flying phases for unmanned flights (Supplementary Materials Table S1). Since flight planning could bypass the subsequent consent requirement, unmanned flights still perform more efficiently than manned aircraft flights on average for the pre-flying phase (Table 4). However, since consent is required each time an unmanned mission is performed it becomes less efficient than a manned aircraft flight. Overall, the administrative efficiency of manned aircraft flights was higher (Table 4).

#### 3.1.4. Economic

From the economic aspect, unmanned flights required considerably lower inputs in terms of time and money throughout all flying phases (Table 2). The same trend is reflected in efficiency measures, and overall, the manned aircraft flights, on average, were about 39% efficient, whereas unmanned flights were 73% efficient (Table 4).

#### *3.2. Re-Deployment of Unmanned Aerial Vehicles (UASs) Following a Flooding Event of the Riverbed*

The Aparima River experienced a flooding event in late October 2020. It was immediately after the second UAS mission for acquiring photographs of the Northern end of the project site (Figure 2). Before the weather event, white fronted terns (*S. striata*) were exploring the site for potential colony establishment with an estimated 70 individuals observed in that site for few weeks spanning from September to October, 2020. However, after the weather event, the terns abandoned the site and did not return that summer. This was a good opportunity to assess the changes in the habitat due to the weather event within two weeks of the flooding event in early November. Although detailed image analysis (Figure 3) remains outside the scope of this article, a quick visual inspection reveals erosion of the sandy site as floodwater forced through the riverbed (Figure 3e,f). The ability to rapidly deploy a UAS enabled documentation of the sudden changes in habitat features. This would not have been possible with a manned aircraft flight, given the long period required for logistic requirements.

**Figure 3.** Image mosaics of same site at the Aprima River, New Zealand, before and after a flooding event. Images were acquired with unmanned aerial system (UAS): (**a**,**b**) show flight path and photo centres of flight missions, respectively, in October 2020 and November 2020; (**c**,**d**) show the image mosaics; and (**e**,**f**) are zoomed in subsections of the same area before and after the flooding event occurred in late October 2020 at the site.

#### *3.3. Comparison of Manned Aircraft and Unmanned Flights Through Data Envelopment Productivity Frontier Analysis (DEA)*

The DEA frontier analysis shows that across all four aspects of aerial imaging, manned aircraft flights (Figure 4, solid lines) missions had higher productivity potential, i.e., higher output efficiency over unmanned flights (Figure 4, dashed lines). The unmanned flights have lower input requirements indicating the input efficiency of the unmanned flights (Figure 4).

**Figure 4.** Frontier analysis for different aspects efficiency to compare manned aircraft and unmanned aerial imaging techniques for different flying phases: (**a**) Technological, (**b**) administrative, (**c**) logistical and (**d**) economic; the solid line represents manned aircraft flight production frontier, the dashed line represents unmanned flight production frontier and the dotted line represents the combination of manned aircraft and unmanned flight production frontier. Where the frontier lines cross the *x*-axis indicates the minimum input required for any productivity to occur and the highest y-values the lines reach represents the highest production possible. O1, O2, and O3 are the first, second, and third manned aircraft flights where U1, U2, and U3 are unmanned flights in the same sequence.

Technologically, the manned aircraft and unmanned flights have similar minimum input requirements and manned aircraft flights will have higher productivity given the same inputs (Figure 4a). Unmanned flights need lower inputs for administrative, logistical, and economic components (Figures 2c,d and 4b). However, it is the minimum input requirement (where the frontier crosses the *x*-axis representing inputs) that clearly shows that unmanned flights had much less initial investment and less time required for flight deployments.

#### **4. Discussion**

The results indicate that UASs have higher input efficiency, as minimum resource requirement for deploying UASs are lower than that for manned aircraft flights. Results also re-affirm that manned aircraft flights have higher output efficiency in acquiring more images and area coverage than UASs. However, UASs, are performed at a lower height, and usually have improved ground resolution of acquired images. More importantly, UAS missions are much more flexible than manned aircraft flights for subsequent deployment for monitoring dynamic habitats.

#### *4.1. UASs Have Higher Input Efficiency Than Manned Aircraft Flights*

The cost of arranging for UAS flights was much lower than for the manned aircraft flights. Moreover, the manned aircraft flights needed further customization to fit cameras and other accessories and therefore required approval from the Civil Aviation Authority [8]. Commercial aerial survey flights usually have these arrangements in place and are not included in the cost-only analysis. However, these requirements added to the pre-flying phase input, especially for the first fight mission of the manned aircraft flights, both in terms of time and money. These inputs are indicative of the technological complexities of the manned aircraft flights.

The manned aircraft flight operation is also much more training intensive than UASs. Flying a manned aircraft flight for aerial photography, where one needs to fly along a pre-planned flight line at a fixed speed, requires a long-term commitment for training and accumulation of flying experience. On the contrary, UAS operations are much easier to learn and implement. With only relatively brief training, project personnel can readily operate a UAS to obtain aerial photographs of the area of interest.

The lower cost of UAS-based aerial photography for a small area is considered one of the reasons for the growing use of UASs in various remote sensing operations [34], including aerial photography for monitoring conservation projects [2]. Apart from investment, UASs are considered safer as there is a lower risk to life and property in an event of an accident [35]. The critical public concern for drone usage is privacy [18] and this is reflected in UAS flying and usage guidelines mandating that consent must be secured from all private landowners and people present within the UAS flying zone. Even though manned aircraft flights could capture high-resolution images of the same area and people, there is no legal requirement for specific consent for image acquisition [36]. Securing flight consent is one notable feature, falling within the administrative category of flight operations, where the input requirements for UASs are higher than for manned aircraft flights. The other aspect where higher input is required for UAS operation is the need for arranging logistics on the ground including finding a secure launching and landing site near the flight path [3]. In all other aspects, UASs have higher input efficiency than manned aircraft flight missions for aerial photography.

#### *4.2. Output Efficiency of Manned Aircraft Flights Are Higher Than UASs for Aerial Photography*

Manned aircraft flights have the potential to acquire more images and over a larger area than the UASs during flying missions, because the UASs are restricted by speed and battery power. The ability to acquire and process more images can be translated into larger area coverage or improved ground resolution or to cover the same area in fewer images.

The need for UASs to be always visible to the pilot and co-observer also limits the extent of the flying area from each launching site [3]. The area coverage of manned aircraft flights was significantly higher than the UASs for all three flight missions (Table 3). Accordingly, the average overall cost of manned aircraft flights (US\$18.75/ha) was much lower than for UASs (US\$104.28). However, for image quantity, the average cost is reversed as the manned aircraft flights cost (US\$ 66.81/image) is much higher than for UASs (US\$2.97/image). On the other hand, the ground resolution of aerial images taken from UASs is finer (0.8 cm) than that of manned aircraft flights (average 4.3 cm, Table 3). These

contrasting figures highlight the importance of a holistic analysis for comparing new technologies.

The area coverage and ground resolution can be adjusted by flying at different heights according to the intended end use of the aerial images. It is the number of acquired and processed images that is independent of the user's choice or requirement. Therefore, the number of images was considered a key performance indicator (KPI) for frontier analysis. As noted earlier, ground resolution of images acquired from manned aircraft flights can be improved by flying at a lower height. However, there are administrative restrictions on flight heights. For manned aircraft flights in New Zealand, the lowest flight height is set at 304 m above ground level for settlements and 152 m for other areas. A flight plan with lower altitude AGL will need special approval from the Civil Aviation Authority (CAA), New Zealand [37]. Ground resolution of the resultant image can also be lowered or by using a lens of higher focal length. In either case, the field of view for such images will be reduced [8], necessitating more images to be taken to cover the area of interest. Maintaining the image triggering sequence with a shorter time interval between photocentres is challenging, and the potential for missing images also increases. Correcting the GPS tags of these photos can also become challenging as the image timing sequence will likely be shorter than a second and thereby difficult to align with triggering time. On the other hand, the flight restriction for UASs is at the higher height. The highest UASs are permitted to fly in New Zealand is 122 m [38], and there are similar regulations for flying UASs in many countries of the world [39]. Additionally, the safety concern and reaction of wildlife such as birds can limit how low a UAS can be performed [40], putting a cap on the image ground resolution for a given camera setting. In any case, despite the lower ground resolution, the output efficiency of manned aircraft flights remains higher than for UASs in terms of the number of images acquired and by having higher ground coverage.

#### *4.3. High UAS Flexibility for Monitoring Dynamic Ecosystems*

The productivity frontier analysis indicates that deploying UASs requires much less time and money than commissioning a manned aircraft flight for an aerial image mission (Figure 4). This input efficiency provides UAS missions with very high flexibility in arranging aerial image missions at short notice. The low cost of procuring a UAS would make the hardware accessible at all times, even at a project level. In contrast, along with the maintenance and other associated costs, procuring a manned aircraft flight on short notice would likely be unrealistic at a project level. As the manned aircraft flights are specialized, even outsourcing the services needs a considerable lead time to secure a slot for the mission. This is a key constraint for monitoring dynamic ecosystems such as braided rivers, where weather events might rapidly change flow regimes and necessitate that monitoring takes place within only a few days, as demonstrated in the case of the Aprima river weather event in October 2020 (Figure 3). UASs have been used for detecting temporal changes in both biotic and abiotic habitat features such as detecting spring phenomenon [18], erosion monitoring [41], etc. The high temporal resolution of image sequences using UASs can also be a key to detecting vegetation changes [42]. Overall, flexibility in re-deployment is key to the temporal flexibility of UASs is a positive attribute of UASs for monitoring dynamic habitats.

Even technologically, drone operations are more flexible than manned aircraft flights, especially for pre- and peri-flying phases. The flexibility comes from the availability of flight planning and operating software suites such as Pix4D, which can be used through everyday platforms such as mobile smartphones. It is also useful to be able to adjust flight plans according to changing field conditions. For our second unmanned mission, due to high winds, we adjusted our mission by slightly reducing the area to conserve the battery power used to stabilize the drone against the wind. Such flexibility for adjusting flight missions in response to local conditions is limited for manned aircraft flight missions.

#### *4.4. Use of Frontier Analysis in Future Research*

It would be interesting to assess the efficiency of varying aerial flight mission settings (flight height, camera, and focal length of manned aircraft flights and UASs) through a scenario analysis. In this article, we demonstrated efficiency measure using the number of images as a performance indicator. However, it is possible to use other performance indicators such as ground resolution or area, which are used as outputs. The DEA productivity analytical framework can be used for any such analysis. The productivity frontier draws a general threshold for productivity (minimum input requirement and maximum possible output) and the free disposability allows the input-output observations to move proportionately along that frontier [28,30]. If a mission setting falls under the curve, it is operating at a lower efficiency. Additionally, the frontier analysis is considered better able to handle the difference in scale issue than some other analyses and is used for technological comparison in other fields [29]. With this, researchers and project managers can make use of the DEA productivity frontier analysis for selecting technology and setting for aerial photography missions, including flexibility of re-deployment for frequent monitoring.

#### **5. Conclusions**

Management of dynamic ecosystems such as braided riverbeds typically requires monitoring at high frequencies over large spatial extents. We applied an established and easy-to-use method for comparing two aerial imaging technologies to be used for monitoring dynamic braided river habitats. This provides some insight into the field operations of both manned aircraft and unmanned aerial image flight missions and demonstrates a way to make technological choices in research and conservation practice.

We assessed the applicability of aerial photogrammetry for monitoring habitat restoration efforts in the Aparima River, a braided river in Southern New Zealand, and compared the efficacy of using manned aircraft and unmanned aerial vehicles. We found that technologically manned aircraft and unmanned flights have similar efficiency. Even though manned aircraft flights have the potential to cover a larger area, manned aircraft flights are constrained by the need for high initial investment both in terms of money and personnel training. Outsourcing from specialized aerial photography aviation companies can also be costly and will have limited availability due to demands elsewhere. UASs require much less initial investment and it is relatively easy to train project personnel to fly UASs for aerial photography missions. The low lead time required for UAS flying makes them flexible for deployment, which is very critical for monitoring the dynamic braided river habitats.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/drones5020039/s1, Tables S1 and S2.

**Author Contributions:** Conceptualization, P.J.S., R.F.M. and M.S.I.K.; Methodology, M.S.I.K.; formal analysis, M.S.I.K.; data curation, M.S.I.K.; writing—original draft preparation, M.K.; writing—review and editing, R.O., R.F.M. and P.J.S.; visualization, M.S.I.K.; supervision, P.J.S.; funding acquisition, R.F.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially (manned aircraft flights and on-field operations) funded from the Aparima River Habitat Enhancement Project funded through the Department of Conservation.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data used for analysis and results presented in this manuscript are available as Supplementary Materials Table S1 of this manuscript.

**Acknowledgments:** We are grateful to Simone Cleland and Terry Green for helping with manned aircraft flight planning and operations. We are also grateful to the Maukahuka project of the Department of Conservation, New Zealand for providing the Mavic 2 Pro drone. We would like to thank Ann de Schutter for providing hands-on orientation on flying the Mavic 2 Pro drone and

Ella Sussex for helping as a ground observer for the UAS flights. Clement Lagrue coordinated the UAS flights including ensuring the consents of the different stakeholders involved. We would also like to thank Grant McGregor and Wreys Bush Concrete for support and facilitating groundwork including communication with local landowners. Special thanks to Hugh Robertson and Canterbury Aviation for their services in conducting the manned aircraft flight missions. We are also thankful for the contributions of the three anonymous reviewers and academic editor whose comments have significantly improved the content of this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Assessing the Potential of Remotely-Sensed Drone Spectroscopy to Determine Live Coral Cover on Heron Reef**

**Valerie J. Cornet 1,\* and Karen E. Joyce <sup>2</sup>**


**Abstract:** Coral reefs, as biologically diverse ecosystems, hold significant ecological and economic value. With increased threats imposed on them, it is increasingly important to monitor reef health by developing accessible methods to quantify coral cover. Discriminating between substrate types has previously been achieved with in situ spectroscopy but has not been tested using drones. In this study, we test the ability of using point-based drone spectroscopy to determine substrate cover through spectral unmixing on a portion of Heron Reef, Australia. A spectral mixture analysis was conducted to separate the components contributing to spectral signatures obtained across the reef. The pure spectra used to unmix measured data include live coral, algae, sand, and rock, obtained from a public spectral library. These were able to account for over 82% of the spectral mixing captured in each spectroscopy measurement, highlighting the benefits of using a public database. The unmixing results were then compared to a categorical classification on an overlapping mosaicked drone image but yielded inconclusive results due to challenges in co-registration. This study uniquely showcases the potential of using commercial-grade drones and point spectroscopy in mapping complex environments. This can pave the way for future research, by increasing access to repeatable, effective, and affordable technology.

**Keywords:** remote sensing; coral reefs; drones; linear unmixing; R; Google Earth Engine

#### **1. Introduction**

Coral reefs are some of the most biologically diverse ecosystems on the planet, providing key ecosystem services to coastal communities through tourism, food security, and coastal protection [1]. However, reefs around the world are currently experiencing decline, through mass coral bleaching, ocean acidification, and water quality reduction [2]. Due to both their ecological and economic importance, more accessible and cost-effective methods to map and monitor the decline of coral reefs are needed.

Many monitoring programs have focused on studying reefs locally using in situ field methods [3]. Due to the various difficulties of working in aquatic environments, there is increasing pressure to develop better, wide-scale methods to map and monitor coral reef benthos [4,5]. Collecting data through using remote sensing therefore complements research conducted in the field. By developing more affordable and repeatable methods in remote sensing, research can be made more accessible and efficient. This is particularly useful in locations that are hard to access as the improved capacity to survey remote areas can facilitate repeated monitoring [6]. This can be achieved at broad spatial and temporal scales, using platforms such as drones, aircrafts, or satellites.

Drone-based remote sensing presents a wide array of advantages with regard to local, detailed assessments of study sites. With advances in the technological field over the years, the cost of using drone-mounted sensors has decreased, making consumer-grade drones accessible to many whilst reducing the need for expertise in operating commercial grade

**Citation:** Cornet, V.J.; Joyce, K.E. Assessing the Potential of Remotely-Sensed Drone Spectroscopy to Determine Live Coral Cover on Heron Reef. *Drones* **2021**, *5*, 29. https://doi.org/10.3390/ drones5020029

Academic Editor: Diego González-Aguilera

Received: 28 February 2021 Accepted: 15 April 2021 Published: 17 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

drone technology [7]. Increased battery life has led to increased flight time, and decreased payload weight has made drones lighter and more user friendly [8]. Additionally, ondemand deployment has the advantage of choosing favourable weather conditions for collecting data [9]. Drones provide the benefit of flying under the cloud cover, resulting in greater flexibility in terms of data collection time frames compared to satellites and aircrafts. Furthermore, since external limitations presented by the environment influences the accuracy of benthic mapping studies, reducing the distance between the sensor and the subject reduces atmospheric effects on readings [10]. These combined benefits give drones competitive advantage over other remote sensing platforms.

However, there are disadvantages in using drones that need to be considered as well. Data processing errors often occur within the quantitative analysis and classification steps. As for errors in data collection, these are presented by the sensors and platforms, the classification steps, and the environment. When collecting data with drones, it is important to note that the platform moves. When doing so, attached spectrometers do not always point directly downwards. This means that spectral readings may not always be taken from the area of interest [11]. Errors in data collection from drones may also occur through geopositioning, as the accuracy of the GPS location determined by the drone is not always exact. This is particularly the case in commercial-grade drones, as the inertial navigation systems that measure position information are often of low to medium accuracy to save costs and payload weight [12]. Similar to the inaccuracies present by drones' GPS, in-water validation imagery collected in situ are also subject to spatial inaccuracy. This, along with scale differences in field data justify the difficulties to use field data for direct comparisons to aerial mapping [13].

Mapping and monitoring using remote sensing often relies on being able to accurately record colour or light interactions in the environment [14]. This includes using spectrometers to make measurements of reflection, absorption, and transmission, and finding patterns or 'spectral signatures' that may be unique to features of interest—in this case, live coral, algae, rock, and sand. This information can be collected using imaging spectrometers (e.g., hyperspectral scanners) or with individual point based spectroscopy [14]. Spectroscopy has been used to distinguish between live coral and other coral reef benthos in the past, but these studies have largely been limited to in situ underwater or closerange measurements [15–20]. Capturing data in that way is time intensive and provides limited coverage.

However, drone-based spectroscopy provides the opportunity to extend the coverage, providing a tool for rapid data collection. While other research has documented the potential for using small and lightweight imaging spectrometers on drone platforms (e.g., [21], little work has been done to test the extent to which the more affordable pointbased spectrometers can also capture categorical and continuous variable information about the benthos and water column.

In using drones, a major consideration is the influence of the water column and the nature of its influence under different light and environmental conditions, such as waves. With varying depths and water quality, there is likely increased confusion between more spectrally similar classes such as algae and coral due to uneven attenuation throughout spectral signatures [18]. For example, it has been found that with higher chlorophyll or sediment content in the water, more algae will likely be classified as coral [22]. Lee et al. [23] proposed a widely used inversion model that uses diffuse attenuation coefficients as functions of light absorption and scattering. This model was built upon to derive water column properties and water depth, which has been widely used in water column correction [24,25]. Classification of benthic groups was successfully achieved by Goodman and Ustin [24] through combining Lee et al. [25]'s semi-analytical inversion model with linear spectral unmixing, which allowed for the correction of the water column and achieved an overall accuracy of 80% for all substrate groups. BRUCE, a model built upon Lee et al. [25]'s algorithm achieved an overall accuracy of 79% in mapping benthic substrates [26]. However, in

clear, unturbid, and shallow waters under 5 metres, water column correction is not always necessary to capture accurate measurements of the benthos.

This study tests the extent to which consumer-grade drones are capable of providing fine resolution information on coral reefs. This type of data offers a low-cost resource that has the potential to overcome separability issues between classes such as coral and algae, as well as a level of detail and information that cannot be provided by multispectral and RGB data. By using drones, there is the potential to bridge the scale gaps presented between field and satellite-based assessments. Achieving this would help pave the way for future research in the field of remote sensing, as it would demonstrate how accessible technology such as consumer-grade drones and public spectral endmember libraries can be used by anyone. As such, the aim of this study is to quantify the amount of various benthic substrates using drone-based spectroscopy on Heron Reef.

#### **2. Materials and Methods**

*2.1. Study Site*

Data were collected at Heron Reef (23.44◦ S, 151.91◦ E), a shallow, lagoonal coral reef located on the Southern end of the Great Barrier Reef, Australia (Figure 1). The shallow depth of the reef and the clear water afforded by its offshore location allow for effective spectral data collection. As it is a lagoonal reef, the depth remains relatively constant across the reef.

**Figure 1.** (**a**) Heron Reef study site. Image obtained from Google Earth and shows Heron Island, Heron Reef and the lagoon, along with the primary drone flight used to establish workflow. Drone flight is indicated as a series of black points. Satellite images were obtained from Google Earth. (**b**) Mosaicked RGB image of reef and study area on Google Earth Engine. Note that in Google Earth Engine the creation of completely circular areas is not possible and therefore the use of a pentagon was used to get the closest equivalence for linear unmixing results. (**c**) Example spectrum from drone flight sampled from the high coral cover region of the reef.

*2.2. Data Collection*

2.2.1. Public Spectral Library

We used a spectral library of known features to calibrate and validate our drone spectroscopy mapping model. For the purpose of this study, we defined the benthic substrate features of interest (spectral endmembers) as live coral, algae, sand, and rock as these are the most common generalised substrate groups found at the study site [27]. Representative spectra were chosen from the public spectral library of substrata collected in situ on Heron Island in 2006 by Dr. Christian Roelfsema and Dr. Stuart Phinn [28]. The public library consisted of endmember spectra that were recorded at shallow depths using a dive torch as a light source 5 cm away from the subject and a white panel was used as a baseline to calibrate the respective spectrometers. The dark current of the spectrometer (concurrent to the thermal variation) was also accounted for, removing the effects of dark current noise. Recordings of the digital number obtained were converted to reflectance values through the equation below, where dark current is written as Dark, Target refers to the reflectance of the target, and White refers to the reflectance of the white panel:

$$R = \frac{(\text{Target} - \text{Dark})}{(\text{White} - \text{Dark})} \tag{1}$$

#### 2.2.2. Drone Spectroscopy

Spectroscopy data were collected using an Ocean Optics STS-Vis 15◦ field of view spectrometer, which measured reflectance at bands within the effective spectral range of 350 to 800 nm mounted on a 3DR Solo drone [29]. At a flying altitude of 20 m, this achieved an approximate spectral and spatial resolution of 0.13 nm and 5.2 m, respectively (Figure 2). Point spectroscopy data were collected approximately four times per second and each data point was attributed with the time and coordinates of the drone at the time of capture. The drone was flown up and down adjacent flight paths using a trajectory perpendicular to the shore in order to obtain a cross-reef-flat study area (Figure 1a). Data were calibrated to reflectance using a 99% Spectralon® reference panel (Labsphere) [22]. A Phantom 4 Pro, with an RGB camera also captured photos over the same region for accuracy assessment.

**Figure 2.** Drone footprint of flight path for spectrometer data collection. Since drone was flown at a constant height of 20 m, a point resolution of approximately 5.2 m diameter was achieved.

#### *2.3. Data Processing*

As seen in Figure 3, there are three subsequent steps in the methods tested which are described below. As it relies entirely on running the code written on R and Google Earth Engine, there is no need for expertise regarding commercial software, nor is there the need to obtain licenses for paid software (see Supplementary Materials).

**Figure 3.** Workflow of the study. Each column represents the workflow to achieve the three objectives. The first column demonstrates the steps to obtain and modify the spectral library from a public database to obtain the four spectra for live coral, sand, algae, and rock. The second column shows the steps to obtain the processed drone data. These two datasets will be used in combination to derive the fractional contributions of each endmember class. The third column shows the steps of the accuracy assessment. The output datasets are shown in boxes and unboxed comments represent the steps for each objective.

#### 2.3.1. Evaluating the Spectral Library

To choose endmember spectra, a principal component analysis (PCA) was conducted on all the endmembers present in the public library [30]. This method to visualise the maximum variation seen between data points has been used in studies proving its use in endmember determination [31]. Spectra that were projected far apart from other substrate classes and within their own substrate class were chosen. Preliminary review comparing spectral signatures to known "pure" endmember signatures was conducted to confirm suitability of the endmember.

Spectra were processed to create the final endmember library of the four substrate classes (Figure 3). Spectra were smoothed through Savitzy–Golay smoothing and normalised for vector length. Vector length normalisation involves calculating the length of reflectance vectors and dividing reflectance values by the vector length [32]. This ensures that a focus is given on the shape of the spectral signatures in the spectral unmixing step rather than the absolute values. Finally, the spectra were tested for collinearity using the detect.lindep()function in R from the plm package [33,34]. There should be no collinearity or linear dependence detected between endmember spectra as this is likely to lead to misclassification.

#### 2.3.2. Evaluating Drone Spectroscopy

A separate endmember library consisting of endmembers sampled from the hyperspectral drone data were created through conducting a principal component analysis of the data and choosing the "purest" endmembers found clustered furthest apart. This was used as a comparison to the public endmember library in order to evaluate the pros and cons of each. Sampling within the studied dataset gives the advantage of providing spectra that will inherently be sourced from the same sensor and in the same environmental conditions. However, the likelihood of providing "pure" spectra is low due to the resolution of the spectroscopy data and the fine scale of spatial heterogeneity of the reef benthos.

Spectral reflectance values of the final endmembers were corrected for through smoothing and normalising in the same manner as the drone data, as explained above. Spectra were then resampled in order to coordinate with the wavelengths sampled in the drone spectroscopy dataset. Resampling was conducted through the resample() function in the spectrolab package using R [33,35]. The algorithm will then be used to separate these endmembers and determine the fractional contribution of each endmember. Through this, live coral cover may be estimated. This section of the workflow was processed in R (Figure 3) [33].

#### 2.3.3. Unmixing Drone Spectroscopy

Spectra were imported into and processed in R in the appropriate format to run the code (columns as wavelengths and rows as individual points) and work through the steps of objective 2 in the workflow (Figure 3) [33]. Prior to the unmixing step, spectra from the drone data were also smoothed using Savitzy–Golay smoothing and normalised, through vector length normalisation [32,36]. Spectra collected by the drone were subset to record reflectance between 400 and 750 nm due to the opaque nature of the water column at wavelengths above and atmospheric scattering below that in the visible spectrum. Data reduction serves in reducing dimensionality of the dataset, which further facilitates algorithm performance, complexity, and data storage [37]. Due to the time limitations presented by the study and the aim of shaping a more accessible, repeatable, and relatively simple workflow, water column was not corrected for using radiative transfer equations. Previous studies have confirmed that classification of reef substrata using the aforementioned spectral range remains possible at depths shallower than six meters, which was the case for this study [38].

A single endmember spectral mixture analysis (SMA) was conducted to unmix endmembers for the hyperspectral data obtained. This was chosen because previous studies, have established its ability to unmix benthic classes, its accessibility of unmixing algorithms, and the lower computational power needed compared to MESMA or non-linear SMAs. Single endmember unmixing functions as a linear unmixing method. This assumes a linear contribution of endmembers to the spectra. This implies that the fractional spatial contribution of an endmember will equal the fractional spectral contribution an endmember will have on a spectrum. Although it is unlikely that the nature of spectral mixing among reef substrata is completely linear, most coral reef benthic studies that have used this method have yielded positive results [20,39]. The lack of perfect linearity in coral reef systems could be explained by the morphologic nature of coral colonies, where spectral reflectance may differ depending on the viewing angle of the colony [40]. This is also important when considering different substrate types overlaying one another. For example, a coral colony may have a dead top that might present as turf algae, whilst the rest of the coral colony below classifies as live coral. Despite this, using a linear unmixing model provides the additional advantage of being less sensitive to collinearity between endmembers [41]. This is useful for this study as live coral spectra and algae spectra are known to be highly similar, resulting in an increased likelihood of estimation errors if a non-linear model is used.

Non-negative and least squares (NNLS) constraints were applied to carry out simultaneous inversion of the data and endmember determination. The inversion step allows the fractional abundances retrieved to be constrained to be non-negative, meaning that all

fractions within a pixel will be positive, rendering the results more realistic over unconstrained methods. The model was not forced to sum to one, to give a better indication of the unexplained spectral contributions by endmembers. If the summed fractional contributions obtained from the linear unmixing step are significantly less than one, this will indicate the inability of the set of endmembers to fully explain the spectral signature of the hyperspectral data point. The linear unmixing algorithm chosen was performed through R using the unmix() function in the package RStoolbox [33,42]. It was chosen as the model implies sparsity within the pixel of certain endmembers, meaning that some endmembers within a hyperspectral pixel can be set to zero. This is important as not all endmembers will necessarily be present in all pixels. NNLS unmixing is also widely used in the field of marine studies due to its simplicity and proven ability to yield more accurate results than unconstrained unmixing [43,44]. In addition, NNLS unmixing also decreases fractional retrieval error over unconstrained methods, especially in waters under 5 m of depth, which was the case for this study's dataset. Previous studies have demonstrated that the highest accuracy of classification occurs when the fractional percentage of the endmembers cover over 25% of the pixel recorded [45]. As coral colonies on Heron Reef can span over an area larger than one pixel (>5 m wide), accurately determining live coral cover using this method is likely.

#### *2.4. Accuracy Assessment*

Using RGB/multispectral drone data collected along the same flight paths, an accuracy assessment was conducted. An RGB image was created by mosaicking images collected along the flight path. Live coral cover was estimated for each RGB image through supervised classification using Google Earth Engine, based on the methods of Bennett et al. [13] yielding high classification accuracy of over 85% for live coral. The workflow in this study was modified to suit the format of the dataset and to calculate substrate cover for point sizes comparable to those obtained by drone spectroscopy (Figure 3).

Within the multispectral images, polygons delimiting each substrate class were created to train the classification. The same number of random points across substrate classes were then selected within these polygons to ensure equal sampling and validation of the training data. The Classification and Regression Tree (CART) algorithm was chosen, as the most suitable when compared to Random Forest [46]. To calculate live coral cover, 50 randomly generated corresponding points of overlapping coordinates with the hyperspectral drone data were marked. A pentagonal area of 2.6 m radius was then demarcated for each point and the live coral cover within each area was calculated. This radius was chosen to equate the circular area of the spectral point's 5.2 m diameter. Note that in Google Earth Engine the creation of completely circular areas is not possible and therefore the use of a pentagon was used to get the closest equivalent of linear unmixing results. The accuracy of live coral cover assessment through spectral unmixing was then assessed using a Spearman's correlation test between the measured live coral cover (recorded from the RGB classification) and the percentage values obtained from spectral unmixing. This was also conducted for the substrate classes of algae, rock, and sand. Through conducting the linear spectral unmixing, the root mean square error was also obtained for each endmember for an additional measure of error for each individual endmember.

#### **3. Results**

By combining drone spectroscopy data and a public spectral library, linear unmixing of the spectroscopy points collected on the drone flight was achieved. Over 82% of the spectral variance seen in the drone spectroscopy dataset was explained by the chosen endmembers. With statistically significant correlations between live coral, rock, and sand cover derived from the linear unmixing and the RGB classifications, we highlight the potential for using drone spectroscopy in mapping coral reef habitats.

#### *3.1. Evaluating Spectral Libraries*

The PCA was conducted on a total of 101 spectra from a public spectral library that were divided into eight substrate classes. There was a lack of distinct clusters for all substrate classes, but with most coral spectra forming a cluster with low scores in the first principal component (Figure 4a). Spectra '56' was chosen as it was projected furthest away from the highest density of algae spectra, with high scores in the first component. The coral spectra chosen was of an Acropora colony, which was deemed appropriate due to the common nature of Acropora in shallow, lagoonal waters, but also specifically at Heron Reef in the area of the data capture [47]. The projections of algae spectra also led to the choice of spectrum '67', which was that of turf algae. This was also deemed appropriate due to turf algae generally being the most abundant algal assemblage found on coral reefs [48,49]. For both sand and bare rock, due to the low number of spectra present in the public spectral library, spectra '80' and '48', respectively were chosen, being positioned away from the other chosen spectra.

The four spectra were normalised, smoothed, and tested for linear dependence, for which results indicated a lack thereof. Spectral signature shapes were compared to known endmember spectra in the literature to validate the likeliness to "pure" spectra. Comparison to the spectra published by Joyce and Phinn [43] confirmed that spectra chosen for the endmember library were suitable (Figure 4b).

#### *3.2. Evaluating Drone Spectroscopy*

To challenge the use of public libraries, a PCA was conducted on the drone spectroscopy data to evaluate the potential for endmember extraction within the dataset. As seen in Figure 5i, no clear clusters can be seen, but points were projected across the plot in three directions (a, b, and c). Points projected around "b" and "c" were, respectively situated with low and high scores in the first principal component, whereas points around "a" were projected with high scores in the first and second principal components (Figure 5i). Situating these spectra on a map indicated these represent deep water, coral, and sand (Figure 5ii). This was validated upon further inspection of the spectral signatures, with the deep-water signature showing a characteristic continuous dip in reflectance past 750 nm (Figure 5iii). However, due to the spatial resolution of the drone data (circular area of 5.2 m diameter) and the heterogeneous nature of coral reefs, it was unlikely that the extracted spectra were as "pure" as those obtained from the public spectral library. The difference in spectral signature between the extracted spectra and the public spectral library spectra could also be attributed to endmember heterogeneity, where the extracted endmembers for coral could have represented different species or even bleached corals. Despite some clustering in the plot, it was difficult to confidently extract "pure" algae and rock endmembers, thus reinforcing the advantages of using the public spectral library.

#### *3.3. Unmixing Drone Spectroscopy*

A total of 2553 reflectance measurements were unmixed during the spectral unmixing step using the selected endmember library. Spectral unmixing of the drone data using the endmember library created revealed a live coral coverage ranging from 0 to 24% across the drone flight path studied. An increasing coral cover gradient can be observed progressing away from the island (Figure 6a). Similarly, rock cover decreased along the same gradient, but was found in lower density compared to live coral, ranging from 0 to 17% (Figure 6b). Conversely, sand cover is higher on the sandy reef areas with sand cover ranging from 0 to 64%, as expected. Data points where no sand influenced the spectral signatures all coincide with the highly structured section of the reef preceding the reef slope (Figure 6c). Algae showed the greatest range of percentage cover, of 0 to 69%. As seen on Figure 7d, most points displayed a percentage algal cover between 25 and 60%, which is high compared to the other substrate classes.

**Figure 4.** (**a**) Principal component analysis of endmember spectral signatures. Chosen endmembers are circled in green (coral), red (algae), green-blue (rock), and blue (sand). Various other substrate classes are also included from the spectral library but were not included in the formation of the final endmember library. The first principal component accounts for 37.5% of the variation, whilst the second component accounts for 29.43%, indicating a lack of full explanation of variance by the first two principal components. (**b**) Spectral signatures of chosen endmembers. Spectra were all smoothed using Savitzy–Golay smoothing and normalised for vector length.

**Figure 5.** (**i**) Principal component analysis of hyperspectral drone data, (**ii**) Map of drone flight, (**iii**) Spectral signatures of self-sampled spectra. Spectra were chosen from the points that clustered the furthest apart, where "a" is likely to represent deep water, "b" coral, and "c" sand. Note that spectra are unlikely to be pure but serve as the purest spectra within the drone dataset. Axes are not shown to the same scale for better visualisation of spectral trends.

**Figure 6.** (**a**) Percentage benthic habitat type estimated from linear unmixing using drone spectroscopy: (**a**) Coral, (**b**) rock, (**c**) sand, and (**d**) algae. Results were overlaid on a map of the study area in question. Substrate cover is shown from a scale of 0 to 1. The model yielded an RMSE of 0.00204.

**Figure 7.** Mosaicked RGB image of the corresponding study region showing the fifty randomly generated pentagons to calculate substrate cover with a classified map of the four substrate classes: coral (purple), rock (orange), sand (yellow), and algae (green).

As fractional contributions of endmembers were not forced to sum to one (100%), the unexplained fractional contributions may be explained by endmembers that were not included in the endmember library, such as species within the same class with variable spectra or completely separate substrate classes such as marine biota or mud. Overall, the summed percentage cover of the four endmembers for all points ranged from 82 to 100%, showing that the endmembers chosen were able to account for at least 82% of the spectral mixing seen within each drone point. Over 78% of data points studied showed total percentages of over 90%. This demonstrates that the use of only four endmembers can produce a relatively representative map.

#### *3.4. Accuracy Assessment*

To check the validity of the results, an accuracy assessment was conducted to compare the unmixing results to a classification of fifty polygonal areas (Figure 7). As seen on the classified mosaicked image, the inner reef flat showed the greatest number of pixels being classified as sand. Further towards the crest, algae is the dominant substrate class, with coral and rock substrate types increasing in this area as well (Figure 7).

Retrieving fractional contributions by linear unmixing revealed an unsurprising spatial distribution of the endmembers unmixed. Spearman's correlation tests revealed a significant moderate correlation between live coral cover derived from spectral unmixing and from RGB classification (rs = 0.408, S = 13085, p = 0.00297) (Table 1). Results estimated a mean live coral cover of 17% and 14%, respectively for the unmixing and RGB classifications. As seen on Figure 8a, this was predominantly the case at low to moderate coral cover (Figure 8a). Although a correlation is seen, in order to better test the correlation between the unmixing results and the RGB classification, a greater range in coral cover would need to be tested.


**Table 1.** Results of Spearman's correlation test. Significant correlations are highlighted in bold.

**Figure 8.** Relationship between substrate cover determined from linear unmixing and from the RGB classification. The black line represents the best fit line and the grey area indicates the 95% confidence interval. (**a**) Regression results for live coral cover, (**b**) Rock cover, (**c**) Algae cover, and (**d**) Sand cover. Axes are not shown to the same scale for better visualisation of individual trends.

Similar to live coral, rock cover yielded a moderate correlation between classification results (rs = 0.505, S = 10943, p = 0.000158). However, rock cover was underestimated in the linear unmixing process when compared to the RGB classification (Figure 8b). This underestimation may have been the result of misclassification within the accuracy assessment, by falsely classifying other benthic types as rock. Confusion between rock and algae is especially likely due to the difficulties in distinguishing turf algae that may be overgrown on rock or dead coral specimens. This would have resulted in an overestimation of rock in the RGB classification.

On the other hand, algae cover was shown to have a low and insignificant correlation between unmixing and RGB classification results (rs = 0.115, S = 19570, p = 0.424) (Figure 8c). Again, this may be due the inability of distinguishing between turf algae and other benthic groups in the RGB classification, but could also be linked to human error in the training step, being limited by less spectral information and inefficient spatial resolution to confidently classify groups.

Sand classified by linear unmixing had the highest correlation with that obtained from the accuracy assessment, likely meaning that the sand measured is in truth, sand (rs = 0.620, S = 8392.8, p = 1.208 × <sup>10</sup><sup>−</sup>6). Sand was underestimated in the linear unmixing process, which could potentially be explained by error in the unmixing step, but also could be attributed to misclassification in the RGB classification (Figure 8d). Sand could have been underrepresented due to the unmixing algorithm detecting spectral influences from other substrate classes such as algae and small biota that may be too small to be visualised with the resolution available from the RGB image. Sand may also be variable in origin, grain size, and mineralogy, and therefore one endmember may not explain the spectral mixing caused by both silicate sand and carbonate sand from bioeroders such as parrotfish and physical erosion.

#### **4. Discussion**

#### *4.1. Using Public Spectral Libraries*

Successfully using public spectral libraries for these types of analyses encourages mapping efforts by making unmixing studies more accessible (by decreasing the need for field collection of endmembers), as well as opening doors for the facilitation of endmember determination [28]. It is important to note that determining endmembers arguably remains the most crucial step in the spectral unmixing process [50]. Acting as the first step along with the pre-processing of data, minimising error is vital, as error caused from insufficient or unrepresentative endmember selection can propagate errors in all subsequent steps of the analysis.

Despite yielding positive unmixing results, direct improvements to the methods can be achieved in the future through additional selective steps. Choosing endmembers from the PCA plot is the only step that is not automated in the linear unmixing workflow and relies on user choice. Although the endmembers chosen were successful in unmixing the drone spectroscopy points in this study, the methods used in choosing endmembers should be automated to remove bias and ensure repeatability. The method is also flawed in that choosing endmembers that cluster far apart on a PCA plot could lead to the extraction of anomalies, leading to the use of endmembers that are not representative of their substrate class. This stresses the need for standardised endmember determination methods. Examples of methods include endmember determination include iterative endmember selection (IES) or endmember average RMSE (EAR) [50]. Using automated steps such as these will help in standardising the proposed workflow of this study and ensure that the choice of suitable endmembers is statistically backed.

With marine public spectral libraries becoming more accessible and complete, it may be soon possible to find pure spectra with matching sensors and environmental conditions as those collected for individual studies, facilitating the pre-processing step by decreasing the need for extensive normalisation of datasets. In order to achieve this, future data collection methods should aim to standardise methods for collecting spectra for libraries and provide additional information on the factors affecting intra-specific variability, such as developmental stage, tidal position, and bathymetric position [51]. Public databases are commonly used in the fields of mineral exploration and canopy analysis, where organisation and individual researchers have combined efforts to develop shared libraries for a range of different materials, both natural and anthropogenic. To minimise the variation in spectra caused by differences in data collection techniques, various standardisation methods have been proposed, such as continuous wavelet analysis, a form of scaling spectra [52]. By doing this, spectral libraries become increasingly transferrable between

studies and the use of spectra from different libraries can be made possible. Shared public databases such as USGS, SPECMIN, and SPECCHIO also help in identifying the requirements of a spectral library, by using a Database Management System (DBMS) that stores spectral information in relational tables [53]. However, this does not necessarily enforce data integrity, reinforcing the need for standardisation methods during data collection.

#### *4.2. Benthic Distribution on Heron Reef*

According to the current Reef Check Australia Health Report of Heron Island, the reef comprises of approximately 37% live coral, similar to the 36% in 2017, the year the drone spectroscopy data were collected [54]. In that year, across 17 sites studied, hard coral cover ranged between 3% and 65%, which is the range within which the unmixed fractional contributions fell within. The highest coral cover was highest at the reef slopes and the lowest on sandy reef flats, which agreed with findings by the Reef Check Report [54]. Although the unmixing results fall within the live coral cover range found by Reef Check, comparing results to monitoring studies must always be done with caution, as these in situ studies often overestimate live coral. This is often the case due to bias in choosing monitoring sites, where the sandy regions tend to be monitored less frequently. Additionally, the findings of this study were based on one single drone flight and therefore may not serve as an accurate representation for the benthic distribution on the rest of Heron Reef. This could explain the slightly lower overall coral coverage yielded by the unmixing at 17% compared to the estimated 37% found by Reef Check.

For algae, a previous study by Roelfsema et al. [55] found that chlorophyll a concentrations found in Heron Reef sediments were among the highest reported for any marine sediments. This was especially the case on the windward side of the reef, which is where the drone data from this study was captured. The sediments sampled were used to quantify benthic microalgal communities [46]. The high levels of benthic microalgae could be a factor explaining the dominance of algae seen in the findings of the spectral unmixing, as the endmember of algae could have extracted the fractional contributions of turf, macroand microalgae combined. Similarly, this could also explain the low rock cover found through the unmixing process, as rock covered by turf algae is likely to have a spectral signature similar to that of the turf algae endmember used.

#### *4.3. Sources of Error and Potential Improvements*

Weak correlations in the accuracy assessment may be attributed to error in the data collection and error in misclassification during data processing. Whilst errors in the data processing generally occur in the quantitative analysis and classification steps, errors in data collection are accumulated through the sensors and platforms, the classification steps, and the water column, as previously mentioned [56]. As this study involved combining three separate datasets, errors produced within collecting or processing of all three need to be considered.

#### 4.3.1. Sources of Error from Sensors and Platforms

As discussed, sources of error from sensors and platforms may arise due to the instability of the moving drone platform and inaccuracies in geopositioning. To avoid this, spectrometers may be attached on a gimbal. However, not all commercial-grade drones have a built-in gimbal and attaching one will add additional weight and cost. Errors in geopositioning present implications for the accuracy assessment step, as matching up the coordinates between the drone spectroscopy data and the mosaicked RGB image will not be exact. This could explain the lack of correlation seen in the accuracy assessment, as spatial inaccuracy, even minimal, can lead to significant changes in substrate cover in a heterogeneous environment. As the spectroscopy data are not in the form of imagery and do not provide spatial context, matching up of data through landmark structures is not possible. This highlights one of the drawbacks of this study's chosen accuracy assessment.

#### 4.3.2. Sources of Error from the Classification Steps

It must be noted that the accuracy assessment used in this study serves as one option to testing accuracy without the need for underwater data collection. The RGB classification in itself presents inaccuracies, as it relies on visual classification by the user and is therefore prone to human error and bias. It is also limited by the amount of spectral information it holds and is more likely to confuse benthic groups such as coral and algae [27]. Therefore, the classification obtained from the linear unmixing has the potential to show higher accuracy compared to that obtained from the RGB image. To improve the RGB classification accuracy, more polygons and points could have been used to train the classifier and sun glint could be added as a substrate group to minimise misclassification. In making sure that we collect data in the most appropriate way in the first place, we minimize artefacts due to sampling and environmental conditions [57]. As this study establishes a protocol where in situ underwater validation was not conducted as part of it, further testing is recommended for future studies to validate this.

As previously discussed, GPS location errors also arise during underwater validation and images collected underwater in situ cannot be compared at the same scale. Efforts to minimise GPS location errors include the use of georeferenced quadrat sampling in estimating benthic cover, combined to underwater photography [13]. However, this would greatly increase data collection effort and does not address the issue of scale. An alternative would be to conduct an accuracy assessment using imagery collected from the same drone and at the same time of drone spectroscopy collection. This would reduce spatial discrepancies between the spectroscopy dataset and the validation data, as both would be collected from the same source. Although this may serve as a credible accuracy assessment, the need to develop more effective methods for validation is highlighted.

Aside from the accuracy assessment, misclassification errors could have occurred in the linear unmixing step. These errors could be linked to inefficient data reduction, the absence of representative endmembers, or the confounding presence of the water column. Studies have found that many of the differences between coral and algae lie between 520 and 580nm and therefore linear unmixing could have been conducted on a dataset where these wavelengths were given a greater weighting [13]. Hochberg et al. [27] used a multivariate stepwise selection procedure to isolate the wavelengths that best differentiate between substrate classes. Spectral feature selection is another method that relies on extracting endmembers that minimise intra-class variability and maximise interclass variability [58]. These methods remove less meaningful information in the dataset for more efficient classification. Inefficient data reduction could therefore be improved by focusing on wavelengths where diagnostic features of substrate classes can be found, but the disadvantages of losing data must be considered.

Inaccurate estimation of benthic cover could have also occurred in the linear unmixing step by not including certain endmember classes (biota such as holothurians) or not accounting for endmember variability within the analysis [59]. Due to the inherent spectral variation that occurs within and between species of the same class, using one endmember spectra per substrate class leads to an oversimplification of the model that does not incorporate the heterogeneous nature of coral reef habitats [50]. Algae comes in the form of more than turf, with various species of red, brown, green, fleshy and calcareous algae, whereas corals can be classified as bleached, blue, brown or soft/gorgonians, that each differ in spectral signatures [60]. In order to account for spectral variability within endmember classes, previous studies have used averages of various species and yielded a lower overall RMSE.

Alternatively, implementing MESMA instead of single endmember SMA accounts for endmember variability [45,61]. Using MESMA, where different endmember spectra can be chosen depending on the pixel, has shown to yield lower RMSE values in coral reef unmixing studies in the past [62,63]. However, MESMA can also be flawed as it cannot fully incorporate the heterogeneous nature of coral reefs, only choosing one endmember spectra per class in each pixel or point [62]. To evaluate its potential with point spectroscopy data

such as that used in this study, further research should be conducted. Results can then be compared to those from studies in which endmember variability is not accounted for, or where an average signature is used to represent one endmember class.

There are significant biological implications associated with grouping species together within endmember classes or omitting substrate classes. Although some mapping studies may not require the differentiation within algal and coral groups, the use of such proxies for coral reef health could be misleading. Generally, an increase in turf or macroalgae over time can represent a phase shift from coral-dominated to algae-dominated reefs, indicating a decline in reef health as the presence of some algae affects coral recruitment and survival [48]. However, observed increases in crustose coralline algae (a red alga) can instead be an indicator of increasing coral reef health [48,64]. Although this study is a development and test of a workflow that does not include testing biology, it is important to consider for future applications of the technique what biological implications the dataset being used can have.

#### 4.3.3. Sources of Error from the Environment

This study was conducted on a shallow study site, when environmental conditions were good and water quality was high, and therefore it was an optimal study site to test the effectiveness of the workflow in ideal conditions. If this method were to be applied in deeper water, water column correction would be needed. Lee et al. [25]'s algorithm consistently showed improvements in classification accuracy when applied. This should be used to build upon the workflow in this paper, for future studies requiring water column correction. Combining a semi-analytical model with linear unmixing on hyperspectral imagery has been achieved with positive results by Goodman and Ustin [24] and Klonowski et al. [65], demonstrating the potential for using such models on spectroscopy data. It is important to note that water column correction remains a difficult task, explaining the choice to exclude it from this study, for the sake of simplicity in using it for shallow reefs. Nonetheless, through the use of linear unmixing techniques, this workflow serves as a first step towards scaling mapping of hyperspectral point data and can be added to in order to incorporate water column correction.

#### *4.4. Examples of Future Applications*

Although accurate measurements have been done using the relatively more affordable RGB data, there are benefits of using data with a greater amount of information. Mapping studies, such as that by Bennett et al. [13] focused on using RGB images to extract substrate cover, instead of spectroscopy, and showed the pros and cons of doing so. Although yielding positive results, the paper highlighted the difficulties in differentiating between certain substrate types such as live coral and rock, where live coral cover estimates are often overestimated due to rock being classified as coral. The reason for the use of spectroscopy in this project was to assess whether the use of a greater number of wavebands within the data would help to differentiate between similar looking substrate classes. In the case of estimating live coral cover, or the cover of other types of substrata, a 1D coverage could be an effective way to obtain estimates whilst ensuring a higher accuracy of classification than provided through the use of RGB images. Using such data can be useful for more sophisticated information extraction purposes in the future.

Although spectroscopy has been shown to successfully help in monitoring live coral cover, it is not limited by this application. Since it provides a way to access complex datasets without the need for extensive expertise in remote sensing, the proposed workflow could be used in various fields such as quantitative mapping, through monitoring bleaching and reef health, without being restricted by the environmental and time limitations offered by a satellite. Joyce and Phinn [66] used hyperspectral imagery to derive chlorophyll content of coral reef substrates. Quantifying pigment concentrations using drones may serve as early warnings for bleaching or health monitoring on the reef for conservation managers. Drone spectroscopy could be further applied to quantitative mapping by quantifying in

situ fluorescence spectra of benthic substrates, which if further tested, could open doors to quantifying photosynthetic potential of the substrata [67]. This gives an indication of applications of drone spectroscopy that need to be tested, which could facilitate monitoring through directly quantifying key variables. Developing this workflow for mapping substrate cover demonstrated a relatively simple application, but helps to present a method that enables a range of other more sophisticated applications. The applications are endless and the simplicity of running the code makes these applications achievable.

#### **5. Conclusions**

Overall, using drone spectroscopy data shows promise for mapping benthic cover on Heron Reef. This type of data offers a low-cost resource that has the potential to provide a level of detail and information that cannot be provided by multispectral and RGB data.

The process of determining endmembers in this study was able to account for over 82% of the spectral mixing throughout all spectral measurements collected from a consumergrade drone and was able to moderately determine the exact fractional contributions of live coral, sand, and rock. Although there still remains the need to further refine current workflows, this method provides an accessible process that can be applied to data collected by affordable technology. Due to this, future research should focus on testing the effectiveness of using drone spectroscopy for specific applications, such as quantitative mapping or detecting coral bleaching. Further recommended steps to improve the study include an automated endmember selection step, bathymetric retrieval, and water column correction.

This highlights the importance of this study, as it can hopefully help further wide-scale research and monitoring programs, not only in highly studied sites, but in remote areas. With the increase in accessibility to both drone hyperspectral data and public spectral libraries, high spectral resolution information will be made available for mapping studies for a range of various research, as the applications for remote sensing are endless.

**Supplementary Materials:** R Script used in RStudio for linear unmixing of hyperspectral points is available online at https://github.com/valeriecornet/DroneSpectroscopy/blob/main/R%20 Linear%20Unmixing.R (accessed on 1 April 2021). JavaScript code used in Google Earth Engine for classification of the RGB mosaicked image is available online at https://github.com/ valeriecornet/DroneSpectroscopy/blob/main/RGBClassification (accessed on 1 April 2021). Script was modified from Bennett et al. (2020)'s code and added to. RGB drone data is available via https://data.geonadir.com/project-details/173 (accessed on 1 April 2021).

**Author Contributions:** Conceptualization: V.J.C. and K.E.J.; methodology, V.J.C.; data collection, K.E.J.; formal analysis, V.J.C.; original draft preparation, V.J.C.; writing—review and editing, V.J.C. and K.E.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** Internal JCU staff grants to Dr Karen E. Joyce and Dr Stephanie Duce provided funding for field survey and data acquisition. There was no external funding provided for this project.

**Data Availability Statement:** The spectral endmember library collected by Dr Christian Roelfsema and Dr Stuart Phinn is openly available in Pangaea at https://doi.org/10.1594/PANGAEA.804589 (accessed on 3 October 2020), reference number 804589. The RGB images that were used to create the mosaicked image for accuracy assessment can be found at https://data.geonadir.com/projectdetails/173 (accessed on 1 April 2021). R Script used in RStudio for linear unmixing of hyperspectral points is available online at https://github.com/valeriecornet/DroneSpectroscopy/blob/main/ R%20Linear%20Unmixing.R (accessed on 1 April 2021). JavaScript code used in Google Earth Engine for classification of the RGB mosaicked image is available online at https://github.com/ valeriecornet/DroneSpectroscopy/blob/main/RGBClassification (accessed on 1 April 2021). Script was modified from Bennett et al. (2020)'s code and added to. RGB drone data is available via https://data.geonadir.com/project-details/173 (accessed on 1 April 2021).

**Acknowledgments:** We thank Stephanie Duce for assistance with drone data collection and Arnold Dekker for his valuable input on the study and for pointing towards the spectral endmember library used. We thank Christian Roelfsema and Stuart Phinn for sharing their valuable data. We thank Katie Bennett and Florence Sefton for sharing the JavaScript code that was used and modified for the classification of the mosaicked RGB image. Finally, thank you to Jonathan Kok, Raf Rashid, Redbird Ferguson, and Joan Li who reviewed and provided useful comments on drafts of the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Technical Note* **The Use of UAVs for the Characterization and Analysis of Rocky Coasts**

**Alejandro Gómez-Pazo 1,2,\* and Augusto Pérez-Alberti 1,3**


**Abstract:** Rocky coasts represent three quarters of all coastlines worldwide. These areas are part of ecosystems of great ecological value, but their steep configuration and their elevation make field surveys difficult. This fact, together with their lower variation rates, explains the lower numbers of publications about cliffs and rocky coasts in general compared with those about beach-dune systems. The introduction of UAVs in research, has enormously expanded the possibilities for the study of rocky coasts. Their relative low costs allow for the generation of information with a high level of detail. This information, combined with GIS tools, enables coastal analysis based on Digital Models and high spatial resolution images. This investigation summarizes the main results obtained with the help of UAVs between 2012 and the present day in rocky coastline sections in the northwest of the Iberian Peninsula. These investigations have particularly focused on monitoring the dynamics of boulder beaches, cliffs, and shore platforms, as well as the structure and function of ecosystems. This work demonstrates the importance of unmanned aerial vehicles (UAVs) for coastal studies and their usefulness for improving coastal management. The Galician case was used to explain their importance and the advances in the UAVs' techniques.

**Keywords:** UAV; rocky coast; Galicia; ecosystems; geomorphic change detection; GIS

#### **1. Introduction**

Rocky coasts are of great importance within the global context. This typology occupies three quarters of all coastlines worldwide, and cliff sections are found in 52% of coastal areas at the global level [1,2]. While these environments could be expected to show lower variation rates than other areas, such as sedimentary systems, they have been found to be characterized by a great dynamism [3]. Although many studies have been performed on rocky coasts, the overall research on these areas has been restricted by several limitations, such as spatial and temporal resolution or the relative importance of different erosion factors, which can generate very different shapes with similar values [2].

In Spain, coastal cliffs with slopes greater than 32◦ represent 21.88% of the total coastal area, and 80% of them have elevations below 100 m. These values are relatively similar in Galicia, where 15.29% of cliffs have slopes greater than 32◦, and 6% of them have elevations above 200 m at their highest point [4,5]. In this regard, it is worth highlighting that, since rocky coasts show a wide variability, multiple factors affect and modify their characteristics and determine their evolution. In the case of Galicia, those factors are summarized in Figure 1 [4]. Lithological differences, such as different degrees of fracturing, play a key role in cliff behavior, with different lithological types having higher or lower degrees of vulnerability and different degrees of predisposition to the occurrence of mass movements [6,7] or other erosive processes related to bioerosion [8,9].

**Citation:** Gómez-Pazo, A.; Pérez-Alberti, A. The Use of UAVs for the Characterization and Analysis of Rocky Coasts. *Drones* **2021**, *5*, 23. https://doi.org/10.3390/ drones5010023

Academic Editor: Diego González-Aguilera

Received: 12 February 2021 Accepted: 5 March 2021 Published: 16 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Figure 1.** Main characteristics of the Galician coastline. (**a**) Coastal types in Galicia; (**b**) lithology of coastal areas in Galicia; (**c**) elevation of the Galician coast, in meters; and (**d**) slope in the same sector.

Another key element in the behavior of coastal systems is the type of structures associated with cliff toes. In this respect, there is a great diversity, ranging from shore platforms to heterometric boulder accumulations and sedimentary beaches [4]. All these elements determine coastal evolution and must be taken into account for any study related to coastal characterization or dynamic analysis. Moreover, the forms present at a given moment are part of the natural heritage, and this can play a relevant role in coastal evolution [10]. Indeed, in the Galician case, it is necessary to understand the past cold processes in order to explain the current coastal landscape [10].

Other factors add to the slower rates of change; among them, it is worth highlighting the difficulties of access, which have historically led to a lower degree of attention from researchers. Moreover, sedimentary areas have generally been considered more important due to their social and economic relevance, which has promoted their study [11]. This lack of attention to rocky coasts has been clearly evidenced by Naylor et al. [12], who reported an increase in this type of work in recent years, covering a wide variety of environments such as shore platforms [11,13], boulder beaches [14,15], and cliffs [16,17].

In general terms, the first studies on rocky coasts, performed only a few decades ago, focused on describing study areas by analyzing topographic maps and field surveys [18,19]. These works provided a first approach to the landscape of these areas and an interpretation of their possible evolution. Since then, new techniques and devices have been applied to understand the evolution of rocky coasts, among which the following are worth highlighting: TMEM (transverse micro-erosion meters) [11,20–24], TLS (terrestrial laser scanning) [25,26], LiDAR (light detection and ranging) [27–29], hardness testers [30–32], aerial and satellite images [33–36], or sensors as RFID (radio frequency identification) [14,37–39] among others (Figure 2).

**Figure 2.** Timeline showing the evolution of research about rocky coasts. The position of the different techniques and methods along the timeline represents the date they were first used to study rocky coasts. Abbreviations: TMEM (transverse micro-erosion meters); LiDAR (light detection and ranging); UAV (unmanned aerial vehicles), and SfM (structure from motion).

This improvement in research on rocky coasts has been related to technical and technological advances (Figure 2), which have partly solved the problems associated with measurements in these areas [40]. This fact can be clearly observed in the above-mentioned references, as well as in the diagram represented in Figure 2. The last decade has seen a great improvement both in research in general and in studies on rocky coasts in particular. This improvement has been related to the increasing number of researchers employing UAVs (unmanned aerial vehicles). These have been applied to different fields, such as vegetation analysis [41] or wildfire research [42]. In the context of coastal research, these devices have gained importance in the last decade, allowing for new works about coastal areas in general [43,44] and rocky areas in particular [13,44–46], without the need for complicated and expensive field surveys. In this sense, it is worth mentioning the study by Pérez-Alberti et al. [47,48], as the first application of UAVs on rocky coasts, in this case boulder beaches, published in [46]. The use of UAVs allowed quantifying topographic variations and current dynamics with a great level of detail and spatial continuity. This fact was often not possible with classic field surveys, such as topographic profiles, or with aerial images, due to their low spatial and temporal resolution. Even in a novel field such as the use of UAVs on rocky coasts, great technical and methodological advances have taken place, from an early period related to photointerpretation of high resolution images [45,46] to more recent works using photogrammetric techniques such as DSM (digital surface models) generated through SfM (structure from motion) techniques, which enhance the possibilities for quantification [3,49–51].

The aim of this research was to emphasize the importance of rocky coasts in general and to analyze how UAVs can improve research about the different environments present in these areas. These allowed for a detailed analysis of the dynamics of boulder beaches, shore platforms, and cliff ecosystems. For this purpose, the results of several studies performed in the last decade using UAVs as a key tool are shown, paying special attention to their methods and results. These investigations summarize the main results obtained in the Galician coast, a sector with a great variability of coastal typologies. These studies allowed understand their usefulness in other sectors with similar characteristics. Moreover, this project provides a review on the evolution of techniques and methodologies applied to the study of rocky coasts and outlines the future of these devices both for general research and for the study of rocky coasts, as a potentially helpful tool for new researchers in rocky coastal environments.

#### **2. Materials and Methods**

#### *2.1. Study Areas*

This study is included in a project about rocky coast dynamics and evolution in Galicia (NW Iberian Peninsula) started in 2012 and currently ongoing. The first results have been published in communications by Pérez-Alberti. [47,48]. In this case, four sites were selected for measurement and analysis. Their locations are shown in Figure 3, while Figure 4 shows photographs of the four sites.

#### 2.1.1. Oia (Pontevedra)

Oia is located in the SW coast of Galicia. This area has been monitored since 2012 [47,48], and previous works have analyzed variations in a natural boulder beach based on information from UAVs using different techniques. It constitutes one of the first areas analyzed with these vehicles. As described in previous works [3,37,47,48], the boulder beach is approximately 20 m wide and 100 m long. This boulder beach is located at the edge of a shore platform and is limited by two rocky promontories to the north and south. The majority of boulders come from the erosion of the back cliff formed by fluvio-nival deposits in a cold environment more than 40,000 years ago [48]. The beach is composed of heterometric boulders, with the largest ones exceeding 60 cm along their major axis, and it sits at the northern section, while decametric clasts are accumulated in the southern section. The substrate is dominated by intensely fractured two-mica granites. This area shows low

elevations, reaching 8.56 m at the cliff toe, while a major portion of the shore platform and part of the boulder beach belong to the intertidal level (Figure 5a). Figure 5b shows the slopes of this site, which show a great variability that is clearly related to the boulder accumulations that define their landscape.

**Figure 3.** Location of study sites. (**a**) Location of Galicia in Europe; (**b**) location of the study sites (red points) in the western coast of Galicia; (**c**) Oia study site (41.998; −8.879); (**d**) Caamaño shore platform (42.655; −9.041); (**e**) Laxe Brava site (42.598; −9.075); and (**f**) Ponzos cliff site (43.561; −8.254).

**Figure 4.** Photographs of study sites. (**a**) Oia boulder beach; (**b**) part of the Caamaño shore platform with its lithological variations; (**c**) Laxe Brava boulder beach; and (**d**) north section of Ponzos cliff site.

**Figure 5.** Main characteristics of the Oia site. (**a**) Elevation in meters; (**b**) slope in degrees. Figure adapted from [3].

Oia is located in a mesotidal region, and the beach is exposed to waves from the northwest, the main direction of waves during winter storms, which are of great importance for the dynamics of this site.

#### 2.1.2. Laxe Brava (A Coruña)

The Laxe Brava site includes a boulder beach associated with a 45–65 m wide shore platform. As in the previous case, this area has been studied in detail using UAVs since 2012, and the first results have been published in communications by Pérez-Alberti [47,48]. This area is fully composed of two-mica granites that, as in the case of Oia, are intensely fractured. The origin of boulders in Laxe Brava is the shore platform and its edge, since no cliffs or any other similar forms that could provide boulders to this beach, are present in the continental area. This origin is related to wave action in the foreshore, combined with frost weathering during glacial periods.

Mean boulder size is 66 cm along the major axis. However, some larger boulders, even exceeding 200 cm in some cases, are found especially in the landward zone, where the current marine influence does not activate these boulders, although some larger boulders located in the most dynamic area have been observed to change positions in recent years. Among these large landward-located boulders are other smaller-sized boulders that have been moved during extreme events in the winter.

Figure 6a shows elevation in this area, which is homogeneous in the active boulder area and reaches 20 m in the distal sections of the platform and along the edge of the study site, which protects part of this area from marine influence. As in the previous case, Figure 6b clearly shows boulder distribution based on their contour in the slope map, which becomes smoother in the landward beach zone.

Laxe Brava is located in one of the most energetic coastal sections of the Iberian Peninsula, where waves above 9 m are frequent during winter storms [46]. As in the case of Oia, their orientation is N-NW; thus, the boulder beach is exposed to the main component of storm waves while being partially protected from SW waves by a rocky promontory located west of the accumulation zone.

#### 2.1.3. Caamaño (A Coruña)

Like Laxe Brava, the Caamaño site is located in the Barbanza Peninsula. This site is a large, 600 m-long shore platform facing west. Its width varies between 50 and 120 m. This area does not have a main sedimentary deposit, showing only small areas with clasts

and sediments cemented by iron and accumulated in small hollows, which stand out amid the mostly horizontal shore platform. This platform is dominated by two-mica granites and schists. This lithological variety adds interest to the analysis of this site, since it allows comparing the behavior of these different rock types when subjected to similar erosive factors.

**Figure 6.** Main characteristics of the Laxe Brava site. (**a**) Elevation in meters; (**b**) Slope in degrees.

Figure 7a shows elevation distribution throughout the entire site. In this case, the shore platform area reaches an elevation of 4 m. The highest values occur in the southern area, while elevation is below 3 m in the rest of the shore platform. As for the slope, major differences were observed between the central section, with lower values, and the edges, with the highest slopes (Figure 7b).

The marine influence in this case is similar to that observed in Laxe Brava, with a great impact of winter storms and waves from the NW. The most relevant differences relative to the previous site is orientation: Caamaño does not directly receive the influence of high waves from the NW.

#### 2.1.4. Ponzos (A Coruña)

Ponzos is a cliff site located in the northwestern coast of Galicia, near the city of Ferrol. This area is characterized by cliffs above 50 m with beaches at their base, related to rocky promontories and with a wide lithological heterogeneity. This variety is composed of schists and biotite–muscovite orthogneisses related to paragneisses with a medium and high degree of metamorphism in the southwest edge, as well as small granite seams near schist areas.

**Figure 7.** Main characteristics of the Caamaño shore platform. (**a**) Elevation in meters; (**b**) slope in degrees.

In this case, the analysis focused on an 800-m-long section selected as a pilot area to validate different tools for monitoring cliff areas. Figure 8a clearly shows the elevation distribution in this area, with cliffs exceeding 100 m in the central section. Moreover, an analysis of the slopes (Figure 8b) confirmed the wide variety of this zone. These variations are related to the materials accumulated at the cliff toe and to the presence of almost vertical areas, especially in the central section, many of them exceeding 75◦.

Ponzos is also located in a high-energy coastline, with a mesotidal level where significant wave height exceeds 5 m for 3.79% of the time during winters. This fact is relevant for the redistribution of materials at the cliff toe and in sedimentary areas in general. Moreover, in order to understand the dynamic of cliffs, it is necessary to take into account that the analyzed section is located in an area where rainfall exceeds 1000 mm/year, with minimum values during the summer.

#### *2.2. Material*

All the processes and studies included in this research have as their key element the use of UAVs to characterize and analyze variations in different rocky coasts. It is worth noting that significant changes in UAV devices and associated cameras and in their quality have taken place during the study period. Table 1 summarizes the main characteristics of each flight.

**Figure 8.** Main characteristics of the Ponzos site. (**a**) Elevation in meters; (**b**) slope in degrees.



#### *2.3. Methods*

The process for image acquisition in the study sites is explained in detail in previous publications [3,45,46] and consisted of the placement of GCPs (ground control points) and their positioning through a GPS device, in this case a Stonex S8 GNSS device. These were used to determine the exact position of the acquired images. The flight area was then defined, and routes were created, along with other parameters such as flight elevation, overlap, and the outline of the analyzed section. After completing the flights, the collected information was processed to yield a comprehensive image of the areas of interest through

the generation of a mosaic with UAV photographs and the design of a point cloud to obtain elevation values at each point [45,46].

Once UAV data were processed, various approaches were taken to analyze the areas. First, high-resolution aerial images were used to map the elements present in Oia and Laxe Brava. This approach was based on photointerpretation techniques using ArcGIS (licensed to the USC) to identify variations in the position of thousands of boulders in both sectors [45,46]. This methodology have some problems as the time necessary to mapping all boulders and the uncertainty associated to the person that drawing the boulder contour.

More recent projects have also employed high-resolution images using SfM techniques. This technique was used to generate digital surface models (DSM) to analyze spatial and volumetric variations. based on this information, techniques such as geomorphic change detection (GCD) [52] were applied to identify and quantify differences among several dates, through a raster file with the DEMs of difference (DoD), and to identify the areas with the greater losses or gains during the analyzed period [3,53]. In these analyses a Limit of Detection (LoD) was applied based on the uncertainty of raw data and the processes. This value varied in both sectors, 3.71 cm in Oia and 1.5 cm in Laxe Brava. The uncertainty estimation and its reduction is one of the most important elements to improve these studies.

To characterize shore platforms, high-resolution images and DSMs generated from UAV flights were employed. Photographs were used to distinguish lithologies in the shore platform, while the DSM was used to analyze the elevation, slope, and roughness of these areas. Using both data sources, joints were then traced to analyze platform evolution. In the specific case of Caamaño, this remote information has been complemented with field surveys to identify surface hardness using a durometer (Proceq Equotip 3) [54], thus obtaining a more comprehensive view of the behavior of this site.

In the monitoring of rocky ecosystems, the necessary processes for their correct management are much more costly than in the previous cases in terms of time and resources. As a first approach, variations in the cliff sector were analyzed for two years (2016–2018), detecting mass movements and variations in cliff profiles [52,55]. Moreover, in this case, variations in the cliff top were analyzed to estimate the evolution of the cliff during the aforementioned period using DSAS (Digital Shoreline Analysis System) [55], widely used in coastal studies [56–58]. In this sense, this type of study must be complemented with future field surveys by installing sensors to monitor soil humidity or by analyzing the flora and fauna of each site in order to understand their behavior and importance.

#### **3. Evolution of UAVs Studies in Galician Coast and Their Future**

*3.1. Boulder Beach Dynamics*

Similar methodologies were applied in the two studied boulder beaches in Galicia, ranging from visual analysis between the winters of 2012–2013 and 2013–2014 [45,46], to more quantitative methods such as the analysis of volumetric variations during the 2014–2016 period [3,53]. The analysis of boulder contour in both sectors revealed differences between Oia and Laxe Brava sites (Figure 9c,d).

The first area showed the highest percentage of relocated boulders. This fact is related to the lower size of boulders. However, size by itself is not sufficient to explain the differences, because in some cases, larger boulders moved while smaller ones remained in place. For this reason, it is essential to understand that the beach environment is key to explaining their dynamics, with a great emphasis on the degree of confinement [37]. Boulder groups could have a more reduced movement capacity, while isolated boulders have a greater degree of freedom to move around.

In the case of Laxe Brava, the number of relocated boulders notably increased from 17.5% to 47.8% between the first and second winter. Movements covering the greatest distances were observed in the seaward zone, between 0 and 4 m, and in the eastern section, where mobility increased between 4 and 6 m. In Oia, the movement rate was greater during the winter of 2013–14 (87.6%) than in the previous winter (53.0%). These variations mainly involved boulders between 0.5 and 6 m in diameter and occurred mainly in the central and

northern sections of the study site, far away from the protection of the rocky promontory to the southwest. In both areas, variations were related to the higher frequency of storms during the winter of 2013–14 [59], which promoted boulder mobilization.

**Figure 9.** Analysis of spatial variations and variations in boulder position in Laxe Brava and Oia. (**a**) Elevation difference in the Laxe Brava site between 2014 and 2016; the blue polygon represents the area detailed in (**c**); (**b**) variations in elevation in the Oia area during the 2014–2016 period; the blue polygon represents the area detailed in (**d**); (**c**) boulders movement in Laxe Brava during the 2013–2014 period; and (**d**) variations in boulder position in Oia between 2013 and 2014. Figures adapted from [3,45,46,53].

In both areas, boulder movement did not occur individually, but rather occurred in groups; this fact led to important variations in beach profile, which is usually 8◦. Greater boulder sizes led to steeper beach profiles and, therefore, more energy would be necessary to initiate movement. In this sense, it is worth noting that waves are not the only factor for boulder mobilization; the hydrostatic overload derived from the water mass arriving to the coast is also important. For this reason, major mobility events must be related to storms and wave height, as well as to tides.

Based on the results obtained with the previous method, DSMs were used, focusing on data from the 2014–2016 period. In Laxe Brava, a decrease in elevation was observed throughout almost the whole analyzed site (97.81%), corresponding to a net volume variation of −5141.32 m3. Similarly, to Oia, erosion concentrated in the lower section of the beach, while accretion was limited to small areas, especially in the center of the study area.

In the case of Oia, 80% of the area was found to undergo an accretion process, with erosion concentrated in the seaward sector and with lower variations in the southern part, the most protected from winter storms (Figure 9b). For this period, net difference was 1461.07 m3. The greatest variations occurred in the central section, consistently with findings by previous works [45,46]. The differences in dynamics between both sectors were very similar to those revealed by previous analyses using aerial images and boulder contours (Figure 9a).

One of the main advantages of the latter technique is the collection of continuous data about spatial variations in both areas. This allowed having a more accurate idea about the dynamics affecting these boulder beaches and improved the quantitative results (Table 2). In this sense, it is necessary to highlight the uncertainty associated with this type of analysis due to the presence of spaces between boulders in rocky environments, which affect accretion/erosion values, and which could not be quantified [3,60].

**Table 2.** Variations in Laxe Brava and Oia in all the analyzed periods, based on [3,45,46,53].


For this type of study, it is necessary to combine these methods with other techniques to improve our understanding of the existing dynamics. In this context, the Oia site has been selected to introduce RFID sensors in recent years. These devices have allowed monitoring the positions of boulders since 2016 and verifying the behavior of this beach as described in the aforementioned studies, therefore leading to a better understanding of their evolutionary dynamics [37].

#### *3.2. Shore Platform Analysis Using UAV Data*

The use of UAVs allowed for the detailed characterization of the Caamaño shore platform. This area shows relevant lithological differences between sections dominated by schists and by granites. In this sense, this study revealed that granitic areas form islands with a higher elevation than their surroundings at the lower part of the shore platform. This is related to the greater resistance to erosion displayed by this type of rock. Moreover, this lithology shows higher degrees of roughness and steeper slopes, a fact that can be qualitatively observed in aerial images, as shown in Figure 4.

Figure 10 summarizes the main characteristics of a specific section of the Caamaño shore platform. First, joints were traced due to their key role in understanding platform evolution and analyzing the more erodible areas. These values are closely related to roughness and slope (Figure 10b,d), which explains the importance of taking these variables into account for shore platform characterization. As for elevation, Figure 10c clearly shows hollow areas in the platform and also allows identifying the most resistant materials by the higher elevations with which they appear.

Taking a closer look at lithological differences, this research demonstrated that granites had high roughness values, especially in the upper platform areas (0.44), while values for schists were similar in all areas (0.32–0.37). Using Proceq Equotip 3, resistance was found to be higher for granite (424.89) than for schist (332.22). These values could be related to remote sensing variables. This type of research not only allows for shore platform characterization [15,61], but, thanks to the detailed examination of their future dynamics [13], it can identify differences in behavior among various lithological types and reveal how physical and biological factors can influence their evolution [8].

**Figure 10.** Analyzed parameters in the Caamaño shore platform. (**a**) The main joints are shown black dashed lines; (**b**) roughness in the shore platform section; (**c**) elevation in meters; (**d**) slope of shore platform in degrees.

#### *3.3. Management of Rocky Ecosystems Using UAV Data*

Ecosystem characterization and monitoring are key aspects in the field of coastal management. This study provides a first approach to the evolution of a Galician rocky coastal cliff applying photointerpretation and spatial statistic techniques. Nevertheless, this is only one of the necessary elements for this type of research. To describe ecosystems in detail, it is necessary to generate reports about the species, the floristic and faunal composition, and to analyze trace species that indicate a high environmental quality, as well as to characterize soils and of the degree of anthropization [5]. All these parameters require great efforts in terms of field surveys, which must be performed in different seasons and with a specific frequency to understand the evolution of these sectors.

Despite the great usefulness of UAVs, some analyses could be performed through field surveys to thoroughly understand the areas and the species inhabiting them. However, remote sensing information enabled a close analysis of the evolution of mass movements in this area, as well as of the retreat of cliff tops. Both elements are key parameters to understand ecosystem evolution and functionality. The improvement of technologies in the last decades had a great importance in this field, with an important increase in accuracy level [62,63].

In this project, an important dynamism could be observed in the Ponzos site, with major landslides affecting the cliff section and mobilizing more than 4000 m3 of material in two years (Figure 11). A similar dynamic was observed regarding cliff-top behavior, with a mean retreat of 0.9 m. Other areas found along the coastline are vulnerable to factors such as rock fracturing, sediment accumulation at cliff toes, or rainfall [5]. It is worth emphasizing

the importance of continental factors in the alteration of cliff areas, among which rainfall accumulation is a fundamental element to understanding mass movements [64].

**Figure 11.** DEMs of difference (DoD) in the Ponzos south section between 2016–2018.

The mass movements occurred in the south sector zone during the analyzed period have been outlined. Other mass movements have previously appeared in this area, and further analyses are required in order to understand their evolution [65]. In this sense, UAVs are a fundamental tool to identify cliff dynamics and, therefore, to improve coastal management thanks to the creation of inventories of the most vulnerable areas [64,66], while avoiding activities that could potentially be dangerous for the area's inhabitants and infrastructures.

UAVs can be extremely useful to estimate vegetation status and soil humidity in this type of research. For these purposes, multiband cameras enable a quantitative approach [67].

#### *3.4. Improvements in Research with the Use of UAVs*

The arrival of UAVs to coastal geomorphology research has two main elements. The first, as already mentioned, is related to an improvement in spatial resolution. In addition, the relatively low cost of these devices allows for faster image acquisition and for a more comprehensive temporal coverage in comparison with other technologies such as satellite or aerial images [68–70]. This improvement currently enables detailed analyses of the influence of marine and continental factors on the evolution of rocky areas, thanks to the possibility of surveying study areas between storms or after extreme rainfall events [71,72]. Moreover, in coastal cliffs or steep slopes, other data sources (e.g., satellite or aerial images) do not allow a detailed analysis about their dynamics. In steep zones, the pixel size of these images avoids the variations identification.

The application of UAVs to geomorphological studies and, more generally, to studies about ecosystem behavior did not initially target rocky coasts, nor are these the most relevant areas in this field. Nowadays, multiple studies have used UAVs to monitor sedimentary systems [73,74], vulnerable areas as coastal lagoons [75], saltmarshes [76], mangroves [77], estuaries [78], and other coastal areas that require detailed analysis [79–81], as well as in the general coastal context [68,73].

All the aforementioned sectors had been previously studied, but the technical advances associated with UAVs have improved the time series and have allowed acquiring better and more accurate qualitative data at mesoscale. These facts have allowed expanding the knowledge about the natural environment in general and about rocky coasts in particular.

#### *3.5. The Future of UAVs in Coastal Research*

Based on the number of projects generated in the last decade and on their relevance, it is clear that the increase in coastal research using UAVs will continue to follow a positive trend in the near future. The wide possibilities of their associated methodologies, together with fast data acquisition and low costs, have promoted their use, especially in sectors where field surveys are not easy to perform either due to costs or to other factors.

It is worth emphasizing the importance of combining the information obtained through UAVs with other data to improve results and generate better and more accurate explanations about coastal evolution. In line with this, it is worth emphasizing the combined use of instruments such as durometers in shore platforms, which, together with DSM analysis, can expand our knowledge about these areas and their evolution. In the case of Oia, the information acquired by the flight was merged with data obtained by other techniques, such as RFID sensors, to determine the directions and measure the distances of boulder movements. This combination of information from different sources helps to understand the variations observed in DSMs [37].

Moreover, as has been observed in other fields such as in wildfire research [82], UAVs are likely to constitute a key tool to monitor coastal environments in the future. They could play an important role in improving coastal management projects by updating the actions applied in protected spaces and other sensitive areas according to variations related to global change. It is precisely in relation to global change and its consequences where UAVs could be of great importance to analyze how alterations in physical factors could impact the territory.

In relation to the aforementioned, research carried out on cliffs, shore platforms, and boulder beaches has revealed the importance of understanding soil temperature and humidity for rock alteration processes, particularly at the start of mass movements in cliff areas. For this reason, multiband sensors in UAVs could be immensely useful to monitor sectors such as Ponzos during heavy rainfall periods. The use of multispectral sensors is a logical evolution that will allow analyzing seasonal variations in vegetation, as well as soil characteristics, such as humidity or other physicochemical parameters, to understand landscape evolution in detail [67,83,84].

#### **4. Conclusions**

This study shows the importance of UAVs for rocky coastal research. These instruments have allowed for a detailed analysis of the dynamics affecting rocky coasts.

The associated methodologies allowed confirming the high dynamism of rocky areas, such as boulder beaches, by measuring variations in the position of elements or volumetric differences. These studies allow comparing near sectors to understand the factors controlling their dynamics and explain the possible differences (e.g., the behavior in Oia and Laxe Brava during the same period).

The use of UAVs combined with other techniques is of enormous value to expand the knowledge about rocky coasts. As previously observed, their combination with field surveys in accessible areas is of crucial importance for a correct environmental characterization. Clear examples of this are shore platforms, where the relation with durometer values allows a more realistic approach to the platform characteristics, and boulder beaches, where currently the use of UAVs are related to RFID sensors. This combination improves the knowledge about the boulders displacements.

The techniques employed, as well as the uncertainty associated with these analyses, could be improved in the future by increasing processing capabilities and by performing

more studies using similar techniques. In this sense, the application of multispectral cameras could greatly impact our understanding of coastal areas.

The use of UAVs for ecosystem monitoring and coastal management is expected to increase in the future within the context of global change. These devices allow an accurate characterization of land uses and ecosystems distribution.

**Author Contributions:** Conceptualization: A.G.-P. and A.P.-A.; methodology: A.G.-P. and A.P.-A.; formal analysis: A.G.-P.; investigation: A.G.-P. and A.P.-A.; writing—original draft preparation: A.G.-P. and A.P.-A.; writing—review and editing: A.G.-P. and A.P.-A. All authors have read and agreed to the published version of the manuscript.

**Funding:** A.G.-P. is supported by an FPU predoctoral contract by the Spanish government (Ministerio de Educación, Cultura y Deporte). Grant Number: FPU16/03050. This work was supported by CRETUS Institute.

**Data Availability Statement:** Data sharing not applicable.

**Acknowledgments:** This work was supported by CRETUS Institute. A.G-P. was in receipt of an FPU predoctoral contract with reference FPU16/03050.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Modeling Streamflow and Sediment Loads with a Photogrammetrically Derived UAS Digital Terrain Model: Empirical Evaluation from a Fluvial Aggregate Excavation Operation**

**Joseph P. Hupy 1,\* and Cyril O. Wilson <sup>2</sup>**


**Abstract:** Soil erosion monitoring is a pivotal exercise at macro through micro landscape levels, which directly informs environmental management at diverse spatial and temporal scales. The monitoring of soil erosion can be an arduous task when completed through ground-based surveys and there are uncertainties associated with the use of large-scale medium resolution image-based digital elevation models for estimating erosion rates. LiDAR derived elevation models have proven effective in modeling erosion, but such data proves costly to obtain, process, and analyze. The proliferation of images and other geospatial datasets generated by unmanned aerial systems (UAS) is increasingly able to reveal additional nuances that traditional geospatial datasets were not able to obtain due to the former's higher spatial resolution. This study evaluated the efficacy of a UAS derived digital terrain model (DTM) to estimate surface flow and sediment loading in a fluvial aggregate excavation operation in Waukesha County, Wisconsin. A nested scale distributed hydrologic flow and sediment loading model was constructed for the UAS point cloud derived DTM. To evaluate the effectiveness of flow and sediment loading generated by the UAS point cloud derived DTM, a LiDAR derived DTM was used for comparison in consonance with several statistical measures of model efficiency. Results demonstrate that the UAS derived DTM can be used in modeling flow and sediment erosion estimation across space in the absence of a LiDAR-based derived DTM.

**Keywords:** streamflow; sediment loading; unmanned aerial systems; drones; digital terrain model

### **1. Introduction**

Unmanned aerial systems, hereafter referred to as UAS, are now widely recognized in the remote sensing community as a valid geospatial data collection tool. Their utility extends into a wide variety of applications, including but not limited to general mapping, precision agriculture, forestry, wetlands, mining, excavation, and hydrology [1–4]. While UAS platforms can be equipped with a wide array of sensor types, i.e., meteorological, gas, and particle sensors, the majority of current UAS platforms are equipped with imaging sensors [5,6]. UAS are a useful platform to gather imagery over small to moderately sized areas due to their relatively low cost and for their overall versatility when compared to traditional satellite and manned aircraft remote sensing platforms. Manfreda [7], detailed how traditional manned aircraft and satellites were limited in their ability to gather remotely sensed data based on their altitude constraints and their inability to gather information over certain areas within a given set of time constraints and temporal frequency needs. Furthermore, both satellite and fixed-wing aircraft cannot achieve the centimeter to sub-centimeter spatial resolution that UAS delivers.

The versatility of a UAS platform is best described as being able to fly 'low and slow', which means that a UAS platform equipped with a small format camera sensor can fly

**Citation:** Hupy, J.P.; Wilson, C.O. Modeling Streamflow and Sediment Loads with a Photogrammetrically Derived UAS Digital Terrain Model: Empirical Evaluation from a Fluvial Aggregate Excavation Operation. *Drones* **2021**, *5*, 20. https://doi.org/ 10.3390/drones5010020

Academic Editor: Diego González-Aguilera

Received: 9 February 2021 Accepted: 9 March 2021 Published: 12 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

at a low altitude over a pre-defined area to gather remotely sensed images at resolutions unattainable with traditional platforms, and at a temporal frequency deemed necessary. When UAS imagery is gathered by the platform sensor at a constant altitude, and with enough overlap, the imagery can be processed using structure-from-motion with multiview stereo (hereafter referred to as SfM) methods to generate a three-dimensional point cloud model [8–10]. This point cloud can then be used to generate a digital surface or terrain model (DSM/DTM). A DSM contains all above ground surface features, such as vegetation and buildings, while a DTM only contains the bare ground.

Software companies, such as Pix4D, Agisoft, and Socet Set, have revolutionized the field of photogrammetry by transforming the laborious and time-consuming conventional photogrammetric method into more efficient and optimized workflows such as the scale invariant feature transform [11] and its associated SfM algorithm [12–14] including the BAE Systems Socet Set [15]. This new development is timely and welcome in the processing of UAS imagery and extraction of derivatives and is slowly being adopted into the mainstream of many different fields that rely on geospatial data. While the increased use of UAS data is indeed a boon to the geospatial community, the ease of creating SfM derived data products such as a point cloud, DSM, DTM, and orthomosaic means that both the limitations and potential of such data need to be realized.

Despite airborne and terrestrial LiDAR's accuracy in generating surface and terrain models for use in diverse applications, it is expensive to collect and process, and therefore has limits to its widespread use [7,16,17]. Furthermore, because of the altitude of the platform used, airborne Lidar data is often not available at the 1−2 cm spatial resolution that one can obtain using UAS derived SfM methods [9,18]. While progress has been made toward equipping UAS platforms with LiDAR sensors, the technology remains both cost prohibitive and unreachable for most researchers and UAS users [7,17]. Scientific investigations in the emerging frontier of UAS SfM approaches are needed to better understand the potentially concealed functionalities of photogrammetrically derived UAS data, which might be beneficial in certain research and market niches.

Extraction and excavation-based activities associated with vegetation removal, such as open pit mines, construction sites, barren farm-fields, and post-hazard events (e.g., slope failure) are well-suited for UAS derived DSM and/or DTM analysis [19–21]. The lack of vegetation cover associated with surface disturbance on excavated surfaces makes them ideal for modeling both drainage and erosion potential. In an applied sense, the erosion associated with surface drainage at sites devoid of vegetation cover can impact day to day operations at the site. Excessive erosion at construction and mining sites also can mean being subjected to fines and citations from regulatory bodies. Most mining and construction operations are aware of erosion modeling techniques available with LiDAR data, but understand the financial and temporal limitations in place to be able to use such datasets and are seeking means to model surface drainage with technology that is more attainable, such as UAS derived DTM from SfM models.

While LiDAR derived terrain models in hydrologic applications have their place as an essential data source [18], traditional photogrammetrically derived products like digital elevation models (DEMs) are still an important source of topographic information that informs hydrologic applications [22,23]. Historically, such DEM products have been retrieved from medium resolution optical and microwave images [7,24,25]. Quite recently, photogrammetrically derived DSM and DTM from high resolution UAS imagery has been emerging in the hydrologic literature [18,26–28]. Photogrammetrically derived UAS data has been used to model surface flow within and outside urban areas [17,29,30], spatial and temporal variability of riverbed hydraulic conductivity [31], channel morphology [32], streambank topography [33], streambank erosion [27], and gully erosion in agricultural and urban watersheds [34,35]. Stocker [34] demonstrated that the photogrammetrically derived data from UAS can measure gully erosion in farmland in a way that LiDAR technology could not due to the increased spatial and temporal resolution that UAS models provide. Gudino-Elizondo et al. [35] reported that UAS derived DSM was effective in estimating

gully erosion rates in an urban catchment. While the use of UAS derived dataset to estimate specific types of erosion rates in different catchment characteristics is welcome, we currently know very little on the use and effectiveness of this relatively new dataset to estimate runoff and erosion rates on diverse land cover types [35]. Moreover, an investigation of this phenomenon within the lens of a nested scale distributed hydrologic modeling framework has great potential to unlock the efficacy of UAS derived DTM for the monitoring of runoff and erosion rates across space.

Nested-scale hydrologic modeling framework has proven effective in predicting stream flow and water pollutants when model input data are not available at the same scale [36–38]. Didszun and Uhlenbrook [38] applied a nested-scale approach to investigate hydrologic responses at scales < 1 km2 and ≥ 40 km2; they reported slight variation in hydrologic responses at the smallest scale attributed to varying topography. Van der Velde [36] concluded that hydrologic models configured within a nested-scale framework improves prediction of stream discharge and nitrate loads. In a related vein, Zeiger and Hubbart [37] echoed the efficacy of a nested-scale experimental hydrologic modeling design to predict suspended sediment loads. The overarching objective of this study is to assess the potential of a UAS photogrammetrically derived DTM for modeling surface runoff and sediment loading in an open pit fluvial aggregate mining operation experiencing high amounts of erosion. Specifically, this study addresses the following objectives: (1) to develop a downscaling calibration and validation framework for a large-scale hydrologic water quality model and extend to a smaller UAS dataset areal extent; (2) to evaluate the suitability of UAS derived DTM for surface runoff and erosion modeling within an open pit fluvial aggregate mining operation.

#### **2. Materials and Methods**

#### *2.1. Study Area*

This study was conducted in an open pit fluvial aggregate excavation operation in Waukesha county, near the village of Delafield, Wisconsin, USA. (Figure 1). The operation uses extracted aggregate materials from the open pit for its activities in paving and construction. The data acquisition area covers 10 ha with elevation values ranging between 314 m and 287 m above mean sea level. Due to the erosion issues that the site was experiencing, a request was made to have the area flown with a UAS to potentially identify areas where significant erosion was occurring. The steep slopes and loosely consolidated material made a ground-based survey both impractical and dangerous (Figure 2). The study area is sparsely vegetated (<15%) due to excavation and extraction activities at the site. Average annual precipitation at the site is 87.9 mm. The nearest climatic records from Waukesha, WI show average temperatures in January range from –11.8 to –2.3 ◦C. Average July temperatures fall within the 27.7 to 15.4 ◦C range. Soils in the study area are designated as an open gravel pit soil unit and classified as Psammic Fluvent [39].

#### *2.2. UAS Data Collection and Processing*

All data was gathered in June 2015 by MenetAero LLC, a UAS service provider who specializes in UAS data acquisition. The Platform used to collect imagery was a DJI Matrice 600 Pro with D-RTK, allowing for reliable flight paths and altitude consistency during flight. Image collection was facilitated by a gimbal based Zenmuse X5 RGB camera equipped with a 15 mm lens (Table 1).

**Figure 1.** Map of the study area: Excavation operation site (*a*), Wisconsin (*b*), USA (*c*).

**Figure 2.** Optical UAS imagery derived land cover (*a*) and hillshade DTM based on UAS point cloud (*b*).

Flight altitude at image capture was 80 m with 80% frontal and lateral overlap. The images were saved onto a 32 GB Generation V SanDisk SD card in Jpeg format. Image geolocation was stored to the image exif file using the WGS 84 geographic coordinate system. This coordinate system is what most UAS platforms utilize to record data related to their GPS log and is the default setting for the DJI platform. It should be noted here that the Matrice 600 Platform, although equipped with D-RTK GPS, did not communicate with the Pix4D Capture application to geolocate the imagery with RTK precision. Spatial accuracy was achieved by the placement and survey of ground control markers prior to the flight. A Trimble R2 GNSS integrated system was used to acquire coordinate locations at six ground control points (GCPs) distributed accordingly across the flight area. To ensure survey quality, redundant check shots were also recorded at each GCP with a variance

tolerance of 0.01 m horizontally and 0.02 m vertically. GCPs were recorded using the projected Universal Transverse Mercator coordinate system with a WGS 84 datum. This was chosen to match the desired processed data output projection and coordinate system for further use in a Geographic Information System with other forms of geospatial data.


**Table 1.** Survey data collection parameters and equipment specifications.

Pix4D Structure from Motion Multi-View Stereo (SfM MVS) software (version 3.1.23) was used to generate a point cloud, digital surface model, orthomosaic image, and subsequent derivative data products that allowed for further analysis within LP360, ESRI ArcMap Desktop (version 10.7) software, and image processing utilities. Calibration, validation, subsequent processing, and error reporting details associated with SfM model creation utilized in this study are summarized in Table 2 and adhere as best as possible to guidelines put for by James [40]. Following the initial processing phase, where the geolocated images are used to generate a low-density point cloud, the dataset was adjusted for both horizontal and vertical accuracy using the GCP markers. The dataset was then reoptimized and used to generate a high-density point cloud (las format) with 0.058 m RMS error.

Utilizing a hybrid approach, the UAS photogrammetrically derived point cloud was classified into ground and above ground points with the aid of LP360 software (GeoCue Group, Madison, AL, USA). The small fraction of above ground points was a result of shrubs and trees in dispersed throughout study area (Figures 1a and 2a). In the first stage of point cloud classification, we employed the adaptive triangulated irregular network (TIN) ground filter to separate ground points from non-ground points [41,42]. The adaptive TIN-based ground filter generates tiles over the point cloud dataset and identifies the lowest point in each tile as a potential candidate ground point [41]. Next a triangulated irregular network (TIN) is generated from the earmarked lowest points. The algorithm then utilizes thresholds that encapsulates elevation difference and angle closest to a TIN face to iteratively remove non-ground points. Detailed description of the adaptive TINbased ground filter can be found in Axelsson [41]. In stage two, a two-dimensional profile window that is equipped with vertical manual classification tools was utilized to improve on stage one automated classification. Following the successful classification of ground points, a 3.6 cm DTM was derived by interpolating ground points using a triangulation

algorithm. The output spatial resolution of the DTM was set at 3.6 cm to be consistent with the nominal point spacing (NPS) of the classified ground points. A 14.3 cm LiDAR DTM was derived from an already classified LiDAR ground points for Waukesha County collected late spring 2015 [43]. The LiDAR derived DTM spatial resolution was determined by the NPS of LiDAR ground points (~7 points per m2). We used triangulation interpolation method to generate a LiDAR derived DTM to be consistent with that of the UAS derived DTM mentioned above. The final processing of the UAS and LiDAR derived DTMs was the generation of a hillshade for visualization purposes.


**Table 2.** Calibration, processing, and error reporting detail summary table.

In hydrologic modeling, land use/land cover (LULC) are needed to establish parameters related to erosion potential. LULC data was derived from the orthomosaic image of the study area collected during the UAS flight mission. We employed a two-stage hybrid classification framework to generate land cover information for the study area. Object-based image analysis (OBIA) followed by the implementation of a random forest classifier was utilized in stage one image processing [44,45]. The determination of spectral and spatial segmentation parameters was informed by local variance of heterogeneity; following this, image objects were generated by a multiresolution segmentation algorithm [46,47]. Equation (1) illustrates a simplified example of the major segmentation parameters employed in OBIA.

$$F\_s = \mathbb{C}\_w \* r\_c + (1 - \mathbb{C}\_w) \* r\_s$$
 
$$\text{and } 0 \le \mathbb{C}\_w \le 1 \tag{1}$$

where *Fs* is segmentation function; *Cw* is weight given to color; *rc* illustrates color criterion; and *rs* denotes spatial criterion. The random forest classifier was trained to classify image objects in conjunction with textural and contextual information [47]. In stage two, output of random forest classification was integrated in an expert system ruleset classifier [48] with the use of ancillary data to improve on the result of stage 1 classification. Ancillary data was obtained by creating patches to fix misclassification encountered during stage 1. Image classification accuracy was conducted by collecting 300 ground reference points via stratified random sampling from a high-resolution National Agriculture Imagery Program (NAIP) imagery collected at the same temporal scale as the orthomosaic imagery [39,49]. Overall image classification accuracy is 89%. At the end of image processing, five classes were produced: Trees, shrub, marshland, loose sediment, and compacted sediment (Figure 2a).

#### *2.3. Flow and Sediment Model Construction*

The Soil and Water Assessment Tool (SWAT) was employed to model flow and sediment loading in the study area. SWAT is a fully distributed model that aids in the evaluation of land management practices on flow and water quality in river basins over time [50,51]. For a SWAT model to be successfully implemented, it requires LULC, soil, elevation, and climate variables. Soil data was obtained from the soil survey geographic (SSURGO) data [52] while climate data was acquired from SWAT database. For a SWAT model to produce results that are close to reality, it must be calibrated and validated. We constructed two SWAT models for the UAS derived DTM and the LiDAR derived DTM, respectively. Due to the unavailability of observed streamflow data within the UAS image acquisition area, a nested downscaling approach was developed and applied in operationalizing model calibration and validation. We reconstructed the hydrologically active area covered by the closest USGS streamflow gauging station (4 km upstream) to the UAS study extent. The larger watershed (199 km2) encapsulated the UAS study area and accounted for the hydrologically active area of the observed streamflow data (Figure 3). An ungauged SWAT model with a warmup period of 15 years (2000−2014) was initially operationalized for this larger calibration study area. Following this, the model was calibrated and validated to obtain the appropriate coefficients for 17 key SWAT parameters (Table 3) that were found to be highly sensitive at this large spatial scale and extendible to the smaller UAS spatial scale.

**Figure 3.** Simplified SWAT large extent calibration and validation model for study area.


**Table 3.** Fitted SWAT model calibration parameters and their coefficients.

We employed Sequential Uncertainty Fitting version 2(SUFI-2) program embedded in SWAT-CUP 2012 to calibrate and validate the models [53,54]. SUFI2 fits SWAT output simulated data to observed data and in the process adjust the coefficients of SWAT model parameters during model calibration [55]. In evaluating how well a model is calibrated, SUFI-2 utilizes two major criteria. The *P*-factor which provides a measure of SUFI-2's ability to capture uncertainty while the *R*-factor gauges the quality of model calibration [56]. Equation (2) depicts the *R*-factor.

$$R = \frac{\frac{1}{P} \sum\_{i=1}^{P} (B\_{s, \mathcal{GH}, \mathbb{W}\%} - B\_{s, \mathcal{2}, \mathbb{W}\%})i}{\sigma\_{\text{obs}}} \tag{2}$$

where *p* is the number of parameters fitted, *Bs*,97.5% and *Bs*,2.5% represents the upper and lower bounds of the 95PPU (95% prediction uncertainty) for a simulated variable *Bs*, *σobs* is the standard deviation of the observed data. Values for *R*-factor range between 0 and infinity. An *R*-factor of zero demonstrates a perfect fit between simulated and measured data. Figure 3 illustrates a simplified SWAT model used in calibration at the larger spatial extent.

Due to the paucity of monitoring stations in the watershed, a temporal split sampling was used in the calibration and validation for flow [57]. The model was calibrated for flow between August−2015 and August−2016 while validation was implemented between August−2017 and August−2018. To obtain the appropriate calibrated coefficients for the parameters outlined in Table 3, we executed 10,000 iterations. The best simulation that produced the appropriate coefficients for the parameters was achieved at iteration 9457. Following the successful model calibration and validation for flow at the larger spatial extent, the simulated flow value at the subbasin that mostly coincides with the UAS study area location was used as an observed data to calibrate and validate the UAS DTM and LiDAR derived DTM models for flow, respectively. To evaluate the efficacies of model calibration and validation for (i) the large extent SWAT model, and (ii) the UAS DTM and LiDAR derived DTM models for flow, three additional statistical measures besides that outlined in Equation (2) were employed.

The first additional statistical measure used to evaluate model effectiveness for calibration and validation is the Nash–Sutcliffe (NS) coefficient [58]. Equation (3) illustrates the Nash–Sutcliffe coefficient:

$$\mathcal{E} = 1 - \frac{\sum\_{i=1}^{n} \left( O\_i - S\_i \right)^2}{\sum\_{i=1}^{n} \left( O\_i - \overline{O} \right)^2} \tag{3}$$

where E is the Nash–Sutcliffe coefficient of model efficiency, *Oi* is observed data; *O¯* is the mean of observed data; *Si* is simulated value, while *n* is the total number of observations. Possible values of NS range between −∞ and 1.0. A Nash–Sutcliffe statistic of 1.0 suggest a perfect fit between simulated and observed data, NS values between zero and 1 are generally regarded as tolerable levels of model performance, while NS values less than zero illustrates that the mean of observed data is a preferred predictor compared to the simulated values. Another model efficiency criterion that we employed is the index of agreement (d) which is calculated according to the following equation [59]:

$$\mathbf{d} = 1 - \frac{\sum\_{i=1}^{n} (O\_i - S\_i)^2}{\sum\_{i=1}^{n} \left( |S\_i - \overline{O}| + |O\_i - \overline{O}| \right)^2} \text{ o } \le \mathbf{d} \ge 1 \tag{4}$$

where d is the index of agreement, *Oi* is observed signal, *O¯* is the mean of observed signal, *Si* is simulated value, while *n* is the number of observations. An index of agreement value of 1 suggest perfect fit between simulated and observed while zero depicts no association. We further employed the root mean square error (RMSE) in evaluating model predictive power [60]. The RMSE statistics quantifies the predicted error (residuals) vis à vis the units of the simulated value into a single measure of model efficiency. The RMSE is calculated according to the following equation:

$$\text{RMSE} = \sqrt{\frac{\sum\_{i=1}^{n} \left(O\_i - S\_i\right)^2}{n}} \tag{5}$$

where *Oi* and *Si* represent observed and simulated values of a sample size *n*. Values for RSME range between 0 and ∞, where RMSE of zero suggest perfect fit between observed and simulated information.

Calibration of suspended sediments was not performed due to the unavailability of suspended sediment observed data covering the accepted period of model calibration and validation, outside SWAT warm up period (2000–2014). Notwithstanding, one pivotal sediment related SWAT parameter (PRF) that is tied to flow parameters was calibrated thus providing an indirect calibration for suspended sediment (Table 3).

Figure 4 shows a simplified SWAT model constructed for the UAS spatial extent using a LiDAR derived DTM (Figure 4a) and a photogrammetrically derived point cloud DTM (Figure 4b). In each of the models, the watershed was automatically delineated into 60 sub-basins with very similar characteristics of monitoring points, stream network, and sub-basin sizes and morphology.

**Figure 4.** Simplified SWAT UAS extent calibration and validation model for study area: LiDAR derived DTM (*a*) Photogrammetrically derived DTM (*b*).

Both models were calibrated and validated for flow at the monitoring points marked with red circle in Figure 4a,b. The warmup period including calibration and validation period for the SWAT models was consistent with that assigned to the large spatial extent illustrated in Figure 3.

#### **3. Results and Discussion**

#### *3.1. Model Calibration and Validation at the Larger Spatial Extent*

The SWAT model was calibrated for flow between August 2015 through August 2016 at a monthly timestep. In Figure 5, the statistical measures of model efficiency clearly demonstrate that the simulated flow is within acceptable threshold of the USGS measured data. Moreover, all the measures of model efficiency demonstrated that the 17 SWAT parameters outlined in Table 3 did a decent job in fitting the simulated data to the USGS observed streamflow data. The extent of model uncertainty captured by the 95PPU (>76%) further attest to the effectiveness of the model calibration. The R-factor (R), Nash–Sutcliffe (E), and index of agreement (d) shows strong association between simulated and observed data. Despite the relatively low RMSE, the overwhelmingly excellent efficiency of the other criteria suggests that the model calibration is robust.

Note: R is *R*-factor, E is Nash–Sutcliffe coefficient of simulation efficiency, d is index of agreement, and RMSE is root mean square error.

The SWAT model was validated for flow between August 2017 and August 2018 at a monthly time interval. The model evaluation criteria of R, E, and d (Figure 6) are not that different from those demonstrated for model calibration and suggest a strong validation of the model, though one can conclude the model was slightly better calibrated compared to its validation. Notwithstanding, all the statistical criteria strongly suggest that the model was constructed in a manner that closely matches surface fluvial hydrologic characteristics. Figure 6 also shows that a large fraction (>80%) of model uncertainty was captured by the 95PPU. The widely used model efficiency values generated by Nash–Sutcliffe, index of agreement, and the RMSE are within acceptable levels reported in other studies [61–63].

Note: R is *R*-factor, E is Nash–Sutcliffe coefficient of simulation efficiency, d is index of agreement, and RMSE is root mean square error.

#### *3.2. Model Calibration and Validation at the UAS Spatial Scale*

Following the successful calibration of the large spatial extent SWAT model, the fitted parameter coefficients were transferred to the nested UAS spatial scale models. Moreover, the predicted streamflow at the sub-basin closest to the UAS spatial extent watershed was used for calibrating and validating the models. It has been shown that model parameters and their coefficients are regionally transferrable within a watershed if the efficiency

value statistics are reproducible at a different sub-basin [64]. Calibration results for the UAS derived DTM and Lidar derived DTM demonstrated that both models fall within an acceptable threshold of model calibration efficiency despite the LiDAR derived DTM having a relatively higher R-factor, Nash–Sutcliffe, and index of agreement (Figure 7). Notwithstanding, both models had identical RMSE which turned out to be higher than that obtained at the larger spatial extent scale of calibration. The models generated by the LiDAR DTM and the UAS DTM, respectively also illustrated that a high fraction (>70%) of model uncertainty was captured during calibration as can be seen by the 95PPU. As a result of the model calibration, it can be subsequently concluded that a photogrammetrically derived DTM from a UAS point cloud is effective in modeling flow. Jeziorska et al., [30] reported that a UAS derived terrain model is more effective in accounting for flow morphology and patterns over a lidar derived DTM in areas not covered by vegetation because of its increased spatial resolution. We attribute the slightly lower values of R, E, and d in the UAS derived terrain model to the uncertainty in interpolated terrain beneath the few areas within the watershed that are covered by trees and shrub and also to the single flow (D-8) algorithm used by SWAT. Studies have shown that a multiple flow algorithm better estimates flow compared to single flow algorithm [65,66]. Figure 8 shows the validation results for the LiDAR derived DTM and UAS derived DTM. Both models are within the acceptable threshold of validation based on their simulation efficiency despite the Lidar based DTM scoring slightly higher values compared to the UAS derived DTM in three of the four model efficiency criteria used (Figure 8).

**Figure 7.** SWAT UAS extent model calibration 8/2015–8/2016: (*a*) Lidar point cloud DTM and (*b*) Photogrammetrically derived point cloud DTM. Note: R is *R*-factor, E is Nash–Sutcliffe coefficient of simulation efficiency, d is index of agreement, and RMSE is root mean square error.

**Figure 8.** SWAT UAS extent model validation 8/2017–8/2018: (*a*) Lidar point cloud DTM and (*b*) Photogrammetrically derived point cloud DTM. Note: R is *R*-factor, E is Nash–Sutcliffe coefficient of simulation efficiency, d is index of agreement, and RMSE is root mean square error.

Validation results for the two models display close similarities when compared to the results generated by calibration. This suggest that a UAS-derived DTM can serve as an alternative dataset to model streamflow in the absence of lidar DTM provided that the study area has minimal to no vegetation cover. When vegetation cover dominates a study area, ground/terrain models generated by UAS-derived point cloud contains higher errors [67,68] and may not be suitable for modeling streamflow. Since surface runoff is mostly controlled by terrain, we recommend that a UAS derived DTM used for estimating flow and eventual sediment erosion be collected over areas with minimum high vegetation cover, such as trees (<10%). Moreover, as demonstrated by Jensen and Mathews [69], the point cloud should be classified into ground points using a robust algorithm such as the adaptive TIN ground filter employed in this study followed by a manual classification of the automated classified ground points to further eliminate above ground features. The resulting DTM generated from the hybrid classified ground points can be used in modeling flow across space in the absence of a LiDAR derived DTM. This refined UAS derived DTM has great potential to extend the applications of UAS data.

#### *3.3. Assessment of Sediment Erosion at the UAS Spatial Scale*

The calibrated and validated SWAT models for the LiDAR point cloud DTM and photogrammetrically derived point cloud DTM were used to model the amount of sediment being eroded and washed from the watershed at the sub-basin level. Figure 9 compares the sediment loads generated by the two models. The photogrammetrically derived point cloud DTM accounted for a slightly higher sediment loading from the watershed compared to that obtained from the LiDAR derived DTM (Figure 9). In sub-basins covered with loose sediment and gentle slopes, the amount of sediment eroded is identical between the LiDAR derived DTM and the UAS-derived point cloud DTM. However, this is not the case in subbasins that have rugged terrain where the UAS derived DTM generated greater sediment loads compared to the LiDAR derived DTM. We speculate that this difference might be ascribed to the higher spatial resolution of the UAS derived DTM (3.6 cm) compared to the relatively lower spatial resolution of the LiDAR derived DTM (14.3 cm). Digital terrain models derived from UAS point cloud have been shown to be more effective in accounting for streambank erosion [27,33] and measuring gully erosion rates [34] compared to LiDAR -derived DTMs. This effectiveness demonstrates how photogrammetrically derived SfM terrain when used in scenarios with little to no vegetative ground cover, can serve as a low-cost viable alternative to more costly methods that rely on LiDAR data.

**Figure 9.** Comparison of sediment erosion generated by (*a*) LiDAR point cloud DTM and (*b*) Photogrammetrically derived point cloud DTM.

Additional research is needed to better compare UAS and LiDAR derived DTMs collected at identical spatial and temporal resolution over non-vegetated terrain to comprehensively evaluate the efficacy of the UAS derived DTM in estimating flow and erosion of sediment. Moreover, the key difference unearthed in this study where both DTMs performed identically in gentle slopes and loose sediments but differently in rugged terrain needs further testing in similar site setting. As the cost of lighter payload LiDAR sensors developed for drone platforms becomes cheaper [70], hydrologic modeling of flow and nonpoint source pollutants which have been historically conducted at moderate to large scales will become more practical at the smaller UAS spatial scale thus providing a more effective tool for monitoring of erosion at mining sites.

#### **4. Conclusions**

Unmanned Aerial Systems have been long recognized for their ability to acquire imagery over areas of interest with spatial resolutions that can provide incredible amounts of detail, both temporally and spatially. Coupled with their ability to be quickly deployed over small areas on a frequent basis, UAS have rapidly demonstrated themselves as a valid data collection tool in many geomorphic and geologic applications. While UAS derived data products, such as DSM and DTM have been used in many forms of fluvial research, the integration of UAS derived DTM in a nested scale distributed hydrologic modeling that this study investigated is a unique domain in UAS application. In this research we assessed the feasibility and efficacy of a photogrammetrically derived DTM in modeling sediment erosion across space. The nested scale hydrologic modeling framework successfully downscaled streamflow data from a larger spatial extent and applied to a smaller UAS spatial scale. In this study, we have demonstrated that it is possible to extend the use of UAS derived DTM from river and other narrow transects to the entire image area in modeling erosion potentials. We built on the literature which mostly agrees that the higher spatial resolution obtained from UAS derived products facilitates the modeling of erosion at the transect level. The study also demonstrates that with the tools of model calibration and validation, it is possible to utilize UAS derived DTM to model flow and sediment load estimation in the absence of measured data at the UAS spatial scale. Notwithstanding, we caution that if LiDAR data is available at a higher temporal and spatial frequency, such as the recent lighter payload lidar sensors that can be mounted on UAS platforms, it should be used to monitor flow and sediment loading rather than a photogrammetrically derived DTM especially if the study area is covered with significant vegetation. The nested scale methodology developed and utilized in this study can be extended to similar fluvial aggregate excavation operations. The hydrologic modeling framework serves as an excellent example of how UAS data can serve as a low-cost alternative to LiDAR in terms of decision making and lowering overhead costs in a variety of extraction-based industries. Future research should evaluate the quality and accuracy of models over areas with diverse amounts of vegetation cover and provide a direct comparison of DTM models gathered via LiDAR and UAS imagery, respectively.

**Author Contributions:** This article was written jointly by J.P.H. and C.O.W. The project was conceptualized and it methodology designed as a joint endeavor between the two authors, where C.O.W. took on a main role of engaging in the GIS modeling while data collection and preliminary analysis was done by J.P.H. Formal analysis of the data was done predominantly by C.O.W. Writing, reviewing, and editing was joint effort between the two co-authors. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by The University of Wisconsin Regent Scholar Award, awarded to J. H. for his work title 'Lowering Overhead Costs within the Industrial Aggregate and Sand Mining Industry using Unmanned Aerial Systems'.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank P. M. from Menet Aero for conducting the flights and performing the ground control point survey.

**Conflicts of Interest:** The authors declare no conflict of interest related to this study. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


### *Technical Note* **Temperature Profiling of Waterbodies with a UAV-Integrated Sensor Subsystem**

#### **Cengiz Koparan, Ali Bulent Koc \*, Calvin Sawyer and Charles Privette**

Department of Agricultural Sciences, Clemson University, Clemson, SC 29634, USA; ckopara@g.clemson.edu (C.K.); CALVINS@clemson.edu (C.S.); privett@clemson.edu (C.P.)

**\*** Correspondence: bulent@clemson.edu; Tel.: +1-864-656-0496

Received: 21 June 2020; Accepted: 19 July 2020; Published: 21 July 2020

**Abstract:** Evaluation of thermal stratification and systematic monitoring of water temperature are required for lake management. Water temperature profiling requires temperature measurements through a water column to assess the level of thermal stratification which impacts oxygen content, microbial growth, and distribution of fish. The objective of this research was to develop and assess the functions of a water temperature profiling system mounted on a multirotor unmanned aerial vehicle (UAV). The buoyancy apparatus mounted on the UAV allowed vertical takeoff and landing on the water surface for in situ measurements. The sensor node that was integrated with the UAV consisted of a microcontroller unit, a temperature sensor, and a pressure sensor. The system measured water temperature and depth from seven pre-selected locations in a lake using autonomous navigation with autopilot control. Measurements at 100 ms intervals were made while the UAV was descending at 2 m/s until it landed on water surface. Water temperature maps of three consecutive depths at each location were created from the measurements. The average surface water temperature at 0.3 m was 22.5 ◦C, while the average water temperature at 4 m depth was 21.5 ◦C. The UAV-based profiling system developed successfully performed autonomous water temperature measurements within a lake.

**Keywords:** autonomous; hexacopter; water quality; water stratification; water temperature

#### **1. Introduction**

Evaluation of the physiochemical parameters of lake water is crucial for lake management and water quality monitoring. Water temperature is one of the physiochemical parameters that has a significant impact on water chemistry. Change in water temperature can trigger several phenomena in a waterbody. Some of these phenomena can occur naturally, causing no harm to the aquatic system, while others can cause negative impacts on water quality. Thermal stratification occurs at a depth of 3.6 m in many lakes where layers are formed with different temperatures [1]. These layers are categorized from top to bottom where the warmest layer is on the top and the coolest layer is at the bottom as the epilimnion, the thermocline, and the hypolimnion [2]. A lake can be considered as stratified when the temperature difference between the epilimnion and the hypolimnion is greater than 1 ◦C [3]. An inverse stratification, where coolest layer forms on the top while the warmer layer rests at the bottom, occurs during winter [4]. This phenomenon can impact many aspects of the lake, such as spatial distribution of fish, microbial growth, and oxygen content [5]. Other than the thermal stratification, water temperature can be the direct indicator of dissolved oxygen (DO), toxic absorption, and salinity [6]. The growth rate of algae and aquatic plants are influenced by change in temperature, where reduced DO due to increased temperature can cause harmful effects to the aquatic life [7]. Other factors such as discharge of industrial wastes, forest harvesting, and agricultural runoff can affect water temperature [8]. Therefore, periodic evaluation of the thermal stratification, as well as systematic monitoring of water temperature, are important for water quality monitoring and lake management.

Water temperature monitoring systems vary depending on the size of the targeted waterbody. The most common water temperature monitoring systems are manual sampling with multi-parameter sensors and buoy-based submersible sensor systems that can provide real time water temperature measurements from water columns [5,9]. The multi-parameter sensors and the buoy-based temperature sensors come with different configurations depending on the desired monitoring depth and sampling conditions [10]. Buoy-based temperature sensors are formed by thermistors that are embedded along a single cable, forming a thermistor chain [11]. The total number of thermistors and the distance between each thermistor vary depending on the depth, width, and other hydrological properties of the lake [12,13]. A buoy-based thermistor chain can make synchronous water temperature measurements at various depths, thus providing information for water column profiling. Multi-parameter sensors are useful for rapid water temperature monitoring from shore; however, they require transport vehicles and extensive labor. Buoy-based thermistors are at a fixed location and the spatial resolution of the measurements depends on the number of thermistors. Buoy-based systems must be installed for a longer period with limited numbers due to cost and maintenance constraints. Because shallow lakes stratify for short periods of time, the installation of buoy-based systems can be impractical and expensive [14]. An easily deployable system that can collect water temperature measurements with high spatial resolution within a short period of time could be applicable in shallow waters.

Unmanned aerial vehicles (UAVs) offer advantages over current multi-parameter sensors and buoy-based systems for water temperature profiling when it comes to lake management and water quality monitoring. UAVs are mobile and easily deployable from nearby location to a waterbody. Recent studies have utilized remote sensing and UAVs for monitoring the surface temperature of waterbodies. Thermal infrared remote sensing has been used for measuring surface water temperature in rivers and lakes for practical applications [15]. UAV-based thermal infrared mapping to assess groundwater discharge into coastal zones has been studied [16]. In addition to remote sensing, temperature sensor-integrated UAVs have been tested for water temperature measurements in lakes [17,18]. These UAV-based systems acquire temperature measurements from an applicable depth while hovering above water surface. Aerial measurements with a UAV while hovering above a water surface increases the battery use, thereby limiting the number of samples that can be taken [19]. UAV systems rely on sensitive navigational sensor technology to fix their position in the air. When it comes to water sampling at a lower altitude, many things can go wrong, resulting in crash landing into the water. The hover altitude of UAV depends on wind speed, sensor calibration, and payload swing motion. These factors prevent the precise depth of the water temperature measurements [20]. Therefore, more reliable UAV-based water temperature measurement approach is required to provide water column temperature data.

Our previous studies introduced the development, application, and evaluation of UAVs for water quality monitoring. First, a water sampling UAV for aerial water sample collection was designed and evaluated [21]. Second, an in situ water quality measurement UAV was designed and utilized for autonomous water quality measurements within an agricultural pond [19]. Third, the water collection apparatus and the sensor node were combined in the same UAV with a relatively larger payload capacity [22]. Fourth, the combined UAV was re-designed for adaptive water sampling where the decision for water collection was made based on in situ water quality measurements with the onboard sensor node.

The objective of this research was to develop and test a water temperature measurement system UAV for lake temperature profiling and monitoring. The in situ water quality measurement UAV reported in Koparan et al. [19] was re-modeled by replacing the sensor node with a depth and temperature sensors and the buoyancy apparatus on the UAV was modified for safer water-landing. The novelty of the system presented here is that the UAV starts measuring the temperature and depth when the temperature and pressure probes are in the water while descending. Another key feature of the system is that the UAV can land on the water surface and take off from the water surface rather than hovering during measurements.

#### **2. Materials and Methods**

#### *2.1. UAV and Sensor Node Components*

The system developed for water temperature profiling consisted of a hexacopter UAV and a sensor node. The UAV was custom designed and the technical specifications were provided in a previous publication [19]. The gross weight of the aircraft was 3100 g (UAV and payload). The weight of the UAV was 2,300g and the payload (sensor node) was 800g including a second battery, microcontroller unit with protective case, temperature and pressure probes, extension cord (10 m), and protective steel case for the probes. The second battery was a Li-Po battery (7.4 V, 2.200 mAh, Venom, Rathdrum, ID, USA). The voltage to the microcontroller was regulated using a second battery with a battery eliminator circuit (BEC). A separate battery for the sensor node allowed dismounting the unit for standalone measurements without using the UAV.

The pressure and temperature sensors were embedded as a single unit by the manufacturer (Bar02, Blue Robotics, Torrance, CA, USA). An integration of this single unit with a microcontroller unit (Arduino Mega 2560, Ivrea, Italy) was made for calibration, control, and data recording. The pressure measurements were used to determine the depth at which temperature measurements were made. The measurements were recorded in a secure digital card (SD card) (SunFounder, Shenzhen City, Guangdong Province, China) that was inserted with the microcontroller unit. A voltage converter circuit (I2C Level Converter, Blue Robotics, Torrance, CA, USA) was used with the pressure sensor to regulate voltage and to enable communication with the microcontroller unit. The pressure sensor and voltage converter circuit were waterproofed in a custom designed 3D printed case and sealed with epoxy. (Figure 1). The 3D printed case was placed in a steel tube to ensure that the pressure sensor would submerge rapidly in the water. The steel tube was coated with Flex Seal (Flex Seal, Weston, FL, USA) to prevent corrosion. The microcontroller platform was sealed in a box and mounted on the UAV. The pressure sensor was suspended with a 10 m long tether.

**Figure 1.** The sensor node assembly: (**a**) pressure sensor and voltage converter circuit, (**b**) computer aided design (CAD) of the case in SolidWorks, (**c**) pressure sensor in 3D printed and sealed case, and (**d**) steel tube to enable rapid submerge and sensor protection.

#### *2.2. Experiment Site and Sampling Locations*

The UAV-based water temperature profiling system was evaluated and tested in Lake Issaqueena (Pickens County, South Carolina). The length of the lake is 13 km, with an approximate surface area of 36 ha. The width of the lake at the largest section is 400 m. The top of the dam at Lake Issaqueena is about 15.7 m above bedrock. The water temperature averages 21.9 ◦C in summer, and 4 ◦C in winter [23]. In 2005, The South Carolina Department of Health and Environmental Control (SCDHEC) reported that water quality parameters meet the standards at this lake [24]. This lake was chosen for the experiments because the results could be used to generate new data sets for water quality monitoring. Lake Issaqueena has no boat access from neighboring Keowee River, thereby providing safe UAV flight conditions for the experiments. Figure 2 shows the UAV integrated sensor node and launch locations at the lake.

**Figure 2.** (**a**) Sensor node integrated with the aircraft and (**b**) the launch location in Lake Issaqueena.

Due to the flight restrictions imposed by Federal Aviation Administration (FAA) and limited battery power, the water temperature profiling experiments were conducted in a smaller portion of the lake. The FAA mandates UAV flights to be within the line of sight at a maximum altitude of 120 m above ground level [25]. Because of these limitations, the sampling locations were selected in areas where the UAV can access with the limited battery power while staying within line of sight. The UAV launch location was marked as zero and the water sampling locations were marked with numbers one to seven in the map in Figure 3. The UAV launch location was free of trees and provided flat ground for safe takeoff and landing. Water sampling locations were assigned in grid sampling fashion while scattered to provide water temperature measurements to represent the entire area within the mission plan boundary. The distance between the sampling locations were 80 m to 90 m apart from each other. The shortest flight distance was 73 m, from launch location to sampling location one, and the longest flight distance was 290 m, from launch location to sampling location seven.

**Figure 3.** A section of the lake was used as the experiment site for measurements.

#### *2.3. Water Temperature Profiling Data Collection*

The experiments for water temperature profiling were conducted on 25 April 2019 at 3:00 p.m. The average air temperature from 20 m altitude to water surface was 24 ◦C within the mission plan boundary. The air temperature measurements were obtained from the UAV's internal temperature sensor. The UAV with integrated sensor node was deployed to each of the sampling points with

autopilot-controlled autonomous flights. The navigation altitude was set as 20 m to provide safe flight during travel as the probe was mounted on a 10 m long tether. After the navigation destination was been reached, the autopilot let the UAV slowly descend and land on the water surface for 5 s. The temperature and water depth measurements were made during the descent until landing. After completing measurements, the UAV took off and continued with the mission plan to measure water temperature and depth at the next sampling location (Figure 4). The autonomous flights were programmed with a ground control station using the open source Mission Planner (MP) software and each individual flight was assigned as a mission plan [26]. The limited battery power and long flight distances made it necessary to divide the selected area into individual mission plans. Locations one and two were included in the first mission plan, location three was included in the second, locations four and five were included in the third, location six was included in the fourth, and location seven was included in the fifth mission plan.

**Figure 4.** Applied method of water temperature measurements with the UAV.

The water depth and temperature measurements were initiated by the autopilot when the UAV arrived at the predefined sampling location at 20 m altitude. Water depth and temperature measurements were recorded at 100 ms intervals while the UAV was descending at a rate of 2 m/s for landing. A flare altitude of 10 m was assigned in the autopilot's configurations for safe, smooth, and steady landing. Flare altitude is the final stage of the auto-landing procedure where autopilot decreases the throttle and slows down the UAV to readjust the descent speed prior to landing [26].

The number of measurements at each location varied depending on the water depth. The depth measurements indicated the depth of the probe during descent; therefore, repeated measurements were expected once the probe had reached the bottom of the water column. Measurements that repeated themselves after a certain depth were assigned as the maximum water depth at that sampling location. The collected water depth and temperature data were used to create a bathymetric map and water temperature maps for visualization of water temperature distribution at the surface (0.3 m) and at the depths of 2 m and 4 m. The Inverse Distance Weighted Interpolation (IDW) method was used for processing and interpolating in ArcMap (ESRI, Redlands, CA, USA). Raster maps were developed by interpolating vector data in the Geographic Information System (GIS) to illustrate data values for intermediate locations [27]. The relationship between water depth, water temperature, and location was evaluated. The water temperature distribution was illustrated in R software (R-GUI, Vienna, Austria) driven 3D scatter plot [28].

#### **3. Results and Discussion**

The indoor depth measurements with the sensor node were 100% accurate when compared to the reference depth values within a water column. The 3D printed watertight case protected the probe and the circuits from water damage. The indoor experiments showed that the probe was submersible in water and provided reliable water depth and temperature measurements. Table 1 shows the summary statistics of indoor tests to evaluate whether the 3D printed case affected the sensor depth measurements. The difference between the actual sensor depth values and measured sensor depth values was not significant using 0.05 level of significance (t (18) = 2.03, p = 0.57). The difference between the two depth measurements show less than a 1 percent error. The accuracy of water temperature measurements from the sensor was not investigated, because temperature measurements are reported to be within 2 ◦C in the manufacturer's specifications. Visual observations were made for confirming the sensor temperature measurements.


**Table 1.** Summary statistics for evaluation of sensor depth measurement.

The steel tube-enclosed sensor probe descended rapidly into the bottom of the lake as was expected. The rapid descent of the pressure sensor reduced the time that the UAV had to stay at the water surface and increased the speed of data collection. Reduction of floating time on the water surface minimized battery use because the UAV's idle mode duration was reduced. The idle mode of the flight controller kept the propellers spinning at the slowest rate to ensure that UAV could take off immediately when requested to either by the mission plan or the ground control station. The water depth evaluations estimated the maximum water depth as 8.4 m within the experiment boundary near the center (Figure 5). The water depth was 7.3 m at sampling location three, as it was the deepest sampling point, while the water depth was 4 m at sampling location four.

**Figure 5.** Water depth map of Lake Issaqueena within the mission plan boundary.

Water temperature varied at each location across the mission plan boundary and water depth. The temperature profiling experiments showed that water temperature was highest at location one, both at the water surface and at the bottom (Figure 6). The water temperature was 28 ◦C at the surface and 23 ◦C at the bottom in location one, with the highest temperature variation. The water temperature was 18.3 ◦C at the surface and 17.6 ◦C at the bottom in location six. Water temperature

at locations two and five followed the same trend, with the water temperature at the surface being 20 ◦C and the water temperature at the bottom being 19 ◦C. The trend in water temperature was the same in these two locations, because they were both located at the center in the downstream direction from northeast to southwest. A similar trend was observed at locations three and seven with a 1 ◦C difference in water temperature. The water temperature measurements from water columns indicate a sudden temperature change at locations two, five, and six at the depths of 3.67 m, 3.93 m, and 3.67 m, respectively. A rapid and steady water cooling was observed at these depths and the cooling continued until the bottom of the lake was reached at each location. The water temperature was steady until the depth of 1.4 m at location three. A sudden temperature drop was observed after this depth, indicating the cooling depth at the location three was less than at locations two, five, and six. Thermal stratification occurs at a depth of 3.6 m in many lakes and the temperature difference between the epilimnion and the hypolimnion must be at least greater than 1 ◦C [1,3]. While there was a temperature drop of more than 1 ◦C at an average depth of 3.8 m, it was not clear that if a thermal stratification occurred according to these measurements.

**Figure 6.** Water temperature distribution by location and water depth: (**a**) 2D scatter plot illustration and (**b**) 3D scatter plot illustration.

Sampling location one was closest to a stream that was located at the west corner of the experiment boundary. Increase in the water temperature might have been due to runoff after a rain event that occurred before field experiments. Change in water temperature by intermediate locations and sampling depths is illustrated in the water temperature maps in Figure 7. The maps represent the water temperature at the surface (0.3 m) and at depths of 2 m and 4 m, respectively. The average surface water temperature was recorded as 22.5 ◦C, while the average water temperature at the 4 m depth was 21.5 ◦C. The water temperature remained at around 18 ◦C at all depths at sampling location six. The largest water temperature drop was recorded as 3 ◦C at sampling location one. The difference in water temperature between sampling locations one and six was the highest at the surface, at 10 ◦C, and the lowest at the sampling depth of 4 m, at 6 ◦C.

**Figure 7.** Water temperature maps representing change in water temperature by intermediate locations and sampling depth.

#### **4. Conclusions**

The UAV-based water temperature profiling system described here provides a different perspective to water quality monitoring practices. Its ability to remotely access waterbodies and the ease of deployment provided better and faster data collection when compared to other water quality monitoring methods. The UAV-based water temperature profiling system successfully navigated to pre-defined water sampling locations and executed mission plans for water temperature and depth measurements. The 3D printed pressure sensor case successfully prevented water leak and kept the sensors' components safe while allowing it to descend quickly throughout the water column. The maximum depth of water was 8.4 m within the selected boundary in Lake Issaqueena. A rapid water temperature drop at sampling location one was due to the stream entry into the waterbody. A rapid water temperature drop of greater than 1 ◦C at an average depth of 3.8 m at locations two, five, and six was observed. However, a wider data collection experiment that covers the entire lake is necessary to justify whether thermal stratification might have occurred. The length of the extension cable of the probe can be readjusted, depending on the depth of the waterbody under study, while taking the endurance and the thrust performance of the UAV and the maximum operational depth of the sensor node into consideration. Water temperature profiling with this system could be achieved within a relatively short time span, providing great advantages over other methods such as traditional sampling by boat. The UAV-assisted temperature profiling option could also reduce the costs by minimizing the required time on a site while providing coverage of a larger area with ease. Considering the maintenance time, cost, and lack of spatial resolution of fixed sensor

*Drones* **2020**, *4*, 35

stations, the UAV-assisted temperature profiling system here described provides unique advantages, including advanced mobility, high spatial resolution, low cost, and fast response to disasters and other natural events.

**Author Contributions:** Conceptualization, C.K. and A.B.K.; Methodology, C.K., A.B.K., C.P. and C.S.; Software, C.K.; Validation, C.K., A.B.K., and C.S.; Formal Analysis, C.K.; Resources, A.B.K. and C.P.; Data Curation, C.K.; Writing—Original Draft Preparation, C.K.; Writing—Review and Editing, A.B.K., C.S. and C.P.; Visualization, C.K.; Supervision, A.B.K.; Project Administration, A.B.K.; Funding Acquisition, A.B.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The Authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Implementing Mitigations for Improving Societal Acceptance of Urban Air Mobility**

**Ender Çetin 1,\*, Alicia Cano 2, Robin Deransy 3, Sergi Tres <sup>2</sup> and Cristina Barrado <sup>1</sup>**


**Abstract:** The continuous development of technical innovations provides the opportunity to create new economic markets and a wealth of new services. However, these innovations sometimes raise concerns, notably in terms of societal, safety, and environmental impacts. This is the case for services related to the operation of unmanned aerial vehicles (UAV), which are emerging rapidly. Unmanned aerial vehicles, also called drones, date back to the first third of the twentieth century in aviation industry, when they were mostly used for military purposes. Nowadays, drones of various types and sizes are used for many purposes, such as precision agriculture, search and rescue missions, aerial photography, shipping and delivery, etc. Starting to operate in areas with low population density, drones are now looking for business in urban and suburban areas, in what is called urban air mobility (UAM). However, this rapid growth of the drone industry creates psychological fear of the unknown in some parts of society. Reducing this fear will play an important role in public acceptance of drone operations in urban areas. This paper presents the main concerns of society with regard to drone operations, as already captured in some public surveys, and proposes a list of mitigation measures to reduce these concerns. The proposed list is then analyzed, and its applicability to individual, urban, very large demonstration flights is explained, using the feedback from the CORUS-XUAM project. CORUS-XUAM will organize a set of very large drone flight demonstrations across seven European countries to investigate how to safely integrate drone operations into airspace with the support of the U-space.

**Keywords:** drones; unmanned aerial vehicles (UAV); social acceptance; urban air mobility (UAM); CORUS-XUAM

#### **1. Introduction**

Drones are flying machines ranging from insect-sized flapping crafts to large airplanes the size of a commercial airline jet [1]. Their capabilities are also wide-ranging: some drones are capable of flying for only a few minutes, while others can fly for days at a time. The applications of drones are also diverse. While the initial applications of drones were mainly for military purposes, and later for recreational purposes, drones are used today in many civil applications and in public spaces. Some of the most common commercial applications and uses for drones include agriculture (crop spraying, crop monitoring, etc.), live streaming events, emergency response, search and rescue, firefighting, disaster zone mapping, mapping and surveying, and artificial intelligence applications [2–5]. More recently, the societal utility of drones has been further enhanced in the management of the global COVID-19 pandemic, with use cases such as aerial spraying of public areas to disinfect streets, the surveillance of public spaces, and monitoring local authorities during lockdowns and quarantine [6].

**Citation:** Çetin, E.; Cano, A.; Deransy, R.; Tres, S.; Barrado, C. Implementing Mitigations for Improving Societal Acceptance of Urban Air Mobility. *Drones* **2022**, *6*, 28. https://doi.org/10.3390/ drones6020028

Academic Editor: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 30 November 2021 Accepted: 14 January 2022 Published: 18 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The 2016 European Drones Outlook Study [7] forecasts a promising economical growth fostered by the emerging drone market. Unmanned aircraft will be part of everyday life in most of the economic sectors, as shown by the size of the bullets in Figure 1, but will have a greater impact on air travel, utilities, entertainment and media, logistics, and agriculture. Indeed, the number of drones flying in the European airspace is expected to increase from a few thousand to several hundred thousand by 2050, most notably in government and commercial activities. The annual economic benefit could exceed EUR 10 billion by 2035 in Europe and create 100,000 new direct jobs to support drone-related operations. An example of this growth is illustrated by the agricultural sector, where authors estimate that 150,000 drones will be operated by 2035. The same is true in the fields of utilities and security, where around 60,000 unmanned aircraft will be used to assist in natural disaster management or traffic control, among other tasks.

**Figure 1.** Drone market areas (size represents the market expectations).

However, despite the multiple operational services and the huge potential economic benefits of the drone industry, this relatively new technology will not really take off until the societal concerns associated with its widespread deployment are properly addressed.

As in the early days of aviation, safety will remain the main factor that will influence public acceptance of drones, especially as, unlike conventional commercial and general aviation, drones will often operate over moderately to densely populated areas and at lower altitudes. Visible to the naked eye, civil drone operations will raise questions about their nature and the risk they may represent for the populations and installations overflown. Noise pollution generated by drones will have to be contained to acceptable levels depending on the time of day and the frequency of operations. In addition, other societal and environmental impacts on the population, fauna, and flora will also have to be anticipated and mitigated.

Aware of these societal and environmental challenges, the CORUS-XUAM project has undertaken a review of surveys that covers the public acceptance of drones, and has initiated the identification of possible mitigation measures. The aim of this work is to address public concerns before the UAM business spread over cities, and have mitigation measures in place to facilitate a seamless acceptation of drones in our urban skies.

#### **2. Public Surveys about Drones**

The surveys reviewed are from various organizations (air traffic service providers, industry, research, universities, airspace security agencies) and countries (Australia, Germany, Brazil, USA, China, Korea, etc.), and were conducted between 2015 and 2021, 2018 being the inflection point in which drones were considered for the first time as new-entrant vehicles to share urban transport.

In 2015, Clothier et al. [8] studied the risk perception and the public acceptance of drones in Australia. The objectives of this study were to investigate whether the public perceives the risks of drones differently to that of conventionally piloted aircraft, to provide guidance for setting safety requirements for drones, and to understand how the terminology used to describe the technology influences how the public perceives the risk. In this research, it was found that that terminology had a minimal effect on public perceptions. However, this may change as more information about the drone technology and risks and benefits of their usage becomes available to the public.

In 2016, the Office of Inspector General of the United States Postal Service published a report [9] on the public perception of drone delivery in the United States. This report refers to an online survey that was administered, in June 2016, to a sample of 18–75-year-old residents in all 50 states and the District of Columbia to understand the current state of public opinion on drone delivery for potential customers. The survey showed, among other things, that most Americans like the concept of drone delivery rather than dislike it, but that many have yet to make up their minds. Different groups have different levels of interest in drone delivery. Drone malfunctions was the main concern of the public, but other concerns included misuse, privacy, potential damage, and nuisance.

In 2017, Lidynia et al. [10] conducted a survey of 200 people, laypersons, and active users living in Germany about their acceptance and perceived barriers for drones. The survey questions were about the general evaluation of civil drone technology, barriers, demography, and further user factors. The survey results show that user diversity strongly influences the acceptance of drones and perceived barriers. Active drone pilots were more concerned by a risk of possible accidents, while laypeople were more concerned about the violation of their privacy (the routes that drones should and should not be allowed to use).

In 2018, an online survey from NATS [11], the UK airspace service provider, showed that drone acceptance can range from 45% when seen as a generic technology tool, but raises to 80% when they are used in emergency situations. A deep market study conducted by NASA [12] forecasts that, in the coming years, there will be numerous markets in which the drones will have a stake.

As a novelty, additional operations, such as passenger transport by unmanned aircraft, or "air taxis", are expected to grow exponentially. Air taxis operations will reduce the travel times of part of the commuting traffic to city centers and contribute to decongesting ground transport by up to 25% . Urban air mobility (UAM) is emerging as the new concept for drones future business. In the US, the concept will later be extended to also include manned electrical vehicles with vertical take-off and landing capabilities, known as eVTOL, under the new term advanced air mobility (AAM). The paper shows that the acceptance level raises to 55% with the development of new safety technologies, the improvement in air flow network, and automation of the flights.

In 2019, Airbus conducted also a survey [13] about the public perception of UAM. The Airbus survey covered four cities/countries around the world, Los Angeles, Mexico City, New Zealand, and Switzerland, and collected 1540 responses. Results revealed that 44.5% of respondents supported or strongly supported UAM and that 41.4% of respondents thought UAM was safe to very safe. This suggests that the initial perception of UAM is quite positive.

The same year, a meta-analysis from Legere of former US public surveys [14] and the DLR survey to 832 German citizens [15] showed acceptance levels of 60% and 49%, respectively. The meta-analysis focused on the different acceptance levels per mission, with public missions having higher acceptance than private/commercial uses. The German survey provides results about major public concerns. The most important one was the misuse of drones for crime (91%) and the violation of privacy (86%). Both surveys refer to generic (small) drones involved in missions such as police surveillance or search and rescue.

In 2020, Tan et al. [16] surveyed the opinion of more than 1000 citizens from Singapore. Delivery drones and passenger vehicles were considered to have an average acceptance of 62%.

In 2021, an EASA survey [17] obtained the highest acceptance (83%) for the UAM composed of passenger electrical vehicles, not necessarily unmanned, cargo drones, and also surveillance drones. Special emphasis was given to the different types of passenger vehicles, and also to concerns related to the environment.

In addition, surveys [18–20] had a main focus on the analysis of the demand of the future UAM services. Questions were addressed to the public as potential customers. Kloss

and Riedel surveyed almost 5000 people from Brazil, China, Germany, India, Poland, and the US. Acceptance was measured for different missions (six using eVTOL and four using cargo-drones) and they found out that only 27.3% of the people declared themselves willing to try passenger drones, mostly for commuting, business trips, or travels to/from airport). On the contrary, the willingness to use cargo drones, paying twice or more of today's cost, was 57.8%.

More positive were the responses from the Lundqvist survey. This survey was conducted on almost 500 people from five EU regions (in Holland, England, Spain, Croatia, and Poland). Respondents were mainly connected to drone operators or their business. The general positive attitude towards drones was up to approximately 70%. Specific questions about concerns included safety, environment, and privacy issues. Finally, the Park and Joo survey was conducted in South Korea on more than 1000 citizens plus 44 experts. The willingness to use UAM (both passenger and cargo) was 47%, and decreased as the automation of the vehicles increased.

In Figure 2, the surveys are visualized according to their main focus, such as public acceptance (blue) or market analysis (green). The number of surveys for each year from 2015 to 2021 can be seen on the vertical axis. The different types of drones (surveillance, cargo, and passenger) covered by each questionnaire are also indicated by a picture.

**Figure 2.** Summary of the surveys per year, aim, and proposed vehicles.

A growing interest in passenger drones can be observed starting from 2018 to 2021. Conversely, the interest in surveillance drones has been decreasing. This may be a reason why privacy concerns have been decreasing over the years in these surveys, while noise and environmental concerns have increased.

As an overall metric, the level of acceptances of drones and of urban air mobility are shown in Figure 3. Each bar represents one survey, and they are sorted by year to try to show any trend across time. Again, the color indicates the final focus of the survey: blue for public acceptance and green for market analysis.

As can be seen, the public acceptance has no clear trend over the years but reached the maximum in six years of 83% in 2021 (EASA survey). However, the other surveys of the same year had very different results. The way questions are proposed in these surveys partly explains these differences. In the EASA survey[17], with the highest acceptance value, the question was the "general attitude towards urban air mobility". In the Park and Joo survey [18], also conducted in 2021, but from a market analysis perspective, the question that obtained 47% of positive responses was about the "public's willingness to use UAM in its initial phase". This shows how difficult it is to compare survey results.

**Figure 3.** Acceptance levels of drones and/or UAM per survey (in %).

Surveys, in general, have a first set of questions to classify the public according to their age, gender, and economic status, but also their knowledge about drones, so that the answers can be further studied by groups. Typically, females, elders, and less-educated people have a slightly lower acceptance of drones than the other groups. On the contrary, experts in the field are generally more concerned about safety than laypersons.

Most surveys are also usually accompanied by a scenario of drone usage, and in the market study surveys, the scenarios include a forecast about the cost of the services. Many unknowns are yet to be unveiled: Will safety increase or decrease? Will the projected drone service costs/times be achievable? Will drones generate the expected economical growth? For the moment, only predictions can be provided when conducting surveys, whereas the survey results show clearly that the costs of drone services, as well as the time saved, have a high impact on responses. Drone operations related to health and welfare always have a high level of acceptance, while leisure or business related to leisure are always the least accepted drone operations.

In most recent surveys, we found that quantitative data are obtained from the questionnaire responses, while qualitative information is obtained from a set of persons who are interviewed separately, and whose responses are analyzed with more detail. Typically, this set of responses, referred to as experts, is used to validate and interpret the responses of the questionnaire. However, expert answers usually point towards a positive attitude to drones, as confirmed during the first CORUS-XUAM stakeholder workshop. This workshop analyzed the most critical elements related to UAS/UAM operations along with possible solutions that could enable a sustainable and accepted expansion of drone operations in and around the European cities. In particular, the fifth day of the workshop was dedicated to the analysis of societal impact of drone operations and possible mitigation measures. The responses to the questions in Table 1 showed a high acceptance rate among the 66 workshop participants, as in the surveys analyzed.


**Table 1.** Please check the option(s) that apply/applies to you.

Although public opinions vary with time/country, trends seem to show that between half and three-fourths of the public accepts the deployment of business-related drone operations.

In addition to acceptance, most surveys include questions about public concerns, but they do not use an equivalent set of concerns or the same terminology. To highlight this fact, we used word clouds to process the surveys addressing "public concerns" (see Figure 4). In word clouds, the most frequently used terms within a document are displayed in larger font size.

As can be seen in these word clouds, public concerns related to drone operations are mostly focused on safety, environment, privacy, and noise. Terms such as animals, visual, and waste are classified as an environmental concern, while others, such as risk and danger, are considered as safety concerns. In addition, we see terms related to the economy, (i.e., cost and liability), or to other topics, such as regulation or ethics.

**Figure 4.** Word clouds highlighting the public concerns in the different surveys.

In the CORUS-XUAM, the workshop participants were asked to select the top three concerns for them, and the results are shown in Table 2. As most of them came from the aviation sector, it is not surprising to see that the safety concern was selected as the major concern.

**Table 2.** Please select which of the three main concerns is most important for you.


It is worth mentioning some specific issues that yield "not in my backyard" responses. The location of vertiports is a good example. People are open to the concept, but would not be happy to have one near their home or office.

The full list of societal concerns that can be identified by the CORUS-XUAM project is as follows:


As the environmental area has many items, and noise and privacy are highly mentioned as concerns, we treat them separately in the following sections. While analyzing each societal concern, we also hint at possible mitigation measures.

#### **3. Materials and Methods**

The procedure followed for defining the mitigation measures and analyzing them is summarized in Figure 5. First, the main societal concerns were extracted from the surveys. Aspects related to safety, the economy, the environment and noise were the result of this first step, as depicted in the figure. Next, the societal concerns were analyzed during several brainstorming sessions. For each concern, we determined possible actions that could help to minimize its negative perception. The result was a list of mitigation measures, in which each item is an individual action that can mitigate one or several concerns.

**Figure 5.** Mitigation analysis process.

Finally, the list of mitigation measures was analyzed to draw conclusions. As part of this process, this list was presented and discussed in the CORUS-XUAM workshop. The majority of the participants felt that it was a good start (details can be seen in Table 3) but it was still incomplete. During the debate, new potential actions were proposed and added to the existing ones.

**Table 3.** Opinions collected during CORUS-XUAM workshop to the question "Please select the best option to end this sentence: *The presented list of mitigations* . . .".


Once the list of mitigation actions is completed, the analysis was performed using the double classification process illustrated in Figure 6. With the workflow moving from the inside to the outside, we started collecting public concerns and then proposed actions to mitigate such concerns, and finally applied two overlapping classifications: first providing a category to each action and then a level that measures the required effort to implement that action.

**Figure 6.** Mitigation measures classification procedure.

In more detail, the analysis starts by categorizing each mitigation measure according to the scope in which it can be applied. We established four different scopes, or categories, as follows:

• Regulation and policy. This category contains the mitigations that should be part of a regulation made by the authorities.


Figure 7 shows some examples of mitigation for each category. Note that simply rewording a mitigation slightly can move it from one category to another. For instance, "setting up countermeasures to criminal/illegal use of drones" was categorized under "tools and technologies", but rephrasing it to "make mandatory the use of countermeasures ..." would have categorized it under "regulation and policy".

In addition to the category, we assigned each mitigation a second classification in three levels, "easy", "medium", and "difficult", according to its ease of implementation in terms of resources and time. Figure 8 shows some mitigation examples for each of the three levels of ease of implementation. For example, the mitigation "creating an independent authority to investigate accidents/incidents/complaints related to drone operations" is considered to be difficult to implement at the moment because it requires a high level of agreement between stakeholders. In particular, this mitigation measure shall involve regulation bodies which have to follow a lengthy period of legal procedures. In contrast, the mitigation "limit minimum altitude" is an operational action that is easy to implement.

**Figure 8.** Examples of mitigations by ease of implementation.

#### *3.1. Scoring of the Mitigations*

Given the long list of mitigation measures, we needed a method to rank them from highest to lowest priority. The prioritization process uses a scoring value generated from a dynamic table. The dynamic table is created by reversing the rows and columns of the table used to generate the mitigation measures.

The process for scoring each individual mitigation is the result of adding up the applicability values of that mitigation measure in each and every concern.

Indeed, one mitigation that reduces visual impact may have a negative effect on the safety of the surrounding traffic, but at the same time may be neutral for natural life and for privacy. For this reason, we crossed each mitigation with each concern on the long list of concerns presented in Section 2 and set +1 for a positive impact, −1 for a negative impact, and 0 for a neutral.

This is similar to the process used in the safety operational risk assessment (SORA) methodology [21]. The final sum of the values of a mitigation provides a numerical proxy of the impact of its applicability. The higher the number, the more positive the impact.

#### *3.2. CORUS-XUAM Mitigations Subset*

As a final step for this work, we selected a subset of mitigation measures, mainly from the operational/ConOps category (but not all) that are applicable to very large (drone flight) demonstrations (VLDs) being prepared within the CORUS-XUAM project.

Very-large-scale demonstration (VLD) activities will be at the heart of CORUS-XUAM and will support the integrated operations of UAS/UAM and manned aircraft and the advanced forms of interaction through digital data exchange supported by integrated and advanced U-space services in urban, suburban, and intercity scenarios, as well as in and near ATM-controlled airspace and airports. The VLDs will focus on different types of missions such as passenger transport, delivery, emergency response, and surveillance. The VLDs will use different U-space deployment architectures and state-of-the-art technologies. They will take into account the coordination between ATC and U-space, including interaction with ATCOs and pilots. The VLDs will combine eVTOLs flights with other traffic, and operations in the CTRs of major airports. Vertiport procedures, separation, and data services will also be demonstrated [22].

The mitigation measures proposed to be tested during VLDs are mainly those that can be implemented by the U-space service providers or any other partner involved. As the VLDs are in the planning phase, at the time of writing this paper, each VLD responds differently to the proposed list of mitigation measures, depending on its mission and capacity.

#### **4. Results**

#### *4.1. Mitigations*

4.1.1. Full Mitigation List: Categories, Ease of Implementation, and Top 10 Scored

The full list of social acceptance mitigation measures identified after the CORUS-XUAM brainstorming sessions [23] is presented in Appendix A.

Figure 9 shows the percentage of categorization of the mitigation measures according to the scope in which they can be applied. The categories are explained in detail in Section 3.

**Figure 9.** Mitigation categories for full list.

A categorization of the ease of implementation of each mitigation measure was established to analyze those that could be implemented and achieved with the current technologies and regulations. Figure 10 shows the percentages of the ease of implementation of the full list of mitigation measures. The aim was to identify and analyze the possible mitigation measures that could be implemented quickly.

**Figure 10.** Percentages of total ease of implementation for full list.

The list of prioritized Top 10 mitigation measures and the concerns that can be improved upon are presented in Table 4. Figure 11 shows the ease of implementation of the ones that mitigate a bigger number of concerns; for example, mitigation "M1—limit minimum altitude" is thought to support six different concerns and be easy to implement.

**Figure 11.** Mitigation scores for full list of mitigation measures.

It can be observed that more than 70% of the mitigation measures are found to be achievable in a short or medium timeframe, either because the necessary applied science exists today or because the required technologies are under development. However, 31% of the mitigation measures are still considered complex to implement, which means that there is still a long way to go in the research and development of new technologies and the regulations that make these mitigations possible.


**Table 4.** Prioritized top 10 mitigation measures.

4.1.2. Partial Mitigation List Applicable to VLDs: Categories, Ease of Implementation, and Top 10 Scored

The mitigation measures were selected by considering their applicability to VLDs. This partial mitigation list applicable to VLDs is in Appendix B.

In Figure 12, the percentages of categorization of the mitigation measure are shown. As can be seen in this figure, the first category, "regulation and policy", accounts for almost 43% of the mitigation categories that are applicable to VLDs. However, the fourth category, "tools and technologies", accounts for only 9.5%.

**Figure 12.** Mitigation categories for VLDs.

In Figure 13, the percentages of the ease of implementation considering the partial mitigation list applicable to VLDs are shown. In this figure, it can be seen that 57% of these mitigation measures can be implemented quickly. Only 10% of the partial mitigation measures are considered difficult to implement.

**Figure 13.** Percentages of total ease of implementation for VLDs.

Figure 14 and Table 5 show that most of the top 10 scored mitigation measures that are applicable to VLDs are considered to be easy to implement. Only the mitigation measure "ensure that electronic devices on drones (cameras, sensors, etc.) cannot be used to infringe on privacy" is considered to be hard to implement in a short time.

**Figure 14.** Mitigation scores for mitigation measures applicable to VLDs.


**Table 5.** TOP 10 VLDs mitigation measures.

#### **5. Discussion**

The application of actions to mitigate risks is the basis of the SORA methodology [21]. For instance, to reduce the energy of a falling drone, a common mitigation is the addition of a parachute. While the parachute will, in general, improve safety, it may also introduce new risks and failures, such as an undesired deployment of the parachute. We have to understand that any well-intentioned action may indirectly introduce malicious effects as well.

In the case of the proposed social concern mitigation list, we found a number of contradictory effects.

For instance, we proposed a number of mitigations in relation to the flight trajectory and noise-limiting hover time, flying direct routes, and so on—but also using alternate paths, avoiding certain areas, and limiting speeds. Although they are helpful for reducing the noise on the ground, it is not possible to apply all of them at the same time. Trade-offs need to be elaborated to avoid long route deviations due to protected zones. Other route characteristics, such as altitude, time of day, and maximum capacity, play important roles in the abatement of noise. They should all be taken into account together when selecting the best mitigation strategy for drone operations.

Another example is the location of vertiports. For safety reasons, vertiports should be located in isolated areas, with few air and ground risks, but for economical reasons they should be close to transportation hubs (persons and/or freight). Moreover, the high traffic density of a vertiport can generate a nuisance for neighbors. Using a building ceiling could mitigate this nuisance, but at the cost of increasing the flight risk. A split of opinions is very clearly shown in Table 6, obtained based on the workshop attendees' opinions. It seems clear that more research is needed to further develop this and some of the proposed mitigations.


**Table 6.** Which factors should be considered as constraints for the location of vertiports?

A number of mitigations have been classified as "tools and technologies". Based on research on clean energy sources, artificial intelligence or new materials are key to reducing societal concerns. A drone with low-noise propellers may be inaudible at 10–15 m of height, thus increasing the minimum altitude noise mitigation. It is especially relevant to note the object avoidance technologies, currently based on near-infrared or ultrasound sensors, that only work at low speeds. Future developments can help to avoid unexpected encounters (i.e., killing birds) at any speed.

Most societal concerns cannot be purely measured objectively. Human perception is highly subjective. A clear example is the experiment reported in the EASA survey [17] about noise. In a lab, a number of people were requested to order a list of sounds according to what they considered a nuisance. While all sounds were played all at the same volume (80 decibels, which is higher than a vacuum cleaner), responses penalized unknown noise sources more than other, known ones. As the public becomes informed and used to drone characteristic noise, this human factor will change. Moreover, according to [24], the noise of a VA-X4 taxi drone flying at 300 m produces a noise of 43 decibels, a loudness between the noise of a quiet urban night (40 dBA) and that of light urban traffic (50 dBA). A very interesting review of drone noise emissions and noise effects on humans can be found in [25]. Furthermore, the effects of drone noise with natural life, especially in birds, seems to be a growing societal concern [EASAsurvey], but scientific studies show that certain frequencies, such as drone high-frequency noise, are not audible to most birds [26].

A number of proposed mitigations can be adopted in future regulations. However, the role of governments must go beyond the regulatory aspects. Actions are needed to disseminate the benefits of drones as environmentally friendly vehicles, with a capacity for the fast transport of people and goods to be used in emergency situations and a motor of a new economic growth cycle. Simultaneously, the initial support in terms of infrastructure to be deployed (i.e., U-space) are actions needed to foster the new era of transport using drones. The development of this infrastructure still requires decisions about U-space airspace organization to be made. This is also a hot topic for research, as there does not appear to be any consensus based on the expert responses of Table 7.

**Table 7.** How should UAM flights be organized to mitigate ground risks and noise?


Another aspect that governments and authorities should face is the fairness in regard to the access to airspace. Transparency is a tool of fairness as well as a strategy for mitigating citizen concerns about privacy, according to the responses to the workshop poll shown in Table 8.


**Table 8.** To which point do you agree with the following sentence: *The ability of citizens to obtain information about drone flights in their vicinity would resolve privacy concerns*.

On exchange, drone operators shall carefully monitor safety levels to be fully compliant with the regulations. With this paper, we hope to provide them with ideas to help them to improve the social acceptance of drone operations (and thus increase business), especially in urban environments. The authors' aim is to convince drone operators to be eager to apply the most convenient mitigations to their operations, including dissemination actions and collaboration with researchers of new environmentally friendly technologies for drones.

#### **6. Conclusions**

Many governments believe that drone-related business can provide a competitive advantage for developing their country and are taking political and economic measures to foster drone business and urban air mobility. Expectations about drones are to be widely adopted by citizens in urban areas, once some issues are addressed and resolved. The most important ones are safety and societal issues.

Safety issues are largely anticipated in traditional aviation, to reduce risks to airspace users and people, assets, and facilities on the ground. Safety is achieved through thoughtful airspace design, robust and certified industrial processes, and the use of operational mitigation measures, all of them supported by an international regulation. Societal issues are sometimes overlooked before deployment.

This paper proposes to address societal issues similar to safety risks by anticipating and reducing risks (public concerns) prior to deployment. Social acceptance can be facilitated by ensuring mitigation measures that prevent the negative impact of drones on citizens and on the environment. Public concerns are identified, and actions that mitigate them shall be implemented well in advance of urban air mobility widespread deployment. The paper presents the main concerns of society with regard to drone operations, as already captured in some public surveys, and proposes a list of mitigation measures to reduce these concerns. The proposed list was then analyzed and its applicability to individual, very large demonstration urban flights is explained, using the framework of the CORUS-XUAM project. The proposed mitigation measures do not only concern drone operators but also regulators, educational bodies, other airspace stakeholders, infrastructure providers, technology and software developers, and research centers.

Future work includes the analysis of the application of mitigations in the six very large demonstrations of CORUS-XUAM to understand their impact and to fully consolidate the mitigation list proposed in this paper. It also includes new measures and scientific work to provide more detailed data to some mitigation measures, such as the perception of noise in the ground, which will help to suggest limits on altitude and speed. Future observations are needed to understand the interaction with birds. The analysis of images captured during flights will be useful to estimate threats to privacy. Additionally, further work is needed to develop a comprehensive list of mitigation measures, identify regulatory gaps, propose suitable infrastructure deployment, or influence pilot training in the future. Social concerns need to be anticipated and mitigated in advance if urban air mobility is to become an accepted part of a modern, efficient, environmentally friendly, and competitive future mobility.

**Author Contributions:** The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

**Funding:** This project has received funding from the Ministry of Science and Innovation of Spain under grant PID2020-116377RB-C21 and from the SESAR Joint Undertaking (JU) under grant agreement No 101017682. The JU receives support from the European Union's Horizon 2020 research and innovation programme and the SESAR JU members other than the Union.

**Acknowledgments:** The authors want to thank to the participants of CORUS-XUAM workshop for their feedback.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

UAM Urban air mobility


#### **Appendix A. Full List of Mitigations**

The full set of mitigations proposed in this paper are as follows:


#### **Appendix B. Partial List of Mitigations Adapted to Very Large Demonstrations**

The following tables show the mitigations that we believe are applicable to flight demonstrations. These mitigations have been grouped according to the areas they influence.

**Table A1.** List of mitigation actions on the flight plan design to reduce the social concerns of demonstration flights.


**Mitigation Action Areas** General aviation pilots' engagement in activities about UAM fairness, safety, economy Public engagement activities about drone technology and operations transparency Disseminate the environmental benefits of drones and disseminate results (emission savings) transparency, economy Disseminate the mobility and economic benefits of drones transparency, economy

**Table A2.** List of dissemination actions to reduce the social concerns of demonstration flights.

**Table A3.** List of mitigation actions applicable to drones that reduce the social concerns of demonstration flights.


**Table A4.** List of other mitigation actions applicable to drones that reduce the social concerns of demonstration flights.


#### **References**


### *Concept Paper* **The Use of Drones in the Spatial Social Sciences**

**Ola Hall \* and Ibrahim Wahab**

Department of Human Geography, Lund University, SE-223 62 Lund, Sweden; ibrahim.wahab@keg.lu.se **\*** Correspondence: ola.hall@keg.lu.se; Tel.: +46-73-374-7849

**Abstract:** Drones are increasingly becoming a ubiquitous feature of society. They are being used for a multiplicity of applications for military, leisure, economic, and academic purposes. Their application in academia, especially as social science research tools, has seen a sharp uptake in the last decade. This has been possible due, largely, to significant developments in computerization and miniaturization, which have culminated in safer, cheaper, lighter, and thus more accessible drones for social scientists. Despite their increasingly widespread use, there has not been an adequate reflection on their use in the spatial social sciences. There is need for a deeper reflection on their application in these fields of study. Should the drone even be considered a tool in the toolbox of the social scientist? In which fields is it most relevant? Should it be taught as a course in the social sciences much in the same way that spatially-oriented software packages have become mainstream in institutions of higher learning? What are the ethical implications of its application in spatial social science? This paper is a brief reflection on these questions. We contend that drones are a neutral tool which can be good and evil. They have actual and potentially wide applicability in academia but can be a tool through which breaches in ethics can be occasioned given their unique abilities to capture data from vantage perspectives. Researchers therefore need to be circumspect in how they deploy this powerful tool which is increasingly becoming mainstream in the social sciences.

**Keywords:** drones; legislation; ethics; spatial social sciences

of Drones in the Spatial Social Sciences. *Drones* **2021**, *5*, 112. https://doi.org/10.3390/ drones5040112

**Citation:** Hall, O.; Wahab, I. The Use

Academic Editor: Diego González-Aguilera

Received: 6 September 2021 Accepted: 2 October 2021 Published: 6 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

The term drone (For the purposes of this paper, whenever we use the term drones, we are referring to unmanned aerial systems in their use to capture photographic data, and as this is used in the social sciences. Our focus therefore is the photographic product that can be collected from drones and not the devices themselves), originally referring to the male bee, is the everyday name for autonomous aircraft. Drones tend to evoke memories of warfare as they were first flown during the First World War, during which they were often launched using a catapult. Since then, they have been used as tools for reconnaissance, for deploying propaganda leaflets, as decoys for missile launches, or even as actual combat platforms in multiple theaters of war. In more recent times, drones have become household names for their non-military uses. There have been several attempts to delink them from their militaristic past. These efforts include popularizing more civilian-leaning names for these systems. To this end, common monikers in academic literature include unmanned aerial vehicles (UAV), remotely-piloted aircraft (RPA) or vehicle (RPV), unmanned aerial system (UAS), or the recent, gender-neutral form, uncrewed aerial vehicle (also UAV). Notwithstanding these efforts, the traditional terminology has, however, stuck, and so even now, major industry players offer 'drones' and 'mini-drones' as flagship products.

Within the last decade, drones have become a much more common feature of life. This is largely attributable to the fact that they have become relatively cheaper to manufacture compared to just two decades ago. This is mainly due to significant technological advances in computerization and miniaturization. The former has exponentially increased the processing power of computers and researchers' ability to process data from drones, even on low-cost laptops, while the latter has dramatically reduced the cost of production with less expensive components such as carbon fiber. Increasing civilian use of drones has also accompanied improvements in features aimed at augmenting safety. These features include obstacle-avoidance and vertical take-off and land (VTOL) systems. The latter allows take-off and landing of the drone system even in challenging terrains, while the former helps prevent mid-flight collision with other aircraft, trees, or buildings.

The application of drones is expanding just as their use in multiple facets of life is growing. In addition to being used as leisure tools, drones have several applications in weather forecasting, search and rescue operations, disaster management, crowd control, and delivery of vaccines and blood for transfusion, among others. One area that has also begun seeing the increased application of drones in the last decade is academia. In this regard, both the physical as well as the social sciences have found it to be a useful tool. Some disciplines, by virtue of their subject matter and focus, find the incorporation of drones into their research agenda easier than others. For instance, in the physical sciences, the subject matter is often physical, and the unit of analysis could be rocks, stars, plants, or animals. Studying such using drones is easier and more straightforward than the social sciences, in which the subject matter is society itself and the unit of analysis is often humans. The latter are more complex units of study because they are sentient, conscious, and can modify their activities under study. Related to this are ethical dilemmas as well as questions about objectivity when we study our, as well as other, societies. Given our biases, both declared and undeclared, as well as recognized and unrecognized, there is often the need for greater reflexivity in social science studies, especially in collecting data.

Generally, social scientists employ a range of tools and methods to collect data for their studies. These methods range from experiments, surveys, interviews, focus group discussions (FGDs), participant observations, life histories, and documentation analysis, among others. While each of these methods have their strengths and weaknesses, their appropriateness depends on the context of studies. Some social science disciplines are more inclined to relying on visual methodologies than others. The social science disciplines of anthropology, archaeology, economics, geography, history, and sociology have found drones to be useful research tools. Of these, anthropology and geography are unique in their reliance on the visual and visual images to construct their knowledge [1]. Given drones' advantage of providing access to a bird's eye view of the geographical space [2], and geography's preoccupation as a spatial science, drones have found a much more accepting audience among geographers compared to other social scientists. Even within geography, not only are drones helping to bridge the gap, but they are also offering new opportunities for collaborative research between human and physical geographers given that these two subdisciplines often approach the application of drones differently [3].

In this conceptual paper, we aim to discuss the use of drones as social science research tools. In this vein, we focus the discussion on three main themes: teaching their use as a course in the social science faculties of universities, legislations governing their use across countries, and the ethical and political hurdles that need reflection in their application, particularly in the social sciences. While the civilian use of drones for surveillance and policing to fight crime is generally socially acceptable [4], there is often a certain level of uneasiness and strong pushback against a continuous, universal, and an all-seeing flying big brother in the sky [5], given the potential for abuse and concerns for privacy. These reflections are critical as drones continue to become mainstream tools in the toolbox of the social scientist.

#### **2. Drones as Social Science Research Tools**

The application of drones in the social sciences as data collection tools comes on the back of the use of satellite imagery in the same endeavours. The latter can be traced to the mid-1990s when the National Aeronautics and Space Administration (NASA) approached the research community to realize the potential of satellite imagery to specifically address questions which social scientists are preoccupied with. Notwithstanding the high expectations from this collaboration expressed in People and Pixels: Linking Remote Sensing and

Social Science [6], the results have been meagre, and their added value questioned. Much of the difficulties that limited the success of using remotely sensed satellite imagery-coarse resolution of most readily available datasets, the challenge with cloud cover, particularly in the tropics, and limitations relating to temporal resolutions have persisted. This is despite the significant strides that have been made in this area in the last two decades. It is on the back of these challenges that other platforms have been proposed as alternatives to satellites as remote sensing platforms for collecting critical data about the earth's surface. As the third generation remote sensing platform-with piloted aircraft as the first generation and earth-orbiting satellites as the second generation [7], drones are proving much more ubiquitous in terms of their application in scientific research.

There are, of course, substantial differences between drone data and satellite imagery and, as such, the two are not comparable. An important differentiating factor is the scale of application. While satellites are ideal when the macro view of the terrain is needed, due to the larger spatial coverage, drone imagery is better suited for a micro view of the landscape, given its higher-centimetre-level-resolution. Some studies have catalogued the pros and cons of each platform and shown where each performs optimally [8,9], others have been preoccupied with integrating them in a synergistic manner [10,11]. The general trend, with regard to spatial resolution, is a continuous increase, with some satellite platforms now offering sub-meter resolutions. This opens the door for greater applications that were hitherto virtually impossible. The recent use of the 1 m resolution Terra Bella satellite imagery for measuring smallholder productivity in Western Kenya is a case in point [12]. Thus, each platform and the resultant data they generate meet specific needs. In some cases, however, drone data can be up-scaled to cover larger areas. This of course, raises questions of cost-effectiveness [13].

Drones and drone imagery position the researcher and the objects of interest closer together, both physically as well as conceptually. Compared to satellite imagery, the low flying altitude, small area coverage, and the detailed visuals that drones offer create a familiar perspective, closely related to traditional field work. Unlike satellite imagery, drone imagery is usually ready to be used as a map base or in photo elicitation interviews [14–16]. They can also be processed, classified, and analyzed in a more conventional remote sensing way [17,18]. Thus, drone imagery can either be used on its merit or to improve the quality of data that other more conventional methods of data collection in the social sciences have produced. For instance, van Auken, Frisvoll [19] enumerate the advantages that photoelicitation interviews have over more traditional social science research tools, such as the provision of tangible stimuli for more effectively tapping into informants' tacit, and often unconscious knowledge, consumption of representations, images and metaphors, and thus leading to the production of different and richer information than other techniques, while also helping to reduce differences in power, class, and knowledge between researcher and researched. In the developing world, which is invariably the 'data-poor' world, drone imagery has proven to be an indispensable tool for research.

The geographical applications of drones are perhaps more widespread than in other fields. In the sub-field of physical geography, drones have gained wide acceptance for studying rock weathering [20,21], for river bed monitoring [22,23], and for restoration [24]. In this area, studies have progressed beyond proof-of-concepts to real-world applications for geomorphological change detection and mapping, vegetation mapping, habitat classification and sediment transport path delineation [25]. Further downstream, Callow, May [26] used a drone to generate high-accuracy, centimeter-resolution digital topographic models which offer insights into the likely consequences of inundation and the dynamics that control low-gradient sedimentary landforms. It is not surprising that geography in general, and the subdiscipline of physical geography in particular, were always going to be more accepting of drones due mainly to the their 'vertical' and 'visual' character. The proliferation of drones as research tools, however, avails further opportunities for intra-discipline collaboration between physical and human geographers [3].

In agricultural geography, drone applications include mapping crop condition and yield estimation [18,27,28], crop classification [17], seedling emergence assessment, crop damage assessment, weed detection, and mapping [29]. In general terms, drones have been heralded as the right tools for making agriculture smarter, especially in Sub-Saharan Africa where the application of the first- and second-generation remote sensing platforms have met with largely limited success. This limited success is due to such factors as costs of acquisitions, cloud cover, and low spatial and temporal resolutions vis-à-vis the predominance of small farms in SSA. Multiple reviews such as those by Daponte, De Vito [30], Puri, Nayyar [31] have chronicled the use of drones in the field of smart agriculture. Iost Filho, Heldens [32] more specifically reviewed the application of drones as noninvasive crop monitoring systems in precision pest management. Three main niches exist in this subfield for drone application: (1) scouting for problems; (2) monitoring crops to prevent/reduce losses; and (3) planning crop management operations [33]. Similarly, Barbedo [34] offers a more comprehensive and critical review of the use of drones in this area, chronicling the major milestones, the main research gaps and possibilities for future research with the application of even newer techniques of machine learning on drone image analysis.

This is, however, not to assert that drones are not already useful research tools in the other subdisciplines of human geography. In tourism studies, for example, the usefulness of drones continues to grow. Here, drones are being used for monitoring and patrolling tourism activities for safety and security as well as for virtual tourism systems [35]. For virtual tourism, drones serve as destination marketing tools to produce large amounts of visually appealing footages of various destinations [36]. On the part of tourism service providers, major considerations regarding economic viability and operational feasibility need to be addressed in order to efficiently deploy drones in the tourism sector [37]. While tourists tend to have a better appreciation of the potential uses of drones compared to managers of tourist centers, there is need to set boundaries of what is acceptable [38]. Beyond tourism studies, drones are finding applications as research tools in cultural geography, health geography, rural geography, transportation geography, and urban geography, among others.

In the area of environmental geography, for instance, community drones for natural resource management and conservation is a strongly growing niche [39–42]. Much of the work in this area emanates from Latin America and, to a lesser extent, South-East Asia. Drone applications are most useful in cases where the study locations are usually difficult to physically access. A number of reviews by those such as Paneque-Gálvez, Vargas-Ramírez [43], Canal and Negro [44], Beaver, Baldwin [45], have emerged in this area that point to current and potential significant contributions that drones can be put to in natural resource management. Cummings, Cummings [46] go a step further to demonstrate how drones can be adapted in indigenous peoples' dominated settings in a collective and concerted manner. Following such a collaborative approach can help build mutually beneficial relationships, as it respects indigenous culture and customary norms, which in turn augurs well for a sustainable monitoring and protection of natural ecosystems. Vargas-Ramírez and Paneque-Gálvez [39] provide a broad overview of this growing field of community drones, finding that local knowledge is often neglected or undervalued, and emphasizing the need to recognize indigenous peoples' territorial rights. Done well, participatory action mapping (PAM) using drones can be useful for bolstering the political and legal claims of indigenous communities to counteract land grabs by foreign entities [47,48]. Conversely, unintended negative consequences of PAM include fragmentation and conflicts among indigenous communities and the facilitation of land acquisitions, either by the state or corporations, following legal recognitions [49]. There is also the need for researchers engaged in PAM to pay attention to the digital divide that often exists between them and indigenous communities, a situation which is symptomatic of broader socioeconomic and political inequalities which are largely legacies of colonialism [50].

In archaeology, drones are becoming increasingly useful in studying previously unrecognizable features. For example, Cucchiaro, Fallu [51] demonstrate that orthomosaics from drones provide an accurate and high level of detail of the terrace landscape, the archaeological features and sediment stratigraphy along an excavation trench previously unobserved. Similarly, Brown, Walsh [52] show the beneficial use of drones to map multi-faceted terraces under intensification and diversification. In landscape archaeology, Stek [53] demonstrates the utility of drones for detecting previously undocumented subsurface archaeological artifacts in mountainous, Mediterranean landscapes. Campana [54] provides an excellent review of the application of drones in archaeology and delineates five main areas of application: exploratory aerial surveys, survey of archaeological sites and landscapes, three-dimensional (3D) documentation of excavations, 3D surveys of monuments and historic buildings, and archaeological surveys of woodland areas. Just like in tourism studies, the application of drones, which makes hitherto unobservable sites accessible, also requires some safeguards to ensure that aerial photos do not contribute to looting and destruction of heritage sites [55].

Different fields incorporate drones into the fieldwork and studies to varying extents. The broad field of geography is, however, relatively more predisposed to employing drones compared to other fields because it uses the full sweep of quantitative and qualitative methods and places greater emphasis on fieldwork and mapping. This is underpinned by its special focus on spatial analysis and areal differentiation. Perhaps, through more widespread teaching of drones as a stand-alone course in the social sciences in universities, other fields might come to realize their value and how they can be adopted and utilized to suit each discipline's peculiar needs.

#### **3. Teaching Drones in Higher Education**

The teaching of drones in institutions of higher learning is fast catching on, notwithstanding the substantial capital outlay that this involves due to the infrastructure demands that it entails. The teaching can, broadly, be categorised into two main areas: (1) teaching it as hardware, including the development of technical improvements to the drone's navigation systems; and (2) teaching drone-based data capture and processing. The first is the kind of stuff that more technical departments of universities such as engineering already do, and this is not the focus of the present paper. Here we limit ourselves to the social sciences, and thus the second, which is the teaching of the capture and processing of drone-based data.

In this era of rapid data collection, drones have emerged as a well-established geospatial technology for collecting and analysing primary remote sensing data. In terms of importance, they are poised to be as revolutionary for geography in the same order of magnitude as other spatially-oriented software packages. They offer a method for collecting and accumulating data from strategic viewpoints [56] and at such fine spatial resolutions that there are many social science disciplines that can benefit from this vantage perspective. Given their ubiquity in society and increasing applications, even in the social sciences, there is increasingly obvious need for having a dedicated course on drones in research-oriented universities. To be fair, much of what we propose here, in terms of teaching drones, is already being done by many engineering departments across many universities. Our focus here relates to flying the drones, capturing spatial data in photographic format, processing these into orthomosaics, and the application of these in the social sciences and the ethical implications that this entails. It is our position that the reflexivity and reflections that arise when social scientists undertake these processes themselves are markedly different from those that arise when the orthomosaics are presented to them for analysis. Hence the need to teach these in universities to social science students even if photogrammetry has been taught for years to engineers. Herein lies the gravamen of the argument for the teaching of drones as an important tool in the toolbox of the social scientist.

Effective teaching of the principles and applications in a field such as dynamic geographic information systems and technology in higher education is usually challenging [57]

and drones are no exception in this. This is partly down to the constant change that this niche of study is subject to. Effective teaching of drones in institutions of higher learning needs to overcome two fundamental issues: teachers need to adopt and adapt new paradigms and tools while keeping up to date with newer trends in the field, and yet also develop effective methods for transferring the new competencies to students. The teaching of drones, especially when it encompasses image acquisition, data processing and interpretation, has been shown to significantly enhance students' data processing skills while enhancing their competence in handling data quality issues [58]. Consequently, this field is usually at the cutting edge of teaching and learning approaches, with traditional approaches giving way to more modern methods.

In recent times, more traditional approaches, such as pen-and-paper in lectures and laboratory exercises, are giving way to more active learning strategies such as 'flipped classrooms' [57]. Such participatory and collaborative approaches lay good foundations among students for participatory action research and popular education approaches which ensure community participation and cultural appropriateness of the methods that are employed in data collection using drones [39]. These, however, often require further training not only on the part of students, but instructors as well. On the part of instructors, there is often the need to allocate extended periods of preparation for classes to keep abreast with software updates as well as new trends and developments in the general field of geographic information science and technology [57] and the specific field of image analysis, especially using the artificial intelligence techniques of machine learning and deep learning using big data. Holloway, Kenna [59] further posit that, with regards to new technologies, such new approaches foster teamwork, peer-to-peer learning, and positively reinforce the uptake of such technologies in fieldwork.

Already, some institutions are setting the pace in teaching drones both at undergraduate and advanced levels. In the United States, the Drone Journalism programmes at the Universities of Missouri and Nebraska-Lincoln are pacesetters in drone studies. In such programmes, students are taught not only the technical skill of flying a drone mounted with cameras to collect aerial data, but also the ethics of collecting data on people in public places and the legal, safety and regulatory frameworks across various jurisdictions and areas as well as the analysis of aerial data [56]. For instance, flying regulations are different within a 2-km radius of an airport than they are for a rural area [3]. For safety reasons, special rules also apply to flying altitudes, with about 100 metres often considered a safe height. Elsewhere in continental Europe, the Oslo School of Architecture and Design is also considered an early pioneer in the teaching of drones as a course [60]. Similarly, Lund University has an aviation school which specializes in training and certification of drone pilots and is in the process of acquiring the necessary credentials from the country's transport administration. Other research-oriented universities should be following these early innovators in this endeavour. The main challenge for instructors is to cover the three fundamentals of remote sensing, these being planning, data collections, and image analysis, while minimising logistical and practical issues associated with the actual flights [61]. A further hurdle in this is securing the necessary certifications to be able to train pilots within the existing legislation framework.

Those institutions that do not have the infrastructure or which are in jurisdictions where private drone use, even for academic purposes, is significantly restricted, could liaise with already established pilot schools to train students on the technical aspects of flying while they focus on the theoretical, philosophical, ethical, and methodological aspects of drone use in the social sciences. The onus falls on geographers, both physical and human ones, to actively engage with this technology and lead cross-disciplinary discussions on not only the processing of drone data but also the ethical implications of its use. Collaborating with specialized flight training schools thus helps to overcome barriers relating to certification and licensing.

#### **4. Legislations on Drone Use across Countries**

At the core of the legislations regulating the use of drones is the need to ensure safety and minimize harm in the use of drones in civilian airspace. Professional use of drones necessarily needs to be guided by a multiplicity of legislations, from the national, regional, and even local levels. Regulations often relate to the flying of the drone itself, the safe and secure management of the communication to and from the drone system, and those relating to the ethical issues arising from the acquisition, processing, and dissemination of drone imagery. Underpinning the first two is the need to prevent airspace conflict and interference with commercial airport systems. The third is concerned with preventing breaches of confidentiality, privacy and safety of people and locations of national security importance.

Much like other technological innovations, regulations for drones have been playing catch-up with the proliferation and use of the devices [62]. Different countries have reached different regulation development stages for drones, with most countries, especially in Africa, having developed their national guidelines within the last half decade. Even among OECD countries, there is substantial heterogeneity in national legal frameworks on drone regulation [63]. Regulations on the use of drones are necessary due to the potential for breaching privacy, data protection, and public peace [64]. Regulations relating to licensing and operations therefore vary significantly across countries, even though a substantial proportion (40 to 85%) of the provisions of legislations governing drone usage is often sourced from the manual of the International Civil Aviation Organization [64]. Even where regulations have been harmonized, as is the case with the European Union (new regulations came into force in January 2021), stakeholders often find them cumbersome due to administrative and bureaucratic complexities in their interpretations [65]. The regulatory field will most likely continue to be characterized by fluidity in the foreseeable future. There are a few data repositories for information on drone regulations worldwide. A useful portal for the most up-to-date information on drone regulations for various jurisdictions can be found at: https://www.droneregulations.info/index.html (accessed on 16 June 2021). Here, one can access the specific website of respective national authorities responsible for licensing and issuing guidelines and regulations for drone pilots. The portal thus serves as a one-stop-shop for the most updated laws on the use of drones on each country.

A major challenge relating to the regulations is the restrictions they tend to come with. This is particularly true when regulatory agencies fear that lives could be at risk. In such circumstances, there is a tendency for broad restrictions which limit adoption and use of drones even for academic research [64,65]. These barriers are sometimes purely financial. For example, Kenya Airways has an entry-cost of about USD 1600 for a monthlong course to obtain a license to fly a drone in Kenya [66]. This excludes other charges such as the cost of medical examination. The initial license issuance costs some USD 720 and this is renewable at a fee of about USD 460 [67]. This area is, however, in constant flux. Countries are regularly reviewing regulations to improve the ease of use of drones in their jurisdictions. In the United States, which is a pioneer in this area, drone operators are no longer required to pass a medical examination nor have liability insurance, for example. Drone pilots are only required to pass an aeronautical knowledge test rather than acquire any form of pilots' license.

Other portals exist to check drone operations and report incidents involving drones. The most popular of these is the drone incidents and intelligence system https://www. drone-detectives.com/ (accessed on 7 July 2021). The primary purpose of this portal is to safeguard public safety by allowing private individuals to report dangerous drone activities and to file accident incidents involving drones. Some of the details one can report include date and time of incident, the type/model of drone involved, and the altitude at which the drone was flying, as well as the specifics of the incident such as the proximity to airport airspace or military installations. Such reports are useful for regulatory institutions in their monitoring activities. Apart from showing incidents involving drones, Drone Detectives

is also useful in noting the various no fly zones in all countries. These are usually over military installations, airport airspace and public spaces such as parks and stadia.

Thus, while the fundamental role of these national regulations relating to drone use is to ensure public safety and security, some of the rules will have to be relaxed as drone features such as obstacle sensing and avoidance systems improve. Legislations on drones often have three main aims: (1) to regulate the use of airspaces; (2) to impose operational limitations; and (3) to outline administrative processes for permissions, licenses and authorizations [64]. The overall aim is therefore safety and security. The enactment of such rules is fundamental for further tapping into the potential benefits that drones come with in the various fields.

Several studies and reviews have been carried out in this area of regulations governing drone use and the implications of these on the industry. In the last year alone, Alamouri, Lampert [65] provided an overview of recent updates on drone regulations in the European Union and showed how regulations can help and hinder the use of the technology. Similarly, Hodgson and Sella-Villa [68] provide a review of the regulatory regimes in the United States, with particular reference to its application for academic research. They further highlight the complexities relating to restrictions on flying over critical infrastructure such as security installations when the locations of such facilities are classified for security reasons, as well as recommendations on how researchers can obtain exemptions from often sweeping restrictions. In the African context, Ayamga, Tekinerdogan [64] provide a review of the challenges that regulations pose for drone adoption and application, with specific focus on the field of agriculture. They argue that while the political commitment may be present in most Sub-Saharan African countries, regulations are often hampered by inadequate capacity to develop and enforce drone regulations.

#### **5. Ethical and Safety Considerations**

In terms of safety, drones are generally considered relatively safer than piloted aircraft for two main reasons; first, they are not piloted and so there is minimal risk of harm to the human controller in cases of a crash, and second, they do less damage on crashing because they are relatively smaller in size [69]. Modern drones also come with more safety features such as obstacle avoidance systems and return to launch buttons on controllers than their predecessors. This notwithstanding, drones come with some heightened concerns of safety due mainly also to their pilotlessness nature [70]. How safe a particular drone system would be is influenced, to some extent, by the drone configuration. For instance, rotary winged drones tend to fall stone-like in cases of rotor failure, while their fixed-wing counterparts tend to fall more gracefully. It is for this reason that drone licenses and flight permission are influenced by the type of drone.

Beyond safety is the ethical implications of research done using drones. The main issues of concern when discussing ethics in drone research are not markedly different from those that come up when using conventional research techniques like interviews, surveys, and FGDs, among others. Indeed, the key issues of privacy, confidentiality, and consent are still fundamental. The distinction of what constitutes the private sphere and public domain is critical. While drone data collection does not involve human test subjects *per se*, they often involve the observation of public places that humans are an intrinsic part of. In such contexts, it would be required that the data is recorded in a manner such that individuals are not personally identifiable, and if they were identifiable, disclosure of their identity outside of the research environment would not place them at the risk of any harm [71]. Studies that do not meet these thresholds may be subject to institutional ethics restrictions. On private property, however, studies will necessarily require consent from individuals to pass the ethics requirement. It is, for instance, not inconceivable that a drone captures an individual engaging in an illegal activity which would make them liable to criminal prosecution. The possibility of such accidental breaches of people's privacy means that drone operations over the private domain often requires researchers to obtain informed consent. Conversely, Sella-Villa [72] argues that drones are primarily data collection devices

whose impact on privacy is rather limited, as they are not substantially different from other camera-equipped technology. From this perspective, if a photograph is taken, the platform used is largely irrelevant. This notwithstanding, certain unique characteristics and qualities of drones means that their use as data collection tools in the social sciences brings to the fore key ethical concerns.

Issues regarding ethics in drone research, like in many other fields, is not a straightforward one and is often riddled with inconsistencies and contradictions; what is private can quickly become public and vice-versa. For example, can people have private moments in a public park and how does one draw the distinction? There remain many grey areas and a lack of universality in principles regarding these requirements. For instance, what constitutes private information? While the airspace may reasonably be considered public space, would flying a drone over a farmer's field in open view require consent from them? What happens if they were growing marijuana on this field? Would institutional ethics review committees require researchers to gain informed consent for such drone operations? Moreover, obtaining informed consent from individuals in a study using drones can be a daunting task. This becomes impractical where, for instance this involves an indefinite number of people in a village.. In such a scenario, a community-wide forum prior to data collection becomes prudent. Through this, researchers could inform community members of what kind of information is to be collected and assure them of the protection of their anonymity, privacy and confidentiality [71]. This could engender public trust and buy-in on drone projects. This is especially critical in resource conservation in the interest of long-term sustainability of projects long after researchers leave research communities [69].

Despite their ubiquitous nature in the last few years, the capabilities of drones means that they are an excellent tool for surveillance, since they capture data from a vantage perspective inaccessible to other technologies [72]. There is, therefore, the need to be circumspect when applying them to data collection in the social sciences [70]. There is general agreement that researchers who employ drones to collect data should ideally submit their proposals to institutional review committees or some other oversight body for vetting to ensure compliance with ethical and regulatory standards [71]. While most drone studies, based on current ethical requirements, would qualify as exempt from such stringent reviews due to the minimal risk of harm to human subjects, researchers should nonetheless be aware of the possibility of ethical breaches that the collection of data in the public domain can occasion.

#### **6. Conclusions**

The last decade has seen a significant uptake in studies that use drones either as supporting tools or even as the primary methods of data collection. Drones can have important roles to play in mixed methods, especially in the areas of natural resource conservation, agriculture, tourism, among others. In this paper, we have discussed the increasingly widespread application of drones as tools for research in the social sciences. Given the unique capabilities of drones, there is the need for adequate ethical considerations when using them in research. While they hold enormous potential in multiple fields of study, certain fields such as geography and archaeology are already more inclined to their application than others. In archaeology, drones have enabled hitherto unobserved artifacts to become accessible to researchers. This has both positive and negative implications for heritage sites and indigenous populations. The application of drones in the fields of tourism studies and archaeology thus requires additional reflexivity to ensure that their use does not contribute to exploitation and looting of sites that were hitherto inaccessible. These considerations should fit into broader national guidelines and regulatory frameworks which should, in turn, be streamlined and be made less cumbersome to engender compliance. Certain barriers, which in most countries are financial, do not augur well for the adoption of the technology to reap the full benefits of their application.

Drones are already an inexorable part of society, and so spatial social scientists should be actively engaged with the use of this tool and be engaged in debates on the application of the technology as a tool in their increasingly dynamic toolbox. This will ensure that the benefits inherent in the use of drones are maximised without exacerbating possibilities of breaches in ethics. A key plank of this engagement is the teaching of specialised courses in drones in institutions of higher learning. Such a drone course will not only help students acquire the technical skills to operate drones but also help them explore ways in which the tool can be applied in their own research specialisations, as well as enable them to engage critically with the ethical dilemmas inherent in its application in the social sciences. This latter discussion is critical as drones are becoming an integral tool in the toolbox of the social scientist as they become cheaper, safer, and more accessible.

**Author Contributions:** Conceptualization, O.H. and I.W.; methodology, O.H. and I.W.; formal analysis, O.H. and I.W.; resources, O.H.; data curation, I.W.; writing—original draft preparation, O.H. and I.W.; writing—review and editing, O.H. and I.W.; Project administration, O.H. Both authors read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding, the APC was funded by Lund University.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **On the Dominant Factors of Civilian-Use Drones: A Thorough Study and Analysis of Cross-Group Opinions Using a Triple Helix Model (THM) with the Analytic Hierarchy Process (AHP)**

**Chen-Hua Fu 1, Ming-Wen Tsao 2, Li-Pin Chi <sup>3</sup> and Zheng-Yun Zhuang 4,\***


**Abstract:** This study explores the experts' opinions during the consultation stage before law-making for civilian drones. A thorough literature study is first undertaken to have the set of influencing factors that should be suitable for the investigation from the perspective of designing and selecting civilian drones. Several rounds of surveys using the Delphi method, followed by an analytic hierarchy process (AHP), are performed to conform to the organized tree structure of constructs and factors and to obtain the knowledge about the opinions of the expert groups, with the expert sample being intentionally partitioned into three opinion groups at the beginning: academia (A), industry (I), and research institutes (R). Doing so facilitates a "mind-mining" process using the triple helix model (THM), while the opinions across the groups can also be visualized and compared. This exploits a new set of knowledge for the design and selection of civilian drones on a scientific yet empirical basis, and the observed differences and similarities among the groups may benefit their future negotiations to propose the drafts for regulating the design, manufacturing, and uses of civilian drones. As several significant implications and insights are also drawn and gained from the abovementioned results eventually, some possible research directions are worthwhile. The proposed hybrid methodological flow is another novelty.

**Keywords:** drones; civilian use; factors; design and selection; law-making; mind-mining; expert groups; literature study; triple helix model (THM); analytic hierarchy process (AHP)

#### **1. Introduction**

With the mature development of the Internet of Things (IoT) and mobile communication technology, especially 5G, civilian drones are widely used in many applications. The drone industry is rapidly developing, causing drones with higher technical capabilities at a lower cost to enter the market. Combined with their ease of use, drones could become the primary players in the field of surveying for commercial, government, and scientific entities [1].

A forecast of global commercial and private drone market sizes indicates that in 2020, the global drone market generated almost 22.5 billion USD in terms of global revenue. By 2025, the global drone market is expected to generate over 42.8 billion USD, with annual growth projected to beat 13.8% [2]. The market forecast also shows the market will keep growing, and the major key factor comes from civilian consumers. This means that many drone-based applications will be available for enterprises and other organizations in the future. Therefore, for the government, how to control the use of civilian drones

**Citation:** Fu, C.-H.; Tsao, M.-W.; Chi, L.-P.; Zhuang, Z.-Y. On the Dominant Factors of Civilian-Use Drones: A Thorough Study and Analysis of Cross-Group Opinions Using a Triple Helix Model (THM) with the Analytic Hierarchy Process (AHP). *Drones* **2021**, *5*, 46. https://doi.org/10.3390/ drones5020046

Received: 13 April 2021 Accepted: 20 May 2021 Published: 26 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

becomes a critical issue. On the other hand, for the aeronautics industry, the design issues of advanced civilian-use drones have also become a hot topic valued by many researchers and manufacturers. Therefore, a nexus of these possible conflicts is law-making. For example, Sah et al. [3] have pointed out that regulations are the most critical barriers to implementing drones in the logistics sector, an important application domain of civilian drones. Therefore, the significance of drone regulations is highlighted.

In Taiwan, for example, as in many developed countries or regions, many drone applications have gradually emerged recently. Many enterprises and organizations could use drones to perform different kinds of jobs for their business. Given the gradual rise of demand for drone applications and regulations, the Civil Aeronautics Administration (CAA), Taiwan R.O.C., is planning relevant regulations and certification mechanisms to manage drone applications.

However, in the recent studies related to drones, most of those studies explored the issues of applications, technologies, markets, safety, privacy, and others. Some of these studies also explored the cooperative relationships among industry (I), government (G), and academia (A) regarding military and civil drones with the triple helix model (THM) [4].

The THM is the key theory for this study. A THM model usually refers to a triangle that involves the corporations of I, G, and A parties (i.e., the vertices) to achieve some certain objective (O), e.g., for the public interest to design a large (mega) construction project, as shown in the left model in Figure 1. It is a "conceptual model" because a party's name remains non-uniformed, e.g., party I is sometimes called "Business" (B) and party A is sometimes called "University" (U), varying case by case. Despite the name varying, the interchangeable names usually connote the same party without any ambiguity. Given this model, relations and actions that happen between each pair of parties can be analyzed along with the edge in the model. Moreover, the intersections between the three parties may form yet another interesting matter to be observed. However, in this study, the THM model is slightly modified for the research purpose, and those relations and intersections (and gaps) between the group opinions that affect the law-making actions are analyzed.

**Figure 1.** From conventional THM model to the proposed "modified THM".

Before drafting the relevant regulations for law-making, the CAA, the institution in charge (of the G party), consulted and collected the opinions from the other three parties, namely, academia (A) (e.g., the university-level academics), industry (I) (e.g., the largest aeronautic manufacturer), and research institutes (R) (e.g., the largest government-funded aeronautic R&D institution), in Taiwan, R.O.C. These three parties, namely, A, I, and R, may constitute another meaningful triangle in THM rather than industry, government, and academia (I, G, and A). Problems related to designing civilian drones and making reasonable legislation can be dissected and studied using the proposed and renewed THM model. In other words, among the players in the THM model, G is replaced by R, and such a replacement should be reasonable because of the following two main reasons.

First, the original THM was used to explore the military-, civil-, and dual-use drone development issues [4], so THM is based on the cooperative relationship between I, G, and A. However, for this study, it is the case that G usually would like to entrust A, I, and R to draft the relevant regulations, and G is then responsible for legislating and implementing the law later. This is because the R party always performs some cutting-edge research on drones, so it can also give G more suggestions about newly developing directions, limits, and applications of drones. Hence, there is a change in the players in THM, i.e., for "which are triple", as required by the nature of the main aim.

Second, according to the surrogate model [5] in the engineering domain, replacing G with R to explore the design factors of civilian drones is also reasonable, because the R, if properly selected, is also G. Consider the largest domestic R&D institution chosen to be R for opinion collection in this study: could such an institution not represent G, as it is government funded? In other words, in the A-I-R THM model, R can be carefully chosen to concur with the original structure of the I-G-A THM model without breaking its rule for G. These are the reasons why we replace G with R in the "modified THM" in this study. For clarity, in this text, we refer to THM as this *modified THM model* in most of the cases.

Additionally, in this study, following the abovementioned, we argue that a systematic study of the *design factors* of civilian drones for the makers (or the *selection factors* for users, equivalently) may help to *understand the opinions of these three parties in the THM* and to *clarify the differences in their opinions*. Such knowledge may *address the emphasis of the subsequent advice* given to the government *for law-making*.

To begin with, a set of design factors are clearly established through a thorough literature study, and the Delphi method can be used to confirm whether this set of factors is effective or not. When we stand on the user's side, the temporarily civilian drone selection is a typical multiple-criteria decision-making (MCDM) problem. The analytic hierarchy process (AHP) is a very suitable approach that can be applied to understand how each group of experts (i.e., the three parties) may perceive the importance of these factors (i.e., for mind-mining [6,7]). That is, in terms of MCDM, civilian drone selection can be treated as a multi-attribute decision-making (MADM) problem that usually involves many consideration factors and the priority among these consideration factors (i.e., the significances of them), and usually, these consideration factors are the "decision criteria" in terms of MCDM, so the information about the "criteria weight vector" (CWV) can be obtained using AHP.

Prior to this, we perform a comprehensive literature study to seek and summarize a set of these factors, mounting them under some meaningful "constructs" in a tree-formed hierarchy, and confirm whether this set of factors we have included is effective or not by reference to the opinions of the domain experts using the Delphi method. Note specifically that the use of the abovementioned Delphi–AHP approach in studying the design factors has been shown to be effective in practice for other aircraft types with military purposes, e.g., next-generation fighters and MALE UAS [8,9], and/or for other UAV subjects, e.g., in Ulloa et al.'s work [10] to design a lightweight, portable, and flexible air-based PV-T module for UAV shelter hangars and in Song et al.'s work [11] to evaluate the comprehensive performance of UAV-based granular fertilizer spreaders (GFSs).

In short, this study explores the main design factors for the civilian applications of drones. It adopts several research methods, such as a literature study to collect and establish the initial set of factors, the Delphi method to confirm and solidify the set of factors, and AHP to understand the priority over these factors and how the opinions may differ (or not) among the three interest groups (A, I, and R). Inasmuch as the results would be helpful to understand the thoughts in the minds of the different groups of experts pertaining to the set of civilian drone design factors and the priority over these factors before the three groups of experts give some advice regarding the relevant regulations in the law, this study may perfectly meet the requirement and the core spirit of data-driven decision-making (DDDM) [12].

Therefore, the research question of this study is worth exploring, and *the possible contributions of this study should be two-fold*. First, as there are relatively few studies on the market and applications of civilian drones that are driven by a law-making requirement, the results of this study may offer valuable knowledge in an empirical sense. Second, to the authors' knowledge, there seems to be no research that applies the THMs with the decision analysis (MCDM) methods, such as AHP, to explore some know-how further for drones. This study fills the gap by offering a new hybrid way to conduct the research in the methodological sense.

This introductory section placed the study in a broad context and highlighted the reason why it was conducted. Section 2 reviews the relevant literature, including a thorough literature study for the influencing factors. Section 3 explains the processes performed by using the methods and the results obtained from using these methods. Section 4 provides the discussions and implications drawn. Section 5 gives the concluding remarks and the recommendations for future works.

#### **2. Literature Study and Methods**

This section starts with the history of drones and the related works about drones (Section 2.1). This is followed by an in-depth review of the possible consideration factors (Section 2.2). In addition, except for the THM theory, which should be clear so far, a review about Delphi and AHP is also given to link these used methods with the study (Section 2.3).

#### *2.1. Drones and Their Civilian Uses*

Drones originated in 1917 when the U.S. military began conducting research and tested them during World War I. Therefore, relevant developing drivers and phases are reviewed by using this country as the example here for space reasons. In addition, the comprehension of public datasets that are available is another reason (i.e., data availability) to adopt this country as an example, at least for the researchers of this study.

After the early stage of the drone's invention, it was not taken into combat until after the Vietnam War. However, since then, advanced communication technology has increased the bandwidth of military communications satellites and the development of navigation technology, improved the remote-control capability of unmanned aerial vehicles (UAV), and made the remote operations of UAV more practical. Additionally, the geographical nature of the wars affecting Iraq and Afghanistan has increased the need for the Organization of American States to identify, locate, and attack hidden targets through continuous surveillance and rapid strikes, while minimizing collateral damage. In these applications, unmanned aircraft systems (UAS) provide asymmetrical technical advantages in these conflicts [13]. In general, over the past decades, UAS have played a critical role in non-military operations, such as supporting humanitarian relief operations in Haiti or for mine detection and chemical, biological, radiological, and nuclear reconnaissance.

The U.S. Federal Aviation Administration (FAA, USA) clearly defines the term UAV as "equipment used or intended for use in air, without an on-board pilot". In other words, UAV include all categories of aircraft, helicopters, airships, and refractive lifts without on-board pilots, and this implies that a "drone" flies autonomously or remotely without a pilot operating the aircraft. In the development history of drones, they have different designs and functions for military missions and civilian and commercial applications [14]. Because many missions are boring, dirty, or dangerous (3D) for pilots, drones are better suited for some tasks. For example, many types of drones, either large or small, are widely used by government departments or research institutes to carry out different tasks and research work [15].

Additionally, UAV can integrate with ground control stations and data links to form UAV systems (UAS). Therefore, drones involve command, control, and communications (C3) systems and must support those who control them [16]. UAS can be considered a system that includes multiple subsystems, including aircraft (often referred to as drones), payloads, control stations (often with other remote stations), aircraft launch and recovery, support, communications, transmission, etc. With advanced navigation and communication technologies, UAS have become a "new capability" available to the government (public) and commercial (civil) aviation sectors [17]. As the developing technology of

drones is becoming more and more mature, the civilian applications are growing day by day. Table 1 lists the related applications of drones in the civilian domain [18].

**Table 1.** The uses of drone technology in the civilian domain.


Drone manufacturers and UAS suppliers worldwide are developing industry-specific solutions to meet customers' business needs effectively. Advancements in drone technology have enabled manufacturers to produce different models in sizes, weights, and shapes that can carry various devices and payloads to play roles in a wide range of applications. However, some safety issues and drone traffic management issues are the factors that could challenge the growth of the commercial drone market to some extent. The demand for drones in the commercial sector is increasing, because they increase productivity through improvements in graphical visualization and an overall reduction in project costs. Due to a significant improvement in drone control accuracy, many drone application requirements are emerging quickly in the commercial sector. Various applications related to cost reduction and time have led to the increasing use of drones. This trend could be expected to create the overall value of commercial drones. Figure 2 presents another forecast of the drone growth trend, in addition to the forecast made for 2020–2025 [2] (see Section 1), in the commercial sector in the U.S. from 2014 to 2025 [19].

**Figure 2.** The forecast of the drone growth trend in the commercial sector from 2014 to 2025.

Except for the U.S., many countries around the world also used drones in the civilian sector. Currently, at least 90 countries or non-state organizations are known to operate drones [20]. However, different types of drones have different usages. For example, micro and small drones are commonly used in low-altitude and unregulated airspace. Typically, light drones smaller than 150 kg are suitable for monitoring tasks in many real-world applications.

Regarding the classification of drones, Gupta et al. argued that there is no uniform classification due to the diversity of drone capabilities, size, and operational characteristics [16]. That means that we can have a categorical list of drones by the types of drones seen so far. Table 2 lists the categories and their associated parameters. These include the maximum total take-off weight (UAV with payload), regular operating altitude, mission radius, endurance, general use, and purpose of use.


**Table 2.** The classification of drones by drone type.

Table note for legends: R: reconnaissance, I: inspection, S: surveillance, DG: data gathering, CT: cargo transportation, SR: signal relay.

Grimaccia et al. proposed an effective method to classify drones according to their functionalities [21]. However, the U.S. Department of Defense provided other classifications, while NASA provided another "classification matrix" [15], as shown in Tables 3 and 4, respectively. In Table 3, these classifications are based on weight, altitude, mission radius, and duration [13,22].


**Table 3.** U.S. DoD's classification for drones.

#### **Table 4.** NASA's drone classification matrix.


In Table 4, drones are classified by weight, airspeed, and type. However, it can be seen that, although there are different terms, weight can be used as the only common classification criterion, as other characteristics of drones are usually weight-related. Additionally, some studies [9,16,18] provided another means of drone classification, which is descriptive, telling the "feature descriptions" rather than the numerical parameters above. Table 5 shows the categorization of drones with this method.

#### *2.2. Review for the Consideration Factors*

In the last subsection, the history and classifications of drones, as well as the civilian uses of drones, were reviewed in general. In this subsection, a deepened review for the design and selection factors for civilian-drone applications is provided, because subsequent studies (i.e., the confirmation of the set of factors, the solidified basis of the AHP hierarchy that is established, and the final results obtained using the AHP) may heavily rely upon a set of factors that is grounded. Additionally, after the review of works, we found that the surveyed literature involves several categories, e.g., cost, performance and applications, operation, and maintenance, and we present the review to the relevant works by reference to these categories in order. In fact, as will be shown later, these four factor categories form exactly the four constructs to establish the AHP hierarchy.

#### 2.2.1. Cost

Cost is always a concern for designing and applying any equipment and facilities. This also holds for drones. Many studies have explored related costs of drones. Aragão et al. [23] proposed a UAV selection model, and initial investment and maintenance cost are two main factors in their proposed selection model. In an exploration of life cycle costing (LCC), Kianian et al. [24] argued that the LCC should contain the acquisition costs, operation costs, maintenance costs, and disposal costs. In the study related to a stochastic facility location model for drones, Kim et al. [25] argued that, besides the purchase cost, it is critical whether the operation cost and maintenance cost are reasonable or not when considering the costs of a drone. They also mentioned that the total relevant costs of a

drone include the costs of opening drone facilities and the operation and maintenance of drones and drone facilities. Discussing the sustainability of small UAV, Figliozzi [26] highlighted some important trends, such as UAV tare, payloads, battery energy, purchase costs, and energy consumption per unit of time flown. He also argued that the cost of operating a drone should include UAV operation staff costs, maintenance costs, ground costs, energy costs, purchase costs, battery costs, software costs, and communications costs. In the study recommending the use of UAV platforms in precision agriculture in Brazil, Jorge et al. [27] mentioned that reduced operating costs is one of the consideration factors when purchasing a UAV. In another study about UAV's technology-supporting maintenance operations, Miari [28] emphasized the importance of UAV operation and maintenance costs in UAV usage. As for a drone's purchase cost and operation cost, Yu et al. [29] mentioned that these are very economical, with 1/5 of the purchase cost and 1/10 of the operation cost of unmanned helicopters.

**Table 5.** The categorization of drones with the "feature description" method.


Moreover, the study related to maintenance cost estimation of Royal Canadian Navy ships described that operating costs included three major categories: the cost of personnel operation, the cost of operation consumables, and the costs of all activities that support the system's operation. In addition, the study also had a description of maintenance costs. It was stated that maintenance costs covered all planned and unplanned activities to keep or return the system to a given state or provide the additional operational capability. These maintenance activities include detection; inspection; troubleshooting; prevention; testing and calibration; overhaul; and replacement of parts, components, or assemblies performed by the crew, by specialist repair personnel, by a depot or agency, and by the industry [30]. Based on the airlines' perspective, Doži´c et al. [31] chose evaluation criteria to solve the aircraft type(s) selection problem. The chosen criteria include aircraft seat capacity, reflecting the measure of matching demand and capacity; aircraft price describing needed investment; total baggage related to the earning possibility from cargo transport; maximal take-off weight (MTOW), which is the main unit for calculation of airport and

navigation fees; payment conditions describing payment advantages offered by different manufacturers or leasing companies; and total cost per available seat miles (CASM), indicating the operational costs and aircraft performances. Gomes et al. [32] proposed a novel approach to imprecise assessment and decision environments (NAIADE) method. This proposed method is based on three criteria, financial, logistics, and quality, to select an aircraft for regional chartering. In a study about cost–benefit assessment and implications for service pricing of electric taxies, Wang et al. [33] argued that the total life-cycle cost model for cost–benefit assessment should consider purchase cost, usage cost, and other operation costs. Additionally, Yeh and Chang [34] claimed that purchase cost and operating cost are the two criteria that can be used to evaluate an aircraft's fuzzy rating performance.

In a study related to LCC, Woodward [35] mentioned that purchase costs are just one of the initial capital costs, and they should include land, buildings, fees, furniture, and equipment. It means that purchase costs should contain the costs of the purchase of equipment and facilities related to the equipment. Operation cost is usually to ascertain the costs of providing or operating a service. This method of costing is applied by those undertakings that provide services rather than the production of goods [36]. The role of operational cost is also evident in expected-value-approach (EVA) studies, while determining the best purchase portfolio, e.g., in the photovoltaic manufacturing industry [37]. Relatedly, in Bressani-Ribeiro et al. [38], it was shown that low operational cost is one of the primary reasons for users to adopt new technology. In their study, Nachimuthu et al. [39] mentioned that total maintenance costs should include maintenance personnel costs, maintenance ship costs, dedicated repair ship costs, spare parts costs, and production losses due to downtime.

Thus, it can be asserted that *purchase cost*, *operation cost*, and *maintenance cost* are the primary consideration factors for a civilian drone design and selection from the discussions above. For drones, the purchase costs would include flight vehicles, manipulation devices, and devices required to perform related applications. The operation costs would contain fuel/electricity, mission-related consumables, and the use of landing sites, and the maintenance would consist of flight vehicle and primary equipment maintenance and component replacement. In conclusion, this completes the review of the factors that should be included in the "cost" category.

#### 2.2.2. Performance and Applications

A civilian drone's performance and applications are perhaps the most complicated category. Although a construct can be established to recapitulate all relevant factors, a thorough study should be exerted to clearly specify the initial set of factors that are to be included under the construct and sent for experts' approval during the Delphi process.

Performance and applications are usually the primary consideration factors when users try to adapt to any new equipment. For drone usage, many studies also discussed drones' performance and applications. In discussing drone performance, Hwang et al. [40] concluded that the key performance of a drone should include speed, altitude, range, payload, and specific operational activity. In the study about selection of UAV for precision agriculture, Petkovics et al. [41] argued that flight duration time, flight speed, on-board computer, sensors, payload, coverage area, and operational time are the UAV selection consideration factors for precise agricultural usage. In the study about imagery collection to aid Aedes aegypti (mosquito) breeding site identification, Aragão et al. [23] proposed a UAV selection model. The proposed model adopted weight, dimension, technique, performance, speed, and investment as the main criteria. There are 12 performance-related sub-criteria, such as maximum take-off weight, maximum payload weight, wingspan, take-off, landing, maximum range, maximum mapped area, wind resistance, maximum altitude, cruise speed, stall speed, and maximum level speed. For last-mile delivery drone selection, Nur et al. [42] proposed an evaluation model. There are 19 sub-criteria related to the "performance" main criterion in the proposed model: drone's overall size, weight, drone type, fuel type, internal computing components, location and proximity accuracy, communication and data quality, traceability, reliability, required delivery distance, maximum flight time, charge and fuel usage rate, maximum load, maximum carry dimensions, maximum reachable altitude, drone speed, adaptability to a dynamic assignment, package handling flexibility, and delivery flexibility.

For the selection of the most proper UAV for transportation in emergency operations, Ulukavak and Miman [43] used eight factors to obtain an evaluation process: payload, UAV weight, maximum altitude, maximum ground speed, approximate flight time, remote controller range, landing field, and ease of use. For visual inspection, monitoring, and analysis of infrastructure using UAV, Duque et al. [44] also select drones that can meet their requirements based on eight evaluation factors, which are flying time, an additional camera on top of a drone, camera resolution with low illumination, video resolution, payload capacity, drone lights, remote control range, and price. For inspecting bridges with a drone, Duque et al. selected a drone with various consideration specifications. These specifications include user-controls and interface, maneuverability, software capability, adaptability, size, and payload [45]. Hoyas Ester [46] selected a drone for his study with another eight selection criteria: price, body size, drone weight, flight time, radio frequency (RF) range, lens field of view (FOV), and lens aperture.

Cesnik et al. [47] mentioned that payload mass, flight speed, fuel mass, time, flight altitude, and landing and flying distance are the important indicators to evaluate the vehicle flight performance of a drone. Chen et al. [48] argued that designers of a drone must consider the high mobility and flight time while minding the limited battery life; the flight duration time of a drone is a key performance indicator. In the study related to drone flight capabilities, Ajanic et al. [49] discussed the importance of manipulative ability for a drone. Therefore, in drone selection, the manipulative ability of a drone is a critical evaluation factor. Yang et al. [50] mentioned the operation range of a drone, the control distance of a drone's controller, the transmission bandwidth for a drone, and the signal interference of a drone are a drone's critical performance issues. Shakeri et al. compared the advantages and disadvantages of multi-UAV and single-UAV systems with several features: targeted area coverage, cost, task time, radar cross-section, power, network topology, application, and security [51].

In discussing disaster management with UAV, Erdelj et al. [52] also argued that airborne operation duration of a drone is an important consideration factor in disaster process operation. In fact, the application of UAV now plays a critical role in not only disaster management but also law enforcement and first responders, and these can be mapped to "Fire Service and Forestry" and "Police Authorities" in Table 1. Whilst the essence of these applications is quickness (i.e., for launching actions just in time for law enforcement (police), effective interventions made by the first responders (e.g., police officers, firefighters, and disaster managers), etc.), the design of drones may also address such a noteworthy feature (i.e., "quick reconnaissance"). For this specific domain, we cite Laszlo et al.'s work in 2018 [53] and the work of Restas in 2015 [54], which have shown the importance of this feature in the aforementioned application domains.

Aljehani and Inoue [55] discussed the coverage problem of a UAV and mentioned that the communication system's performance in a UAV is important for a UAV's flight control and operation. In the study related to design and trajectory control of universal drone system, Yıldırım et al. [56] argued that controllers and sensors of a drone are the components that directly affect the vehicle's flight performance, while other components directly affect the payload of a drone. Besides these basic components, other devices on a drone may be added to perform some specific requirements. Additionally, Amiri [57] argued that the manipulation's agility and stability for the required maneuvers are the key performance indicator for a drone. Pai et al. [58] mentioned that the drone operator should be familiar with the operation interface of a drone, as it involves the manipulative ability of a drone. Therefore, the operation interface of a drone might affect its manipulative ability. Liu et al. [59] also discussed the manipulative ability of robots. Manipulative ability refers to the manipulative ability of robots to perform certain tasks. In terms of maneuverability, different types of rescue robots have different performances. Robots must be controllable and easy to control. The global manipulative ability embodies survivability, mobility, sensors, communication, and the human–machine interface.

Gomes et al. [32] used 12 sub-criteria to evaluate the selection of an aircraft. The performance-related sub-criteria include range, flexibility, cruising speed, landing and take-off distance, comfort, and avionics. Bruno et al. [60] proposed an aircraft evaluation model based on the airlines' requirements. Their proposed evaluation model contains four main criteria and eight sub-criteria. The one main criterion, technical performance, and two sub-criteria, cruise speed and autonomy, are related to a performance consideration for an aircraft evaluation. Yeh and Chang [34] used three main criteria and 11 sub-criteria to evaluate each aircraft's fuzzy rating performance. One of the three main criteria, technological advances, and two of the 11 sub-criteria, aircraft reliability and maximum range, are related to an aircraft's performance. See et al. [61] used speed, range, and the number of passengers as criteria to select the best aircraft among a set of alternatives with a multi-attribute methodology.

In the study about the classifications, applications, and design challenges of drones, Hassanalian and Abdelkefi [62] mentioned that drones' applications would cover a wide range of civilian and military fields. This is important. Drones can perform outdoor and indoor missions in very challenging environments. Drones can be equipped with a variety of sensors and monitors for intelligence, surveillance, and reconnaissance missions. Drone applications can be categorized into different channels. They can be based on mission type (military and civilian), flight area type (outdoor and indoor), and environmental type (underwater, water, ground, and air and space). Types of drone applications are shown in Figure 2. Depending on the type of drone, there are more than 200 drone applications in the future, including search and rescue missions, environmental protection, mailing and delivery, shooting and reconnaissance, performance, bird repellent, cleaning, agricultural spraying, missions at sea or on other planets, and other miscellaneous applications.

In discussing disaster management with UAV, Erdelj et al. [52] mentioned the applications of UAV, and they divided these applications into six categories: monitoring, forecasting, and early warnings; disaster information fusion and sharing; situational awareness, logistics, and evacuation support; support for a standalone communication system; support for search and rescue (SAR) missions; and damage assessment. Figure 3 shows a classification of drones' applications [62].

**Figure 3.** A classification of drones' applications.

In particular, Hassanalian et al. [62] mentioned that different types of drones use different types of power sources, such as fuel, battery, solar cells, and laser power beaming technology. They discussed the advantages and disadvantages of these power sources. Nur et al. [42] also mentioned that the types of power sources provided by UAV include batteries, solar energy, hydraulic fuel cells, internal combustion engines, and tethered and laser transmitters. In the invention related to the field of radiation detection and the "CdZnTe" aerial inspection system, Zhang et al. [63] emphasized the importance of a long operation duration. Pickett [64] also emphasized the importance of extended operation duration for applications of a drone in the patent about unmanned aerial vehicle boosters. Moreover, in rescue applications with an energy-constrained UAV, Liang et al. [65] argued that the longest operation duration of a UAV is a constraint factor in rescue applications. Erdelj et al. [52] also argued that airborne operation duration of a drone is an important consideration factor in disaster process operation in discussing disaster management with UAVs.

From the above long survey so far, we can find several critical factors that are significant to a drone's performance and applications. These critical factors include *vehicle flight performance*, *manipulative ability*, *flight power*, *main application*, and *operation duration*.

#### 2.2.3. Operation(s)

Operation is another important category of concern when users use drones. Many studies discussed issues of drones' operation, and several patents are also related to this category.

There are several patents related to drone operation convenience. Zhang et al. [66] emphasized that operation convenience and operation security are critical for a drone; they had a rotary-wing invention to improve the operation convenience and the operational safety of the rotary-wing drones to a large extent. Ng [67] invented a mobile vehicle charging system that can improve the operation convenience of drones. Tang et al. [68] invented a remote controller of a UAV with a handle structure. This remote controller can simplify the control interface of a UAV to enhance the operation convenience of a UAV. From the above inventions related to UAVs, we can find that drone operational convenience is significant for users. Drone manufacturers can use those new inventions of UAV to improve the operation convenience of drones. Additionally, Zhou et al. [69] argued that the size of the landing site would affect the convenience of drone operations.

Xiao et al. [70] mentioned that UAV offer advantages, such as flexibility, ease of operation, convenience, safety, reliability, and low costs. Thus, ease of operation and safety are two of the reasons why users want to use UAV. In an exploration of the features of consumer UAV, Mao et al. [71] used the DJI UAV products to illustrate the characteristics of consumer UAV availability that include convenience, easy operation, safety, intelligence, and entertainment. That means safety is a critical consideration factor for a consumer UAV. In the study related to the risk, vulnerability, and safety improvement of the industrial UAV transportation system, Johnsen and Evjemo [72] explored the safety of UAS (unmanned aviation systems) based on drones and other UAVs. Safety is related to accidental harm, and security is related to intentional harm. They mentioned that drones have often been used in tasks that are dangerous, dirty, or dull. Thus, safety and security have often been the principal drivers when using UAS. Therefore, we must ensure high reliability, safety, and security when we use UAS in critical operations such as the transport of medical supplies. In the study related to operator engagement during UAV operation, Roy et al. [73] argued that bio-cybernetic systems could adapt the UAV operator's mental status to optimize UAV performance and increase UAV operation safety. Therefore, how to improve the safety of UAV operation is a critical issue. Jorge et al. [27] also mentioned that ease of operation and safety are two of the consideration factors in purchasing a UAV. Moreover, they also explored the advantages and disadvantages of UAV with type, applications, and operations.

In the book about UAV design, Gundlach [74] described the supportability of unmanned aircraft. He argued that the supportability covers activities required to sustain operations. That includes support equipment, spares management, and other items. Both the customer and the contractor can perform support activities. Hwang et al. [40] also argued that supportability is a key performance indicator of a drone. In the study related to UAV design, Sadraey [75] listed the design-related requirements as follows: performance, stability, handling qualities, operational requirements, low weight, affordability, reliability, maintainability, producibility, evaluability, usability, safety (airworthiness for aircraft and operator), supportability and serviceability, sustainability, disposability, marketability, environmental requirements, detectability (i.e., stealth), standards requirements, and legal requirements. In the design consideration of a precision aerial delivery system, Hall [76] argued that the design team should consider the operational supportability of a precision aerial delivery system.

Since drone applications kept growing for both civilian and military purposes, Candeloro et al. [77] also mentioned that this situation exacerbated the noise pollution problem caused by drones. In the study about a drone's noise scattering, Jiang et al. [78] described that both commercial and civil domains had many applications that caused growing environmental concerns on the noise emission. Thus, drone noise received much attention in recent years. In the study about an exploratory investigation of combustion and NVH emissions signature of a drone jet engine, Soloiu et al. [79] mentioned that the largest pollution source of greenhouse gas (GHG) emissions comes from aerospace transportation. The pollution caused by drones' fuel exhaust is also getting worse. Gaynutdinov and Chermoshentsev [80] mentioned that the avionics of UAV uses analog, digital, and RF devices to manipulate a UAV simultaneously over a wide frequency range (up to several gigahertz) of voltages and currents that lead to expansion of the paths' electromagnetic interactions. That obviously leads to deterioration in the electromagnetic environment. Petrov et al. [81] mentioned that mass electrification of UAV creates massive mobile sources of electromagnetic interference. This is worse in dense urban environments in particular.

From the abovementioned literature, we can find many factors that are related to drone operations. However, through this comprehensive literature survey, we can further locate several key consideration factors, such as *convenience*, *safety and security*, *supportability*, and *environmental impact*, for drone operations.

#### 2.2.4. Maintenance

Users should consider the maintenance issue when they adopt new equipment or new facilities. Design factors under this category should also be critical for drones, so the literature is also abundant with articles about relevant issues.

In discussing the U.S. Army's light tactical vehicle solution, Metzler argued that the design of the U.S. Army's light tactical vehicle should keep maintenance requirements low and keep maintenance costs low as well. A possible design solution is to adopt commercial off-the-shelf parts found in civilian vehicles [82]. In a study related to the neural network control system of the UAV altitude dynamic, Muliadi et al. mentioned the reasons why a UAV becomes a popular type. The easiness of maintenance is one of the reasons [83]. Edgell et al. [84] argued that the development of modular aircraft results from highly reliable and easy-to-maintain systems. That will minimize logistical requirements and simplify maintenance requirements. They also mentioned that locally manufactured parts allow the maintenance process to reduce their dependence on the parts supply and described the importance of the acquisition processes for critical components.

In the study about trickling filters following anaerobic sewage treatment, Bressani-Ribeiro et al. [38] argued that maintenance simplicity is the key consideration factor for new technology adoption. In the study about the role of nanoparticles upon the productivity of solar desalination systems, Rashidi et al. [85] mentioned the advantages of solar distillation systems, including low costs of construction, repair, and maintenance, simplicity, portability, and use of solar energy resources. In the study related to a time

service improvement scheme for a nuclear power plant, Zhai and Bai [86] proposed the improvement scheme of time service system to solve the problems of redundant backup, system networking, convenient maintenance, and power supply optimization. Therefore, maintenance convenience is a critical issue for the operations of a system. In the study about point cloud processing system development, based on the proposed, Dingning and Qiong [87] improved system security and maintenance convenience of the point cloud processing system. This study showed the significance of maintenance convenience for an information system. In the patent about container-type data centers, Zhao et al. [88] also emphasized the importance of maintenance convenience for a mini data center in their patent application.

Moreover, in a study to compare life-cycle costing and performance parts costing, Adebimpe et al. [89] mentioned that maintenance cost involves the usage of consumables and spare parts. In another patent about a spare parts and consumables management system, Dellar et al. [90] used the wafer manufacture as an example to illustrate the importance of spare parts and consumables. They mentioned that a lack of required spare parts or consumables at critical points could mean damage to expensive wafers in the process and other wafers in pipelines waiting to process with closed machines. Additionally, the time it takes for a machine to drop reduces the wafer output, which can be very expensive. They also emphasized that it is critical to ensure that needed spare parts are available when designing a system. In an illustration of the life cycle of durable manufactured products, Oliva and Kallenberg [91] also mentioned that the acquisition assurance of spare parts and consumables is a critical issue for users.

The above literature study has described many consideration factors systematically related to maintenance for drones. Further, in the review process, a set of common critical factors has been identified in the literature, including *maintenance simplicity*, *maintenance convenience*, and *acquisition of parts and operation consumables*. These should be the critical factors placed under the drone maintenance construct.

#### 2.2.5. Short Summary

In this subsection, a suitable set of criteria (design factors) for the application and selection of civilian drones is filtered and obtained on a solid basis. The constructs, each of which cover a separate subset of these criteria, are also identified. Therefore, the hierarchical tree structure for AHP investigation (i.e., AHP hierarchy) is justified and developed concretely in Section 3, so all of the subsequent works can begin based on (i.e., the Delphi process to confirm the included criteria and the tree's form) the main process of AHP to collect the opinions using the expert questionnaire designed to calculate the weights for the factors and prioritize them. The THM-integrated analysis is for the preferential orders toward the criteria and constructs in each opinion group, and the implicative discussions identify the emphases for law-making advice. In other words, the thoroughness of the reviewed literature that has been studied underlies all later results' credibility; even the comprehension of the process is, perhaps, sufficient to form a review study.

#### *2.3. Methods: Delphi and AHP*

After the previous use of THM in this field (see Section 1), this subsection gives a review at a higher level of Delphi and AHP, as they are the other two main methodological elements.

#### 2.3.1. Delphi Method

Delphi is a survey method of research that aims to structure group opinions and discussions. The RAND Corporation developed the Delphi method. The purpose is to discuss and enable judgments on a specific topic, and then some synchronized decisions related to the specific topic can be made to represent a given group's opinions and views. The Delphi method can break the limitations when traditional methods were used to obtain

a specific group's opinion or judgments for policy making [92]. Woudenberg [93] evaluated the characteristics of Delphi, which are listed as follows:


Delphi is also helpful when using other methods would be not adequate or inappropriate. The following application contexts wherein it is particularly useful are summarized [94]:


Turoff and Linstone [95] have also summarized several application areas of Delphi, which are as follows:


In this study, Delphi is used to confirm the criteria set, the proposed constructs, and the tree form of the AHP hierarchy, for the following studies and results to be made and justified on a solid basis (see Section 3.1).

#### 2.3.2. Analytic Hierarchy Process (AHP)

AHP is a theoretical and methodological framework of humans' mental measurements by using a "pairwise comparison" repeatedly. It depends on the judgment of experts to get the priority and measure the relative values of intangible things, e.g., assets, alternatives, criteria, etc. Comparisons are carried out with an absolute "judgment scale" that represents the extent to which one element dominates another with respect to a given concept or content [96].

Specifically, if the things being compared are criteria in the MADM context, those relative values would mean the relative importance among the criteria (or alternatives). The vector-based form consisting of these values is called a "criteria weight vector," or CWV, which can be used to prioritize the criteria (or alternatives) (i.e., to order them or rank them) [7,97].

For AHP, the judgments may be inconsistent. How to measure the inconsistency is another focus of AHP. Saaty breaks down the decision-making process into several steps, including a step called "CR-validation" to verify whether one's pairwise judgments made individually for each pair of things, when put together, are consistent or not [98].

Zhuang et al. [12] proposed another "two-phases" separation for the entire MADM process using AHP, namely, a first phase of "CWV-determination" and a second of "alternative ranking". As the research question of this study only concerned the first phase, we also refer to [12] for a summary of the formulas for the calculation process of CWV-determination. The other summary for these calculations is also seen in another mathematical expression form [8]. We also refer to [6] for a demonstration of CR-validation using a numerical example. The mathematical details of these are omitted here for space reasons, but similar full applications of AHP for the CWV-determination purpose (with CR-validation) have been illustrated in [8] and [9] to study the design factors for fighters and MALE UAS.

#### **3. Processes and Results**

#### *3.1. Confirming the Factors with Delphi Method*

According to Section 2.3.1, in the first round of Delphi in our study, three "almost experts" who are not familiar with each other were intentionally chosen to meet the selection standard of the method, and these participants gave their qualitative estimates. Their communication used e-mails. This confirmed that the 15 factors that were distilled from the literature review work in Section 2.2 in the original factor set—purchase cost, operation cost, maintenance cost, vehicle flight performance, manipulative ability, flight power, primary application, operation duration, operation convenience, operation safety and security, operation supportability, and operation environmental impact—are all effective, and there was no comment given to augment this factor set by adding any other factor. In addition, none of them opposed the four constructs (categories) that were established by the authors after the literature study and agreed with how each individual factor (criterion) could be placed (mounted) under its upper-level construct. This means that the hierarchical structure of the tree has been confirmed at this stage, and the "decision hierarchy" can therefore be ascertained in Figure 4 for AHP.

**Figure 4.** The decision hierarchy for civilian design/selection.

However, as in the first round the operational definitions for each factor were sent along with the data for polling, and they were slightly modified by the participants after returning, we present the modified version in Table 6.

**Table 6.** Operational definitions of the factors for civilian drone design/selection.


Then, the second round of Delphi fed back the whole group's results on the previous round to all participants for subsequent evaluation. These included the hierarchical structure of the tree weaving the constructs and the criteria (i.e., the decision hierarchy) as shown in Figure 4 and the set of factors (although no change was made) and the revised operational definitions for the factors in Table 6. Finally, all participants returned with no other comments, and this confirmed that all the delivered data (i.e., the set of factors, the constructs, the decision hierarchy that reveals their relationships, and the operational definitions) are effective. Therefore, fortunately, the Delphi process does not involve a third round.

#### *3.2. Prioritizing the Factors with AHP Method*

Based on the confirmed decision hierarchy, as shown in Figure 4, a set of expert questionnaires using the pairwise-comparison scales of 9:1, 7:1, 5:1, 3:1, 1:1, 1:3, 1:5, 1:7, and 1:9 is designed for investigating the individual opinions of the experts (or decision-makers (DMs5)).

According to the THM, we would like to understand the opinions of the domain experts in the industry, academia, and research institutes. For this sake, the entire sample is designed to have 27 experts, and it is intentionally partitioned into three groups, with each group containing 9 experts. The researchers visited these 27 experts and performed investigations in terms of AHP using the designed questionnaire set, during the period from 3 January to 28 February 2021.

Eventually, for each expert interviewee (respondent), we successfully received five pairwise comparison matrices of the five AHP expert questionnaires that are consistent in the results. Note that among the five matrices, one of them connotes the results of pairwise comparison of the constructs (i.e., with respect to CDSEM, or the civilian drone selection evaluation model as shown in Figure 4), and for the remaining four, each connotes the results of a pairwise comparison of the individual factors with respect to some construct.

Table 7 gives the statistics for the backgrounds of the 27 interviewed experts overall, as the "respondent profiles" are also polled along with the AHP-style survey using an initial but anonymous block of questions. As can be observed, 24 of these experts are male and three are female; 13 have a Ph.D. and 14 have a master's degree. In terms of age, they are all 31–65 without exception. Finally, five of them have experienced the service for <10 years, 12 have experienced 10–20 years, while 10 of them have worked for over 21 years. The A, I, and R partitions of the sample of the experts are also clear in this table.


**Table 7.** Background statistics of the interviewed experts.

Surprisingly from these sample stratifications, no respondent in this survey possesses only a bachelor's degree (or under), even for the respondents in group I (industry). After a post-survey consultation, the fact is interesting. For group I, we interviewed respondents from a maker that is partially government-funded, because it is not only the largest aeronautic manufacturer in the studied country (as mentioned in Section 1) but also where people who have the decisive/influencing power for drones' law-making may come from. As such, the respondents whom we interviewed are all (very) high-level staff, but in such institutions, only people who possess a master's or Ph.D. degree can be promoted to these higher positions.

As usual, the investigation process required one to three rounds of interviews for all pairwise comparison matrices filled in by an expert to pass the CR-validation process, with a CR (consistent ratio) threshold set at 0.1 (see Section 2.3.2). Additionally, we also found that most DMs could pass the CR-validation easily with one round of interviews when they were requested to answer the questionnaires containing three items to be compared pair-wisely (i.e., for the three factors under construct CA and for another three under construct CD). Otherwise, many of them passed the CR-validation with two to three rounds of interviews, especially when they were asked to compare the five factors under CB pair-wisely.

After the consistencies in the results were all guaranteed, calculations to obtain the CWVs based on the data in the five pairwise comparison matrices were performed for each expert using the Expert-Choice software (and it was also installed on a laptop as a mobile office during the investigation to record the experts' answers in real time).

By reference to THM, we further combined the CWVs with respect to the same thing (i.e., CDSEM or one of the four constructs) for all experts in each expert group to obtain an "aggregated CWV." For more details, please see Appendix A. This aggregated CWV connotes a "group opinion" with respect to "that thing". As the different groups (i.e., A, I, and R) give various "group opinions" for "the same thing", the variety and heterogeneity can be observed, analyzed, and compared. The following sub-subsections present these results in a visualized manner.

#### 3.2.1. Academia (A) Group's Opinions

The weights of the four main constructs are shown and ranked in Figure 5 based on the group opinion according to the "aggregated CWV" over the individual CWVs of the nine experts in the "A" group, with respect to CDSEM.

**Figure 5.** Academia group's opinion for the constructs.

Reviewing Figure 5, we can find the significance rankings of those four constructs: CB: Performance and Application CC: Operation CD: Maintenance CA: Cost. The sum of the relative weights of the two decision constructs, CB and CC, exceeds 70%; especially, the performance and application construct received nearly half of the relative weight, which is 0.463.

Figure 6 shows the 15 consideration factors' "absolute weights" when the aggregated CWV (for a subset of factors) with respect to each construct is multiplied with the associated element in the aggregated CWV with respect to CDSEM (i.e., a number in Figure 5, in its original non-ranked order), while the priority over these 15 factors is also visualized.

Examining Figure 6, we find that only the top five factors are "more important" and the 10 decision criteria remaining are less important, according to the academia (A) group's overall opinion. The manipulative ability (cb2) and vehicle flight performance (cb1), both under construct CB, are the two most important factors. both absolute weights are over 0.1, and the sum of the two weights exceeds 30%. Relatively, operation convenience (cc1) is deemed the least important factor among the 15 by the experts in the A group; its absolute weight is less than 0.0335. Note that we have used ">0.7" as the rule to justify whether a factor is "more important" by seeing whether the absolute weight of the factor surpasses the threshold of 0.7 or not.

#### 3.2.2. Industry (I) Group's Opinions

The weights of the four main constructs are shown and ranked in Figure 7 based on the group opinion according to the "aggregated CWV" over the individual CWVs of the nine experts in the "I" group, with respect to CDSEM.

**Figure 7.** Industry group's opinion for the constructs.

Reviewing Figure 7, we can find the significance ranking of those four constructs: CB: Performance and Application CA: Cost CD: Maintenance CC: Operation. Among them, the industry (I) group thinks that CB and CA are the two most significant constructs for designing and selecting civilian drones. Their relative weights, 0.288 and 0.287, are almost even. The operation construct (CC) is the least significant one, because its relative weight is only 0.180. However, CB and CA are just a bit more important than CD, and the weight of CC is just 0.07 away from a quarter (0.25).

Figure 8 shows the 15 consideration factors' absolute weights, while the priority over these 15 factors is also visualized. The process to obtain this figure is analogous to what has been described in Section 3.2.1, but the studied sample group has been changed from A (academia) to I (industry).

Examining Figure 8, we can see that eight of the 15 factors are "more important" and the other seven factors are "less important", according to the industry (I) group's overall opinion, while the same threshold (0.7) is used to justify such a classification. Among the eight "more important" factors, we found that there are also five factors whose weights may dominate the other three to a certain extent, and their order is: cb5: Operation duration ca3: Maintenance cost cb1: Vehicle flight performance cb2: Manipulative ability cd3: Acquisition of parts and operation consumables. These five factors, in sum, may occupy 55% of the total importance. Relatively, among the seven less important factors, operation environmental impact (cc4) is the least important decision criterion. This factor receives a weight of only 0.014, which is obviously lower than the other six less important ones. This means that experts in the industry (from the I group) think the possible environmental impacts caused by operating the civilian-use drones are of little significance.

#### 3.2.3. Research Institute (R) Group's Opinions

Finally, this study explores the four main constructs' weights as shown and ranked in Figure 9 based on the group opinion according to the "aggregated CWV" over the individual CWVs of the nine experts in the research institute (R) group, with respect to CDSEM.

**Figure 9.** Research institute (R) group's opinion for the constructs.

Reviewing Figure 9, we can find that the significance rankings of those four constructs are CB: Performance and application CD: Maintenance CA: Cost CC: Operation. Among them, the research institute (R) group thinks that CB is the most significant factorial construct for designing and selecting civilian drones. Its relative weight, 0.315, may dominate the other three constructs, the importance of which is almost 1/3. In contrast, the cost construct (CA) and the operation construct (CC) are the less significant ones (see their weights, 0.221 and 0.211, in the figure). However, both of the weights are not far away from a quarter (0.25).

Figure 10 gives the absolute weights of the 15 consideration factors, while the priority over these 15 factors is also visualized further. The process to obtain this figure is analogous to what has been described in Section 3.2.1, too, but the studied sample group has been changed from A (academia) to R (research institute).

**Figure 10.** Research institute (R) group's opinion for the absolute weights of all factors.

Given the results in Figure 10, seven of the 15 factors' absolute weights have surpassed the used threshold, 0.7, so they can be classified as the "more important" factors. Besides these, the other eight factors should be less important for the design and selection of civilian drones. Further looking into these two groups, we find that the only three factors that have received a >0.10 or ~0.10 weight value, i.e., main application (cb4), vehicle flight performance (cb1) of civilian drones, and acquisition of parts and operation consumables (cd3), are the most important, as regarded by the experts who came from the R&D institutions. These factors should contribute a lot to CB and CD, as CB and CD have been shown to be the two more important constructs in Figure 9.

Relatively, the only two factors that received a weight value less than or equal to 0.03, i.e., operation convenience (cc1) and operation environmental impact (cc4), are deemed the two least important factors in minds of those experts who came from the research institute (R) group. The fact that the two least important factors are under the operation construct (CC) is reflective of the fact that CC is the least important construct in Figure 9.

#### *3.3. A Short Summary*

In this section, the process we have performed to conduct the analysis is demonstrated, and the results are shown and justified based on the empirical survey data, for retrieving a set knowledge about the design and selection factors of civilian drones, i.e., it is a mind-mining process for the experts' key opinions that will be reflected in law-making. This process involves the use of multiple theories and methods, including THM, Delphi, and AHP.

In Section 3.1, the Delphi method is used to confirm that the established set of influencing factors is effective after the thorough literature study, that the operational definitions made for these factors are correct, that the postulated set of constructs used to cover these factors is reasonable, and that the defined tree structure (i.e., the AHP hierarchy) is plausible, with the help from the selected "almost expert" participants.

In Section 3.2, the AHP survey is designed in such a way that it is linked with the THM, i.e., three groups of experts, A, I, and R, are intentionally organized and interviewed to understand the "group opinions," rather than the individual opinions. These "group opinions" are aggregated based on each individual expert's CWVs related to the civilian drone selection and design constructs and the selection and design criteria under each construct.

However, after the studies in Section 3.2, we may eventually see that there are heterogeneities and homogeneities in the opinions of the three groups as stratified by using the THM; the similarities and differences when these groups' opinions are compared subject to "the same thing" may require further exploration. This is the purpose of the next section.

#### **4. Discussion and Implications**

The priority and relative weights for the four constructs and for the 15 criteria were explored group by group in the previous section by aggregating the individual opinions (CWVs) in each group. Given the information, questions were raised as to the homogeneities and heterogeneities that exist across the different expert groups (i.e., A for academia, I for industry, and R for research institutes).

The first insight gained is that regardless of the subject group, the experts have a consensus that "CB: Performance and application" is the most significant construct to be considered while designing and selecting civilian drones. Such an outcome is persuasive, because no exception is found according to the group opinions.

Following the first insight, the *second insight* is that *there are varieties when the groups of experts are addressing the importance of another three constructs* while it is to design a civilian drone; these constructs are CA: Cost, CC: Operation, and CD: Maintenance. Among these, group A feels that operation is more important than cost and maintenance. Meanwhile, group I feels that cost is more important than maintenance and operation. However, group R feels that maintenance is more important than cost and operation. The differences in "which construct is more significant than the other two" *in the three expert groups' mind* tells a truth that for these constructs (CA, CC, and CD), *the group opinions are quite diversified* (except for the commonly agreed critical importance for CB).

Following from the former two insights, the *third insight* relates to a claim that *no construct is meaningless*. The I group's opinion for the constructs (with respect to CDSEM) (Figure 7) shows that the importance of the least important construct, operation, is 18.0%. The A group's opinion for the constructs (Figure 5) shows that the importance of the least important construct, cost, is 12.1% (still >10%), let alone for the operation construct that is regarded as the least important by the R group (whose relative importance is 21.1%; see Figure 9). These may imply that *the set of constructs proposed by this study is effective*, since each construct, either being critical or not, possesses a weight to a certain extent.

Before going on to discuss other insights about the consideration factors in more detail (in addition to those about the constructs), in Table 8, let us summarize the final ordinal ranks over all criteria (row-wise) as assessed by the three expert groups (column-wise), in terms of "rank order vector" (or ROV; see [7,12,99]) and analyze the ranking differences (or differences in rank alternative).


**Table 8.** Rankings for individual criteria and the ranking differences among the expert groups.

\* Table legend: SRD: sum of ranking difference, ARD: average ranking difference.

Scrutiny into this table finds the *fourth insight* that several factors (decision criteria) receive similar overall rankings from the three expert groups. Because the SRD (sum of ranking difference) aggregates the "difference in rank between each pair of groups" for some specific criteria, and there are always three pairs of groups in each case, SRD/3 can be used to measure the average diversity of the group opinions when the focus is a factor. If we set up a threshold "SRD/3 - 2", which means that the three expert groups give similar rankings to the same factor (in that for this vector, the average diversities of the group opinions are quite converged), to filter the penultimate SRD column in Table 8, there are seven factors are under this threshold. Therefore, we can gain the knowledge that regardless of the expert group, experts in these three groups have reached *a consensus to a certain extent on the priorities of "operation cost" (ca2), "vehicle flight performance" (cb1), "flight power" (cb3), "operation environmental impact" (cc4), "maintenance simplicity" (cd1), "maintenance convenience" (cd2), and "acquisition of parts and operation consumables" (cd3)*. Note that here the "priority" of any factor, while being represented using a rank order number as shown in the middle three columns in Table 8, does not necessarily mean its importance in terms of absolute weight.

The *fifth insight*. As can be seen in the above list of seven factors, one is under CA and one is under CC, two are under CB, but three are under CD (and all criteria under construct D, CD, are listed). This tells a truth that under every construct, there is at least one factor that has received (very) similar rankings from the three groups of experts, and the real count varies construct by construct. Moreover, *compared to the factors under CA and CB regarding which the expert groups sometimes give very diversified opinions but sometimes do not* (see the gap in SRD, 12, for the factors under CA and the same gap size, 12, for the factors under CB), *their opinions* (among the groups) *are relatively stable* (with smaller variance; the gap in SRD is six) *for all factors under CC*, *and no variance is observed for the SRD values assessed for all factors under CD* (with the smallest possible variance, 0).

The *sixth insight*. The last column in Table 8, *"ARD" (average ranking difference), provides another measure*. Theoretically speaking, the ARD measures the degree to which the opinions for the factors included by a construct are diversified on average, i.e., 10, 8.4, 7.5, and 6.0 for construct CA (cost), CB (performance and application), CC (operation), and CD (maintenance), respectively. It is, in fact, a measure other than the "gap" between the highest and lowest of the SRD values used in the previous point to connote the variance (degree) in those SRD values for each construct. However, despite this, it is surprising that *the order of the constructs, while justified using the ARD measure, roughly concurs with the order while justified using the "gap" measure to see the variances in SRD under the constructs*. In other words, *these outcomes may cross-validate with each other, for how the group opinions of experts are diversified under each construct within which the individual factors are included*.

Then, following the fourth insight, the *seventh insight* relates to *the set of the remaining eight factors, which have more opinion differences among the three groups of experts* in terms of SRD. These include: "purchase cost" (ca1), "maintenance cost" (ca3), "manipulative ability" (cb2), "main application" (cb4), "operation duration" (cb5), "operation convenience" (cc1), "operation safety and security" (cc2), and "operation supportability" (cc3). Additionally, according to their SRD values (i.e., the degree to which their opinions diversify across the groups for each factor), these eight factors are sorted in descending order as: *ca3* (SRD = 16) *> cb4* (14) *> cb5* (12) *> (ca1, cb2, cc1)* (10) *> (cc2, cc3)* (8). Both the set of these factors and the order of them are worthwhile knowledge for the design and selection of civilian drones. These are also important for the different groups of experts to make the advice for law-making, because integrating their diversified opinions for these factors while defining the legislations is usually a tough task, but the results of this study may offer some supplemental information about the opinion gaps before they sit down together with real lawmakers.

The *eighth insight* pertains to "*what factors are the most critical ones to be watched for?*" Either the field of big data analytics or MADM usually gives more emphasis to the items (i.e., criteria or alternatives, and the factors as in the case here) that are extraordinary (i.e.,

the data points, the opinions from DMs, and the opinions for the factors here). For example, an MADM model may assign a heavier weight to a DM whose opinion is away from the average by using the opinion weight vector (OWV) concept [8]. For this sake, based on the information of the "diversified order" of those factors obtained in the previous point, we also find that the three groups of experts have the largest opinion differences for the maintenance cost (ca3), main application (cb4), and operation duration (cb5) factors for designing and selecting civilian drones. Following from the former insight, this implies that they (i.e., the three groups) may reach little consensus while the relevant drafts pertaining to these three factors are discussed among them for regulating the civilian use of drones (after they sit down together), if no extra (or at least sufficient) effort is made to coordinate their preferences and intentions toward civilian drones' design and selection and law-making.

#### **5. Conclusions and Future Recommendations**

This study, at the outset, aims to understand the expert opinions during a consultation process before law-making. These opinions will shape the related laws made and enacted to regulate the relevant matters of civilian drones (i.e., to control the design, manufacturing, and uses of civilian drones in Taiwan). As the consultation process involves polling the opinions from three expert groups, academy (A), industry (I), and research institutes (R), the main research question is thus about understanding their opinions and noting the similarities and differences in their group opinions. However, how can the relevant set of knowledge be explored thoroughly and "neatly"?

In this study, first, the literature was studied comprehensively to have a set of influencing factors for designing and selecting the civilian drones. Eventually, a set of 15 consideration factors was established, with four possible constructs covering these factors being summarized, i.e., CA: cost, CB: performance and application, CC: operation, and CD: maintenance. Therefore, a hierarchical tree weaving the 14 factors under the four constructs was postulated, and this formed the initial "AHP hierarchy" that awaited further confirmation. This rigorous process is critical, because drone technologies have been combined with the non-aeronautic emerging technologies for various application purposes nowadays, e.g., 5G [100], data integration techniques [101], the machine learning models [102], optimization routing algorithms [103], and the humanoid robots [104]. Thus, a multiplex of these advanced capabilities (and technologies) into existing areas of UAV [105] may become the interferences to exploit the key influencing factors.

In the next, the established initial AHP hierarchy was sent to several "almost experts" for evaluation using the Delphi process for two rounds. At last, the AHP hierarchy is confirmed to be effective, with minor corrections made for the operational definitions. Following this, the questionnaires were designed in AHP style, and using these, experts in the three groups were interviewed. Stratifying the respondent sample into three groups intentionally followed the proposed "modified THM" model, in which the "triple player parties" were altered from I, G, and A to A, I, and R. See former sections for the reasoning process.

The entire survey process spanned over two months in early 2021. It required one to three rounds of interviews for the expert opinions to pass the CR-validation check (of AHP). Eventually, the pairwise matrices collected from all of the 27 experts were used to calculate the CWVs (i.e., opinions in each expert's mind) for the constructs (with respect to the total goal) and for the criteria (factors) (with respect to some construct). Based on these individual CWVs, the "group opinions" are aggregated in terms of "group CWVs" for the relative importance (of the four constructs, and of the different factors under each construct).

For all the three groups, their group CWVs justified for the four constructs were dissected and analyzed. Furthermore, within each group, its group CWVs were "synthesized" to obtain the "absolute weights" for the 15 factors overall, and these weights were also ranked in each group. These overall ranks of the factors became another target to be compared across the THM groups. Through a process to understand the former results, several practical implications for drone design/selection and for the advice made by the

different expert groups for law-making were therefore drawn. As a short summary, the insights gained are summarized as follows:


As can be seen in the list of valuable insights gained, all this is critical for the different groups of experts to give their advice for law-making because investigating and integrating their opinions, either diversified or not, for these factors is usually a difficult task while defining the legislations. Moreover, the results may provide the knowledge to facilitate the communication processes and close the opinion gaps before the experts sit down together or even before a final draft is formally delivered for law-making. As from the perspective of operational research (OR) some law-relevant issues have just been addressed [106], these insights also encompass the general aim of this research line.

The THM has been popular for years, but studies based on it explored many other issues, such as innovation (or national innovation system), governmental aspects (e.g., smart city), the industrial revolution and cooperation, local economic or regional development, and knowledge production, transfer, and economy matters. Most of the studies were based on the academia (university)–industry–government interactions (i.e., A–I–G) to explore the concerned topics where appropriate. Analogously, this study proposed a "modified THM" to explore the factors for civilian drone design/selection in terms of

academia–industry–research (i.e., A–I–R) and identify the relationships between the group opinions (attitudes) toward those constructs and factors, so as to understand the required knowledge for law-making. In previous THM studies, e.g., in the field of smart city, only the sphere of drone applications has been touched. In this sense, this study not only fills the gap by using the modified THM to offer another set of in-depth knowledge about the design/selection of civilian drones with a systematic study flow, but also is helpful for lawmakers to develop regulations on drones during the formation process of law-making. The future research directions may involve:


**Author Contributions:** C.-H.F.: conceptualization, methodology, software, visualization, and writing (original manuscript); M.-W.T.: questionnaire design, investigation, and data analysis; L.-P.C.: project administration, and supervision; Z.-Y.Z.: Conceptualization, validation, writing (review and editing), funding and project administration. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Ministry of Science and Technology, Taiwan, ROC, grant number 109-2410-H-992-015. The APC is funded by Drones based on the warm invitation from the journal, which we felt grateful and honorable.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

This study also explores all twenty-seven DMs' decision opinions. Figure A1 shows the four decision constructs' relative weights for the civilian drone selection evaluation model. Looking at Figure 5, we can find that the significance ranking of those four decision constructs is "performance and application", "maintenance", "cost", and "operation." Except that the relative weight value of the most significant decision construct, "performance and application" is 0.363, there exists a big relative weight gap with the other three decision constructs. The other three decision constructs' relative weight values are relatively close; their relative weight values are all around 0.210.

**Figure A1.** The four decision construct aggregated relative weights in the all group.

This study uses the mean value of all decision criteria's absolute weights as an evaluation criterion. A decision criterion will belong to a more significant group when its absolute weight is more than the mean value of all decision criteria's absolute weights; otherwise, the decision criterion will belong to a less significant group. Figure A2 shows that all the fifteen decision criteria absolute weights are under the total designed goal. Examining Figure A2, we can find that eight decision criteria belong to the more significant group; seven decision criteria belong to the less significant group. In the more significant group, the "vehicle flight performance" and "manipulative ability" are the two most significant decision criteria, and all those decision criteria's absolute weights exceed 0.1 (1.5 times the mean value); the absolute weight sum of those two decision criteria is over 25%. Relatively, in the less significant group, the "operation convenience", "operation supportability", and "operation environmental impact" are the three least significant decision criteria; their absolute weights are less than 0.0335 (half the mean value).

**Figure A2.** The fifteen decision criteria aggregated absolute weights in the all group.

#### **References**


### *Article* **Drone Control in AR: An Intuitive System for Single-Handed Gesture Control, Drone Tracking, and Contextualized Camera Feed Visualization in Augmented Reality**

**Konstantinos Konstantoudakis \*,†, Kyriaki Christaki \*,†, Dimitrios Tsiakmakis, Dimitrios Sainidis, Georgios Albanis, Anastasios Dimou \* and Petros Daras**

> Visual Computing Lab (VCL), Centre for Research and Technology-Hellas (CERTH), Information Technologies Institute (ITI), 57001 Thessaloniki, Greece; tsiakmakis@iti.gr (D.T.); dsainidis@iti.gr (D.S.); galbanis@iti.gr (G.A.); daras@iti.gr (P.D.)

**\*** Correspondence: k.konstantoudakis@iti.gr (K.K.); kchristaki@iti.gr (K.C.); dimou@iti.gr (A.D.)

† These authors contributed equally to this work.

**Abstract:** Traditional drone handheld remote controllers, although well-established and widely used, are not a particularly intuitive control method. At the same time, drone pilots normally watch the drone video feed on a smartphone or another small screen attached to the remote. This forces them to constantly shift their visual focus from the drone to the screen and vice-versa. This can be an eye-and-mind-tiring and stressful experience, as the eyes constantly change focus and the mind struggles to merge two different points of view. This paper presents a solution based on Microsoft's HoloLens 2 headset that leverages augmented reality and gesture recognition to make drone piloting easier, more comfortable, and more intuitive. It describes a system for single-handed gesture control that can achieve all maneuvers possible with a traditional remote, including complex motions; a method for tracking a real drone in AR to improve flying beyond line of sight or at distances where the physical drone is hard to see; and the option to display the drone's live video feed in AR, either in first-person-view mode or in context with the environment.

**Keywords:** drones; augmented reality; AR; gesture recognition

#### **1. Introduction**

Over the past decade commercial drones, and quadcopters in particular, have become increasingly popular and affordable. In addition to their use in professional or casual photography, they have grown into a transformative force in diverse sectors, including inspection [1,2], mapping [3,4], exploration [5], human–machine interaction [6], search-andrescue missions [7,8], and more. More recently, their combination with virtual and augmented reality (VR and AR, respectively) has yielded new experiences such as first-personview (FPV) fights (www.dji.com/gr/dji-fpv, accessed on 20 December 2021), AR training (vrscout.com/news/dronoss-training-drone-pilots-with-ar/, accessed on 20 December 2021), and mixed reality games (www.dji.com/newsroom/news/edgybees-launches-thefirst-augmented-reality-game-for-dji-drone-users, accessed on 20 December 2021).

Learning to pilot a quadcopter effectively can be a challenging task: Conventional remote controllers are largely unintuitive, as they use two joysticks to control flight, with one corresponding to horizontal motions (pitch and roll) and the other to vertical (throttle) and rotational (yaw) motions. Additional wheels and buttons control the drone's camera. While basic motions in relaxed circumstances are achievable with short training sessions, complex motions can be more difficult. Moreover, in challenging or stressful circumstances (e.g., in disaster response or under tight time constraints), the lack of intuitive controls add additional cognitive load on the pilot, affecting his/her safety and efficiency. In addition,

**Citation:** Konstantoudakis, K.; Christaki, K.; Tsiakmakis, D.; Sainidis, D.; Albanis, G.; Dimou, A.; Daras, P. Drone Control in AR: An Intuitive System for Single-Handed Gesture Control, Drone Tracking, and Contextualized Camera Feed Visualization in Augmented Reality. *Drones* **2022**, *6*, 43. https://doi.org/ 10.3390/drones6020043

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 31 December 2021 Accepted: 1 February 2022 Published: 10 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the remote controller in itself, requiring the continuous use of both hands, can be restrictive. Although alternative remote controllers, such as the DJI motion controller, provide a less cumbersome solution, they cannot support the full range of motions executable by a traditional remote.

Another challenge stems from the difference in position and orientation between the drone and its pilot, which can make it difficult to match the drone's camera feed with the pilot's surroundings. In particular, especially when the drone is at some distance or out of direct line of sight, it can be challenging both to judge its position and orientation and have a direct understanding of where its camera is pointing at. Moreover, as the video feed is normally displayed on a screen attached to the remote, it requires users to constantly change their focus from the drone (in the air) to the screen (usually held at chest level, hence towards the ground), glancing from one to the other. This can be both tiring (mentally as well as visually) and adds to the cognitive load, as the user alternates between two different points of view. Although VR FPV mode eliminates the change of perspective, it leaves users unaware of their own surroundings, which can be prohibitive in many cases.

In this paper, we propose an AR solution that leverages gesture recognition, computer vision AI, and motion tracking techniques to provide a natural user interface for intuitive drone control and contextualized information visualization. Based on the Microsoft HoloLens 2 headset (https://www.microsoft.com/en-us/hololens/hardware, accessed on 20 December 2021), the presented system allows users to pilot a drone using single-hand gestures, improves tracking the position of the drone in an AR environment, and provides video feed visualization in either the contextualized or FPV modes. It allows for a comfortable flight, as both hands are free and can be used for other tasks when not directing the drone. AR tracking improves drone visibility in longer distances or in low light conditions. The visualization of the drone's video feed in the AR display means that users can view both the video and their surroundings without glancing towards different directions. In conjunction with tracking, a contextualized video view projects the video in the approximate location and/or direction of its contents, considering the drone's location and orientation, resulting in a mixed reality view combining the virtual video feed with the real world. The main contributions of this work are as follows:


The rest of the paper is organized as follows: Section 2 discusses earlier work related to the presented solution, in particular regarding drone gesture control, drone tracking, and the visualization of information in AR; Section 3 presents the system's architecture, including hardware, software, communication, and data flow; Section 4 describes the primary technical tasks that were needed to realize the solution; Section 5 discusses the testing of the system, including subjective and objective measurements and feedback; and finally, Section 6 summarizes this work and outlines possible future developments.

#### **2. Related Work**

#### *2.1. Gesture Control*

With the ever growing accessibility of drones and the range of their potential applications, the users acting as pilots are no longer limited to highly-trained professionals. This, in turn, leads to a shift in the design of human–drone interfaces as there is a need for simplified control methods [9] that provide an easy way to control a drone without the need of extensive training. There are now several approaches that go beyond the standard

remote controller with joysticks and allow the user to use inherent expression and communication means such as gesture, touch, and speech. These methods are also known as Natural User Interfaces (NUI) and have the advantage, in some application domains, to feel intuitive and require less training time and cognitive effort. Suárez Fernández et al. [10] implemented and tested a gesture drone control interface utilizing a LeapMotion sensor (https://www.ultraleap.com/product/leap-motion-controller/, accessed on 20 December 2021). At the early stages of our work, we applied a similar approach using LeapMotion, but later, we abandoned this plan for more efficient AR interfaces. Herrmann and Schmidt [11] designed and evaluated an NUI for drone piloting based on gestures and speech in AR. Their study results indicate that the use of the designed NUI is not more effective and efficient than the use of a conventional remote controller. However, the selected gesture "alphabet" may have played a crucial role on the study results. For example, their input alphabet included combinations of head and hand movement in order to perform certain navigation tasks, something that many users of the study found confusing. In our approach, although head gestures do exist, they are not combined with hand gestures, and they are only used to perform auxiliary tasks in certain navigation modes. In their survey about human–drone interactions, Tezza and Andujar [9] classify gesture interaction as an intuitive and easy to learn control interface but with the disadvantages of high latency and low precision. However, we argue that the current technologies for gesture recognition are able to overcome the aforementioned disadvantages.

#### *2.2. Tracking*

Positional or motion tracking is the ability of a device to estimate its position in relation to the environment around it or the world coordinate system. As described by Kleinschmidt et al. [12], there are two fundamental solutions for positional tracking, markerand non-marker-based methods. Marker-based methods require the use of standard tags such as ARTags and QR codes for the coupling between an agent's camera and the tracked element. In our application both approaches are available since each one has certain pros and cons: Our current non-marker-based approach does not require the use of tags, however, it requires a two-step calibration process. Marker-based approaches are used in time-critical tasks or in hostile environments where there is no sufficient time for calibration, however, they can only be performed with drones that are already equipped with the compatible tags. Regardless of the approach used, the tracking process is prone to the accumulation of errors and latency, and is in general computationally intensive. Several methods attempt to overcome these challenges. Islam et al. [13] introduced an indoor tracking system with a rotary-laser base station and photo-diode sensors on tracking objects. The base station scans the space and detects the UAV sufficiently. The error is almost negligible as the estimation in 5 m is only 13 cm. Arreola et al. [14] proposed a position estimation method using a low-cost GPS and optical flows provided by UAV's camera. This approach is available only outdoors since the system uses GPS measurements. In a hybrid case, Tsai and Zhuang [15] adapted optical flow and ultrasonic sensors on the drone and achieved better results than GPS positioning. A similar approach was implemented for real-time pose estimation by Hong et al. [16], combining the Cubature Kalman Filter on IMU readings with an optical-flow-based correction to minimize positioning error.

#### *2.3. Computer Vision for AR*

For successful AR use and application, a detailed understanding of the scene is required, a challenging task that includes multiple sensor fusion, object tracking and real, and virtual world registration [17]. Computer Vision approaches have been used in AR applications for object detection and tracking [18,19], object pose estimation [20], localization [21], and even gaze-based UAV navigation [22]. In our application, we utilize computer vision techniques for visual pose drone detection [23].

#### *2.4. Situation Awareness and Virtual Information Visualization in AR*

Situation awareness (SA) is related to the perception and comprehension of a dynamic environment. Increased SA is related to improved performance and efficiency in the completion of complex tasks and operations. According to Endsley's definition [24], SA comprises three main phases: perception of the relevant elements, their relation to the operation goals, and projection of the operation environment future states. In the context of disaster response and law enforcement operations, drones can provide a means to expand SA by providing video and images from advantageous vantage points [25,26]. Leveraging AR visualization and tools can have positive effects in SA as task-relevant information and feedback can be displayed on the same screen and in the context of real-world elements. The user can easily correlate the augmented information elements with the real world, avoiding additional cognitive load. Earlier work [27,28] has leveraged AR for vehicle operators in maritime applications. In the AR environment, information from maps, radars, and other instruments were fused with the real world view. Their results indicate improved SA of the operators. Lukosch et al. [29] used AR to support information exchange for operational units in the security domain. Their results showed improved SA for personnel that were not present in the scene.

#### *2.5. Visualization in Context*

Apart from visualizing virtual information with the aim of improved perception and SA, another key element in our application area is live video in contextualized mixed reality. Cameras are nowadays ubiquitous and used for a wide space of applications, quite often deploying multiple cameras at once. Having multiple views from an observed area, while useful, can also be straining for the observer who, in order to understand the situation, needs to mentally reconstruct the observed area and understand the spatial relations between the different views. In order to reduce this mental workload, new design approaches should be identified. Brejcha et al. [30] introduced a pipeline for aligning photographs in a reconstructed 3D terrain based on location and rotation metadata included in the photographs. The produced result offers improved spatial orientation and scene understanding and overall an enhanced first-person experience for the viewers. Wang et al. [31], suggested and tested different visualization techniques for embedding videos in a 3D spatial environment. One of their preferred approaches is the Billboard video view. In this approach, the video is projected in a rectangle that orients itself in order to face the user. A modified Billboard approach is utilized in our AR application.

#### **3. Architecture**

#### *3.1. Overview*

As stated in Section 1, the main focus of this work is to provide drone pilots with intuitive gesture control, drone tracking in AR, and a video feed displayed in context with the real environment. This section presents the system architecture developed (depicted in Figure 1) to support these functionalities.

The two end-points of the system are the pilot and the drone. The pilot interacts with the system via an autonomous mixed reality headset that provides hand-tracking capabilities along with visualization in AR. A HoloLens application was developed to support gesture acquisition, give feedback to the pilot, display the received video feed and the drone's tracked position in AR, and handle connectivity and common settings. The drone connects with the system via a direct wireless link with its remote controller, which is connected to a smartphone running a custom Android app to support the needed functionalities. These include transmitting telemetry data from the drone to the AR device and control commands from the AR device to the drone. The communication between the AR device and the drone is routed through a message broker, while the video data are streamed via a direct socket connection between the smartphone and the AR device.

**Figure 1.** System architecture and data flow between components.

There are three primary data flows pertinent to the desired functionalities:


The remainder of this section further describes the individual system components and their functionalities, while the next section (Section 4) provides technical details regarding the main components of the solution.

#### *3.2. Hardware*

The main hardware devices used in the proposed solution are the drone, and the AR device used as a user-interface, both of which are briefly described below. Additional hardware, such as the smartphone connected to the drone's remote and the computer server hosting the message broker, are generic and need no further elaboration.

Developing and testing the architecture was performed using DJI's Mavic 2 Enterprise Dual and Mavic Mini drones. The DJI Mobile SDK platform offers high-level control of the aircraft and the camera–gimbal system, low latency video feed retrieval from the camera, and state information about the aircraft and the controller through various sensors, all of which are essential for development. For outdoor testing, we primarily used the larger Mavic 2 Enterprise Dual drone with its onboard obstacle avoidance system, which was important for safety reasons when implementing gesture control. For indoor testing, the smaller and lighter Mavic Mini was used.

Microsoft's HoloLens 2 was used as an augmented reality platform. HoloLens is a head-mounted system that can display virtual objects in the real world, with the use of two see-through holographic displays. It has a plethora of onboard sensors to understand its surrounding environment (spatial mapping) including four RGB cameras, which combined with a Time-of-Flight Depth sensor, are used to track the user's hands. In addition, two infrared cameras are tracking the user's eyes, optimizing object rendering, and a microphone array can be used to issue voice commands. We capture hand tracking data through the HoloLens interface and translate them into UAV commands to control the drone and the camera. We also use the head tracking information the HoloLens provides to rotate the drone in accordance to head movement. HoloLens was selected because it provides three features necessary for the presented application: AR visualization, hand tracking, and autonomous operation (i.e., not tethered to a computer).

#### *3.3. Communication*

Information exchange between the different modules of the presented architecture is achieved via the Kafka (https://kafka.apache.org/, accessed on 20 December 2021) message broker. We use Confluent's implementation (https://www.confluent.io/, accessed on 20 December 2021), which includes a built-in REST proxy and a corresponding API, which are necessary since there are no Kafka clients compatible with UWP (Universal Windows Platform) and the HoloLens. The broker's contents are organized into topics, and each topic corresponds to different functionality of the system (e.g., control commands, positioning info, etc.). The messages abide by the JSON protocol, which is natively supported by Kafka, and their structure is strictly defined. This not only ensures compatibility between modules but also provides significant modularity since new message types can be added, removed, or modified in future versions since they comply with the same JSON structure. Finally, our modules use HTTP operations (POST, GET) to produce and consume the stated messages.

#### *3.4. Augmented Reality Application*

The augmented reality app was developed for HoloLens using Unity3D ( https://un ity.com/, accessed on 20 December 2021) and the Mixed Reality Toolkit ( https://github .com/microsoft/MixedRealityToolkit-Unity, accessed on 20 December 2021). The main functional modules of the application are as follows:

	- **–** Visualization of the virtual hand joints overlaying on top of the user hands, making it possible for the user to directly check if his/her hands are correctly perceived by HoloLens (Figure 4).
	- **–** Drone visualization based on the drone tracking module. A virtual drone is overlaid on top of the real drone so that the user is aware of the drone's relative

position even if it is not in a direct line of sight (e.g., behind a wall or building) or too far to be easily visible with a naked eye.


**Figure 2.** First Person View (FPV) into the Augmented Reality environment. Real and virtual drones are shown in the lower right corner.

**Figure 3.** The AR Egocentric View mode. The AR user can be shown in collinearity between the user and the drone's camera in the right corner.

**Figure 4.** Hand-tracking as visualized in our AR application. Cubes represent hand joints. (The displacement of joints is due to the location of camera capturing the screenshots; the user of the app can see the virtual joints on top of his/her actual joints.

#### *3.5. UAV Interface Application*

An application was developed to act as a communication bridge between the DJI drone and the rest of the system architecture. We chose the Android platform since it was the most mature and versatile Mobile SDK platform DJI had to offer at the time of development. The app is installed on an Android smartphone connected to the remote controller via a USB cable. The drone connects to the remote controller via DJI's proprietary communication protocol, OcuSync, while the smartphone connects to the rest of the system by WiFi or a commercial mobile network. The app's primary responsibilities are:


The application, depicted in Figure 5, displays practical information (e.g., battery status of the aircraft and of the remote controller), along with a live view of the drone's camera feed. Buttons are also present to perform various actions. The "Configurator" button exposes the user to a settings page shown on the left pane of Figure 5, where several parameters can be configured, including connection settings, the sensitivity of the drone to gesture commands, and infrared visualization options.

**Figure 5.** The Android UAV interface app. Left: the app's configurator page, with setting regarding connectivity, gesture command sensitivity, and infrared view display. Right (rotated): the app's main view, showing a live view of the drone's camera feed.

#### **4. Methodology**

The realization of the presented solution can be broken down into five tasks, each regarding a different aspect of the gesture control or the AR visualization pipeline:


The rest of this section describes in detail each of these five tasks, presenting the methodologies followed, noting weaknesses, and outlining possible future improvements.

#### *4.1. Gesture Definition*

#### 4.1.1. Requirements and Considerations

In order to develop a viable alternative drone control system, it should allow users to perform the same motions as a regular handheld remote: pitch, roll, yaw, throttle, and their combinations, as well as camera motions. Similarly, it should allow users to define the sensitivity of each command, and by extension, the amplitude of the corresponding motion.

Designing Natural User Interfaces (NUI), such as gestures, some additional requirements must be taken into account to achieve intuitive interaction. Peshkova et al. [33] explained that the gestures should be related to each other while following a single metaphor. In the same study, the use of multi-modal interaction, such as a combination of gesture and speech commands, is encouraged in order to increase naturalness. Hermann et al. [11] additional requirements were highlighted including avoidance of non-ergonomic positions (as described in [34]) and the presence of some form of feedback. In an AR environment, several forms of feedback can be integrated in the real-world view of the user as overlaid virtual objects.

Hence, UAV gesture control has three primary aims:


#### 4.1.2. Gesture Vocabulary

In a previous user study and survey [32], we tested two sets of gesture vocabularies, a finger-based and a palm-based vocabulary in a simulated drone navigation environment. Users were required to navigate a virtual drone in an outdoor and an indoor environment. In the finger-based mode, each gesture is defined by which fingers are extended, and the velocity value is defined by the pointing direction of a selected extended finger. In the palm-based mode, the operator uses the positioning and orientation of his/her open palm (all finders extended) to control the drone. The subjective study results showed a clear user preference for the palm-based control, which was described as more comfortable and easier to learn and use. The objective metrics showed faster task completion using the palmbased controls, however, finger-based control offered more precise navigation. The overall objective score showed that the palm-based control has a slightly better performance.

The study results are aligned with findings in [33] that intuitive NUIs should be based on a common metaphor. In this case, the metaphor chosen is that the drone corresponds to the user's hand, and hence mimics and follows all its movements. Therefore, raising the hand up higher will correspond to an ascend (increase throttle) command; a tilt of the hand will correspond to the drone assuming a similar angle (pitch and/or roll); a horizontal rotation will correspond to yaw; and so on. This metaphor, validated by the user study, addresses the first aim of gesture control.

The second aim of gesture control is to avoid non-ergonomic, physically stressing hand gestures. It may be noted that even easy-to-perform gestures can be tiring when performed repeatedly or may oblige users to hold their hand/arm steady in the same height for a long time. In order to avoid the latter, we applied a relative gesture vocabulary based on a varying reference position. One key gesture is used as a reset and calibration command. When the user performs this key hand gesture, the current hand pose is set as the neutral (resting) position, and all following gestures are interpreted relative to this. This allows users to chose the resting position that is most comfortable to them and even define new resting positions in preparation for certain maneuvers.

Hence, a user wishing to command the drone to ascend may define a low resting position, allowing her to move her hand higher with no discomfort. Later, in preparation for a descent, a higher resting position may be chosen so that the user can lower her hand easily. The reset and calibration command is tied to the user opening the palm and extending the fingers. Hence, after a brake and stop command (closed fist), a new resting position is always defined before the user issues a new command. While the hand is closed and no commands are issued, users can position their hand in preparation for the next gestures safely.

This palm-based gesture set was expanded in later work to include take-off and landing gestures [35]. Similar to a handheld remote, these gestures are mapped to hand positions assumed and held over a couple of seconds. To differentiate them from the other UAV control commands, take-off and landing use an upwards-facing palm. The full palm gesture vocabulary used for drone navigation is shown in Figure 6. Moreover, the user's other hand can be used to control the camera gimbal, where the same set of gestures are mapped to camera pitch and yaw.

**Figure 6.** The palm gesture vocabulary used for drone navigation.

It can be noted that the selected gesture set can be used to perform any UAV motion achievable with the traditional handheld remote. Velocities of UAV commands are mapped directly to the angles or distances relative the resting position, allowing for a continuous control ranging from no motion at all to fast motion. Combination motions, such as pitching forward while yawing a little to the left and slowly descending are easily achievable. This can be further substantiated, as the human hand, like all real-world objects, has six degrees of freedom (translation and rotation along three axes), while a quadcopter's control has four (pitch, roll, yaw, and throttle). This demonstrates that movements are independent of each other and can hence be freely combined. Naturally, the same holds for the cameracontrolling hand. Therefore, the third aim of gesture control is addressed.

#### 4.1.3. Drone Control Features

The drone control module was implemented following an agile development process; starting from a set of basic features, continuous testing and user feedback led to the refinement of the user requirements. Initially, the *hand gesture drone control* was implemented and three additional features, *hand gesture camera control*, *periscope* mode, and high-level *voice control* commands were added.

The primary task was to accurately navigate the drone using the selected palm gestures. It should also be noted that when no gesture is performed or when the user hand is not visible, no command is sent, and the drone stops. Inside the AR application, visual cues were displayed providing feedback to facilitate the navigation task. Drone navigation is performed using the user's right hand by default, however, this can be a parameter.

In the hand gesture camera control, the unassigned user hand, by default the left hand, is used for controlling the view direction of the drone's camera. This feature is controlling the camera gimbal, which is able to perform 2DoF movements: rotation up/down (pitch) and rotation right/left (yaw). The gesture vocabulary used for camera control is similar to the vocabulary for drone navigation, however, there are less defined gestures since the camera's possible movements are fewer; by tilting the open palm left or right, the camera

turns to the corresponding direction in the vertical axis while, by tilting the palm up or down, the camera rotates in the corresponding direction in the horizontal axis.

Periscope mode is an additional form of gesture control, which does not involve the use of hands but it is based on the user's head direction. When Periscope mode is enabled, the drone viewing direction follows the user's head direction in real time. For example, when the user rotates his head to look 90◦ east, the drone rotates to the exact same direction. Periscope is an auxiliary feature allowing for a quick and intuitive inspection of the surrounding environment that has collected very positive reviews from the end-users.

Finally, voice commands are used to perform high-level commands such as resetting the camera direction. Voice commands are completely optional, as all the commands can be alternatively performed by buttons and controls available in the virtual application menu, however, they offer a multi-modal tool of interaction and at the same time allow the user to use their hands for controlling the drone or perform an other task and at the same time dictate the desired command using speech.

#### *4.2. Gesture Acquisition*

#### 4.2.1. Background

Our early research [32] into gesture acquisition for drone control considered two different hardware peripherals: the AvatarVR glove (https://avatarvr.es/, accessed on 20 December 2021), a wearable with embedded IMU and touch sensors, and the LeapMotion controller ( https://developer.leapmotion.com/, accessed on 20 December 2021), an infrared LED and camera device. The former was found to falsely report finger curl when rotating the hand, which made it incompatible with the open hand metaphor described in the previous subsection; hence, it was discarded for the current application. The LeapMotion controller performed well in a lab setting but the lack of wireless portability (as it needs a USB connection) and its inability to track hands in an sunlit outdoor environment were important limiting factors. However, it served well for the initial development phase of the solution, as it includes a Unity API. Most of the gesture recognition and interpretation functions were based on the LeapMotion API and later adapted for the HoloLens.

Google's MediaPipe Hands [36] was also considered for purely RGB-camera-based gesture acquisition. This has proved robust in a wide range of lighting conditions, making it suitable both indoors and outdoors. However, as it is compatible with Unix and Android systems only, the integration of this modality into the HoloLens has not been realized. It has been used with a webcam for training on a computer simulator, and it could be considered for future developments on different AR devices or standalone gesture control applications not including AR visualization.

#### 4.2.2. Gesture Acquisition and Interpretation in HoloLens 2

Gesture recognition for the palm-based control in HoloLens 2 utilized the MRTK Unity3D plugin and more specifically, the Hand-tracking API. The MRTK hand-tracking API did not provide (at least during the time of the development) high-level functions for recognizing extended fingers or an open/closed palm. The main values provided are handedness (left or right hand), joint transform (the position and orientation in the 3D space), and joint type (e.g., index tip, index distal, metacarpals, etc.). As a result, the development of palm navigation had to start by implementing a Gesture Recognizer component able to detect an open palm gesture in real-life conditions where fingers are quite often occluded or not completely straight, etc. The gesture recognizer component is based on the work provided by Rob Jellinghaus (https://github.com/RobJellinghaus/MRTK\_HL2\_HandPose/, accessed on 20 December 2021), but is heavily modified to adjust to our case needs. The code takes into account finger–eye alignment and co-linearity between finger pairs to cope with finger occlusions. It also uses ratios between different hand's parts instead of absolute measurements to be able to handle different hand sizes. Since in the palm-based control we are only interested in two classes of gestures, open/extended palm and closed palm, the gesture detector does not have to be very precise in recognizing gesture details. For that reason, the gesture classification criteria are modified to match the needs of the application: The criterion for an open palm classification is the detection of an extended thumb and at least three extended fingers. This way, the recognizer is more sensible in recognizing an open palm gesture, which is the main state of interest.

When the Gesture Recognizer module detects an open palm, the hand joints' information (position and rotation) is processed by the Palm Navigation module that calculates the navigation commands (pitch, yaw, roll, throttle, and take off/landing) and their amplitude (or velocity values) based on the hand joints' current and past rotation angles. As mentioned in Section 4.1.2, in order to avoid physical stress, the reference null position can change every time the user performs the reset command (closed palm gesture). The next open palm position after the reset command corresponds to the reference position and the joint angles and position at that point are stored as the neutral (or zero) transforms for the future commands.

In particular, three components of the hand's position are relevant: the hand's Up normal vector, perpendicular to the palm; the hand's Forward normal vector, perpendicular to Up, and facing forward, along the direction of the middle finger; and the hand's vertical position. An overview of the vectors and their correspondence with the drone control values is shown in Figure 7. Interpretation is relative to the reference hand position. The current position's normals are translated to the y-axis and analyzed into components. Pitch is computed by the angle of the Forward vector to the xz-plane. Yaw is tied to the angle of the Forward vector's xz-component to the x-axis. Roll is computed by the angle of the Up vector's yz-component to the y-axis. Throttle is proportionate to the vertical (y-axis) distance between the current position and the reference. All commands allow for a "neutral zone", so that minor or involuntary motions are not translated to a drone command.

**Figure 7.** Palm vectors and angles used to interpret gesture commands. Interpretation is always relative to the reference position, whose normal vectors form the axes x, y, and z. The current position's normals are translated to the y-axis and analyzed into components. Pitch, roll, yaw, and throttle values are then computed according to the angles and distances between these component vectors and the reference axes.

#### *4.3. Drone Position Tracking in AR*

Drone tracking is used to monitor the physical location of the drone during a flight. In our case, this information is utilized in two different ways: to assist monitoring and piloting the drone when it is out of line of sight or too distant to distinguish it clearly using a virtual drone marker in the AR environment and to position the AR video canvas according to the drone's location and orientation and hence in context with the real environment.

Drones are commonly equipped with two positioning mechanisms: GPS and IMU. GPS is commonly used to provide an approximate location-based satellite communication. However, it presents two major drawbacks: It is only available outdoors, and its precision is limited, ranging from 2.5 m to 10 m or more [14]. This precision can be inadequate for flights that require precision. Hence, the presented system relies on IMU readings. The DJI SDK

offers no access to low-level IMU measurements; instead, it combines the readings from multiple sensors, including magnetometer, gyrospope, and accelerometer measurements to report the estimated velocity of the drone, at regular intervals. Velocity is reported in a North–East–Down (NED) coordinate system, with an origin point at the drone's starting location. The reported velocity measurements *V* are collected by the Android UAV interface app, multiplied by the interval time *T* to yield translations in distance, and summed to provide the estimate position of the drone *D* at time *t*, relative to its starting point:

$$D\_t = \sum\_{i=0}^{t} D\_i = \sum\_{i=0}^{t} V\_i T\_i \tag{1}$$

The above assumes that between reports, velocities remain constant, which is not the case when performing maneuvers. However, with a 100 ms reporting period, these errors are negligible. A more significant drawback to this procedure is the low precision — two decimal places—at which velocities are reported by the SDK. This makes for an average error of 0.005 m/s. Naturally, this becomes more significant at lower velocities, where it can account for up to 16% of the real velocity. Even without compensation, the position as derived by aggregated IMU readings is more accurate than GPS over short periods of time and short distances. With the error-correcting compensation described in Section 5, the accuracy improves significantly and is adequate for both locating and guiding the drone, perhaps excluding very precise maneuvering in tight spaces. For longer distances, GPS reading can be used correctively to ensure that the aggregated IMU drift is capped to the GPS precision.

The application posts the position related data presented in JSON format on the Kafka broker. The post frequency is four times per second in order to consume less network and computational process power. On the client side, the HoloLens application connects to the Kafka broker and consumes these messages on a specific topic. Based on the message information, it renders a virtual drone-like object into the augmented reality environment, overlaying the position of the real drone. In order to achieve this coupling, a prior calibration process is required to align the HoloLens's internal coordinate system with that of the drone.

#### *4.4. Calibration of the AR Environment*

Tracking a real object in augmented reality and visualizing it in context with the real environment requires a correspondence between two different coordinate systems: one based in the real world and one based in the virtual world, which will be superimposed on the real world to provide the AR graphics. As outlined in Section 4.3, the drone's position is continuously estimated based on aggregated IMU readings and its heading is supplied directly by its onboard compass. Hence, the calculated pose is expressed in relation to a system of coordinates with the drone's starting location as an origin point and its y-axis aligned with the north.

The HoloLens does not use its internal compass to orient its internal coordinate system, and the compass readings are not readily available to Unity apps. Therefore, the AR elements, including the virtual drone marker and the video panel, must be expressed in relation to the internal coordinate system, with an origin and y-axis direction equal to that at the user's location and heading, respectively, at the time of launching the application.

In order to link the two coordinate systems and place the virtual drone marker in the corresponding position of the physical drone and, by extension, for the video panel to be displayed correctly in context with the environment, a calibration procedure should be performed. The calibration process aims to calculate the relative translation and rotation between the two systems. Even though both real and virtual objects are mobile in 3D, with six DoF, the drone's altitude is tracked and reported independently via its altimeter. Hence, this can be viewed as a 2D problem of axis rotation and translation.

Keeping elementary linear algebra in mind [37], for two 2D coordinate systems with the same alignment (no rotation) and an offset of *x* <sup>0</sup>, *y* <sup>0</sup>, if *x*, *y* are the coordinates of a point in one system, then its coordinates in the rotated system will be:

$$
\begin{bmatrix} \mathbf{x'}\\\mathbf{y'} \end{bmatrix} = \begin{bmatrix} \mathbf{x} \\\mathbf{y} \end{bmatrix} + \begin{bmatrix} \mathbf{x}'\_0\\\mathbf{y}'\_0 \end{bmatrix} \tag{2}
$$

Meanwhile, for two 2D coordinate systems with the same origin point (no translation) and a rotation of *φ*, if *x*, *y* are the coordinates of a point in one system, then its coordinates in the rotated system will be:

$$
\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \cos(\phi) & \sin(\phi) \\ -\sin(\phi) & \cos(\phi) \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \tag{3}
$$

In the generic case of both translation and rotation, we have:

$$
\begin{bmatrix} \mathbf{x'} \\ \mathbf{y'} \end{bmatrix} = \begin{bmatrix} \cos(\phi) & \sin(\phi) \\ -\sin(\phi) & \cos(\phi) \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \mathbf{y} \end{bmatrix} + \begin{bmatrix} \mathbf{x'\_0} \\ \mathbf{y'\_0} \end{bmatrix} \tag{4}
$$

Hence, using the real-world drone and its virtual marker in the AR coordinate system as the common point, we can link the two coordinate systems following a calibration procedure. The following text describes both the default, manual calibration method and two vision-based alternatives.

#### 4.4.1. Manual Two-Step Calibration

The manual two-step calibration method relies on users performing the HoloLens's "air tap" gesture to mark the drone's location to the HoloLens app.

In the generic case, the HoloLens app will launch with the user some distance away from the drone and facing a random direction, forming an angle *φ* with the north. Figure 8 shows the two different coordinate systems and the devices' starting location with different colors. In order to translate coordinates from one system to the other, we need to calculate the coordinates of the origin of one with respect to the other (translation vector) and the angle *φ*, which can form the rotation matrix.

During the first step of calibration, the drone has not yet moved, and the user air-taps its present (starting) position, as shown in Figure 8 on the left. That position's HoloLens coordinates (*x<sup>H</sup>* <sup>0</sup> , *<sup>y</sup><sup>H</sup>* <sup>0</sup> ) are then captured and stored and form the translation vector.

**Figure 8.** Axis translation and rotation via the two-step manual calibration process. Left: with the first tap on the drone, the translation offset is captured by the HoloLens. Right: after moving the drone a small distance, the second tap captures the drone's position in both coordinate systems (HoloLens and Drone), allowing the calculation of rotation angle *φ*.

To capture the rotation angle, the drone is moved from its starting position. The new position's HoloLens coordinates (*x<sup>D</sup>* <sup>1</sup> , *<sup>y</sup><sup>D</sup>* <sup>1</sup> ) are captured with a second air-tap. At the same time, the new position's Drone coordinates (*x<sup>H</sup>* <sup>1</sup> , *<sup>y</sup><sup>H</sup>* <sup>1</sup> ) are calculated based on the aggregated IMU readings. Figure 8, on the right, shows this second step. Hence, Equation (4) yields:

$$
\begin{bmatrix} \mathbf{x}\_1^H\\ \mathbf{y}\_1^H \end{bmatrix} = \begin{bmatrix} \cos(\phi) & \sin(\phi) \\ -\sin(\phi) & \cos(\phi) \end{bmatrix} \begin{bmatrix} \mathbf{x}\_1^D\\ \mathbf{y}\_1^D \end{bmatrix} + \begin{bmatrix} \mathbf{x}\_0^H\\ \mathbf{y}\_0^H \end{bmatrix} \tag{5}
$$

The only unknown being *φ*, the above can yield:

$$\begin{aligned} \cos(\phi) &= \frac{\mathbf{x}\_1^H - \sin(\phi)\mathbf{y}\_1^D - \mathbf{x}\_0^H}{\mathbf{x}\_1^D} \\ \sin(\phi) &= \frac{\mathbf{x}\_1^H\mathbf{y}\_1^D - \mathbf{x}\_1^D\mathbf{y}\_1^H - \mathbf{x}\_0^H\mathbf{y}\_1^D + \mathbf{x}\_1^D\mathbf{y}\_0^H}{(\mathbf{x}\_1^D)^2 + (\mathbf{y}\_1^D)^2} \end{aligned} \tag{6}$$

Therefore, Equation (6) can be used to obtain rotation angle *φ*, given any non-zero displacement (*x<sup>D</sup>* <sup>1</sup> , *<sup>y</sup><sup>D</sup>* <sup>1</sup> ). Knowing both translation and rotation, any future drone-system coordinates can be translated to HoloLens system coordinates to allow visualization in AR.

#### 4.4.2. Visual Drone Pose Estimation

Instead of relying on user information (air-taps) a drone's relative pose (position and orientation) may be inferred from visual data, captured live from the HoloLens's onboard camera. To that end, we propose a method that exploits the latest advances in deep learning to automatically retrieve the 6DoF pose of a drone from a single image. An early version of this architecture, described in more detail in [23], uses a CNN encoder backbone followed by three fully connected layers that output two predictions, one for the translation components and one for the rotation. The translation prediction uses an *L*<sup>2</sup> loss, while the rotation prediction aims to minimize the angular difference *LR*. The two loss functions are weighted by a balancing hyperparameter *λ* and combined to form the training loss function.

The newest update employs a state-of-the-art architecture, HRNet [38], as a landmark regression network to predict the eight landmark image positions. Then, we retrieve the pose using the predicted 2D–3D correspondences. However, this approach does not permit end-to-end training, as the pose retrieval step is not differentiable. Towards this end, we exploited the recently introduced BPnP algorithm [39], which has been proved to be effective for the pose estimation task [40].

For training our models and evaluating their performance, we have compiled and made publicly available the UAVA dataset (also in [23]), an extensive set of synthetic, photorealistic images of drones in indoor and outdoor environments, which provides ground truth annotations of 2D keypoints, pose, depth maps, normal maps, and more.

Accurate visual drone pose estimation can automate and simplify the calibration process, as both distance (step 1) and orientation (step 2) can be estimated by a single image, automatically, without user input. Visual recognition of the drone and its pose by HoloLens is robust at distances of a maximum of 5–10 m, depending on drone model and size. In addition, this method can be used to periodically correct the position computed by the aggregated IMU readings, offsetting IMU drift and any other errors. With the current version specifications of HoloLens, such an algorithm cannot run continuously in real time. However, even sparse inferences, performed every 5–15 s, can keep IMU drift at a minimum and improve tracking.

#### 4.4.3. QR Code Reading

An alternative to the visual estimation of the drone's pose is to rely on QR codes pasted on the physical drones. The HoloLens provides a built-in capability to detect and read QR codes robustly. Hence, by pasting a QR code on top of the drone, the HoloLens can detect its starting position (analogous to step 1 in the manual calibration process) automatically, with no need of human intervention. While the air-tap determines a single point and hence

provides only position information, the QR code is an object, and its detection can provide both position and orientation. Hence, in conjunction with the heading provided by the drone's compass, both steps of the manual calibration process can be completed in a single QR code read.

Naturally, a QR code pasted on a drone's top is easily readable at short distances (perhaps a maximum of 2 m), i.e., while the drone is landed and the user stands close to it. Hence, while this method can provide fast and automated calibration, it cannot be used for correcting the drone's position during flight and offsetting IMU drift, as is the case with Section 4.4.2.

#### *4.5. Video Transmission and Visualization*

When implementing the live streaming feature of the drone's camera video feed to HoloLens 2, we had to consider specific requirements regarding latency and frame rate. In order for the video stream to be usable for piloting, it had to be low in latency and high in frame rate. Overall latency is the sum of two values: the latency from the drone to the remote controller, which depends on the drone manufacturer and the streaming protocols used, and the latency from the remote controller to the HoloLens, via the UAV interface app, which depends on the presented solution. In this case, DJI uses OcuSync for video transmission, which adds a nominal latency of 120 to 270 ms. The following text focuses on the latency between the remote controller and the HoloLens.

Ideally, the overall latency should be low enough to allow easy and safe control based solely on the video feed (e.g., in FPV mode) and frame rate should be high enough for the video to be viewable as a continuous stream. Therefore, we set targets for <350 ms overall latency (hence 120 ms for the Android-to-HoloLens latency) and >15 frames per second (FPS).

#### 4.5.1. Video Streaming

Several methods and streaming protocols were considered for the transmission of live video to the HoloLens, including Real-Time Messaging Protocol (RTMP), Real-time streaming protocol (RTSP), sending frames as individual messages through Kafka, and direct connection via network sockets.

RTMP is a flash-based video streaming protocol routing the video stream through a server and supported natively by the DJI SDK. However, RTMP induces high latency, ranging from one to several seconds, depending on the the network conditions and server location and capabilities. This makes RTMP a good choice for non-reactive video watching (e.g., for a spectator) but unsuitable for remote control. RTSP and HLS (Http Live Streaming) exhibit similar behavior, and hence they were discarded for use in the presented solution.

Sending video frames through Kafka was implemented in an effort to simplify the architecture by routing all data through the same hub (Kafka). Two flavors were tested: sending encoded frames directly from the UAV interface app to the HoloLens and decoding the frames first on Android and then forwarding the decoded data to the HoloLens. Naturally, the first flavor minimizes bandwidth requirements, while the second minimizes the processing strain on the HoloLens. On a local (Wi-Fi) network, bandwidth is less of a concern, hence latency was reduced when streaming decoded frames. However, the frame rate has proved to be too low (at about 11 FPS) in both cases. This is attributed to the HoloLens using a REST API to connect to Kafka (as there are no clients available for UWP), which induces an overhead for each message request. Therefore, this approach was also discarded.

The solution finally selected made use of Network Sockets. This type of connection is Point-to-Point, meaning that once the connection is established, no additional overhead is required for the flow of data. With this implementation, in contrast to the two previous methods, there is no intermediate server between the drone and the HoloLens. The Android device running the UAV interface app acts as a server with the HoloLens being the client. To test this approach, we connected the drone and the HoloLens to alleviate network

bottlenecks. We measured the Android-to-HoloLens latency below 100 ms and the frame rate exactly 30 FPS, which is the native frame rate of the drone's camera. Since both requirements were met, this method is the most appropriate for piloting the drone through the live video feed from the camera.

#### 4.5.2. Video Decoding

On the HoloLens side, the decoding process starts when the connection has been established. For the purposes of the decoding task, we have implemented a dedicated library using the FFmpeg tool, namely, Decoder, to handle video frames. The library has been implemented as a dynamic-link library for the Microsoft Windows operating system and built for UWP architecture compatible with HoloLens 2.

The Android app feeds the AR app with h.264 encoded frames. In the AR application decoding and visualization of the received video frames are handled in two background processing threads. The first thread runs the Decoder module while the second thread is responsible for the frame visualization. Decoding is performed in a 30Hz (native) rate while rendering in 15Hz for performance reasons and efficient management of computational resources. Rendered frames are displayed in a virtual projection panel. In contextualized video mode, this panel is placed in front of the drone, in the direction its camera is facing, while in FPV mode it is centered on the user's field of view.

During development, different methods of video streaming were implemented and tried (as shown in Figure 9). These included RTMP streaming (top left), sending video frames through Kafka (top right), and a direct connection using web sockets (bottom). The latter was finally selected, as it yields the lowest latency and requires no intermediaries.

**Figure 9.** Diagrams of different streaming methods.

#### **5. Testing and Evaluation**

The presented solution was tested both subjectively and objectively. Subjective tests included a pre-development user study to define the gesture set used for drone control and a number of field trials where users had a chance to try the system in action. Objective measurements concerned the responsiveness time between gesture execution and drone reaction, IMU-based tracking error, and the latency in video transmission. Each of the above is presented in the following subsections.

#### *5.1. Subjective User Evaluation*

#### 5.1.1. Gesture Usability

The gestures and underlying metaphor were selected in a pre-development user study, including a survey resulting from both subjective and objective measurements. Two gesture sets were considered: the palm-based gesture, where the drone mimics the orientation of the user's palm, and a finger-based gesture set with no common underlying metaphor, where three different commands (pitch, yaw, and throttle) were mapped to distinct gestures, differentiated by which fingers are extended. The study aimed to define a finger-based gesture set and then measure user preference and performance in the two gesture sets: the

mutually exclusive, rigidly defined finger-based gestures, and the freer and combinable palm-based gestures.

The study, conducted between December 2019 and February 2020, included two parts. The first part aimed to define the finger-based gesture set. Based on a questionnaire completed by 29 participants, we selected those finger-based gestures voted as both comfortable and intuitively appropriate and mapped them to the 3 drone control commands.

The second part employed a drone flight simulator developed in Unity and coupled with a gesture acquisition peripheral and gesture interpretation programming. It aimed at measuring the ease of learning, usability, and comfort of each control mode, as well as the performance in terms of mission completion time and collision avoidance. An outdoor and an indoor scene were prepared for testing. Participants, including 27 men and 12 women with varying degrees of familiarity with drone control, were asked to control a virtual drone in the simulator and pass through specific waypoints while avoiding collision with walls. Each participant tried both modes (palm- and finger-based) and both types of scenes (indoor and outdoor). Both objective and subjective results were obtained: the former by measuring the timings and collision from the simulator itself and the latter via a questionnaire.

An overview of the results can be seen in Figure 10. The results showed an overall preference for palm-based control. This included both faster timings and less collisions, and a subjective evaluation of it being easier to learn and more comfortable to use. In contrast, finger-based control was generally slower, but could be more precise, and hence useful in very tight environments (e.g., indoors). For more information and further results, the interested reader is referred to the full user study publication [32].

**Figure 10.** Overview of the gesture selection user study results.

#### 5.1.2. Field Trials

Post-development, the complete system has been demonstrated to both drone pilots and non-expert users. While these evaluations are still ongoing, early results are being used to guide adjustments and improvements in the system. In these events, participants are first responders interested in using drones for search-and-rescue operations. Such mission are often stressful, highlighting the need for more comfortable and intuitive drone control and a video feed easily viewed and placed in context. Early demonstration and feedback sessions have been held in Japan (July 2021—9 participants), Greece (October 2021—6 participants), and France (November 2021—12 participants), with additional events planned for Italy, Finland, and Spain in the first half of 2022.

Initial feedback from these events indicate that gesture control with a real drone is easy to learn for inexpert users, comfortable, and allows full control, including any maneuver possible with a handheld remote. One setback noted concerned the use of the HoloLens in very bright sunlight or a hot day: Both of these conditions degraded the HoloLens's hand-tracking performance, and the heat caused it to shut down after a few minutes. This is expected, as the HoloLens employs passive cooling and is designed primarily for indoor use. However, under non-extreme heat and sunlight conditions, performance has been consistent and robust.

Two experienced drone pilots taking part in the demonstrations stated that they would feel comfortable using gesture control, provided conditions did not degrade the robustness of the HoloLens's hand tracking. The periscope mode, which keeps the drone motionless while tying its yaw to that of the HoloLens, has proven particularly popular and useful for scanning an area.

Regarding drone tracking in AR, feedback was mixed: The manual calibration process, in particular, often proves too difficult and complicated for users. This feedback has driven a turn towards implementing a easier, largely automated, vision-based calibration process, relying on AI visual pose estimation and/or QR codes, as described in Section 4.4. The new calibration process will be tested and evaluated in upcoming demonstration events.

Concerning visualization, users preferred the FPV mode rather than the contextualized view mode. This has been in part due to the difficult calibration process, which is a prerequisite for drone tracking and hence contextualized display. In addition, the size of the contextualized display was sometimes deemed too small. Based on this feedback, future versions will feature a larger canvas on which to display the video feed, as well as the option to scale this canvas using gestures on the camera-controlling hand.

#### *5.2. Lab Testing and Objective Measurements*

#### 5.2.1. Drone Responsiveness

When testing alternative drone control methods, an important consideration is the response time of the drone. In the presented solution, this is the latency between the user performing a gesture and the drone responding to this command. In the presented architecture, commands are transmitted from the HoloLens to the UAV interface app via a Kafka broker on the Internet or a local network. The location of the broker, as well as network conditions, can impact drone response latency.

To measure this latency, a high-speed camera was used to capture both the user and the drone at 240 frames per second, also described in previous work [35]. Over a series of experiments, a counting of the number of elapsed frames between the user's gesture and the drone reaction yielded average latency timings. Different locations of the Kafka broker were considered and measured separately: the broker being on the same local network as the HoloLens and the UAV interface; the broker being in a different city in the same country; and the broker being in a different country. Figure 11 shows the minimum and the average timings measured in these experiments. With a camera frame rate of 240 frames per second, measurement error is less than 5 ms, which has has no significant impact on the conclusions of this experiment.

Since a DJI drone consumes commands from its remote controller every 40–200 ms, that degree of latency is unavoidable. In a local area network, the observed latency was minimal, at less than 200 ms, therefore, no worse than that of conventional, handheld remote controls. At more remote broker locales, response latency increases. However, even at around 300 ms, it is still within acceptable limits for drone piloting.

**Figure 11.** Drone response time measurements according to broker location.

#### 5.2.2. Positioning Accuracy

The basic modality of drone tracking calculates the drone's position based on the IMU readings it reports. Even small error from individual reading aggregate over time resulting in IMU drift. Hence, any error correction or compensation can improve the robustness of the system in a span of seconds or minutes.

To measure IMU positioning accuracy, an experiment was conducted where the drone would move forward close to ground level and the position as calculated from the IMU readings would be compared to the actual distance traveled, measured physically on the ground. The experiment was repeated for different values of pitch, corresponding to different velocities.

The results of these experiments are presented in Figure 12. The position estimated from the IMU is marked on the horizontal axis, while the ground truth is on the vertical. For reference, the yellow diagonal line marks the ideal of zero error, where IMU measurements equal the ground truth. It can be noted that the error is relatively small for higher pitch values and velocities, increasing when the drone moves slower. This is also evident by the angles of the different linear trendlines. For pitch = 0.3, the trendline is almost parallel to the zero-error line, while for lower pitches, the lines diverge. It may also be noted that IMU measurements steadily report a smaller than the actual distance. Using linear regression, the functions of each trendline can be calculated. In the following, *GT* is the ground truth, and *IMU* is the distance measured by the IMU readings aggregation:

$$\begin{aligned} GT &= 1.31 \ast IMII + 0.20, \text{ for } pitch = 0.05\\ GT &= 1.17 \ast IMII + 0.27, \text{ for } pitch = 0.1\\ GT &= 0.99 \ast IMII + 0.217, \text{ for } pitch = 0.3 \end{aligned} \tag{7}$$

The observations of Figure 12 may be used to improve the IMU measurements by compensating for the error. As noted, this compensation must be inversely tied to the pitch value. Performing linear regression on the coefficients of Equation (7), we can approximate them as a function of the respective pitch value as 1 + (1/52.36 ∗ *pitch*). As the *GT* = *IMU* equation (with a coefficient of 1) is the ideal (i.e., no error), the 1/52.36 ∗ *pitch* part expresses the error of the measurements.

**Figure 12.** IMU measurements vs. ground truth for various pitch values, including linear trendlines and a no-error line for reference. Note how lower pitch values result in greater errors, and how the angle between the trendlines and the no-error line increases for slower speeds.

Hence, a simple, first-degree estimation of the error *E* per distance unit would be:

$$E = \frac{1}{52.36 \ast pitch} \tag{8}$$

Therefore, the compensated distance *Dcomp* can be calculated from the measured distance *D* as:

$$D\_{comp} = D \ast (1 + E) = D \ast (1 + \frac{1}{52.36 \ast pitch}) \tag{9}$$

Figure 13 shows the effect of error compensation on IMU measurements. The left graph shows the mean squared error (MSE) of measurements for different pitch values, without compensations (solid lines) and with (dashed lines). The right graph shows the

average MSE values for the different pitch values, again, with and without compensation. It can be seen that compensation has a huge impact, especially on lower pitches (slower drone speeds).

Compensation research is still ongoing. Future tests will include combination motions and different drone models, as well as higher-degree error modeling.

**Figure 13.** IMU measurement compensation. (**Left**): MSE of the error without compensation (solid lines) and with (dashed lines) for different pitch values. Note that the vertical axis is in logarithmic scale. (**Right**): Comparison of average MSE values with and without compensation, for different pitch values. Compensation has a drastic impact on slower speeds, where the error is greatest.

#### 5.2.3. Video Transmission

A final measurement regarded the latency and frame rate of video transmission and display. End-to-end latency was measured, as above, with a high-frame-rate camera, which yielded an average measurement of 467 ms. While such latency is acceptable for inspection or searching, it is not ideal for piloting a drone in FPV, based on video feedback only. Video transmission latency can be considered in two steps: from the drone to the UAV interface app connected to the remote controller and from there to the HoloLens. The first part was measured at 386 ms, accounting for the larger part of the overall latency. However, this is significantly greater than the timings reported by DJI (170–240 ms) (https://www.dji.com/gr/mavic-mini/specs, accessed on 20 December 2021). Further investigation into this could result in decreased video latency. A full frame rate of 30 frames per second (native to the drone's camera) was achieved.

#### **6. Conclusions and Future Steps**

#### *6.1. Conclusions*

In this paper, we have presented a unified system for AR-integrated drone use, encompassing gesture control, tracking, and camera feed visualization in context with the user's environment. We have outlined the overall system architecture as well as individual components, described the mechanics of the vital tasks necessary for its function and conducted both objective and subjective evaluation experiments. Different aspects of the proposed solution were evaluated, with the results described in Section 5, including gesture selection and usability; drone responsiveness in terms of time lag; drone tracking accuracy and the efficacy of compensation; and video transmission quality.

Although the presented implementation focused on specific hardware (the HoloLens and DJI drones), the underlying logic and architecture are modular and not tied to the current hardware choices. As mentioned in the Section 4.2, the same gesture control methodology has been implemented and tested successfully with alternative hardware, including a regular webcam or smartphone camera. Hence, integration with different drone models or AR hardware is possible and mostly a matter of implementation.

The presented system has largely achieved its objectives, and future plans include both refinements and wider validation. In particular, gesture control has proved both intuitive and accurate, providing the same level of control as a traditional handheld remote. However objective evaluation with real drones (i.e., not a simulator) has not yet been completed and is scheduled for the near future. This should be in comparison with the handheld remote controller, with pilots being asked to place the drone in a specified position. Measurements could include accuracy (distance from the specified position) and time (to achieve that position). In addition, more complex drone behaviors can be tied to either specific gestures or virtual buttons in future implementations; such behaviors can include flying in a circular or other pattern or returning to land either at the take-off location or near the user.

#### *6.2. AR Tracking and Visualization Present and Future Vision*

Work into the AR components—drone tracking, calibration, and visualization—of the presented solution is still ongoing. While the currently developed system is a working prototype, a number of possible future improvements have been outlined and planned for the medium term.

Figure 14 presents our vision for such future additions.

**Figure 14.** Present and future components of the AR part of the solution. Solid lines indicate completed modules, dashed lines work in progress, and dotted lines future plans.

Regarding calibration, the working prototype uses a manual, two-step calibration, which can prove both tiresome and challenging for inexpert users. Hence, work is already in progress to calibrate initial drone pose with a largely automated, visual method. Two options are currently considered, as outlined in the Section 4.4: a visual drone pose estimation AI and the reading of QR codes pasted on the drones. The former should also provide intermittent visual recognition of the drone during flight, correcting the IMU readings and eliminating any accumulated drift.

In addition, it can be noted that the position as tracked in AR will always lag some time behind that of the real drone, as IMU readings must be collected, read, and aggregated, a position estimated and forwarded to the HoloLens and there displayed. A future module could raw input from the flight control of the drone (pitch, roll, yaw, and throttle) and estimate a future position for the drone, offsetting the lag of data transmission and processing.

Finally, the AR video canvas is currently displayed a set distance in front of the virtual (tracked) drone. However, the actual distance between the drone and the objects in its field of view might range from a couple of meters to hundreds. A depth estimation algorithm could gauge this distance and position the video canvas appropriately, for a more realistic display in context with the environment.

**Author Contributions:** Conceptualization, K.K., G.A. and A.D.; data curation, D.T. and G.A.; formal analysis, K.K. and D.T.; funding acquisition, A.D. and P.D.; investigation, K.K. and D.T.; methodology, K.K. and K.C.; project administration, A.D. and P.D.; resources, A.D. and P.D.; software, K.C., D.T. and D.S.; supervision, K.K., A.D. and P.D.; validation, K.K.; visualization, D.S.; writing—original draft, K.K., K.C., D.T., D.S. and G.A.; writing—review and editing, K.K., K.C., A.D. and P.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research has been supported by the European Commission within the context of the project FASTER, funded under EU H2020 Grant Agreement 833507.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Acknowledgments:** The authors would like to thank the Hellenic Rescue Team Attica (HRTA–-Greece) and the École Nationale Supérieure des Officiers de Sapeurs-Pompiers (ENSOSP—France) for testing the proposed system in the context of search-and-rescue operations and providing valuable feedback. Additional thanks are due to drone operators Michail Fotoglou and Eleni Antoniou for testing the gesture control system during its development and offering feedback from an experienced operator's point of view.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Demystifying the Differences between Structure-from-MotionSoftware Packages for Pre-Processing Drone Data**

**Taleatha Pell 1,\*, Joan Y. Q. Li 2,\* and Karen E. Joyce <sup>3</sup>**


**Abstract:** With the increased availability of low-cost, off-the-shelf drone platforms, drone data become easy to capture and are now a key component of environmental assessments and monitoring. Once the data are collected, there are many structure-from-motion (SfM) photogrammetry software options available to pre-process the data into digital elevation models (DEMs) and orthomosaics for further environmental analysis. However, not all software packages are created equal, nor are their outputs. Here, we evaluated the workflows and output products of four desktop SfM packages (AgiSoft Metashape, Correlator3D, Pix4Dmapper, WebODM), across five input datasets representing various ecosystems. We considered the processing times, output file characteristics, colour representation of orthomosaics, geographic shift, visual artefacts, and digital surface model (DSM) elevation values. No single software package was determined the "winner" across all metrics, but we hope our results help others demystify the differences between the options, allowing users to make an informed decision about which software and parameters to select for their specific application. Our comparisons highlight some of the challenges that may arise when comparing datasets that have been processed using different parameters and different software packages, thus demonstrating a need to provide metadata associated with processing workflows.

**Keywords:** unmanned aerial vehicle (UAV); digital elevation model (DEM); digital surface model (DSM); orthomosaic; photogrammetry; Earth observation; environmental monitoring

#### **1. Introduction**

Drone data use within environmental sciences has increased considerably over the past 20 y. This is due in part to the increased availability of drone platforms on the market, technological advances providing better sensors, a longer battery life, easier-to-use systems, and enhanced structure-from-motion (SfM) software that is able to process these datasets into orthomosaics and digital elevation models (DEMs) [1]. Further, in contrast to traditional aerial survey and satellite data capture, drones are able to survey at a fine resolution from a low altitude, be deployed on flexible time schedules, and fly below clouds for unobstructed data collection [2]. In some ways, a drone can capture data more akin to field surveys, though over larger and potentially inaccessible areas, thus effectively bridging the gap between satellite and on-ground data collection across terrestrial and marine environments [3].

Drone data have been captured to provide information across a range of environmental fields, predominantly to assess vegetation coverage, composition, and/or structure in the terrestrial environment (e.g., [4–8]). However, they have also been used to study a range of other environments, including mangroves [9–11], oyster reefs [12], coral reefs [13,14],

**Citation:** Pell, T.; Li, J.Y.Q.; Joyce, K.E. Demystifying the Differences between Structure-from-Motion Software Packages for Pre-Processing Drone Data. *Drones* **2022**, *6*, 24. https://doi.org/10.3390/ drones6010024

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 16 December 2021 Accepted: 9 January 2022 Published: 13 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

coastal dunes [15,16], and seagrass beds [17]. They have also been used to identify invasive plant species [18–20] and estimate animal populations [21–23]. Drones are also commonly used within agriculture (e.g., [24,25]), forestry [26], and urban settings (e.g., [27,28]).

Most of these environmental applications require the drone data to undergo some form of preprocessing before the data are suitable for mapping analyses. Typically, individual drone images will retain metadata not only of the time and date of capture, but also the x, y, z coordinate location of the drone (longitude, latitude, altitude), its orientation, and the angular rotations of the platform and camera. The combination of imagery and metadata is used by SfM photogrammetry software to commence the pre-processing workflow.

The two most commonly used outputs of an SfM workflow include a DEM and an orthomosaic. DEMs are spatial datasets that describe surface terrain features and are broken up into two categories: digital terrain models (DTMs) and digital surface models (DSMs) [29]. DTMs measure the elevation of the mapped surface, minus objects on the surface (e.g., trees, buildings), whereas DSMs measure the mapped elevation including surface objects. The derived DEM is then used in the process of building an orthomosaic.

Orthomosaics are created by stitching together a series of individual, overlapping orthorectified aerial images to produce a single continuous image/map [30]. This process corrects for distortions in the image, introduced by factors such as camera tilt, lens distortion, and environmental conditions [31]. The end product is a uniformly scaled, georeferenced image allowing for accurate estimation of the location, size, and shape of photographed objects.

The accessibility of drones and their derived data products allow scientists, land managers, and other users to collect and manage their own spatial datasets [1]. However, many end users remain unaware of the processes that take place within the workflow of a chosen software or the potential differences in the end product as a result of processing and software choices [32]. There are both proprietary and open-source software options available for conducting SfM photogrammetry. Provided one has sufficient skills in coding, open-source toolkits can be more flexible and allow customisation of many stages in the workflow. In contrast, proprietary software often provides a streamlined workflow to facilitate photogrammetric processing. However, these packages are often referred to as a "black box"-type solution because they offer little control or insight for users on the internal workings of the software, and in many cases, there is limited opportunity for customisation [33].

The uptake of SfM methods in research and monitoring requires some understanding of the data acquisition and image processing workflow to ensure research design repeatability and comparability [34]. Decisions in the image capturing process such as camera type, image resolution, level of image overlap, use of ground control points (GCPs), time of day, tides, and weather conditions can all affect the final orthomosaic and DEM [2]. Additionally, the different SfM software use different algorithms and processing options, which can also affect the final outputs [34]. The subtle difference in outputs between software types, combined with the limited photogrammetry background knowledge of many users of the software, means it is often difficult to reproduce or confidently compare results across photogrammetry studies.

While there have been some studies that have sought to compare the outputs provided by different photogrammetry software, they appear to have exclusively been conducted in the terrestrial environment, with a focus on forests [35,36], sandpits [37], agriculture [38,39], or urban environments [38]. Only Jiang et al. [38] investigated the output of different software across different types of datasets (urban and agricultural). There appear to be no assessments of the comparative accuracy or suitability of different photogrammetry software in processing unmanned aerial vehicle (UAV) data in marine or coastal environments that evaluate both orthomosaic and DSM outputs. This is particularly problematic since most terrestrial UAV mapping uses GCPs and real-time kinematic positioning (RTK) to enhance the accuracy of mapping outputs from photogrammetry software. Previous comparative studies that have assessed the performance of SfM software have focussed on

the accuracy of GCPs compared to ground truth GPS measurements under differing levels of GCPs [37]. However, in the marine environment, the placement of GCPs and the use of RTK is extremely difficult, requiring photogrammetry software to produce orthomosaics with limited or no GCPs and RTK.

Through this research, we aim to assist others in selecting an appropriate SfM software package for pre-processing drone data. In order to do this, we provide qualitative and quantitative assessments of four desktop-based SfM photogrammetry packages, assessing their output file dimensions and specifications, orthomosaics, and digital surface models using input aerial drone data across a variety of terrestrial and marine environments, both natural and built. Finally, we compare the outputs of the software packages and against satellite-derived data in the same locations. We hope that these comparisons highlight some of the challenges that may arise when comparing spatial datasets that have been processed using different parameters and different software packages, thus demonstrating the need to provide the metadata associated with a processing workflow.

#### **2. Methods**

There are a multitude of SfM photogrammetry software packages available designed to pre-process drone data and create DEMs and orthomosaics. Within this study, we focussed on a subset of these packages and selected four of those that are commonly cited and are available in the desktop processing environment, namely Pix4Dmapper [40], AgiSoft Metashape [41], Correlator3D [42], and WebODM [43].

#### *2.1. Study Sites and Input Data*

Using the online drone data platform GeoNadir [44], we downloaded datasets representing variable and commonly studied ecosystems (agriculture, marine, coastal, and urban) (Figure 1). All drone images were captured during mapping missions. As such, the images have a high degree of overlap and sidelap between adjacent photos, were captured using an RGB camera at nadir angle, and include a location at the time of capture in the image metadata. The dataset specifications are included in Table 1.


**Table 1.** Details of each drone image collection dataset.

#### *2.2. Software Packages*

We selected three commercial and one open-source desktop SfM photogrammetry software packages to construct a DSM and orthomosaic for each sample area:


line toolkit Open Drone Map [48], it can also be used across Linux, macOS, and Windows OS.

**Figure 1.** Study site locations and orthomosaic examples: (**A**) a vineyard (6°21 14.62042592 E, 49°32 41.87405578 N) located at Remich, Luxembourg [49]; (**B**) Fringing Reef on Yanooa (Pelorus) Island on the Great Barrier Reef (146°30 03.20650026 E, 18°33 37.06807973 S) located in Queensland, Australia [50]; (**C**) coastal suburban recreational beach (145°42 38.60723871 E, 16°47 52.90367696 S) in Trinity Park, Queensland, Australia [51]; (**D**) urban residential block (111°03 33.67593896 W, 32°20 56.84116378 N) located in Tucson, Arizona, USA [52]; and (**E**) Lung Island (145°13 49.31663938 E, 15°31 16.66610816 S) in Annan River (Yuku Baja-Muliku) National Park located in Cooktown, Queensland, Australia [53]. Service layer credits: HERE, Garmin, USGS, ESRI, ©OpenStreetMap (and) contributors, CC-BY-SA.

Each of the software packages have their own manufacturer-suggested workflows, in addition to a variety of parameters that can be manually altered depending on the user requirements. In this study, we opted to follow the suggested workflow of each package, based on the assumptions that many users are likely to opt for default settings at least initially and that the default settings have been selected by the manufacturer as producing

the most consistent and hopefully optimal outcomes. It was outside the scope of this study to evaluate each and every parameter within the software themselves, and we refer interested persons to the user manuals of each software for further details.

With minor variations in terminology between software packages, each follows a similar workflow including loading data, aligning photos, bundle adjustment, creating a dense point cloud and/or mesh, creating a DSM, and building an orthomosaic. We used the "True Ortho" Correlator3D wizard workflow; the "3D Maps—Standard" Pix4D workflow; and the default template for WebODM. As AgiSoftMS does not have a templated automated workflow, we selected the manufacturer-recommended components, namely align photos, optimise alignment, build dense cloud, build DEM, and build orthomosaic. We accepted the default recommended settings for each package.

While we recognise the benefit of including independent GCPs to improve the spatial registration of the output products, we did not have access to the required reference data for this study. Further, there are many circumstances where it may not be possible to obtain sufficient GCP data (e.g., in marine environments). This study therefore evaluated the software outputs in their absence, but remains relevant as a relative comparison of the "worst-case" spatial registration between each software package.

All processing was performed using a computer with Windows 10 Enterprise OS, an Intel(R) Core(TM) i7-7700 CPU @ 3.60 GHz, 32 GB of installed RAM, and an NVIDIA Quadro P1000 GPU with 4096 MB RAM. All spatial analyses to compare the output products were conducted in ArcGIS Pro [54], and quantitative analytics were completed using Python [55].

#### *2.3. Comparing Output File Dimensions and Specifications*

After processing all datasets using the manufacturer-recommended default parameters, we compared the output details for every software and dataset combination including output file size, projected coordinate system, geographic coordinate system, x and y resolution, absolute geographical coverage, and relative coverage. The areal coverage of each orthomosaic was obtained by extracting the polygon footprint of the projected DSM and orthomosaic boundary, excluding the "no data" values.

We selected the output from AgiSoft Metashape as the baseline product to which the other datasets were compared to obtain the relative areal coverage.

#### *2.4. Comparing Orthomosaics*

To compare the output orthomosaics, we assessed the following:

a Colour correlation score: The luminance value of each pixel was extracted from each colour channel (red, green, and blue) from the original drone images, as well as the output orthomosaic. A density histogram was subsequently plotted to visualise the similarity between the unprocessed and the processed image of each colour band. A correlation score [56] was also calculated to quantify the resemblance of each histogram with each other using the equation below:

$$d(H\_1, H\_2) = \frac{\sum\_{l} (H\_1(I) - H\_1) \left(H\_2(I) - H\_2\right)}{\sqrt{\sum\_{l} (H\_1(I) - H\_1)^2 \sum\_{i} (H\_2(I) - H\_2)^2}}\tag{1}$$

where *H*<sup>1</sup> and *H*<sup>2</sup> are the colour density histograms of any two out of five sources (original drone images and outputs from four software) being compared,

$$H\_k = \frac{1}{N} \sum\_j H\_k(f) \tag{2}$$

and *N* is the total number of histogram bins (256 for 8 bit true colour images). A correlation score close to one indicates high similarity between the colour density of the input images and that of the orthomosaic, while a score approaching zero indicates low similarity;


#### *2.5. Comparing Digital Surface Models*

DSMs are often associated with various uncertainties and errors that could happen at either the data collection time or during the processing time [58]. In the absence of high-resolution LiDAR or field-verified elevation data, all DSM outputs were compared to each other and to the DSM derived from the Space Shuttle Radar Topography Mission (SRTM) DEM 1 Arc-Second Global data (approximately 30 m resolution) [59]. At each site, the four SfM-derived DSMs, in addition to the SRTM DSM, were paired up with each other (i.e., n = 10 combinations per site). Within each pair, both DSMs were resampled to the smaller pixel size of the pair and using the following statistical measures adapted from Szypuła [60]:


$$MBE = \frac{\sum\_{i=1}^{N} (a\_i - b\_i)}{N - 1} \tag{3}$$

$$MAE = \frac{\sum\_{i=1}^{N} |a\_i - b\_i|}{N - 1} \tag{4}$$

$$RMSE = \sqrt{\frac{\sum\_{i=1}^{N} (a\_i - b\_i)}{N - 1}} \tag{5}$$

where *ai*, *bi* are the pixel values (i.e., elevation) at the same location of the paired up DSMs and *N* is the total number of overlapping pixels.

#### **3. Results and Discussion**

All software packages were able to successfully build a DSM and orthomosaic using the input datasets; however, we observed differences in the output file size, projected coordinate system, geographic coordinate system, x and y resolution, geographical coverage, relative coverage, and processing time between software packages.

Unsurprisingly, the total processing time was closely and linearly related to the number of images processed (Figure 2). In most cases, C3D was the fastest-performing software package, followed by AgiSoftMS. With the three smaller datasets, P4D was the slowestperforming software; however, with the two larger datasets, WebODM became the slowest. It is likely that the slow performance of WebODM for large datasets was due to it using the CPU for processing, while the other three packages are able to access the GPU for higher performance. Of particular note, P4D had a processing time of up to 348% more than that of C3D (Figure 3). The longer processing time for Pix4D is likely related to the additional processing steps requiring the software to generate a 3D mesh and also automatically exporting the DEM and orthomosaic. These features are not included in the recommended templates for the other packages, where 3D models are not required or where the export of files occurs after the processing stage. There was only one case where Correlator 3D was outperformed on speed (Dataset C—Trinity Park), where AgiSoftMS processed these data in 29 min compared to 34 min, or 85% of the time taken for C3D (Figure 3).

**Figure 2.** Comparison between the number of images to be processed and the time taken for each software package to complete the processing.

When time is money, the speed of processing is likely to influence software selection, particularly when multiple large datasets are captured. Yet, this cannot be considered in isolation, as the quality of the output is most likely the primary determinant of SfM software choice. We also note that it is possible to reduce the processing time of each of these packages by modifying the standard workflows (e.g., deselect the mesh option for P4D and WebODM), with the caveat that the modification may reduce the quality of the output products, so it should be evaluated accordingly.

**Figure 3.** Comparison of the percentageof time taken for each software package to complete the processing, using Correlator 3D as the baseline. Blue shades depict a faster comparison time, while red shades indicate slower comparison times. The darker the tone, the greater the difference is.

0

#### *3.1. Comparing Output File Dimensions and Specifications*

The output orthomosaic and DSM file sizes varied considerably between the software packages (see the details in Tables A1 and A2). This is a result of a combination of the output image resolution and the area that was successfully processed. For example, the default WebODM processing resamples the output to a resolution of 5 × 5 cm per pixel. This lower resolution results in the lowest output file size among all software, which is useful for sharing data between collaborators or hosting on online servers. However, the loss of detail may prove problematic for some users. The other packages tend to generate the maximum resolution output by default, which is closer to the ground sample distance (GSD) of the original input drone images. As with all other parameters, the user can deviate from the default settings to stipulate the desired output resolution, and the software will resample the output accordingly. This might be important for maintaining consistency across multiple datasets, in particular for time series analysis, but resampling will inevitably alter the output image accuracy.

In combination with the output pixel size, the total areal coverage also impacts the file size. In Figure 4, we compare each of the software DSM and orthomosaic outputs to the areal coverage generated by AgiSoftMS and note the considerable differences. Pix4D in particular returns smaller areal coverages for both the DSM and orthomosaic in each of the datasets that contain water bodies (B, C, and E—Yanooa Reef, Trinity Park, and Lung Island). P4D and, to some extent, WebODM clearly have difficulty aligning and resolving water and submerged features—in particular where there is sunglint on the water's surface—and consequently crop these features from the final products (Figure 5).

WebODM failed to reconstruct the crop field on the right half of Dataset A, also resulting in a comparatively small areal coverage for those output products (Figure 6).

**Figure 4.** Relative coverage for the DSM (**left**) and orthomosaic (**right**) in all datasets where the cell colouring is blue, the output areal extent is smaller than the AgiSoft reference, while shades of red indicate an area larger in extent.

**Figure 5.** AgiSoft vs. P4D with Dataset B (**left**), Dataset C (**middle**), and Dataset E (**right**). The red shade and the inset at the bottom right corner are the coverage of the orthomosaic generated by P4D. The red cluster scattered in the shade is voids that have no value in the orthomosaic datasets. This is overlaid with the output from AgiSoft to show the difference of coverage.

**Figure 6.** AgiSoft vs. WebODM with Dataset A. The red shade and the inset at the bottom right corner are the coverage of the orthomosaic generated by WebODM. The red cluster scattered in the bottom right edge of the shade is voids that have no value in the orthomosaic datasets. This is overlaid with the output from AgiSoft to show the difference of coverage.

#### *3.2. Comparing Orthomosaics*

In comparing the orthomosaics, we aimed to evaluate the similarity in colour between the input and output data; any geographic shift between the output products and reference satellite imagery; and the visual consistency between the output product and ground features.

#### 3.2.1. Colour Density Correlation Score

During the process of building an orthomosaic, pixel values are averaged in areas of overlap, and as we already demonstrated with the coastal datasets, some pixels are excluded entirely. The colour density correlation score provides further evidence for the differences seen between the SfM packages and original drone images (Figures 7–9). In particular, the density histogram for the red channel for the P4D orthomosaics created from Dataset B (Figure 10A) are very different from the original images and other software outputs, which could be due to the cropped water feature pixels (Figure 5). The green and blue channels (Figure 10B,C) show closer alignment between all software packages with the exception of P4D, in particular in the middle range values for luminance (i.e., pixels that are neither very bright nor dark). This also results in decreased contrast across the orthomosaic scene. In cases where it is important to retain the input absolute pixel values (e.g., for quantitative mapping and assessments), it is worth further investigating the methods of feathering and averaging between images in overlapping areas to ensure the appropriate algorithms are used.

**Figure 7.** Correlation scoreof red pixel luminance values. The results from each dataset comprise half of the square presented, separated by the diagonal dashed line and labelled with the corresponding letter. Both columns and rows are labelled with orthomosaic sources, and "Original" refers to the original drone images. Darker shading denotes a higher correlation score, i.e., similar luminance value density.

**Figure 8.** Correlation score of green pixel luminance values. The results from each dataset comprise half of the square presented, separated by the diagonal dashed line and labelled with the corresponding letter. Both columns and rows are labelled with orthomosaic sources, and "Original" refers to the original drone images. Darker shading denotes a higher correlation score, i.e., similar luminance value density.

**Figure 9.** Correlation score of blue pixel luminance values. The results from each dataset comprise half of the square presented, separated by the diagonal dashed line and labelled with the corresponding letter. Both columns and rows are labelled with orthomosaic sources, and "Original" refers to the original drone images. Darker shading denotes a higher correlation score, i.e., similar luminance value density.

**Figure 10.** A subset of colour density histograms that show the most variance (i.e., lowest correlation score) between different software packages and original drone images. (**A**) Red channel colour density plot for Dataset B, where the lowest score (0.55) occurs between P4D and AgiSoft outputs. (**B**) Green channel colour density plot for Dataset C, where the lowest score (0.47) occurs between P4D and C3D outputs. (**C**) Blue channel colour density plot for Dataset C, where the lowest score (0.39) occurs between P4D and C3D outputs.

#### 3.2.2. Geographic Shift

When compared to the satellite data available in Esri base maps within ArcGIS Pro [57], the drone data show between two and four metres of displacement (Figure 11), which is reasonable considering the positional accuracy of Global Navigation Satellite System (GNSS) units on drone platforms, in particular without additional ground control (Kalacska et al. 2020). The geographical shift is more prominent towards the edges of the orthomosaic than in the centre, due to the lower overlap of the input images in these areas. This reinforces the need to plan data capture missions that cover areas beyond the bounds of the central region of interest. In the centre of all software-generated orthomosaics, all features were within 2.50 m of the satellite features (WebODM: 1.86 ± 0.36 m, C3D: 2.06 ± 0.10 m, P4D: 2.44 ± 0.25 m, AgiSoftMS: 2.50 ± 0.26 m).

**Figure 11.** Average displacement (m) (±SE) of the centre and edge features on the AgiSoft Metashape, Correlator3D, Pix4DMapper, and WebODM Orthomosaics from satellite imagery [57]).

In contrast, at the orthomosaic edges, P4D, AgiSoftMS, and WebODM showed slightly larger displacement from the satellite features (Pix4DMapper: 3.29 ± 0.51 m, AgiSoftMS: 3.61 ± 0.53 m, WebODM: 4.13 ± 0.49 m). C3D, however, displayed similar displacement at the orthomosaic edges to the centre, (2.12 ± 0.16 m), making it the nearest to the satellite imagery at the orthomosaic edges. Interestingly, while WebODM appeared to be nearest to the satellite imagery in the centre of the orthomosaic, displacement at the edges was the furthest, at 2.2-times further from the satellite imagery than the centre features.

There are a range of factors in both the data collection and image processing phases that can influence geographic shift/geometrical accuracy. These include flight path, camera quality, calibration, georeferencing strategy (use of GCPs or reliance on the direct onboard georeferencing Global Navigation Satellite System (GNSS) with RTK), and the SfM algorithms [61]. While georeferencing strategies that have employed GCPs have been found to result in finer horizontal accuracy at the decimetre/centimetre scale [61–63], direct methods that rely on GNSS (i.e., non-RTK drones) alone have resulted in accuracies at the metre scale [31]. The metre scale of horizontal accuracy found in these previous studies that have

not employed GCPs or the GNSS with RTK appear to correspond to the results of this study where average displacement ranged from 1.86–2.50 m in the centre of the orthomosaics and 2.12–4.13 m at the orthomosaic edges. If users are seeking accuracy at much finer scales, the software choice is not likely to improve the outcome greatly, and they will need to consider the addition of GCPs or the use of a GNSS with RTK to achieve centimetre-scale accuracy [64,65]. The SfM algorithms employed by each software are another possible source of variation in the geographic shift observed. Although C3D appeared more consistent in the geographic shift across the orthomosaic, this is based on the assumption that the satellite imagery represents the true location of the features. A previous study comparing both horizontal and vertical accuracy across software platforms found AgiSoft PhotoScan to be more accurate than Pix4D web-based image processing and Bundler SfM algorithms [66], but did not evaluate C3D or WebODM. Additionally, a study that compared the accuracy of five different software packages that included AgiSoft and Pix4D in their assessment found little difference in accuracy; however, this was under differing levels of GCPs as opposed to what the software can produce without these inputs [37]. While these differences were attributed to differences in the algorithms in this case, it remains difficult to directly compare the cause due to the lack of detailed information released by proprietary software developers [34,66]. This however serves as a reminder to always retain copies of the original imagery so the data can be re-processed using the best available methods, as software packages and their algorithms will change and hopefully improve over time.

#### 3.2.3. Visual Artefacts

The qualitative analysis of artefacts contained within orthomosaics found that all software contained more artefacts at the edges of the orthomosaic, compared to the centre. Centre artefacts were generally at a smaller scale and were only evident once viewed at increased zoom. Artefacts were also more evident in areas where DSM values in neighbouring pixels changed rapidly, such as at the edges of buildings or trees and forests (Figure 12). Artefacts presented in the form of missing data or gaps in information; "filled" data through smoothing, interpolation, extrapolation, or filtering; and cutlines at feature edges. Cutlines in the orthomosaic often produce visual artefacts at high zoom levels and will also present challenges for automated information extraction at later processing stages. Alternatively, the user can deviate from the standard workflow to create the orthomosaic using the derived DTM instead of the DSM, which tends to result in fewer visual artefacts, though can introduce a greater geographic shift of tall features in the imagery due to uncorrected radial displacement. The users must therefore determine for themselves the most suitable outcome for their specific application.

Each software's default appears to deal with missing data in different ways. C3D simply excludes pixels where it cannot reconstruct portions of the image, presenting them as "no data" fill in the orthomosaic (see Figure 12I). Pix4DMapper and AgiSoftMS have interpolation enabled in the default settings to fill the space based on surrounding values resulting in warped or shaded sections on an orthomosaic (see Figure 12G). However, interpolation will only fill when there are enough close points, and a lack of information from close points can result in areas that are filtered out of the final DSM and orthomosaic and present as holes (See Figure 12G,H). Close inspection of the DSM and the final orthomosaic is recommended to detect holes and warped areas, as these may not be detected until zooming in on smaller features in the orthomosaic.

**Figure 12.** Examples of artefacts found in Dataset D (**A**–**E**) and Dataset C (**F**–**J**) orthomosaics generated from various image processing software where: (**A**) raw image, (**B**) AgiSoftMS, (**C**) Correlator3D, D: Pix4DMapper, and (**E**) WebODM for Dataset D; (**F**) raw image, (**G**) AgiSoftMS, (**H**) Correlator3D, (**I**) Pix4DMapper, and (**J**) WebODM for Dataset C.

#### 3.2.4. Comparing Digital Surface Model

When using similarity metrics to evaluate the output DSMs, we found that there was very little overall difference between the software outputs except Dataset E, though all differed somewhat from the SRTM data. The MAE (Figure 13) and RMSE (Figure 14) had little difference in Datasets A, B, and D, between comparison pairs, which indicates that the difference between all the pixels was fairly even. A mixture of terrestrial and aquatic features yet led to a greater variance of the difference (the RMSE of Datasets C and E was higher than the MAE). The depth (or "negative elevation") of underwater features was inadequately represented across all models, though none of them boast a capability in this respect, and bathymetric LiDAR would certainly be a better option for deriving depth information [67].

**Figure 13.** Mean absolute error (MAE) comparison between each DSM output. The results from each dataset comprise half of the square presented, separated by the diagonal dashed line and labelled with the corresponding letter. Both columns and rows are labelled with DSM sources (SRTM and drone derived). Darker shading denotes higher MAE values.

**Figure 14.** Root-mean-squared error (RMSE) comparison between each DSM output. The results from each dataset comprises half of the square presented, separated by the diagonal dashed line and labelled with the corresponding letter. Both columns and rows are labelled with DSM sources (SRTM and drone derived). Darker shading denotes higher RMSE values.

For the datasets dominated by terrestrial features, drone-derived elevation was overestimated, producing a positive MBE in Datasets A and D when compared to the SRTM data (Figure 15). Most of the differences between the drone-derived DSMs from AgiSoftMS, C3D, and P4D were minor (seen as light shading in Figure 15), indicating that the software algorithm is not the most important factor in obtaining an accurate elevation estimate. However, WebODM displayed greater differences from the other drone-derived DSM values, in particular with Dataset D. WebODM was also more closely aligned to the SRTM data for Dataset D than the other software packages. As a previous review pointed out, the accuracies of UAV true colour image-derived DSM are comparable to those obtained by LiDAR [68]. Regardless of the usage of GCP or check points, the absolute difference between LiDAR and true colour image-derived DSM was less than 4 m [69,70]. The similarity between software outputs was as expected (Figures 13–15). On the other hand, the relatively large difference between SRTM data and the software outputs is worth noting since UAV imagery is considered as a new tool to fill in the gaps between satellite imagery and in-person field surveys. However, with such a small sample size, inconsistent findings between datasets, and no GCP information, these results are inconclusive. Given the significant difference in resolution between the SRTM and the true-colour-image-derived DSMs, this technique fits better for comparisons at the regional scale. Higher-resolution DSMs (e.g., derived from UAV-based LiDAR sensors or commercial-level satellites) would allow for more detailed comparison at the pixel level.

**Figure 15.** Mean bias error (MBE) comparison between each dataset. Each dataset takes up half of the quadrant, separated by the diagonal dash line and labelled with the corresponding letter. Both columns and rows are labelled with DSM sources (SRTM and drone derived). In each quadrant, for the top right half, *cellvalue* = *DSMrow* − *DSMcolumn*. For the bottom left half, *cellvalue* = *DSMcolumn* − *DSMrow*.

In most cases, if deriving absolute elevation is an important project consideration, using the standard processing workflows listed above will be insufficient. Similar to reducing geographic shift, large improvements in elevation accuracy are unlikely to be achieved through software or algorithm choice at this stage, based on the options available. Rather, previous research has indicated that the use of GCPs is highly important to calibrate DEMs [61], and additional direct georeferencing also improves the precision and accuracy [66]. There is growing potential for platforms with onboard RTK GNSS capability to also address this challenge [71]. Employing these technologies where available will undoubtedly increase the output product's accuracy, but will require an increase in the user's time and financial investment across both data capture and processing.

#### **4. Conclusions**

In this study, we tested the prescribed workflows and output products (DEMs and orthomosaics) of four different SfM photogrammetry packages using five drone image datasets. We observed considerable differences in processing times, with Correlator3D and AgiSoft outperforming Pix4Dmapper and WebODM, in particular with large datasets. It was also clear that Pix4Dmapper in particular struggled to reconstruct underwater features, while the other software packages provided suitable outputs in reef and coastal ecosystems. Each software package introduced visual artefacts in the output orthomosaic products, in particular around the edges of buildings and tall vegetation, and we leave the opinion of the acceptable level of artefacts to the users and their particular application. Based on our qualitative and quantitative assessments of the output orthomosaics and DSMs, we caution users against comparing multitemporal drone datasets that have been processed using different software packages and algorithms. Using the same software will give users greater confidence that any detected change is in fact a change in an ecosystem and not due to the processing workflows. The information contained herein will allow users to make informed decisions about future software selections and the impact that their choices may have on the output product's accuracy.

**Author Contributions:** Conceptualisation, K.E.J., T.P., and J.Y.Q.L.; methodology, T.P. and J.Y.Q.L.; data collection: K.E.J.; formal analysis, T.P. and J.Y.Q.L.; original draft preparation: T.P., J.Y.Q.L., and K.E.J.; review and editing: T.P., J.Y.Q.L., and K.E.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** All data analysed in this study are available via https://www.geonadir. com: 1. https://data.geonadir.com/project-details/341 [49]; 2. https://data.geonadir.com/projectdetails/98 [50]; 3. https://data.geonadir.com/project-details/139 [51]; 4. https://data.geonadir. com/project-details/353 [52]; 5. https://data.geonadir.com/project-details/523 [53] (acessed on 13 December 2021).

**Acknowledgments:** We thank Margherita Bruscolini, Jack Koci, David Rogers, and Mick Hale, as well as their colleagues for capturing and for uploading their data to GeoNadir to be findable, accessible, interoperable, and reusable (FAIR). We owe deep gratitude to Anne Crosby for her valuable feedback on the manuscript. We acknowledge the useful assessments and corrections from the anonymous reviewers, as well as the Journal Editors.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A**

Detailed output file dimensions and specifications.



**Table A2.** Output file specifications for the orthomosaics.

#### **References**


*Article*

## **Determining the Optimal Number of Ground Control Points for Varying Study Sites through Accuracy Evaluation of Unmanned Aerial System-Based 3D Point Clouds and Digital Surface Models**

#### **Jae Jin Yu, Dong Woo Kim, Eun Jung Lee and Seung Woo Son \***

Korea Environment Institute, Bldg. B, 370 Sicheong-daero, Sejong 30147, Korea; jjyu@kei.re.kr (J.J.Y.); dwkim@kei.re.kr (D.W.K.); ejlee@kei.re.kr (E.J.L.)

**\*** Correspondence: swson@kei.re.kr

Received: 21 July 2020; Accepted: 20 August 2020; Published: 27 August 2020

**Abstract:** The rapid development of drone technologies, such as unmanned aerial systems (UASs) and unmanned aerial vehicles (UAVs), has led to the widespread application of three-dimensional (3D) point clouds and digital surface models (DSMs). Due to the number of UAS technology applications across many fields, studies on the verification of the accuracy of image processing results have increased. In previous studies, the optimal number of ground control points (GCPs) was determined for a specific area of a study site by increasing or decreasing the amount of GCPs. However, these studies were mainly conducted in a single study site, and the results were not compared with those from various study sites. In this study, to determine the optimal number of GCPs for modeling multiple areas, the accuracy of 3D point clouds and DSMs were analyzed in three study sites with different areas according to the number of GCPs. The results showed that the optimal number of GCPs was 12 for small and medium sites (7 and 39 ha) and 18 for the large sites (342 ha) based on the overall accuracy. If these results are used for UAV image processing in the future, accurate modeling will be possible with minimal effort in GCPs.

**Keywords:** UAS; GCP; 3D point cloud; DSM; image processing accuracy

#### **1. Introduction**

Various studies have been conducted to investigate the utility of unmanned aerial systems (UASs) and unmanned aerial vehicles (UAVs) as the technologies have developed and popularized. These studies mainly acquire data through UAVs equipped with optical (RGB), multi-spectral, or infrared sensors [1–3], and produce 3D point clouds and digital surface models (DSMs) using image processing programs. Before the popularization of UAV technology, satellites, manned aircrafts, and professional surveillance cameras were the prominent methods of data acquisition for research [4,5].

The 3D point cloud is a set of data in which each point has a 3D coordinate value. The initial 3D point cloud was acquired through an expensive laser scanner [6,7]. However, recently, it could be constructed from images taken with a camera using the Structure from Motion-Multi View Stereo (SfM-MVS) algorithm [8–10]. DSM is a 2.5D raster format data generated with stereo images [5,11] and is also produced by interpolating point clouds built using the SfM-MVS algorithm [12,13]. Image processing results, such as 3D point clouds and DSMs built using UAV images, have been applied to various fields, such as geography, environment, administration, and industry [14–17].

Due to the increase in the number of studies that produce and utilize results of UAV image processing, studies that verify their accuracy are also increasing. Parameters that affect the accuracy of results are flight parameters [18–20], such as the front and side overlap and flight altitude, interior orientation parameters of the camera [1,21–23], and exterior orientation parameters through ground control points (GCPs) [24,25]. The root-mean-square error (RMSE), which statistically represents the errors between the constructed results and the checkpoints (CPs), mainly evaluates the accuracy [25–28].

To verify the accuracy of image processing results, studies have been continuously conducted using flight parameters [18,29] or the interior orientation parameters and calibration of the camera as variables [1,22,30]. However, the most actively conducted studies on accuracy verification were focused on using the number of GCPs as the variable. Studies on 3D modeling using UAVs mainly used non-measurement cameras and low-cost UAVs [31–34], which requires an increasing number of GCPs [34]. Installing GCPs is labor-intensive and time-consuming work [35,36], and human access can be difficult depending on the terrain of the target site (e.g., steep mountain area and rock quarry) and the material of the ground surface (e.g., tidal flat and waste stock). In order to reduce the amount of time and labor required to install the GCPs, a real-time kinematic (RTK) and a post-processing kinematic (PPK) method have been recently introduced [25,27,37]. However, the RTK and PPK methods require expensive devices and need complex technologies that make them difficult to use. Therefore, many studies focus on the number of GCPs installed instead.

Previous studies have suggested the optimal number of GCP. Soo-Bong Lim [38] mentioned that 10–12 GCPs are appropriate per 100 ha. Yong-ho Yoo et al. [39] classified the number of GCPs into 0, 3, and 6. They reported that no significant difference exists in the deviation of the horizontal accuracy, depending on the number of GCPs. In contrast, the deviation of vertical accuracy decreased as the number of GCPs increased. Bu-yeol Yun et al. [40] reported that stable accuracy could be achieved when eight to nine GCPs are used to determine the precise position. Seung-woo Son [29] mentioned that two to three GCPs should be set per 1 ha to obtain a 3D model with high accuracy, although the number may differ depending on the flight altitude.

The accuracy was high when one GCP was used per 2 ha [41], and another study noted that one GCP is required per approximately 1.17 ha. This is because no change occurred when the number of GCP was 15 or higher for the study site (17.64 ha) [42]. A study on the appropriate amount of GCPs, in which more than 100 GCPs were installed in a study site of approximately 12 km2 (1200 ha), reported that sufficiently high accuracy could be achieved when the number of GCPs was three or less per 100 photographs (a total of 2514 photographs) [43]. Patricio Martínez-Carricondo et al. [26] reported that GCPs must be placed at a density of 0.5–1 GCP <sup>×</sup> ha<sup>−</sup>1.

These such studies indicate that 9–12 GCPs are generally required per 100 ha (1 km2). As previous studies conducted research only in a single site to determine the optimal number of GCPs, the question is whether the results can be applied to study sites with various areas. In summary, when 9–12 GCPs were assumed to be optimal per 100 ha, examining whether the number of GCPs increases (or decreases) as the area increases (or decreases) is necessary. Additionally, the Public Surveying Regulation Using Unmanned Aerial Vehicle enacted recently in South Korea [44] specifies that nine or more GCPs are required per 1 km2. However, it does not mention the number of GCPs changes according to the increase or decrease in the target site, and no criterion exists for areas less than 100 ha. 3D point clouds and DSMs were frequently constructed using UAVs in sites smaller than 100 ha [45–47] due to technical limitations, such as battery shortages [48–50], hence the need for research on GCP setting for various target sites.

In this study, the accuracy of 3D point clouds and DSMs were analyzed in three study sites with different areas according to the number of GCPs to determine the optimal number of GCPs, which is required when 3D modeling is performed for target sites with different areas using UAVs.

#### **2. Materials and Methods**

#### *2.1. Study Sites*

This study aimed to determine the optimal number of GCPs for each area by examining the accuracy of 3D point clouds and DSMs. Three differing sites were selected as study sites. The small site (SS) was an aggregate yard located in Jiphyeon-ri, Sejong city, with 7 ha. The long-direction of the site was N-S with a length of approximately 0.34 km, whereas the short-direction was E-W with a distance of approximately 0.25 km. As SS was excessively small; one mission of a rotary-wing UAV could cover the entire area. The medium site (MS) was the Pado-ri coast located in Taean-gun, Chungcheongnam-do, with 39 ha. The long-direction of the site was N-S with a length of approximately 1.2 km, whereas the short-direction was E-W with a distance of approximately 0.3 km. The MS required four missions of a rotary-wing UAV. The large site (LS) was the Daedeok industrial complex in Daejeon metropolitan city, with an area of 342 ha. The long-direction of the site was NW-SE with a length of approximately 2.4 km, whereas the short-direction was NE-SW with a distance of approximately 1.7 km. As LS covers an extensive coverage even for a fixed-wing, which performs a longer flight time than the rotary-wing, three flight missions were performed using the fixed-wing UAV. The flight sites and their positions within the Korean peninsula are summarized in Figure 1.

**Figure 1.** From left to right, a map of the Korean peninsula showing the locations of the three different study sites for (**a**) the small site, (**b**) medium site, and (**c**) large site, aerial orthomosaics of the three labeled study sites, next to images captured of the study sites from UAV.

In the SS, no vegetation growth was observed on the ground surface because aggregate was continuously carried in and out. The MS was composed of tidal flats, where flood and ebb tides were repeated, with artificial structures behind them; thus, no vegetation was in the tidal flats. The LS was also covered with buildings, pavements, and sidewalk blocks; therefore, there was no vegetation except street trees.

#### *2.2. Data Collection and Photogrammetry Process*

The target sites were classified as SS, MS, and LS; then, 3, 6, 9, 12, 15 (14 in the case of the SS, where the increase in the number of GCPs was halted owing to space restrictions), and 18 GCPs were placed in each target site to investigate the effects of the number of GCPs on the accuracy of 3D point clouds and DSMs for these sites. The 3D point clouds and DSMs were produced using each GCP as an

exterior orientation parameter in SfM-based image processing. The accuracy of the generated results was analyzed through CPs.

The research method can be divided into in situ operations, including UAV flights, image acquisition, and land survey; the photogrammetry process, where results are produced according to the number of GCPs; and accuracy analysis (Figure 2).

**Figure 2.** Flow of research, from in situ operations to the photogrammetry process, for each of the SS, MS, and LS.

#### 2.2.1. In Situ Operations

The field survey was divided into UAV flight, image acquisition, and land survey using a global navigation satellite system (GNSS). A field survey for the SS was conducted on 10 October 2018. In the SS, the Inspire 1 Pro model was used, and images were captured using a Zenmuse X5 camera. As the area of the SS was small, imaging could be completed during a single flight.

A field survey for the MS was conducted on 2 December 2016. In the MS, the Phantom 3 Advanced model was used, and the camera mounted on the UAV (FC300S) was used. As the area of the MS exceeded the one-time flight coverage of the rotary-wing UAV, four missions were executed.

A field survey for the LS was conducted on 13 April 2017. In the LS, the QuestUAV DataHawk, a fixed-wing UAV, and ILCE-QX1 imaging camera were used. As the area of the LS exceeded the coverage of the fixed-wing UAV, which had a longer flight time than the rotary-wing UAV, three flight missions of the fixed-wing UAV were executed.

For land surveys, the Trimble R8s model was used for the SS and LS, and the Huace X90 model was used for the MS. This is summarized in Table 1.

Automatic path flight measurements for uniform image acquisition, SfM-based image processing, accuracy analysis, and mapping were processed using the same software for each site. Pix4d Capture was used for automatic path flight measurements. Agisoft Metashape (v. 1.5.3) was used for SfM-based image processing, CloudCompare (v. 2.10.2) for accuracy analysis, and ArcMap (v. 10.1) for mapping, as summarized in Table 2.

The coordinate system used for both flight and imaging was WGS 84 (EPSG: 4326), and the TM Korea 2000 or Central Belt 2010 (EPSG: 5186) coordinate system was used for GCP measurement. Although different coordinate systems for flight and measurement were used, absolute orientation was performed following the land survey performance during the image processing process.


**Table 1.** Summary of study sites, hardware, and specifications used in the research.

#### 2.2.2. Construction of 3D Point Clouds and DSMs and Their Accuracy Evaluation

SfM is a technology for reconstructing the camera's position and direction from multi-shot two-dimensional (2D) images and restoring the subject and scene in 3D. This technology, based on computer vision, was developed in the 1990s and became widely used in the 2000s [31,51].

**Pix4d Capture** Agisoft Metashape (v.1.5.3) CloudCompare (v.2.10.2) and

ArcMap (v.10.1)

In this study, imaging (acquisition of 2D images) was performed in each study site using the UAVs. Based on this, 3D point clouds and DSMs were constructed as final results using Metashape, a software program based on the SfM algorithm. Metashape goes through the following: Camera calibration-align–absolute orientation–camera align optimization–3D point cloud generation–DSM generation (to produce the final results from 2D images).

This imaging process is a minimal procedure required to produce a 3D point cloud and DSM. As this study aims to identify the accuracy of 3D point clouds and DSMs according to the area of each study site and the number of GCPs, image processing was performed by applying multiple GCP sets for each study site.

The accuracy of the produced 3D point clouds and DSMs can be evaluated using various methods. In this study, the accuracy of *x*, *y*, *z*, *xy*, and *xyz* was evaluated for verification using the errors and *RMSE* between the constructed results and the measured CPs. The errors represent individual errors between the created point clouds and CPs. *RMSE* indicates the overall accuracy of the results by combining individual errors and is one of the generally used criteria for position accuracy [28,52].

The error between the 3D point cloud and the CP was calculated using the "Cloud/Cloud distance" tool in CloudCompare. Similarly, the error between the DSM and the CP was calculated using the "Extract Values to Point" tool in ArcMap. The mean distance and standard deviation were calculated using the calculated error with the programs, and finally *RMSE x*, *y*, and *z* were produced (Equations (1)–(5)):

$$(RMSE)\_x = \sqrt{\frac{\sum\_{i=1}^{n} \Delta x\_i^2}{n}},\tag{1}$$

where Δ*xi* is the difference between the CP coordinates and coordinates determined from the 3D point cloud and DSM, and *n* is the number of points. The same equation applies to *RMSEY* and *RMSEZ* mutatis mutandis:

$$(RMSE)\_y = \sqrt{\frac{\sum\_{i=1}^{n} \Delta y\_i^2}{n}},\tag{2}$$

$$(RMSE)\_{xy} = \sqrt{(RMSE)\_x^2 + (RMSE)\_y^2},\tag{3}$$

where (*RMSE*)*<sup>x</sup>* represents the *x*-direction error in the plane between the CPs and the produced 3D point cloud (Equation (1)) and (*RMSE*)*<sup>y</sup>* represents the *y*-direction error in the plane between the CPs and the generated 3D point cloud (Equation (2)). The individual *x*- and *y*-direction error is calculated as (*RMSE*)*xy*, a radius error, which corresponds to the horizontal error of the 3D point cloud (Equation (3)):

$$(RMSE)\_z = \sqrt{\frac{\sum\_{i=1}^{n} \Delta z\_i^2}{n}},\tag{4}$$

$$\left(\text{RMSE}\right)\_{\text{xyz}} = \sqrt{\left(\text{RMSE}\right)\_{\text{x}}^{2} + \left(\text{RMSE}\right)\_{\text{y}}^{2} + \left(\text{RMSE}\right)\_{\text{z}}^{2}}.\tag{5}$$

Further, (*RMSE*)*<sup>z</sup>* represents the *z*-direction (vertical) error. For vertical accuracy testing, different methods are used in non-vegetated terrain (where errors typically follow a normal distribution suitable for RMSE statistical analyses) and vegetated terrain (where errors do not necessarily follow a normal distribution) [28]. In this study, (*RMSE*)*<sup>z</sup>* was applied for the evaluation of the vertical accuracy because all of the study sites were non-vegetated terrains (Equation (4)). (*RMSE*)*xyz* represents the error of the 3D point cloud in the overall direction (easting, northing, and elevation) (Equation (5)).

#### **3. Results**

#### *3.1. In Situ Operation*

In the SS, a total of 52 aerial images were acquired through a flight at an altitude of 120 m. The front and side overlap were 85% and 65%, respectively. In the MS, imaging was performed at an altitude of 70 m, and both the front and side overlap were 80%. As the site exceeded the flight radius of the rotary-wing UAV, four flights were performed, and a total of 1022 aerial images were acquired. In the LS, imaging was performed at an altitude of 150 m, and both the front and side overlap were 80%. As the site exceeded the flight radius of the fixed-wing UAV, three flights were performed, and 1163 aerial images obtained. The result of UAV and field survey is summarized in Table 3.


**Table 3.** Results of UAV flight and field survey.

The arrangement adopted for the placement and the number of the GCPs is as follows: The number of GCPs was increased by a multiple of 3, starting with the minimum number of GCPs required to obtain an absolute orientation in Metashape. The GCPs were arranged to form a central polygon that covered the study area while keeping the gaps between the GCPs as uniform as possible.

Although the SS area was small, there was a difference in altitude between the aggregate stacked at the center of the site and the surroundings. Considering this, GCPs were evenly placed at the top of the aggregate and in the surrounding areas. A total of 14 GCPs were placed. Among them, 3, 6, 9, 12, and 14 GCPs were present in the absolute orientation process of image processing. The MS was long in the N-S direction due to the coastal area's nature, and GCPs were placed accordingly. A total of 18 GCPs were placed, and divided into six cases (3, 6, 9, 12, 15, and 18 GCPs). In the LS, there were 18 GCPs as in the MS.

At least 20 CPs are required to evaluate the accuracy of the produced image processing results [28]. Surveying CPs across the study area can ensure higher reliability. However, the limitations of time and labor restrict the number of CPs surveyed. Therefore, only the CPs that were around the main targets, such as aggregate mounds, sand beaches, and main streets in each study area, were measured. In the SS, 311 points were acquired in the upper and surrounding areas of the aggregate yard; the MS, 79 points were acquired by setting two survey lines across the sand beach; and in the LS, 436 points were acquired by planning one survey line in the long direction of the site and two lateral lines in the short direction, as shown in Figure 3.

**Figure 3.** *Cont.*

**Figure 3.** Deployment of GCPs (red dots) and CPs (black crosses) for the (**a**) small site, (**b**) medium site), and (**c**) large site.

#### *3.2. Photogrammetry Process*

#### Constructing 3D Point Clouds and DSMs

The SS had five GCP sets generating five 3D point clouds and DSMs. For the MS and LS, they produced six 3D point clouds and DSMs because they had six GCP sets (Figure 4). There was a small difference in the number of points, point density, DSM, and orthomosaic of each 3D point cloud depending on the number of GCPs. Since the average number of points in a 3D point cloud is proportional to the area, a large difference in the average number of points is seen corresponding to the study area. Table 4 shows the average values of the produced results.

**Table 4.** Results of the field survey, summarizing the average number of points, density, and resolution of the 3–18 GCP results.


There were differences in the resolutions of DSM and orthomosaic because 3D point cloud processing was set to "Medium" in Metashape. "Medium" downscales the original scale at a ratio of 1:4. When a 3D point cloud and DSM are produced at the original scale in Metashape, the average resolution of the DSM (shown in Table 4) shows a 4-time increase, thereby increasing the value of the average resolution of DSM to become the same as the resolution of orthomosaic. The average point density and the average number of points also show an increase by 4 times. The image processing results were not produced on the original scale due to their capacity. The restriction on capacity is less for the SS and MS because the area is not large. In the case of the LS, however, the area is excessively large, and capacity can also become excessive if 3D point clouds and DSMs are produced in a 1:1 scale, making the post-processing of the results difficult.

**Figure 4.** 3D point clouds (**a**) and DSMs (**b**) of the study sites.

#### *3.3. Accuracy Analysis According to the Number of GCPs*

#### 3.3.1. Analysis of the Accuracy of 3D Point Clouds

The accuracy of 3D point clouds was evaluated using the horizontal (*xy*), vertical (*z*), and total (*xyz*) errors between the 3D point clouds and CPs. Figure 5 shows the average horizontal error between the 3D point cloud and CPs according to the number of GCPs. Figure 5a shows the average horizontal error distribution of the SS, and the error ranged from 0.046 to 0.050 m. Figure 5b shows the average horizontal error distribution of the MS. The average horizontal error was 0.103 m when the number of GCPs was three. This error ranged from 0.041 to 0.045 m when the number was six or larger. Figure 5c shows the average horizontal error distribution of the LS. The average horizontal error was 0.453 m when the number of GCPs was three, with a range of 0.061 to 0.063 m for other cases.

**Figure 5.** 3D point cloud horizontal error of the study areas: (**a**) SS, (**b**) MS, and (**c**) LS.

Figure 6 shows the average vertical errors between the 3D point clouds and CPs in each site and Figure 6a the vertical error distribution of the SS, and the error ranges from −0.004 to −0.063 m. Figure 6b shows the average vertical error distribution of the MS. The average vertical error was −0.916 m for three GCPs, ranging from −0.018 to −0.046 m. Figure 6c shows the average vertical error distribution of the LS. The average vertical error was 3.81 m for three GCPs, and it ranged from −0.004 to 0.114 m.

**Figure 6.** 3D point cloud vertical error of the study areas: (**a**) SS, (**b**) MS, and (**c**) LS.

Figure 7 shows the average total errors between the 3D point clouds and CPs in each site. Figure 7a shows the average total error distribution of the SS, and the error ranged from 0.063 to 0.111 m. The average total error of the MS was 0.972 m for three GCPs, as shown in Figure 7b, and it ranged from 0.055 to 0.07 m. Figure 7c shows the total error distribution of the LS. The average total error was 3.951 m for three GCPs, and it ranged from 0.088 to 0.174 m in other cases.

**Figure 7.** 3D point cloud total error of the study areas: (**a**) SS, (**b**) MS, and (**c**) LS.

The 3D point clouds produced for each GCP set were analyzed by using the average error range. In the case of the SS area, there was no significant difference observed in the average horizontal error corresponding to the increase or decrease in the GCP number. However, in the cases of the MS and LS areas, the average horizontal error showed a decrease when the number of GCPs was increased. Additionally, the average vertical error decreased significantly when six GCPs were used. The average total error showed a similar pattern to the average vertical error; however, when more than 12 GCPs were used, there was no significant decrease observed in the average total error.

#### 3.3.2. Analysis of the Accuracy of DSMs

DSMs are produced by interpolating 3D point clouds. Their vertical errors were analyzed as the elevation of the surface is considered important. Figure 8 shows the average vertical errors between the DSMs, produced by varying the number of GCPs and CPs in each site.

**Figure 8.** Vertical DSM accuracy of the study areas: (**a**) SS, (**b**) MS, and (**c**) LS.

Figure 8a shows the average vertical error distribution of the SS, which ranged from −0.011 to 0.074 m. Figure 8b shows the average vertical error distribution of the MS. The average error was −0.939 m for three GCPs, with a −0.019 to −0.061 m range in other cases. Figure 8c shows the average vertical error distribution of the LS. The average error was 3.986 m when three GCPs were used, and it ranged from −0.01 to 0.11 m in other cases. The vertical error observed in the DSMs was similar to the average vertical error observed in the 3D point clouds.

#### 3.3.3. Comprehensive Comparison

The errors between the constructed results and CPs were analyzed in the above sections to verify the 3D point clouds and DSMs constructed for each site. RMSE by region and the number of GCPs were analyzed to evaluate the overall accuracy of the constructed results (Figure 9; Table 5).


**Table 5.** RMSEs of the 3D point clouds (in meters).

*Drones* **2020**, *4*, 49

**Figure 9.** Comprehensive comparison of the RMSE results for the different study sites: (**a**) horizontal RMSEs of 3D point clouds, (**b**) vertical RMSEs of 3D point clouds, (**c**) total RMSEs of 3D point clouds, and (**d**) vertical RMSEs of DSMs.

The SS's horizontal RMSEs were 0.054 m (largest) for three GCPs and 0.050 m for 14 GCPs. The smallest RMSE of 0.050 m was observed when using 6 and 14 GCPs. The MS's horizontal RMSEs were 0.117 m (largest) for three GCPs and 0.048 m for 18 GCPs. The smallest RMSE of 0.044 m was observed when nine GCPs were used. The LS's horizontal RMSEs were 0.907 m (largest) for three GCPs and 0.065 m (smallest) for 18 GCPs.

The vertical RMSEs of the SS were 0.114 m (largest) for three GCPs and 0.044 m (smallest) for 14 GCPs. The vertical RMSEs of the MS were 1.148 m (largest) for three GCPs and 0.036 m for 18 GCPs. The smallest vertical RMSE of the MS was 0.036 m when 15 and 18 GCPs were used, and the vertical RMSEs of the LS were 4.101 m (largest) for three GCPs and 0.067 m for 18 GCPs. The smallest vertical RMSE was observed when the number of GCPs was largest.

The total RMSEs of the SS was 0.126 m (largest) for three GCPs and 0.067 m for 14 GCPs. The total RMSEs of the MS was 1.154 m (largest) for three GCPs and 0.060 m for 18 GCPs. The LS's total RMSEs were 4.200 m (largest) for three GCPs and 0.093 m for 18 GCPs.

The vertical RMSEs of DSMs for each site are as follows. Vertical RMSEs of the SS were 0.117 m (largest) for three GCPs and 0.039 m (smallest) for 14 GCPs. The MS had vertical RMSEs of 1.173 m (largest) for three GCPs and 0.037 m (smallest) for 18 GCPs. Finally, the LS's vertical RMSEs were 4.248 m (largest) for three GCPs and 0.069 m (smallest) for 18 GCPs, as in the previous case.

The RMSE analysis showed similarity to the error analysis. Excluding the case of the SS area, the RMSE showed a significant increase when 6 GCPs were used. The total RMSE seemed to be optimized when the 12 GCPs were used, although there were differences depending on the area of the study area.

#### **4. Discussion**

As mentioned in the introduction, the number of GCPs used in the study is important for performing absolute orientation in UAV surveys [24,53] because installing GCPs is not only labor intensive but also time consuming [35,36]. In addition, a large number of GCPs may increase the time required to mark drone images during image processing. Therefore, determining the optimal number of GCPs to construct 3D point clouds and DSMs is significant as it enables time-efficient work without wasting labor.

To find the optimal number of GCPs according to each target sites area, the accuracy of the 3D point clouds and DSMs was calculated according to the number of GCPs. GCPs were evenly distributed in each target site so that the GCP network could form a central point polygon capable of covering the site [26,43,54].

Drone modeling has various end-users, and each end-user requires different types of accuracy (vertical, horizontal, or total accuracy) depending on their needs. Therefore, the horizontal, vertical, and total accuracy were calculated individually. Whether this accuracy met the horizontal/vertical accuracy criteria of the American Society for Photogrammetry and Remote Sensing (ASPRS) was assessed [25]. In the Accuracy Quality criteria of ASPRS, the ground sample distance (GSD) of the original image becomes the judgment criterion of accuracy for horizontal accuracy, but the vertical accuracy only distinguishes the absolute accuracy. However, all of the three study sites were non-vegetated terrains. Therefore, non-vegetated in the vertical accuracy class was considered for the judgment of the accuracy (Table 6).

**Table 6.** Horizontal accuracy quality examples for high-accuracy digital planimetric data and vertical accuracy quality examples for digital elevation data.


#### *4.1. Horizontal Errors and RMSEs of 3D Point Clouds*

In the SS, the average horizontal errors were 0.05 m for three GCPs and 0.047 m for 14 GCPs. The smallest average horizontal error was 0.046 m when using six GCPs. The standard deviation ranged from 0.019 to 0.020 m, indicating no significant difference between whether the number of GCPs was the largest and the smallest. RMSE also ranged from 0.050 to 0.054 m, corresponding to the range of less than 2 \* GSD (5.94 cm) in all cases. In summary, using only three GCPs in the small area of the SS met the horizontal position accuracy criterion of ASPRS. Therefore, using only three GCPs is sufficient when only considering the horizontal accuracy.

The average horizontal error of the MS exceeded 0.1 m when the number of GCPs was three and became smaller when it was six. The error was the smallest (0.041 m) when the number of GCPs was nine. The standard deviation was 0.055 m for three GCPs and sharply decreased from six GCPs with a range of 0.014 to 0.016 m when the number of GCPs was between 6 and 18. RMSE was also inaccurate (approximately 5 \* GSD) when three GCPs were used but exhibited an accuracy of less than 2 \* GSD (5.38 cm) from six GCPs. As the horizontal accuracy criterion of ASPRS could be met when the number of GCPs was six or larger for the area of the MS, it is necessary to secure at least six GCPs when only the horizontal accuracy is considered.

The accuracy trend of the LS, according to the number of GCPs was very similar to that of the MS. When the number of GCPs was three, the average horizontal error was 0.453 m. However, the error sharply decreased and ranged from 0.061 to 0.067 m when the number of GCPs was between 6 and 18. The standard deviation also decreased from 0.78 m to 0.023–0.028 m when the number of GCPs increased from three. RMSE was also excessively large (close to 1 m) when the number of GCPs was three but decreased to less than 2 \* GSD with six GCPs. This indicates that using at least six GCPs can

meet the horizontal accuracy criterion of ASPRS for sites whose area is similar to or larger than that of the MS.

#### *4.2. Vertical Errors and RMSEs of 3D Point Clouds*

In the SS, the average vertical error was the largest (−0.063 m) when the number of GCPs was three. As the number increased, the error slowly decreased and showed a tendency to converge to zero. The difference between the two cases was only 1 mm when using 12 and 14 GCPs. The standard deviation was also the largest for three GCPs. However, it slowly decreased and there was a small difference when 12 and 14 GCPs were used. RMSE also exceeded 0.1 m for three GCPs but was less than 0.1 m for six and nine GCPs. It was close to 0.04 m from 12 GCPs. These results indicate that at least six GCPs must be installed in small areas, such as SS, considering the vertical accuracy and that at least 12 GCPs must be used to achieve the 5 cm class of ASPRS.

The average vertical error of the MS was very inaccurate (approximately −0.9 m) with three GCPs. As the number increased from 6 to 12, the average vertical error tended to decrease, reaching −0.027 m when it was 12. The error was −0.018 m when 15 and 18 GCPs were used. The standard deviation amounted to 0.7 m when the number of GCPs was three. It was 0.036 m for six GCPs and approximately 0.03 m from nine GCPs. RMSE was 1.148 m for three GCPs and slightly exceeded 0.005 m for six and nine GCPs. It was close to 0.03 m from 12 GCPs. In summary, excellent accuracy could be secured for MS with six GCPs as in SS when considering the vertical position accuracy, the accuracy within 5 cm class was observed when 12 or more GCPs were used.

In the LS, the error was close to 3.8 m with three GCPs, and it exceeded 0.1 m even when the number increased to six. The error was 0.06 m for nine GCPs, and it recorded a stable error range only when the number of GCPs reached 12. The standard deviation also recorded a vast difference when the number of GCPs increased from three to nine but became less than 0.08 m when 12 GCPs were used. The RMSE of the LS became less than 0.1 m from 12 GCPs, unlike the SS and MS, which exhibited excellent accuracy from six GCPs, and the LS exhibiting the highest accuracy using 18 GCPs. In the LS with a large area, the accuracy showed a tendency to improve as the number of GCPs increased. Therefore, higher accuracy could have been achieved through the use of more GCPs. In many cases, however, UAV modeling is not performed in large areas, such as in LS [45].

#### *4.3. Total Errors and RMSEs of 3D Point Clouds*

While the horizontal and vertical errors represent xy- and z-direction errors, the total error determines the 3D error in the xyz direction. The mean total errors of the SS were approximately 1 m with three, six, and nine GCPs were used. They significantly decreased to 0.064 and 0.063 m when 12 and 14 GCPs were used. The standard deviation also exhibited the smallest difference when 12 or more GCPs were used. RMSE exceeded 0.1 m for three and six GCPs and presented the best results (0.068 and 0.067 m) when 12 and 14 GCPs were used. Therefore, for the area of the SS, 12 or more GCPs must be used to obtain high accuracy when considering the total accuracy.

In the case of the MS, the average error slightly changed by millimeters from 12 GCPs, which is also true of the standard deviation and RMSE. As RMSE converged to less than 0.06 m from 12 GCPs, it is necessary to install 12 or more GCPs to obtain high accuracy.

In the LS, the average error, standard deviation, and RMSE were significantly reduced when 12 or more GCPs were used as in MS. RMSE was approximately 0.08 m from 12 GCPs. However, the difference in the MS is that the error and standard deviation were somewhat reduced when 18 GCPs were used compared to 12 and 15 GCPs. Thus, 12 or more GCPs must be used to secure high accuracy in large areas, such as LS. If more than 18 GCPs are used, this may increase the accuracy slightly.

#### *4.4. Vertical Errors and RMSEs of DSMs*

As DSMs are produced by interpolating 3D point clouds, their accuracy is similar to that of 3D point clouds. In the SS, the average vertical error of DSM was the largest when the number of GCPs was three. It decreased as the number of GCPs increased, and there was almost no difference when 12 and 14 GCPs were used. The standard deviation was also the smallest when using 12 and 14 GCPs similarly to 3D point clouds. Further, RMSE was approximately 0.04 m from 12 GCPs.

In both MS and LS, there was a small difference from the vertical errors of the corresponding 3D point clouds. In summary, in the case of DSM, at least 12 GCPs must be secured to ensure the accuracy of the result in a situation where vertical errors are considered.

#### *4.5. Comprehensive Discussion*

For each study site, the number of GCPs to be used to derive highly accurate horizontal, vertical, and total accuracy was different (Table 7). In this table, "minimum" represents the minimum number of GCPs to achieve the accuracy of approximately 0.1 m, which is "excellent" accuracy. Whereas, "optimal" is the number of GCPs required to meet the accuracy of ASPRS. The criteria for vertical and total errors are relative.

**Table 7.** Minimum and optimal number of GCPs for each area and accuracy (horizontal, vertical, and total).


This study's results showed that using only three GCPs could result in the optimal accuracy when considering the horizontal accuracy in a small area. Regarding the vertical and total accuracy in typical drone survey areas, such as SS and MS, 6 GCPs are required for excellent accuracy and 12 GCPs for optimal accuracy. Therefore, installing more than 12 GCPs in an area of approximately 39 ha or less is inefficient.

In the LS, the optimal accuracy was observed from six GCPs when considering the horizontal accuracy. The vertical and total accuracy shown excellent results when at least 12 GCPs were used, and the optimal accuracy was observed using 18 GCPs. In previous studies, the accuracy improved as the number of GCPs increased for very large areas, but the accuracy improvement alongside the increase in the number of GCPs was unknown [19,24,26,43]. In this study, the degree of improvement was not evident, either.

Thus, in typical areas (such as SS and MS), using 12 GCPs will be able to produce highly accurate 3D point clouds and DSMs. In larger areas, the installation of more than 12 GCPs will be required depending on the needs. These results indicate that using too many GCPs may not be effective compared to the labor and cost [55,56], even if the area of the target site is diverse, unlike calculating the optimal number of GCPs per area in previous studies [26,29,41,43].

In order to find methods that can improve UAV modeling accuracy, research has been conducted on solutions, such as camera angle, altitude, overlap, interior orientation parameters of the camera, and calibration, rather than on exterior orientation parameters, such as GCPs. In addition, the need for GCP installation was significantly reduced through the RTK and PPK techniques [25,27,37]. Therefore, with the development of technology, the dependence on the number of GCPs is expected to decrease gradually. Nevertheless, discussion on the number of GCPs will continue, as the method of using GCPs is the most widely used UAV modeling method and the most accurate method.

There are extremely diverse target sites for UAV modeling, as almost all the surface of the earth can be targeted [57–61]. Thus, the elevation and roughness of the ground surface are different. In our study, the three study areas were similar because there was little or no vegetation. However, the main study areas showed variation as follows: (i) Aggregation and bare land (SS), (ii) sand and gravels of beach (MS), and (iii) artificial/man-made areas (LS). In addition, the study on the number of GCPs mentioned in the introduction [38–43] also had a difference in elevation of less than 100 m in the study area except for the study of Enoc Sanz-Ablanedo et al. [43]. Generally, the distance between GCPs is referred to as the plane distance, but if the height difference in the study area is large, the height of the study area should be considered because the slant distance increases in proportion to the height difference. If the characteristics of the ground surface are considered in future studies, this may produce interesting results.

Herein, different UAVs and cameras were used for the three study sites. Various models and cameras were used as research materials because UAVs from hobby-type UAVs to industrial UAVs are routinely considered in UAV modeling. Therefore, it will be possible to apply the results of this study more universally. However, in future research, it will be essential to conduct research by reinforcing variable control and changing only the relative area for the same UAV, camera, and site.

Up to 18 GCPs were used for each site. LS exhibited the smallest RMSE with 18 GCPs. In future research, if the set interval for the number of GCPs is reduced to less than 3 and 18 GCPs or more are used, the results can be used to reinforce the results of this study.

#### **5. Conclusions**

In this study, the accuracy of 3D point clouds and DSMs were analyzed in three study sites with different areas according to the number of GCPs to propose the optimal number of GCPs for the 3D UAV modeling of various areas.

When the horizontal accuracy was considered, three or more GCPs had to be used in the SS, while six or more GCPs had to be used in the MS and LS. In terms of the vertical accuracy, using only 12 GCPs reached the optimal accuracy in SS and MS, and 18 GCPs in the LS. When considering the total accuracy that covers both the horizontal and vertical accuracy, using only 12 GCPs exhibited the optimal accuracy in SS and MS, and again 18 GCPs in the LS as with the vertical accuracy.

When 3D point clouds, DSMs, and orthomosaics are produced using UAVS, the installation of GCPs requires a lot of time and labor, both indoors and outdoors. Most of the previous studies discussed the number of GCPs in only one study area. On the other hand, our study selected various UAV modelling sites (encompassing natural environment, large industrial complexes, and environmental monitoring areas) based on different areas. We expect that if the results of this study are applied to actual UAV modeling, it may be possible to reduce the time and labor required for GCP installation.

**Author Contributions:** Conceptualization, J.J.Y.; methodology, J.J.Y.; land survey and UAV operation, J.J.Y., D.W.K.; validation, J.J.Y., S.W.S.; formal analysis, J.J.Y.; investigation, J.J.Y.; resources, S.W.S.; data curation, J.J.Y.; writing—original draft preparation, S.W.S.; writing—review and editing, J.J.Y., E.J.L. and S.W.S.; visualization, J.J.Y., D.W.K.; supervision, J.J.Y., D.W.K., E.J.L. and S.W.S.; project administration, J.J.Y.; funding acquisition, J.J.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Environmental Assessment Monitoring Project (GP-2020-05) of the Korea Environment Institute, and Broadcasting and Communication Development Fund of the Ministry of Science and ICT.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Drone Magnetometry in Mining Research. An Application in the Study of Triassic Cu–Co–Ni Mineralizations in the Estancias Mountain Range, Almería (Spain)**

**Daniel Porras 1, Javier Carrasco 2, Pedro Carrasco 3, Santiago Alfageme 4, Diego Gonzalez-Aguilera 3,\* and Rafael Lopez Guijarro <sup>5</sup>**


**Abstract:** The use of drones in mining and geological exploration is under rapid development, especially in the field of magnetic field prospection. In part, this is related to the advantages presented for over ground surveys, allowing for high-density data acquisition with low loss of resolution, while being particularly useful in scenarios where vegetation, topography, and access are limiting factors. This work analyzes results of a drone magnetic survey acquired across the old mines of Don Jacobo, where Copper-Cobalt-Nickel stratabound mineralizations were exploited in the Estancias mountain range of the Betic Cordillera, Spain. The survey carried out used a vapor magnetometer installed on a Matrice 600 Pro Hexacopter. Twenty-four parallel survey lines were flown with a speed of 5 m/s, orthogonal to the regional strike of the geological structure, and mineralization with 50 m line separation and 20 m flight height over the ground was studied. The interpretation of the magnetic data allows us to reveal and model two high magnetic susceptibility bodies with residual magnetization, close to the old mines and surface mineral shows. These bodies could be related to potential unexploited mineralized areas whose formation may be related to a normal fault placed to the south of the survey area. Our geophysical survey provides essential data to improve the geological and mining potential of the area, allowing to design future research activities.

**Keywords:** aeromagnetics; drone survey; mineral exploration; geophysical prospecting

#### **1. Introduction**

The demand for raw materials is rapidly increasing, proving a fundamental pillar in modern development as well as the future prospects of European industries. The product of this is a rise in demand for new material extraction sites that are able to support this type of development. In the case of Europe, geological and mining research is currently hindered by the effectiveness and speed of traditional methodologies in this type of research. This has resulted in adaptations in the field of geophysics in response to increasingly stricter requirements.

In this context, the use of drones is under important development, incorporating more autonomous systems allowing for the integration of multiple sensors such as: RGB sensors, ultrasonic sensors, Infrared Sensors (IR), stereo camera, laser range finders (LRFs), Ultra-Wideband Radar (UWB), and hyperspectral sensors like hyperspectral cameras, which allows for its use in a variety of civilian and military applications [1,2] and missions including the magnetometer for geological and mining research [3–9]. Other authors have used drone magnetometry in the oil and gas industry to locate abandoned wells and other buried infrastructures such as pipelines over wide areas [10–12]. Traditionally, magnetic

**Citation:** Porras, D.; Carrasco, J.; Carrasco, P.; Alfageme, S.; Gonzalez-Aguilera, D.; Lopez Guijarro, R. Drone Magnetometry in Mining Research. An Application in the Study of Triassic Cu–Co–Ni Mineralizations in the Estancias Mountain Range, Almería (Spain). *Drones* **2021**, *5*, 151. https://doi.org/ 10.3390/drones5040151

Academic Editor: Giordano Teza

Received: 19 October 2021 Accepted: 14 December 2021 Published: 18 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

surveys are performed on-site moving the sensor manually, thus capturing high resolution spatial data at the cost of low productivity. These approaches are additionally limited by access to the area of study. Other approaches consist in the installation of sensors on planes, increasing productivity by being able to study larger areas in smaller periods of time, at the cost of spatial resolution [13]. The use of drones is thus an alternative of interest presenting multiple operative advantages, such as flexibility, ease of use, and a lower logistic cost. Additionally, drones present high capacity for obtaining data over large areas in short periods of time, with less restrictions based on low accessibility, topographical, environmental, and vegetative conditions.

One of the most common techniques used in geological and mining research is magnetometry, employing the use of specially designed sensors that can be made airborne using drones [14–16]. This consists of a means of remotely carrying out geophysical surveys based on the measurement of terrestrial magnetic variations at regular intervals along a set of profiles. The majority of minerals present their own particular non-magnetic behavior. Nevertheless, another group of minerals exists, called ferromagnectic minerals, which include the cobalt ores frequently found in the Estancias mountain range of the Betic Cordillera, Spain, whose concentration in the Earth's crust generates detectable local variations in the magnetic field.

The present research project is focused on the analysis of the Don Jacobo mine, where copper and cobalt minerals were found in Triassic carbonates and have been extracted since the mid-19th century, up to the beginning of the 21st century [17]. These minerals include azurite, malachite, limonite, pyrite, galena, and erythrite. Mining works in the area were limited to small galleries that penetrated only few meters into the rock. There are no data from previous geophysical surveys or drilling activities.

This study thus has the primary objective of extracting information from this area in relation to the possible presence of mineral bodies located under areas of complex topographies where high slopes, scree debris, and vegetation make the application of ground geophysical techniques difficult. It is known that these topographic particularities make it difficult to study this area using other ground geophysical techniques such as, for example, electrical tomography, induced polarization, and ground magnetometry, which highlights the importance of research with drones in this application.

#### **2. Geological Context and General Characteristics of the Locality**

The region of interest is located in the Estancias mountain range, in the southeastern part of the Iberian peninsula, geologically positioned in the northern region of the Internal Betic Cordillera (Figure 1A), in materials of the The Alpujárride Complex, belonging to the Internal Zones of the Betic Cordillera. The diverse lithologies present can be organized into three large lithological units defining this complex. Stratigraphically, these three units correspond with, from bottom to top: Paleozoic shales, Triassic phyllite-quartzites, and Triassic carbonates (dolomitic-limestone). The first two units are additionally affected by Alpine metamorphism.

These units are presented with a general east-west orientation (N75-90E) (Figure 1B), conditioned by a strong deformation due to tectonic accidents [18,19]. These accidents have been interpreted mostly as fault propagation folds towards the south/south-east [20], thus conditioning the geological structures within this region (Figure 1B,C).

The area occupied by the Don Jacobo mine, and the general location of the target minerals, are dispersed in a topographically abrupt region, positioned on a 1000 m by 300 m (length × width) portion of the Triassic dolomitic-limestone unit. This unit is located above phyllites and is constricted towards the south by an interpreted as a possible normal tertiary fault, partially covered by plio-quaternary materials (Figure 1C).

From a metallogenetic perspective, the mineralizations are located within the Upper Triassic dolomites and limestones of the Alpujarride Complex, and are considered to be stratobound [18]. Rich Co-Cu mineralization can be found here alongside other minor elements

(Ni-Pb-Zn-Ag-Se-As-Hg) [21]. This is similar to the geological-lithological contexts of the Betic region, such as those found in Molvízar (Granada) and Huércal-Overa (Almería).

**Figure 1.** (**A**) Geological Context of the southeastern part of the Iberian Peninsula; (**B**) 1:50,000 Geological map (sheet 973—Chirivel and 995—Cantoria, Instituto Geológico y Minero de España, 1972 [18,19]), where the general east-west trend of the Alpujarride Complex can be observed; (**C**) 1:5000 cartographic revision of the Don Jacobo mine area, showing the main tectonic accident in the area (southern normal fault—black dashed line), survey area, and position of the Don Jacobo mines and outcropping mineral shows.

Genetically, these mineralizations have been attributed to a Mississippi Valley type, where metals are a product of the hydrothermal washing of marine series or from mafic intrusions, with carbonates acting as a reducing trap for the mineralization. The carbonated lithologies are constituted by large structures separated among themselves by phyllite lithologies, typically associated with important tectonic accidents. These accidents are then considered the potential channels for hydrothermal circulation [20].

The geochemical studies as well as nearby mineral indices documented from the initial exploitation of the Don Jacobo mine indicate the presence of Cu-Co-Ni primary mineralizations, with the presence of Pb-Zn (Ag), and with Cu and Co contents of >3% and 1%, respectively (Figure 2, [21]). No drilling data or geophysical surveys are available.

**Figure 2.** Mineralizations of the Don Jacobo mine; (**A**,**B**) Cu carbonates (green and blue colors) and veins of black Co oxides (black colors).

#### **3. Materials and Methods**

This research project consisted in the analysis of the mine's surroundings and outcropping mineral shows. Additionally, this research has tried to analyze the carbonated outcrops that appear over the 80 ha area.

#### *3.1. Site Conditions*

Data collection was performed on 14 June 2020 in warm, sunny, uncloudy, and low wind speed (<16 m/s) day, and lasted a total of 4 h (from 10 am. to 2 pm.). The area under study is practically covered with matorral type shrub-land, and scattered with dense wooded areas (Corine Land Cover 2018 types 312 and 323—https://land.copernicus.eu/ pan-european/corine-land-cover/clc2018—accessed on 1 November 2021). The area is accessible by only one track and is characterized by a low anthropogenic magnetic noise area and a severe topography, with an average slope of >26◦ (Figure 3). To minimize risks such as drone collision with topographic or vegetative elements (the tallest trees in the area reaching approximately 8 m Above Ground Level-AGL) and to ensure flight at a constant height above the ground, a digital elevation model (DEM) of the area was first generated. For this flight, the most important geometric criteria for photogrammetric applications were considered [22], allowing the generation of cartographic data of high quality and obtaining a Ground Sample Distance (GSD) of 4 cm/pixel. This DEM was calculated using a Dji Mavic 2 Pro drone with a 1" CMOS sensor, flying at 150 m AGL.

**Figure 3.** Slope map of the study area including drone flight lines (pink lines). The topographic profiles of flight lines 9 (FL9) and 18 (FL18) show the wild topographical characteristics of the area.

#### *3.2. Platform and Flight Planning*

The platform employed for the present study was the multi-rotor hexacopter Dji Matrice 600 Pro (Figure 5A), equipped with a A3Pro flight controller, and compatible with the UgCS mission planning tool software. This equipment has a total takeoff weight of 9.6 kg, and up to 6 kg payload, while being powered by 6 lithium polymer batteries (4500 mAh).

Flight software (UgCS) was used for the design and control of the survey. Twenty-four 650 m length parallel survey lines were flown trending N170E, orthogonal to the regional strike of the geological structure and mineralization (N75-90E), with a 50 m line separation in order to obtain a high spatial resolution that allows observing variations in the magnetic

field of target size. Due to the extension of the survey area, the total flight was divided into four flight blocks and two take-off and landing points taking into account the capacity of the batteries (Figure 4). The flight was programmed with a speed of 5 m/s and a sampling interval of 200 ms, obtaining measurements every meter along the registered profile. This selected configuration results in a total of 14,500 magnetic total field registration points. The altitude of the flight (20 m AGL) was selected to maximize the resolution of the sensor while guaranteeing safety against obstacles, integrating the DEM acquired previously. A 1 m tolerance level was also used for altitude adjustments.

**Figure 4.** Survey design including flight lines (pink lines) and flying blocks division, including the take-off and landing points.

The effect produced by the distance between the sensor and the surface generates a low decrease of spatial resolution and intensity compared to ground surveys [16,23,24], compensated by a regular and higher density data acquisition.

#### *3.3. Magnetometry*

The drone was equipped with a GSMP-35U GEM-Systems potassium-vapor magnetometer, with a sensibility of 0.0002 nT/1 Hz. This system is additionally equipped with a simultaneous register of the magnetic field, as well as a real-time single-frequency (L1) GPS receiver with up to 0.7 m absolute accuracy in Satellite Based Augmentation System coverage areas.

This equipment consists of a sensor attached by cable to a controller, datalogger, 5 V battery power-source, and GPS, with a total weight of <2 kg. The datalogger and batteries were securely fixed and balanced in the payload container of the drone's undercarriage, while the GPS antenna was installed on the upper portion of the drone, so as to ensure a constant signal. Due to the magnetic interference that is generated by electromagnetic motors within the platform [5,24,25], the magnetic sensor was installed ata3m distance from the base of the drone, connected by cable, so as to counteract this electromagnetic

effect (Figure 5B). For this configuration, the magnetic field produced by the drone is attenuated and does not affect the measurements of the GSMP-35U magnetometer [20].

**Figure 5.** Registration system. (**A**) a Mavic Matrice 600 Pro Hexacopter drone; (**B**) static drone position with the magnetometer hanging below; (**C**) base magnetometer for diurnal corrections.

The magnetic sensor was deployed with no rotational restrictions about any of the axes. The survey was designed by adding an extra 25 m at each extremity of the flying lines at the 180◦ turn and reducing flight speed, preventing the pendulum motion of the sensor that creates yaw, pitch, and roll axis variations.

In parallel, a fixed magnetometer was installed in a nearby area away from sources of magnetic interference to calculate the diurnal effect correction caused by the temporal variation of the magnetic field throughout the day (Figure 5C). The magnetometer was set up with a 1 s time total field interval record, found to register a maximum of 11 nT throughout the data collection period.

#### *3.4. Data Processing*

Data were processed using the OASIS Montaj 9.8 software, using classic methods in the calculation of anomalies by applying a series of different filters so as to obtain anomaly residual maps. First, erroneous values due to drone position, take-off, landing, pitch, excess roll, and lag errors between the sensor and the drone are deleted. All points placed over the extended extremes of the survey lines and other outliners were discarded, applying a 1D median filter. Finally, diurnal variation of the Earth's magnetic field values caused by sun activity were used to correct the data obtained during the survey period [26].

Aeromagnetic data processing is based on a gridding computation routine that interpolates the observed aeromagnetic data from the survey data, placing these locations into a regular grid with the nodes displayed as a 2D total magnetic field contour map (RGB image). The minimum curvature gridding method was applied to the observed data [27], at <sup>1</sup> <sup>4</sup> of the flight line spacing (12.5 m) [28].

#### **4. Results**

#### *4.1. Total Magnetic Field (TMF) and Reduction to Pole (TMFRP)*

Data interpretation begins with the calculation of the Total Magnetic Field (TMF), from the filtered data (Figure 6A), and the Reduction to Pole (TMFRP). From this perspective, applying the data regarding magnetic inclination and declination of this area on the day of data collection, and in combination with the International Geomagnetic Reference Field (IGRF), a view of the central magnetic sources can be obtained directly for the bodies that generated them (Figure 6B).

**Figure 6.** (**A**) Total Magnetic Field (TMF) and (**B**) Total Magnetic Field with Reduction to Pole (TMFRP). Dipoles A and B have been marked on each map, as well as the location of different mining activities and areas where the mineralization had been observed on the surface.

On both planes, the presence of two strong magnetic dipoles can be observed (A and B) towards the south-east quadrant of the region of interest. Both dipoles are located on carbonated materials and can be found in the immediate surroundings of the old mine as well as areas where the mineralization had originally been detected on surface.

TMFRP presents a slight variation in the position of the dipoles with reference to the TMF (Figure 6B), as well as an increase in its intensity, reaching a variation in the magnetic field of up to 88 nT (dipole A) and 165 nT (dipole B). The dipoles can be related to the presence of ferromagnetic elements compatible with the paragenesis of the minerals from this locality.

It is important to point out that reduction to pole has not removed the dipolar character of the magnetic anomalies, which indicates a remaining magnetization of the materials found in this source. This is product of the natural axis of these dipoles, found approximately E-W and not N-S, as would be expected in the actual position of the magnetic field. Under this premise, the mineral bodies that generate these dipoles obtained the remaining magnetization at a moment in time when the magnetic field was different from present day.

#### *4.2. Analytical Signal*

The filtered Analytical Signal (AS) allows for the spatial identification of the two sources producing the observed dipoles. The calculation of AS is based on the execution of directional derivatives, where the obtained anomalies are organized in a bell-shape, and where the maxima are located directly on the edges of the anomalous bodies, with their amplitude being proportional to the depth of the location of the magnetic source [29].

The map displaying the computed analytical signal (Figure 7) presents a preferential lineation of anomalous areas with an approximate orientation of N80E, coinciding with the two maxima (A and B, Figure 7), and the general trend of the main geological contacts and structures, specially, with the normal fault defined through the geological cartography of the area (Figure 8).

**Figure 7.** Analytical signal map over the aerial RGB image of the area of Don Jacobo with the position of the main mining works and outcropping mineral shows. Note the alignment of the analytical signal anomalies with approximate N80E orientation (white dashed line) and with the two AS maxima (A and B), as well as the position of the mining works and the outcropping mineral shows close to the main anomalies.

It is important to point out that all the mining works and outcropping mineral shows are located surrounding the northern part of the analytical signal anomalies, with all of them situated in the northern block of the normal fault.

#### *4.3. D Inversion Model*

The creation of a uniform grid with a high density of information obtained by drone allows to the creation of 3D models displaying magnetic susceptibility by applying the technique of Magnetic Vector Inversion (MVI) [30]. The model generated using the VOXI Earth Modeling software by OASIS Montaj is made up of a data mesh of 117 × 69 × 82, generating a total of 661,986 cells of 10 × 10 m size. This model facilitated the spatial definition of the precise magnetic bodies of interest. These methods also allow for the characterization of these bodies, through computing magnetization vectors for each block that contain information about directionality as well as intensity.

**Figure 8.** Analytical Signal anomalies over the aerial RGB image and normal fault trace from the 1:5000 geological cartography (Figure 1C) of the area of Don Jacobo. Note the alignment of analytical signal anomalies parallel to the normal fault placed at the south of the survey area.

The results from this modeling (Figure 9) reveals the presence of two bodies with high magnetic susceptibility (A and B), presenting susceptibility values over 12 × <sup>10</sup>−4, with peak values of 15 × <sup>10</sup>−3. Both bodies are modelled to prevent an oval morphology, with the maximal magnetization located at 45 m and 60 m in depth below surface (Figure 9A,B). This depth is an estimation and other methods such as drilling would be required to validate the proposed depth.

In the case of magnetization vectors, a preferential W-E orientation has been observed, different from the direction of the Earth's natural magnetic field. This model confirms the presence of a residual magnetization in the materials encountered here, coinciding with those observations obtained from TMFRP, and therefore, could be related to the presence of mineralized bodies.

Both bodies are located towards the south of the mining works and outcropping mineral shows and just at the north of the normal fault that crosses the southern border of the survey area (Figure 10), making it possible to interpret that the fault may have played

an important role in the formation of the mineralization, probably as a channel for the circulation of hydrothermal mineralizing fluids.

**Figure 9.** 3D model of the magnetic susceptibility in the area of Don Jacobo. (**A**,**B**) Different views and sections of the two magnetic bodies defined by the model. (**C**,**D**) Visualization of magnetization vectors obtained from each of the studies and their relationship with magnetic susceptibility.

**Figure 10.** 3D model of the magnetic susceptibility of the Don Jacobo area, displaying the position of the two bodies of high magnetic susceptibility A and B (12 <sup>×</sup> <sup>10</sup>−<sup>4</sup> susceptibility threshold value); over the 1:5000 geological cartography and aerial RGB image of the area of Don Jacobo; normal fault placed at the south of the survey area and mining works and outcropping mineral shows.

#### **5. Conclusions**

We present results of the drone magnetometry survey for mineral exploration over the Don Jacobo mining area. We demonstrate the utility and advantages of using drone magnetometry in the study of mineral deposits, allowing for the acquisition of high-quality data in unfavorable conditions where traditional approaches are limited.

The present study was able to detect the presence of two magnetic dipoles with residuals magnetization in the nearby surroundings of old mining activities.

The use of 3D inversion was able to define the morphology and limits of the two potential mineral bodies, and further confirm their relation to the surrounding geological features, such as the normal fault to the south of the region of interest.

Our results strongly support that these dipoles are related to the potential presence of ferromagnetic mineral elements compatible with the Copper-Cobalt-Nickel paragenesis of the Don Jacobo area. The alignment of the analytical signal anomalies parallel and close to a normal fault indicate that this fault played an important role in the formation of the mineralization, probably as a channel for the circulation of hydrothermal mineralizing fluids.

We conclude that the drone magnetics survey method could be an important tool to study mineralized areas, such as the Don Jacobo mine, where precise modeling allows the precise definition of magnetic anomalies and the design and development of future investigation activities.

**Author Contributions:** D.P. and S.A.; methodology, D.P.; software, P.C. and J.C.; validation, D.P.; formal analysis, D.P.; investigation, D.P.; resources, D.P.; data curation, P.C. and J.C.; writing original draft preparation, D.P. and R.L.G.; writing—review and editing, D.G.-A.; visualization, D.P.; supervision, D.G.-A.; project administration, D.G.-A.; funding acquisition, D.P. and R.L.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not available.

**Acknowledgments:** We would like to acknowledge EXCO MINING SL and GEOLAND SERVICES SL for permitting the publication of this research, as well as for the chance to participate in this research project.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **A Practical Validation of Uncooled Thermal Imagers for Small RPAS**

### **George Leblanc 1,\*, Margaret Kalacska 2, J. Pablo Arroyo-Mora 1, Oliver Lucanus <sup>2</sup> and Andrew Todd <sup>3</sup>**


**Abstract:** Uncooled thermal imaging sensors in the LWIR (7.5 μm to 14 μm) have recently been developed for use with small RPAS. This study derives a new thermal imaging validation methodology via the use of a blackbody source (indoors) and real-world field conditions (outdoors). We have demonstrated this method with three popular LWIR cameras by DJI (Zenmuse XT-R, Zenmuse XT2 and, the M2EA) operated by three different popular DJI RPAS platforms (Matrice 600 Pro, M300 RTK and, the Mavic 2 Enterprise Advanced). Results from the blackbody work show that each camera has a highly linearized response (R<sup>2</sup> > 0.99) in the temperature range 5–40 ◦C as well as a small (<2 ◦C) temperature bias that is less than the stated accuracy of the cameras. Field validation was accomplished by imaging vegetation and concrete targets (outdoors and at night), that were instrumented with surface temperature sensors. Environmental parameters (air temperature, humidity, pressure and, wind and gusting) were measured for several hours prior to imaging data collection and found to either not be a factor, or were constant, during the ~30 min data collection period. In-field results from imagery at five heights between 10 m and 50 m show absolute temperature retrievals of the concrete and two vegetation sites were within the specifications of the cameras. The methodology has been developed with consideration of active RPAS operational requirements.

**Keywords:** drone; UAV; UAS; thermal imaging; blackbody; emissivity; thermography

#### **1. Introduction**

With the nearly ubiquitous use of small (under 25 kg) Remotely Piloted Aircraft Systems (RPAS), there are incredibly versatile, capable and, cost-effective systems being applied to a diverse span of applications [1–4] that can acquire very high quality (4K) optical video and other sensor data, with exceptional stability. Only a few years ago, this ability did not exist or belonged solidly in the realm of high-cost, much larger and lower performing systems. For optical cameras and RPAS, it is clear that very small systems (sensor, avionics, and airframe) are now very capable of producing high quality images and video; but what about the Thermal InfraRed (TIR) and, specifically, TIR Imaging (TIRI) within the Long Wave InfraRed (LWIR) from 7.5 μm to 14.0 μm?

Within less than half a decade, capable small RPAS TIRI systems have been employed in a wide variety of studies, such as forestry [5–7], wildlife surveys [8–11], natural hazards [12–14], urban environments [15–18], archeology [19–21], mining [22–24], building inspection [25–27], etc. These works, as well as many others, have aided the overall technology push toward the use of thermography with small RPAS. This, in turn, has driven the demand for the technology to deliver more robust, accurate, easier to use and, lower cost sensor systems.

**Citation:** Leblanc, G.; Kalacska, M.; Arroyo-Mora, J.P.; Lucanus, O.; Todd, A. A Practical Validation of Uncooled Thermal Imagers for Small RPAS. *Drones* **2021**, *5*, 132. https:// doi.org/10.3390/drones5040132

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 20 October 2021 Accepted: 4 November 2021 Published: 6 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The heart of the new breed of easily accessible TIRI cameras is an uncooled radiometrically calibrated microbolometer that is able to produce usable data [28–31]. Relatively recently, TIRI detectors required cryogenic environments (operating temperatures <−150 ◦C) in order to produce useful data—due to the inherent thermal noise sources associated with environmental and electronics temperatures [32]. However, present day uncooled TIRI detectors have devised clever ways around this thermal noise barrier, such that the performance of uncooled calibrated TIRI instruments are now sufficient to produce very useful data [28,29,33,34].

With the wide accessibility of low-cost, uncooled TIRI cameras and RPAS platforms, their coalescence was, as with optical systems, inevitable. However, unlike optical imagery, the proper collection and use of TIRI requires greater user knowledge about TIR, including the behavior of materials, sensors, and calibration [35]. One concern is that many users of TIRI are unaware that the calibration of the instrument is obtained in a highly idealized laboratory-based environment and also, that it may change over the course of transport and real-world conditions.

Within the following work, we present the development of, and present our approach to, collecting and validating TIRI from real-world field examples. This work relies on validation of indoor (non-laboratory) blackbody measurements, and in-field surface temperature measurements of common target materials. We evaluate three different uncooled manufacturer radiometrically calibrated TIRI cameras of various ages and abilities, on three different RPAS airframes taken under the same environmental conditions. From those data, we analyze and show the abilities of these systems to replicate surface temperature measurements of known sources, and provide examples of the applicability of the uncooled TIRI cameras for RPAS in real-world environments.

Finally, it is important to explicitly state that our objective was to do a validation that can be done by most RPAS operators of TIRI systems, either pre- and/or post- campaign, to ensure that their cameras are operating within the specifications of the calibration. We are not seeking to do a field-based calibration of the camerasince true calibrations are an entirely different process requiring a highly controlled environment for temperature, pressure, humidity, air movement, etc.

#### *Previous Work*

Principally, due to the "newness" of the small RPAS TIRI ability, there are relatively few studies in the literature [36–39] that endeavor to address the issue of TIRI sensor calibration or validation. In [36], an in-field (non-laboratory and exposed to the natural environment) blackbody was used as a thermal target within the imagery with the development of a vicarious calibration method. This work included the environmental influences (e.g., wind, humidity, etc.) as part of an overall correction factor, along with other sensor-specific influences on the data. In a different approach than [36], Ribeiro-Gomes, K. et.al. [37] used a blackbody source to characterize their TIRI system before flight and also use a variety of methods—including artificial neural networks—to perform a calibration of the instrument from the blackbody measurements. The results of the analysis of the calibration method, image filtering, and geocorrection improvements were applied to a field-level data collection campaign over a vineyard. This method resulted in an increased accuracy with the use of neural networks and a requirement to use more accurate spectroradiometers in follow-on work for the blackbody calibration process.

Work by [38] using TIRI on RPAS and piloted aircraft produced results from snow, water and forest canopy as validation targets. The conclusions of this work were that there is a significant component of instrument bias in the resulting TIRI data as well as difficult spectral mixing conditions at boundaries of the validation material. Still more recently [39], a calibration procedure applied to thermographic RPAS cameras for use in the field has developed new electrical hardware and methodology for RPAS TIRI calibration within a controlled laboratory environment that provided the necessary calibration function to go from digital numbers through to calibrated temperatures.

#### **2. Materials and Methods**

*2.1. RPAS Airframes and Cameras*

Da-Jiang Innovations, Shenzhen, China (most commonly referred to as DJI) is the largest and most popular civilian multirotor VTOL (Vertical Take-Off and Landing) RPAS manufacturer in the world. As of 2019, DJI systems comprised 76.8% of the market in the USA (based on FAA registrations) [40]. As such, we have chosen three DJI RPAS models with TIRI capabilities, they are: the Mavic 2 Enterprise Advanced (hereafter M2EA), the Matrice 600 Pro (hereafter M600P) and, the Matrice 300-RTK (hereafter M300) (Figure 1). Table 1 contains general physical [41–43] and costing information for each of the RPAS airframes.

The M2EA airframe is the only one of the three tested here that has the dual TIRI/visible camera and gimbal system form-integrated into its airframe. Therefore, unlike the M600P and the M300 airframes, the M2EA is not suitable for interchanging with other camera systems. Both the M2EA and M300 geotag the thermal images with RTK (Real Time Kinematics) corrected positional information, which allows for accurate (<3 cm horizontal) geopositioning [44]. However, because it does not use a local base station, in order for RTK to be enabled on the M2EA, an external cellular internet connection and incoming corrections are necessary. While the M600P uses the RTK module in differential mode for flight control, the geotags of the XT-R are based on the basic GNSS (Global Navigation Satellite System) position.

**Figure 1.** RPAS systems and cameras in this study. (**A**) The M600P (left), M2EA (center) and M300 (right). (**B**) the Zenmuse XT-R with gimbal, (**C**) the M2EA form-integrated thermal (upper) and visible (lower) cameras and, (**D**) the Zenmuse XT2 thermal (right) and visible (left) cameras. The cameras are shown in the same order from L-R as the airframes with which they are compatible.


**Table 1.** Physical and costing information on the 3 DJI airframes (unfolded, base-level batteries, no payload configuration for the M600P and M300) used in this work. The costing was accurate as of the date of the data collection (5 July 2021).

\* M2EA camera is form-integrated as part of the airframe. \*\* The M600P used here has the integrated D-RTK upgrade and Max. Take-off Weight updated from [41] by personal communication (K. Toderel, RMUS Canada).

> Unlike the M2EA, the M600P and the M300 airframes are able to carry a wide range of payloads, including various TIRI systems and gimbals. This is due to their payload carrying capacity—up to 11.5 kg and 2.7 kg, for the M600P and M300, respectively. The M600P and the M300 airframes are also more robust than many similarly classed airframes in terms of general physical presence, flight endurance and having a wider operational envelope. The M300 has an Ingress Protection (IP) rating of IP45, meaning it is protected against solid objects that are >1.0 mm in diameter, and also from water coming from any direction.

> As this study is focused on TIRI for RPAS, we have selected DJI's M2EA, the Zenmuse XT-R and the Zenmuse XT2 cameras for comparison. We distinguish the XT-R (radiometrically calibrated) version from the non-calibrated performance version not tested here. Standard specifications of each camera is shown in Table 2 [41,45,46]. The XT2 is the only camera tested that has an Ingress Protection (IP) rating (IP44), which indicates that it is protected from solid (i.e., dust) and water particles/drops larger than 1 mm. Considering that these cameras are some of the most popular models for RPAS TIRI, the outcome of this comparison is expected to bring some insight into the potential impacts that newer/older, smaller/larger and, expensive/cost-effective technology may have on the quality of data produced. For greatest accuracy of measurement, all three cameras are recommended for use by the manufacturer in applications where the emissivity (see Section 2.2.1) of the material under study exceeds 0.9. All three cameras have high and low gain modes; in this study, all data were acquired in high gain mode to increase sensitivity (at a loss of the overall usable scene temperature range).


**Table 2.** Standard information on each of the TIRI camera systems.

Both the XT-R and XT2 used here were equipped with non-interchangeable 13 mm lenses. The M2EA's focal length is reported as ~9 mm. All three models have a frame rate of 30 Hz. The reported sensitivity (Noise Equivalent Differential Temperature (NEdT)) of the XT-R and XT2 is <50 mK @ f1.0 and <50 mK @f1.1 for the M2EA (personal communication from K. Toderel, RMUS Canada).

#### *2.2. Blackbody Indoor Camera Validation*

#### 2.2.1. Thermal Radiation

In considering the thermal behavior of a material, it is very useful to invoke the construct of a "blackbody"—a theoretically perfect radiator that absorbs all incoming energy (no reflection) and, when in equilibrium, becomes a perfect emitter. Therefore, when in thermal equilibrium, a blackbody is both a perfect absorber and emitter of incident radiation. An important quality of the surface is that it is perfectly isotropic, emitting and absorbing incident radiation without directional bias.

To understand the blackbody energy relationships of interest to this study, we begin with Planck's blackbody radiation law (Equation (1)), which describes the intensity of the electromagnetic radiation emitted by a blackbody at a given wavelength as a function of its temperature,

$$I\_{\lambda} = \frac{2\pi hc^2}{\lambda^5 \left(e^{\left(\frac{\lambda}{\lambda kT}\right)} - 1\right)}\tag{1}$$

where, *<sup>I</sup><sup>λ</sup>* is the spectral emissive intensity (W·m−2·sr−1·μm<sup>−</sup>1) of the radiation emitted by the blackbody at a given wavelength, *λ* (in μm), *T* (in kelvin) is the absolute temperature, *h* is Planck's constant (6.626 × <sup>10</sup>−<sup>34</sup> <sup>J</sup>·s), *<sup>c</sup>* is the speed of light (2.9979 × <sup>10</sup><sup>8</sup> <sup>m</sup>·s<sup>−</sup>1) and *<sup>k</sup>* is the Boltzmann constant (1.380649 × <sup>10</sup>−<sup>23</sup> <sup>J</sup>·K<sup>−</sup>1).

In work with electromagnetic sensors and imaging systems, one of the main quantities of interests is the energy at which a system radiates. Therefore, with Equation (1) as a basis, the total emitted radiation (*E*) of a thermal system is well-known to be a function of the temperature associated with the body and is given as:

$$E = \sigma T^4 \tag{2}$$

where, *<sup>σ</sup>* is the Stefan–Boltzmann constant (5.670374 × <sup>10</sup>−<sup>8</sup> <sup>W</sup>·m−2·K<sup>−</sup>4) [47]. Equation (2) shows that there is a relatively simple relationship between the temperature of a body and the energy of that body and that it is proportional (by the Stefan–Boltzmann constant) to the 4th power of the temperature. The important behavior that Equation (2) is able to determine is that a change in temperature within a system (a body) produces a large change (relative to the magnitude of the temperature change) in energy output for the system.

While Equation (2) is certainly useful in many respects, for this work, we are primarily interested in a subset of wavelengths due to the fact that imagers have a finite sensitivity over specific wavelength ranges (7.5–14 μm in our case—from Table 2). Fortunately, there is a well-known derivation (Wien's law) of Equation (1) that determines the wavelength at which the maximum radiant energy (*λmax*) is produced as a function of temperature, and is given below as:

$$
\lambda\_{\text{max}} = \frac{b}{T} \tag{3}
$$

where, *<sup>b</sup>*, is Wien's constant (2.8978 × <sup>10</sup>−<sup>3</sup> <sup>m</sup>·K). We note that while Equation (3) determines the maximum, or peak, wavelength, it cannot be forgotten that there remains a continuum of wavelengths produced from a body, at temperature *T* above 0 K. This continuum is fully described by Equation (1) and is the blackbody curve for that temperature.

Although the construct of a perfectly emissive radiator (blackbody) is well-defined, real-world materials deviate from this perfect behavior by emitting less energy than that predicted by blackbody theory. This deviation is captured by the emissivity coefficient (*ε*) and is quantified as the ratio of the energy emitted from a material's surface (*Em*) to that emitted by a blackbody (*Eb*):

$$
\varepsilon = \frac{E\_m}{E\_b}.\tag{4}
$$

Emissivity is a unit-less characteristic of real-world materials that is dependent on several factors including: wavelength, temperature, material composition and surface characteristics (including, roughness, angle and direction). It is the most important factor of a surface that affects the amount of energy radiating from it [48]. The closer the value of *ε* is to 1, the greater is that material's thermal behavior approximating that of a blackbody. Accordingly, objects with high *ε* absorb and radiate large amounts of energy, while those with low *ε*, such as most metals, absorb and radiate low amounts of energy but are highly reflective. Therefore, in order to retrieve accurate surface temperatures of realworld materials, it is fundamentally important to correct for the ε of the materials. In general, when collecting the raw TIRI data, the radiative temperature is often acquired and expressed as Brightness Temperature (BT). The BT represents the temperature that a blackbody would be at in order to generate the observed radiance at the given wavelength (as *ε* = 1 in this case). It is important to note that BT produces incorrect surface temperatures

for real materials; therefore, it is necessary to apply the correct emissivity value (from Equation (4)) in order to determine the true surface temperature of a material.

While emissivity is certainly important, there are also energy contributions from a number of other factors that need to be taken into account with typical RPAS TIRI estimations of surface temperature (Figure 2). As an example, sky/cloud contributions heavily depend on the condition (i.e., percent cloud cover and cloud type) with clear skies contributing the least. In contrast, thick cumulus clouds contribute considerably more energy to the measurement [49].

**Figure 2.** Illustration of the various sources of thermal infrared radiation to be considered with RPAS estimations of surface temperature in absence of nearby objects which would radiate energy into the scene. The arrows are representative of the relative contribution of each source. Three materials (grass, concrete, InfraGold) with a range of emissivity values considered in this study are shown. Atmospheric transmission decreases with increased distance between the TIRI sensor and the target. Atmospheric constituents such as water, smoke, dust, etc. all influence thermal transmission, with increasing significance at greater distances and/or as the concentration of the constituents increases.

#### 2.2.2. Indoor Camera Validation—Blackbody Radiator

While it is often the case that small RPAS radiometric TIRI cameras come with a factory calibration certification, it is important (often necessary in practice) to validate if the calibration remains accurate at or near the time of the survey. The two primary reasons are that, (1) often there has been some physical jarring action during transport, which can potentially cause the unit to become uncalibrated and (2) all camera systems are not created equal, with quality greatly differing between them. The validation of TIRI systems involves the use of a stable calibrated blackbody source. In this study, we used the FLUKE 4180 Calibrator (Fluke Corporation, Everett, WA, USA) [50] as the target source for the validation. The FLUKE 4180 is a well-known portable blackbody operating within the −15 ◦C to 120 ◦C temperature range with a large 15.24 cm diameter target radiator. Over the temperature range of this work, the calibration error was 0.4 ◦C (obtained via the instrument's calibration certificate) at each temperature measured. The unit has a stability of +/−0.05 ◦C and +/−0.10 ◦C at 0 ◦C and 120 ◦C, respectively. Over the central 12.7 cm diameter of the target, the radiator has a uniformity of +/−0.1 ◦C and +/−0.20 ◦C at 0 ◦C and 120 ◦C, respectively. The unit has a nominal emissivity of 0.95 but is applicable over the range of emissivities (0.9 to 1.0) via thermometer emissivity compensation. As it was calibrated for *ε* = 0.95, all the imaging measurements for this work have used the 0.95 nominal emissivity value. As per

the instructions of the manufacturer, the unit requires 10 min of settling time after reaching the desired testing temperature to ensure the stability of the target radiator. Figure 3 shows the set-up of the indoor validation exercise.

**Figure 3.** Validation set-up that would be typical of a workspace pre/post RPAS operations. The tripod (**A**) that carries the environmental sensors for air temperature, RH, pressure and air movement, the FLUKE 4180 blackbody (**B**) and the M600P airframe with the XT-R camera (**C**).

The motivation of this validation process was to set the blackbody source at a distance away from the cameras that is greater than the minimal focal distance of the camera (for the XT-R and XT the minimum focus distance is 7.6 cm and ~3.2 cm for the M2EA [41,45,46]). Then set the temperature of the blackbody and let it settle to ensure uniformity. After settling, image the blackbody source with the central pixels of the camera and then go to the next temperature, let settle and image—repeat as per necessary for the temperature range expected by the phenomena under investigation. Then, to ensure the instrument and related software are properly accounting for distance, we doubled the distance and re-image at the same temperatures. In this process, we have used the central pixels as they are often (but not always) the ones of concern when we investigate any phenomena. This method, of course, is not entirely complete as the non-central pixels do not image the blackbody; however, we are proposing a realistic method to field-validate RPAS imagers. In order to fully validate the imager, each pixel of the imaging camera should image the central area of the blackbody target. Since we are interested in field-applicable methods, and our imager is 640 × 512 pixels, it is not at all practical (to the RPAS operations) to test more than the central pixels.

We set-up air temperature, pressure, humidity and air speed monitors within the indoor space using the HOBO™ (Honest Observer By Onset) smart sensor system (Onset Computer Corporation, Bourne, MA, USA). The HOBO system is a well-known and highly used suite of field-proofed environmental measurement instruments [51–54]. During the entire time of the exercise, there was no measurable flow of air, which was not unexpected as we performed this work indoors with no forced air circulation. However, it was necessary to ensure this wasthe case.

Each RPAS camera was positioned at 2 m distance to the blackbody and normal (90◦) to the center of the face of the radiator target, so that any angular dependence of measurement was fundamentally restricted to being perpendicular to the measured surface. We then allowed the blackbody to come to the first temperature (5 ◦C) and once at that temperature, we let it stabilize for 10 min. With each camera in succession, the blackbody radiator target was imaged with the central portion of the camera's FOV. The blackbody temperature remained constant at the set point while the camera was moved to 4 m from the blackbody target and then each camera re-imaged the blackbody target again. The use

of two distances (2 m and 4 m) was an exercise to determine if there were considerable differences in measured temperature as a function of distance, even though each camera has a small focal length of 9 mm (M2EA) or 13 mm (XT-R and XT2). Once all cameras had imaged the blackbody target at both distances for each temperature, the temperature was increased to the next setting and the process was repeated. We selected eight temperature settings (5, 10, 15, 20, 25, 30, 35 and, 40 ◦C), over a range of 35 ◦C, that are uniformly spaced and would replicate the air and target temperatures expected in real-world environmental applications. As well, this process ensures the evaluation of the camera performance near to, and including, temperatures beyond those of interest to the application. In our case, the application temperatures of interest were from 15 ◦C to 25 ◦C.

The indoor blackbody validation data were acquired over a period of 2.5 h, after set-up, where the majority of that time was dedicated to allowing the blackbody target radiator to settle after reaching the testing temperature. Once the temperatures were recorded by the TIRI cameras, the images were processed in FLIR Thermal Studio (Teledyne FLIR, Wilsonville, OR, USA) in order to account for emissivity, distance to source, air temperature, humidity, and optics temperature. As the images from the M2EA cannot be directly read by FLIR Thermal Studio, they were first converted to a standard FLIR radiometric jpg with ThermoConverter (Aethea, London, UK).

#### *2.3. Outdoor Field Trial*

#### 2.3.1. Study Site Set-up

The location of the field trial was at a private RPAS test site near Vaudreuil-Dorion, QC, Canada (45◦26 23 N, 74◦13 05 W). The extensively instrumented ground site is shown in Figure 4. The ground site set-up and the instrument distribution included: 2 differential RTK base stations supporting the M300 and M600P, HOBO environmental measurement station, vegetation validation targets, an InfraGold target panel and, concrete slab validation targets. The RTK base stations were used during the trial to replicate flight conditions appropriate to larger area TIRI acquisitions.

The HOBO environmental monitoring instrumentation (Figure 5) consisted of measuring the following (with model numbers): an anemometer (S-WSA-M003) for measuring wind speed, and wind gust, air temperature (*Tair*) and relative humidity (RH) ((A-THB-M002), solar shielding (RS3), air pressure (S-BPB-CM50), two in-canopy vegetation temperatures (S-TMB-M006), and two soil temperature probes (S-TMB-M002). The electronics and data collection units (H21-USB) were stored under the tripod. *Tair* and RH are integrated into one unit (S-THB-M002), and were placed under the solar shield (RS3) in order to obtain accurate *Tair* and RH measurements free from the effects of direct solar radiative heating [51].

While, *Tair* will have a large impact on surface temperature, wind speed has been shown to induce substantial localized cooling depending on its value over a sustained period of hours. Wind-induced surface cooling arises as a case of forced convective cooling and is proportional to the temperature differential between the air in motion and the surface [55]. Several studies using TIRI have shown and measured this effect on surfaces of buildings [56,57], and on the bare ground overtop of heating pipe systems [58]. In general, these studies show that with wind speeds below 2 m/s the effect is negligible, or at least at the limit of the resolution of the TIRI data. The work of [58] remains as one of the few examples of TIRI-derived temperature measurement as a function of increasing wind speed. In [58], TIRI taken overtop of buried heating pipes have shown that winds below ~2 m/s did affect the measurements but the value of this effect (~+/−0.2 ◦C) is less than the error in the TIRI data. In more recent work, [56] found that for many building surfaces, winds up to even 5 m/s did not significantly alter the values obtained via a TIRI survey. In an even more recent work, [57] generalized that wind pulsations (gusting) induced changes with TIRI data as being "not-critical" for speeds of up to 2 m/s. Sustained wind periods of several hours before data collection did have an impact on the surface temperature and should be avoided during those times. As a result of this known wind-induced cooling, the majority of past studies identified that it is often necessary to install the environmental monitoring equipment (Figure 5), at least several hours before TIRI data collection.

**Figure 4.** Field site near Vaudreuil-Dorion, QC (top panel) with the light blue/green box representing the area of the investigation and the instrumented ground site location. Ground site set-up (lower panels) including: differentialRTK base station for the M300 (**A**), and for the M600P (**B**), environmental monitoring station and vegetation validation site (**C**), the InfraGold dark sky diffuse reflectance panel (**D**) with foil protection covering that was removed before data collection, and the concrete slab target (**E**) with three thermal flux sensors (used for deriving temperature) indicated by the red arrow in the lower left panel. Photographs were taken during the day for clarity. The orange circular feature in the lower left panel (with an "H") shows the launch area for the RPAS vehicles.

**Figure 5.** Environmental monitoring station and vegetation target from Figure 4. The photograph shows the emplacement of the anemometer (**A**) at 1.0 m height, the *Tair* and RH (one unit) under a solar radiative shield (**B**), air pressure (**C**) and, control electronics and data collection units (**D**). Also shown are the vegetation/soil target areas of T1/T2 (**E**) and T3/T4 (**F**) at 1.0 m distance from the center of the tripod location. Soil temperature probes (T2 and T4) were implanted at a depth of 5 cm directly below the in-canopy temperature sensors (T1 and T3).

Figure 6 shows the three targets of interest for this study, the InfraGold panel, the concrete slabs and, the in-canopy vegetation and soil targets. The diffuse InfraGold panel (25.5 cm × 25.5 cm), is being used only to determine if there is a significant contribution to the signal from the downwelling irradiance of the open sky [49]. Having an *ε* = 0.06 means that it is highly reflective and as it faces the sky, the conditions it presents to the imager is that of the sky with negligible contribution from its own emission (i.e., less than the detection limit of the TIRI camera) at most ambient air temperatures at which RPAS TIRI is carried out.

The patio stone concrete target, Figure 6B, were in place for several years prior to the experiment and, therefore, was in good contact with the underlying soil. The surfaces of the concrete slabs were weathered. Three FluxTeqTM PHFS-01-e 1 × 1" standard heat flux sensors (Flux Teq, Balcksburg, VA, USA) were connected to a high resolution, 8 channel thermocouple data acquisition device (TC-DAQ) logging to a laptop. The PHFs-01e have a nominal sensitivity of 9.0 mV/(W/cm2) and a specific thermal resistance of 0.9 K/(kW/m2) [59]. The PHFs-01e heat flux sensors were adhered to the concrete using Arctic MX-4-4G thermal compound paste (carbon micro-particle based) (Arctic GmbH, Braunschweig, Germany). One of the three flux sensors did not properly adhere to the surface of the concrete, and thus those data were ignored in this study.

**Figure 6.** Target materials used for comparison. (**A**) InfraGold panel atop a plastic storage box, used as a platform to keep the surface debris free. (**B**) Concrete patio stones with three heat flux sensors (black squares on the concrete slab—as also indicated in Figure 4 label E) from which temperature was derived. The left-most sensor did not stay adhered to the surface and therefore, the data were not used. (**C**) In-canopy temperature sensor T1 (located with the red arrow) within the living and detritus vegetation canopy. The in-canopy temperature sensors (T1 and T3) were positioned at a height of approximately 2–3 cm above the soil.

The soil and in-canopy temperature sensors were configured to work in pairs where the soil temperature probes were placed directly below the in-canopy temperature probes at a depth of 5 cm from the surface. In our set-up, shown in Figure 5E and in detail in Figure 6C, the T1 and T2 sensor pair was located 1 m north of the middle of the anemometer tripod. The in-canopy sensor (T1), located by the red arrow in Figure 6C, was held in place by a nylon cord that was attached to wooden stakes driven into the ground approximately 5 cm on either side of the sensor. This was necessary as curious field animals have a tendency to remove these sensors when left in place overnight. The other pair of in-canopy and soil temperature sensors, respectively, T3 and T4, were installed in the same manner, 1 m south of the anemometer tripod (location F in Figure 5). The vegetation target area is composed of a heterogeneous mix of grasses, white clover, dandelion and detritus, as well as void space. The overall result is that the soil is fully overlaid by the living vegetation canopy and its detritus. At the time of the TIRI measurements, the height of vegetation

was ~10 cm. The soil and in-canopy sensors were installed and logged continuously for 12 days prior to the acquisition of the RPAS TIRI images.

#### 2.3.2. RPAS TIRI Acquisition

The TIRI RPAS measurements were made after twilight (21:25 EST) on 5 July 2021 in order to ensure that solar irradiance would not influence the measurements. For a large amount of TIRI work, it is highly desirable to work after twilight in order to eliminate the potentially large non-linear differential solar contribution to the measured signal (Holtz 2000). For the TIRI acquisition starting at 10 m height above the study area, the first RPAS airframe hovered over the targets and acquired a thermal image—manually triggered by the operator. It then ascended to 20 m and repeated the acquisition. In this manner all three airframes completed the acquisition of the TIRI at heights of 10, 20, 30, 40 and, 50 m. The multi-altitude data set was acquired over a period of ~30 min. Even though each of the thermal cameras carry out flat field corrections (FCC) at power up and periodically during use, a supplemental FCC was triggered by the operator prior to the acquisition of each image. The FCC compensates for errors that occur during operation—such as those induced by temperature change at altitude. The images were processed in FLIR Thermal Studio to account for height (i.e., distance from target), atmospheric temperature, relative humidity and reflected temperature. The external optics temperature was set to *Tair* recorded at the time of acquisition. The images were processed with three different emissivity values, 1 for BT, 0.98 for grass and 0.95 for concrete. As with the blackbody radiator experiment, the M2EA images from this test were also first converted to a standard FLIR radiometric jpg with ThermoConverter.

In this work, we have used the value of *ε* = 0.98 for vegetation, which is based upon [60] (as cited in [61]) who have suggested this value for a general canopy emissivity. The value is further supported by [62] who have described grass emissivities for complete and partial canopy covers ranging from 0.956 to 0.986. More recently, [16] have used *ε* = 0.979 for grasses and ε = 0.977 for canopy cover, and [61] used ε = 0.98 as the vegetation canopy emissivity in non-arid environments. The emissivity value used for concrete (grey weathered, rough surface) was obtained from [63] where a large number of urban materials have undergone emissivity determination within the 8–14 μm range.

#### **3. Results**

#### *3.1. Indoor Blackbody Validation*

During the indoor trial, the environmental parameters were reported as: *Tair* average of 20.6 ◦C (ranged from 20.2 ◦C to 21.0 ◦C), RH average of 70.9% (ranged from 69% to 71.7%) and barometric pressure average of 1009.2 mbar (ranged from 1008.5 mbar to 1009.9 mbar). For the purposes of this work, we used the average *Tair*, pressure and RH for the calculation of the environmental temperature, in the vicinity of the blackbody, by each camera. The imagery obtained from each camera showed that the blackbody radiator target was well identified at each of the eight temperature settings. Examples of the imagery are shown in Figure 7 for the XT2 camera, at 2 m and 4 m distance, for 5 ◦C and 40 ◦C, respectively. The pixels in the central portion of the blackbody radiator (all within the 12.7 cm diameter of the central area of the radiator) were used in the subsequent analysis of the temperature measurements. The portion of the image outside of the area of the circular blackbody radiator exhibits less detail (lower contrast) as the material within the room was at or near the average indoor ambient temperature of 20.6 ◦C. A total of 830–950 pixels comprised the area extracted for analysis at 2 m and 220–230 pixels at the 4 m distance.

**Figure 7.** Example results at different distances and temperatures of imaging the variable temperature blackbody with the Zenmuse XT2. (**A**) 2 m distance and a blackbody temperature of 5 ◦C, and, (**B**) 4 m distance with 40 ◦C blackbody temperature. The lack of contrast in the objects outside of the circular blackbody radiator (in the center of the images) is due to the objects' temperatures being nearly at the ambient room temperature of 20.6 ◦C.

The measured results for each camera at all temperatures and distances, are shown in Figure 8 along with the best fit lines and uncertainties related to the measurements. The data used for Figure 8 is provided in Table A1 in Appendix A. The error bars shown for each plot arise from the range (minimum and maximum values of the central pixels) of the measurements themselves. Also presented in each plot is the corresponding 1:1 temperature line (black solid) that represents the theoretically perfect blackbody target temperature and the uncertainty (black dashed lines) corresponding to the +/−0.4 ◦C uncertainty value that was supplied by the calibration certificate. For each data set shown in Figure 8, statistical analytics describing the linearity of the response and the deviation from the blackbody source are shown in Table 3.

**Figure 8.** Results of the indoor validation with error bars derived from blackbody errors. (**A**) M2EA at 2 m, (**B**) M2EA at 4 m, (**C**) XT-R at 2 m, (**D**) XT-R at 4 m (**E**) XT2 at 2 m, (**F**) XT2 at 4 m. Red line is the best fit line between the TIRI calculated temperature and the blackbody set temperature. The red error bars illustrate the uncertainty of the FLUKE 4180 (x-axis) and the minimum and maximum calculated TIRI temperature (y-axis). The black line is the 1:1 blackbody temperature line with the dashed lines on either side illustrating the 0.4 ◦C FLUKE 4180 temperature uncertainty.

**Table 3.** Results from the indoor validation exercise for the M2EA, XT-R and XT2 cameras at 2 m and 4 m distance from the blackbody radiator over the 5 ◦C to 40 ◦C range. The coefficient of determination (R2) is a measure of the variability explained by the linear regressions, the root mean square error (RMSE) is a measure of the difference between predicted and observed temperatures and the bias is a measure of the over or under prediction of temperature by the cameras. SD is the standard deviation of the temperature recorded by the cameras.


Overall, Figure 8 shows that each camera is highly linear over the temperature testing range of 5 ◦C to 40 ◦C. The linear relationship for each camera shown in Figure 8 is reflected, numerically, in the results of Table 3 with each best fit line having an R<sup>2</sup> > 0.99 The 95% confidence intervals of the slope for the M2EA at both distances are the only ones that do not span 1.0 (i.e., slope of the 1:1 line) substantiating the larger underestimation of the blackbody temperatures of 5 ◦C and 10 ◦C see in Figure 8A,B. Over the temperature range investigated with the blackbody (35 ◦C), the M2EA had the greatest temperature difference (Δ*T*) (36.2 ◦C at 2 m and 39.3 ◦C at 4 m) from the Δ*T* of the blackbody (35 ◦C). In contrast, the XT2 had the least difference with a Δ*T* of 35.8 ◦C and 35.6 ◦C at 2 m and 4 m, respectively. The standard deviation for the measurements at each temperature was less than 0.15 ◦C for all cameras indicating a consistency in pixel values across the imaged surface of the blackbody (Table A1).

The analysis results for each camera at each distance are summarized in Table 3 and indicate that all cameras, in general, underestimated the blackbody temperatures. At the 2 m distance, the greatest deviations in the measured temperatures were found to be −3.1 ◦C and −2.2 ◦C for the M2EA at blackbody temperatures of 5 ◦C and 10 ◦C, respectively. At the 4m distance, the XT-R camera had the greatest deviations from the blackbody with −2.9 ◦C and −1.9 ◦C at 40 ◦C and 5 ◦C respectively (apparent in Figure 8 as well). The largest RMSE values at both distances are seen for the XT-R (0.94 ◦C and 1.12 ◦C). It also presents the greatest bias at both distances of −1.00 ◦C and −1.15 ◦C (Table 3, Figure 8). Consistently at both distances, the XT2 has the lowest bias (−0.21 ◦C and −0.23 ◦C). While its RMSE values were not as low as that of the M2EA (<0.4 ◦C), they were still less than 0.7 ◦C for both distances.

#### *3.2. Outdoor Field Trial*

#### Environmental Conditions

As previously noted, when considering doing TIRI work, the need to collect environmental data (*Tair*, wind and gust speed, RH, and air pressure) for several hours prior to taking thermal measurements is imperative. In our work, the results of these environmental measurements for the two hours prior to the end of the RPAS TIRI collection are shown in Figure 9 (along with the start time of the RPAS survey). On the day of the survey, twilight occurred at a time of 21:25 EST.

Figure 9A shows the temperature profiles of each sensor that measured the vegetation (T1 and T3), soil (T2 and T4) as well as that of *Tair*, for direct comparison. As a result, it is clear that variation in soil temperatures (1.0 ◦C) does not experience the larger fluctuations evident in the vegetation (3.3 ◦C) or *Tair* (2.1 ◦C). The soil profiles (T2 and T4) are also very similar in values, having a maximum difference of 0.07 ◦C (well within the +/−0.2 ◦C accuracy of the sensors); they also show a constant cooling slope over the first 90 min and again over the last half hour of the trial period (~2 h). The small change in cooling slope for the soil is likely due to the delayed effects of the increasing *Tair* and vegetation temperatures. A similar relationship to soil exists for the vegetation sensors (T3 and T4) where they mimic each other's profiles, however, there are differences that amount to ~0.7 ◦C (nearly 10 times that of the soil profiles) which is larger than the error of the sensors and is therefore, real but still quite small. T1 and T3 also mimics *Tair* (with some offset in time and a larger offset in temperature) but to a lesser degree. It is also evident that *Tair* has the most influence on the variability of T1 and T3; moreover, during the time of the RPAS survey (shown in Figure 9A), *Tair*, T1 and T3 experience ~0.5 ◦C of variability, which will be considered as remaining essentially constant for the duration of the survey.

**Figure 9.** Two hours (of the 12 days of continuous logging) of environmental parameter data from 5 July—21:00 to 23:00. (**A**) Temperature data, (**B**) Environmental data, Wind, RH, Barometric pressure. RPAS data collection started at minute 94 and ceased after minute 116. The blocky nature of the barometric signal is a result of resolution effects near the threshold levels (0.1 mbar) of the instrument.

For the environmental factors in Figure 9B, the time during the RPAS TIRI data collection period shows small amounts of variability—with barometric pressure varying by 0.2 mbar and RH by 5.9%. Therefore, we are treating these parameters as being essentially constant for the purposes of calculating source temperatures from TIRI. In this study, the wind and gust averages for the 2 h prior to the end of the trial were 0.01 m/s and 0.03 m/s, respectively—well below the 2.0 m/s value where they would start to influence the surface temperature measurements. Moreover, the maximum gust speed obtained was 2.0 m/s for a single point anomaly (at time point 108 in Figure 9B). Under these conditions, we can confidently say that the wind had no effect on the targets of interest.

While the environmental parameters in this study have been shown to be essentially constant or insignificant, it is important to measure the variables, in situ [14], as it could easily have been the case that the conditions exceeded the bounds of influence for TIRI temperature retrieval. Furthermore, the variables of humidity and *Tair*, along with an assumption of constant and near standard air pressure (1000 mbar), are required input values (or assumptions) for the radiometric correction software.

#### *3.3. TIRI*

#### 3.3.1. Brightness Temperature

As previously discussed, BT is the collection of TIR measurements with ε = 1. It assumes all radiating surfaces in the scene are blackbodies. While BT is not an accurate measure of surface temperature for real materials, it is useful to compare these values across the scene and for quickly identifying changes in the thermal properties of the area under investigation. The relative differences within the imagery can provide a "quick view" of the potential areas of interest. As an example, Figure 10 shows the results of collecting 20 m and 50 m height TIRI with each camera in this study. It is clear from the 20 m data that all the features we are interested in, (i.e., InfraGold panel, vegetation and, the concrete patio stones), are all identifiable, while the larger area of the 50 m data also shows a curious high temperature linear feature (shown by the red arrow) cross-cutting the direction of the dirt road (brightest feature in the imagery from all three cameras). This feature was later investigated and found to be a buried drainage pipe under the roadway.

#### 3.3.2. InfraGold Panel

To reiterate, the InfraGold panel (*ε* = 0.06) is used in this study as an indicating tool to determine if the thermal contribution from the sky is, or is not, significantly contributing to the TIRI data. Results from FLIR Thermal Studio of the InfraGold panel derived temperatures (at *ε* = 0.06) produced open sky average temperatures <−60.13 ◦C. This value derived for the panel constitutes the lowest temperatures that FLIR Thermal Studio could produce, however, the lowest limit for all three cameras in high gain mode is −25 ◦C. Therefore, we can only state that the reflected sky temperature from the panel is <−25 ◦C (as shown in Figure 11) indicating a negligible downwelling contribution. From Equations (1) and (3), the <−25 ◦C contribution provides a corresponding intensity of <3.85 W/m2/sr/μm at a maximal peak wavelength at ~11.678 μm for the panel surface. Comparing this with an average temperature for the vegetation scene at 19.5 ◦C, which, through Equations (1) and (3), gives an energy of ~8.79 W/m2/sr/μm at a peak wavelength of ~9.9 μm. In addition, because the 11.678 μm peak is near the edge of the sensors' measurable response, the effects from the downwelling irradiance are not appreciably impacting this work.

#### 3.3.3. Concrete Patio Stone Target

Figure 12 provides an example of the results of imaging the concrete target with the XT-R camera at 20 m altitude with an emissivity of 0.95. Because of the 41 cm width of the concrete tiles, only the 10–30 m altitude images were included in the analysis. Considering the largest pixel pitch (i.e., 17 μm for the XT-R and XT2) and the 13mm lens, in order to retain the recommended minimum region of interest size of 10 pixels diameter [41,45,46], ensuring accurate temperature retrievals, the sensor/target separation can have a maximum value of 30 m. With fewer pixels, the spot size effect results in lower accuracy measurements due to contamination (spectral mixing) from neighboring materials.

**Figure 10.** Brightness temperature (BT) in kelvin for all TIRI RPAS systems in this study at 20 m (left column of images) and 50 m (right column of images) with the M2EA (**top**), XT-R (**middle**) and XT2 (**lower**) cameras. The red arrow in the 50 m column indicates the location of a buried drainage pipe.

**Figure 11.** Thermal imaging results at 10 m AGL with the M2EA camera of the InfraGold panel (dark square in the center of the image). Evaluation of the panel area with ε = 0.06, produces a sky reflected temperature <−25 ◦C, below the lower limit of the camera in high gain mode.

**Figure 12.** Subset of a TIRI from the XT-R camera over the concrete target from 20 m height. In this version of the image, only the concrete pixels provide valid temperature measurements due to the scene being processed with ε = 0.95.

The results of applying the emissivity corrections for concrete to the TIRI data from each camera are shown in Figure 13. The contact measurements from the PHFS-01-e sensors indicated a temperature of 23.8 ◦C during the data collection. At 20 m and 30 m, the M2EA reports the closest surface temperature to the PHFS-01-e (over estimation of 0.3–0.5 ◦C.) At the 10 m height, however, the M2EA overestimated the concrete temperature by 1.8 ◦C. At 20 m and 30 m both the XT-R and XT2 have a similarly large deviation from the in situ measurement, although in opposite directions (−1.1 ◦C, −1.2 ◦C for the XT-R

and 1.0 ◦C, 1.1 ◦C for the XT2). At 10 m height, the XT-R was the most similar to the in-situ temperature of concrete (−0.3 ◦C difference) in comparison to the M2EA and the XT2, which both overestimated the surface temperature by 1.8 ◦C. In context, however, while the deviations from the in situ measurements are greater than the bias reported with the blackbody experiment (Table 3), they are all relatively small (<2 ◦C).

**Figure 13.** Results of comparing the TIRI data from each camera at 10–30 m height to the temperature determined by the PHFS-01-e heat sensors. (**A**) Mean with error bars representing the min and max of the temperature of the concrete tile from the TIRI. The solid line represents the temperature of the in-situ measurement of surface temperature from the PHFS-01-e heat sensors, the dotted lines represent the uncertainty of the PHFS-01-e in situ calculated temperature. (**B**) Mean difference between the TIRI estimated surface temperature and the PHFS-01-e heat sensors.

#### 3.3.4. Grass and Soil

While vegetation targets are complex, due primarily to water content, air space and canopy distribution, they are, nevertheless, important to include as a great deal of outdoor work involves their measure [64,65]. Figure 14 shows an example of a TIRI image of the grass (*ε* = 0.98) from the XT2 camera at 40 m height. In the image are two black squares (10 cm × 10 cm) of aluminum tape that have been used as markers (due to their very low emissivity- providing high thermal contrast) to locate the areas of the in situ temperature sensors in the resulting TIRI. The markers were placed ~10 cm adjacent to the areas of the in-canopy temperature sensors, bookending the areas of investigation.

**Figure 14.** TIRI from the XT2 camera at 40 m height over the areas of vegetation identified by the red rectangles adjacent to the black squares. Validation sites for Vegetation 1 (left red rectangle) and Vegetation 2 (right red rectangle) correspond to locations E and F, respectively, in Figure 4. The black squares are metal 10 cm × 10 cm markers that have been used in all the TIRI to locate the areas of the in-situ temperature sensors, due to their high thermal contrast to the rest of the material in the image. The geometrical relationship of the location of the red validation sites, w.r.t. the true ground coordinates relies on the visual identification of the corners of the black markers in the image.

Results for vegetation was obtained by comparing the TIRI data from each camera at each height with that of the in situ ground measurements are shown in Figures 15 and 16. For the most part, the grass temperature estimate from the M2EA lies between the in situ temperatures obtained for grass canopy and soil for both T1/T2 (Figure 15) and T3/T4 locations (Figure 16). The actual location (on the temperatures axis) that the grass and the image measurements lies at is a function of air temperature, air space within the canopy and the coverage of the vegetation over the soil. As the soil sensor was emplaced 5 cm below the surface, and below the root mat, its value represents the true temperature of the upper surface of the soil. As seen in Figure 13, the TIRI temperature values from the M2EA camera, exhibit the greatest deviation from the in situ measurements. As with the concrete example in Figure 13, the source of this anomaly has yet to be determined. At both vegetation sites, across heights, the XT-R camera is most similar to the measurements of the in situ soil probes (0.3–0.8 ◦C difference). In contrast, the XT2 is most similar to the grass canopy in situ probes at both vegetation sites (0.7–1.3 ◦C difference) (Figures 15 and 16). As expected, for all three cameras, the greatest variability in the temperature of the grass pixels is seen at the 10 m height (smallest pixels). For the XT-R and XT2, the range of grass pixel values is lowest but similar across the 30–50 m heights. As seen with the concrete target (Figure 13), the deviation of the TIRI estimated temperatures is mostly greater than the bias determined from the blackbody experiment (Table 3); nevertheless, the differences seen for the grass canopies and soil are small for all sensors (<2.5 ◦C and <2.0 ◦C respectively).

**Figure 15.** Comparison of TIRI from the three cameras with the grass region of interest surrounding temperature probes T1 (canopy) and T2 (soil) for Vegetation site 1. (**A**) Mean with error bars representing the minimum and maximum TIRI estimated temperature value. Solid lines indicate the temperature recorded in situ by the HOBO sensors. The dotted lines represent the uncertainty of the HOBO in situ temperature measurement (**B**) Mean difference between the soil in situ temperature measurement (T2) and the TIRI estimated temperature. (**C**) Mean difference between the grass canopy in situ temperature measurement (T1) and the TIRI estimated temperature.

**Figure 16.** Comparison of TIRI from the three cameras with the grass region of interest surrounding temperature probes T3 (canopy) and T4 (soil) for Vegetation site 2. (**A**) Mean with error bars representing the minimum and maximum TIRI estimated temperature value. Solid lines indicate the temperature recorded in situ by the HOBO sensors. The dotted lines represent the uncertainty of the HOBO in situ temperature measurement. (**B**) Mean difference between the soil in situ temperature measurement (T4) and the TIRI estimated temperature. (**C**) Mean difference between the grass canopy in situ temperature measurement (T3) and the TIRI estimated temperature.

#### **4. Discussion**

The overall results obtained from the blackbody work, show that each camera has performed within the uncertainty envelopes of their calibration, and thus, are validated for use within the constraints of that calibration. Further, this work has shown that there is essentially, no appreciable difference in validation results or camera performance as a function of distance—under the constraints of 2 m and 4 m distances—which may be extended with caution to suggest that these same results hold when the distance is doubled if the source is relatively close to the camera. All three cameras showed high linearity in their response and low errors for all targets (blackbody, concrete, soil) with differences <2 ◦C between them (except for the blackbody measurements with the XT-R camera at 5 ◦C and 40 ◦C—which are outside of the environmental range of the field study).

It is evident from the results of the concrete targets and both vegetation sites, that temperatures retrieved with the XT-R camera are consistently lower than those of the other two cameras. While the temperature retrievals for the vegetation and soil plots show that the XT2 overestimates the grass temperature they also show that the M2EA falls between the values for the vegetation and the soil. However, all the systems generally performed within 2 ◦C (i.e., Δ*T* ≤ 2 ◦C) of the in situ measured temperatures of the targets.

In the context of how this work fits into that of previous studies, our methodology and results have taken several parameters into account that were either missing or were identified as topics of further work. For example, the process we showed herein is very different from [36], due primarily to our night time operations at the field site; further, we also consider environmental factors, such as air temperature and humidity. In [36], the in-field blackbody target was imaged; however, no accounting for environmental factors such as wind, humidity or air temperature was performed but rather considered as a portion of a common residual bias offset of 2.67 ◦C. Furthermore, the work was performed during the daytime (between 07:10 and 14:00) when solar heating effects have a significant non-linear impact [54,66,67] even over the timespan of an RPAS flight—typically under 30 min.

Comparing our current work to that in [37], shows the similarities in thought behind the use of the indoor blackbody to characterize the sensor before flight campaigns. While [37] provides a vicarious calibration, we have focused on validation for the reasons outlined previously in this work (calibration requires a highly controlled environment and exceedingly accurate radiometric instrumentation—as was the outcome concluded in [37]). Further, in [37], there was no mention of collecting or addressing environmental data nor the use of any in-field thermal calibration targets, although they did validate their neural network-based photogrammetry process which aided in improving the accuracy of their imaging. Also, as in [36], [37] collected their field-based TIRI during the daytime and therefore, previous statements for [36] on that practice also apply to the work of [37].

In the work by [38], a constant 0 ◦C value for the imaged surface of melting snow targets was used; however, it was identified that the vicarious calibration method employed, performed better without the natural target of the melting snow surface. The melting snow target temperature was found to be variable and correlated to underlying forest vegetation conditions. Moreover, retrieved TIRI data showed spatial pixel size temperature dependencies, angular dependencies as well as mixed pixel issues with more dense forest canopy. In our work, we have developed a method that is not being influenced by such sources.

The calibration work of [39] produced TIRI of an ice bath for evaluation. While [39] provides support for the idea of requiring accurate RPAS TIRI, it has not provided those results in a field environment under RPAS flight conditions—which is stated within [39] as a suggestion for further work. The work of this study has developed a field-level validation method (not the same as calibration) in order to determine if the calibration (be it manufacturer or based on [39], or similar methods) remains applicable near the time of the RPAS work.

In our study, all cameras have performed as per specifications, and therefore, it may be difficult to decide which of these cameras to use. As all three produce essentially identical ground pixel sizes (the M2EA produces slightly larger pixels, due to the slightly larger FOV, ~0–2 mm difference within an image square pixel of ~10 cm side length, and have nearly the same noise floors (<50 mK), the decision may be best decided by the non-thermal portions of the system. Both the M300 airframe and XT2 camera have ratings of IP45 and IP 44, respectively, and are thus, essentially waterproof—neither of the other two cameras, nor airframes, have an IP rating. The XT-R camera requires a stationary drone to take the images (otherwise the results have considerable blur), while the XT2 and M2EA can acquire with slow but continuous flight speeds. The M2EA is, by far, the easiest to transport, followed by the M300 and finally the M600P, which is difficult due to the 67 cm length of side and cube shaped case. While the M300 and M600P require a separate ground station with tripod to function in RTK mode, the M2EA is fully self-contained and only requires a Wi-Fi connection and incoming corrections for RTK mode. The inclusion of a 4K visible RGB camera coincident with the thermal cameras with the M2EA and XT2 is a large advantage in many applications during daytime hours. However, care must be taken with all three cameras not to burn the thermal sensor from incoming solar radiation. For applications that require stealth, such as wildlife studies [9,11], RPAS size and rotor noise (more significant with the M600P) is a primary concern. Further, payload and ancillary sensors (hyperspectral, LiDAR, etc) are often a complementary desire with TIRI, therefore RPAS payload capability will play a dominant part of the selection criteria. Finally, one other non-technical consideration that has a tremendous impact on operations is that the XT-R camera has an Export Control Classification Number (6A003.b.4.b., [46]) and is therefore, very difficult to travel across international borders.

While it is also important to note that the XT-R and XT2 are no longer manufactured, they, continue to be very much in use [9,11,13,17,18,21–24,26–28,68,69] and are sure to be primary instruments in new works to come. The M2EA is relatively new and highly capable, being able to compete (results-wise) with the XT2 system.

Finally, although this work has been applied to TIRI in the LWIR region, it has been derived in accordance with basic properties and principles of the TIR region of the electromagnetic spectrum. As a result, we suggest that it is also applicable to the Mid-Wave InfraRed (3–7.5 μm) region and, therefore, it can be used with all TIRI data from within the TIR.

#### **5. Conclusions**

In this work, we have developed a new TIRI validation methodology that has been applied to LWIR imagery (7.5 μm to 14 μm) taken of a blackbody source (indoors) and to real-world field conditions (outdoor) using instrumented concrete and vegetation sites as validation targets. We have tested our method on three popular LWIR TIRI cameras (the Zenmuse XT-R, Zenmuse XT2 and the M2EA) that are operated by three different popular DJI RPAS platforms—the Matrice 600 Pro, the M300 RTK and the Mavic 2 Enterprise Advanced. Results of the blackbody work over the temperatures expected in the field site show a highly linearized response for each sensor as well as a small temperature bias. All TIRI camera measurements were validated (by surface thermal sensors) to be within the stated tolerances of uncertainty for the cameras. Environmental parameters (air temperature, humidity, pressure and wind) were measured for several hours before TIRI data was collected. Wind cooling was not a factor up to 2 h prior to data collection. In-field TIRI results from 10 m to 50 m heights, images show absolute temperature retrievals of the target materials (concrete and two vegetation sites) that were within the specifications of the cameras. The methodology has been developed under the condition that it should be applicable to RPAS operators who need to verify their equipment either pre or post operations.

**Author Contributions:** Conceptualization, G.L. and M.K.; methodology, G.L., M.K., J.P.A.-M., O.L. and A.T.; software, M.K. and J.P.A.-M.; validation, M.K., O.L. and G.L.; formal analysis, G.L. and M.K.; investigation, G.L., M.K., J.P.A.-M. and O.L.; resources, G.L., M.K. and O.L.; data curation, M.K. and J.P.A.-M.; writing—original draft preparation, G.L., M.K.; writing—review and editing, G.L., M.K., J.P.A.-M., A.T. and O.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received external funding by the Natural Sciences and Engineering Research Council Canada (NSERC), Discovery Grant program to M.K.

**Data Availability Statement:** Data available upon request to the corresponding author.

**Acknowledgments:** We would like to thank Stephen Scheunert for use of the RPAS field site, Paul Mondor for support at the field site and, Calvin Leblanc for support with the blackbody work. We also thank Iryna Borshchova, Greg Craig and two anonymous reviewers for their input that helped to improve the manuscript. Finally, we would like to thank Tim Mammatt from Aethea for use of the ThermoConverter software and information about the M2EA, and Kevin Toderel from RMUS for help with the M2EA.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Table A1.** Summary data from the blackbody measurements.



**Table A1.** *Cont.*

#### **References**


### *Review* **Advanced Leak Detection and Quantification of Methane Emissions Using sUAS**

**Derek Hollenbeck, Demitrius Zulevic and Yangquan Chen \***

Mechatronics Embedded Systems and Automation (MESA) Lab, Department of Mechanical Engineering, University of California at Merced, Merced, CA 95343, USA; dhollenbeck@ucmerced.edu (D.H.); dzulevic@ucmerced.edu (D.Z.)

**\*** Correspondence: ychen53@ucmerced.edu

**Abstract:** Detecting and quantifying methane emissions is gaining an increasingly vital role in mitigating emissions for the oil and gas industry through early detection and repair and will aide our understanding of how emissions in natural ecosystems are playing a role in the global carbon cycle and its impact on the climate. Traditional methods of measuring and quantifying emissions utilize chamber methods, bagging individual equipment, or require the release of a tracer gas. Advanced leak detection techniques have been developed over the past few years, utilizing technologies, such as optical gas imaging, mobile surveyors equipped with sensitive cavity ring down spectroscopy (CRDS), and manned aircraft and satellite approaches. More recently, sUAS-based approaches have been developed to provide, in some ways, cheaper alternatives that also offer sensing advantages to traditional methods, including not being constrained to roadways and being able to access class G airspace (0–400 ft) where manned aviation cannot travel. This work looks at reviewing methods of quantifying methane emissions that can be, or are, carried out using small unmanned aircraft systems (sUAS) as well as traditional methods to provide a clear comparison for future practitioners. This includes the current limitations, capabilities, assumptions, and survey details. The suggested technique for LDAQ depends on the desired accuracy and is a function of the survey time and survey distance. Based on the complexity and precision, the most promising sUAS methods are the near-field Gaussian plume inversion (NGI) and the vertical flux plane (VFP), which have comparable accuracy to those found in conventional state-of-the-art methods.

**Keywords:** advanced leak detection; advanced leak quantification; remote sensing; source estimation; environmental monitoring; landfill; natural gas

#### **1. Introduction**

Why is methane so important? Methane is a greenhouse gas (GHG) that has a global warming potential 86 times that of carbon dioxide in a 20 year time window and is even larger for smaller time-scales. The mitigation of methane and reducing methane emissions can help reduce global warming in the near term. The first step is improving the way we measure emissions in practice, both in accuracy and in frequency. The overall measurement of methane emissions in oil and gas for example, (top-down vs bottom-up) has been shown to have discrepancies and is often underestimated [1,2].

For example, 190 oil and gas production sites were explored in [3,4], and the measurements indicated that well completion emissions were lower than previously estimated. The data also showed how emissions from pneumatic controllers and equipment leaks were higher than the Environmental Protection Agency (EPA) national emission projections. In a report titled, "Lessons from a decade of emissions gap assessments" [5], the authors argued about where we need to be and where we think we are, including the Paris climate agreement and what steps to take in order to keep global warming below 2 ◦C. One way to combat this is by detecting super emitters through tiered remote sensing strategies, which is outlined in [6].

**Citation:** Hollenbeck, D.; Zulevic, D.; Chen, Y. Advanced Leak Detection and Quantification of Methane Emissions Using sUAS. *Drones* **2021**, *5*, 117. https://doi.org/10.3390/ drones5040117

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 24 August 2021 Accepted: 30 September 2021 Published: 14 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

This approach aims to focus on the detecting and repairing the largest emitters first, which can contribute a significant portion of the overall emissions of oil and gas systems. Furthermore, pressure to reduce contributions of climate change from customers and investors has been seen, insisting on reducing carbon footprints, including from landfills with much needed debate on inventory methods, direct emission measurements, and accountability [7]. The importance of mitigating methane emissions on a wide array of mitigation strategies is needed to stay on track with the Paris agreement [8].

Who is currently looking at methane? From an anthropogenic point of view (e.g., oil and gas), companies, such as Picarro (Santa Clara, CA, USA), Aerodyne Research (Billerica, MA, USA), Bridger Photonics (Bozeman, MT, USA), SeekOps (Austin, TX, USA), Heath Consultants (Houston, TX, USA), Flir (Global), Scientific Aviation (Boulder, CO, USA), Avitas (Houston, TX, USA), Ventus Geospatial (Houston, TX, USA), Aerometrix (Canada), and many more have provided methane detection and quantification solutions in a variety of technologies.

For example, A quantum cascade laser spectrometer is deployed on a small unmanned aircraft system (sUAS) for measuring facility-scale emissions using a mass balance approach with kriging [9]. For biogenic sources in ecosystems, there has been work looking at permafrost bogs [10,11], lakes [12], small ponds, wetlands [13], and vernal pools [14–17] to name a few. Seasonal dynamics of methane emissions from permafrost landscapes are explored in [18], specifically a lagoon pingo, and emissions estimated using a Thin Boundary Layer approach.

Porewater samples were analyzed using a Quantum Cascade Laser Spectrometer and combined with high resolution images from sUAS as an input into a neural network for creating a prediction map to upscale methane flux [19]. The spatial distribution of methane in the Artic permafrost bluffs was explored in [20] with a backscatter tunable diode laser absorption spectrometer (bs-TDLAS), namely the Pergam Laser Methane mini.

Flux estimates can be made, typically, using methods based on static measurements, on foot, by vehicle, manned aircraft, and by satellite. Static measurements consist of: (1) Eddy Covariance (EC) towers: A footprint modeling technique that looks at the turbulent exchange with the environment and utilizes meteorological conditions with precision concentration measurements to estimate the flux, (2) Chambers (autochambers): An enclosed chamber is placed over a target piece of land and is sampled with a syringe occasionally (to be analyzed at a later time, typically with gas chromatography) or dynamically sampled within a closed loop (such as GASMET's Fourier Transform Infrared (FTIR) analyzer [21]).

Measurements made on foot consist of handheld sensors that are used with survey equipment and the surface. Surface emission monitoring (SEM) is typically a routine operation for landfills, done quarterly, to maintain compliance with local regulations to account for emissions lost from gas control systems. SEM is a point-based scanning technique that can take on the order of a week or so to complete. The concentrations are measured with devices, such as the flame ionization detector (FID) (regulated by EPA's guidance and Method 21), and are integrated along sub divided grids looking for elevated methane levels (greater than 500 ppmv) [22].

Landfill emissions are generally calculated using inventory-type estimates. Measurements by vehicle consist of methods, such as the Tracer Correlation Method (TCM) and the environmental protection agency (EPA)'s other test method 33A (OTM33A) [23]. In [24], they explored vehicle-based advanced leak detection (ALD) with a cavity ring-down spectrometer (CRDS) from Picarro and determined that five to eight drives will capture a majority of leaks (>90%) as well as indicate detection limitations (such as wind and soil conditions and variations in methane enhancements making quantifying emissions difficult).

In [25], a vehicle mounted CRDS (Picarro G2301 and G4302) used empirical formulation to turn elevated concentration levels (or leak indications), *C*, to emission rates , *Q* (based on the work of [26], ln *C* = −0.988 + 0.817 ln *Q*) and used Gaussian plume model to quantify emissions from site-level emissions in Utrecht and Hamburg, Germany. In work by [27], 6650 sites were evaluated using inventory and inverse point source Gaussian measurements, and they found that the methane inventory was underestimated by a factor of 1.5.

A series of campaigns were carried out utilizing TCM and downwind mobile measurements to explore the accuracy of different TCM approaches as well as compare CRDS with FTIR instrumentation in multiple source separation [28]. Measurements from manned aircraft have been done using FID, mounted on both fixed wing (Piper (Vero Beach, FL, USA) Seneca or Piper Navajo twin engine) and helicopter (Bell (Fort Worth, TX, USA) 206 Long Ranger) to detect liquid hydrocarbons from pipelines [29]. In [30,31], they utilize the next generation Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-NG) to retrieve methane, carbon dioxide, and water vapor.

In [32], AVIRIS-NG was used to generate the VIsta-CA geospatial dataset to provide a comparison to the attribution of sources with Calfornia Air Resources Board (CARB) Pollution Mapping Tool (CARB PMT) and the U.S. Environmental Protection Agency (EPA) Facility Level Information on Greenhouse gases Tool (EPA FLIGHT). In [33], they investigate the uncertainty for estimating urban fluxes by an aircraft-based mass balance approach. They assess the sensitivity of the estimated city-wide CO2 and CH4 fluxes for several flight experiments, including the regional background concentration, depth of the convective boundary layer, magnitude of the wind speed, and type of interpolation technique.

In [34], they utilized a Sky Arrow Environmental Research Aircraft to measure emissions from multiple landfills and combined steady state Gaussian models to distinguish the emission coefficients for each individual site. A Bridger Photonics Gas Mapping LiDAR (GML) system was deployed on a Cessna 172 and blindly evaluated where the detection limits were as low as 1 kg/h depending on the wind conditions [35].

This method was also introduced to the Fugitive Emissions Abatement Simulation Toolkit (FEAST) [36] and shown to be comparable to OGI-based methods at equivalent survey frequencies for the detection and repair of emissions. Optical Gas Imaging (OGI) was explored in [37,38] and the effectiveness was evaluated in [38]. NASA's Alpha Jet Atmospheric eXperiment (AJAX) and the AutoMObile greenhouse Gas (AMOG) surveyor were used to fuse airborne and ground-based data together (as part of the GOSAT-COMEX Experiment) using an anomaly approach instead of the typical mass balance approach [39]. Measurements by satellite have been explored in [40], where a ResNet-50 was trained from ESA's Sentinel-2 data and labeled with a U-Net to detect smoke plumes.

Other works in the literature where emissions are detected, quantified, mapped, or localized include: A mid-wave infrared (MWIR) camera was used to compare eight supervised multivariate methods for detecting oil spills along the coastline in [41]. Using an array of stationary laser fetches, a controlled release emission is estimated using a Bayesian Markov chain Monte Carlo (MCMC) approach in [42]. There have been several works devoted to gas distribution mapping (GDM) using the Kernel DM/V methodology [43], including simultaneous localization and mapping (SLAM) [44].

GDM and gas source localization (GSL) with micro-drones have been explored in [45]. GDM has also been used in olfactory simulations in [46]. In [47], different GSL strategies (spiral, surge-cast, spiral-surge, and particle filter) were evaluated using the GADEN gas dispersion simulator. A mobile ground robot system named ARMEx was used to perform gas distribution mapping with a Heath Consultants remote methane leak detector (RMLD) [48].

In recent years, sUAS-based sensing approaches have become increasingly popular amongst practitioners for a variety of reasons, such as the ability to not be restricted to roadways or land locked areas, the ability to operate within the class G airspace at altitudes that traditional manned aircraft cannot operate at (improving resolution), the low cost, and the ability for high frequency deployment for capturing temporal changes.

Here, we provide an overview of some of the recent literature works utilizing sUAS, such as: a fixed-wing SIERRA sUAS with off axis integrated cavity output spectrometer (OA-ICOS) instrument was deployed in Svalbard, Norway prior to the NASA Characterization of Arctic Sea Ice Experiment (CASIE) [49]; single and multi-sUAS systems for source seeking based on the Luenberger observer were explored in [50]; an open path

GHG analyzer based on vertical cavity surface emitting laser (VCSEL) was developed and tested in [51] with an aim to provide improved measurements compared to satellites; and volcanic emissions were captured using thermal cameras [52].

In [53], detection and spatial temporal analysis of a thermokarst lake was done with RGB images taken from a plane and sUAS. They looked at the bubble characteristics of the images to determine methane ebulliton. The use of long wave infrared (LWIR), short wave infrared (SWIR), hyperspectral, and visible cameras were used to detect liquid hydrocarbons with machine learning in [54]. Detection of methane gas from a custom open path absorption spectroscopy , mounted on fixed-wing sUAS, was explored in [55]. Emission factors from a combustion source using the EPA-based sensor, Kalibri, were calculated using a sUAS in [56]. Profiling GHG using sUAS-based AirCore system was analyzed with CRDS in [57].

Terra Sana Consultants developed a sUAS system with a path-integrated laser absorption (10 Hz at 30 m with 1 ppm-m) used in the detection of landfill gas. In a field trial, they compared the sUAS results to ground-based walk-over survey, reporting good correlation between the two [58]. A bs-TDLAS equipped drone with laser rangefinder was used to reconstruct 2D plumes under realistic conditions [59]. SEM, drone emission monitoring (DEM), and downwind plume emission monitoring (DWPEM) with CRDS are used with a genetic algorithm (GA) to estimate methane emissions from a landfill [60].

The AlphaSense electro-chemical sensor suite was used on a DJI 100 series sUAS that conducted ziz-zag and spiral localization flights of a stationary source [61]. In [62], red green blue (RGB), near infrared (NIR), and thermal infrared (TIR) cameras were used to map the topography and create digital elevation maps for identifying problematic areas where localized CH4 emissions were present using a static prototype semiconductor sensor. In [63], a sUAS equipped with a Pergam (Renton, WA, USA) backscatter-based tunable diode laser absorption spectrometer (bs-TDLAS) and OGI camera were used to detect and quantify pipeline leaks. The sUAS traveled 4 m from the pipeline during the surveys and had a minimum detection limit of 0.06 g/s.

In [64], atmospheric particulate matter and carbon dioxide were measured using sUAS sampling and a bag collection system. The bags were collected and analyzed in a lab. characterizing termite mounds using ground and sUAS-based laser scanning [65]. In a recent paper, ref. [66] utilized a NDIR instrument to measure CO2 flux (characterized, corrected, and validated in laboratory experiment at the Integrated Carbon Observation Station (ICOS) in Steinkimmen, Germany) at an ExxonMobil (Irving, TX, USA) natural gas processing facility in Germany. They used an on-board anemometer (FT-205) that was gain and bias corrected prior to the field experiments, where flux measurements were calculated using the mass balance approach.

General questions one can ask are, "What technologies and methods fall under advanced leak detection and quantification (LDAQ)?" Does this include mobile and sUASbased approaches? These questions, in practice, are unfortunately up to the owners and operators of natural gas facilities, as they have the choice regarding what becomes adopted. However, the potential impact that LDAQ can have on improving methane mitigation is yet to be seen. Is leak grading quantification? What is accurate enough? What size leaks should we (or can we) care about? In the literature, it is often observed that leak quantification estimates and variability are reported in the place of accuracy and uncertainty. How can we determine the necessary and sufficient conditions for application of these methodologies? To the best of our knowledge, these questions remain unsolved in practice.

In the literature, there have been several reviews conducted on topics that deal with emissions, including remote sensing, source term estimation, and fugitive gas emissions. A remote sensing review paper in [67] describes many applications and topics within remote sensing, including the environmental sensing of volcanic eruptions, soil erosion, and geological related areas. A thorough review paper on source term estimation techniques is presented in [68]. In [69], a review was conducted on chemical sensing drones, which includes the sUAS platforms, sensors, and a brief overview of methodologies.

In [70], a review was conducted measuring fugitive gas emissions from landfills using various methodologies, including surface chambers (closed and open), EC towers, stationary mass balance, aerial mass balance, vertical radial plume mapping (VRPM), differential absorption LiDAR (DiAL), tracer gas dispersion (stationary and dynamic), and inverse modeling approaches (stationary, dynamic, and aerial [71]). In [72], several biogas plants in the UK were evaluated using the point source Gaussian plume model.

In this manuscript, we provide a review on the literature regarding the source rate estimation of continuous emission sources, focused around UAV-based methodology. We provide an overview of theoretical methodology as well as establish a quantitative comparison (for papers that have shown validated accuracy) between existing/current approaches and with UAV-based approaches in an attempt to shed light on the current accuracy of these methods. In Section 2, we discuss the problem overview. In Section 3, we overview some common sensors (chemical and wind). In Section 4, we overview the LDAQ methodology. In Section 5, we analyze the reported accuracy. Section 6 summarizes the methodologies, Section 7 discusses possible future directions, and in Section 8, we conclude the paper.

#### **2. Problem Overview**

The general problem in this work looks at methane emissions released into the air in gaseous form. The release mechanism or interface can vary depending on the application or system. For instance, in the oil and gas industry, leaks generally appear from tanks, valves, or hatches in the form of a point source, typically an above ground leak. Underground leaks also occur in practice, and the resulting emissions can manifest on the surface in many ways.

This may also be the case for landfills, where many small sources can be present across a very large area. If the distribution of sources is spatially uniform, we refer to this as an area source. The distribution of emissions may also vary, as in the case of natural ecosystems where the amount of methane may be produced at different rates depending on key factors of natural methane and carbon dioxide production (e.g., temperature, soil properties, water properties, etc.). Examples of different types of leaks are shown in Figure 1.

**Figure 1.** Example illustration for source types: (**a**) continuous point source, (**b**) uniform area source, (**c**) distributed area source, (**d**) intermittent point source, (**e**) elevated area source, and (**f**) underground point source.

Landfills, which typically have area sources, are required to do quarterly walkover surveys, based on landfill regulations [73], using SEM on gas collection facilities. Landfills also have to consider the production and control of hydrogen sulfide gas, which is reviewed in [74]. Assuming some level of uniformity, chamber measurements have been taken and compared against atmospheric tracer methods (or TCM) [75]. The mass flux for chamber measurements can be calculated as

$$E = \frac{V}{A} p(\frac{\Delta \mathcal{C}}{\Delta t}),\tag{1}$$

where *V* is the volume of the chamber, *A* is the area covered, *p* is the gas density given the headspace temperature, and (Δ*C*/Δ*t*) is the change in mixing ratio, which is derived from linear regression of the temporal observations (four to five headspace measurements to achieve an acceptable correlation coefficient). However, due to the large size of these sites they are difficult to accurately measure. For example, four methods (aircraft mass balance, tracer correlation (TCM), vertical radial plume mapping (VRPM), and static chambers) and the California Landfill Methane Inventory Model (CALMIN) were compared in a landfill study in Indiana [76].

A field study comparison of different landfill methods for assessment of fugitive gas emissions was explored in [77]. This included VRPM, TDM, DiAL, MicroMeteoroligcal (Eddie Covariance method), and Flux chamber, and VRPM (close to the source, ≈10 m) and TDM (≈400 m) performed quite well against DiAL. In a paper from the UK, a review quantification of biogas plants was undertaken with inverse dispersion modeling (e.g., bLS), a tracer dispersion model, and OGI for different feedstock cases [72]. Mass balance approaches have also been applied using UAVs developed in [78].

In [79], chamber measurements were used to compare TIR images to quantify emissions in two landfills. The overall site emissions were verified using TCM (which tends to be the gold standard). The methane flux from different types of surface emissions were explored using chamber and FID measurements in [80]. A point-based scanning method, utilizing a portable gas detector (bs-TDLAS based), was correlated in lab testing using chamber methods and deployed on a landfill experimentally. This study showed a positive correlation between ambient methane concentrations and flux as well as directly proportional to flow rates [81]. Then, using this relation, a spatial map of the emissions was derived.

In a landfill study, the TCM was quantified for a 6 day campaign during different wind conditions and found that the methane emitted accounted for 31% of the generated methane [82]. Based on these findings, it is clear that fast and effective methods for estimating emissions from landfills are needed.

In natural ecosystems, which typically manifest as distributed sources (sometimes point sources distributed across a landscape) are generally much lower emissions than anthropogenic sources. It has been thought that these emissions are small; however, recent research has suggested that they are still not well understood. For example, digital elevation maps with SWIR imagery have been used to detect temporal trends in ombrotrophic peatland [83].

A Patagonia peat bog was examined with a UAV with high resolution color infrared (CIR). The images were classified using chamber measurements and different microforms in an attempt to upscale the plot-scale fluxes [84]. Thawing permafrost, peatland bogs, etc. have been typically measured using chambers [85], autochambers, and Eddy covariance (EC) towers. Commercialized chambers include those from LICOR, Picarro, and GASMET [21].

Aside from types of sources, wind is a direct input into the flux calculation and it can introduce a lot of uncertainty into the emission estimation. There are many important weather related measurements that can provide metrics for quantification methods, such as atmospheric stability. These stability classification schemes can depend on mechanical turbulence (roughness length and friction velocity), convective turbulence (mixing depth, Monin–Obukhov length, and heat flux), wind speed, and wind direction fluctuations [86,87]. These meteorological measures are summarized here: the Monin–Obukhov length is given as

$$L = -\frac{\mu\_\*^3 T}{\kappa \overline{g \overline{w'} T'}}\tag{2}$$

where *κ* is the von Karman constant, *g* is the acceleration of gravity, *T* is the average temperature, *wT* is the mean covariance between the vertical wind speed and sonic temperature [88]; and the friction velocity is given as

$$
\mu\_\* = \sqrt{-\overline{u'w'}}\tag{3}
$$

where *uw* is the mean covariance between the horizontal and vertical wind speed components. The effective plume height, *z*, can be determined using the following two equations,

$$\begin{cases} \mathbf{L}\_{x,eff} + \mathbf{x}\_0 = \\ \begin{cases} (\overline{\mathbf{z}}/\kappa^2)[\ln(c\overline{\mathbf{z}}/z\_0) - \Psi(c\overline{\mathbf{z}}/L)][1 - p\mathbf{z}\_1\overline{\mathbf{z}}/(4L)]^{-1/2}, & L < 0, \\ (\overline{\mathbf{z}}/\kappa^2)[(\ln(c\overline{\mathbf{z}}/z\_0) + 2b\_2p\overline{\mathbf{z}}/(3L))(1 + b\_1p\overline{\mathbf{z}}/(2L)) + \\ (b\_1/4 - b\_2/6)]p\overline{\mathbf{z}}/L, & L > 0. \end{cases} \end{cases} \tag{4}$$

This equation is first initialized by setting the effective distance *Lx*,*eff* = 0, the effective plume height to the source height *z* = *zs*, and solving for the integration constant *x*0. *Lx*,*eff* is calculated from the longitudinal distance to the source using the angle to the center of the plume *θ<sup>p</sup>* by, *Lx*,*eff* = *Lx* cos (*θ* − *θp*), where *Lx* is the longitudinal distance from the source. The stability parameter, Ψ, which is dependent on the effective plume height *z* and Monin–Obukhov length *L* [89], can be calculated (for a given height) as,

$$\Psi(z/L) = \begin{cases} (1 - a\_2 z/L)^{1/4} - 1, & L < 0, \\ -b\_2 z/L, & L > 0. \end{cases} \tag{5}$$

The coefficient *c* is dependent on the shape function parameter, *s*, described in [89] and given as

$$s = \begin{cases} \frac{1 - a\_1 c \mathbb{1} / (2L)}{1 - a\_1 c \overline{\varepsilon} / L} + \frac{(1 - a\_2 c \mathbb{1} / L)^{-1/4}}{\ln(c \mathbb{1} / z\_0) - \Psi(c \mathbb{1} / L)}, & L < 0, \\\frac{1 + b\_1 c \overline{\varepsilon} / L}{1 + b\_1 c \overline{\varepsilon} / L} + \frac{1 + b\_2 c \overline{\varepsilon} / L}{\ln(c \mathbb{1} / z\_0) + \Psi(c \mathbb{1} / L)}, & L > 0. \end{cases} \tag{6}$$

The remaining coefficients (also from [89]) *p*, *a*1, *b*1, and *b*<sup>2</sup> can be set to 1.55, 16, 5, and 5, respectively (as used in [88]). The speed of the plume is given as

$$
\mathcal{U}(z) = \frac{u\_\*}{\kappa} [\ln(z/z\_0) - \Psi(z/L)].\tag{7}
$$

For the interested reader, the Monin–Obukhov similarity theory overview is given in [90].

#### **3. Sensors and Equipment**

There are many types of sensors that can be used on-board sUAS, granted they are light enough for a given platform's payload capacity. In this section, we overview a few key sensors used and refer interested readers to a thorough review paper for more on chemical sensing drones [69]. Sensors used in many of the works reviewed here are briefly overviewed.

There are generally two types of sensing modalities, passive and active. Passive sensing encompasses any sensor that receives information from the environment. A common example of this includes optical cameras, such as visual spectrum cameras (e.g., RGB), thermal cameras (e.g., thermal infrared (TIR) [79,91], near infrared (NIR), short-wave infrared (SWIR), mid-wave infrared (MWIR), and long-wave infrared (LWIR)).

TIR cameras tend to span a larger bandwidth of wavelengths, whereas hyperspectral cameras can control which wavelengths to focus on. For example, in [92], Telops (Quebec, Canada) used a standoff tripod mounted hyperspectral camera to estimate the flow rate by integrating the mass per unit area and multiplying by the mean velocity of the gas. They utilized a two layer model to calculate the background radiance,

$$L\_{tot} = \left[ L\_{bkg} \tau\_{plumc} + L\_{plumc} (1 - \tau\_{plumc}) \right] \tau\_{atm} + L\_{atm} (1 - \tau\_{atm}).\tag{8}$$

When the hyperspectral camera was optimized for methane detection (as in [93], 7.7 μm band), two controlled release tests showed flow rates calculated (measured) were found to be 25.3 ± 2.8 g/h (23 ± 2.3 g/h) and 102.9 ± 5.8 g/h (100 ± 10 g/h). The authors claimed that the approach is 40 to 100 times more sensitive and can potentially be mounted to an aerial platform, remotely sensing from several hundred meters, deeming it suitable for both natural and anthropogenic sources. Active sensing encompasses any sensor that actively transmits information into the environment, probing a response.

An example of active sensing includes tunable diode laser absorption spectroscopy (TDLAS), which can come in several forms. The working principle relies on the gas species entering the sensor region or laser path, such that some of the power is absorbed by the gas, and a power drop is detected. One form of TDLAS is the closed path TDLAS (e.g., sensing region enclosed in controlled environment), where the emitter and detector are apart of the same device at a fixed distance apart, optimized for a desired detection species.

Other variations of the TDLAS include: the open path TDLAS (e.g., sensing region is open to the environment (see the open path laser spectrometer (OPLS) [94])), the backscatter TDLAS (bs-TDLAS) where the laser is reflected off the natural environment before being received at the detector, and long path TDLAS (e.g., used with retro-reflectors not connected to the physical instrument). Several examples of bs-TDLAS include: LiDAR based (Continuous wave laser absorption LiDAR, Pulsed Differential Absorption LiDAR (DIAL)), Pergam Laser Methane Mini [95], RMLD, Gasfinder2 [96], and Gasfinder3 [97]) [98].

Other more sensitive laser based instruments are also used in practice, such as cavity ring-down spectroscopy (CRDS) and off-axis integrated cavity output spectroscopy (ICOS). An example of these types of instruments can be seen from the Los Gatos Research Inc. (LGR) micro greenhouse gas analyzer (MGGA) (also referred to in the literature as the ultra-portable greenhouse gas analyzer (UGGA)) or the Picarro G2301 and G4302. These instruments are typically the gold standard for sensing gas, albeit they are also typically the most heavy as well. Other sensors used consist of non-dispersive infrared (NDIR), ceramic metal oxide sensors (CMOS) [99–102], photo-ionization detectors (PID, such as the Honeywell MiniRAE® 3000) and electro-chemical sensors (review of applications [103].

A recent survey paper in [104] outlined new electronic nose technologies and applications. As the need for low cost sensing solutions increases and becomes commercially available, we are faced with evaluating the accuracy and characteristics of these sensors for practical use, such as in the work by [105] where low cost commercially available sensors were evaluated for precision and accuracy in a gas mixing chamber, providing promise for applications in continuous monitoring applications. Once these sensors can be evaluated and integrated on a platform with suitable sensor characteristics, they can be applied in practice.

For example, in [102], they demonstrated a proof of concept using a chemical multisensor payload for gas monitoring based on the DJI S900 platform. Or in [106], a semiconductor type sensor (Testo Gas Detector, Testo SE & Co. KGaA, Titisee-Neustadt, Germany) was used on a DJI M600 to analyze the spatial distribution of methane at a landfill, as well as compared different spatial interpolation techniques. This required calibration and consideration of the vehicle's critical velocity.

For sUAS, lightweight and accurate wind sensors are needed to provide in situ measurements that can be used in the quantification methodology. Some examples of lightweight sensors can range from five hole probes (or multi-hole probes) to ultrasonic anemometers that utilize time of flight (e.g., Anemoment (Longmont, CO, USA) Trisonica used in [107] or the Gill (Lymington, Hampshire, UK) WindMaster used on the OP-TOKopter [108]) and resonance based (FT Technologies FT742 and FT205 used in [109]) to more custom micro electrical mechanical systems (MEMS)-based solutions (such as in [110]). These wind sensors can also be applied to general wind profiling and mapping applications. For example, in [111], they used in situ wind measurements on sUAS for understanding the atmospheric boundary layer by developing wind profiling measurements using wind-induced perturbations. Mapping wind distributions over complex terrain

was explored in [112], where they utilized a Gill WindMaster 3D ultrasonic anemometer mounted on a octocopter (called the WindLocater).

Wind and temperature profiling from fires were explored in [113] based on wind measurements using the Trisonica. Vertical velocity measurements of aerosol cloud interactions were compared with ground-based radar in [114]. If payload limitations prevent integration of an on-board wind sensor, wind estimation techniques can also be explored, such as in [115,116]. For a more thorough understanding of different sensors and estimation techniques, we refer the reader to [117].

The choice of a platform and ancillary equipment depends directly on the choice payload system (e.g., the collection of on-board sensors) that needs to be integrated onto the sUAS. For LDAQ, this typically includes lightweight methane and wind sensors. Due to sUAS payload capacity limitations, this often leads to integration problems as the weight of methane sensors can vary greatly depending on the desired sensitivity and response time (e.g., from a couple hundred grams to a couple kilograms). Lightweight and low cost methane sensors (such as CMOS) are slower in response and are less sensitive. For high accuracy and fast response sensors, which typically weigh more, this ultimately affects the selection process of sUAS, requiring bigger and more costly platforms to maintain safe stability and control.

For this reason, ground-based wind sensors are often used (placed adjacent to operating area) with smaller sUAS platforms gathering the in situ methane measurements. The data from these two systems are collected by a local data acquisition system for live observation and post-processing. This kind of configuration, implicitly assumes that the average wind, or mean wind field, represents the overall sensing region quite well. This is usually only the case in rural areas, where there are little to no obstructions (i.e., trees, hills, buildings, infrastructure, etc.). Most of the scenarios faced in practice, however, deal with obstructions and require on-board, or in situ, wind measurements.

This decision requires platform specific knowledge (e.g., hardware setup, autopilot, co-pilot software, etc.) as well as desired payload knowledge, which will vary depending on the desired application, sensitivity, measurement mode, and so on. For these reasons, and since this manuscript's focus is on detection and quantification methodologies, we omit these details in this manuscript, and suggest that the interested reader see [118] for a guide on sUAS platform selection.

Payload integration strategies for methane sensors include several configurations, such as the boom-mounted, bottom-mounted, or top-mounted (see Figure 2). Boom-mounted approaches typically consist of TDLAS based sensors, which are subject to disturbances in the measurement from downwash of the propellers. This is avoided by placing the sensor out front of the aircraft along a boom and sampling when the effective wind speed over the sensor is greater than 2 m/s [119].

**Figure 2.** Payload configuration examples of (**a**) boom-mounted TDLAS on a DJI M210 used in [120], © 2020 IEEE, used with permission, (**b**) top-mounted anemometer in a wind tunnel showcasing the effect of propellers on the streamlines used in [108], and (**c**) bottom-mounted RMLD used in [121].

Bottom-mounted approaches are typically used with bs-TDLAS or OGI. The sensor is mounted on a gimbal system or sometimes hard mounted to the aircraft frame. Topmounted approaches on typically only suitable for bs-TDLAS or OGI based methods. Point source measurements with TDLAS will, on average, underestimate the concentration (see [69] for more details). On the contrary, top-mounted wind sensors can provide high accuracy if translational and induced wind velocities can be removed [108].

#### **4. Advanced Leak Detection and Quantification Methods**

In this section, we overview the conventional and sUAS-based advanced leak detection and quantification (LDAQ) methodology. The LDAQ methods utilize several concepts and approaches within numerics, control, and optimization as well as approaches based on different available sensing modalities (see Figure 3 for effective length-scales). In this manuscript, we divided these approaches into five general categories, namely: Simulationbased (Section 4.1), Optimization-based (Section 4.2), Mass-Balance-based (Section 4.3), Imaging-based (Section 4.4), and Correlation-based (Section 4.5).

In the Simulation-based approaches, the methods depend heavily on simulation and computational tools for solving dynamic partial differential equations, which are used to determine the source rate estimation. Sometimes other source parameters are also estimated in the process and this is typically referred to as source term estimation (STE) or the source determination problem (SDP). Two methods that show up in the literature are backwards Lagrangian stochastic (bLS) and mesoscale recursive Bayesian least squares inverse (RB-LSI).

The optimization-based methods showcased in this manuscript depend on some form of a parameterized system model, which undergoes a model fitting or recursive optimization (statistical or information based). Many of these methods include several variations of the point source Gaussian (PSG) solution of the classical Gaussian plume model. This is seen in the PSG approach based on the EPA's other test method (OTM) 33A, where the data is gathered from a single sensor downwind and undergoes model fit of the peak concentration measured.

Next is the conditionally sampled PSG (PSG-CS) approach that utilizes meteorological data in the model fitting process using conditionally sampled concentration data based on the incremental changes in wind direction. Another variation to this is the recursive Bayesian PSG (PSG-RB) that utilizes a moving sensor and meteorological data to condition the models likelihood function and prior for updating the posterior distribution that is used to quantify the source estimate. This approach also considers past knowledge about equipment characteristics if this is known.

A different approach to the Bayesian way of thinking is to solve for the parameters of the model conditioned on the observations. This approach also utilizes a particle filter and Markov Chain Monte Carlo (MCMC) to update the posterior and is referred to as the PSG sequential Bayesian MCMC (PSG-SBM). The last optimization approach mentioned in this manuscript is the Near-Field Gaussian Plume Inversion (NGI) approach.

The NGI utilizes fitting the Gaussian plume model based on sampling of a perpendicular plane downwind of the source. The vertical and horizontal dispersion relations are used to find the center of the plume within the perpendicular plane and minimize, by least square fit, the difference between the modeled concentration and the observed concentration (integrated over the lateral dispersion direction).

The next category is the Mass-Balance-based approaches, which includes methods that utilize equations based on mass conservation and continuity. The simplest approach is the vertical flux plane (VFP), which takes a control volume approach to estimating the emission rate by measuring the flux entering and leaving the control volume. Traditionally, the plume is sampled using a raster scanning approach in a perpendicular plane upwind and downwind of the source. The sparse set of observations within the plane undergo a spatial interpolation process and are combined with the wind to estimate the source rate.

A direct variation to this approach is the cylindrical flux plane (CFP), which the sensing system measures concentrations on successive loops around the source at different altitudes. The flux going into and out of this cylindrical plane is used to estimate the flux. Using different sensing modalities (such as imaging or backscatter-based sensors), a path integrated vertical flux plane (PI-VFP) method can be formulated. Both aircraft and sUAS-based approaches to PI-VFP have been implemented, which rely on horizontal scanning of the area of interest.

For sUAS-based PI-VFP, concentric circles are flown to confirm that sources are contained inside the path before estimating the source rate. A flux plane approach has also been explored using a series of TDLAS-based laser fetches at different altitudes and utilizes the time-average of the line-integral of the instantaneous product of the wind speed and concentration. This is advantageous to other VFP approaches as it provides very good performance and does not take time to scan the plane. However, it is in ways impractical as it requires setup of the laser fetches and knowledge of the source geometry.

The next method is the Gauss divergence theorem (GDT) approach. It utilizes the CFP approach with mass flux continuity as well as the expected time rate of change of the mass within the control volume to estimate the source rate. Another VFP approach was included in this review that uses Gaussian plume model optimization with a general linear model (GLM) to help determine the contributions of multiple sources. This approach is referred to as the VFP-GLM.

The last Mass-Balance-based approach is vertical radial plume mapping (VRPM). The VRPM approach uses a ground based laser with retro-reflectors at different altitudes downwind of the source. The path integrated concentrations are measured at different radial angles and used to estimate the flux.

The next category is the Imaging-based approaches that utilize MWIR, hyperspectral cameras, and absorption spectroscopy (such as iterative maximum a posteriori differential optical absorption spectroscopy (IMAP-DOAS)).

The last category covered in this manuscript is the correlation-based approaches, which includes the traditional Eddy covariance (EC) method (in brief) and the tracer correlation method (TCM). The TCM has also been referred to in the literature as the tracer dispersion method (TDM) and atmospheric tracer method (ATM).

**Figure 3.** Conventional methods and their effective emission quantification length-scales [70], © 2019 Elsevier, used with permission.

#### *4.1. Simulation-Based*

#### 4.1.1. Forward Modeling

Forward modeling is typically used for projecting or forecasting dispersion. Forward modeling is not directly used in emission quantification by itself, but rather paired with feedback in the optimization sense. This can include numerically solving a governing set of equations, such as the advection diffusion equation (ADE) or applying a parameterized general model (such as the Gaussian plume). It is also common in practice to utilize existing numerical models, such as the WindTrax 2.0, WRF model, FLEXible PARTicle-Weather Research and Forecasting (FLEXPART-WRF), SCIPUFF, QUIC, and others that can be Lagrangian-base, include turbulence e.g., Large eddie simulation (LES), and Reynolds averaged Navier Stokes (RANS). Interested readers can check the review paper from [122] on dispersion models.

#### 4.1.2. Backward Lagrangian Stochastic (bLS)

The accepted backward modeling approach used in the draft OTM- 33A document [23] and in several applications (e.g., Dairy Farm [123], etc.) is the backwards Lagrangian stochastic (bLS) approach by [124]. The bLS approach aims to answer the general questions: What is the proper form of the LS trajectory model? As well as, how can source estimates be extracted from the particle's backward LS trajectory? The forward model, formulated as a generalized Langevin equation, is evolved jointly as a Markov process,

$$du\_i = a\_i(\mathbf{x}, \mathbf{u}, t)dt + b\_{i,j}(\mathbf{x}, \mathbf{u}, t)d\mathbf{j}\_j, \quad d\mathbf{x}\_i = u\_i dt,\tag{9}$$

where the particle position is given by **x** = (*x*1, *x*2, *x*3)=(*x*, *y*, *z*), and *dξ<sup>j</sup>* is a random increment governed by Gaussian process. The functions *ai* and *bi*,*<sup>j</sup>* have to be specified such that the velocity probability density function, *ga*(**x**, **u**, *t*), satisfies the Fokker–Planck equation (FPE) [124],

$$\frac{\partial g\_a}{\partial t} = \frac{\partial}{\partial \mathbf{x}\_i} (u\_{i\prime} g\_a) - \frac{\partial}{\partial u\_i} [a\_i(\mathbf{x}, \mathbf{u}, t) g\_a] + \frac{\partial}{\partial \mathbf{x}\_i} [B\_{i,j}(\mathbf{x}, \mathbf{u}, t) g\_a]. \tag{10}$$

This method provides a source estimation for an area source given the source location (with unknown source rate) and assuming horizontally uniform surface source atmosphere in horizontal equilibrium (see Figure 4). To make an emission estimate using bLS, the method utilizes the dispersion model relation,

**Figure 4.** A diagram depicting the bLS approach [124], © American Meteorological Society, used with permission.

$$\frac{\text{LIC}}{\text{Q}} = n = f(z\_{\text{m}}, z\_{0\prime} \, \text{L}, h, \text{G}), \tag{11}$$

where *L* is the Monin–Obukhov length, *h* is the depth of the mixing layer, *G* describes the set of parameters characterizing the plume, and *zm* represents the measurement height. As the particles from the back trajectories touchdown in the source area, the vertical velocities, *w*<sup>0</sup> are logged and used to estimate *n*,

$$m(z\_m) = \frac{\mathbb{C}(z\_m)\mathcal{U}(z\_m)}{\mathcal{Q}} = \frac{1}{N} \sum \left| \frac{2}{w\_0 / \mathcal{U}(z\_m)} \right|. \tag{12}$$

Once *n* is known, an estimate of the source rate can be determined using the measured concentration and wind speed, *Q* = *n*/(*CU*). In this approach, due to the time-averaged ensemble, the accuracy improves over time (nominal averaging period of 15 min [124]). An alternate expression for the emission estimate is given as

$$Q\_{bLS} = \frac{\mathbb{C} - \mathbb{C}\_b}{(\mathbb{C}/\mathbb{Q})\_{\text{sim}}},\tag{13}$$

where *Cb* is the background concentration and (*C*/*Q*)*sim* is calculated using

$$(\mathbb{C}/\mathbb{Q})\_{\text{sim}} = \frac{1}{N} \sum |\frac{2}{w\_0}|. \tag{14}$$

This Monin–Obukhov similarity theory (MOST)-based bLS emission estimation methodology was validated against the mass balance approach (given the along-wind distance of the source *D*),

$$Q\_{\rm ll} = \frac{1}{D} \int\_0^\infty \mathbb{C}(z) \mathcal{U}(z) dz,\tag{15}$$

and field tested in [96,125,126]. A lagoon environmental leak was simulated and explored with the bLS approach by constructing a large 45 m by 45 m emission source on a pond. The accuracy was shown to be lower during the summer period due to more frequent unstable atmospheric conditions [127].

#### 4.1.3. Mesoscale Recursive Bayesian Least Squares Inverse (RB-LSI)

Utilizing the NOAA P-3 aircraft and a wavelength-scanned CRDS, ref. [71] used a mesoscale bayesian least squares approach to solve the inverse problem of estimating emissions. They use the FLEXPART-WRF to model the forward problem, which was compared to physical observations and minimized on an iterative cost function that assumes lognormal distributions,

$$\begin{split} J &= \frac{1}{2} (\ln \left( y\_0 \right) - \ln \left( Hx \right))^T (\ln \left( y\_0 \right) - \ln \left( Hx \right)) \\ &+ \frac{1}{2} a (\ln \left( x \right) - \ln \left( x\_b \right))^T (\ln \left( x \right) - \ln \left( x\_b \right)), \end{split} \tag{16}$$

where the observed concentration enhancements are given as *y*0, posterior solutions are *x*, FLEXPART-WRF outputs are *H*, prior fluxes are *xb*, error covariance matrix from observations are *R*, and error covariance matrix from prior fluxes are *B* in the lognormal space.

#### *4.2. Optimization-Based*

In this section, we discuss the emission quantification techniques that utilize some form of optimization in the methodology that fits a model.

#### 4.2.1. Point Source Gaussian (PSG)—OTM33A

In [88], the point source Gaussian (PSG) is discussed. The measurement involves a vehicle with a concentration measurement instrument (CMI) to park downwind of the known source with the vehicle off. The CMI (such as Picarro or LGR UGGA) collects data at roughly 2.5 m above ground at a known distance from the source. The variations in the wind direction are measured using a sonic anemometer (e.g., R.M. Young). The PSG calculations are based off of enhanced emission levels and can be calculated as the fifth percentile of the concentration time series signal [88]. The PSG estimate then becomes a simple 2-D Gaussian integration with no reflection term,

$$S\_E = 2\pi \sigma\_y \sigma\_z \mathcal{U}\_m \mathcal{C}\_{p\prime} \tag{17}$$

where *Cp* is the peak concentration from the Gaussian fit, *Um* is the mean wind speed, *σ<sup>z</sup>* and *σ<sup>y</sup>* are the vertical and lateral plume dispersion that can be determined from the meteorological conditions, such as the Pasquill–Gifford stability classification curves [128] (see Figure 5). The accuracy of the OTM33A method is explored in [129,130].

**Figure 5.** (**A**) Depiction of Gaussian plume dispersion with an observer making a stationary measurement downwind. (**B**) Resulting time-integrated data with a Gaussian fit applied [23].

#### 4.2.2. Conditionally Sampled PSG (PSG-CS)

To capture the ensemble mean of the downwind plume behavior, a dispersion model is used in [88]. The model is a function of downwind distance and dispersion factors *Dy*(*x*, *y*) and *Dz*(*x*, *z*), given as

$$\mathbb{C}\_{m}(\mathbf{x}, y, z) = \frac{\mathcal{S}}{\overline{\mathcal{U}}} D\_{\mathcal{Y}}(\mathbf{x}, y) D\_{\overline{z}}(\mathbf{x}, z). \tag{18}$$

This method essentially aims to determine the source rate, *S*, using the conditional mean concentration data, *Cm*, of the downwind plume. The lateral dispersion downwind of a continuous point source can be shown to have a Gaussian distribution such that it can be represented as

$$D\_{\mathbf{y}}(\mathbf{x}, \mathbf{y}) = \frac{1}{\sqrt{2\pi}\sigma\_{\mathbf{y}}} [-\frac{1}{2} (\frac{\mathbf{y}}{\sigma\_{\mathbf{y}}}^2)].\tag{19}$$

However, the vertical dispersion (assuming vertical eddy diffusivity and wind speed that scales vertically to a power law) can be formulated as a parameterized stretched exponential (originally expressed in [131]),

$$D\_z = D\_z(\mathbf{x}, z) = \frac{A}{\mathbb{Z}} \exp\left[-(\frac{Bz}{\mathbb{Z}})^s\right]. \tag{20}$$

The parameters *z*, *s*, *A*, and *B* are functions of the atmospheric stability and downwind distance, *x*. *A* and *B* can be described using the usual Gamma function, Γ(·) as

$$A = s\Gamma(2/s)[\Gamma(1/s)]^2,\tag{21}$$

$$B = s\Gamma(2/s)\Gamma(1/s). \tag{22}$$

The conditional averaged concentration can be calculated using

$$\langle \mathbb{C} | \theta \rangle = \frac{1}{n} \sum\_{\theta\_i \in \Theta}^{n} \mathbb{C}(\theta\_i),\tag{23}$$

where the set Θ(*θ*) = {*θ<sup>i</sup>* : |*θ* − *θi*| < Δ*θ*/2, ∀*i* = 1, 2, ..., *n*} and Δ*θ* = 2◦. The basic idea is to capture the plume geometry in the crosswind direction, which is further used to derive the least squares source estimate,

$$S = \left[\sum\_{i=1}^{N} \frac{D\_y D\_z}{\overline{\mathcal{U}}} \langle \mathbb{C} | \hat{Y}\_i \rangle \right] / \left[\sum\_{i=1}^{N} (\frac{D\_y D\_z}{\overline{\mathcal{U}}})^2 \right]. \tag{24}$$

As shown in [88], the lateral dispersion can be determined in two ways: classically, using atmospheric stability (for constants *ay* and *py*) [132],

$$
\sigma\_y = a\_y z\_0 \mathbf{1}.9(L\_x / z\_0)^{p\_y}; \tag{25}
$$

and by reconstructing the lateral dispersion,

$$
\sigma\_{\mathcal{Y}} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \underline{\hat{Y}\_{i\prime}}} \tag{26}
$$

where the *N* is the number of values in *<sup>C</sup>*|*Y*<sup>ˆ</sup>, and *<sup>Y</sup>*<sup>ˆ</sup> are *<sup>Y</sup>*<sup>ˆ</sup> values that are greater than the minimum concentration (i.e., background) and ±40◦ off the plume center *θp*. The distance *Y*ˆ is calculated as

$$\hat{Y}(\theta) = L\_x \sin \left(\theta - \theta\_p\right),\tag{27}$$

with *θ<sup>p</sup>* = arg max*<sup>θ</sup> C*|*θ* (see Figure 6).

**Figure 6.** (**a**) Polar plot with the wind direction, *θ* as the radial axis, and the conditionally averaged concentration, *C*|*θ* as the distance from the center. *θ<sup>p</sup>* is the peak wind direction located at the maximum conditionally averaged concentration. (**b**) Illustration of the wind direction geometry for conversion of *θ* to crosswind position *Y*ˆ with the source plume represented by the dashed lines [88], © 2015 Atmospheric Environment, used with permission.

#### 4.2.3. Recursive Bayesian Point Source Gaussian Method (PSG-RB)

In work from [133,134], a moving sensor measured a point source concentration that can be formulated as

$$C(\mathbf{x}, y, z) = \frac{S}{\overline{\Pi}} D\_y(\mathbf{x}, y) D\_z(\mathbf{x}, z). \tag{28}$$

The source rate is given as *S*, the effective wind speed is *U*, and the lateral and vertical dispersion is characterized by *Dy*(*x*, *y*) and *Dz*(*x*, *z*), respectfully. The equation is formulated such that the downwind distance, *x*, is aligned with the predominant wind direction. Since the measurement is taken at closer distances to the source, the lateral dispersion is taken as a random function such that

$$\int\_{-\infty}^{\infty} D\_y(x, y) dy = 1.\tag{29}$$

This can be advantageous for instantaneous plumes. The integrated lateral concentration can be written as

$$C^{\mathcal{Y}}(\mathbf{x}, z) = \frac{S}{\overline{\mathcal{U}}} D\_{\overline{z}}(\mathbf{x}, z). \tag{30}$$

The choice of the vertical dispersion *Dz* (originally expressed in [131]) is that of a parameterized stretched exponential function,

$$D\_{\overline{z}} = D\_{\overline{z}}(\mathbf{x}, z) = \frac{A}{\overline{z}} \exp\left[-(\frac{Bz}{\overline{z}})^s\right],\tag{31}$$

where *z*, *s*, *A*, and *B* are functions of atmospheric stability and downwind distance, *x*. The lateral dispersion is given as

$$D\_y = \frac{1}{\sqrt{2\pi}\sigma\_y} \exp\left[-\frac{1}{2}(\frac{y\_i}{\sigma\_y})^2\right]. \tag{32}$$

Then, by numerically integrating (28) and incorporating the vehicle movement *V*,

$$\mathbf{C}^{y} = \sum\_{i=0}^{\infty} \mathbf{C}(\mathbf{x}\_{i\prime} y\_{i\prime} z\_{i}) \Delta t V = \mathbf{S} \sum\_{i=0}^{\infty} \frac{\Delta t V}{\prod\_{i}} D\_{z}(\mathbf{x}\_{i\prime} z\_{i}) D\_{y}(\mathbf{x}\_{i\prime} y\_{i}).\tag{33}$$

The recursive Bayesian approach described here is based on well pads and oil and gas production, which are used to help inform the path planning of the mobile sensor. For brevity, we will only cover the formulation of the quantification only. Starting with the definition of the posterior distribution,

$$p(S|M, W, \Lambda) = \frac{p(S|W)p(M|S, \Lambda)}{p(M|\Lambda)},\tag{34}$$

where *M* is the concentration data, *W* is the ancillary information (e.g., well pad characteristics), Λ is the meteorological conditions, *p*(*S*|*W*) is the prior, *p*(*M*|*S*, Λ) is the likelihood, and *p*(*M*|Λ) is the evidence (which can be thought of as a normalization constant for the likelihood [135]). The prior is given as

$$p(S|\mathcal{W}) = \frac{1}{\beta} \exp\left[ -\left( 1 + \gamma \frac{S - \mu}{\beta} \right)^{-\frac{1}{\gamma}} \right] \left( 1 + \gamma \frac{S - \mu}{\beta} \right)^{-1 - 1/\gamma},\tag{35}$$

where the hyperparameters need to be fit to the application (for well-pad source, *γ* = 1, *μ* = 0.19, *β* = 0.23 based on [136]). The likelihood function is chosen to be a Gaussian,

$$p(M|S,\Lambda) = \frac{1}{\sqrt{2\pi}\sigma\_{\varepsilon}} \exp\left[-\frac{1}{2} \left(\frac{C^y - C^{y,M}}{\sigma\_{\varepsilon}}\right)^2\right],\tag{36}$$

where *Cy*,*<sup>M</sup>* is the modeled concentration for a given source rate, and *σ<sup>e</sup>* is the combined model and measurement error (outlined in [137]). The recursive approach involves replacing the prior with the previous posterior distribution found using the likelihood function,

$$p(S|\mathcal{W})\_i = \begin{cases} p(S|\mathcal{W})\_i & i = 1, \\ p(S|\mathcal{M}, \mathcal{W}, \Lambda)\_{i-1}, & i > 1. \end{cases} \tag{37}$$

As the number of passes increases, the posterior distribution improves and can be used to estimate the source rate,

$$S = \int\_{S\_{\min}}^{S\_{\max}} Sp(S|M, W, \Lambda)dS. \tag{38}$$

Variations of this method were seen in [134], where the measurement noise was assumed to be Gaussian and also included a UAV with sensor noise and utilized the flux plane mass balance method to estimate the source rate, which was used in the calculation of the posterior distribution. Further field tests of this method were carried out in [138].

#### 4.2.4. Point Source Gaussian Sequential Bayesian Markov Chain Monte Carlo (PSG-SBM)

Utilizing the Gaussian plume model for the likelihood of a sequential Bayesian Markov Chain Monte Carlo (MCMC) method, a UAV scans horizontally to update the estimated posterior distribution in [139]. The parameters are given as Θ*<sup>k</sup>* = [**p***<sup>T</sup> <sup>s</sup>* , *qs*, *us*, *φs*, *ζs*] *<sup>T</sup>*, were the position is **p***s*, source rate *qs*, wind speed and direction *us* and *φs*, and the model diffusion coefficients *ζ<sup>s</sup>* = [*ζs*1, *ζs*2] *<sup>T</sup>*. The point source observations,**z**1:*<sup>k</sup>* = {*z*1, *<sup>z</sup>*2, ..., *zk*} are used within Bayes rule to update the posterior,

$$p(\Theta\_{k+1}|\mathbf{z}\_{1:k+1}) = \frac{p(\mathbf{z}\_{k+1}|\Theta\_{k+1})p(\Theta\_{k+1}|\mathbf{z}\_{1:k})}{p(z\_{k+1}|\mathbf{z}\_{1:k+1})}.\tag{39}$$

The likelihood model, M(**p***k*, *ζk*), in [139], based on observational data, *zk* = M(**p***k*, *ζk*) + *vk*, was taken to be detection event, *p*(*zk*|Θ*k*), if *zk* > *zthr*,

$$p(\overline{z}\_k|\Theta\_k) = \frac{1}{\sigma\_k \sqrt{2\pi}} \exp[-\frac{(\overline{z}\_k - \mathcal{M}(\mathbf{p}\_k, \mathbb{J}\_k))^2}{2\sigma\_k^2}],\tag{40}$$

and a non-detection event otherwise,

$$p(\underline{z}\_k|\Theta\_k) = (\frac{p\_b}{2}[1 + \text{erf}(\frac{z\_{thr} - \mu\_b}{\sigma\_b\sqrt{2}})]) + p\_W + (\frac{p\_s}{2}[1 + \text{erf}(\frac{z\_{thr} - (\mu\_b + \mathcal{M}(\textbf{p}\_k, \mathbb{Z}\_k))}{\sigma\_b\sqrt{2}})]).\tag{41}$$

The three terms in the non-detection event account for instrument noise, turbulence, and observing concentrations above the threshold, where *pb* + *pm* + *ps* = 1, and *μ<sup>b</sup>* and *σ<sup>b</sup>* are mean background noise and standard deviation, respectively. Using a particle filter, the posterior can be approximated by a set of *<sup>n</sup>* weighted random samples {Θ(*i*) *<sup>k</sup>* , *<sup>w</sup>*(*i*) *<sup>k</sup>* }*<sup>n</sup> <sup>i</sup>*=1,

$$p(\Theta\_k | \overline{z}\_{1:K}) \approx \sum\_{i=1}^n w\_k^{(i)} \delta(\Theta\_k - \Theta\_k^{(i)}),\tag{42}$$

where *δ* is the Dirac delta function. The un-normalized weights are then updated using

$$
\overline{w}\_{k+1}^{(i)} = w\_k^{(i)} \cdot p(z\_{k+1} | \Theta\_{k+1}^{(i)}).\tag{43}
$$

Once the weights are determined they can be normalized by dividing by the summation of all the weights. Additionally, an effective sample size must be considered to avoid the degeneracy problem. The new samples undergo a MCMC step that is accepted with the likelihood probability distribution described earlier (see Figure 7).

**Figure 7.** Example run of the PSG-SBM method at time steps: (**a**) k = 0, (**b**) k = 6, (**c**) k = 16, and (**d**) k = 36. The white lines indicate the path of the UAV starting at from the beginning at the white rectangle to the UAV's current positions—the white quadrotor symbol. The black circle is the source location, and the red arrow is the wind direction. The red dots are the random sample approximation of the source parameter estimates at that respective time step [139], © 2019 Field Robotics, used with permission.

#### 4.2.5. Near-Field Gaussian Plume Inversion (NGI)

The near-field Gaussian plume inversion (NGI) method [140,141] is a mass continuity model in principle, where the upwind and downwind concentration measurements, combined with wind measurements, of an emission source are differenced to quantify emission flux. The NGI method is typically sampled around 100 m from the source. The sampling aims to capture the time-invariant behavior of the plume, which, under turbulent conditions, may not map out the characteristic Gaussian plume shape.

This is because it is assumed that spatial variability in the time-averaged plume is Gaussian. This method was initially carried out with a DJI S900 equipped with a ultra portable greenhouse gas analyzer (UGGA) by Los Gatos Research Inc. (LGR). The flux estimate is derived by fitting the experimentally measured flux values, *qme*, to the modeled flux values, *qm*<sup>0</sup> given as,

$$q\_{\rm mc} = (\mathbb{C} - \mathbb{C}\_b) \mathcal{U}(z) \rho\_\prime \tag{44}$$

where the modeled flux is given by the Gaussian model,

$$q\_{\rm lw0} = \frac{F\_{\varepsilon}}{2\pi\sigma\_{\rm y}(\mathbf{x})\sigma\_{\rm z}(\mathbf{x})} \exp\left(\frac{-(\mathbf{y}-\mathbf{y}\_{\varepsilon})^{2}}{2\sigma\_{\rm y}(\mathbf{x})^{2}}\right) (\exp\left(\frac{-(z-h)^{2}}{2\sigma\_{\rm z}(\mathbf{x})^{2}}\right) + \exp\left(\frac{-(z+h)^{2}}{2\sigma\_{\rm z}(\mathbf{x})^{2}}\right)).\tag{45}$$

The lateral and vertical dispersion relations are typically looked up in the PGT stability tables, however, in this method, they are assumed to be linearly proportional to downwind distance,

$$
\pi\_y = \sigma\_y(\mathbf{x})/\mathbf{x}, \quad \pi\_z = \sigma\_z(\mathbf{x})/\mathbf{x}. \tag{46}
$$

Trying to solve (45) is not always well constrained, and thus the method proposes to separate (45) and fit the model along the z-direction,

$$q\_{m\varepsilon,y} = q\_{m\varepsilon} \frac{\text{tr}\_z \mathbf{x} \sqrt{2\pi}}{\left(\exp\left(\frac{-(z-h)^2}{2(\pi\_z x)^2}\right) + \exp\left(\frac{-(z+h)^2}{2(\pi\_z x)^2}\right)\right)}.\tag{47}$$

The spatial variability in the z-direction has to be sampled to determine *τz*. The lateral spatial variability *τ<sup>y</sup>* and plume center *yc* are determined simultaneously,

$$y\_{\varepsilon} = \frac{\sum\_{j} (q\_{m\varepsilon,y} y\_j)}{\sum\_{j} (q\_{m\varepsilon,y})} \, \tag{48}$$

$$\pi\_{y} = \sqrt{\frac{\sum\_{j} (q\_{mc, y\_{j}} (\frac{y\_{j} - y\_{c}}{x\_{j}})^{2})}{\sum\_{j} (q\_{mc, y\_{j}})}}.\tag{49}$$

Once the unknowns, *τz*, *τy*, and *yc* are determined, the source emission rate, *F*, can be estimated by minimizing the least square fit between *qme* and *qmo*, given *Fe* and *τz*. The uncertainty in *F* and the impact of limiting *τ<sup>z</sup>* are given in [140].

#### *4.3. Mass Balance Based*

The mass balance approach aims to estimate an emission source by balancing the mass flux leaving or entering a control volume. Generally, there are two path planning approaches to the mass balance method: (1) rectangular vertical flux plane (or curtain) downwind of the source and (2) a cylindrical flux plane enclosing the source. For a well behaved plume under stable atmospheric conditions, the downwind plume contains all the flux. The sampling distance from the source may vary based on each submethod. The measured flux plane data can be sparse and is typically subject to spatial interpolation.

#### 4.3.1. Vertical Flux Plane (VFP)

The flux plane method generally involves sampling within a plane, vertically or horizontally, upwind and downwind, of an emission source. It has been applied in several works [9,33,37,66,76,78,120,142–149]. The plane is typically sampled using a rasterscanning approach, capturing the plume within the width and height of the plane. The emission rate (in moles *s*<sup>−</sup>1) can be estimated as,

$$Q\_{\mathcal{L}} = \int\_{0}^{z} \int\_{A}^{B} n\_{ij} (\mathbb{C} - \mathbb{C}\_{b}) \mathbf{u} \cdot \mathbf{n}\_{f} dx dz,\tag{50}$$

where *nij* is the mole density of air (given standard temperature and pressure), (*C* − *Cb*) is the enhanced mole fraction (referenced to air), *Cb* is the background mole fraction, **u** is the wind speed vector, and **n***<sup>f</sup>* is the flux plane normal vector (see Figure 8). Since the measurements are sparse, the integral irregularly spaced. To combat this, the sparsely sampled points are spatially interpolated using techniques, such as inverse distance weighting (IDW) [150] or kriging [151]. This is a common problem in geostatistics to interpret unknown data, *<sup>z</sup>*(*s*0), from desired spatial locations *<sup>s</sup>*<sup>0</sup> in domain <sup>Ω</sup> <sup>∈</sup> <sup>R</sup>2, only using *<sup>N</sup>* sparse sampling points, *z*(*si*), based on some optimal weights, *λi*,

$$\sharp(s\_0) = \sum\_{i=1}^{N} \lambda\_i z(s\_i). \tag{51}$$

For example, in ordinary kriging [151], a semivariogram is used to model the spatial variability and, given a spatial distance, *h*, is defined as,

$$\gamma(h) = \frac{1}{2N(h)} \sum\_{i=1}^{N(h)} (z(s\_i) - z(s\_i + h))^2. \tag{52}$$

This experimental semivariogram can be fitted to the model semivariogram with one of several common functions: circular, spherical, exponential, Gaussian, or linear. The weights are determined by solving

$$\sum\_{j=1}^{N} \lambda\_j \mathbb{C}(\mathbf{s}\_i - \mathbf{s}\_j) + \mu(\mathbf{s}\_0) = \mathbb{C}(\mathbf{s}\_i - \mathbf{s}\_0), \text{ for } i = 1, 2, \dots \\ N,\tag{53}$$

where *C*(·), in this context, represents the point support covariance matrix. This matrix is related to the semivariogram, *γ*(*h*) = *C*(0) − *C*(*h*) [151], and the mean square prediction error is *σ*<sup>2</sup> *<sup>e</sup>* = *Var*(*z*(*s*0) − *z*ˆ(*s*0)), which, for ordinary kriging, is minimized to make the estimated values *z*ˆ(*s*0) optimal. Furthermore, the estimator should be unbiased (e.g., *E*[*z*ˆ(*s*0)] = *E*[*z*(*s*0)]), which requires ∑ *λ<sup>i</sup>* = 1 and the spatial mean to be stationary *E*[*z*(*s*)] = *μ*, ∀*s* ∈ Ω.

If the kriging process is not stationary, it is considered, at best, an approximate solution to the spatial interpolation problem and incorrect at worst. A better approach could be to apply a spectral method that takes into consideration non-stationarity and higher frequencies, namely, the high frequency kriging method [152]. Consideration of temporal observations could be included as well, see quantile kriging in [153].

An enhanced version of the IDW was proposed in [154] to include an adaptive distancedecay parameter based on the density characteristics of the sampled points. Available tools, such as Kriging Assistant (KA) [155], Golden Software Surfer, or ESRI Geostatistical Analyst for ArcMap have been used in the literature before. For irregular geographical units with different sizes and shapes, the interested reader should consult [156].

A variation of the VFP technique is illustrated in [157], where a path-averaged long open path duel-comb spectroscopy is operated from a ground vehicle to a sUAS with a retro-reflector. A vertical profile is flown downwind of the source to conduct the VFP. This technique is also vary similar to VRPM.

**Figure 8.** Demonstration of using sampled flux plane data and applying kriging to it for spatial interpolation [11].

#### 4.3.2. Cylindrical Flux Plane (CFP)

A variation to the VFP is the Cylindrical Flux Plane (CFP). This method has been used with manned aircraft as it is not as easy to raster-scan a rectangular flux plane. The methodology is essentially vary similar to the VFP and can be found in the work by [158], omitted here for brevity.

#### 4.3.3. Path Integrated Vertical Flux Plane (PI-VFP)

A variation of the VFP is the path integrated vertical flux plane (PI-VFP). This method utilizes a bs-TDLAS approach in that the instrument points straight down and scans or circles the emission source (see Figure 9). In [159], the AVIRIS-NG manned aircraft used IMAP-DOAS technique to retrieve methane concentrations and estimated fluxes using a PI-VFP type calculation. This approach was compared with the GDT and Gaussian inverse approaches during a joint-flight campaign.

**Figure 9.** Example of the VFP-PI strategy via a UAV sensing in circular trajectories with (**a**) being an internal leak producing a net positive flux and (**b**) being an external leak producing a net zero flux. The color of the arcs are indicative of methane flux strength with green being more negative and red being more positive [121].

The emission rates were estimated by, *Q* ≈ **u** · **n** ∑*<sup>i</sup> Vi*Δ*si*, were *Vi* represents the vertically integrated concentration, and Δ*si* is a path segment along the boundary. The individual measurements are integrated together (referred to as integrated methane enhancement (IME)) such that IME = *k* ∑ *XCH*4(*i*) · *S*(*i*). The value *XCH*<sup>4</sup> is the methane plumes that exceed the minimum threshold of 200 ppm/m and *k* is a conversion factor.

Using an RMLD sensor fitted to a small quadrotor UAV, a circular scanning approach can be applied to sample horizontally a site of interest. The sensor uses a bsTDLAS to measure integrated methane emissions from a known height. The resulting measurements are then combined with wind measurements to estimate the flux [121,160],

$$q = \int\_{0}^{H} \int\_{-W/2}^{W/2} \mathbf{u} \times (X - X\_b) dx dz,\tag{54}$$

where *H* and *W* are the vertical and lateral dimensions, and *Xb* is the background concentration. This calculation encompasses a single circular loop and if the source is encapsulated, multiple passes can be used to estimate the source,

$$Q = \frac{1}{n} \sum q\_i. \tag{55}$$

In practice, the circular flight path is actually made up of line segments that are boxlike. The source location was also identified by course raster scanning over the area of interest followed by a more fine flight pattern free approach combined with triangular natural neighbor interpolation. The maximum observed concentration was used for the source location.

#### 4.3.4. Micrometeorological Mass Difference (MMD)

Utilizing the technique from [161], sampling the plume far enough downwind of the source, the averaged MMD can be calculated as

$$Q = \overline{\iint \mathcal{U}\_{(y,z)}(\rho\_{(y,z)} - \rho\_b) dydz} = \int \chi(z) dz. \tag{56}$$

The work in [162] utilized the time-average of the line-integral of the instantaneous product of *U* and *ρ* in the y-direction. Alternatively, while using a laser fetch, an instantaneous product of a single wind measurement *U* and line-averaged laser concentration was used,

$$
\chi \approx \Delta y \overline{\mathcal{U}\_{(z)}(\rho\_{L(z)} - \rho\_b)}.\tag{57}
$$

This method can also be used to calculate the turbulent fluxes,

$$\frac{Q\_{tur}}{Q} = \frac{(Q\_{\overline{\Pi\overline{\rho}}} - Q\_{\overline{\Pi\overline{\rho}}})}{Q\_{\overline{\Pi\overline{\rho}}}},\tag{58}$$

where *QU<sup>ρ</sup>* is calculated from the flux term in (57) and *QU<sup>ρ</sup>* in (59),

$$
\chi \approx \Delta y \overline{\mathcal{U}\_{(z)}} \overline{(\rho\_{L(z)} - \rho\_{\mathcal{b}})}.\tag{59}
$$

This prescription of the flux does not capture the turbulent component of the horizontal flux (albeit wrong), is often necessary due to the short time-scale behavior of the wind (e.g., limitations in wind measurement devices).

#### 4.3.5. Gauss Divergence Theorem (GDT)

In the paper by [163], Conley et al. they focused on the continuity equation,

$$Q\_{\mathcal{L}} = \left\langle \frac{\partial m}{\partial t} \right\rangle + \iint \int \nabla \cdot \mathbf{c} \mathbf{u} dV\_{\prime} \tag{60}$$

where *m* is the mass of the aerosol, · is the expectation or average, *c* = *C* + *c* is the concentration (comprised of an average term and a deviation term), **u** is the wind speed, and *V* is the volume of the area of interest. The flux divergence can be expanded as,

$$
\nabla \cdot \boldsymbol{c} \mathbf{u} = \mathbf{u} \cdot \nabla \boldsymbol{c} + \boldsymbol{c} \nabla \cdot \mathbf{u}.\tag{61}
$$

The surface integral is taken to be a cylinder, which can be broken into several parts: the floor, the walls of the cylinder, and the top. The height of the cylinder is taken such that the emission is encapsulated with the minimum and maximum height. The resulting emission rate can be calculated as

$$Q\_{\mathcal{L}} = \left\langle \frac{\partial m}{\partial t} \right\rangle + \int\_0^{z\_{\text{max}}} \oint c' \mathbf{u}\_{\text{h}} \cdot \hat{\mathbf{n}} dl dz\_{\text{s}} \tag{62}$$

where *z* represents the altitude, and *l* the flight path. The temporal trend of the total mass ( *<sup>∂</sup><sup>m</sup> <sup>∂</sup><sup>t</sup>* ) within the volume can be estimated from the measurements. The cylinder passes can be vertically binned and discretely summed up,

$$Q\_{\mathcal{L}} = \frac{\Delta m}{\Delta t} + \sum\_{z=0}^{z=Z\_l} \left(\sum\_{0}^{L} \rho \cdot u\_n\right) \cdot \Delta z. \tag{63}$$

#### 4.3.6. Vertical Flux Planes with GLM (GLM-VFP)

In [34], a 3D grid of airborne measurements are collected across multiple landfill sites. The resulting downwind observational points are then spatially interpolated with IDW and used to calculate the total mass flux. The multiple steady state Gaussian dispersion models,

$$\mathcal{C}(x, y, z) = \frac{Q}{2\pi\sigma\_y\sigma\_z l l} \exp\left(\frac{-y^2}{2\sigma\_y^2}\right) (\frac{1}{\sqrt{2\pi\sigma\_z}}) \exp\left[\frac{-(z - L)^2}{2\sigma\_z^2}\right],\tag{64}$$

are applied to a fixed grid (50 by 50 m), where the mixing ratios found over each individual landfill was used to calculate a model mass flux (for each site, integrated along the x, y, and z directions). The experimental measurements are then used with simulation measurements and a general linear model,

$$\min\_{\alpha} \left| MF - \sum\_{i=1}^{\max} (MMF\_i \cdot \alpha\_i) \right| \,, \tag{65}$$

to approximate the emission coefficient, *αi*, from multiple landfill sources. The emission findings are further corroborated with a local Eddie covariance tower measurement.

#### 4.3.7. Vertical Radial Plume Mapping (VRPM)

The vertical radial plume mapping approach (compared with other methods in [77]), utilizes a long path TDLAS instrument from the ground. The laser is aimed at retroreflectors, situated perpendicular and downwind of the source. The height of the retroreflector constitutes the different radial angles where the path-integrated concentrations are combined with the normal wind component to estimate the flux (similar to VFP or MMD). An illustration of this is seen in Figure 10.

**Figure 10.** A diagram of the VRPM method [77].

#### *4.4. Imaging-Based*

In this section, we overview the imaging-based methodology for quantifying methane emissions. This typically includes techniques that sample images passively, such as TIR, MWIR, or other OGI-based instrumentation. The methods mentioned here that can quantify methane emissions are considered as quantitative optical gas imaging (QOGI).

#### 4.4.1. Mid-Wave Infrared (MWIR) and Hyperspectral

In the work by [164], the detection limits of MWIR band of a hyperspectral data was explored using the Spatially-Enhanced Broadband Array Spectrograph System (SEBASS) airborne instrument. They also provided a comparison between LWIR and MWIR (see Figure 11) using the radiative transfer model,

$$R\_s = \left(R\_T^\uparrow + R\_S^\uparrow\right) + t\left\{\epsilon\_s B(T\_s) + (1 - \epsilon\_s)\left[\frac{R\_T^\downarrow + R\_S^\downarrow}{1 - S(1 - \epsilon\_s)}\right]\right\},\tag{66}$$

where *Rs* is th etotal radiance at the sensor, *R*<sup>↑</sup> *<sup>T</sup>* is the upwelling emitted atmospheric path radiance, *R*<sup>↓</sup> *<sup>T</sup>* is the downwelling emitted atmospheric path radiance, *R*<sup>↑</sup> *<sup>S</sup>* is the scatter path radiance at the sensor, *R*<sup>↓</sup> *<sup>S</sup>* total solar radiance that reaches the surface, *t* is the atmospheric transmittance, *<sup>s</sup>* is the surface emissivity, and *B*(*Ts*) is the blackbody radiation at the surface temperature.

**Figure 11.** Methane plume detections in the (**A**) MWIR and (**B**) LWIR ranges [164].

Other works, such as [98], have used MWIR cameras combined with two Pergam Methane Mini G lasers in pipeline leak detection. In [165], a FLIR GF320 and a RMLD were used together to make volumetric flow rate calculations in the laboratory using a data fusion approach. In [166], they utilized a thermal camera and steady state energy balance approach to estimate methane emissions from thermal anomalies in urban landfills.

#### 4.4.2. Iterative Maximum a Posteriori Differential Optical Absorption Spectroscopy (IMAP-DOAS)

The IMAP-DOAS method was applied to the AVIRIS-NG [30,31] aircraft and measures reflected solar radiation between 0.35 μm and 2.5 μm with 5 nm spectral resolution and sampling. Using a nonlinear iterative minimization of the differences between modeled and measured radiance. The measured concentrations can be applied to the PI-VFP method to calculate fluxes [159]. Variations in this approach for retrieving methane concentrations has been seen in [167] for albedo correction and [39] anomaly-based mass balance.

#### *4.5. Correlation-Based*

#### 4.5.1. Tracer Correlation (TCM)

The tracer correlation method, or isolated source tracer ratio method, initially proposed and implemented in works by Lamb et al. [168] and Czepiel et al. [75], aims to quantify the emission rate of an unknown gas species by releasing a tracer gas at a known flow rate while measuring both the tracer and the unknown signals collocated downwind. This method assumes that the location of the source is known and, at the measurement location, the plume is well mixed. The elevated signal downwind also needs to typically be greater than 50 ppb. The authors report uncertainty estimates of ±15%. The general equation is given as

$$Q\_m = Q\_t \frac{\mathcal{C}\_m}{\mathcal{C}\_t} \, \tag{67}$$

where *Qt* is the tracer release rate, and *Cm* and *Ct* are the elevated mixing ratios of the unknown source gas and tracer gas, respectively. A comparison study between TCM and other fugitive emission quantification methods are studied in [77]. The effect of wind on accuracy of the TCM was explored for landfills using WRF model [169]. An in situ method was used to evaluate the collection efficiency of gas extraction wells based on tracer gas [170].

Variations of the quantification of TCM were explored in [28], which quantified emission rates based on the plume integration of a transect, peak height of the transect using a scatter plot to calculate the ratio (best fit line), and comparison with fitted Gaussian plume model. A landfill field comparison of methane emission models were compared to measured emissions using TCM [171]. The TCM method was also applied to quantifying emissions from dairy farms in [123].

A dual tracer method was explored in [172]. The second tracer provides for closer downwind measurements that can be refined by assessment of plume position as well as in the far-field measurements the second tracer becomes an internal standard to the measurement. A mobile version of the TCM approach was proposed in [173].

#### 4.5.2. Eddy Covariance (EC)

The Eddy covariance method aims to estimate the emission flux from a footprint area given the boundary layer meteorology. Historical developments and current implementations of this method are summarized in [174]. This method generally assumes stationarity of the measured data and fully developed turbulent conditions [175]. One way it can be expressed is,

$$Q = \frac{1}{t\_f - t\_i} \int\_{t\_i}^{t\_f} (\mathbb{C}(t) - \overline{\mathbb{C}})(w(t) - \overline{w})dt,\tag{68}$$

where the time-averaged concentration and vertical wind speed is *C* and *w*, respectively. There are several assumptions required to make this flux calculation.

#### **5. Analysis of Methods and Assessment**

In an attempt to analyze the methods covered in this paper, we decided to use the following metrics: required assumptions, sample distance, survey time, complexity, average precision, average accuracy, and average cost. The required assumptions are meant to inform the practitioner so that the best method can be applied to a given problem. For example, if the source location is unknown, the PSG method may not be directly applicable unless a source location estimate is supplied. The sample distance is defined as the distance from the source at which the required method needs measurements taken from.

The survey time consists of the time required to make a single flux estimate. Understandably, some methods may require multiple flux estimates in order to approximate the emission source to within an acceptable error. Complexity is the measure of how difficult it is to implement any given method. In order to determine a value for complexity, a scheme was developed using figures of merit (FOM) that assigns factors and weights to the metrics (detailed in Table 1). Determining the values for these factors were based on loose estimates, inferred from papers found in the literature.

Ranges were assigned to the metrics to capture variations in the factors due to either the operators or the equipment being used, and are given in Table 2. For example, some setups may use more expensive equipment or more people for the same method and, as a result, are reflected in the complexity metric.


**Table 1.** Figures of merit for defining complexity of an estimation method.

Evaluating methane quantification techniques is important, and much work has already gone into this topic through controlled release experiments and evaluation frameworks. Examples from controlled release facilities (CRF) consist of but are not limited to the following:

In the Joint Urban 2003 study [176,177], static sensors were distributed in an urban setting to measure the dispersion of tracer particulates. In [178], area-averaged velocity and turbulent kinetic energy profiles were derived from data collected at the Mock Urban Setting Test (MUST). Mock Urban Setting test (MUST) was also evaluated with photo-ionization detectors (PID) [179,180]. MUST was further simulated using MISKAM 6 [181]. In [182], the WRF model was used to model wind and turbulence inside the Quick Urban and Industrial Complex (QUIC) model for comparing simulated and observed plume transport. A test plan for Jack Rabbit II was developed in [183], which aimed to improve chemical hazard modeling, produce better planning for release incidents, improve emergency response, and improve mitigation measures.

More recently, single-blind tests at the Methane Emission Technology Evaluation Center (METEC) in Fort Collins, Colorado evaluated several types of LDAQ sensing modalities as apart of the Standford/EDF Mobile Monitoring Challenge (MMC) and the Advanced Research Projects Agency-Energy (ARPA-E) MONITOR program (such as by vehicle, plane, and drone—shown in Figures 12 and 13). In the Standford EDF MMC it was observed that the drone based technologies performed quite well (e.g., SeekOps) with an *R*<sup>2</sup> = 0.42 [144].

While the results shown in Figure 12 seem quite promising, there is still exists some improvements in precision that can be made. In the ARPA-E MONITOR program, 6 of the 11 participants tested their technologies at the METEC facility in [184] against six other industry-based participants. Due to confidentiality agreements at the time of testing, the data gathered from the 12 participants were aggregated to compare the methodologies based on measurement type (handheld, mobile and continuous monitoring). However, to the best of authors knowledge, only four of the MONITOR program participants have published data regarding the METEC tests (shown in Figure 13).

In a white paper by Bridger Photonics, a sUAS-based approach using LiDAR-based sensor, also demonstrated promising results even though the uncertainty is not given. In [121], a RMLD was used on a sUAS with the PI-VFP method. In contrast, ref. [185] utilized a portable TDLAS-based instrument and the PSG method to quantify emissions. Lastly, ref. [186] used a dual frequency comb spectrometer (from over one kilometer away) with the non-zero minimum bootstrap method (see [187]) and the Gaussian plume model to estimate the source rate.

Examples from active operations with comparison to conventional OGI-based methods are conducted in the Alberta Methane Field Challenge (AMFC) [147,188,189], which aimed to answer the questions: Are Leak detection and repair (LDAR) programs effective at reducing methane emissions? As well as, Can new technologies provide more cost-effective leak detection compared to existing approaches?

**Figure 12.** METEC results from the Standford EDF Mobile Monitoring Challenge [144].

**Figure 13.** Published METEC results of ARPA-E MONITOR program participants from: (**a**) [185] using a static on-site portable TDLAS and PSG method, (**b**) [186] using the dual frequency comb spectrometer and non-zero minimum bootstrap (NZMB) method [187] with Gaussian plume model, reprinted (adapted) with permission from [186], © 2019 American Chemical Society (**c**) Bridger Photonics' group white paper using Gas Mapping LiDAR [190], used with permission, © 2019 Bridger Photonics, Inc., and (**d**) [121] using the bs-TDLAS and PI-VFP methods.


 over

implementa-

**Table 2.**

Summary of methods and their

assumptions,

 operational

 details, complexity,

 cost, average precision, and average accuracy are generalized


**Table 2.** *Cont.*

In order to compare the performances of the each of the methods to one another, their performance metrics were garnered from different studies where the method was utilized in either a field study or a controlled release scenarios and recorded in Table 2. Performance values were gathered from the standard deviations of consecutive flux estimates of a singular source leak scenario. Accuracy pertains to error of the flux estimate to the known source rate.

This information was limited primarily to controlled release scenarios. For each method, performances and details were separated into the broad types of sampling strategies: fixed/ static, on foot, mounted on a vehicle, mounted on an aircraft, and mounted on a sUAS. This prevents convolution of performance values between, for example, long aircraft sampling flights at far distances and short sampling flights near the source via sUAS.

#### **6. Summary of Methods**

After analyzing the quantification methods, we can separate the methods based on whether they have used sUAS or not. In this manuscript, we observed that the sUASbased methods consist of near-field Gaussian plume inverse (NGI), vertical flux plane (VFP), and the path-integrated vertical flux plane (PI-VFP). The non-sUAS-based methods consist of backwards Lagragian Stochastic (bLS), point source Gaussian (PSG), recursive Bayesian point source Gaussian (PSG-RB), conditionally sampled point source Gaussian (PSG-CS), micrometeorological mass difference (MMD), Gauss divergence theorem (GDT), VFP, PI-VFP, cylindrical flux plane (CFP), general linear model verticl flux plane (GLM-VFP), vertical radial plume mapping (VRPM), quantitative optical gas imaging (QOGI), tracer correlation method (TCM), and Eddy covariance (EC).

When comparing their performances in Table 2, it can be seen that, when categorizing by means of mobility (i.e., fixed, on-foot, etc.), methods using static sensors show a trend of having higher complexity values while UAV-based methods display generally lower complexity values. For a subset of the methods, the survey times, sample distances, and average accuracies can be seen in Figure 14. This subset was specifically displayed for these methods had both upper and lower bounds for survey times and sample distances along with accuracy data, which allowed for the plotting of these quantities for each method in the form of ellipses on a log–log plot.

When analyzing this plot, it can be seen that the sUAS-based methods are generally lower in sample distances and survey times as opposed to the aircraft-based method being the one of the highest in both. The bLS and TCM methods are shown to have the best average accuracy with several sUAS and mobile methods close in accuracy. The long sample times of bLS method are due to the values reported in [191], and it is possible that these values don't reflect typical bLS sample times. The advantages and disadvantages of each of the methods can be seen in Table 3 along with what typical application fields that they were applied in.

The final ranking of the methods depends heavily on the desired application, which also depends on factors, such as sample distance, sample time, and desired accuracy. For that reason, it is difficult to rank the methods in general. Thus, we provide a ranking of the methods in terms of complexity (outlined in Table 1) with highlights from the precision and cost in Figure 15. The results indicate that the simplest methods, in terms of complexity, are the sUAS-based NGI5 and VFP5 as well as fixed QOGI1.

The most complex methods include bLS<sup>1</sup> and manned aircraft-based approaches. In terms of precision, bLS1, NGI5, GDT4, VFP5, QOGI1, TCM3, and EC1 tend to be the best. Thus, for sUAS-based methods, NGI5 and VFP5, are the most promising approaches. Additionally, the GDT4, TCM3, and EC1 approaches can be treated as candidate methods for future implementation using sUAS.

**Figure 14.** Diagram of summary of methods (based on Table 2) showing the relationship between typical survey time versus sample distance and there associated normalized accuracy, where lower values represent more accurate measurements. (measured using 1fixed, 2foot, 3vehicle, 4manned aircraft, or 5UAV).

**Figure 15.** Diagram of the complexity ranking of the methods (based on Table 1), showing the relationship between the method complexity (red), precision (green), and cost (blue). The precision is normalized on the source estimate multiplied by 10 and the cost is ranked from 0 to 10. (measured using 1fixed, 2foot, 3vehicle, 4manned aircraft, or 5UAV).


**Table 3.** Summary of the method advantages and disadvantages along with fields of application.

#### **7. Future Directions**

What areas should we begin to focus and invest in, and where is the field going? One approach is by looking more into smart sensing using sensor arrays and machine learning [192]. Leveraging gas dispersion modeling in path planning with source estimation approaches. A recent paper in [193] showed a joint estimation method (wind and gas) that performed fairly well compared to existing methods at reconstructing plumes within enclosed structures.

Can these approaches be used with sUAS for smarter path planning to improve LDAQ methodology? As these methods develop for outdoor gas dispersion modeling, sUAS could potentially use these concepts for improved sensor placement, which can yield improved quantification results. For techniques, such as SEM of landfills as well as any 2D plume reconstruction problems, the sUAS-based complex tomography techniques outlined in [59] could be applied. Additionally, the combination of gas tomography with mass balance approaches or model-based optimizations could yield improved emissions estimates as well.

As survey areas become larger and harder to capture with single sUAS systems, the use of swarms can also be leveraged to improve VFP mass balance based approaches, such as in [194], or can be applied to the Bayesian inference framework, which aims to maximize the mutual information, such as in [195] to estimate source parameters. Currently regulatory hurdles and cost may prevent these systems from being applied in practice today. Given that low cost methane sensors are being actively researched, it is possible that swarms of sUAS may be used in the near future.

Considering that some non-sUAS methods (overviewed in this manuscript) may be adopted by sUAS that require longer sampling times (such as PSG and TCM), the use of power-over-tether may become desirable for increasing the survey time of sUAS. This is demonstrated for meteorological applications in [196] and has nearly indefinite flight times. There has also been advances in the digital transformation of technological applications and control, where concepts, such as Digital twins are being applied to perform smart control engineering or industrial artificial intelligence (IAI).

These techniques utilize modeling, machine learning, edge computing, and or internet of things (IOT) approaches to create digital representations of physical assets that evolve system parameters over time. They can be used to understand remaining useful life (RUL) of equipment and perhaps be coupled with LDAQ to understand when to survey equipment that is projected to fail in the near-term. This can allow the limited resources of companies and practitioners to focus on problematic areas in an attempt to detect and mitigate super emitters.

In a previous work, [120], plume modeling was applied to not only improve on methodology but also aide in smarter path planning (as mentioned above). As low cost methane sensors become more sensitive, they can be integrated into existing infrastructures to give system status updates that can be fed back into modeling approaches, such as with PSG-RB method's a priori well information [133] and PSG-SBM [139]. Furthermore, having access to a priori information and digital twins of the plume (e.g., a model of the system), can allow for improved autonomy of the sUAS. Ultimately, by applying sUAS in this context, early detection, and repair of methane leaks can be better approached.

#### **8. Conclusions**

Overall, this manuscript serves to capture the majority of sUAS-related emission quantification strategies as well as provide some accuracy comparisons to more conventional and non-sUAS quantification strategies. LDAQ methods based on sUAS can provide accuracy close to the state-of-the-art conventional methods, while improving the sampling distances and sampling times (see Figure 14). The advantage to using sUAS in some cases allows for better localization of emission sources and provides more flexibility in deployment.

Taking into consideration the operator skill, number of operators, equipment cost, setup time, and survey times, the complexity of the methods were derived. The complexity ranking of the methods indicated that NGI5, VFP (sUAS-based), and QOGI1 have the simplest complexity, while bLS<sup>1</sup> and the manned aircraft-based approaches have the highest. Comparing the precision of each method indicated that bLS1, NGI5, GDT4, VFP5, QOGI1, TCM3, and EC1 have the most precise estimations, while VFP4 and PI-VFP<sup>4</sup> are the least precise. To conclude, for sUAS-based methods, NGI5 and VFP<sup>5</sup> are the most promising approaches.

Additionally, the GDT4, TCM3, and EC1 approaches can be treated as candidate methods for future implementation using sUAS. Lastly, sUAS-based quantification approaches, outlined in this manuscript, can be combined with new modeling and control approaches and a priori information (e.g., digital twins, machine learning, or the joint estimation method) to improve autonomy and estimation. For interested readers, the papers and bib file can be made available upon request.

**Author Contributions:** Conceptualization, D.H. and Y.C.; writing—original draft preparation, D.H. and D.Z.; and writing—review and editing, D.H., D.Z., and Y.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** D.H. was supported by National Science Foundation Research Traineeship Grant DGE—1633722.

**Acknowledgments:** The authors would like to thank the reviewers for their helpful comments, which aided in improving the overall quality of the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:



#### **References**


### *Article* **Numerical Fluid Dynamics Simulation for Drones' Chemical Detection**

**Fabio Marturano 1, Luca Martellucci 1,\*, Andrea Chierici 1, Andrea Malizia 2, Daniele Di Giovanni 1,3, Francesco d'Errico 4, Pasquale Gaudio <sup>1</sup> and Jean-Fran**ҫ**ois Ciparisse <sup>1</sup>**


**Abstract:** The risk associated with chemical, biological, radiological, nuclear, and explosive (CBRNe) threats in the last two decades has grown as a result of easier access to hazardous materials and agents, potentially increasing the chance for dangerous events. Consequently, early detection of a threat following a CBRNe event is a mandatory requirement for the safety and security of human operators involved in the management of the emergency. Drones are nowadays one of the most advanced and versatile tools available, and they have proven to be successfully used in many different application fields. The use of drones equipped with inexpensive and selective detectors could be both a solution to improve the early detection of threats and, at the same time, a solution for human operators to prevent dangerous situations. To maximize the drone's capability of detecting dangerous volatile substances, fluid dynamics numerical simulations may be used to understand the optimal configuration of the detectors positioned on the drone. This study serves as a first step to investigate how the fluid dynamics of the drone propeller flow and the different sensors position on-board could affect the conditioning and acquisition of data. The first consequence of this approach may lead to optimizing the position of the detectors on the drone based not only on the specific technology of the sensor, but also on the type of chemical agent dispersed in the environment, eventually allowing to define a technological solution to enhance the detection process and ensure the safety and security of first responders.

**Keywords:** detection; CBRNe; drone; MOX; chemical sensor; simulation; fluid dynamics simulations

#### **1. Introduction**

The rise of new technologies, such as drones, and the improvement of their capabilities, nowadays allows to potentially design and develop useful detection and sampling systems that may be aimed at limiting the exposure to hazardous agents of the workforce and of the population following a chemical, biological, radiological, nuclear, and explosive (CBRNe) event. Factors such as early detection and alarms are primary requirements to consider when designing and deploying new technologies in the field of CBRNe events' management [1–4].

CBRNe events may belong to either the hostile or accidental dimensions. Intervention by responder teams is quite complex and needs to be structured in many phases. When approaching a dangerous scenario, the first and most critical phase is usually considered "situational awareness"; indeed, the right perception of the potential hazards is the basis for the future decision-making process. Being able to characterize a CBRNe event in terms

**Citation:** Marturano, F.; Martellucci, L.; Chierici, A.; Malizia, A.; Giovanni, D.D.; d'Errico, F.; Gaudio, P.; Ciparisse, J.-F. Numerical Fluid Dynamics Simulation for Drones' Chemical Detection. *Drones* **2021**, *5*, 69. https://doi.org/10.3390/ drones5030069

Academic Editors: Diego González-Aguilera and Pablo Rodríguez-Gonzálvez

Received: 21 June 2021 Accepted: 24 July 2021 Published: 29 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of time, space, required actions, and operations is a fundamental step to successfully protect the public, the workforce, and the environment. Such an approach is commonly the best suited to ensure a fast and effective operative response when facing a wide range of dangerous situations. A good situational awareness could mean the difference between life and death of both rescuers and responders; in order to protect themselves and others against a CBRNe event, responders must assess as soon as possible the nature and proportion of the threat/hazard for subsequent life-saving and decontamination operations. In order to possess the right situational awareness, first responders not only need to be very well prepared, but they shall use whatever tool or technology available to them to enhance their response capabilities [5].

Mobile robot olfaction (MRO), the field of robotics where intelligent mobile platforms are equipped with a mixture of chemical sensors, has made tremendous progress in the last few years. Monitoring of environmental gases for risk assessment both indoors and outdoors usually requires a complex sensor system and a long operational time. A typical application field is gas pipelines' monitoring, where MRO equipped drones are used to monitor and localize a dangerous dispersion along the pipeline. The idea of installing a portable gas detector in a mobile robotic platform was first described in [6] for the localization of gas dispersion in nuclear power plants with the aim to minimize workforce exposure to dangerous environments. From the advancement of autonomous robots, different application results from the integration of specific sensor, to different kinds of mobile platforms [7], such as for firefighting, demining, environmental monitoring, and search and rescue [8–11]. Mobile robots are effective tools for replacing the workforce in repetitive tasks, such as continuous monitoring; they can work in a hostile environment (e.g., chemical and radioactive dispersion, oxygen-deficient, or hostile environment) and explore impervious areas that cannot be easily reached by human operators. Moreover, the use of this kind of system is twofold: support and monitoring application of the chemical dispersion, whereas it could be used even for direct response to the event [12]. Such aspects make drones a suitable platform in the context of CBRNe events' management [13–16].

MRO systems need on one hand to satisfy the requirements of early disaster response, where a high degree of mobility, fast operation, and highly efficient collaboration with human operators and decision makers are crucial; on the other end, MRO systems also need to satisfy the needs of long-term monitoring when less critical events affecting permanent infrastructures may exist.

Recent advances in mobile robot platforms, specifically in drones' technology, together with the improvements in the performance of chemical detectors, nowadays present a great prospect to deploy an integrated platform in a wide range of applications. Furthermore, recent advances in the miniaturization of chemical instrumentation as well as data processing algorithms and methodologies allow to better understand the nature and the origin of the chemical dispersion event [17–19]. For example, micro aerial vehicles (MAVs) equipped with gas detection systems and/or sampling devices have already been used in the field of environmental monitoring [11–27], volcanic gas sampling [28–32], localization of gas dispersion [33,34], early fire detection [35,36], precision agriculture [37–39] landfill monitoring [40–42], disaster response [43,44], demining [45], and others [46–48].

The use of inexpensive, low complexity sensors mounted on small commercial drones for the detection of specific substances could respond to the requisites of rapid response to a threat and allow to satisfy the safety requirements for the operators involved in a chemical release event. However, a potential problem in the accuracy of the acquired data arises when low-cost chemical sensors are used on board a drone because of the vortexes generated by the propellers mainly during the approaching phase, thus preventing the sensor from correctly detecting the presence of hazard substances or misreading the real concentration of the substance. Several works indicate that the problem of the turbulence generated by drone propellers may strongly affect the chemical sensor signals. In the work of Rossi et al. [26] and Burgues [22], the applicability of nano-drones for gas sensing tasks is explored: preliminary indoor experiments using nano-drones equipped with metal-oxide semiconductor (MOX) gas sensors showed that the air drawn around the airframe strongly affects the sensor response, basically resulting in useless signals.

To reduce the interferences of the drone propellers on the behavior of the chemical sensors, this work is aimed at optimizing the position of the detectors on the drone platform in order to maximize the effectiveness of sensor detection. A series of fluid dynamic simulations have been performed, aimed at improving the capability and the proficiency of such mobile systems to correctly collect data during the drone approaching phase by optimizing the position of the sensors on the drone body.

#### **2. Materials and Methods**

In this section, the authors describe the methodology to evaluate and improve the performances of metal oxide chemical sensors when used on board a drone, taking into account the fluid-dynamics interaction through the software COMSOL Multiphysics® (COMSOL Inc., Stockholm, Sweden [49]). The authors provide an explanation of the advantages of using numerical simulation versus experiments followed by the case study analyzed in this work. The last two parts of these sections are devoted to explaining the models and the designed geometry, respectively.

The design of a mobile system, such as a drone platform for the detection, identification, and monitoring of a chemical substance release, requires a deep analysis of the interference that the propulsion of the drone itself produces owing to the volatile nature of the chemical substances that the system needs to analyze. In order to understand if a specific design of a drone equipped with chemical sensors will be able to correctly detect the target substance/particles during different phases of flying, such as approaching and hovering, studies on the fluid-dynamics of the vortexes generated by the propellers need to be performed.

Drones come in different shapes, sizes, and configurations. Among the most used configurations are the quadcopter and hexa-copter. While the first is more commercially and widely available for hobbyist, the hexa-copter configurations are usually aimed at professional and specialized applications. To carry out a simulation of a realistic drone platform, a 3D drone was modelled based on an existing prototype. A specific commercial drone that has already been the object of trials in the CBRNe domain is the SR-SF6 (Figure 1). It is a hexa-drone created by Skyrobotics, which has shown good performance when applied in a wide range of applications. For example, it has been modified to host a biochemical aerosol detector and sampler system with the goal of being used as a tool in the management of CBRNe scenarios. In this work, the hexa-drone was chosen as the reference drone to be modelled in the simulation thanks to its relatively simple structure and good performance balance in terms of speed, stability, and aerodynamic control compared with commercially available quad-copters and octa-copters.

**Figure 1.** SR-SF6 by Skyrobotics.

Because of the volatile nature of the substances, experimental tests could lead to unpredictable results and require a strong effort to define an experimental set. Moreover, the repeatability of the experimental set between different cycle makes it hard to achieve particularly in the particle dispersion flow. Fluid-dynamic simulations, mainly carried out to optimize the positioning of the chemical sensors on a drone platform to minimize the fluid dynamics interference of the propeller, could lead to an optimal system able to correctly acquire the data from the chemical sensors used. In computational fluid dynamics (CFD), finite volume algorithms subdivide the fluid domain into several little volumes in which, once the boundary conditions are imposed, the Navier–Stokes equations solve the problem using an iterative method [50]. Each fluid is then characterized by its macroscopic properties such as density, viscosity, pressure, and so on. The CFD equations can predict with a reasonable degree of approximation the behavior of fluid flow with a mathematical model and numerical methods. A set of pre- and post-processing algorithms are usually applied to visually correct the error in the resolution of the system and to help perform the virtual simulations. To correctly set up the variables in the CFD simulation, attention should be paid to the following properties:


At the end, the CFD simulation provides a solution for complex flow problems that may be expensive and not reliable in real experimentation.

#### *2.1. Aim of the Investigation*

The present study addresses the problem of the optimization of the chemical sensor location on the drone to avoid the interference on the sensors owing to the propulsion used in the drones. Specifically, the flow from the propellers could cause the loss or corruption of the data collected from the sensors owing to the turbulence area and the dispersion of the concentration around the propeller.

#### *2.2. Fluid Dynamics Models*

In this work, the dispersion of ammonia due to an accidental release is being simulated, and the sensitivity of sensors placed in different areas of the drone is analysed. As ammonia is a common by-product of several chemical industrial processes, and as it is an irritant, toxic, and uncoloured gas, it was chosen as the representative chemical substance of choice in this study. Because of its molecular characteristics, ammonia is more dangerous compared with more volatile substances, and its stagnation could create a dangerous area. Moreover, the use of ammonia in low concentrations does not represent a large threat to human life; therefore, it represents a good candidate to perform an experimental campaign aimed at characterizing the detecting capabilities of different kinds of chemical sensors, such as MOXs. It has to be highlighted that such an approach should be avoided for concentrations larger than the lower explosion limit (LEL).

The numerical model is a multiphase mixture model, where two phases are considered: air and ammonia. The first equation used to simulate the event is the continuity equation, which describes the conservation of the mass:

$$\frac{\partial \rho}{\partial t} + \nabla(\rho u) = 0 \tag{1}$$

where *u* is the velocity vector and *ρ* is the density of the mixture. The momentum conservation is taken into account by a set of three equations:

$$
\rho \frac{\partial \mathbf{u}}{\partial t} + \rho \mathbf{u} \nabla(\mathbf{u}) = -\nabla p + \rho \mathbf{g} + \nabla \mathbf{K} \tag{2}
$$

where *p* is the pressure and

$$\mathbf{K} = (\boldsymbol{\mu} + \boldsymbol{\mu}\_t) \left( \nabla \boldsymbol{\mu} + (\nabla \boldsymbol{\mu})^T \right) - \frac{3}{2} (\boldsymbol{\mu} + \boldsymbol{\mu}\_t) (\nabla \cdot \boldsymbol{\mu}) \mathbf{I} - \frac{2}{3} \rho k \mathbf{I} \tag{3}$$

where *μ* is the dynamic viscosity, *μ<sup>t</sup>* is the turbulent viscosity, *I* is the identity matrix, and *k* is the turbulent energy. The turbulent variables are calculated by the *k*-*ε* turbulent model, which describes, using a two equations approach, the turbulent kinetic energy (*k*) and the turbulent dissipation ratio (*ε*):

$$
\rho \frac{\partial k}{\partial t} + \rho \mu \nabla k = \nabla \left( \left( \mu + \frac{\mu\_l}{\sigma\_k} \right) \nabla k \right) + P\_k - \rho \varepsilon \tag{4}
$$

$$
\rho \frac{\partial \varepsilon}{\partial t} + \rho u \nabla \varepsilon = \nabla \left( \left( \mu + \frac{\mu\_l}{\sigma\_\varepsilon} \right) \nabla \varepsilon \right) + \mathbb{C}\_{\varepsilon 1} \frac{\varepsilon}{k} P\_k - \mathbb{C}\_{\varepsilon 2} \rho \frac{\varepsilon^2}{k} \tag{5}
$$

where

$$P\_k = \mu\_t \left( \nabla \boldsymbol{\mu} : \left( \nabla \boldsymbol{\mu} + \left( \nabla \boldsymbol{\mu} \right)^T \right) - \frac{2}{3} (\nabla \cdot \boldsymbol{\mu})^2 \right) - \frac{2}{3} \rho k \nabla \cdot \boldsymbol{\mu} \tag{6}$$

$$
\mu\_t = \frac{\rho \mathbf{C}\_\mu k^2}{\varepsilon} \tag{7}
$$

The turbulent model constants are *C<sup>μ</sup>* = 0.09, *Cε*1= 1.44, *Cε*<sup>2</sup> = 1.92, *σ<sup>k</sup>* = 1, and *σε* = 1.3. The mixture properties are calculated by the mixture flux equation:

$$
\rho \frac{\partial a\_i}{\partial t} + a\_i \rho \nabla(\mu) + \nabla \dot{\jmath}\_i = 0 \tag{8}
$$

where *α<sup>i</sup>* is the mass fraction of the i-th phase and *ji* is the mass flux vector, calculated taking into account both diffusion and convection.

#### *2.3. Settings of COMSOL Multiphysics Parameters*

To run a CFD simulation, the software needs to process the data modelling the drone and the volume where the simulation is going to be carried out with well-defined boundaries. For this simulation, it was decided to consider a cube of air with a linear length of 50 m (**L**) (Figure 2), assuming the drone position is at the centre of the volume.

**Figure 2.** Schematics of the numerical simulation geometry.

Transforming partial differential equations (PDEs) into algebraical equations while working on the entire model might introduce significant errors into the results. To avoid this problem, the surface is divided into a number of sub-elements of geometrical simple shapes (tetrahedrons) to be studied one by one in order to increase the goodness of the approximation and remove the final errors. The mesh settings determine the resolution of the finite element mesh used to discretize the model. The mesh used in a fluid flow simulation depends on the fluid flow model and on the accuracy required in the simulation. Generally, a fluid flow model requires a fine resolution to converge. The finer the resolution (number of sub cells), the better the goodness of the approximation and the lower the error of the final result.

Two different meshes are used for the simulation: a first one for the reference volume, characterized by larger sub-elements; and a second one for the drone, whose dimensions require the sub-elements to be much smaller. Once an assessment about the virtual environment is completed, with the fundamental boundary conditions, the geometry of the drone can be created by choosing a simplified geometric shape and by providing the spatial information to the system. The geometry of the drone was created starting from a three-element shaped body using an ellipsoid for the central body, while cylindrical shapes were used to model the rotor and the ducted propeller. The cylindrical shape is an approximation of the propeller area for simulating the inlet and outlet surface in the simulation of the flow, whereas the ducting effects are not taken into account. These components are joined by the arm connecting the rotor shape with the central body, respectively. The union of these three separate parts was then used to define the final ensemble shape into a single body. Finally, the drone was completed, with the exception of the sensors, by maintaining the center of the ellipsoid as a fulcrum and applying a rotation on the XY plane by an alpha angle equal to 60 degrees five times and gluing all the parts together (Figure 3).

**Figure 3.** Schematics of the drone geometric components.

In order to validate the CFD simulation results by comparing them with experimental data, a fundamental step involves the choice of the model of the chemical sensors, which needs to take into account both the availability of the sensor in the market and the volatile substance to analyze. The aim of the simulations is the analysis of a specific area contaminated with ammonia, thus an MQ137 chemical sensor was selected as the detector to be modeled in the simulation thanks to its low cost, commercial availability, and ease of use in a real environment. Moreover, MOX technology is today one of the most used in the context of MRO thanks to the relative rapid prototyping requirements. MOX sensors in the simulation are modelled as a sphere with radius of 5 cm to better approximate the shape and simplify the fluid dynamics computation. To simulate a release of ammonia in the environment for the bi-component simulation, a solid sphere releasing the particles in all directions at a speed of 0.5 m/s was configured. In the first set of simulations, aimed at evaluating the propeller interference, the releasing source was located at 5 meters on the X-axis, at 15 meters form the ZX plane along the Y-axis, and at 25 meters from the ZY plane along the X-axis. Wind was introduced as a parameter, blowing along the Y-axis direction at a speed of 3 m/s.

The simulation will run with configurations having two different locations of the sensors positioned on the drone. The first configuration considered is the nadir position (underneath the central body ellipsoid) of the drone hosting a single sensor, whereas the radial configuration of the sensors hosted six sensors radially displaced around the outer edge of each propeller (Figure 4).

**Figure 4.** Nadir (**left**) and radial (**right**) sensors' displacement.

The study of the flow and the interference of the propeller were carried out for two different virtual environment settings: the first setting considers only the presence of the air as a unique fluid flow, namely mono-component CFD simulation, whereas the other setting takes into account the presence of the dispersion source (inlet) of ammonia that spreads the volatile substance into the air at a specific height, namely bi-component CFD (owing to the mixing of ammonia with the surrounding air). Regarding the latter configuration, the simulation of fluid flow is more complex compared with the single fluid because the different properties of the molecules give rise to different behaviors when mixed together.

The last groups of settings required to define the simulation, taking into account the prototype of the drone with the sensors and the releasing point (modelled as a sphere) had to be positioned inside the reference volume.

#### **3. Results and Discussion**

To test the optimal positioning of the detectors, two simulations were carried out, each characterized by different displacement of chemical sensors on the drone body. A first check of the goodness of the simulations could be performed during the run of the simulation itself. The convergence of the velocity, pressure, volume fraction of the dispersion, and turbulence variables can always be observed on the run. Therefore, it was observed that the errors of the solutions just stabilized to a constant value as the number of iterations was increasing. The difference of a 10−<sup>3</sup> factor between the error values was assumed to be dictated by the numerical model used and the computational algorithms.

By examining the flow lines in the simulation, the air flow follows the expected path; that is, it exits from the reference volume into the upper side of the ducted propeller and re-enters the reference volume from the lower side (Figure 5a). It is important to note that the flow re-entering into the volume is accelerated and concentrated in a coherent downwash until the ground interference dissipates it radially (Figure 5b). Moreover, it is important to consider that the air flow that enters into the propeller duct not only is directly influenced by the propeller action on the vertical space, but also derives from the outer boundaries of the ducts; in fact, the fluid starts its acceleration downward, and the mass flow claimed from the propeller increases with the propeller velocity.

By analyzing the air speed and fluid flow, we can instead observe that the air is being accelerated on top of the ducted propeller with a speed close to 8–9 m/s, and the same value is maintained for almost 2 meters downward, where it starts to slow down at 5–6 m/s (Figure 6). The speed of air is still effective as the air flow hits the ground at 25 m with a speed of 1–3 m/s. Furthermore, the air flow assumes a larger radial angle as it leaves the drone such that the flow cross section at ground level is almost double with respect to the cross section below the drone.

**Figure 5.** (**a**). Flow ground radial dissipation (**b**). Fluid Flow simulation lines.

**Figure 6.** Air speed behaviour.

The air flow section underneath the central ellipsoid shows an interesting feature: the air flow has an acceleration between 2 and 4 m/s. As the air flow downstream expands below the propeller, it merges down at 2 meters with another air flow. This phenomenon creates a vortex (a local vorticity generated by the drone propeller) that redirects upward the flow under the nadir area and finally forces it to re-join the mainstream downward from the propeller (Figure 7). The toroidal recirculation under the nadir made by the six propellers creates the increment in speed toward the sensor located under the centre of the drone. This implies that, in order to detect chemical substances under the drone, the aerosol/particles first need to be attracted by the propeller, then pushed downward, and finally reach the sensor after being recirculated upward in the toroidal pattern.

**Figure 7.** Downstream of the hexacopter.

Given the strength of the downstream, which creates a virtual wall around the sensor, it may be difficult for the drone to detect anything when flying at medium to high altitudes unless the plume is higher than the propellers. Flying at low altitudes will instead facilitate the detection as the downstream of the air flow will increase the terrain turbulence by impacting with the ground at a high speed. Furthermore, once the chemical substance is accelerated through the ducted propeller and reaches the sensor, its concentration could change enough to be detectable during a measurement.

In the other setting, namely the radial configuration, the sensors are displaced in the outer ring of the propeller; here, the presence of several other surfaces gives rise to a fluctuating error affecting the convergence of the variables considered in the simulation.

Most of the considerations and analysis for the nadir configuration are applicable for the radial configuration as well, where the sensors are placed in the outer ring of the propellers. The pressure velocity values as well as the turbulent variables need to converge to a constant value as the number of iterations increases. Given the presence of several other aerodynamic surfaces, in this configuration, the pressure and velocity values will converge too, but they will maintain a fluctuating error range with an average value of 10<sup>−</sup>9. Despite this fluctuation error in the convergence of the values of the variables, the radial setting seems to offer an interesting solution for enhancing the detection compared with the nadir sensor location. Whereas the nadir sensor is limited to sampling the particles that arise from the below vortex that redirect the flow upward, sensors in the radial configuration appear not to be influenced by the acceleration of the air flow (Figure 8). As the drone hovers on a stationary plane and is reached by a chemical plume, none of its rotors downstream flows should affect it before it detects the chemical presence from a lower altitude. Moreover, if the chemical plume flows at the same altitude of the drone, then the upper rotor air-need would accelerate the air radially from outside, forcing the mixture to impact the sensors before it is sucked into the ducted propeller rotors.

#### *Bi-Component Simulation*

In the bi-component (ammonia and air) simulation, where the ammonia source is introduced into the environment, the influence of the wind direction and intensity (3 m/s) results in a turbulence both in proximity to the drone air flow and in proximity to the release point (Figure 9). As the wind velocity is higher than the normal air flow in the reference volume above the drone, the flow lines coming from the propeller described by the simulation are dominated by the wind direction and intensity. After entering the rotors, they are twisted and accelerated downwards, but are still affected by the external wind direction.

**Figure 8.** Air flow behavior for the radial configuration.

**Figure 9.** Turbulence around the drone (**left**) and around the emission sphere (**right**) owing to the introduction of the wind vector.

In the final simulation, the reference volume was resized to become a parallelepiped (50 m × 50 m × 300 m) with the drone positioned at 225 meters from the ammonia source and lowered to be 5 meters higher than the source. The nadir drone bi-component simulation in the new reference volume shows the same results for both velocity magnitude and flow lines behavior graphs as for the previous reference volume (cube). It should be noted that the different speed in the volume around the drone is not related to the reference plane (ZX) including two ducted propeller rotors (as for the first mono component simulation), but it is related to the reference plane (ZY) passing for the ammonia sphere and crossing the drone between its rotors. The nadir drone bi-component simulation in the new reference volume shows the same results for both the velocity magnitude and flow lines behavior graphs found for the previous reference volume (cube). It should be noted that the different speed areas around the drone in the section view are not related to the reference plane (ZX) including two ducted propeller rotors (as for the first mono component simulation), but they are necessarily connected to the reference plane (ZY) passing for the ammonia sphere and crossing the drone between its rotors.

The nadir drone simulation provides interesting results when analyzing the mass fraction and, more specifically, to understand if the sensor detects a certain concentration of ammonia particles. Mass fraction is defined as the mass of one chemical species in a set volume divided by the total mass of species in the same volume. It is normally calculated in parts per million (ppm). With COMSOL software, it is possible to calculate the concentration of ammonia at any specific point, thus it is possible to assess if any sensor in a specific location on the drone will be able to detect the presence of the contaminant. By focusing on the mass fraction of the contaminants around the drone, different concentration may result owing to propeller flow. In front of the drone, such as the radial position, the mass fraction appears uninterrupted, whereas under the drone, it decreases. By comparing the mass fraction in front of the forward rotor ducts (where the sensors are located on the radial position) and below the nadir of the central body of the drone, the difference in the concentration sampling of the two configurations is evident (Figure 10). Ranging from a minimum of 122.60 ppm to a maximum of 169.17 ppm with an average value of 151.8 ppm, the radial design shows a detection capability 5.8 times higher than the nadir configuration, where the average sensor value is 26.089 ppm.

**Figure 10.** As the gas dispersion reaches the sensor in the radial configuration, frontal sensors detect the highest values in ppm, followed by the lateral sensors and thus back sensors, where a lower concentration level is detected.

When considering an inexpensive commercial portable detector for industrial and domestic uses and sensitivity to ammonia such as the MQ137, and by comparing its mass fraction detecting threshold against the values for both the nadir sensor and the radial ones, the advantage of the radial configuration is evident. When a dispersed concentration changes (for example, by changing the relative distance to the emission source), and considering a reduction of 1/10 of the concertation, the limit of detection (LoD) of a real sensors could make the sensor unable to detect the dispersion itself. In the case of MQ137, with an LoD of 5 ppm, a relative detection of 1/10 of the mass fraction implies that, in the nadir position, the possible detected concentration drops to 2.6 ppm, thus under the LoD of the sensor, whereas for the radial configuration, it drops to a 12.26 ppm. This implies

that the nadir detection is under the threshold, whereas the radial sensor is still able to detect a useful level of concentration.

#### **4. Conclusions and Prospects**

Both CBRNe threats and hazards have evolved significantly over the last decades. Know-how and dangerous elements have become more accessible; consequently, the probability of a chemical or biological attack or casualty has grown significantly all over the world. It is obvious that every country in the world should be prepared to respond to a CBRNe event. Such a response is provided by the intervention of first responder teams, whose main task is focused on acquiring the correct situational awareness in order to save lives, rescue people, and decontaminate the hazardous area. A good situational awareness could mean the difference between life and death of both rescues and responders. In order to possess the right situational awareness, first responders not only need to be very well prepared, but they need to use whatever tool or technology available to enhance their capabilities. By providing all first responder teams with commercial drones equipped with inexpensive, low complexity sensors, capable of detecting a wide range of substances, the risk assessment process could be improved, ensuring the safety and security of the operators through a fast and effective response to the threat.

The main problem to be solved when sensors for chemical and biological detection are used on board of a drone is related to the vortexes generated by the drone propellers. Two different sets of computational fluid-dynamics simulations, using COMSOL software and starting from a drone specific design, were carried out to demonstrate the effectiveness and advantage of correct positioning of the sensors.

In this work, the simulation of the fluid-dynamics variables in proximity to the drone and the sensors on two different configurations helped to identify the optimal positioning of the detectors in the case of a chemical release/dispersion scenario. The distribution of the concentrations of the particles around the drone with and without the wind interference allowed to identify a radial configuration as an optimal solution for the detection of the chemical particles' release. The results of the simulation emphasized how radial positioned sensors would be less affected by the rotors downstream of the drone compared with the one placed on the central belly. Moreover, as the virtual drone is a hexa-copter able to carry up to six radial sensors, a similar model could be considered a recommendation for further simulations and experimentations; for example, it may be equipped with six different low-cost and selective sensors among those to detect nerve agents, blister agents, chocking agents, blood agents, and riot control agents. If such a low-cost device will prove effective, it could determine an improvement in the detection of hazardous agents following a CBRNe event. By providing first responder teams of any organization in the world with such a kind of platform, a fast and full situational awareness and risk assessment could be easily achieved in order to face any challenge in different hostile environments.

**Author Contributions:** Conceptualization, J.-F.C., F.M., D.D.G. and L.M.; methodology, J.-F.C. and F.M.; validation, J.-F.C., A.M., F.d. and L.M.; formal analysis, A.M.; investigation, F.M., J.-F.C., L.M. and A.C.; data curation, F.M., J.-F.C., F.d. and L.M.; writing—original draft preparation, L.M. and A.C.; writing—review and editing, L.M., F.d. and A.C.; visualization, F.M.; supervision, P.G. and A.M.; project administration, A.M. and P.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Acknowledgments:** Authors are grateful to International Master Courses in "Protection against CBRNe event" for material used and interaction with students and experts in this field. For any other information please visit http://www.cbrngate.com, accessed on 1 March 2017.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Convolutional Neural Networks for Classification of Drones Using Radars**

**Divy Raval 1,2, Emily Hunter 1,3, Sinclair Hudson 4, Anthony Damini <sup>1</sup> and Bhashyam Balaji 1,\***


**Abstract:** The ability to classify drones using radar signals is a problem of great interest. In this paper, we apply convolutional neural networks (CNNs) to the Short-Time Fourier Transform (STFT) spectrograms of the simulated radar signals reflected from the drones. The drones vary in many ways that impact the STFT spectrograms, including blade length and blade rotation rates. Some of these physical parameters are captured in the Martin and Mulgrew model which was used to produce the datasets. We examine the data under X-band and W-band radar simulation scenarios and show that a CNN approach leads to an F1 score of 0.816 ± 0.011 when trained on data with a signal-to-noise ratio (SNR) of 10 dB. The neural network which was trained on data from an X-band radar with 2 kHz pulse repetition frequency was shown to perform better than the CNN trained on the aforementioned W-band radar. It remained robust to the drone blade pitch and its performance varied directly in a linear fashion with the SNR.

**Keywords:** drone classification; CNN; machine learning; HERM lines; micro-Doppler; radars

### **1. Introduction**

Modern drones are more affordable than ever, and their uses extend into many industries such as emergency response, disease control, weather forecasting, and journalism [1]. Their increased military use and the possible weaponization of drones have caused drone detection and identification to be an important matter of public safety.

There are several types of technology which can facilitate drone detection and classification. Some sensors employ sound-based or acoustic technology to classify drones. Drones give off a unique acoustic signature ranging from 400 Hz to 8 kHz, and microphones can capture this information. Unfortunately, this technology can only be used at a maximum range of 10 meters, and the microphones are sensitive to environmental noise [2]. When tracking drones through the air, this method becomes impractical.

Optical sensors use one or more cameras to create a video showing the target drone. The classification problem then becomes the identification of specific patterns in the shapes and colours of the drones. This approach is a popular technique because it is intuitive and enables the use of image processing and computer vision libraries [3] as well as neural networks [4]. However, optical sensors have a limited range and require favourable weather conditions. For these reasons, they are not reliable enough to use for drone classification, especially at longer ranges.

Drones and other unmanned aerial vehicles (UAVs) rely on (typically hand-held) controllers, which send radio frequency signals to the drone. These signals have a unique

**Citation:** Raval, D.; Hunter, E.; Hudson, S.; Damini, A.; Balaji, B. Convolutional Neural Networks for Classification of Drones Using Radars. *Drones* **2021**, *5*, 149. https://doi.org/ 10.3390/drones5040149

Academic Editor: Diego González-Aguilera

Received: 25 November 2021 Accepted: 10 December 2021 Published: 15 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

radio frequency fingerprint that depends on the circuitry of the controller, drone, and the chosen modulation techniques. Radio frequency fingerprint analysis has been studied as a method to detect and classify drones [5].

Finally, radar sensors for drone tracking and classification have been extensively studied [6–9]. Radars are capable of detecting targets at longer ranges than other sensors and perform reliably in all weather conditions at any time of day [10].

The classification technique investigated in this paper is based on the target drone's micro-Doppler signature. A micro-Doppler signature is created when specific components of an object move separately from the rest. The rotation of propeller blades on a drone is sufficient to generate these signatures. The use of radars for studying micro-Doppler signatures has been shown effective [11] and has been used in conjunction with machine learning for many UAV classification problems [12–19]. As such, radars are the chosen technology for this paper.

Previous work has shown that an analysis of the radar return, including the micro-Doppler signature, can reliably distinguish drones from birds [7,8,20]. We now turn to the problem of distinguishing different types of drones. Standard analyses of the radar return include using the short-window and long-window Short Time Fourier Transform (STFT). The short- and long- window labels refer to the rotation period of the drone, or the time it takes for the drone's blades to make a complete 360-degree rotation. A short-window STFT is when the window length is less than a rotation period, while a long-window STFT is when the window length exceeds the rotation period. The long-window STFT generates a unique signature of the drones in the form of Helicopter Rotation Modulation (HERM) lines. The number of HERM lines and their frequency separation can be used to distinguish between the different drones.

For situations where the pulse repetition frequency (PRF) of the radar is not high enough to extract the full micro-Doppler signature, Huang et al. proposed a log harmonic summation algorithm to use on the HERM lines [21]. This algorithm estimates the micro-Doppler periodicity and performs better than the previously used cepstrum method [8] in the presence of noise. Huang also showed using collected radar data that the Minimum Description Length Parametric Spectral Estimation Technique reliably estimates the number of HERM lines. This information can be used to determine whether the target is a rotary drone with spinning propellers [22].

When the full micro-Doppler signature is available (using a high PRF radar), the short-window STFT can be utilized for analysis. Klaer et al. used HERM lines to estimate the number of propeller blades in these situations [23]. They also proposed a new multifrequency analysis of the HERM lines, which enables the approximation of the propeller rates [23]. In this paper, we leverage the work of Hudson et al., who demonstrated the potential of passing STFT spectrograms into a Convolutional Neural Network (CNN) to classify drones [13].

Recently, Passafiume et al. presented a novel micro-Doppler vibrational spectral model for flying UAVs using radars. This model incorporates the number of vibrational motors in the drone and the propeller rotation rates. They showed that this model is able to reliably simulate the micro-Doppler signature of drones. Furthermore, they proposed that the model could be further studied for use in unsupervised machine learning [24].

In another study, Lehmann and Dall trained a Support Vector Machine (SVM) on simulated data. They simulated their data by considering the drone as a set of point scatterers and superimposing the radar return of each point [25]. However, their work modelled the data as free from thermal noise. Using the Martin and Mulgrew model instead, we can simulate varying signal-to-noise ratio (SNR) conditions in this paper. Doing so provides a more realistic situation in which to apply machine learning. Additionally, our use of CNN provides better classification accuracy than their SVM.

In our investigation, we use the updated versions of the Martin and Mulgrew model [26] to simulate drone signals and perform additional augmentation to improve the data's realism. We used the model to produce datasets distinguished by the SNR and PRFs of the

contained samples. A CNN was trained for each of these datasets, and their performances were analyzed with the F1 metric. Our findings suggest that it is possible to train a robust five-drone classifier (plus an additional noise class) using just one thousand data samples, each 0.3 s in duration. Furthermore, we show it is possible to train CNN classifiers robust to SNR-levels not included in training while maintaining performance that is invariant to the blade pitch of the drones.

The work presented in this paper contributes to the field of drone classification in several ways. Many studies explore the use of neural networks for a specific SNR. Here, we provide an analysis for a wide range of SNR values, thus making our results more generalizable to different situations. We also show that the selected model is robust against varying pitches of the propeller blades, maintaining its performance when tested on drones whose blade pitch is outside of the training range. Additionally, we find that X-band radars provide better data than W-band radars for this application within the studied SNR range. This last result is likely due to the configuration of our neural network and may not be true in general cases. Finally, we leverage the Martin and Mulgrew model for data simulation, a model that is not commonly used for drone classification.

This paper is organized as follows. Section 2 introduces the reader to the concepts used in our work. We review some of the important radar parameters in Section 2.1, paying close attention to their use in our context. The Martin and Mulgrew model is used to simulate returns from different types of radars and is summarized in Section 2.2. Drone parameters and data generation are discussed in Section 2.3, and an overview of the machine learning pipeline is presented in Section 2.4. The results of the machine learning model are shown and discussed in Sections 3 and 4, respectively. Finally, we present our conclusions and future steps in Section 5.

#### **2. Materials and Methods**

#### *2.1. Radar Preliminaries*

As discussed previously, radar (RAdio Detection And Ranging) systems are advantageous over other surveillance systems for several reasons. This subsection will define the radar parameters and discuss the signal-to-noise ratio and the radar cross-section, two significant quantities for drone classification.

#### 2.1.1. Radar Parameters

There are two main classes of radars: active and passive. Active radars emit electromagnetic waves at the radio frequency and detect the pulse's reflection off of objects. Passive radars detect reflections of electromagnetic waves that originated from other sources or other transmitters of opportunity. In this paper, we will be focusing our attention on active radars. Such radars may be either pulsed or frequency modulated continuous wave (FMCW) radars. Pulse radars transmit pulses at regular intervals, with nominal pulse duration (or pulse width) of the order of a micro-second and pulse repetition interval of the order of a millisecond. Many variables related to the radar and target dictate a radar's performance. These variables are presented in Table 1.

**Table 1.** The radar parameters and pre-determined quantities which describe the radar-drone interaction.




For more information about radar types and their operations, we direct interested readers to the text by Dr. Skolnik [28]. We will now turn our attention to some of the specific measurements that help describe how radars can detect and classify drones.

#### 2.1.2. Radar Cross-Section

The radar cross-section (RCS) is critical when working with drones. As explained in Table 1, the RCS of a target is the surface area that is visible to the radar. The RCS varies with the target's size, shape, surface material, and pitch. Typical drones have an RCS value from −15 dBsm to −20 dBsm for X-band frequencies and smaller than −20 dBsm for frequencies between 30–37 GHz [29]. The RCS of drones varies significantly with the drone model and position in the air. A comprehensive study of drone RCS was performed by Shröder et al. They reported that the material is a significant factor in the blade RCS as metal blades have a much higher RCS than plastic ones [30].

The strength of the returned radar signal varies directly with the RCS, making it a critical factor in drone classification using radars. This paper will utilize the micro-Doppler effects from drone propeller blades. Thus, the RCS of the drones' blades is much more important for this investigation than that of the body.

#### 2.1.3. Signal-to-Noise Ratio

Another important quantity for radar studies is the signal-to-noise ratio (SNR). The SNR measures the ratio of received power from the target(s) and the received power from noise. The expression for the SNR depends on the radar parameters previously introduced, including the RCS, and is provided by the radar range equation [31]:

$$\text{SNR} = \frac{P\_l G\_l G\_r \lambda^2 \sigma \tau}{(4\pi)^3 \mathcal{R}^4 T\_n k\_b L} \tag{1}$$

One would expect that classification performance decreases with the SNR because the target becomes less clear. Dale et al. showed this to be true when distinguishing drones from birds [32]. As seen in Equation (1), the SNR is directly related to the RCS and so consequently tends to be small for drones. It is, therefore, crucial to understand and appreciate the signal SNR because it will significantly impact the quality of the trained model. If the training data has an SNR that is too high, the model will not generalize well to realistic scenarios with a lower SNR. The work later in this paper analyzes model performance as a function of the SNR of the signals in the training data.

It is often more convenient to express Equation (1) in decibels (dB), which is a logarithmic scale. The log-scale simplifies the calculation of the SNR quantity by adding the decibel equivalents of the numerator terms and subtracting those in the denominator. A Blake Chart clarifies this process. Table 2 shows an example of such a calculation where the radar operates in the X-band (10 GHz frequency) and the object is 1 km away.


**Table 2.** Example usage of a Blake Chart for calculating the SNR of a particular drone using a given radar.

Blake Charts make it easy to see how slightly adjusting one parameter can impact the SNR per pulse. It is important to note that for a particular radar and object at a specified range, the only parameters that can be adjusted are the *Pt*, *λ*, and *τ*. Each of these parameters comes with an associated cost due to limited power supply or the specifications of the radar, and so it is not always possible to achieve a desirable SNR. Due to this, classification models need to perform well in low-SNR conditions.

#### *2.2. Modelling Radar Returns from Drones*

The Martin and Mulgrew equation models the complex radar return signal of aerial vehicles with rotating propellers [26]. The model assumes that the aerial vehicle (or drone in our context) has one rotor. The formulation of the model is presented in Equation (2) and was used to simulate radar return signals of five different drones under two different radar settings. French [33] provides a derivation and detailed insights on the model.

$$\psi(t) = A\_r e^{j\left(2\pi f\_c t - \frac{4\pi}{\lambda}(R + v\_{\rm rad} t)\right)} \sum\_{n=0}^{N-1} (a + \beta \cos \Omega\_n) e^{-j\frac{L\_1 + L\_2}{2}\gamma\_n} \text{sinc}\left(\frac{L\_2 - L\_1}{2}\gamma\_n\right) \tag{2}$$

where

$$\mathfrak{a} = \sin\left(|\theta| + \phi\_{\mathcal{P}}\right) + \sin\left(|\theta| - \phi\_{\mathcal{P}}\right) \tag{3}$$

$$\beta = \text{sign}(\theta) \left( \sin \left( |\theta| + \phi\_{\mathcal{P}} \right) - \sin \left( |\theta| - \phi\_{\mathcal{P}} \right) \right) \tag{4}$$

$$
\Omega\_n = 2\pi \left( f\_r t + \frac{n}{N} \right) \tag{5}
$$

$$
\gamma\_n = \frac{4\pi}{\lambda} \cos \theta \sin \Omega\_n \tag{6}
$$

Table 3 provides a complete description of each of the parameters within the model. Excluding time, *t*, the model has eleven parameters approximately categorized as radar and drone parameters. Radar parameters include the carrier frequency, *fc*, and the transmitted wavelength, *λ*. The latter set of parameters depends on the position of the drone relative to the radar and the characteristics of the drone's propeller. In particular, the strength of the presented model over the initial version of the Martin and Mulgrew equation is its ability to account for variation in blade pitch of the drones, *φ<sup>p</sup>* [26,34].

**Table 3.** Interpretation of all the parameters in the Martin and Mulgrew model. *fc* and *λ* depend on the specific radar. *θ*, *R*, and *v*rad depend on the position of the drone, while *fr*, *N*, *L*1, and *L*<sup>2</sup> are characteristic of the drone's blades.


For several reasons, the Martin and Mulgrew model was the chosen data simulation model. It is a model based on electromagnetic theory and Maxwell's equations. Despite this, it is computationally efficient compared to more sophisticated models. Additionally, the drone parameters used were previously compiled and demonstrated by Hudson et al. [13]. The following section of this paper will strengthen confidence in the model by comparing it to an actual drone signal. It is found that the Martin and Mulgrew model produces distinct HERM line signatures (dependant on the parameters), as seen in the collected data—a fact that is crucial for this investigation. Although the number of rotors on the drone is not the focus of this paper, it is helpful to note that the model assumes that there is a single rotor. A proposed extension to the model sums the signal over the different rotors [35], but it has not been extensively studied.

#### *2.3. Data Generation and Augmentation*

This section describes the different sampling and augmentation considerations taken to simulate a Martin and Mulgrew signal. As will be elaborated in Section 2.4, many simulated signals were put together to produce datasets for machine learning.

The data simulation step involved two sets of radar parameters, representing an X-band and W-band radar, respectively. Furthermore, five sets of drone parameters corresponding to different commercial drones were used. Tables 4 and 5 have the parameters for the radars and drones' blades. Note that along with the five drones (classes), a sixth Gaussian-noise class was produced to investigate the possibility of false alarms during classification and their impact.

**Table 4.** Transmission wavelength and frequency values for a W-band and X-band radar.


**Table 5.** Approximate drone-blade parameters of five drones.


Although the selected drones have fixed blade pitches, the parameter was assumed to be variable for modelling purposes since some drones can have adjustable blade pitches. This assumption can improve the generalizability of the analysis. Moreover, *θ* and *R* were similarly considered as variable parameters while *v*rad and *Ar* were set to be constant (zero and four, respectively) for simplicity. As seen in Table 6, these variable parameters were uniformly sampled to produce meaningful variations between each simulated drone signal.

**Table 6.** Sampling distributions for some variable parameters.


Besides varying the above parameters, additional methods were used to produce differences between the simulated signals. We applied shifts in the time domain, adjusted the signal to reflect the probability of detection, and added noise to augment each sample. The time shift was introduced by randomly selecting a *ts* such that the resulting signal would be *ψ*(*t* + *ts*). Next, a probability of detection (*pd*) of 0.8 was asserted by removing some data from the signal, simulating the amount of information typically present in real scenarios. Finally, Gaussian-normal noise was introduced to produce a signal of the desired SNR. The added noise, **n**, was sampled from N (0, *σ*0) where the standard deviation was

given by the rearranged form of Equation (7). Equation (8) presents the final augmented signal produced using the Martin and Mulgrew equations. Each simulated sample used for machine learning was 0.3 s in length.

$$\text{SNR} = 10 \log\_{10} \left( \frac{A\_r^2}{\sigma\_0^2} \right) \quad \xleftarrow{\text{rearrangeed}} \quad \sigma\_0 = \sqrt{10^{-\text{SNR}/10} A\_r^2} \tag{7}$$

$$
\psi\_{\text{final}}(t) = \text{detection}(\psi(t + t\_s), p\_d = 0.8) + \mathbf{n} \tag{8}
$$

The use of a convolutional network requires data with spatially relevant information. The long-window STFT was applied to produce a spectrogram representation of the simulated signals [36]. The STFT is one of many methods used to produce spectrograms for convolutional learning [18]. Recall that a short-window STFT has a window size smaller than the rotation period of the drone. However, according to the Nyquist Sampling Theorem, using a short window requires that the radar PRF is at least four times the maximum Doppler shift of the propeller blades to detect micro-Doppler blade flashes unambiguously. In contrast, a long-window STFT cannot detect blade flashes because the window size is larger than the rotational period of the propellers. The long-window method only requires that the PRF is at least twice the propeller rotation rate, making this method more versatile for different radars. Previous work suggests that the long-window STFT can reveal HERM micro-Doppler signatures of drones even under low PRF conditions [21,23].

We used two configurations of the long-window STFTs. The first has a window size of 512 for the X-band radar with a PRF of 2 kHz, while the second is a window size of 2048 for the W-band radar with a PRF of 20 kHz. Due to its higher PRF value, the latter requires a larger window size.

Figure 1 shows a long-window STFT spectrogram for each of the five drones outlined in Table 5. The signals were produced using radar parameters with the X-band (left column) and W-band (right column). For demonstration purposes, these signals have no augmentation. Notice the unique HERM line signature, or bands, within each spectrogram. These signatures are not an exact representation of the drones, but the important fact is that they are distinct from one another—just as we would expect in practice. Furthermore, these spectrograms would be easily identifiable by a convolutional network. We will investigate whether this remains true for signals that have undergone the previously discussed augmentation.

**Figure 1.** Long-window STFTs displaying the HERM line signatures of five different drones under (**A**–**E**) X-band and (**F**–**J**) W-band simulation conditions. (**A**,**F**) Mavic Air 2, (**B**,**G**) Mavic Mini, (**C**,**H**) Matrice 300 RTK, (**D**,**I**) Phantom 4, (**E**,**J**) Parrot Disco. For demonstration, *v*rad = 0, *θ* = *π*/4, *R* = 1000, *φ<sup>p</sup>* = *π*/8, and *Ar* = 4 were enforced. Signals were produced using no augmentation, and a *pd* of 1.

Before continuing with the creation of our neural network, it is prudent to examine whether the radar simulation is suitable for our purposes. To validate the Martin and Mulgrew model, we used the results of a laboratory experiment involving a commercial Typhoon H hexacopter drone. An X-band radar measured the drone, which was fixed in place and operating at a constant blade rotation frequency. Figure 2 shows the reflected radar signal and its corresponding long-window STFT spectrogram. This measured timeseries is periodic, just like the simulated signals produced using Equation (2). Additionally, there appear to be HERM line signatures in the STFT, verifying the reasonability of the artificial data spectrograms shown previously. The spectrogram is not as clean as those seen in Figure 1 owing to background noise in the collected signal. This fact is addressed by our augmentation methods which make the simulated signals more realistic. Further validation would have been performed; however, the authors were limited by data access. From this point on, the Martin and Mulgrew model was pursued because it can capture many of the physical drone-blade parameters that contribute to the micro-Doppler signature.

**Figure 2.** (**A**) Radar return signal and (**B**) its corresponding long-window STFT. Captured using a TyphoonH hexacopter and an X-band radar with *fc* = 9.8 GHz and 1.5 kHz PRF.

#### *2.4. Machine Learning Pipeline*

Datasets were produced using the described data generation and augmentation methodology. A new dataset was created for each combination of radar specification and SNR, where the SNR ranged between 0 dB and 20 dB, in increments of 5 dB. The range for the SNR was motivated by the expected SNR of actual signals collected by our available radars. It is not easy to collect large real-drone datasets of high fidelity in practice. Thus, each dataset contained only 1000 spectrogram-training samples, equally-weighted among the six classes (five drones and noise). A smaller validation set of 350 samples was created to follow an approximate 60-20-20 percentage split. However, having the ability to produce the artificial data with relative ease, three test datasets—each with 350 samples were uniquely generated. Therefore, when models were evaluated, the results contained standard deviation measures.

The architecture of the neural network has a convolutional and a linear portion. Given an input batch of spectrograms, they undergo three layers of convolution, SoftPlus activation, instance normalization, dropout, and periodic max pooling. The batch then undergoes linearization and passes through three hidden layers with ReLU activation. Finally, the loss is computed using the logits and cross-entropy. Figure 3 demonstrates this pipeline, with the dashed outline representing the model itself.

**Figure 3.** Machine learning pipeline where the data input is a batch of spectrograms. The dashed portion is the neural network.

Recall that the two simulated radars used different PRF values, resulting in the size of the input spectrograms being different. A PRF of 20 kHz produces more data points per 0.3 s in the signal, resulting in a larger spectrogram than a PRF of 2 kHz. The spectrogram sizes are 512 × 4 and 2048 × 7 for the 2 kHz PRF and 20 kHz PRF, respectively. This required the creation of two (similar) networks, one for each radar. The general architecture still holds because both networks differ only in kernel sizes and the number of hidden units in the linear layers.

A CNN model was trained for each training dataset (each dataset represents a unique radar and SNR combination). The training was conducted for 300 epochs, and the most generalizable models were selected through consideration of the training and validation loss and accuracy. All training occurred using PyTorch, a Python machine learning library, on an RTX 2060S GPU.

From the work of El Kafrawy et al. [37], the macro-F1 score, referred to as the F1 score, was used to evaluate the performance of the trained models against the three test datasets. In many cases, the F1 score can be similar to pure accuracy. However, it benefits from considering false positives and false negatives through precision and recall, respectively, making it a preferred metric to accuracy. The formulation is provided in Equation (9), where *C* represents the number of classes (six), P*c* is the precision and R*c* is the recall, both of class *c* ∈ {1, 2, ... , *C*} [37]. In their definitions, TP*c*, FP*c*, and FN*<sup>c</sup>* are the number of true positives, false positives, and false negatives for class *c*, respectively.

$$\mathbf{F}\_1 = \frac{2}{\mathcal{C}} \sum\_{c=1}^{\mathcal{C}} \frac{\mathbf{P}\_c \mathbf{R}\_c}{\mathbf{P}\_c + \mathbf{R}\_c} \tag{9}$$

where

$$\mathbf{P\_{c}} = \frac{\mathbf{T}\mathbf{P\_{c}}}{\mathbf{T}\mathbf{P\_{c}} + \mathbf{F}\mathbf{P\_{c}}} \tag{10}$$

$$\mathbf{R\_{\mathcal{L}}} = \frac{\mathbf{T}\mathbf{P\_{\mathcal{L}}}}{\mathbf{T}\mathbf{P\_{\mathcal{L}}} + \mathbf{F}\mathbf{N\_{\mathcal{L}}}} \tag{11}$$

#### **3. Results**

Presented in Figure 4 are the F1 score results of the trained models. The model for the X-band 2 kHz PRF radar achieved an F1 score of 0.816 ± 0.011 at an SNR of 10 dB. The W-band 20 kHz PRF radar performed much worse, only reaching comparable results at 20 dB SNR. A W-band radar with a 2 kHz PRF was trained as a control model, which demonstrated weaker performance against the X-band radar at the same PRF. The X-band 2 kHz PRF radar, trained on 10 dB SNR, was used for further investigation moving forward due to its ability to perform relatively well under high noise conditions.

**Figure 4.** F1 score of various models versus the SNR of the training data. The vertical black line at the top of each bar shows the standard deviation of that model. The standard deviations were created using three test datasets.

A multi-class confusion matrix and Receiver Operating Characteristic (ROC) curve were used to gain insight into the performance of the selected X-band model for each class.

The selected model performs well at classifying drone classes. However, the classifier demonstrates a high false-negative rate when a false alarm sample (noise) is presented. As seen in Figure 5, noise is most often confused with the Matrice 300 RTK. Similarly, within Figure 6, noise has the lowest area under the ROC curve. Nevertheless, the model maintains a high average ROC area of 0.9767, suggesting a minimal compromise between true positives and false negatives.

**Figure 5.** Row-normalized confusion matrix results of the X-band 2 kHz PRF model trained on 10 dB training data. Order of the classes as follows: Mavic Air 2, Mavic Mini, Matrice 300 RTK, Phantom 4, Parrot Disco, and Noise (false alarm).

**Figure 6.** Multi-class ROC curves of the X-band 2 kHz PRF model trained on 10 dB training data. The micro-Average ROC, an aggregate representation, is shown as the black dotted-line.

Next, the robustness of the model against blade pitch and varying SNR values was investigated. The inclusion of blade pitch was a driving factor in using the more complex version of the Martin and Mulgrew equations to produce the artificial dataset. The results of the trained X-band 2 kHz PRF model against the blade pitch, *φp*, are in Figure 7. The model's F1 score for the bins between 0 and *π*/4 remains close to the mean of 0.816 as expected. More importantly, the model remained robust, maintaining comparable performance when tested with *φ<sup>p</sup>* values outside those used in training.

**Figure 7.** F1 score versus different blade pitch, *φp*, bins for the X-band 2 kHz PRF model trained on the data within the *φp* range shown by the solid bins. The striped bins show the performance when tested on data with *φp* values not used in training. The standard deviations were created using three test datasets.

In Figure 8, a similar analysis was conducted to determine the model's robustness to varying SNR values. The model performs worse when tested on data with an SNR lower than 10 dB. In contrast, the same model can classify much more effectively as the SNR of the provided test data increases.

**Figure 8.** F1 score versus varying SNR values for the X-band 2 kHz PRF model trained on 10 dB SNR data, where the blue bars strictly represent test data with SNR values not used in training. The standard deviations were created using three test datasets.

#### **4. Discussion**

Following the processes outlined in Section 2.3, artificial datasets were generated using the Martin and Mulgrew model, and CNN classifiers were trained for each dataset. Training each of these models enables us to quickly identify the required SNR of the data for drone classification and compare the performance of different radars for this application. Of particular importance, only 1000 spectrograms were present in each training set. This small size reflects the case where real drone data is minimal in applications. Obtained from the work of El Kafrawy et al., and presented in Equation (9), the multi-class F1 score metric was used to measure the performance of each trained model [37].

As found in Figure 4, the performance in the X-band is superior to that of the W-band for the chosen CNN. This is somewhat surprising as the W-band corresponds to a shorter wavelength and would intuitively result in better classification performance. There could be several reasons for this. Firstly, the 20 kHz PRF spectrogram is quite dense and holds a lot more information than a 2 kHz PRF STFT in the X-band. The additional complexity and detail may not be as robust against noise. By simulating a W-band radar at the same PRF as the X-band, we identify the reason for the discrepancy in performance lies in the transmitted wavelength parameter in the simulation. Note that we do not claim the X-band frequency is the best, but rather that it is better than W-band radars for typical parameters. Furthermore, X-band radars offer superior all-weather capabilities over other frequency bands. The W- and X- band radars were chosen for this paper because they are lighter and more portable than other band radars; however, it is possible that lower frequency bands (e.g., S-band and L-band) may provide superior performance over the X-band. This is undoubtedly a topic of exploration in future work. Additionally, our conclusion is limited to the choice of the model. A deeper CNN might yield a different conclusion, and it is something we plan to explore in future work.

In any case, the result is interesting and valuable. It suggests that an X-band radar with a PRF on the order of a few kHz can be highly effective in classifying drones under our simulation model. The lower PRF requirement in the X-band is also welcome as this leads to a longer unambiguous range. The 2 kHz PRF X-band classifier, trained on 10 dB SNR data, was selected for further investigation.

A multi-class confusion matrix and ROC curve were produced in Figures 5 and 6, respectively, which revealed shortcomings in the selected model. The noise class exhibits the weakest performance in both, and the confusion matrix reveals that noise is most often misclassified as a DJI Matrice 300 RTK drone. The reasoning for this misclassification is not immediately apparent. However, a qualitative consideration of Figure 1 suggests that spectrograms with a dense micro-Doppler signature (HERM lines) might be more easily confused with noise. The Matrice 300 RTK, in particular, has a very dense signature. On the other hand, an STFT of Gaussian noise contains no distinct signature or obvious spatial information. Under the low, 10 dB SNR conditions, the Matrice 300 RTK spectrogram likely becomes augmented to the point where it begins to resemble the random-looking noise STFTs. This result emphasizes the need for reliable drone detection models and algorithms, as false alarms can easily be confused with some drone types during the classification process.

Consideration of Figure 7 shows that model performance is invariant to different values of *φ*, even for values not included in the training dataset. This result is important because it suggests that classification performance is mostly unaffected by the pitch of the drone blades. In a similar analysis against varying SNR, the trained model demonstrated a linear trend in performance. The model performed worse when evaluated using data with an SNR lower than the 10 dB used for training while showing superior performance on less noisy, higher dB data samples. This result is shown in Figure 8. Both of these results inform that when training drone classifiers, the blade pitch of the samples is not critical. Additionally, models should maintain linear performance near the SNR of the training samples.

In Figures 4, 7 and 8 we observe the standard deviation of the test results shown by vertical lines. These were produced by evaluating the model's performance on three different test sets. The small size of the standard deviations strengthens our results' confidence and, therefore, the efficacy of our CNN classifier.

#### **5. Conclusions**

This paper investigated the feasibility of classifying drones using radar signals and machine learning. The utilized datasets were generated using the Martin and Mulgrew model with realistic drone parameters. Five distinct drones were classified, and a sixth noise (false alarm) class was also considered during our analysis. We show that it is possible to train a CNN model to classify drones in low SNR scenarios, using a signal of just 0.3 s in duration. We find that practically realizable radars (e.g., X-band with 2 kHz PRF) can lead to a F1 performance of 0.816 ± 0.011 using training data of 10 dB SNR. Further analysis of the trained model shows that it remains robust to varying drone blade pitch and linearly decreases and increases in performance with the SNR of the signal. However, the model becomes less viable when false alarms are presented, as they can be confused with some drone classes. We wish to stress that the presented analysis is based on a simple model and should be corroborated with higher-fidelity models.

Our goals are to continue exploring the Martin and Mulgrew model for the purpose of constructing drone classifiers. Specifically, we wish to investigate the effect of transmitted wavelengths on model performance. An in-depth comparison of the classification results using a more comprehensive range of radars is a good topic for future work. Further exploration should uncover why the W-band radar signals provided worse data than the X-band for use in our CNN. As stated previously, the Martin and Mulgrew model assumes that the modelled drone has a single rotor; however, many real drones have up to eight rotors. Future work could investigate an adapted Martin and Mulgrew model proposed in [35] which considers multiple rotors. Moreover, it would be interesting to compare the performance of models trained using samples of shorter duration but with higher PRF against training samples of longer duration with lower PRF values. Consideration of different machine learning approaches and more complex CNN architectures will also be of interest. In parallel, recognizing that simulations produce results under controlled and ideal conditions, we wish to elucidate the applicability of the trained CNN models on real drone data. This can be investigated via a direct application of the model on real data or through a transfer learning process where the trained models are used as a starting point for further training.

**Author Contributions:** Conceptualization, E.H., D.R., A.D. and B.B.; methodology, D.R., S.H. and B.B.; software, S.H., D.R.; validation, D.R., S.H. and B.B.; formal analysis, D.R.; investigation, D.R., S.H.; resources, B.B. and A.D.; data curation, D.R.; writing—original draft preparation, D.R., E.H.; writing—review and editing, E.H., D.R., A.D. and B.B.; visualization, D.R.; supervision, B.B. and A.D.; project administration, B.B. and A.D.; funding acquisition, A.D. and B.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not Applicable.

**Informed Consent Statement:** Not Applicable.

**Data Availability Statement:** Some of our initial code: https://github.com/SinclairHudson/ CANSOFCOM (accessed on 13 December 2021).

**Acknowledgments:** We would like to thank CANSOFCOM and the organizers of the Hack the North for the challenge.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**

