1. Introduction
Mediterranean-type ecosystems (MTEs) are biodiversity hotspots, accounting for almost 20% of the world’s species in 5% of its area [
1,
2]. Heterogeneity in climate, topography, and fire regimes contributes to these high levels of species diversity [
3,
4,
5]; however, climate change, frequent fire, urbanization, and non-native plant invasions are threatening the biodiversity of MTEs, leading to species extinctions as well as rapid changes in vegetation cover [
2,
4,
6]. Therefore, it is essential to understand the trends in species loss and monitor changes in biodiversity [
7]. This requires the capacity to consistently monitor occurrences of species or assemblages over broad spatial and temporal ranges, which is increasingly accomplished with advanced optical remote-sensing technologies, such as imaging spectroscopy [
7,
8,
9,
10,
11,
12].
Imaging spectroscopy of multispectral or hyperspectral remote-sensing data is a rapidly expanding method used in ecological and conservation applications. In the context of biodiversity monitoring, imaging spectroscopy relates the spectral properties of an area with other biodiversity metrics such as species identity and diversity, functional traits, or other variables such as photosynthetic vegetation, nitrogen content, etc. [
12,
13,
14,
15]. Analyses have been broadly applied across ecosystems, including forest, dryland, marine, and urban ecosystems with increasing emphasis on the identification of individual species and/or mapping plant species diversity at landscape scales [
13,
14,
15].
The expanding use of remote sensing and imaging spectroscopy to estimate metrics of biodiversity requires robust verification and validation protocols that generate measures consistent with traditional ground-survey methods. This need is intensified by forthcoming spaceborne spectral missions, such as NASA’s Surface Biology and Geology mission (SBG) and other related missions in Europe and Japan [
16,
17]. However, validation will be challenging for spaceborne data given the extent and volume of measurements that will be needed to accommodate global satellite missions. To quantify how remote measurements relate to traditional metrics of biodiversity, it is important to understand how they scale across different ecosystems, including those that are under-studied due to limited access.
While topographic heterogeneity contributes to biodiversity in MTEs, it also creates barriers to their study due to large regions that are too steep for traditional field surveys. Studies of these regions either lack data or rely heavily on visual estimation (e.g., [
18,
19,
20]). Furthermore, much of the protected land in MTEs occurs in steep, moderate- to high-elevation regions, which are unsuitable for development, making the study of these areas even more important for understanding trends in biodiversity [
6]. While all remote-sensing studies include some type of mechanism to compare algorithmic results to reality, often the baseline condition is derived from satellite imagery or potentially outdated ground surveys, which were not always performed with the original purpose of being used to validate remote-sensing predictions [
21]. Finally, the prospect of worldwide satellite-data collection intensifies the need for efficient, accurate ground-truthing methods that can be applied in a variety of landscape settings while maintaining accuracy among users.
Here, we sought to develop a field protocol appropriate for calibrating and validating vegetation-mapping algorithms from optical remote-sensing data, with a focus on methods that can be rapidly implemented in the rugged shrubland ecosystems that are characteristic of MTEs. Because optical remote sensing primarily characterizes the top of the canopy of vegetation, our method prioritizes the enumeration of species composition from semi-aerial photos taken above the canopy [
22,
23,
24]. This method has been successfully used to document classes of vegetation cover in rangelands, but has less often been applied to identifying species [
25,
26]. Vegetation-cover surveys were performed for 83 45-m × 45-m plots of mostly shrubland plant communities in the southern California San Gabriel Mountains. The pictures were analyzed using a classification procedure in which species were identified on a grid superimposed on the pictures [
27,
28]. We evaluated the accuracy and efficiency of the method in both flat and steep terrain, and discuss the tradeoffs between more traditional vegetation-cover surveys and this method at different sampling efforts and for estimating various metrics of plant species diversity. We present this approach as a viable alternative for intensive surveys in shrublands, especially in settings where traditional surveys are impractical due to difficulty traversing steep terrain or other conditions.
2. Methods
2.1. Study-Site Description
Data collection occurred in two post-burn areas in the Angeles National Forest in southern California, with elevations ranging from less than 400 m to over 1300 m above sea level (
Figure 1). The mean (SE) slope of our study area was 27.3 (1.2) percent; however, 46 of our study plots (nearly half of the total) had an average slope value greater than 30 percent, and 14 plots had slopes greater than 40 percent. The Copper and Sayre fires occurred in 2002 and 2008, respectively, both at high intensities, burning a collective 25,000 acres. In the following years, there has been re-growth of the predominantly montane chaparral and coastal sage scrub vegetation types, as well as the establishment of non-native invasive forbs and grasses. Information on the amount of area revegetated to native and non-native vegetation, and the composition of that vegetation, has been limited by the notably steep and rugged terrain of the San Gabriel Mountains.
2.2. Field Plot Selection
In the spring and summer of 2018, the cover of all plant species was recorded from 83 field plots (43 within the Copper fire and 40 within the Sayre fire) consisting of low-and mid-statured vegetation (mainly shrubs and herbaceous species). A LiDAR digital elevation model was used to quantify the area of the study site falling within distinct topographic zones of elevation, slope, and aspect, and to ensure that the survey plots sampled a similar proportion of attributes as the survey area itself (
Supplementary Materials S1: Figure S1) [
29].
Each survey plot was 45 m × 45 m and was delineated in the field based upon the uniformity of vegetation, slope, and aspect. We aimed to survey individual plots that were representative of a single cover class (plant species, ground cover, etc.) or vegetation community/alliance. We similarly aimed to survey plots having uniform slope and aspect within the plot boundary.
Plots were classified as either “steep” or “flat” based on their average percent slope. “Steep” plots (having slopes greater than 10 percent, 69 plots total) were surveyed using a “Steep Protocol” to optimize efficiency and safety of surveyors, whereas “flat” plots (with an average slope less than 10 percent, 14 plots total) were surveyed with the “Flat Protocol” (both described below and pictured in
Figure 2).
2.3. Image-Based Method: Field Sampling
Photogrid (Liu, Singh and Townsend,
https://github.com/EnSpec/PhotoGrid) is an open-source Python program that was developed for users to classify plant species from field photos using an image-based point-intercept sampling schema. The program superimposes a user-defined grid over a photograph of a field plot, and an analyst manually identifies the species present at each crosshair. From this, users can calculate standard metrics of species composition and diversity, such as percent cover by species or species richness. Sampling of vegetation using Photogrid requires the use of a sufficient-resolution camera operated at a low enough altitude to enable species identification. Here, we used a camera attached to a 6 m telescoping fiberglass pole (Wonder Pole, American Flag & Banner Company, Salem, OR, USA).
To survey plots with steep slopes we followed the “Steep Protocol.” A 45-m transect was laid along a ridgeline (generally a roadside or trail) and overhead photos were taken looking downslope along the transect using a Sony Alpha 5100 digital camera (16–50 mm lens, 23.5 × 15.6 mm CMOS sensor with 6.65 MP/cm
2 pixel density) mounted to the fiberglass pole. The pole was extended to the appropriate height (generally 5–6 m) per ground-slope angle to capture an approximately 5-m × 5-m area within each photo frame, which was verified by placing a meter stick on the ground in the photo area for scaling (
Figure 2A,B). The camera was set to automatic focus and was controlled wirelessly via the PlayMemories Mobile application [
30]. We set the shutter speed and aperture settings each day depending on the light conditions (e.g., full sun, clouds, etc); although few adjustments were needed during our surveys. The PlayMemories application also allows for adjusting the aperture and shutter speed remotely if conditions are variable while taking the photos. Photos were taken every 5 m along the transect for a total of 10 photos per plot.
The “Flat Protocol” was used for plots surveyed in less-frequently occurring flat terrain. For these plots, two bisecting 45-m transects were laid out at 90-degree angles to each other and 10 5-m × 5-m ground photos were taken from above with the camera set-up (
Figure 2C).
When using both Steep and Flat protocols, the area covered by the photos was also surveyed by experienced botanists to compile a complete plant census from ground surveys. An individual recorded all observable plant species by walking along the transect and broadly classified their percent coverage, as well as that of other cover classes such as bare ground and leaf litter. This individual also took notes on important identifying features (such as phenology and coloration of vegetation) and annotated a set of photos in the field to assist analysts back in the lab with species identification (
Figure 3A). Static location points were taken along each survey transect with a Trimble R10 GNSS system to determine precise geographic locations of surveys for referencing with airborne imagery.
2.4. Image-Based Method: Photo Analysis
Photogrid enables a systematic approach to classifying cover within photos to avoid human bias. For each 5-m × 5-m ground plot photo, we specified a grid of six rows and seven columns, totaling 42 sampling points per photo (
Figure 3B) and 420 points per 10-photo plot. In Photogrid, each gridpoint is enclosed by a solid box on the screen; the user clicks on each box and is able to zoom in or out until they can classify the dominant species or cover class within the box and select it from a corresponding list compiled from all species and cover classes that were surveyed in the field within that particular plot (
Figure 3C). Cover classes were used to accommodate the identification of bare ground, as well as a few taxa that were identifiable to genus or family from photographs, but not to species (e.g., annual grasses which included annual grass species such as
Bromus spp.,
Avena spp., etc.). The program saves the user’s entry for each point and calculates percent cover for each species/class once all photos have been classified for a plot (
Figure 3D).
2.5. Image-Based Method: Validation
The purpose of surveying the 83 plots in 2018 was to accurately and efficiently collect percent-cover data within the two fire zones. The image-based survey method allowed a large area to be surveyed over one field season, including many steep slopes that would have been impossible to survey using a traditional field-transect approach. Consequently, only species richness (based on surveys for species presence) could be precisely measured in each field plot in 2018; no other quantitative measures of species composition were taken using traditional ground surveys.
To compare the accuracy and efficiency of the image-based method with more traditional point-intercept field-transect surveys (henceforth “point-intercept method”), we employed a second field study in 2019 with an additional 16 plots that could be safely sampled with both the Photogrid and point-intercept methods. Due to a lack of level sites within the original study area, the 16 validation plots were located outside of the 2018 study sites, but within a nearby region of the San Gabriel Mountains with similar vegetation.
For the 2019 plots, species percent cover was estimated within a 45-m × 45-m plot. A 45-m baseline transect was established with seven perpendicular survey transects laid out evenly and extending 45 m from the baseline. Along these perpendicular transects, surveyors stopped every 1 m and recorded the tallest species or object at that exact point and in contact with a PVC touch pole with a diameter of 1.9 cm. If no standing object or vegetation was touching the pole at a single location, the ground cover class was recorded. A total of 315 survey points were recorded per plot, and the percentage of each species or object out of 315 points was calculated.
Each of the 16 plots was also surveyed by the image-based method, with 10 overhead photos taken per plot using the Flat Protocol as the plots had slopes less than 10 degrees. For the photos from these plots, we modified the Photogrid output to subsample the gridpoints to test how different levels of sampling effort affect the accuracy of estimates derived from Photogrid in comparison to the field-sampled point-intercept data. Twelve Photogrid configurations (henceforth referenced as configuration 1–12) were tested, each comprised of a different variation of rows, columns, and gridpoints surveyed per photo and entire plot (
Table 1). Configuration 1 with 420 gridpoints had the highest sampling effort.
Through experience we found that a three-person crew was optimal for surveying using either method, as adding additional people did not improve efficiency. Using a three-person crew, we kept detailed logs of time spent both in the field and on data processing out of the field for each plot and method. These data allowed us to compare the time spent per survey type and Photogrid configuration (
Table 1). The time necessary to process the photos by each configuration was estimated based on the average time required to process photos by configuration 1, reduced proportionally based on the number of gridpoints in each configuration (
Table 1).
2.6. Statistical Analysis
Because percent-cover data were not collected using traditional ground-sampling in the steep 2018 plots, only species richness was compared between field observations (Sfield) versus Photogrid survey (Sphoto), as well as by protocol (flat versus steep). In cases where a plant could only be identified to a genus or functional group, it was counted as one species (e.g., annual grasses). A paired t-test was used to compare Sfield with Sphoto across all plots in the 2018 dataset. To test the effect of protocol on the difference between Sphoto and Sfield, a linear model was used with Sphoto as the response variable, Sfield as a covariate, protocol (flat, steep) as a fixed factor, and the interaction between Sfield and protocol (Sphoto ~ Sfield + protocol + Sfield:protocol).
For the 2019 Photogrid method validation data, our goal was to assess the correspondence of the field and Photogrid surveys, as well as compare the correspondence of the 12 Photogrid configurations with field data. The following attributes were calculated for each plot using the Photogrid and point-intercept-method cover data: species richness (S), Simpson’s species diversity (1/D), and Simpson’s evenness (E). A linear model was used with the Photogrid attribute as the response variable (e.g., S
p), the field attribute as a covariate (e.g., S
p-i where “p-i” indicates the point-intercept method), Photogrid configuration as a fixed factor, and the interaction between the field attribute and configuration (e.g., S
p ~ S
p-i + configuration + S
p-i:configuration). A separate model was used to test each attribute. Percent cover was calculated for each species and cover class (% cover) in a plot; therefore, models testing % cover included a random factor for plot to account for multiple species measurements in each plot. Tukey tests were used to perform post-hoc comparisons for significant effects of configuration [
31]. For all attributes, separate simple linear regression models and a correlation analysis were analyzed for each configuration to report the model coefficients by configuration. Simple linear regression models were used to analyze the relationship between % cover
p using configuration one and % cover
p-i for species and cover classes observed in more than 50% of the plots (9 or more plots).
For the 2019 surveys, time data were analyzed to assess the efficiency of each method and whether efficiency varied with the height of vegetation, e.g., comparing low grasses or shrubs with taller shrubs. Plots were categorized into one of three classes based on visual estimation of vegetation height: low (<1 m, 4 plots), mid (1 m to 1.5 m, 7 plots), and high (>1.5 m, 2 plots). The response variable of time to complete a plot survey was analyzed with a repeated measures linear mixed model. The model included method (Photogrid or point-intercept) and height class as fixed factors and the method × height class interaction term. For this repeated-measures model, the plot was treated as a random factor to account for two measures in each plot. All data were analyzed in R using the lme4 package [
32,
33].
4. Discussion
With the increasing prominence of using remote-sensing imagery to estimate metrics of plant diversity [
7,
8,
9,
10,
11,
12], there is a concomitant need to develop new methods to rapidly measure diversity on the ground. Our method is an alternative to ground surveys that efficiently sampled vegetation in large plots over rugged terrain. Such methods will be needed to collect a sufficient number of calibration and validation points for forthcoming global imaging-spectroscopy missions such as NASA’s SBG [
16], which have specified mission objectives for characterizing metrics related to diversity.
When compared to a traditional point-intercept field sampling method, the image-based method provided comparable results for measures of Simpson’s species diversity (1/D) and percent cover of individual cover classes, while reducing time and effort to collect these data in the field. There was high correlation between point-intercept and image-based survey results and the Photogrid sampling effort (configuration) did not lead to significantly different estimates of percent cover. By using the lowest sampling effort (configuration 12) the amount of time to complete a Photogrid survey could be reduced by up to 46 min per plot without compromising the accuracy of the percent-cover estimates. This made the image-based method substantially more efficient than a traditional ground survey for estimating percent cover.
When investigating the differences in percent-cover estimates among cover classes, we found the image-based method to have the greatest accuracy for the most abundant classes (
Supplementary Materials S1: Figure S2, Table S6). In addition, the image-based method produced lower values for percent cover for species that were small-statured (eg.
Cryptantha sp., a small white-flowering herb) or sparse in their cover (
Acmispon glaber, dead shrub), which is likely due to the difficulty in observing these classes in the photos compared to in-person in the field. These classes are important for the overall biodiversity of the community, but have less influence on the total vegetative cover due to their lower abundances. Therefore, the performance of the image-based method in classifying abundant species makes it useful for map calibration and validation but may limit it for other more diversity-focused applications.
Furthermore, the results for species richness (S) and evenness (E) were mixed when traditional point-intercept surveys were replaced by image-based surveys. Species-richness estimates were generally higher in the field, systematically so by an average of 3.6 species across all types of plots in 2018, but only by less than one species in 2019 (
Figure 4 and
Figure 5c;
Supplementary Materials S1: Table S1). Surveyors were able to more easily see small and less abundant species in the field that were more difficult to identify in the photos or were missed altogether through the generation of gridpoints by the Photogrid program. As well, many smaller-statured species are likely obscured by the overstory, an observation that is likely to make image-based methods untenable for total richness in forested plots, although in many ecosystems remote sensing is only sensitive to overstory richness in forests. A few species could only be identified to a genus or functional group using photos (e.g., annual grasses), whereas they could be identified to species in the field. This was uncommon and mostly occurred with non-native annual grasses, which could lead to recording one or two species more in field surveys. In addition, as the photo-sampling effort decreased, the difference between point-intercept and image-based results for species richness and evenness became larger with increasingly less accuracy for the image-based method (
Supplementary Materials S1: Table S1;
Figure 5).
While our larger field effort took place during the first year of the study when species percent cover was estimated within 83 plots, we did not conduct quantitative field sampling of each plot using a traditional field-based method. However, we did perform an exhaustive visual survey to identify the number of species present, and the difference in richness followed the same trend found in the survey conducted in 2019, i.e., that field sampling recorded more species present. In 2018, it is notable that image-based estimates of richness were closer to field estimates of richness for the steeper plots than the flat plots. This result may occur because transects for steep plots were laid at the top of the slope along a flat trail or road, with images collected downslope of the flat area that could be traversed safely, whereas transects for flat plots were laid across the sampled area, and the field crew could traverse the entire sampling area. This difference in the ability to traverse the plots may account for species being missed in the field in steep areas. This finding highlights the general issue that the steep and dangerous terrain within the majority of this region make it impossible to use traditional field survey methods to “ground-truth” the image-based data for these plots.
As such, 16 new plots were chosen during the second-year validation study with more moderate, traversable slopes well-suited for more high-effort field surveys in addition to the image-based surveys. The results of our validation experiment confirmed the effectiveness and accuracy of the image-based method; however, our validation was limited to only the Flat Protocol due to safety concerns with traversing steep plots. Based on our experience using these protocols in a variety of settings, we think the Steep Protocol is more accurate than visual estimation for collecting percent-cover data; however, we lack a quantitative assessment of this accuracy. The fact that our small-scale and highly controlled study faced such impediments in accessing reliable ground-truthing data suggests this is likely a universal problem in remote-sensing and other ecological studies involving steep or otherwise hazardous terrain. This limitation highlights the importance of this tool in problematic landscapes that are typical of MTEs, which feature both steep terrain and a heterogeneous composition of short- and mid-statured vegetation.
The sampling method with a camera on a pole seems best utilized in ecosystems such as MTEs, which are dominated by shrubs or large bunch grasses that can be identified easily with semi-aerial images. Species identification from photos could be difficult in ecosystems with many small-statured species, such as some grasslands, but could still be used in these ecosystems to map broader cover classes such as photosynthetic and non-photosynthetic vegetation [
24,
34]. It is also worth reiterating that in some settings, an image-based method may be the only safe and efficient approach to collect a sufficient sample size of data. We used a camera affixed to a telescoping fiberglass pole, but the image-sampling method could also be applied to data from an unmanned aerial system (UAS) or airborne imagery. Certainly, image acquisition from a drone would be necessary for the method to be used in tall vegetation such as forests or in very steep canyons, but as noted previously, the method would miss large numbers of understory species in such environments. More generally, though, the operation of UASs for data collection in remote environments may be difficult. For species identification of low-stature vegetation such as in our study and characteristic of MTEs, a UAS may need to be flown at a low altitude in which air movement by propellers may preclude the identification of species. This can be mitigated by higher flight altitudes using higher-resolution cameras. However, there can also be logistical issues with UASs, including transport into remote sites, restrictions related to line-of-site flying, and safety concerns when flying at low altitudes in steep terrain. Although low-tech, the extendable pole was easily transported into the field and deployed by a team of three to rapidly collect imagery over the shrubland vegetation at our sites. In some conditions, using a camera on a pole will be more affordable and logistically simpler than surveys with an UAS.
5. Conclusions
The implications of these results vary depending on the application of the survey method. To meet certain biodiversity assessment goals highly influenced by species richness, the image-based method will fall short in its failure to identify all species, especially those occurring in the understory. A more appropriate application for the method is in broader-scale habitat mapping, and to capture spatial or temporal trends in diversity metrics rather than the correct absolute values of those metrics, such as global comparisons of biodiversity trends across MTEs. As well, this method should work well to capture the total cover and abundance of common or physiognomically large species, and it had its greatest efficiency in sampling vegetation more than 1 m tall (
Figure 6). The spatial scale and image resolution of our surveys were appropriate for identifying the required species and composition patterns of many described plant communities and vegetation alliances commonly found in MTEs. This resolution was also ideally suited to calibrating vegetation maps derived from imaging spectroscopy for southern California shrublands (Bonfield et al., manuscript in preparation). An added value of this method is the ability to store the images for archival purposes or for future analysis for a different purpose [
23,
24,
27].
Ultimately, the metric most important for validation of imaging spectroscopy is the percent cover of each species, which was found to be reliably measured on the plot scale by the image-based method (
Figure 5B) and for dominant individual species (
Supplementary Materials S1: Figure S2). Image-based surveys require less effort to perform compared to more traditional field surveys, and they also open up some areas to ground-truthing that may have gone previously un-surveyed due to hazardous or steep terrain. As remote sensing of vegetation continues to be a common and relied-upon technique for collecting ecological data, it is increasingly important for ground-truthing efforts to match the levels of quality necessary to confidently assess the calibration and validity of these efforts, and at an equal pace to the evolving technology. This image-based sampling method can be used in a variety of applications for these purposes.