1. Introduction
A major trend in human habitation is the movement towards cities. According to a 2015 report of the International Organization for Migration, 54% of the global population now lives in cities versus 30% in 1950. These trends are almost certain to continue, with 67% of the world’s population expected to live in cities by the mid 21st century. Within the United States, urban areas grew over 10% from 57.9 million acres in 2000 to 68.0 million acres in 2010.
Currently, 70% of global fossil fuel CO
2 emissions are from urban sources [
1], but the percentage is likely to change with urbanization, increasing human presence in cities, and changing socio-ecological and climate factors. In particular, there is growing evidence and awareness that urban environments function as ecosystems with strong feedbacks from both the altered biophysical environment and the biota to the resultant system. Recent work has shown that the urban biosphere has a substantial influence on regional carbon and water cycling [
2,
3]. For example, up to 20% of the excess CO
2 flux in the Los Angeles basin has be attributed to biogenic emissions from Los Angeles’ irrigated landscapes, and urban/suburban water use restrictions can decrease irrigation by between 6–35% in Los Angeles [
4,
5]. Recent studies in Boston [
6] and Los Angeles [
7] suggest that soil respiration is significantly higher in urban areas due to the fertilization and watering of irrigated lawns and other extensive land management practices. Furthermore, there are several well-established impacts of urban vegetation on climate, such as indirect cooling of urban areas through shading and transpiration, but these factors are heavily dependent on the type of vegetation, land management practices, and air temperature [
8].
The Southern California Air Basin (SoCAB) contains both unmanaged, non-urban vegetation and heavily managed urban vegetation with different and unknown impacts on water, carbon, and climate. As new water restrictions and carbon policies are implemented as cities grow, there is an urgent need to better characterize urban ecosystems and land cover classes. We rely heavily on regional distributed land surface models to accurately hindcast and forecast carbon and water cycles. These models require land cover and land use maps that distinguish between land cover types like tree, shrub, grass, water, and impervious surface at appropriate spatial resolutions, but often ignore or misrepresent urban regions due to poorly characterized land cover. High-resolution land cover maps with specific urban vegetation classes are needed to better estimate CO2 sources and sinks within heterogeneous urban environments, detect urban land cover and land use change, and quantify changes in the hydrologic cycle.
Current categorical land cover classification maps in the SoCAB region either cover large areas with a poor spatial resolution, or small areas with a high spatial resolution. For example, the US National Land Cover Database (NLCD) covers the continental US with a spatial resolution of 30 m and classifies urban regions as different levels of development intensity, which is relatively coarse for urban applications [
9]. In Los Angeles, several studies have created higher spatial resolution maps using a combination of field studies and remote sensing technology to investigate the socioenvironmental value of Los Angeles’s urban forests and other environmental impacts of urban land cover [
10,
11,
12,
13]. Field studies are critical to validating vegetation classification based on remotely sensed imagery but challenging to replicate over large areas. Thus, the region of interest for such studies have been limited to the census defined City of Los Angeles or Los Angeles County (
Figure 1), limiting their applicability for use in regional carbon and hydrologic cycle models. For example, [
14] assessed 28 1-hectare plots throughout the city using high-resolution aerial imagery, QuickBird, Landsat, moderate resolution imaging spectroradiometer (MODIS), and airborne lidar to categorize species richness, tree density, and tree cover. Nowak et al. [
11] used the Forest Service’s Urban Forest Effects (UFORE) model to assess 348 0.04-hectare field plots throughout the city to quantify urban tree species distribution, urban forest structure, and its effects on human health in Los Angeles. Wetherley et al. [
15] used airborne visible-infrared imaging spectrometer (AVIRIS) (18 m) and AVIRIS Next Generation (AVIRIS-NG) (4 m) imagery to estimate sub-pixel fractions of urban land cover in Santa Barbara, CA with multi-endmember spectral analysis (MESMA) and evaluated with known field pixels to obtain accuracies ranging between 75% and 91% for spectrally similar and dissimilar classes, respectively. The Environmental Protection Agency (EPA) EnviroAtlas project recently developed a NAIP and LiDAR-based 1 m land cover classification with 82% overall accuracy and 10 distinct land cover classes in 30 urban communities, including Los Angeles County [
16].
Outside of southern California, Erker et al. [
17] used airborne imagery from National Agriculture Imagery Program (NAIP) imagery to classify five main classes of vegetation across the entire state of Wisconsin at 0.6 m resolution: tree/woody vegetation, grass/herbaceous vegetation, impervious surface/bare soil, water, and non-forested wetland. The final classification contained misclassifications from shade effects due to topography and heterogeneous urban environments, both of which are prevalent in SoCAB due to the presence of 16.8 million people and a substantial elevation gradient, with elevations ranging from 0 m at the Pacific Ocean to 3500 m in the San Gabriel Mountains.
Our study will use publicly available data to create a flexible framework for urban land cover mapping with land cover classes relevant to disentangling urban carbon fluxes. As a case study, we will apply this framework over the entire SoCAB domain and classify multiple land cover classes, including grass, shrubs, and forests. We will implement a novel technique for classifying urban and non-urban shadow pixels that have been a major source of uncertainty in previous remote sensing-based urban land cover classifications and provide an accuracy assessment of our classification map.
This classification effort will use a supervised random forest algorithm in Google Earth Engine (GEE) and a combination of NAIP imagery (0.6 × 0.6 m) and Sentinel-2 imagery (10 × 10 m) to classify impervious surface, tree, grass, shrub, water, and bare soil/non-photosynthetic vegetation (NPV) in heterogeneous urban and non-urban regions across SoCAB. NAIP imagery will be used to classify urban regions and Sentinel-2 imagery will be used to classify non-urban regions and shadow regions. We combine bare soil and NPV into a single NPV/bare soil class because these land cover classes have a negligible effect on the carbon cycle and are challenging to distinguish using only red-green-blue (RGB)-near infra-red (NIR) optical imagery. Our framework for both urban and non-urban land cover classification across SoCAB covers a larger region of southern California at a higher spatial resolution (0.6–10 m) than other known classifications. In addition, this classification utilizes NAIP and Sentinel-2 imagery that are widely available across the United States to map land cover types relevant to carbon and hydrological cycle modeling.
2. Materials and Methods
In the present study, we implement a modified method for vegetation classification following Erker et al. [
17] to better account for shadow effects (
Figure 2). The following sections describe our classification scheme in more detail, including brief description of Sentinel-2 and NAIP imagery data (
Section 2.1), preprocessing for water and shadow effects (
Section 2.2), selection of training and validation data (
Section 2.3) for supervised image classification (
Section 2.4) using object-based classification (
Section 2.5), and validation (
Section 2.6). All code and documentation for our high-resolution vegetation classification is available in the
Supplementary Materials section.
2.1. Sentinel-2 and NAIP Data
All classification efforts were completed in Google Earth Engine (GEE) (
https://earthengine.google.com) in August 2019 using built-in NAIP and Level 1C Sentinel-2 data sets under the Earth Engine snippets ee.ImageCollection(“USDA/NAIP/DOQQ”) and ee.ImageCollection(“COPERNICUS/S2”), respectively. GEE is a cloud-based platform for geospatial analysis that supports a multi-petabyte data catalogue through internet-based application programming interfaces (API) in Python and JavaScript [
18]. The JavaScript API and data catalogue provided both Sentinel-2 imagery and NAIP imagery necessary for our classification effort and provided a fast and efficient way to classify and export the entire SoCAB spatial domain in 8 hours.
NAIP airborne imagery is delivered by the United States Department of Agriculture USDA’s Farm Service Agency to provide leaf-on imagery over the entire continental US during the growing season and is freely and publicly accessible at either a 60 cm or 1 m resolution. Like most states, California’s NAIP imagery contains four spectral bands (red, green, blue, and near-infrared) at 60 cm resolution and is re-acquired roughly every four years (2008, 2012, and 2016). The NAIP dataset’s high spatial resolution and availability over the entire country made it an ideal first step for urban vegetation classification in Los Angeles. This classification used 2016 NAIP imagery, the most recent collection available over southern California on Google Earth Engine.
Imagery from the Copernicus Sentinel-2 mission has a relatively high spatial resolution (10–60 m × 10–60 m spatial resolution and 440–2200 nm wavelength) and frequent revisit time (3–5 days in midlatitudes) for an earth-observing satellite, making it ideal for studying vegetation change. Sentinel-2 is a 12-band multispectral satellite with red, green, blue (RGB), and near-infrared (NIR) bands (Band 2, Band 3, Band 4, Band 8) at 10 m spatial resolution, and other bands with 20–60 m spatial resolution. This classification used Level 1C Sentinel-2 imagery, which provides top-of-atmosphere reflectance in cartographic geometry and has been radiometrically and geometrically corrected by the European Space Agency (ESA). Due to Sentinel-2’s limited operating time and availability in Google Earth Engine, only Level 1C imagery from February 2018 to December 2018 was used, rather than Level 2A Sentinel-2 imagery.
2.2. Imagery Preprocessing
2.2.1. Shadow Effects
Sentinel-2 imagery provides the advantage of frequent revisit time throughout the year as the sun angle changes, and as such, shadow position shifts and shadow effects can be minimized by calculating the median spectral signature in urban and non-urban regions of SoCAB. We used the median 2018 Sentinel-2 spectral signature per pixel, aligned with the time frame for 2016 NAIP acquisition, to produce shadow-corrected Sentinel-2 imagery as an input to the land cover classification (
Figure 3). This study combined 2016 NAIP imagery and 2018 Sentinel-2 imagery because at the time of development neither 2016 Sentinel-2 imagery nor 2018 NAIP imagery for California was available on Google Earth Engine.
Using median-derived Sentinel-2 imagery helps to exclude both high and low spectral reflectance values that often skew NAIP training data, corresponding to snow/clouds/albedo and shadows, respectively. To demonstrate, we compare Sentinel-2 images derived from a single snapshot and from the 6-month median in urban and non-urban cases (
Figure 4). Our method is effective at removing both urban shadows from non-urban shadows in topographically complex mountainous regions outside (
Figure 4a,c) and from tall buildings (
Figure 4b,d).
Recognizing that a pixel’s spectral signature may change substantially between leaf-on and leaf-off seasons for urban and non-urban pixels, we used the Sentinel-2 median monthly normalized vegetation index (NDVI) from 2015 to 2019 to define the boundaries of leaf-on and leaf-off seasons (
Figure 5). Urban regions were defined as areas of SoCAB overlapping the five 2010 U.S. Census Bureau urbanized areas in Los Angeles: Camarillo, Los Angeles-Long Beach-Anaheim, Riverside-San Bernardino, Simi Valley, and Thousand Oaks, with large agricultural and undeveloped areas removed [
12]. Both non-urban and urban ecosystems in SoCAB experience a leaf-on period from February to June, and a leaf-off period from July to January. A recent study of photosynthetic seasonality in California found a double peak in solar-induced fluorescence (SIF) activity in April and June due to grass and shrub ecosystems versus evergreen ecosystems, similar to our non-urban Sentinel-2 NDVI data [
19]. Determining the leaf-on and leaf-off timeframes in SoCAB is important to the urban/non-urban land cover classification study because separating textural and spectral features into leaf-on and leaf-off periods effectively doubles the number of training features for our supervised classification process.
2.2.2. Water Pixels
Prior to performing supervised classification on NAIP and Sentinel-2 imagery, all water pixels were masked in Google Earth Engine to reduce misclassifications due to confusion between spectrally similar water and impervious surface or shadowed pixels. The normalized difference water index (NDWI) uses green and NIR spectral bands to identify open water features in remotely sensed imagery [
20]. The modified NDWI (MDNWI) uses the green and MIR spectral bands to identify open water features, particularly in regions dominated by built-up land features by suppressing urban, vegetation, and soil noise [
21]. NDWI can be calculated from both Sentinel-2 and NAIP imagery using green and NIR bands, and MNDWI can be calculated from Sentinel-2 using green and SWIR bands scaled to 20 m resolution to more accurately classify water features in urban areas [
22]. A threshold approach masked all pixels with MNDWI or NDWI greater than 0.3 as water [
23]. Sentinel-2 MNDWI was used to mask larger water features at a 20 m resolution, such as lakes and reservoirs, and NAIP NDWI was used to mask smaller water features at a 0.6 m resolution, such as pools.
2.3. Training and Validating Data for Supervised Classification
To train and validate the land cover classification model, we relied on known GIS layers for impervious surface (roads and buildings), and hand-drawn polygons for the other classes. Building footprints were obtained as polygons from Microsoft US Building Footprints [
24], which contains 8 million building footprints in SoCAB. Road centerline data was obtained as vectors from Los Angeles, Orange, Riverside, San Bernardino, Ventura, and Santa Barbara county GIS servers, then mosaicked together for a total of 630,000 road vectors. For the remaining land cover classes, we drew 500 polygons per land cover type for tree, grass, shrub, and bare soil/NPV, (i.e., dormant and dead vegetation) and 200 polygons per land cover type for shadow, pool and lake.
Training data for impervious surface classification were obtained independently for buildings and roads, then combined as one data set with 1000 randomly chosen polygons and vectors. Training data for water classification was obtained separately for pools and lakes, then combined into a single water class to reduce model complexity. The training data shapefiles for each land cover class (impervious, tree, grass, shrub, bare soil/NPV, shadow, and water) were imported to Google Earth Engine, and 5000 points were randomly selected from each class and binned as training (80%) and validating (20%) data.
2.4. Supervised Image Classification with NAIP and Sentinel-2 Imagery
We performed two distinct image classifications on 1) urban areas defined by Wetherley et al. [
12] and 2) non-urban and urban areas with significant shadow effects. NAIP imagery was used for the initial classification of all urban areas, and Sentinel-2 data was used to refine classification in urban areas with shadows. An index-based thresholding method was used to remove water pixels from NAIP and Sentinel-2 imagery prior to supervised classification (
Section 2.2.1).
The first image classification was performed across all urban areas with NAIP imagery at 60 cm resolution using the land cover classes of impervious surface, tree, grass, bare soil/NPV, and shadow in Google Earth Engine. Regions classified as shadow were reclassified with 10 m Sentinel-2 imagery resampled to 60 cm with nearest neighbor resampling using the land cover classes of impervious, tree, grass, shrub, and bare soil/NPV. It was assumed that there were no shadows in the Sentinel-2 imagery after shadow removal and that all water pixels were masked in the MNDWI and NDWI thresholding step. The second image classification was performed across all non-urban areas with Sentinel-2 imagery at 10 m resolution using the land cover classes of impervious surface, tree, grass, shrub, and bare soil/NPV. To create the final land cover classification map over SoCAB, all non-urban pixels classified with Sentinel-2 were resampled to 60 cm using nearest neighbor resampling, and land cover maps of urban shadow, urban non-shadow, and non-urban areas were mosaicked together in Google Earth Engine. As such, our land cover classification map has a resolution of 0.6 m in urban areas and 10 m in non-urban areas of SoCAB.
2.5. Object-Based Classification
Given the high spatial resolution of NAIP imagery, we used object-based vegetation classification [
25] to resolve “salt-and-pepper” effects, where individual pixels are incorrectly classified as a different class from their neighbors, thereby decreasing classification accuracy. Object-based classification uses a superpixel segmentation method to decrease computational complexity and increase classification accuracy [
26]. This study implemented simple non-iterative clustering (SNIC) to identify clusters of spectrally similar pixels in Google Earth Engine. The SNIC algorithm identifies neighboring pixels at an arbitrary seed pixel that are spectrally similar in an RGB image [
27]. At the resolution of NAIP imagery, the size of a single SNIC superpixel typically clustered around a tree or similarly sized land cover object (
Figure 6) [
28].
In addition to increasing classification accuracy, implementing an object-based classification method for NAIP imagery decreased the computational resources needed to perform land cover classification across all of SoCAB. For example, there are 553 NAIP images covering all of SoCAB (17,100 square km), each with 100–150 million 60 cm pixels. At a minimum, there are 55 billion 60 cm pixels requiring land cover classification. Reducing the number of objects in a single tile to 20 million superpixels, which roughly corresponds to a new seed pixel every 5 pixels, the total number of pixels to classify was decreased to 11 billion.
For classifying NAIP and Sentinel-2 imagery, we used a combination of multispectral and textural features to train a Rifle Serial Classifier with 30 ensembled decision trees, which is the Random Forest algorithm implementation in Google Earth Engine [
29]. The random forest supervised machine learning model was used to classify urban land cover as: impervious surface, tree, grass, bare soil/NPV, and shrub. Spectral features included red, green, blue, NIR, and NDVI. We derived textural features, commonly used to identify spatial tone relationships in images, from Google Earth Engine’s native gray-level co-occurrence matrix (GLCM) functions on the NDVI spectral band [
28]. The selected textural features for this classification were contrast, entropy, correlation, and inertia and are defined in Haralick et al [
28].
2.6. Land Cover Classification Validation
We withheld 20% of the initial training data to perform validation of our land cover classification across SoCAB and generate three confusion matrices: 1) SoCAB classification using only NAIP, 2) SoCAB classification using only Sentinel-2, and 3) SoCAB classification using the combined NAIP/Sentinel-2 approach. Confusion matrices are frequently used to calculate the performance of a supervised classification algorithm and provide a visualization of an algorithm’s performance in terms of user’s accuracy, producer’s accuracy, and overall accuracy. The user’s accuracy refers to the classification accuracy from the point of view of the map viewer, or how frequently that the predicted class is the same as the known ground features. The producer’s accuracy refers to the classification accuracy from the point of view of the map maker, or how frequently known ground features are correctly predicted as such. The overall accuracy refers to the number of features that were correctly predicted out of the total number of known features [
30]. We used a weighted accuracy assessment using the percentage of pixels classified as each land cover class to calculate the weighted overall accuracy for each land cover map.
4. Discussion
In this study, we combined NAIP imagery with Sentinel-2 satellite imagery to create a high-resolution land cover classification map across urban and non-urban regions of SoCAB. The product of a novel data combination of Sentinel-2 and NAIP imagery is the highest spatial resolution land cover classification map currently available for SoCAB. Moreover, our method as implemented in Google Earth Engine (GEE) can be replicated for any urban area of the United States, provided sufficient training data are available. Preliminary tests also show applicability of our supervised classification algorithm for classifying urban vegetation outside the United States using Sentinel-2 imagery. Our map has wide ranging applications for climate science in southern California, including carbon flux quantification, urban land use planning, and hydrology modeling.
4.1. Supervised Classification Errors
The main source of error in our SoCAB vegetation classification map was misclassification of non-water pixels as water due to spectral similarity between water and impervious surface or shadow pixels. Trees and shrubs in mountainous regions of San Bernardino National Forest and Angeles National Forest were misclassified as water. Shadows cast by buildings, high reflectance roofs, and some high reflectance roads in desert regions of SoCAB were also misclassified as water. Although masking NAIP and Sentinel-2 imagery prior to classification with NDWI and MNDWI thresholding reduced water misclassification, an important area of future research will be to further refine water classification methods to limit misclassification in urban and topographically complex regions.
Another source of error was confusion between grass and trees in our classification. These results are similar to Erker et al. [
17], who noted that trees and grass can have very similar spectra that are difficult to distinguish with four-band imagery, like NAIP or Sentinel-2 RGB and NIR bands. Incorporating textural features into our random forest classifier reduced this error, but tall grass and illuminated tree canopies can appear texturally similar, using gray-level co-occurrence matrices [
17]. LiDAR imagery has been shown to improve differentiation between grass, shrub and tree [
31], but is not available across all of SoCAB at high spatial resolution, and does not have repeated acquisitions like Sentinel-2 and NAIP imagery in most areas. Future work should investigate incorporation of LiDAR to improve classifications.
Re-classifying NAIP urban shadow pixels with shadow-corrected Sentinel-2 imagery reduced misclassification of impervious surface as shrub and tree but did not fully eliminate this source of error. This is likely because the spectral and textural features within a single Sentinel-2 pixel (10 m x 10 m) represent the averaged features across many (~278) 60 × 60 cm pixels, so any classification of that pixel is more prone to error [
32]. Furthermore, vegetation classes are more likely to be confused in coarser resolutions because urban vegetation is often at found at resolutions <1 m
2. Lastly, the sensitivity of the classification depends in part on the training data used. The training data used for shrub was derived in the wildlands. Shrub vegetation in southern California is highly heterogeneous, consisting of shrubs of many different species and sizes, as well as bare soil and sand. Thus, the shrub training polygons drawn for our supervised classifier were also heterogeneous, containing both vegetated shrub and bare soil, leading to possible spectral and textural confusion between shrub and bare soil/NPV.
4.2. Potential for Error with Sentinel-2 and NAIP Imagery
There are some concerns with mixing 2018 Sentinel-2 imagery and 2016 NAIP imagery, as the United States Drought Monitor calculated that portions of SoCAB were in a D4 exceptional drought in April 2016 versus D2 severe drought in April 2018. For example, there was likely less irrigated turf grass in 2016 than in 2018 due to water usage restrictions during drought years, which could confusion classification distinctions between bare soil/NPV and grass. The use of commercial very high resolution (VHR) time series imagery such as Planet Lab or Digital Globe will help resolve this issue by matching imagery time frames more closely and provide multiple years of data.
Another possible concern is geolocation between Sentinel-2 and NAIP imagery, as this classification approach combined both data sets over the same region. Based on the algorithm theoretical basis documents for NAIP and Sentinel-2 imagery, NAIP has a 1 m horizontal accuracy ground sample distance and Sentinel-2 has absolution geolocation of less than 11 m [
33,
34]. We assumed that the geolocation between Sentinel-2 and NAIP imagery was accurate enough for this classification, but further work is needed in this area.
4.3. Comparison to Other Los Angeles, CA Land Cover Maps
Previous land cover classifications across SoCAB either failed to cover the entire spatial domain at a high enough spatial resolution for carbon flux modeling [
12,
16], or cover the entire spatial domain at a coarse resolution [
9]. Google Earth Engine has been used for accurate high-resolution urban land cover classification across the globe, but no published results of such a classification for Los Angeles exist [
35,
36,
37,
38]. The overall accuracy for our combined classification, 85%, is comparable to the EPA EnviroAtlas project, but covers a larger spatial domain and only incorporates publicly available data sets (i.e., NAIP, Sentinel-2). Wetherley et al. [
12] developed an AVIRIS-derived 15 m fractional land cover classification across urban areas of Los Angeles, with accuracies varying between 77% (impervious surface) to 94% (turfgrass) for six distinct land cover classes. Although the AVIRIS-based classification has a higher accuracy within some land cover classes (i.e., turfgrass), AVIRIS imagery is not available in many regions and covers a smaller spatial domain than our NAIP and Sentinel-2 combined classification across SoCAB.
To our knowledge, the NLCD database represents the only other urban vegetation classification scheme for SoCAB. The 2011 NLCD was derived from Landsat imagery by the USGS and Multi-Resolution Land Characteristics Consortium (MRLC) and spans the continental United States at a 30 m resolution, with 16 distinct land cover classes and 89% overall accuracy [
39]. In heterogeneous urban centers like Los Angeles, the NLCD database reported nearly all pixels as “developed, open space” or “developed, low, mid or high intensity,” with no reference to urban vegetation (
Figure 7). For example, the Black Gold Golf Course in Yorba Linda clearly appears as irrigated grass in NAIP imagery (
Figure 7b) and was correctly classified as such in our Sentinel-2/NAIP classification (
Figure 7d), but classified as “developed, open space” in the NLCD classification (
Figure 7f), with no reference to irrigated grass as managed urban vegetation.
While the NLCD classification covers the entire SoCAB spatial domain with a high overall accuracy, the 30 m spatial resolution is too coarse for a heterogeneous urban environment like Los Angeles and NLCD categorizations do not cover necessary land cover classes for CO2 flux quantification. We note that NAIP imagery is not available outside the continental United States and thus 60 × 60 cm classification using our combined technique is not possible. However, our results suggest that 10 × 10 m Sentinel-2 satellite imagery may be a suitable alternative to coarse classification (e.g., MODIS landcover binned into the International Geosphere-Biosphere Programme (IGBP) categorizations) for international urban land cover classification.