Next Article in Journal
Experimental Data of a Floating Cylinder in a Wave Tank: Comparison Solid and Water Ballast
Previous Article in Journal
National Open Data Cubes and Their Contribution to Country-Level Development Policies and Practices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

Tree Cover for the Year 2010 of the Metropolitan Region of São Paulo, Brazil

by
Fabien H. Wagner
* and
Mayumi C.M. Hirye
Remote Sensing Division, National Institute for Space Research—INPE, São José dos Campos 12227-010, SP, Brazil
*
Author to whom correspondence should be addressed.
Data 2019, 4(4), 145; https://doi.org/10.3390/data4040145
Submission received: 20 October 2019 / Revised: 10 November 2019 / Accepted: 11 November 2019 / Published: 14 November 2019

Abstract

:
Mapping urban trees with images at a very high spatial resolution (≤1 m) is a particularly relevant recent challenge due to the need to assess the ecosystem services they provide. However, due to the effort needed to produce these maps from tree censuses or with remote sensing data, few cities in the world have a complete tree cover map. Here, we present the tree cover data at 1-m spatial resolution of the Metropolitan Region of São Paulo, Brazil, the fourth largest urban agglomeration in the world. This dataset, based on 71 orthorectified RGB aerial photographs taken in 2010 at 1-m spatial resolution, was produced using a deep learning method for image segmentation called U-net. The model was trained with 1286 images of size 64 × 64 pixels at 1-m spatial resolution, containing one or more trees or only background, and their labelled masks. The validation was based on 322 images of the same size not used in the training and their labelled masks. The map produced by the U-net algorithm showed an excellent level of accuracy, with an overall accuracy of 96.4% and an F1-score of 0.941 (precision = 0.945 and recall = 0.937). This dataset is a valuable input for the estimation of urban forest ecosystem services, and more broadly for urban studies or urban ecological modelling of the São Paulo Metropolitan Region.
Dataset: The dataset is available at https://doi.org/10.5281/zenodo.3373632.
Dataset License: CC-BY

1. Introduction

Recently, urban and peri-urban forests have received growing interest because they are now officially recognized as ecosystem service providers that can help to achieve sustainable development goals (SDGs), and particularly the SDG 11, which aim to make cities and human settlements inclusive, safe, resilient, and sustainable [1,2]. These forests can provide ecosystem services related (i) to provision and wealth, as they can be used for production of food or goods; (ii) to climate-related regulation, such as the heat island effect reduction or mitigation, or runoff, which contributes to urban environment safety; (iii) to ecosystem support, such as carbon and nutrient cycling, photosynthesis and soil formation; and, finally, (iv) to cultural services, by providing an aesthetic environment for recreation spaces and social venues and improving the diversity and attractiveness of the cities by creating diverse landscapes and increasing biodiversity [3,4,5].
In its last report on the state of the world’s forests, the Food and Agriculture Organization of the United Nations (FAO) presented a methodology to quantify the benefits of urban and peri-urban forests (see Box 21 in [1]). It consists of mapping and measuring forests and trees on the ground or in images and using i-Tree Eco (www.itreetools.org), a software program designed to assess specific tree systems’ benefits and also to express their value in monetary terms [1,6]. However, these estimates of services/disservices often use random point sampling and visual interpretation, and the outputs are not spatially explicit maps, limiting the interpretation of where ecosystem services are concentrated, and who benefits [7].
Furthermore, while such ground maps are feasible in cities or towns in the developed world, it is extremely challenging for mega-cities of developing countries, such as the Metropolitan Region of São Paulo (MRSP), which is the most important agglomeration in Brazil and the fourth largest urban agglomeration in the world with 21.6 million inhabitants [8,9]. In the MRSP, efforts have been made in recent years to map forest/vegetation patches with satellite or aerial images and to map the trees on the streets of the São Paulo municipality (both datasets available at http://geosampa.prefeitura.sp.gov.br). However, the complete tree cover map of the MRSP is still lacking.
Making tree cover maps with very high-resolution images is still challenging. It has been shown that some classical pixel or segment methods can achieve good accuracy. For example, an overall accuracy of 79.3% was achieved for the state of Wisconsin tree canopy cover with the most recent published method [7]. However, we are still far from the accuracy that can been reached by deep learning-based segmentation methods. Recently, an innovative method to extract tree cover in very high-resolution images was proposed for tropical forests [10]. Using a convolutional network called U-net [11], it was shown that forest cover could be segmented at a regional scale with very high-resolution images, and an overall accuracy >95% was obtained for the forest cover map [10], a value previously unattainable with traditional segmentation methods.
In our dataset, we provide the tree cover map for the year 2010 of the MRSP. Aerial RGB images with a spatial resolution of 1 m provided by the ‘Empresa Paulista de Planejamento Metropolitano S.A’ (Emplasa) were used in combination with the U-net deep learning method to segment the tree cover in the images. The dataset presented in this paper mitigates the lack of a reliable and complete tree cover map for the MRSP and can be broadly used for urban studies, urban ecological modelling and is an important contribution to the estimation of urban forests and street tree ecosystem services. The tree cover dataset at 1-m spatial resolution is available at https://doi.org/10.5281/zenodo.3373632 [12].

2. Materials and Methods

2.1. Study Site

The tree cover dataset covers the MRSP, which consists of 39 municipalities and covers a total area of ∼8000 km 2 (Figure 1). The shapefiles containing the border of the MRSP and of the municipalities are available online (http://datageo.ambiente.sp.gov.br/app/# in the directory /Limites Administrativos-Completo/).

2.2. High-Resolution Images of São Paulo

In this study, we used 71 orthorectified RGB aerial photographs that were produced and made available to this project by the ’Empresa Paulista de Planejamento Metropolitano S.A’ (Emplasa). Emplasa is a public institution that elaborates and subsidizes the implementation of public policies and integrated projects of urban and regional development in the São Paulo State. The 71 orthorectified RGB aerial photographs from 2010 are a sample of the complete set of aerial photographs of the State of São Paulo, which comprises 1727 orthophotos with 30% to 60% of lateral overlap between flights. These aerial orthophotos were generated with a spatial resolution of 1 m on average and were acquired over the region during the winter dry months (JUL-AUG-SEP) in 2010 and 2011. The aircraft used were a Carajá Turboprop and a Lear Jet, and the cameras were the Ultracam, models X and XP, acquired from Microsoft. The image can be visualized on the following Brazilian government websites: http://datageo.ambiente.sp.gov.br/app/# in the directory /Base Imagem/Portal de imagens-DigitalGlobe/Ortofotos do Estado de São Paulo-2010/2011 (EMPLASA) or in GIS software using the WMS link: http://datageo.ambiente.sp.gov.br/serviceTranslator/rest/getXml/Geoserver_Imagem/ORTOFOTOS_EMPLASA_2010/1435155780713/wms.

2.3. Tree Cover Segmentation

2.3.1. U-Net Model

In this study, we used a convolutional network for image segmentation known as U-net [11]. Details regarding the architecture of the model can be found in [10]. This network performs a per-pixel classification, predicting the probability of each pixel to belong to a particular class. This U-net model has recently proven to become a new standard in image dense labeling [14]. We used the three-band RGB image as the input. The code of the U-net model was adapted from the original U-net code developed for Keras and Rstudio (available here: https:/keras.rstudio.com/articles/examples/unet.html).

2.3.2. Network Training

To train the U-net algorithm to recognize and segment tree cover, a tree cover mask was manually delineated in some parts of one of the aerial images (image ID: SF-23-Y-C-VI-2-SO), resulting in 4015 polygons. The image was chosen because it presented all different types of São Paulo tree cover, including isolated trees, natural forests, natural degraded forests and planted forest as well as a high diversity of background classes, with different urban building types (high-rise buildings, individual houses, slums and industrial buildings) and other important classes for the city of São Paulo such as water reservoirs, roads and highways. The polygons of tree cover were converted to a raster of identical spatial resolution and dimensions as the image SF-23-Y-C-VI-2-SO, where 1 and 0 indicate tree cover and background, respectively. Then, 1608 images and their associated labels, both with a size of 64 × 64 pixels of 1-m spatial resolution, were extracted from the image SF-23-Y-C-VI-2-SO and tree cover raster. Among the extracted images, 1296 contained trees or forest and 312 contained only background. A regular grid with a cell size of 64 × 64 pixels was used to ease the extraction of these images. To constitute the training and validation image samples, among the 1608 images, we randomly selected 1286 (80%) to be used for the training and 322 (20%) for the validation. Data augmentation was applied randomly to the input images, including 0/90/180/270 rotations and changes in the brightness, saturation and hue by converting them from RGB to Brightness-Saturation-Hue space (BSH), and modulated the current values by between 95–110% for brightness, 95–105% for saturation and 99–101% for hue (as changes in the plant hues are not expected). We trained our network for 150 epochs, with 16 images per batch. In deep learning, one epoch represents a complete learning cycle, where all the training images have been presented once to the neural network. To ease computation and convergence, the training images are sent to the network in small amounts called a batch. The initial learning rate was set to 1 × 10 4 . The optimization was stopped when the loss function improvement did not exceed 1 × 10 4 .

2.3.3. Segmentation Accuracy Assessment

Two performance metrics were computed. First, the overall accuracy was computed as the percentage of correctly classified pixels. Second, the F1 score was computed for each class i as the harmonic average of the precision and recall, Equation (1), where precision was the ratio of the number of segments classified correctly as i and the number of all segments (true and false positive), and recall was the ratio of the number of segments classified correctly as i and the total number of segments belonging to class i (true positive and false negative). This score varies between 0 (lowest value) and 1 (best value).
F 1 i = 2 × p r e c i s i o n i × r e c a l l i ( p r e c i s i o n i + r e c a l l i )

2.3.4. Prediction

For prediction, each orthophotograph was cropped on a regular grid of 512 × 512 pixels, and 64 neighbour pixels were added on each side to create an overlap between the patches. The predictions were made on the 640 × 640 pixels images, and the resulting images were cropped to 512 × 512 pixels and merged to reconstitute an image of the tree cover with the original orthophotograph extent. This overlapping method was used to avoid the artifact of prediction on the border, a known problem for the U-net algorithm [11]. The prediction returned by the algorithm is an image containing the probability of each pixels to belong to the tree cover class. The pixels with probability above or equal to 0.5 were labelled as tree cover (value = 1), and background otherwise (value = 0). The resulting tree cover map had the same 1-m spatial resolution as the input images.

2.3.5. Algorithm

The model was coded in R language [15] with Rstudio interface to Keras [16,17] and a Tensorflow backend [18]. The training of the models took ∼10 h using GPU on a Nvidia GeForce GTX970m with 3 GB dedicated memory. Prediction of tree cover using GPU of a single image took approximately 35 min.

3. Results

3.1. Tree Cover Segmentation Details and Accuracies

The overall accuracy measured for the 322 images of the validation sample was 96.4 %, and the F1-score was 0.941 (precision = 0.945 and recall = 0.937), Table 1. Time for convergence was ∼10 h. The best model was obtained after 107 epochs with 16 images per batch (Table 1). Results of the segmentation for two image subsets of the São Paulo municipality are presented in Figure 2. Results of the tree cover area and percentage for the MRSP and all municipalities are presented in Table 2. The tree cover percentage ranges from 9.3% (São Caetano do Sul) to 97.0% (Juquitiba), and the municipality of São Paulo has a tree cover of 37.7%. Considering (i) a mean tree crown area of 32.5 m 2 , as estimated for São Paulo previously [19,20], based on the mean tree crown area of 1109 adult trees of the species Tipuana tipu (Benth.) Kuntze, the most common species in the city, and (ii) the total MRSP tree cover obtained with the U-net segmentation (Table 2), we estimated that the number of trees in the MRSP is ∼136,484,100.

3.2. Limitations of the Tree Cover Dataset

The segmentation produced by the algorithm has three main known limitations. First, due to the shade of some buildings or mountains, the image may be too dark for the algorithm to recognize objects. In our dataset, the main misclassifications were associated with shade occurring in the Serra do Mar, a mountain system that follows the Atlantic coast in the south part of the image, which is outside the MRSP. Hence, we recommend using the tree cover data only inside the MRSP border. Misclassifications due to shadows cast by buildings were not frequent in our dataset but are likely to be observed in other cities, particularly in higher-density urban environments. Second, on relatively few occasions, it segments some green vegetation or algae which present a similar texture or color to that of the trees. Finally, during the conversion of the segmentation results in raster format to shapefile, the tree cover borders have been simplified to reduce the size of the data, so they have a smoother border than in the raster data. Accordingly, we recommend to use the raster rather than the shape for tree cover estimate. Both rasters and shapefiles are available in the tree cover data.

3.3. Dataset Location and Format

The tree cover data are available in the raster and shapefile formats at the Zenodo permanent repository of data (https://doi.org/10.5281/zenodo.3373632) [12]. The dataset is distributed in 71 tiles (EPSG:32723, WGS 84 / UTM zone 23S). The rasters contain one band with a value 1 if the pixel is tree cover and 0 otherwise. The shapefiles contain only polygons, and these polygons are the tree cover. The tree cover can represent individual trees, natural forests, natural degraded forests or forest plantations. The total size of the decompressed archives is 6.40 gigaoctets (6 Go for the shapefiles and 0.4 Go for the rasters).

Author Contributions

F.H.W. and M.C.M.H. conceived and designed the experiments; F.H.W. performed the experiments; F.H.W. analysed the data; M.C.M.H. contributed materials; F.H.W. wrote the manuscript, which was revised, reviewed and edited by M.C.M.H.; Funding was acquired by F.H.W., and the project was administered by F.H.W.

Funding

The research leading to these results received funding from the project BIO-RED ’Biomes of Brazil—Resilience, Recovery, and Diversity’, which is supported by the São Paulo Research Foundation (FAPESP, 2015/50484-0) and the U.K. Natural Environment Research Council (NERC, NE/N012542/1). F.H.W. has been funded by FAPESP (grant 2016/17652-9). M.C.M.H acknowledges the support of CNPq through a doctoral fellowship.

Acknowledgments

We thank the Emplasa for the provision of the orthorectified aerial photographs.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. FAO. The State of the World’s Forests 2018—Forest Pathways to Sustainable Development; FAO: Rome, Italy, 2018. [Google Scholar]
  2. Keeler, B.L.; Hamel, P.; McPhearson, T.; Hamann, M.H.; Donahue, M.L.; Prado, K.A.M.; Arkema, K.K.; Bratman, G.N.; Brauman, K.A.; Finlay, J.C.; et al. Social-ecological and technological factors moderate the value of urban nature. Nat. Sustain. 2019, 2, 29. [Google Scholar] [CrossRef]
  3. Salbitano, F.; Borelli, S.; Conigliaro, M.; Yujuan, C. Guidelines on Urban and Peri-Urban Forestry; FAO: Rome, Italy, 2016. [Google Scholar]
  4. Diaz, S.; Fargione, J.; Chapin, I.F.S.; Tilman, D. Biodiversity Loss Threatens Human Well-Being. PLoS Biol. 2006, 4, e277. [Google Scholar] [CrossRef] [PubMed]
  5. World Resources Institute. Millennium Ecosystem Assessment—Ecosystems and Human Well-Being: Biodiversity Synthesis; World Resources Institute: Washington, DC, USA, 2005. [Google Scholar]
  6. Nowak, D.J.; Robert, E., III; Crane, D.E.; Stevens, J.C.; Walton, J.T. Assessing Urban Forest Effects and Values, New York City’s Urban Forest; Resour. Bull. NRS-9; US Department of Agriculture, Forest Service, Northern Research Station: Newtown Square, PA, USA, 2007; Volume 9, 22p.
  7. Erker, T.; Wang, L.; Lorentz, L.; Stoltman, A.; Townsend, P.A. A statewide urban tree canopy mapping method. Remote Sens. Environ. 2019, 229, 148–158. [Google Scholar] [CrossRef]
  8. United Nations; Department of Economic and Social Affairs; Population Division. World Urbanization Prospects: The 2018 Revision (ST/ESA/SER.A/420); United Nations: New York, NY, USA, 2019. [Google Scholar]
  9. Instituto Brasileiro de Geografia e Estatística (IBGE). Censo Demográfico 2018; Instituto Brasileiro de Geografia e Estatística: Rio de Janeiro, Brazil, 2018. [Google Scholar]
  10. Wagner, F.H.; Sanchez, A.; Tarabalka, Y.; Lotte, R.G.; Ferreira, M.P.; Aidar, M.P.M.; Gloor, E.; Phillips, O.L.; Aragão, L.E.O.C. Using the U-net convolutional network to map forest types and disturbance in the Atlantic rainforest with very high resolution images. Remote Sens. Ecol. Conserv. 2019. [Google Scholar] [CrossRef]
  11. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2015. [Google Scholar]
  12. Wagner, F.H.; Cursino de Moura Hirye, M. Tree cover for the year 2010 of the Metropolitan Region of São Paulo, Brazil. Zenodo 2019. [Google Scholar] [CrossRef]
  13. MapBiomas. Project MapBiomas, Collection 2.3 of Brazilian Land Cover & Use Map Series; Technical Report. Available online: https://mapbiomas.org/ (accessed on 9 May 2018).
  14. Huang, B.; Lu, K.; Audebert, N.; Khalel, A.; Tarabalka, Y.; Malof, J.; Boulch, A.; Le Saux, B.; Collins, L.; Bradbury, K.; et al. Large-scale semantic classification: Outcome of the first year of Inria aerial image labeling benchmark. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium—IGARSS 2018, Valencia, Spain, 22–27 July 2018. [Google Scholar]
  15. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2016. [Google Scholar]
  16. Allaire, J.; Chollet, F. Keras: R Interface to ‘Keras’, R package version 2.1.4; 2016. Available online: https://keras.rstudio.com (accessed on 9 May 2018).
  17. Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 9 May 2018).
  18. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Software. Available online: tensorflow.org (accessed on 9 May 2018).
  19. Brazolin, S. Biodeterioração, Anatomia do Lenho e Análise de Risco de Queda de Árvores de Tipuana, Tipuana tipu (Benth.) O. Kuntze, nos Passeios Públicos da Cidade de São Paulo, SP. Ph.D. Thesis, Universidade de São Paulo, São Paulo, Brazil, 2009. [Google Scholar]
  20. Buckeridge, M. Árvores urbanas em São Paulo: Planejamento, economia e água. Estudos Avançados 2015, 29, 85–101. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Geographical locations of the borders of the Metropolitan region of São Paulo in red; municipality borders are represented in black; and, the extents of EMPLASA images used to generate the tree cover mask in this study are shown in green. The background colours are the 2010 land use/cover classes from the MapBiomas project [13]. No copyright is associated to the MapBiomas data.
Figure 1. Geographical locations of the borders of the Metropolitan region of São Paulo in red; municipality borders are represented in black; and, the extents of EMPLASA images used to generate the tree cover mask in this study are shown in green. The background colours are the 2010 land use/cover classes from the MapBiomas project [13]. No copyright is associated to the MapBiomas data.
Data 04 00145 g001
Figure 2. Details of the segmentation for two image subsets not used for the training. Original RGB images (a,b), original RGB image with tree mask obtained by the U-net model in white colour (c,d), only trees obtained by the U-net model segmentation and background masked in white (e,f). Each image covers approximately 4 km 2 . The images are the property of the EMPLASA and have been made available to the authors for research purposes. No copyright is associated to these images.
Figure 2. Details of the segmentation for two image subsets not used for the training. Original RGB images (a,b), original RGB image with tree mask obtained by the U-net model in white colour (c,d), only trees obtained by the U-net model segmentation and background masked in white (e,f). Each image covers approximately 4 km 2 . The images are the property of the EMPLASA and have been made available to the authors for research purposes. No copyright is associated to these images.
Data 04 00145 g002
Table 1. Numerical evaluation of the models and convergence details.
Table 1. Numerical evaluation of the models and convergence details.
ModelEpochBatchTraining SampleValidation SampleOverall AccuracyF1-ScorePrecisionRecall
Tree cover10716128632296.40%0.9410.9450.937
Table 2. Area, estimations of tree cover and percent cover estimates for all municipalities of the Metropolitan region of São Paulo.
Table 2. Area, estimations of tree cover and percent cover estimates for all municipalities of the Metropolitan region of São Paulo.
MunicipalityArea (m 2 )Tree cover (m 2 )Tree Cover Proportion (%)
ARUJA96,080,44054,270,09156.48
BARUERI65,692,47420,102,91930.60
BIRITIBA-MIRIM317,237,148223,608,77570.49
CAIEIRAS96,102,69471,034,71573.92
CAJAMAR131,347,10086,694,99166.00
CARAPICUIBA34,548,3088,292,00924.00
COTIA324,070,631221,935,40968.48
DIADEMA30,791,6406,030,68319.59
EMBU70,394,39537,736,85753.61
EMBU-GUAÇU155,639,64580,348,39751.62
FERRAZ DE VASCONCELOS29,556,36312,981,52843.92
FRANCISCO MORATO49,071,41925,169,17851.29
FRANCO DA ROCHA134,156,97277,066,74757.45
GUARAREMA270,677,717131,200,74148.47
GUARULHOS318,598,553151,441,82847.53
ITAPECERICA DA SERRA150,882,19697,725,43364.77
ITAPEVI82,674,96543,733,76152.90
ITAQUAQUECETUBA82,577,42225,236,09430.56
JANDIRA17,455,6896,071,40334.78
JUQUITIBA522,311,329454,485,48387.01
MAIRIPORA320,642,337232,198,36872.42
MAUA61,849,44420,879,37333.76
MOGI DAS CRUZES712,355,131399,792,52156.12
OSASCO64,955,64412,157,47018.72
PIRAPORA DO BOM JESUS108,541,02169,323,28663.87
POA17,257,4384,483,01325.98
RIBEIRAO PIRES99,089,35364,058,94564.65
RIO GRANDE DA SERRA36,329,59927,031,69774.41
SALESOPOLIS424,735,476301,765,35471.05
SANTA ISABEL363,157,697186,702,54051.41
SANTANA DE PARNAIBA179,960,318105,001,26958.35
SANTO ANDRE175,734,91094,738,48953.91
SAO BERNARDO DO CAMPO409,403,419228,630,96755.84
SAO CAETANO DO SUL15,328,2861,423,6509.29
SAO LOURENÇO DA SERRA186,359,362155,273,39083.32
SAO PAULO1,520,949,482573,864,55337.73
SUZANO206,127,27799,847,43448.44
TABOAO DA SERRA20,387,8964,527,71622.21
VARGEM GRANDE PAULISTA42,493,64318,865,75144.40
TOTAL MRSP7,945,524,8334,435,732,82855.83

Share and Cite

MDPI and ACS Style

Wagner, F.H.; Hirye, M.C.M. Tree Cover for the Year 2010 of the Metropolitan Region of São Paulo, Brazil. Data 2019, 4, 145. https://doi.org/10.3390/data4040145

AMA Style

Wagner FH, Hirye MCM. Tree Cover for the Year 2010 of the Metropolitan Region of São Paulo, Brazil. Data. 2019; 4(4):145. https://doi.org/10.3390/data4040145

Chicago/Turabian Style

Wagner, Fabien H., and Mayumi C.M. Hirye. 2019. "Tree Cover for the Year 2010 of the Metropolitan Region of São Paulo, Brazil" Data 4, no. 4: 145. https://doi.org/10.3390/data4040145

Article Metrics

Back to TopTop