Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods

Turton, Amber E.; Augustin, Nicole H.; Mitchard, Edward T. A.

doi:10.3390/rs14194911

Open AccessReview

Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods

by

Amber E. Turton

^1,*,

Nicole H. Augustin

¹ and

Edward T. A. Mitchard

²

¹

School of Mathematics, University of Edinburgh, Edinburgh EH9 3FD, UK

²

School of GeoSciences, University of Edinburgh, Edinburgh EH8 9XP, UK

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(19), 4911; https://doi.org/10.3390/rs14194911

Submission received: 24 July 2022 / Revised: 7 September 2022 / Accepted: 27 September 2022 / Published: 1 October 2022

(This article belongs to the Section Forest Remote Sensing)

Download

Browse Figure

Versions Notes

Abstract

:

Forests store approximately as much carbon as is in the atmosphere, with potential to take in or release carbon rapidly based on growth, climate change and human disturbance. Above-ground biomass (AGB) is the largest carbon pool in most forest systems, and the quickest to change following disturbance. Quantifying AGB on a global scale and being able to reliably map how it is changing, is therefore required for tackling climate change by targeting and monitoring policies. AGB can be mapped using remote sensing and machine learning methods, but such maps have high uncertainties, and simply subtracting one from another does not give a reliable indication of changes. To improve the quantification of AGB changes it is necessary to add advanced statistical methodology to existing machine learning and remote sensing methods. This review discusses the areas in which techniques used in statistical research could positively impact AGB quantification. Nine global or continental AGB maps, and a further eight local AGB maps, were investigated in detail to understand the limitations of techniques currently used. It was found that both modelling and validation of maps lacked spatial consideration. Spatial cross validation or other sampling methods, which specifically account for the spatial nature of this data, are important to introduce into AGB map validation. Modelling techniques which capture the spatial nature should also be used. For example, spatial random effects can be included in various forms of hierarchical statistical models. These can be estimated using frequentist or Bayesian inference. Strategies including hierarchical modelling, Bayesian inference, and simulation methods can also be applied to improve uncertainty estimation. Additionally, if these uncertainties are visualised using pixelation or contour maps this could improve interpretation. Improved uncertainty, which is commonly between 30% and 40%, is in addition needed to produce accurate change maps which will benefit policy decisions, policy implementation, and our understanding of the carbon cycle.

Keywords:

above-ground biomass; remote sensing; statistics; modelling; spatial modelling; machine learning; uncertainty propagation; validation; change detection; carbon cycle; forest degradation

1. Introduction

Forests cover 31% of the global land area, which is around four billion hectares [1]. Where global land area is defined as area under national sovereignty, excluding area under inland waters and coastal waters [2]. These forests are an important carbon pool which actively takes in and releases large amounts of carbon dioxide, making a considerable impact on the amount of carbon dioxide in the atmosphere and its year-to-year variation [3]. FAO and UNEP [1] suggests deforestation occurs at a rate of around 10 million hectares per year, for 2015–2020. Meanwhile, other sources suggests tree cover loss of around 30.6 million hectares over the same time period [4,5]. Forest degradation also has a significant impact on carbon levels [6]. Deforestation and forest degradation account for approximately 11% of anthropogenic carbon emissions, second only to the energy sector, these carbon emission contribute heavily towards climate change [7]. However, forests also mitigate climate change, by overall taking in about a quarter of the carbon released by humans (acting as a carbon sink). Climate change itself also poses a major risk to the health of forests, and their ability to continue acting as a carbon sink [1].

Forests store a large amount of carbon which can cause them to become carbon sinks as they grow or become carbon sources through deforestation. Tropical forests are soon likely to become a carbon source, owing to continued forest loss and climate change [8]. To understand climate change and mitigate global warming, it is essential to quantify the amount of carbon held in these forests, and detect changes occurring.

Monitoring carbon levels is also an important part of global policy making. Global forests are highly relevant to the 13th and 15th goals of the United Nations (UN) Sustainable Development Goals (SGDs). These are focused on ‘Climate Action’ and ‘Life on Land’ respectively [9]. Additionally, the UN Paris Climate Agreement agrees to limit global warming to less than 2 degrees Celsius above pre-industrial levels, and preferably less than 1.5 degrees [10]. To achieve these SDGs, reducing emissions from the forest sector is essential. A mechanism developed by UN to reduce emissions is the UN Reducing Emissions from Deforestation and Forest Degradation (REDD+) program [11]. This offers incentives to developing countries to reduce emissions from forested lands and invest in low-carbon paths to sustainable development [7]. Transparent reporting of changes in forest carbon are required for such international agreements. AGB can also be considered an essential climate variable (ECV), which can be important to characterize the Earth’s climate system [12].

Above-ground biomass (AGB) is a good indicator of the amount of carbon held in a forest above the ground and is commonly used for forest monitoring. Above-ground biomass (AGB) is defined as all living biomass above the soil including stem, stump, branches, bark, seeds, and foliage [13]. Around 50% of the dry weight of trees, otherwise known as biomass, is carbon [14,15]. In some forests, the vast majority of the carbon is instead stored belowground as peat. However, this is hard to monitor because satellites can see only above ground and so this study will focus on AGB estimation.

Remote sensing is the process of acquiring information about the Earth’s land from a distance, most commonly this is done using aircraft or satellites, such as those shown in Table 1. Remote sensing has increasingly been used for forest monitoring and management alongside field surveys. Remote sensing is a non-destructive method that can be used for AGB estimation. It requires less manual work than forest inventory plots, covers larger areas, and can provide data for inaccessible areas. Satellites also can offer repeated images using the same sensors over time and thus allows us to detect change more efficiently [16]. However, currently the measurement, reporting, and verification of forest carbon is a lengthy and expensive process. It is hoped that carbon estimation from remote sensing data will become quicker and that uncertainties can be quantified and decreased [17]. Current industry maps, such as those shown in Table 2, have high uncertainty, for individual pixels errors can be around 30–40% [17], where pixel error is defined by comparing the estimated value of a pixel from remote sensing data to the estimated value of a forest inventory plot underneath. Differences between maps also highlight that uncertainties considerably exceed the reported uncertainties in certain regions [18].

AGB can been estimated using various forms of remote sensing data from both aircraft and satellites. To perform predictions on a global scale the most appropriate remote sensing form is satellite data, with aircraft of Unmanned Aerial Vehicle (UAV) data used for testing new methods, as a high resolution calibration/validation dataset for satellite-based maps, or as a ‘stepping stone’ dataset between field data and satellite data. Currently there are a range of satellites which can be used to produce AGB maps. The main forms of satellite image used for estimation purposes are: optical, synthetic-aperture radar (SAR), and light detection and ranging (lidar). Satellite sensors can detect various characteristics of the forest area closely linked to AGB, such as: tree height and vegetation levels.

This review will focus an a number of topics for which it is expected that statistical and machine learning methods will make improvements to the current process. The statistical topics covered are:

(i): Spatial modelling: Spatial modelling techniques can account explicitly for the spatial correlation which is exhibited in satellite data.
(ii): Combining data sources: Interpolation techniques to combine data sources available at different spatial scales.
(iii): Model validation: Validation methods which can be used effectively for spatially correlated data.
(iv): Uncertainty measurements: Methods used to appropriately recognise and visualize final estimates of error, when errors arise from multiple sources.
(v): Change detection and quantification: Methods which may help to detect and quantify AGB change over time.

This study is unique due to the statistical approach taken when reviewing AGB maps. The investigation has focused on the statistical methods applied to each data set unlike other reviews undertaken. Due to the wide availability of data it is important to understand how methods can be improved to effectively make use of this data. A previous study Lu [19] discusses the models used to estimate AGB, however is focused on the overall data used for estimation and overall problems facing the field. More recently, Giménez et al. [17] evaluates the AGB estimation field. They indicate that statistical and computational methods are likely to improve estimation, but do not provide great detail on how this will occur.

This investigation is important in facilitating researchers, from both the statistical and remote sensing communities, to progress the AGB mapping field. It should provide researchers with the background knowledge and further reading needed to motivate research into the most necessary topics and problems faced in the field of AGB mapping.

2. Background Information

2.1. Field Data

In most countries, there exist National Forest Inventories (NFIs) which monitor the extent of forests within the nation using field surveys. These occur at regular intervals and can provide extensive information about key forest attributes such as: size, species, and condition. These data are used to find estimates for stand volume and biomass. Ground data is essential for the creation of effective AGB maps and for validation purposes [20].

Historically, AGB has been estimated from field surveys using a methodology known as the harvest method or destructive method. This involves felling trees and taking measurements including: species, diameter at breast height (DBH), wood density, and height. These trees are then oven dried and weighed, giving their biomass [15]. These data sets can be used to find regression coefficients for models such as in Equation (1) [21,22].

A G B = β_{0} + β_{1} \cdot H e i g h t \cdot D B H^{2}

(1)

These models are known as allometric equations and are usually developed for particular regions or species of tree. Different allometric equations are developed because different species and regions can have very different carbon levels with the same characteristics. Once the model has been developed, these can be applied to (non-destructive) inventory measurements to give biomass estimates for trees, and thus whole sample plots. Various allometric equations have been developed for a range of species and regions [22].

Equations are openly available for temporal zones of Europe from and for the tropical regions [23,24]. For the creation of AGB maps using GEDI lidar data, more stand level allometric equations have been produced by continent and general plant function types, such as broad-leaf trees [25].

2.2. Optical Remote Sensing

Optical satellite images are obtained by passive sensors which rely on reflectance of solar energy back to the sensor. These data are in the visible and near-infrared spectrum [26]. Optical data are provided in the form of multi band images, where each band represents a region of the electromagnetic spectrum within the visible and near-infrared spectrum. Optical imagery most closely represents that which is visible from the naked eye.

Generally, a multi-spectral image is provided with a number of spectral bands, for example Landsat-9 has 11 available spectral bands [27]. Vegetation indices can also be developed from optical imagery to inform AGB estimation. Healthy vegetation exhibits unique patterns within the multi-spectral bands. Vegetation indices, such as such as normalized difference vegetation index (NDVI), make use of these patterns and often use a transformation of two bands of information.

Optical imagery is widely and publicly available from satellites including: Landsat (15–100 m resolution) and Sentinel-2 (10–60 m resolution) [27,28]. Higher resolution optical imagery is also available from private satellites such as Pléiades Neo (30 cm resolution) owned by Airbus and Worldview-3 operated by Maxar Technologies [29].

Optical imagery is useful to map forest land cover and predict biomass due to its wide availability. In particular, change can be observed through optical imagery since long time series of optical data are available with a short temporal resolution [26]. For example, Landsat has data available from the 1970s and has a 16 day revisit period, shown in Table 1. Optical imagery is also often used in conjunction with other data sources.

Table 1. Satellite data sources available for future AGB prediction.

Name	Type of Data	Years Available	Life Span	Revisit Period	Cost	Provider	Resolution
Sentinel 1	SAR C-band	2014-Present	Continuous	6/12 days	Open access	ESA	10m
TanDEM-X	SAR X-band	2010-Present	5+ years	11 days	Private	DLR and AirBus	25 cm–40 m
ALOS-2 PALSAR-2	SAR L-band	2014-Present	5+ years	14 days	Yearly mosaic open access	JAXA	10m
RADARSAT 1-2	SAR C-band	1995-Present	continuous	24 days	Limited open access	CSA	1–100 m
BIOMASS	SAR P-band	2023 Launch	5.5 years	3 days	Open access	ESA	200 m
Landsat 4–9	Optical	1984-Present	Continuous	8 days	Open access	NASA	30 m
Sentinel-2	Optical	2015-Present	Continuous	2–5 days	Open access	ESA	10–60 m
MODIS	Optical	1999-Present	Beyond life span	16 days	Open access	NASA	250–1000 m
ICESat-2	Lidar	2018-Present	3–7 years	<33 days	Open access	NASA	2 m
GEDI	Lidar	2018-Present	2+ years	Not guaranteed	Open access	NASA	25 m circular footprints

2.3. Synthetic Aperture Radar (SAR) Remote Sensing

Unlike optical satellites, SAR satellites actively emit microwave radiation to survey the Earth. The amount of radiation scattered back increases as the volume of vegetation in an area increases. This makes SAR very useful for forest inventory. Different SAR satellites use different wave lengths which are sensitive to different vegetation types. Shorter wave lengths C and X band are sensitive to small vegetation structures such as leaves and twigs. Meanwhile, longer wavelengths such as L and P band are sensitive to large trunks. Longer wavelengths are more useful for above-ground biomass estimation in forests.

SAR data is available from various data sources at wavelengths within C, X and L bands. This includes Sentinel-1 providing C-band data (>5 m resolution), and ALOS-2 PALSAR-2 providing L-band data (10m resolution) [30,31]. ESA is expected to launch the BIOMASS satellite in 2023 which will provide P-band data-sets which will be much more useful for AGB prediction in areas with high biomass density [32]. These satellites are shown in Table 1.

SAR is generally used as the primary form of data to predict biomass as it has been found to have a strong relationship with AGB. Backscatter rapidly increases with AGB until it reaches a saturation point of around 100–150 Mg/ha for L-band data [33]. This saturation point makes it difficult to map densely forested areas accurately. The saturation point increases for longer wave SAR such as P-band and so it is expected that the future BIOMASS satellite will provide improved AGB predictions. It is also possible to use SAR inteferometry and tomography, a measurement method using the phenomenon of interference of waves, to find estimates of forest height through model-based inversions [34].

2.4. Lidar Remote Sensing

Similarly to SAR satellites, lidar instruments are active sensors. They transmit laser light directly down to the target, some of which is scattered and reflected back to the instrument, this is then analysed by the instrument. The change in properties of the light and the time taken to return enables properties of the target and its distance from the instrument to be determined. A representation of a forest can be built with structural parameters estimated directly. Lidar offers more detailed responses than the other methods given.

Lidar is available on a local scale through airborne data from planes and UAVs. Lidar is also available from the satellite ICESat-2 and the International Space Station-borne instrument GEDI, which were both launched in 2018 [35,36] and are still operating at the time of writing.Both satellites are shown in Table 1.

Lidar data is very useful for AGB mapping as it can map the height of forests with very high accuracy. Height can then be used to estimate biomass through allometric equations as outlined in Section 2.1. However, lidar data-sets are only available sparsely and in discontinuous footprints along strips when using satellite sources of lidar, and while data from aircraft/UAVs are continuous and often very high resolution (<1 m), the expense of data collection means they are collected rarely in space and time. This means advanced techniques are needed to use lidar for large scale continuous mapping. Due to this lidar is often used as a supplementary data source to improve maps or as a calibration/validation data-set.

3. Data

This review will investigate the statistical methods used to create current AGB maps which are available on a continental or global scale. In 2021, the European Space Agency (ESA) and National Aeronautics and Space Administration (NASA) collaboratively introduced the first open source platform for global above-ground biomass maps, known as the Multi-Mission Algorithm and Analysis Platform (MAAP) [37]. This platform is hoped to encourage new research and algorithm development in the global scientific community [38], and does include most maps regularly used by the research community. For these reasons, the maps available from this portal will be the main focus of investigation and discussion. Other high profile continental or global maps have also been included in this review. There are nine maps which were investigated thoroughly, shown in Table 2. Other AGB maps produced on a smaller regional scale or in the development phase are referenced throughout this review however then will not be included within this table. In total seventeen AGB maps were investigated here.

Table 2. Data and methodology used to produce global AGB maps.

Owner	Map	Reference	Spatial Resolution	Input Data	Data Used to Train or Validate Models	Method to Obtain Estimate	Method to Combine Data	Method Used to Validate Model	Uncertainty Estimates
ESA, JAXA	GlobBiomass 2010	Santoro et al. [39]	100 m	SAR C-band, SAR L-band, Optical	Spaceborne lidar, Forest Inventory field data	Water cloud model	Weighted combination of two predictions	RMSE	Standard deviation available
NCEO	Africa Aboveground biomass map 2017	Rodriguez-Veiga and Balzter [40]	100 m	SAR L-band, and Optical Percent Tree cover	Spaceborne lidar, Airborne lidar	Random forest for canopy height, empirical model for AGB	Tree cover used to constrain predictions to areas with tree cover	Spatial k-fold cross validation	N/A
NASA	GEDI Level 4A Footprint AGB 2020	Duncanson et al. [41]	25 m- available at footprints	Spaceborne lidar	Airborne lidar and field data	OLS regression	N/A	Geographic cross validation	N/A
NASA	JPL Benchmark map	Saatchi et al. [42]	1 km	Optical vegetation indices, Microwave, digital elevation map	Field data and GLAS lidar	Maximum entropy machine learning	Variables in model	Cross validation with separated data-set	Available at pixel level
NASA	Mangrove canopy height and biomass map 2000	Simard et al. [43]	100 m	Digital elevation map (DEM), spaceborne lidar,	Field data	Allometric equations, regression models	N/A	RMSE	N/A
ESA	CCI Biomass 2017, 2020	Santoro [44]	100 m	SAR C-band, SAR L-band	Spaceborne lidar	Water cloud model, Least squares regression and self calibration	Weighted combination of two predictions	RMSE	Standard deviation available
_	Tropical carbon density map 2003-14	Baccini et al. [45], Baccini et al. [46]	500 m	Optical mosaic imagery	Field data and GLAS lidar	Random forest	N/A	RMSE validation with separated data set	Available at national scale
_	Integrated pan-tropical biomass map	Avitabile et al. [47]	1 km	multiple AGB maps	Sepated reference data-set	Regression model	Linear weighted average of predictors	RMSE with separated data set	Map available for most regions

4. Large-Scale Spatial Modelling

Global satellite remote sensing data is used to estimate the underlying quantity of AGB. These estimates can be produced in areas where these satellite data are available. Models are developed using a set of training data for which the AGB values are known and satellite data are available. These training data sets usually make use of field data or local airborne lidar data, both of which provide accurate enough estimates of AGB to train models.

4.1. Current Global Modelling Approaches

There are very few cases of spatial and temporal statistical models being used for AGB prediction and this is an active field of research likely to improve AGB mapping [17]. The majority of global maps produced for AGB prediction use simple multivariate linear models or generalised linear models (GLMs). For example, Duncanson et al. [41] maps are produced using ordinary least squares (OLS) models, i.e., linear models. Simard et al. [43] also uses linear models. The water cloud model, a non-linear model, is commonly used in biomass estimation. The water cloud model is used within the ESA’s GlobBiomass and CCI Biomass maps [39,44].

There are also examples of global maps using machine learning methodology. Rodriguez-Veiga et al. [48] uses a combination of random forest and empirical modelling to predict biomass. Here, Random Forest (RF) is used for the initial estimation of canopy height, then empirical allometric equations are used to estimate biomass. More simply, Baccini et al. [45] use pure RF to do their prediction. Other examples of machine learning methodology used include the Saatchi et al. [42] map. This uses a Bayesian machine learning method known as a maximum entropy model to produce estimates.

Linear models (LMs) and generalised linear models (GLMs) by their nature do not account for nonlinear effects of predictors which may be apparent in an ecological setting, such as AGB estimation. Additionally, both LMs, GLMs and machine learning methods used in global maps do not account for any spatial correlation other than that induced by predictors.

To summarise, the two main strands of methods applied in the field of AGB estimation are statistical and machine learning methods. Statistical methods include LM, GLMs and hierarchical models. Machine learning methods include for example random forests and k nearest neighbours. The advantage of statistical models is that they readily provide uncertainty estimates, whereas machine learning methods do not always. For example, in random forests a loss function is minimized, unless the loss function is based on a likelihood, uncertainty can only be be quantified through simulation.

4.2. What Problems Are Faced When Modelling AGB Data?

As stated in Cressie [49], basic modelling approaches often use the convenient assumption of independence between sample points. However, this independence assumption is violated for spatial data where dependence is present in all directions and weakens as data points become more dispersed. Satellite data are strongly spatially and temporally correlated; data values will often be very similar to neighbouring pixels and are also likely be similar to previous observations of the same location. This means that models such as LMs, GLMs or RF, where independence of data points is assumed, should not be used for AGB estimation models. If such models are used where assumptions are broken this can cause model miss-specification leading to biased estimates. Non spatial models have been used commonly in all current AGB global models listed in Table 2. This is unlikely because most standard modelling techniques and machine learning methods do not account for correlation. The complexity of creating these types of models, or even a lack of understanding that this may be necessary can lead to spatial models not being used.

False assumptions are also often made regarding the errors of AGB maps. For example, Saatchi et al. [42] uses the assumption that all errors will be distributed identically, are independent, and follow a normal distribution. The first assumption is unlikely for this type of data as areas with low accessibility, low sampling, or species heterogeneity are likely to have higher errors. Whilst spatial correlation causes non-independent errors.

Additionally, data used to create global AGB maps will be complex and large-scale. The data will be highly multivariate as there can be many sources of remote sensing data and there can be multiple predictor variables from just one source of satellite data. Applying advanced modelling techniques to these large multi-variate data sets requires high performance computing methods and is likely to benefit from future developments in this field [17].

4.3. Methods to Model Spatial Data

Hierarchical models provide a suitable framework for modelling in the presence of spatial dependence; see for example [50,51]. This modelling technique can improve spatial maps by including spatial random effects and hierarchical levels to account for data at different resolutions or scales. These methods typically use Gaussian random fields or spatial smooths to capture spatial dependence within the data. A recent example of this is the 1 km AGB map created with GEDI data using a generalised hierarchical model [52,53].

In terms of parameter estimation of these hierarchical models there are different possibilities [50,51,54,55]. Hiercharchical models can be estimated using generalised additive mixed models implemented in the mgcv package [51,56] using a frequentist or empirical Bayes approach. A Bayesian approach to estimation is also possible using Markov chain Monte Carlo (MCMC) see for example [57]. The integrated nested Laplace approximation (INLA) approach proposed by Rue et al. [58] is a computationally effective alternative to MCMC. Combined with the stochastic partial differential equation approach (SPDE) [59], one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The R package R-INLA implements these models [60,61]. R-INLA allows the implementation of spatial Generalised Linear Models (GLMs) and spatial Generalised Linear Mixed Models (GLMMs) alongside a further range of spatial models. These have already been shown to be effective in the environmetrics field. Lindgren et al. [62] highlights various examples of this, including air pollution models [63]. Machine learning methods have additionally been introduced as a method which can improve estimates for AGB. These models have great potential to find patterns in large quantities of data and have been commonly used over geostatistics. However, as discussed by Heuvelink and Webster [64], machine learning methods do not take into account spatial patterns for modelling which can lead to issues of overfitting and underestimation of error. Spatial versions of machine learning methodology may offer a solution to this and work is being conducted in this area [65,66].

5. Data Combination

5.1. Why Use Combinations of Data Sources?

It is difficult to accurately map AGB for all vegetation types and regions, to the spatial and temporal resolution needed with a single remote sensor [17]. There are great benefits to be gained from combining various remote sensing sources. Each data type has its own advantages and drawbacks, and so combinations of different data sources can absolve them of their weaknesses.

Various combinations of optical, SAR, and lidar data-sets are frequently used to produce global and local scale maps. Additionally, as an alternative to remote sensing, data sources such as digital elevation maps (DEMs), land use maps, climate variables and vegetation type are also associated with AGB and can be incorporated into biomass prediction models.

Otherwise, combination methods can also be applied to a single data source. There can be many temporal repetitions of a single data source in a short space of time which can be used to create better estimates. Combination methods can also be applied to various maps to produce a map which encompasses information from several repetitions.

5.2. How Are Global Data Sources Currently Combined?

The various data sources used to create global maps are shown in Table 1. Multiple data sources are often used to create AGB maps. Commonly, small quantities of field data or airborne lidar data are used as ‘ground truth’ to understand relationships between more widely available data and the AGB values they represent, which allows models to be created. Duncanson et al. [41] produced maps in this way: parametric prediction models are applied to spaceborne lidar height metrics. Meanwhile, airborne laser scanning and field data are used as a ‘ground truth’ data-set to create the parametric models. Similarly, Rodriguez-Veiga et al. [48] applies models to L-band SAR data, whilst using spaceborne and airborne lidar data as ‘ground truth’ data-sets to build models.

Of more interest, is the ability to combine multiple remote sensing data sources, to provide more information and more accurate AGB estimates. Complementary sources being used in combination has been shown to improve prediction of AGB [67,68]. This is because different data sources are sensitive to different information. For example, SAR data sources are generally sensitive to the density of vegetation, whilst optical imagery is sensitive to the type of vegetation or land cover. It is clear complementary data-sets can be useful, however it is still unclear how they should be most effectively combined.

A number of global AGB maps have begun to combine complementary data sources. The Santoro et al. [39] map and its following revisions have combined L-band and C-band SAR as input variables [44]. The methodology used here is to create two models, and in turn predictions, for each type of SAR data. These predictions are weighted, based on their sensitivity to biomass volume, and then combined [39].

Saatchi et al. [42] introduces a map which combines various data sources including: optical indices, digital elevation data, and microwave remote sensing data. This map takes all layers as entries to a maximum Entropy model which uses a sequential update machine learning algorithm to update the weights given to each data source for estimation.

Avitabile et al. [47] introduced a global map which used two previously created global AGB maps as input data sources [42,45]. This model used a weighted average of the two models to produce a combined map. This map had improved error rates in comparison to both input maps and shows the advantages of ensemble methods, combining various modelling techniques as well as data sources.

Frequentist hierarchical modelling techniques have also begun to be used on global scale data sets. A study using North American field data combined wall-to-wall remote sensing imagery with sparser discontinuous lidar data by using a generalised hierarchical model. This involved creating a model using the sparser more accurate data-set and then using this model’s predictions to fit a further model with the spatially continuous data-set [52].

5.3. Problems Faced When Combining Data

One difficulty faced when combining data-sets for prediction of AGB is spatial misalignment. This is when different spatial layers are acquired at different spatial scales [57]. For example, GlobBiomass model combines data from C-band and L-band SAR which are available at 150 m and 100 m resolutions, respectively. This can typically be dealt with using some form of spatial interpolation to create wall-to-wall data-sets, of the same resolution, through geostatistical methods or machine learning.

This problem is particularly apparent when using lidar and field data, both of which generally have sparse sample points and spatially discontinuous coverage. This can be seen in Figure 1a. However, since lidar and field sample points often produce highly accurate information in these small areas, incorporating this information into maps can improve estimates.

Satellite data are also prone to temporal misalignment, this is when the satellite imagery is taken at a different time than its validation data. Since field data are is not regularly measured in comparison to regularly occurring satellite data this is a commonly occurring problem. This can cause issues for AGB estimation, particularly if significant AGB change has occurred in the time period between satellite and validation data collection. For example, if deforestation or forest degradation has occurred between repeated data collection.

5.4. Methods to Tackle Spatial Misalignment

Statistical methods can be useful to tackle these spatial misalignment problems. Atkinson et al. [69] notes that statistical methods can be useful for downscaling from a coarse pixel to a finer spatial resolution, and fusion of images of a certain spatial resolution with other images of a different spatial resolution.

Kriging has been an especially effective methodology for tackling this problem and has been well accepted into the geoscience community. However, there are alternative statistical and machine learning methods which can improve results here. For example, a method proposed by Poggio and Gimona [70] uses generalised additive models (GAMs) in conjunction with kriging to downscale climate information. This method is more capable of handling non- linear relationships and showed improvements in the reproduction of the spatial pattern with a reduced bias in estimates, in comparison to standard Kriging methods.

Machine learning methods are also capable of effectively interpolating maps and downscaling or upscaling remote sensing images. However, machine learning methods often do not account for spatial correlation explicitly and so more spatially aware methods are needed to make sure interpolations realistically represent spatial patterns [64].

5.5. Models to Improve Data Combination

As discussed by Gelfand [54], in the spatial setting there is often the need to combine spatially misaligned data sources. For AGB measurement this would usually consist of sparse field or lidar measurements combined with widely available satellite imagery such as optical or SAR.

Whilst generalised hierarchical models have been used to tackle this problem, none of the global maps presented here used Bayesian methodology to implement them. Bayesian hierarchical models provide a cohesive framework for combining complex data models and external knowledge or expert opinion [55]. Bayesian models, treat the unknown parameters as random variables, that is random effects, rather than fixed values. This modelling allows spatial correlation to be induced. The customary requirements for independent data values can also be relaxed, which prevents violations of model assumptions which can be common in spatial modelling.

A study conducted by Babcock et al. [71], highlighted the potential for Bayesian geostatistical methods in combining remote sensing data sources for AGB prediction. However further work is needed to implement these effectively and on global scales.

Ref. [72] introduces a prototype to estimate global daily air temperatures. This receives large number of satellite observations to produce temperature estimates together with near-surface air temperature measurements to create a more complete record of air temperature. Bayesian spatio-temporal models such as these may be useful to improve data combination methods used for AGB estimation. When combined with methods for fast computations for hierarchical statistical models [73] these models can handle multiple scales as well as non-stationarity [59,74].

6. Model Validation

6.1. How Are Global Maps Currently Assessed?

It is clear that in recent years that vast improvements have been made to model assessment within the AGB estimation field. Lu [19], a previous review of AGB modelling in 2006, notes that many AGB estimation studies failed to provide any accuracy assessments due to the difficulty in collecting ground reference data. All global studies considered here provided some form of model accuracy assessment.

Model validation techniques are important in both model selection and model assessment. Researchers would like to use validation methods which can effectively select the best models for AGB estimation and to assess the success of a final chosen model. For this we measure the accuracy of the predictions that we obtain when we apply our method to previously unseen test data [75]. Commonly used measures to quantify the prediction error are the mean squared prediction error or often just called mean square error (MSE). This is the difference between the true value and the prediction, squared and averaged over all predictions. Often the MSE is expressed as root mean squared error (RMSE). Alternatively the absolute error takes the absolute value of the difference. A low MSE indicates that estimates of AGB are fairly good. Valbuena et al. [76] noted in 2017 the need for AGB maps to include some form of error values when reporting maps.

MSE can be estimated by simply retaining a separate data-set with ’ground truth’ values and applying the model to this data-set, this is known as the validation set. The MSE achieved for this validation set gives an indication of model performance. Cross validation is an alternative method used to estimate MSE making full use of the data available rather than using a validation data set. This method splits the data into random partitions. Each partition is used as a validation set for a model produced using the other remaining partitions. The resulting error rates are then averaged over all partitions [75,77].

For the global models investigated here RMSE was the most commonly used form of model validation. This was used by maps produced by Santoro et al. [39], Simard et al. [43], Santoro [44]. Cross validation is also used in AGB estimation studies including Saatchi et al. [42], Ploton et al. [78], Roberts et al. [79]. Cross validation can be used to estimate various statistics including RMSE or the coefficient of determination (R²) [78,80].

6.2. Problems with These Validation Methods?

In the early 2000s, Lu [19] discussed the lack of sufficient AGB data to provide model validation. Since then there have been significant advances, and all maps shown in Table 2 provide some form. However, there are still areas to improve on within model assessment of AGB estimates. Validation techniques discussed are not adequate within the context of spatial and temporal modelling.

As previously discussed, AGB global maps are produced using models which are based on smaller amounts of training data, available only in certain locations. When models are applied to this data it is possible for overfitting to occur. Overfitting is when the statistical learning model follows patterns in the training data too closely, and may be picking up some patterns that are just caused by random chance rather than by true relationships between satellite data and AGB [75,77]. When a given method yields a small training error but a large test error, it is said to be overfitting the data. This has been shown to be a problem for global AGB maps [78]. Model transferability is an important point of AGB estimation [19]. A model is considered transferable if it is created and then can be successfully applied to unseen spatial locations. To allow this transferability it is important to prevent overfitting. To prevent such overfitting good validation techniques are needed. Effective validation techniques can identify when overfitting is occurring.

Ploton et al. [78] show in a large scale study of African forests, that the non-spatial validation methods which are widely used within AGB mapping vastly under-estimate prediction error. The study showed that for a RF model non-spatial cross validation suggested the model predicts over half of the forest biomass variation, whilst a spatial version of cross validation suggested almost zero predictive power [78].

General techniques which do not account for structural dependencies, such as spatial dependence, are more prone to overfitting and underestimating prediction errors. One cause for the poor performance of general cross-validation are dependence structures in the data that persist as dependence structures in model residuals, violating the assumption of independence made in general cross-validation [79]. This spatial dependence, where neighbouring measurements show similar values, is fairly common in AGB mapping. It is especially important to make our validation take into account the structure of our data if the model has not done so, for example in machine learning techniques: random forest and neural networks. This is because these methods do not account for spatial correlation and are hence more prone to overfitting.

Duncanson et al. [81] also states the importance of a global consensus on AGB map validation. Whilst Araza et al. [82] notes the need for a global reference dataset for consistent global accuracy and uncertainty assessments. This will be important going forward as maps are used in collaboration and to direct global policy. In addition to the way the cross-validation is performed, by for example ignoring spatial correlation in the response, there are also problems with only focussing on the prediction error. Statistics like the mean squared error and R², estimated by cross-validation, only quantify the prediction error relative to the mean of the the predictive distribution and hence ignore the uncertainty of the prediction. Gelman et al. [80] therefore recommends to also consider the log score and the log likelihood. In particular, the log score is also useful for different non-normal distributions (e.g., in logistic regression) and when the focus is on the accuracy of the whole predictive distribution rather than just point estimates.

6.3. Alternative Validation Methods

Various implementations of more spatially aware validation methods are available. A further variation on cross validation is block cross validation, where the data-set is separated into blocks which are similar. This forces the model to test on data which is distant and hence independent to the training data used [79]. Spatial forms of block cross validation include Leave-Location-Out (LLO) cross validation which trains the model on all locations but one (or a group of locations) and then tests the model on this alternative location. A further example of this is the R package ‘sperrorest’ which allows user defined spatial sampling when applying cross validation or bootstrap methods [65].

Spatial forms of cross validation have been used in the more recent global maps such as: Rodriguez-Veiga and Balzter [40], Duncanson et al. [41]. Since this method is more effective than general cross-validation and RMSE it should be more widely used across global maps.

Alternatively, Wadoux et al. [83] suggests that a more effective way to validate global maps is to use probability sampling and design-based inference, producing unbiased estimates of map accuracy. An example of this is using random sampling to select points for which to assess the accuracy of.

In the AGB estimation case, models are trained and validated using very sparse validation data. This means it is likely that models will estimate AGB in completely unseen locations. For this reason it is especially important that models are not overfitting and that models are highly transferable. Forms of validation which can improve this are very useful in this situation. However, Meyer and Pebesma [84] notes that, when sparse validation points are available, maps are often extrapolating beyond the geographic locations of training data. For example, when data are strongly concentrated in areas of intense research. This often can occur with AGB field data and can cause predicted values to be meaningless and validation estimates to be inaccurate.

Meyer and Pebesma [84] also suggests the using an ‘area of applicability’, a way of assessing for which areas training data is similar enough to data used for wider estimation for the models used to be valid. Validation in this area also ensures validation yields accurate results.

7. Uncertainty Measurements

7.1. The Importance of Uncertainty Measurements

AGB predictions are susceptible to uncertainty at almost every level of the modelling process. Uncertainty can be introduced from location and tree measurement error, sampling error, plot to map temporal mismatch, and errors from allometric models [82]. These uncertainties can be quite often ignored entirely.

However, it is important for the utilization of AGB predictions that such sources of error are propagated and final uncertainty estimates are provided with AGB maps. Additionally, objective and consistent methods to estimate the accuracy and uncertainty of AGB maps across different research communities and global maps are needed [82].

In order to monitor progress in achieving emissions reductions uncertainty measurements are needed at an accurate enough scale to verify mitigation actions which have been implemented [85]. Without these uncertainties it is difficult to verify if mitigation has been successful; this hinders progress in preventing forest carbon emissions.

Similarly, it is also necessary to understand uncertainties in estimation to detect changes in biomass. When estimating biomass for two distinct time points, without uncertainties it is difficult to know if different biomass estimates indicate a real change or are simply expected due to differences in input data. Furthermore direct measurement of the uncertainty of change is also useful for policy decision making.

Uncertainty can be provided on various scales depending on the uses of the AGB maps produced. Uncertainty can be provided for example by: a per pixel basis, over a large region, or for changes which have occurred. These uncertainties may be completely different since it is likely that aggregated predictions over a region, e.g., mean AGB of a region, are more certain than estimates of locations, i.e., on a pixel by pixel basis. If policies are made based on these maps it is important to understand for which scale these uncertainties are needed. For monitoring, typically maps are required to check for changes and areas of interest, hence the uncertainty by location is of importance.

7.2. How Is Uncertainty of Global Maps Currently Presented?

There are multiple uncertainty metrics for biomass estimates, which include the relative and absolute systematic deviation and confidence interval or RMSE for the overall biomass estimate [12].

The majority of AGB maps currently do not provide uncertainty maps, or do not provide detailed uncertainty assessments. However, uncertainty maps are provided by JPL benchmark map, and GEDI gridded AGB map [42,53]. Uncertainty is also provided at national scale for Baccini et al. [45] tropical map.

In the current literature uncertainty estimates are often not properly propagated via a (Bayesian) hierarchical model and instead are simply added up, assuming that different error components are independent of each other. This will be incorrect in many instances. For example, uncertainty maps produced for Saatchi et al. [42] AGB map are produced by totalling different sources of error. This map combines error produced from: measurement, allometric models, sampling, and prediction. These uncertainties are given on pixel, national and regional scales.

Uncertainty for Baccini et al. [45] is produced by totalling estimates of error from both allometric models and the random forest model used for prediction. This gives an uncertainty value on a national level.

For uncertainty estimates of GEDI hybrid estimator, a variance estimator is used which includes terms for sampling error and uncertainty of parameter estimates [52,53].

7.3. Problems Faced When Providing Uncertainty Estimates

AGB maps are susceptible to uncertainties from many sources. It has also been observed that many contributing errors are often ignored when AGB uncertainty is calculated. Saarela et al. [86] concluded that the common practice of ignoring error occurring in the field-to-lidar step can lead to underestimation of variance. Methods which allow for many sources of error are needed for AGM map uncertainty estimates.

Methods, such as hierarchical models, capable of propagating uncertainty from multiple input data sources and realistic modeling of uncertainty due to spatial variability have seen only very limited use in the Earth sciences [72].

When mean AGB is estimated or predicted (at a location/pixel) using model based methods, these estimates come with variances estimates and these would feed into confidence or prediction intervals. In the case of machine learning methods which are generally algorithmic, variance estimates are typically not available unless some form of resampling or simulation is used. Prediction error or mean squared error is the expected squared deviation of a random variable from its estimate. It estimates how far away a prediction at an unseen location would be from the actual value. It can be decomposed into contributions from the bias (systematic error) and variance (random error) and can be estimated for both model based and machine learning methods and hence can be used to compare predictive performance via some form of validation for both methods. Both variance and prediction error (RMSE) are relevant. Variance estimates are model dependent and can be estimated from the model or by simulation (see [87] for an example using geostatistics).

Meanwhile, a further problem faced in uncertainty estimates is spatial mismatch. This is inherent in modeling AGB on a large scale based on smaller-footprint measurements and makes uncertainty estimates difficult [53].

Further causes of incorrect uncertainty estimates may be attributed to statistical assumptions being broken. For example, Patterson et al. [53] assumes that over 1 km mean residual errors will tend to zero. However, in areas of high spatial correlation this is unlikely to be true.

A lack of appropriate uncertainty estimates can lead to incorrect interpretations of maps and their meaning. Furthermore, the communication of uncertainty is difficult. Unclear communication of uncertainty can lead to over reliance on the produced maps. When uncertainty measures are provided with AGB maps, they are generally provided as an additional uncertainty map. It can be difficult to interpret these two maps together and to visualize the amount of uncertainty.

7.4. Alternative Uncertainty Measurement Methods

Statistical methods provide a comprehensive framework for uncertainty propagation when modelling which can handle error from multiple data sources. One method to propagate error from multiple sources is to use hierarchical modelling. Other modelling techniques, such as many machine learning methods, may not have methods for direct uncertainty estimates to be produced.

The hierarchical models described in Section 4.3 have been more commonly used within the AGB estimation field [52,53]. They have considerable advantages in comparison to simple regression models. One of these advantage is the ability to propagate uncertainty. By recognizing the uncertainty in the model unknowns, uncertainty is properly propagated to inference arising from the model [54]. This allows for better uncertainty interpretation [55,57].

Bayesian inference methods can also be used to improve propagation of uncertainty. This Bayesian method allows us to represent uncertainty in the process by drawing samples from the posterior distributions of the model components. These methods will allow uncertainty to be incorporated from many data source into a final uncertainty estimate.

Rayner et al. [72] shows an example of using these advanced statistical methods to combine satellite data and field observations of air temperature. This has allowed for effective global air temperature estimates alongside their uncertainties. Further examples of using Bayesian methods to estimate air pollution levels using satellite data and other data sources are also available [63,71,88,89].

Simulation methods, such as the sequential Gaussian cosimulation (SGCS), can also be used to provide uncertainty quantification, these methods have been applied to various remote sensing applications [17,87]. There are a range of examples which have used simulation methods to model spatial uncertainty of vegetation cover using remote sensing data. One method which can be used is sequential indicator simulation (SIS) which allows multiple maps to be produced that honour the available data [90]. For example, De Bruin [90] uses SIS to generate a set of equally probable maps, which can then be used to estimate uncertainty.

Whilst many machine learning algorithms do not directly provide uncertainty estimates there is growing demand for algorithms to do so. An example of this is Quantile Regression forests which build upon random forest methodology. They give a non-parametric and accurate way of estimating conditional quantiles for high-dimensional predictor variables [91]. These have been used to provide effective uncertainty estimates in the soil sciences [64,92].

Additionally, various statistical methods can be applied to uncertainty visualization. The R package Vizumap [93] offers a good suite of methods for this, including bivariate choropleth maps, pixel map, glyph maps, and exceedance probability maps. From these the pixel map is a good method to show uncertainty of AGB estimates, areas of greater uncertainty appear pixelated while areas of less uncertainty appear smooth, as though filled by only a single colour. A similar method has been introduced to tackle this problem by Taylor et al. [94]. A further method to visualize uncertainty is to use contour maps. Contour maps have been widely used in environmental sciences however they are rarely used to explain uncertainty. These maps use many contours when uncertainty of the estimated surface is low and fewer contours if uncertainty is high [95].

8. Change Detection and Quantification

8.1. The Importance of Change Detection

Commonly global AGB maps are produced as one time products, when repeat maps are produced they are often produced with temporal gaps of over five years. Whilst there is value in and need for high-quality AGB stocks data, many applications require biomass change information [12]. For AGB maps to be useful as an ECV, it is needed for them to be provided annually or at least every 5 years to capture the most important changes [12]. Since there is an increasing availability of suitable satellites with recurring imagery, this is likely to be an important goal of future research. Quantification of global forest change has been lacking. Spatially and temporally detailed information on a global-scale does not yet exist [4].

8.2. How Is Global Biomass Change Currently Detected?

There are two main strategies behind biomass change detection. The first is to produce multiple ‘one-time’ AGB maps over a set period from remote sensing data-sets. From these maps, biomass change can be detected by calculating the differences between maps.

The second strategy is to directly detect deforestation from remote sensing data. This is done by detecting departures from normal that may indicate deforestation or degradation. This detection may be done by classifying forested and deforested areas, or it may quantify the degradation that has occurred. Though both are useful metrics, for carbon emissions a quantified AGB change map can allow carbon losses to be calculated.

The number of global or continental biomass change maps is extremely low, as this is a very current area of research. However, strong efforts have been made towards producing biomass change maps on a global scale [4,44,46].

The Santoro [44] global AGB change map builds on the single time point AGB maps which have been produced for 2010, 2017 and 2018 [44]. The differences between these maps have been taken to provide a full global coverage of biomass change. However, as both these original maps have high uncertainty levels of around 40–60% of the estimated value, these change maps also have considerable uncertainty [44].

In the tropical regions Baccini et al. [46] has produced Above-Ground Carbon (AGC) change maps. Similarly, change is detected using annual single time point AGC maps and finding differences between these maps. Single maps were produced from MODIS annual mosaics for each year between 2003 and 2014 [45]. For each pixel a 12-year time series was created. Time series change point analysis was then use to detect significant changes on a pixel level. This produced an estimated difference alongside standard error which indicated the uncertainty of change.

8.3. Problems Faced When Detecting AGB Change

One of the main problems faced in detecting AGB change is that there still exist high uncertainties in maps produced for single time points. A shown by Santoro [44], these large uncertainties make it difficult to be certain of changes and the quantities of change. To produce effective maps of change it is first necessary to decrease uncertainty of individual maps. Future developments producing a larger number of global scale maps using various sources of data should make conclusions more certain.

Alternatively, for strategies which directly detect change in biomass using change detection methods, problems may be caused by a lack of validation data. Collecting reliable temporal field-based data-sets is difficult [96]. Validation of AGB change requires data-sets which clearly indicate areas which have experienced changes and those that have remained stable. Without these regular data-sets such methodology is difficult. Hansen et al. [97] introduced the idea of comparing each time series of data (of a pixel) with the mean reference time series of the data. This involves transforming the the time series of each pixel into measures of dissimilarity to the mean time series. The method hence enables change detection without reliable reference data and results are promising.

Remote sensing data are also susceptible to detecting changes in factors entirely unrelated to changes in biomass. Atmospheric conditions, seasonal changes, and sun angle all naturally contribute towards differences in remote sensing data [96]. The ability to separate these factors from real change in biomass levels is a problem which requires consideration when producing change maps.

8.4. How Can Change Detection Be Improved?

Due to cloud cover or satellite availability it is difficult to find repeat imagery while maintaining weather and seasonal conditions [98], which makes it more difficult to detect real change. To minimise the effect of seasonality and other confounders it is important to use methods which acknowledge these effects. This can be done using general time-series methodology. Hostert et al. [98] notes that phenology driven change detection can greatly benefit from the use of time series methods.

In the literature studied here there are no examples of this for global AGB change detection. However, time series analysis in other fields often use models which incorporate seasonality and there are some examples of time-series methods applied to deforestation detection. Zhao et al. [99] uses a Bayesian ensemble algorithm to detect change points, trend, and seasonality within satellite data. Similar methodology to this has been used to detect forest change on small scales using optical imagery [100,101]. To create accurate global AGB detection time series methodology could be applied on a larger scale and applied to some form of AGB measure which depends on multiple data sources.

Reiche et al. [102] has also implemented a proof of concept for deforestation detection based on both SAR and optical imagery. This used a Bayesian multi-source time series algorithm. This type of algorithm could be useful when applied on a global scale to AGB change detection, as AGB predictions are often improved when a combination of data sources are used.

Bayesian space-time models have also been used which can be adjusted for seasonality and confounders. Bayesian spatio-temporal models have been used to detect changes in air pollution during COVID-19, using these models to adjust for seasonality and contextual factors [60,62,63,103].

9. Discussion of Future Research and Potential Solutions

There is a wealth of remote sensing data sources available which are sensitive to AGB, indicated in Table 1. This availability of data means there are many opportunities to produce effective maps. Future data sources, such as the upcoming BIOMASS satellite [32], should also improve the accuracy of AGB estimations.

Above-ground biomass (AGB) maps are valuable data sources which can indicate the volume of carbon held in forests. This is especially useful in making policymakers aware of forest areas which need support or protection, and of changes occurring in forest carbon levels [3,7,17]. Research is needed to ensure that these data are used as effectively as possible and to ensure maps are reliable enough to be used within policy. Statistical methods may offer solutions to many problems faced in producing effective AGB maps.

This study focused on assessing global or continental maps shown in Table 2. Various topics of research stood out as areas which could be positively impacted by statistical knowledge and that would increase the practicality of AGB maps. These topics were: (i) spatial modelling, (ii) data combination, (iii) model and map validation, (iv) uncertainty estimation, and (v) change detection and quantification.

Almost all global models investigated did not account for spatial dependence within their models. Only one example using hierarchical modelling for AGB prediction was found [52]. This is an area which could benefit from the knowledge of spatial statisticians. Various models are available which incorporate a spatial element and do not assume independence of spatial data. These include forms of hierarchical modelling, spatial GLMs and spatial GLMMs which can be used in either a frequentist or Bayesian setting [51,54,55,56,60,61,62]. Models which incorporate a spatial element may be able to more effectively predict AGB using large spatial data sets obtained from satellites.

This study found that to validate their AGB prediction models the majority of global maps used an RMSE value, some using general cross validation to estimate this, and one example of spatial cross validation [41]. Numerous studies have shown that general cross-validation will likely over estimate predictive performance in the presence of spatial auto-correlation, since spatial overfitting may occur [66,78,84]. Considerable work has been done is this area and suggests that using spatial forms of cross-validation or probabilistic sampling may be good options to prevent this overestimation of predictive performance [79,83,84]. However, there is no consensus as to the best method to be used. Further, research is needed to understand which validation methods are most effective for AGB global maps. Furthermore, the encouraged use of these spatial validation techniques is needed.

This study found that a major area for which statistical methods can benefit the creation of AGB maps is uncertainty estimation and visualization. Hierarchical modelling has begun to be used which can provide an improvement on current methods for uncertainty estimation. Additionally using a Bayesian framework for inference could also improve uncertainty estimates [58,60,61,62]. Newly developed machine learning methods are also available which can propagate uncertainty estimates [91,92]. Map interpretation could also be improved by using statistical methods to create contour maps or pixelated maps to visualize uncertainty [93,94,95].

In addition, it was found that there is a severe lack of AGB change maps [4,44,46]. Those considered here also suggested high uncertainty detecting and quantifying change. This is to be expected given that one time maps with much better validation data have high uncertainty levels. However, change detection and quantification is very useful in understanding changes in carbon levels and in policy making regarding carbon levels [17]. To improve the quality of AGB change maps there are many opportunities. One method would be to focus on decreasing uncertainty levels, as previously described, for one-time maps, which would in turn improve change estimates. However the real gains are likely to be made through the use of time series methodology, which may also be able to improve change estimates by detecting real change and ignoring effect such as seasonality [98]. Only a small amount of experimental work has been done using time series, and could present an option for future researchers [97,98,99,100,101,102]. Another avenue for researchers is direct change detection. This work has historically been difficult as there is only small amounts of change validation data available, however solutions to this can be found [97].

10. Summary

In summary, there are many areas for which advanced statistical techniques can be of great use to the geoscience community, in particular for above-ground biomass (AGB) mapping. There is an influx of global satellite data available, and a growing need for accurate large-scale mapping to inform climate policy makers. A greater collaboration between research in the fields of: geoscience, statistics, computer science and forestry could make vast improvements on the currently available global maps. This review should help to point remote sensing researchers to statistical and machine learning techniques which may lead to improvements in AGB maps. Whilst also highlighting the main topics for which statistical methods are likely to have a positive impact. Moreover, this may give statistical researchers a real world data source which inspires future developments.

Author Contributions

Conceptualization, investigation, writing—original draft preparation was completed by A.E.T. Writing—review and editing, supervision and funding acquisition was completed by N.H.A. and E.T.A.M. All authors have read and agreed to the published version of the manuscript.

Funding

Funding for this research was provided by Natural Environment Research Council through a SENSE CDT studentship (NE/T00939X/1). Edward T.A. Mitchard is funded by a grant from the U.K.’s Natural Environment Research Council (NERC, NE/M021998/1) and by the European Research Council Starting Grant number 757526: Tropical Forest Degradation Experiment (FODEX).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AGB	Above ground biomass
AGC	Above-Ground Carbon
ALOS	Advanced Land Observing Satellite
CCI	ESA Climate Change Initiative
CSA	Canadian Space Agency
DBH	Diameter at Breast Height
DEM	Digital Elevation Map
DLR	German Aerospace Center
ECV	Essential climate variable
ESA	European Space Agency
GAMs	Generalised Additive Models
GEDI	Global Ecosystem Dynamics Investigation
GLAS	Geoscience Laser Altimeter System
GLM	generalised linear model
GLMMs	Generalised Linear Mixed Models
HV	Horizontal vertical
INLA	Integrated nested Laplace approximation
JAXA	Japan Aerospace Exploration Agency
Lidar	Light Detection and Ranging
LLO	Leave Location Out
MAAP	Multi-Mission ALgorithm and Analysis Platform
MCMC	MArkov Chain Monte Carlo
MODIS	Moderate Resolution Imaging Spectroradiometer
MSE	Mean Squared Error
NASA	National Aeronautics and Space Administration
NCEO	National Centre for Earth Observation
NFIs	National Forest Inventories
OLS	Ordinary Least Squares
PALSAR-2	Phased Array L-band Synthetic Aperture Radar
REDD	Reducing Emissions from Deforestation and Forest Degradation
RF	Random Forest
RMSE	Root Mean Squared Error
SAR	Synthetic Aperture Radar
SDGs	Sustainable Development Goals
SGCS	Sequential Gaussian Cosimulation
SIS	Sequential indicator simulation
SPDE	Stochastic Partial Differential Equations
UAV	Unmanned Aerial Vehicle
UN	United Nations

References

FAO; UNEP. The State of the World’s Forests 2020. Forests, Biodiversity and People; FAO: Rome, Italy, 2020. [Google Scholar]
UN SDG 15 Definitions. 2022. Available online: https://unstats.un.org/sdgs/metadata/files/Metadata-15-01-01.pdf (accessed on 23 July 2022).
Friedlingstein, P.; Jones, M.W.; O’Sullivan, M.; Andrew, R.M.; Bakker, D.C.; Hauck, J.; Le Quéré, C.; Peters, G.P.; Peters, W.; Pongratz, J.; et al. Global carbon budget 2021. Earth Syst. Sci. Data 2022, 14, 1917–2005. [Google Scholar] [CrossRef]
Hansen, M.C.; Potapov, P.V.; Moore, R.; Hancher, M.; Turubanova, S.A.; Tyukavina, A.; Thau, D.; Stehman, S.V.; Goetz, S.J.; Loveland, T.R.; et al. High-resolution global maps of 21st-century forest cover change. Science 2013, 342, 850–853. [Google Scholar] [CrossRef] [Green Version]
Global Forest Watch. 2022. Available online: https://www.globalforestwatch.org/ (accessed on 23 July 2022).
Qin, Y.; Xiao, X.; Wigneron, J.P.; Ciais, P.; Brandt, M.; Fan, L.; Li, X.; Crowell, S.; Wu, X.; Doughty, R.; et al. Carbon loss from forest degradation exceeds that from deforestation in the Brazilian Amazon. Nat. Clim. Chang. 2021, 11, 442–448. [Google Scholar] [CrossRef]
United Nations Reducing Emissions from Deforestation and Forest Degradation (REDD). 2021. Available online: https://www.unredd.net/about/what-is-redd-plus.html (accessed on 31 August 2021).
Mitchard, E.T. The tropical forest carbon cycle and climate change. Nature 2018, 559, 527–534. [Google Scholar] [CrossRef] [Green Version]
United Nations Sustainable Development Goals. 2022. Available online: https://sdgs.un.org/goals (accessed on 13 May 2022).
United Nations Paris Agreement. 2015. Available online: https://treaties.un.org/pages/ViewDetails.aspx?src=TREATY&mtdsg_no=XXVII-7-d&chapter=27&clang=_en (accessed on 13 May 2022).
Voigt, C.; Ferreira, F. The Warsaw Framework for REDD+: Implications for national implementation and access to results-based finance. Carbon Clim. Law Rev. 2015, 9, 113–129. [Google Scholar]
Herold, M.; Carter, S.; Avitabile, V.; Espejo, A.B.; Jonckheere, I.; Lucas, R.; McRoberts, R.E.; Næsset, E.; Nightingale, J.; Petersen, R.; et al. The role and need for space-based forest biomass-related measurements in environmental management and policy. Surv. Geophys. 2019, 40, 757–778. [Google Scholar] [CrossRef] [Green Version]
Ravindranath, N.H.; Ostwald, M. Carbon Inventory Methods: Handbook for Greenhouse Gas Inventory, Carbon Mitigation and Roundwood Production Projects; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2007; Volume 29. [Google Scholar]
Morison, J.I. Forest Research—Climate Change and Forests Report. 2020. Available online: https://www.forestresearch.gov.uk/documents/7910/20_0039_Leaflet_CC_factsheet_Forests_wip06_ACC.pdf (accessed on 31 August 2021).
Brown, S. Estimating Biomass and Biomass Change of Tropical Forests: A Primer; Food and Agriculture Organization: Rome, Italy, 1997; Volume 134. [Google Scholar]
Chuvieco, E. Fundamentals of Satellite Remote Sensing: An Environmental Approach; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
Giménez, M.G.; Ballester, M.J.Y.; Romero, B.R.; López, A.S.; De Grandi, E.C.; Dutta, O.; Bañuls, E.P.; Nieto, A.F.; Ramírez, P.P.; Carrillo, Á.F.; et al. Assessment of Innovative Technologies and Their Readiness for Remote Sensing-Based Estimation of Forest Carbon Stocks and Dynamics 2021. Available online: https://openknowledge.worldbank.org/handle/10986/35806 (accessed on 30 June 2022).
Mitchard, E.T.; Feldpausch, T.R.; Brienen, R.J.; Lopez-Gonzalez, G.; Monteagudo, A.; Baker, T.R.; Lewis, S.L.; Lloyd, J.; Quesada, C.A.; Gloor, M.; et al. Markedly divergent estimates of A mazon forest carbon density from ground plots and satellites. Glob. Ecol. Biogeogr. 2014, 23, 935–946. [Google Scholar] [CrossRef] [Green Version]
Lu, D. The potential and challenge of remote sensing-based biomass estimation. Int. J. Remote Sens. 2006, 27, 1297–1328. [Google Scholar] [CrossRef]
Chave, J.; Davies, S.J.; Phillips, O.L.; Lewis, S.L.; Sist, P.; Schepaschenko, D.; Armston, J.; Baker, T.R.; Coomes, D.; Disney, M.; et al. Ground data are essential for biomass remote sensing missions. Surv. Geophys. 2019, 40, 863–880. [Google Scholar] [CrossRef]
Forest Research—Climate Change and Forests Report. 2021. Available online: https://www.forestresearch.gov.uk/documents/2726/FCNFI113.pdf (accessed on 13 May 2022).
Chave, J.; Andalo, C.; Brown, S.; Cairns, M.A.; Chambers, J.Q.; Eamus, D.; Fölster, H.; Fromard, F.; Higuchi, N.; Kira, T.; et al. Tree allometry and improved estimation of carbon stocks and balance in tropical forests. Oecologia 2005, 145, 87–99. [Google Scholar]
Muukkonen, P. Generalized allometric volume and biomass equations for some tree species in Europe. Eur. J. For. Res. 2007, 126, 157–166. [Google Scholar] [CrossRef]
Chave, J.; Réjou-Méchain, M.; Búrquez, A.; Chidumayo, E.; Colgan, M.S.; Delitti, W.B.; Duque, A.; Eid, T.; Fearnside, P.M.; Goodman, R.C.; et al. Improved allometric models to estimate the aboveground biomass of tropical trees. Glob. Chang. Biol. 2014, 20, 3177–3190. [Google Scholar] [CrossRef]
Dubayah, R.; Blair, J.B.; Goetz, S.; Fatoyinbo, L.; Hansen, M.; Healey, S.; Hofton, M.; Hurtt, G.; Kellner, J.; Luthcke, S.; et al. The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography. Sci. Remote Sens. 2020, 1, 100002. [Google Scholar] [CrossRef]
Emery, B.; Camps, A. Introduction to Satellite Remote Sensing: Atmosphere, Ocean, Land and Cryosphere Applications; Elsevier: Amsterdam, The Netherlands, 2017. [Google Scholar]
NASA Landsat Mission Details. 2022. Available online: https://landsat.gsfc.nasa.gov/satellites/landsat-9/landsat-9-bands/ (accessed on 13 May 2022).
ESA Sentinel-2 Mission Details. 2022. Available online: https://sentinel.esa.int/web/sentinel/missions/sentinel-2 (accessed on 13 May 2022).
Airbus Pléiades Neo Mission Details. 2022. Available online: https://www.intelligence-airbusds.com/imagery/constellation/pleiades-neo/ (accessed on 13 May 2022).
ESA Sentinel-1 Mission Details. 2022. Available online: https://sentinel.esa.int/web/sentinel/missions/sentinel-1 (accessed on 13 May 2022).
JAXA ALOS-PALSAR-2 Mission Details. 2022. Available online: https://www.eorc.jaxa.jp/ALOS-2/en/about/palsar2.htm (accessed on 13 May 2022).
ESA BIOMASS Mission Details. 2022. Available online: https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass (accessed on 13 May 2022).
Mitchard, E.T.; Saatchi, S.S.; Woodhouse, I.H.; Nangendo, G.; Ribeiro, N.; Williams, M.; Ryan, C.M.; Lewis, S.L.; Feldpausch, T.; Meir, P. Using satellite radar backscatter to predict above-ground woody biomass: A consistent relationship across four different African landscapes. Geophys. Res. Lett. 2009, 36, L23401. [Google Scholar] [CrossRef]
Mette, T.; Papathanassiou, K.; Hajnsek, I. Biomass estimation from polarimetric SAR interferometry over heterogeneous forest terrain. In Proceedings of the IGARSS 2004—2004 IEEE International Geoscience and Remote Sensing Symposium, Anchorage, AK, USA, 20–24 September 2004; Volume 1, pp. 511–514. [Google Scholar]
ESA ICESat-2 Mission Details. 2022. Available online: https://icesat-2.gsfc.nasa.gov/ (accessed on 13 May 2022).
NASA GEDI Mission Details. 2022. Available online: https://gedi.umd.edu/ (accessed on 13 May 2022).
ESA Multi-Mission Algorithm and Analysis Platform (MAAP). 2021. Available online: https://earthdata.nasa.gov/maap-biomass/products/global (accessed on 13 May 2022).
Albinet, C.; Whitehurst, A.S.; Jewell, L.A.; Bugbee, K.; Laur, H.; Murphy, K.J.; Frommknecht, B.; Scipal, K.; Costa, G.; Jai, B.; et al. A joint ESA-NASA multi-mission algorithm and analysis platform (MAAP) for biomass, NISAR, and GEDI. Surv. Geophys. 2019, 40, 1017–1027. [Google Scholar] [CrossRef] [Green Version]
Santoro, M.; Cartus, O.; Carvalhais, N.; Rozendaal, D.; Avitabile, V.; Araza, A.; De Bruin, S.; Herold, M.; Quegan, S.; Rodríguez-Veiga, P.; et al. The global forest above-ground biomass pool for 2010 estimated from high-resolution satellite observations. Earth Syst. Sci. Data 2021, 13, 3927–3950. [Google Scholar] [CrossRef]
Rodriguez-Veiga, P.; Balzter, H. Africa Aboveground Biomass Map for 2017. 2021. Available online: https://leicester.figshare.com/articles/dataset/Africa_Aboveground_Biomass_map_for_2017/15060270/1 (accessed on 13 May 2022).
Duncanson, L.; Kellner, J.R.; Armston, J.; Dubayah, R.; Minor, D.M.; Hancock, S.; Healey, S.P.; Patterson, P.L.; Saarela, S.; Marselis, S.; et al. Aboveground biomass density models for NASA’s Global Ecosystem Dynamics Investigation (GEDI) lidar mission. Remote Sens. Environ. 2022, 270, 112845. [Google Scholar] [CrossRef]
Saatchi, S.S.; Harris, N.L.; Brown, S.; Lefsky, M.; Mitchard, E.T.; Salas, W.; Zutta, B.R.; Buermann, W.; Lewis, S.L.; Hagen, S.; et al. Benchmark map of forest carbon stocks in tropical regions across three continents. Proc. Natl. Acad. Sci. USA 2011, 108, 9899–9904. [Google Scholar] [CrossRef] [Green Version]
Simard, M.; Fatoyinbo, L.; Smetanka, C.; Rivera-Monroy, V.H.; Castañeda-Moya, E.; Thomas, N.; Van der Stocken, T. Mangrove canopy height globally related to precipitation, temperature and cyclone frequency. Nat. Geosci. 2019, 12, 40–45. [Google Scholar] [CrossRef] [Green Version]
Santoro, M. CCI Biomass Algorithm Theoretical Basis Document Year 3. 2021. Available online: https://climate.esa.int/media/documents/D2_2_Algorithm_Theoretical_Basis_Document_ATBD_V3.0_20210614_hkrml_SQ_MS.pdf (accessed on 13 May 2022).
Baccini, A.; Goetz, S.; Walker, W.; Laporte, N.; Sun, M.; Sulla-Menashe, D.; Hackler, J.; Beck, P.; Dubayah, R.; Friedl, M.; et al. Estimated carbon dioxide emissions from tropical deforestation improved by carbon-density maps. Nat. Clim. Chang. 2012, 2, 182–185. [Google Scholar] [CrossRef]
Baccini, A.; Walker, W.; Carvalho, L.; Farina, M.; Sulla-Menashe, D.; Houghton, R. Tropical forests are a net carbon source based on aboveground measurements of gain and loss. Science 2017, 358, 230–234. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Avitabile, V.; Herold, M.; Heuvelink, G.B.; Lewis, S.L.; Phillips, O.L.; Asner, G.P.; Armston, J.; Ashton, P.S.; Banin, L.; Bayol, N.; et al. An integrated pan-tropical biomass map using multiple reference datasets. Glob. Chang. Biol. 2016, 22, 1406–1420. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rodriguez-Veiga, P.; Wheeler, J.; Louis, V.; Tansey, K.; Balzter, H. Quantifying forest biomass carbon stocks from space. Curr. For. Rep. 2017, 3, 1–18. [Google Scholar] [CrossRef] [Green Version]
Cressie, N. Statistics for Spatial Data; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Cressie, N.; Calder, C.A.; Clark, J.S.; Hoef, J.M.V.; Wikle, C.K. Accounting for uncertainty in ecological analysis: The strengths and limitations of hierarchical statistical modeling. Ecol. Appl. 2009, 19, 553–570. [Google Scholar] [CrossRef] [PubMed]
Pedersen, E.J.; Miller, D.L.; Simpson, G.L.; Ross, N. Hierarchical generalized additive models in ecology: An introduction with mgcv. PeerJ 2019, 7, e6876. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Saarela, S.; Holm, S.; Healey, S.P.; Andersen, H.E.; Petersson, H.; Prentius, W.; Patterson, P.L.; Næsset, E.; Gregoire, T.G.; Ståhl, G. Generalized hierarchical model-based estimation for aboveground biomass assessment using GEDI and Landsat data. Remote Sens. 2018, 10, 1832. [Google Scholar]
Patterson, P.L.; Healey, S.P.; Ståhl, G.; Saarela, S.; Holm, S.; Andersen, H.E.; Dubayah, R.O.; Duncanson, L.; Hancock, S.; Armston, J.; et al. Statistical properties of hybrid estimators proposed for GEDI—NASA’s global ecosystem dynamics investigation. Environ. Res. Lett. 2019, 14, 065007. [Google Scholar] [CrossRef]
Gelfand, A.E. Hierarchical modeling for spatial data problems. Spat. Stat. 2012, 1, 30–39. [Google Scholar] [CrossRef] [Green Version]
Clark, J.S.; Gelfand, A.E. Hierarchical Modelling for the Environmental Sciences: Statistical Methods and Applications; OUP Oxford: Oxford, UK, 2006. [Google Scholar]
Wood, S.N. Generalized Additive Models: An Introduction with R, 2nd ed.; CRC: Boca Raton, FL, USA, 2006. [Google Scholar]
Banerjee, S.; Carlin, B.P.; Gelfand, A.E. Hierarchical Modeling and Analysis for Spatial Data; CRC: Boca Raton, FL, USA, 2003. [Google Scholar]
Rue, H.; Martino, S.; Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B 2009, 71, 319–392. [Google Scholar] [CrossRef]
Lindgren, F.; Rue, H.; Lindström, J. An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B 2011, 73, 423–498. [Google Scholar] [CrossRef] [Green Version]
Bakka, H.; Rue, H.; Fuglstad, G.A.; Riebler, A.; Bolin, D.; Illian, J.; Krainski, E.; Simpson, D.; Lindgren, F. Spatial modeling with R-INLA: A review. Wiley Interdiscip. Rev. Comput. Stat. 2018, 10, e1443. [Google Scholar] [CrossRef] [Green Version]
Krainski, E.; Gómez-Rubio, V.; Bakka, H.; Lenzi, A.; Castro-Camilo, D.; Simpson, D.; Lindgren, F.; Rue, H. Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA; CRC: Boca Raton, FL, USA, 2018. [Google Scholar]
Lindgren, F.; Bolin, D.; Rue, H. The SPDE approach for Gaussian and non-Gaussian fields: 10 years and still running. Spat. Stat. 2022, 50, 100599. [Google Scholar]
Beloconi, A.; Probst-Hensch, N.M.; Vounatsou, P. Spatio-temporal modelling of changes in air pollution exposure associated with the COVID-19 lockdown measures across Europe. Sci. Total. Environ. 2021, 787, 147607. [Google Scholar]
Heuvelink, G.B.; Webster, R. Spatial statistics and soil mapping: A blossoming partnership under pressure. Spat. Stat. 2022, 50, 100639. [Google Scholar] [CrossRef]
Brenning, A. Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The R package sperrorest. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; pp. 5372–5375. [Google Scholar]
Meyer, H.; Reudenbach, C.; Hengl, T.; Katurji, M.; Nauss, T. Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environ. Model. Softw. 2018, 101, 1–9. [Google Scholar]
Persson, H.J.; Jonzén, J.; Nilsson, M. Combining TanDEM-X and Sentinel-2 for large-area species-wise prediction of forest biomass and volume. Int. J. Appl. Earth Obs. Geoinf. 2021, 96, 102275. [Google Scholar]
Jiang, F.; Zhao, F.; Ma, K.; Li, D.; Sun, H. Mapping the Forest Canopy Height in Northern China by Synergizing ICESat-2 with Sentinel-2 Using a Stacking Algorithm. Remote Sens. 2021, 13, 1535. [Google Scholar] [CrossRef]
Atkinson, P.M.; Stein, A.; Jeganathan, C. Spatial sampling, data models, spatial scale and ontologies: Interpreting spatial statistics and machine learning applied to satellite optical remote sensing. Spat. Stat. 2022, 50, 100646. [Google Scholar]
Poggio, L.; Gimona, A. Downscaling and correction of regional climate models outputs with a hybrid geostatistical approach. Spat. Stat. 2015, 14, 4–21. [Google Scholar] [CrossRef]
Babcock, C.; Finley, A.O.; Andersen, H.E.; Pattison, R.; Cook, B.D.; Morton, D.C.; Alonzo, M.; Nelson, R.; Gregoire, T.; Ene, L.; et al. Geostatistical estimation of forest biomass in interior Alaska combining Landsat-derived tree cover, sampled airborne lidar and field observations. Remote Sens. Environ. 2018, 212, 212–230. [Google Scholar] [CrossRef] [Green Version]
Rayner, N.A.; Auchmann, R.; Bessembinder, J.; Brönnimann, S.; Brugnara, Y.; Capponi, F.; Carrea, L.; Dodd, E.M.; Ghent, D.; Good, E.; et al. The EUSTACE project: Delivering global, daily information on surface air temperature. Bull. Am. Meteorol. Soc. 2020, 101, E1924–E1947. [Google Scholar]
Rue, H.; Martino, S.; Lindgren, F.; Simpson, D.; Riebler, A. R-INLA: Approximate Bayesian Inference Using Integrated Nested Laplace Approximations. 2013. Available online: http://www.r-inla.org (accessed on 23 July 2022).
Bolin, D.; Lindgren, F. Spatial models generated by nested stochastic partial differential equations, with an application to global ozone mapping. Ann. Appl. Stat. 2011, 2011 1, 523–550. [Google Scholar] [CrossRef] [Green Version]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2013; Volume 112. [Google Scholar]
Valbuena, R.; Hernando, A.; Manzanera, J.; Görgens, E.; Almeida, D.; Mauro, F.; García-Abril, A.; Coomes, D. Enhancing of accuracy assessment for forest above-ground biomass estimates obtained from remote sensing via hypothesis testing and overfitting evaluation. Ecol. Model. 2017, 366, 15–26. [Google Scholar] [CrossRef] [Green Version]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Ploton, P.; Barbier, N.; Couteron, P.; Antin, C.; Ayyappan, N.; Balachandran, N.; Barathan, N.; Bastin, J.F.; Chuyong, G.; Dauby, G.; et al. Toward a general tropical forest biomass prediction model from very high resolution optical satellite images. Remote Sens. Environ. 2017, 200, 140–153. [Google Scholar] [CrossRef]
Roberts, D.R.; Bahn, V.; Ciuti, S.; Boyce, M.S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J.J.; Schröder, B.; Thuiller, W.; et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 2017, 40, 913–929. [Google Scholar] [CrossRef] [Green Version]
Gelman, A.; Hill, J.; Vehtari, A. Regression and Other Stories; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
Duncanson, L.; Armston, J.; Disney, M.; Avitabile, V.; Barbier, N.; Calders, K.; Carter, S.; Chave, J.; Herold, M.; Crowther, T.W.; et al. The importance of consistent global forest aboveground biomass product validation. Surv. Geophys. 2019, 40, 979–999. [Google Scholar] [CrossRef] [Green Version]
Araza, A.; de Bruin, S.; Herold, M.; Quegan, S.; Labriere, N.; Rodriguez-Veiga, P.; Avitabile, V.; Santoro, M.; Mitchard, E.T.; Ryan, C.M.; et al. A comprehensive framework for assessing the accuracy and uncertainty of global above-ground biomass maps. Remote Sens. Environ. 2022, 272, 112917. [Google Scholar] [CrossRef]
Wadoux, A.M.C.; Heuvelink, G.B.; De Bruin, S.; Brus, D.J. Spatial cross-validation is not the right way to evaluate map accuracy. Ecol. Model. 2021, 457, 109692. [Google Scholar] [CrossRef]
Meyer, H.; Pebesma, E. Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods Ecol. Evol. 2021, 12, 1620–1633. [Google Scholar] [CrossRef]
Romijn, E.; De Sy, V.; Herold, M.; Böttcher, H.; Roman-Cuesta, R.M.; Fritz, S.; Schepaschenko, D.; Avitabile, V.; Gaveau, D.; Verchot, L.; et al. Independent data for transparent monitoring of greenhouse gas emissions from the land use sector—What do stakeholders think and need? Environ. Sci. Policy 2018, 85, 101–112. [Google Scholar] [CrossRef]
Saarela, S.; Holm, S.; Grafström, A.; Schnell, S.; Næsset, E.; Gregoire, T.G.; Nelson, R.F.; Ståhl, G. Hierarchical model-based inference for forest inventory utilizing three sources of information. Ann. For. Sci. 2016, 73, 895–910. [Google Scholar] [CrossRef] [Green Version]
Zakeri, F.; Mariethoz, G. A review of geostatistical simulation models applied to satellite remote sensing: Methods and applications. Remote Sens. Environ. 2021, 259, 112381. [Google Scholar] [CrossRef]
Beloconi, A.; Vounatsou, P. Bayesian geostatistical modelling of high-resolution NO2 exposure in Europe combining data from monitors, satellites and chemical transport models. Environ. Int. 2020, 138, 105578. [Google Scholar] [CrossRef]
Beloconi, A.; Chrysoulakis, N.; Lyapustin, A.; Utzinger, J.; Vounatsou, P. Bayesian geostatistical modelling of PM10 and PM2.5 surface level concentrations in Europe using high-resolution satellite-derived products. Environ. Int. 2018, 121, 57–70. [Google Scholar] [CrossRef] [PubMed]
De Bruin, S. Predicting the areal extent of land-cover types using classified imagery and geostatistics. Remote Sens. Environ. 2000, 74, 387–396. [Google Scholar] [CrossRef]
Meinshausen, N.; Ridgeway, G. Quantile regression forests. J. Mach. Learn. Res. 2006, 7, 983–999. [Google Scholar]
Vaysse, K.; Lagacherie, P. Using quantile regression forest to estimate uncertainty of digital soil mapping products. Geoderma 2017, 291, 55–64. [Google Scholar] [CrossRef]
Lucchesi, L.R.; Kuhnert, P.M.; Wikle, C.K. Vizumap: An R package for visualising uncertainty in spatial data. J. Open Source Softw. 2021, 6, 2409. [Google Scholar] [CrossRef]
Taylor, A.R.; Watson, J.A.; Buckee, C.O. Pixelate to communicate: Visualising uncertainty in maps of disease risk and other spatial continua. arXiv 2020, arXiv:2005.11993. [Google Scholar]
Bolin, D.; Lindgren, F. Quantifying the uncertainty of contour maps. J. Comput. Graph. Stat. 2017, 26, 513–524. [Google Scholar] [CrossRef]
Lu, D.; Mausel, P.; Brondizio, E.; Moran, E. Change detection techniques. Int. J. Remote Sens. 2004, 25, 2365–2401. [Google Scholar] [CrossRef]
Hansen, J.N.; Mitchard, E.T.; King, S. Detecting Deforestation from Sentinel-1 Data in the Absence of Reliable Reference Data. arXiv 2022, arXiv:2205.12131. [Google Scholar]
Hostert, P.; Griffiths, P.; Linden, S.v.d.; Pflugmacher, D. Time series analyses in a new era of optical satellite data. In Remote Sensing Time Series; Springer: Berlin/Heidelberg, Germany, 2015; pp. 25–41. [Google Scholar]
Zhao, K.; Wulder, M.A.; Hu, T.; Bright, R.; Wu, Q.; Qin, H.; Li, Y.; Toman, E.; Mallick, B.; Zhang, X.; et al. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sens. Environ. 2019, 232, 111181. [Google Scholar] [CrossRef]
Verbesselt, J.; Hyndman, R.; Newnham, G.; Culvenor, D. Detecting trend and seasonal changes in satellite image time series. Remote Sens. Environ. 2010, 114, 106–115. [Google Scholar] [CrossRef]
Kennedy, R.E.; Yang, Z.; Cohen, W.B. Detecting trends in forest disturbance and recovery using yearly Landsat time series: 1. LandTrendr—Temporal segmentation algorithms. Remote Sens. Environ. 2010, 114, 2897–2910. [Google Scholar] [CrossRef]
Reiche, J.; De Bruin, S.; Hoekman, D.; Verbesselt, J.; Herold, M. A Bayesian approach to combine Landsat and ALOS PALSAR time series for near real-time deforestation detection. Remote Sens. 2015, 7, 4973–4996. [Google Scholar] [CrossRef] [Green Version]
Monteiro, A.; Menezes, R.; Silva, M.E. Modelling spatio-temporal data with multiple seasonalities: The NO₂ Portuguese case. Spat. Stat. 2017, 22, 371–387. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Satellite imagery downloaded using Google Earth Engine over the same location in Black Forest, Germany [centre point: 48.528N, 8.228E]. (a) Lidar, GEDI. Footprints of data are shown by coloured dots ranging from pink (low height) to blue (large height). (b) SAR HV, ALOS PALSAR-2 annual mosaic, shown with no filtering. No backscatter is black and high backscatter is white. (c) Optical Imagery, Sentinel-2. Bands 2, 3 and 4 are shown to create a RGB image, with a 90% stretch applied. (a) GEDI footprint data, lidar imagery; (b) ALOS-PALSAR-2 annual mosaic, SAR imagery; (c) Sentinel-2 bands: RGB, optical imagery.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Turton, A.E.; Augustin, N.H.; Mitchard, E.T.A. Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods. Remote Sens. 2022, 14, 4911. https://doi.org/10.3390/rs14194911

AMA Style

Turton AE, Augustin NH, Mitchard ETA. Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods. Remote Sensing. 2022; 14(19):4911. https://doi.org/10.3390/rs14194911

Chicago/Turabian Style

Turton, Amber E., Nicole H. Augustin, and Edward T. A. Mitchard. 2022. "Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods" Remote Sensing 14, no. 19: 4911. https://doi.org/10.3390/rs14194911

APA Style

Turton, A. E., Augustin, N. H., & Mitchard, E. T. A. (2022). Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods. Remote Sensing, 14(19), 4911. https://doi.org/10.3390/rs14194911

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Estimates and Change Detection of Forest Above-Ground Biomass Using Statistical Methods

Abstract

1. Introduction

2. Background Information

2.1. Field Data

2.2. Optical Remote Sensing

2.3. Synthetic Aperture Radar (SAR) Remote Sensing

2.4. Lidar Remote Sensing

3. Data

4. Large-Scale Spatial Modelling

4.1. Current Global Modelling Approaches

4.2. What Problems Are Faced When Modelling AGB Data?

4.3. Methods to Model Spatial Data

5. Data Combination

5.1. Why Use Combinations of Data Sources?

5.2. How Are Global Data Sources Currently Combined?

5.3. Problems Faced When Combining Data

5.4. Methods to Tackle Spatial Misalignment

5.5. Models to Improve Data Combination

6. Model Validation

6.1. How Are Global Maps Currently Assessed?

6.2. Problems with These Validation Methods?

6.3. Alternative Validation Methods

7. Uncertainty Measurements

7.1. The Importance of Uncertainty Measurements

7.2. How Is Uncertainty of Global Maps Currently Presented?

7.3. Problems Faced When Providing Uncertainty Estimates

7.4. Alternative Uncertainty Measurement Methods

8. Change Detection and Quantification

8.1. The Importance of Change Detection

8.2. How Is Global Biomass Change Currently Detected?

8.3. Problems Faced When Detecting AGB Change

8.4. How Can Change Detection Be Improved?

9. Discussion of Future Research and Potential Solutions

10. Summary

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI