**1. Introduction**

Forest stock volume (FSV) refers to the total volume of tree trunks growing within a certain area of a forest, and it is thus an important indicator for measuring the total forest resources within that area [1]. It is also an important parameter to measure forest quality, forest carbon sequestration potentials, and an evaluation of the effectiveness of forest management [2]. Around the globe and ever since the Chinese government formally proposed a strategic plan for carbon peaking and carbon neutrality in 2020, global warming has drawn widespread attention [3–5]. This is because the carbon sink capacity of forests is an effective measure to mitigate global warming. Through the change in FSV [6], the dynamic change in carbon storage can be understood and the carbon sink capacity of the forest ecosystem can be obtained. Therefore, FSV studies are not only paramount in the global carbon cycle, but also practically significant in the realization of China's dual-carbon objectives.

**Citation:** Ma, T.; Hu, Y.; Wang, J.; Beckline, M.; Pang, D.; Chen, L.; Ni, X.; Li, X. A Novel Vegetation Index Approach Using Sentinel-2 Data and Random Forest Algorithm for Estimating Forest Stock Volume in the Helan Mountains, Ningxia, China. *Remote Sens.* **2023**, *15*, 1853. https://doi.org/10.3390/rs15071853

Academic Editor: Jochem Verrelst

Received: 28 February 2023 Revised: 29 March 2023 Accepted: 29 March 2023 Published: 30 March 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The traditional FSV estimation method is mainly based on the manual measurement of the diameter at breast height (DBH) and tree height on the ground [7]. For fine-scale FSV estimation, it is indeed possible to obtain higher-precision estimation results [8]. However, if extended to a large-scale forest area, the small size and small number of sample plots will make it hard to obtain results close to the actual level [9]. Furthermore, forest ecosystems generally exhibit high spatial heterogeneity and inaccessibility [10,11]. Therefore, at this stage, it is not recommended to estimate FSV purely by manual field surveys. The advent of remote sensing has provided a solution to the challenge of largescale FSV estimation [1,8,12]. By utilizing satellite images, it is now possible to obtain information about forest structures and compositions across vast areas, without the need for extensive ground measurements [13]. This technology has revolutionized the field of forest inventory, allowing for a more efficient and accurate estimation of FSV at a large-scale. Remote sensing images can be used in combination with a small number of ground samples to obtain highly accurate estimates of FSV or biomass [10]. By calibrating remote sensing data with ground-based measurements, it is possible to create statistical models that can accurately predict FSV at a much larger scale [14]. This combined approach has significant advantages over traditional manual field surveys, as it allows for a more efficient and cost-effective estimation of FSV across large areas. Furthermore, the use of remote sensing data can provide a more comprehensive understanding of forest ecosystems, allowing for more informed management decisions.

However, as more and more optical remote sensing images are applied to FSV studies, researchers have focused on the light saturation phenomenon that affects FSV estimation results [15–17]. Using the band reflectance of optical remote sensing images, all kinds of vegetation indices can be calculated. These traditional indices are usually used to estimate the corresponding FSV or biomass [18–22]. However, as the forest ages, the traditional vegetation indices will no longer respond accordingly to the decrease or increase in tree age [15,16]. This is the phenomenon of overestimation of low values and underestimation of high values that often occurs in FSV estimation studies. This is a result of the insensitivity of spectral variables to changes in FSV, especially in forest areas with high vegetation coverage. Previous studies have explored a variety of methods to decrease the influence of light saturation phenomena on remote sensing estimation. These studies include the utilization of spatial regression models and multi-source remote sensing image fusion [15,17]. Unfortunately, being an FSV study solely on a specific region, it has generalized limitations and it does not apply to other regions.

The present study proposes a novel vegetation index aimed at improving the ability to estimate FSV from remote sensing images. According to the literature, it is known that the Sentinel-2 imagery covers 13 spectral bands [23–26], from visible light to shortwave infrared, and each band has different spatial resolutions. Among all optical satellites, Sentinel-2 is the only satellite that includes three spectral bands in the red-edge range [24,26]. These bands are very effective in monitoring vegetation change information. Such as to estimate the FSV of the Helan Mountains, the vegetation reflectance of these three red-edge bands was used to calculate the novel vegetation index [27]. Similarly, by setting the step size, the optimal weighting coefficient of each red-edge band was determined. As this study was carried out the a typical semi-arid montane forest ecosystem of the Helan Mountains, this study may serve as a knowledge base for related research in similar areas across the globe.

Furthermore, the present study aims at developing a novel vegetation index based on Sentinel-2 multiple red-edge bands. It also combines the original band information and traditional vegetation indices to estimate the FSV of the Helan Mountains under the machine learning algorithm. The study will accomplish the following three goals: (1) to explore the potentials of the novel vegetation index developed based on Sentinel-2 data to estimate the FSV; (2) to compare the ability of the different variable combinations to estimate FSV and determine the best model among the three models developed in this study; (3) to map the FSV distribution of the study area by the best variable combination obtained in objective (2).
