*Article* **Eco-Restoration of Coal Mine Spoil: Biochar Application and Carbon Sequestration for Achieving UN Sustainable Development Goals 13 and 15**

**Dipita Ghosh and Subodh Kumar Maiti \***

Ecological Restoration Laboratory, Department of Environmental Science & Engineering, Indian Institute of Technology (ISM), Dhanbad 826004, India; ghoshdipita@gmail.com

**\*** Correspondence: subodh@iitism.ac.in or skmism1960@gmail.com

**Abstract:** Open cast coal mining causes complete loss of carbon sink due to the destruction of vegetation and soil structure. In order to offset the destruction and to increase sequestration of carbon, afforestation is widely used to restore these mine spoils. The current field study was conducted to assess the ecosystem status, soil quality and C pool in an 8 years old reclaimed mine spoil (RMS), compared to a reference forest (RF) site and unamended mine spoil (UMS). Biochar (BC) prepared from invasive weed *Calotropis procera* was applied in this 8 year RMS at 30 t ha−<sup>1</sup> (BC30) and 60 t ha−<sup>1</sup> (BC60) to study its impact on RMS properties and C pool. Carbon fractionation was also conducted to estimate inorganic, coal and biogenic carbon pools. The C stock of 8 year old RMS was 30.98 Mg C ha−<sup>1</sup> and sequestered 113.69 Mg C ha−<sup>1</sup> CO<sup>2</sup> . BC<sup>30</sup> and BC<sup>60</sup> improved the C-stock of RMS by 31% and 45%, respectively, and increased the recalcitrant carbon by 65% (BC30) and 67% (BC60). Spoil physio-chemical properties such as pH, cation exchange capacity, moisture content and bulk density were improved by biochar application. The total soil carbon at BC<sup>30</sup> (36.3 g C kg−<sup>1</sup> ) and BC<sup>60</sup> (40 g C kg−<sup>1</sup> ) was found to be significantly high compared to RMS (21 g C kg−<sup>1</sup> ) and comparable to RF (33 g C kg−<sup>1</sup> ). Thus, eco-restoration of coal mine spoil and biochar application can be effective tools for coal mine reclamation and can help in achieving the UN sustainable development goal 13 (climate action) by increasing carbon sequestration and 15 (biodiversity protection) by promoting ecosystem development.

**Keywords:** coal mine spoil; reclamation; biochar; carbon sequestration; carbon fractionation

#### **1. Introduction**

The UN sustainable development goal (SDG) 13 stands for climate action and promotes all activities which would ensure successful sequestration of carbon, whereas, SDG 15 safeguards and restores biodiversity protection [1,2]. Burning of fossil fuels is the primary drivers of global warming and climate change, and the extraction of these resources also adds to global concerns regarding the climate crisis [3,4]. Mining activities lead to complete loss of vegetation and the carbon sink in the soil and plants are lost to the atmosphere [5]. Mine spoils are carbon deficit with impoverished soil conditions that cannot support plant and microbial growth. Coal mine restoration can help restore the lost carbon sink by promoting plant growth and enriching the mine spoil, which helps sequester the atmospheric carbon [6,7]. The most common techniques for mine restoration include afforestation, agriculture and grassland development [8,9]. Plantation of hardy species in reclaimed mine spoils (RMS) improves the soil organic carbon (SOC) pool and improves the carbon sequestration potential of the ecosystem [10]. Development of natural forest in mine spoils may take centuries due to the impoverished soil properties and lack of substrate for supporting plant growth. Degraded land can be reclaimed by development of forest cover. Technical reclamation such as leveling and grading of dump, reducing slope length, stabilization of slope by blanketing with coir mat along with grass-legume

**Citation:** Ghosh, D.; Maiti, S.K. Eco-Restoration of Coal Mine Spoil: Biochar Application and Carbon Sequestration for Achieving UN Sustainable Development Goals 13 and 15. *Land* **2021**, *10*, 1112. https:// doi.org/10.3390/land10111112

Academic Editors: Marina Cabral Pinto, Amit Kumar and Munesh Kumar

Received: 27 September 2021 Accepted: 18 October 2021 Published: 20 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

mixture, application of top soil, fly ash and bio-solids can be used to enhance the vegetation growth [11–13].

Restoration of RMS can potentially enhance soil C sequestration rate and improve soil properties [14–16]. Akala and Lal, [17] reported that the SOC pool of a RMS increased from 14 Mg ha−<sup>1</sup> to 48.4 Mg ha−<sup>1</sup> after 21 years of pastureland development in a degraded coal mine site. In another 19 years old revegetated coal mine spoil (Singrauli, India) there was 712% increase in the rate of carbon sequestration [18]. The sequestration of carbon depends on the type of vegetation used for reclamation, age of reclamation and nature of coal mine spoil. A study conducted by Mukhopadhyay et al. [19] in RMS reported that the carbon density was higher for *Dalbergia sissoo* Roxb. and *Acacia auriculiformis* A. Cunn. ex Benth. (39.6–43.7 kg C tree−<sup>1</sup> ) and lowest for *Albizzia lebbeck* L. (20.7 kg C tree−<sup>1</sup> ). Thus, plantation can be an effective tool for coal mine spoil eco-restoration and enhanced carbon sequestration.

A number of studies reported an increase in the soil carbon stock by biochar application [20,21]. Biochar is a thermal degradation product of biomass produced in a pyrolysis like condition by limiting the supply of oxygen [21]. Pyrolytic conversion of biomass produces aromatic carbon that is resistant to degradation in soil, thus considered an option to address the global CO<sup>2</sup> emission problems by biomass decomposition [22]. Biochar has a high mean residence time and aromaticity, making it highly recalcitrant in nature [11,23]. Thus, carbon that would normally be released as CO<sup>2</sup> through biomass decomposition is converted to biochar which is highly stable and aromatic. The aromaticty of biochar depends on the chemistry of biomass used for biochar production. Mean residence time of biochar depends on feedstock material, pyrolytic method used and the substrate where it is applied [8,24,25]. Fidel et al. [26] reported that biochar has the potential to improve the soil inorganic carbon by 0.023–0.045 mg C kg−<sup>1</sup> and organic carbon by 0.001–0.0069 mg C kg−<sup>1</sup> . According to a study conducted by Ghosh and Maiti, [27], *Lantana camara* biochar lowered mine spoil CO<sup>2</sup> flux to 3% (2.60 µmol CO<sup>2</sup> m−<sup>2</sup> s −1 ) and 2% (2.85 µmol CO<sup>2</sup> m−<sup>2</sup> s −1 ) in comparison to control (4.92 µmol CO<sup>2</sup> m−<sup>2</sup> s −1 ). Biochar acts as an amendment and improves physico-chemical; biological and nutritive soil properties [25,28]. An enriched soil supports the growth of vegetation which can facilitate ecosystem development and promote carbon sequestration in vegetation and soil. Thus, it is imperative to investigate the link between the intrinsic characteristics of biochar and mine spoil restoration.

The excessive growth of invasive weeds in RMS during the plantation stage of reclamation causes the problem of allelopathy [28]. These weeds are usually uprooted and left to decompose which adds to the atmospheric CO<sup>2</sup> pool. During the dry tropical summers, they act as fuel and cause even bigger problems of mine fire. One such weed growing abundantly in RMS is *Calotropis procera* (Aiton) W.T.Aiton (family: Apocynaceae). *C. procera* is a hardy shrub with an average height of 2 m and covered with a fissured corky bark which is high in cellulose and lignin. This can be a potential feedstock for biochar production and mine reclamation. Only a few studies have reported on biochar based carbon sequestration in a RMS, and the available data are from laboratory or greenhouse scale experiments [11,27]. The present study was conducted in an 8 year old RMS and the carbon sequestered was calculated in this RMS. A 6 month biochar based field experiment was also conducted to study the effect of biochar as an amendment for reclamation of mine spoil. The study aims to understand how coal mine reclamation along with biochar application can help in achieving UN SDG 13 and 15. Thus, the objectives of the study are: (i) assessment of carbon sequestration in an 8 year old RMS by vegetation, litter and soil carbon stock, (ii) application of *C. procera* biochar in the RMS in a 6 month field based study, (iii) fractionation of carbon in RMS, biochar amended RMS, reserved forest (RF) and unreaclaimed mine spoil (UMS) (iv) calculation of total CO<sup>2</sup> sequestration in each system.

#### **2. Materials and Methods**

#### *2.1. Site Description*

The study area is located in Damoda colliery, Jharia Coalfield, situated in the Dhanbad district of Jharkhand, India (23◦–23◦48′ N; 86◦11′–86◦27′ E). The site map of the study area is presented in Figure 1a,b. Damoda eco-restoration site is an 8 year old backfilled dump site of 4 ha area. The geology of the mine spoil consisted of sandstone, carbonaceous shale, intermixed shale and sandstone, Jhama (heat affected coal) with micaperidotite, subsoil and coal. The area experiences extreme weather conditions with summer temperature of 42 to 46 ◦C and winter temperature of 22 to 5 ◦C and received 1900 mm rainfall in the year of the study (2019). Jharia Coalfield is located in a dry tropical region and experiences three main seasons: summer, monsoon and winter. The carbon sequestration study was conducted in the February 2019, 6 months prior to which biochar was incubated in field conditions.

**Figure 1.** (**a**) Map of India, showing the Jharkhand state (**b**) Location map of Damoda ecorestoration site showing the 3 quadrats in the sampling area (**c**) Outer view of Damoda eco-restoration site showing dense bamboo cumps and stone boundry (**d**) Reserve forest sampling site.

> The eco-restored mine dump has a history of shovel–dumper based mining activity. In 2011, plantation of hardy and multipurpose tree saplings was carried out in pits of dimension 30 cm × 30 cm × 30 cm. Grass seeds such as *Pennisetum pedicellatum* Trin. were also spread, which act as pioneer species and develop understory vegetation. Afforested trees such as *A. lebbeck*, *D. sissoo* and *Bambusa arundinacea* (L.) Voss were dominant species with sparse growth of plants such as *Azadirachta indica* A. Juss., *Bauhunia veriegata* (L.) Benth., *Melia azedarach* L., *Psidium guajava* L., *Syzygium cumini* (L.) Skeels., *Terminalia arjuna* (Roxb.) Wight & Arn, and *Zizyphus mauritiana* Lam. Figure 1c shows the Damoda eco-restoration

site with stone boundary. RF area near the mining area was used as a positive reference site while UMS was used as a negative reference for the study. The most dominant trees in the RF site were *D. sissoo*, *A. lebbeck*, *Butea Monosperma* (Lam.) Taub. and *Shorea robusta* Roth (Figure 1d). UMS was coarser with rock debris, soils and subsoil materials. Since UMS was not revegetated, tree species were absent.

#### *2.2. Biochar Production, Characterization and Field Incubation*

*C. procera* growing in the RMS was collected in bulk, sun-dried, grinded and used for biochar production. Feedstock was pyrolysed in a muffle furnace at 450 ◦C for 60 min. Biochar characterization was carried out using the methods given in Ghosh et al. [11] and Ghosh and Maiti, [27]. The biochar field experiment was conducted as a completely randomized design in a 2 × 3 factorial scheme, each with 50 cm × 50 cm plots with two biochar application rate of 30 t ha−<sup>1</sup> (BC30) and 60 t ha−<sup>1</sup> (BC60), and each with three replications (Figure 2a–c). The carbon sequestration study was done with a 6 months incubation period in natural field conditions. − −

**Figure 2.** (**a**) Biochar being applied in RMS, (**b**) Plots showing biochar application (50 cm × 50 cm × **Figure 2.** (**a**) Biochar being applied in RMS, (**b**) Plots showing biochar application (50 cm × 50 cm × 10 cm), (**c**) Ecological restoration project site of Damoda, showing boundary wall and sign board, (**d**) Soil sampling being done by a soil corer (80 cm × 20 cm), (**e**,**f**) Collection of litter from a metal quadrat (50 cm × 50 cm).

#### *2.3. Soil Sampling*

Soil samples were collected from the rhizospheric region (0–15 cm) of different tree species, 8 samples collected from each quadrat with a metallic corer (80 cm × 20 cm) after removing the litter (Figure 2d). A total of 24 samples (8 samples × 3 quadrats) were collected from each RMS and RF. In the biochar incubation sites, 4 samples were collected from each plot making the total number of samples 24 (2 application level × 3 replicates × 4 samples per plot). Samples were placed in zip-lock packets and brought back to the laboratory for physico-chemical analysis. Samples were air-dried inside the laboratory for a week at room temperature.

#### *2.4. Plant Biodiversity and Vegetation Analysis*

Vegetation carbon stock was only analysed for RMS, RF and UMS, and no observable changes were observed with 6-month biochar application on the tree stocks, hence this data were not included. Three random quadrats of size 10 m × 10 m, covering total area of 300 m<sup>2</sup> were laid down for relative density [3]. Details on the density of the plants, IVI values and total number of species in RMS are provided in the Supplementary Materials. The density of species present in each site was expressed as number of individual species present per hectare area [3].

$$\text{Relative density } (\%) = \frac{\text{Number of individual plant species}}{\text{Total number of species in a quadrat}} \times 100\tag{1}$$

Circumference of large tree species in the quadrats were measured at 1.37 m for the measurement of diameter at breast height (DBH), and the height of the tree was measured with a Distometer (Bosch GLM 40, India), while smaller vegetation (<3 m height) was measured using a digital Vernier caliper (Precise®, India). Specific gravity of the wood was measured by water displacement method. The aboveground biomass (AGB) was estimated by the regression model developed by Chave et al. [29], which showed the best-fit for tropical forests. The model estimated tree AGB by the following equation:

$$\text{AGB} = 0.0673 \times \left(\rho \text{D}^2 \text{H}\right)^{0.976} \tag{2}$$

where, AGB = above ground biomass (kg), ρ = wood specific gravity (g cm−<sup>3</sup> ), D = DBH (cm) and H = tree height (cm).

Root biomass (RB) was calculated by multiplying AGB by a factor of 2.25 [30]:

$$\text{RB} \left(\text{Mg} \,\text{ha}^{-1}\right) = \text{AGB} \left(\text{Mg} \,\text{ha}^{-1}\right) \times 2.25 \tag{3}$$

Tree carbon stock was calculated multiplying a factor of 0.5 by total tree biomass [31].

$$\text{Tree C stock} \left(\text{Mg ha}^{-1}\right) = \text{Total tree biomass} \left(\text{Mg ha}^{-1}\right) \times 2.25 \tag{4}$$

The CO<sup>2</sup> sequestered by the plantation stock is calculated by relation given by IPCC [30].

$$\text{CO}\_2\text{squeetized} \left(\text{Mg ha}^{-1}\right) = \text{Tree C stock} \left(\text{Mg ha}^{-1}\right) \times 3.67\tag{5}$$

The AGB of bamboo clumps were calculated by the allometric relationship given by Nath et al. [32] and Mazumder et al. [33]. This equation was primarily developed to establish a relationship between culm height, density and AGB in thick walled bamboo. The equation is as follows:

$$\text{AGB} = 7.5 \times \left(\text{D}^2\text{H}\right)^{0.91} \tag{6}$$

where H is total height of the bamboo culm, and D is DBH of the bamboo culm. 47% of the total biomass stock was considered as total carbon stock [32].

#### *2.5. Herbaceous Biomass and Litter Analysis*

The herbaceous biomass and litter present in the respective quadrats were measured by placing three litter traps (50 cm × 50 cm) under the tree canopy per quadrat [34] as shown in Figure 2e,f. The collected biomass was then dried in a hot air oven at 65 ◦C for 48 h. The dry weight of the litter obtained was converted to kg m−<sup>2</sup> by dividing it by quadrat area (50 cm <sup>×</sup> 50 cm) and then converting to kg ha−<sup>1</sup> . Litter was assumed to have 40% carbon; hence C stock was calculated by a conversion factor of 0.4. − −

#### *2.6. Soil Characterization*

#### 2.6.1. Soil Carbon Fractionation

Soil fractionation for the determination of inorganic, biogenic (labile and stable) and coal carbon present in the mine spoil was determined by the sequence of steps given by Ussiri and Lal. [10] The steps followed for sequential extraction of different forms of soil organic carbon, coal carbon and inorganic carbon in RMS, BC30, BC<sup>60</sup> and RF are given in Figure 3.

**Figure 3.** Sequential methods for the determination of total carbon, inorganic carbon and biogenic carbon pools in RMS, BC30, BC<sup>60</sup> and RF (n x indicated the number of times the step was repeated).

#### 2.6.2. Soil Physico-Chemical Properties

The soil samples were air dried and sieved by a 2-mm sieve to remove the coarse fraction from the fine earth fraction (<2 mm). pH and EC were determined in a soil and water slurry (spoil: water, 1:2.5, *w/v*) by a multiparameter probe (HI-2020, Hanna Instruments, India). Cation exchange capacity (CEC) was calculated by the ammonium acetate extraction method [35]. Available-N was determined by a Kjeldahl distillation unit (KJELODIST-EAS VA, Pelican Equipment Inc. India). Available-P was extracted by NaHCO<sup>3</sup> (pH 8.5) and measured by a UV-VIS Spectrophotometer (Shimadzu Corporation,

Japan) [36]. Available-K was calculated by 1 N ammonium acetate method by using flame photometer [5]. The C-stock is often underestimated due to the coarse fraction in mine soils, hence only soil fraction (<2-mm particle size) was considered for bulk density calculation. The bulk density was corrected by the equation given by Ahirwal et al. [6]:

$$\text{Corrected Bulk Density} \left(\text{Mg m}^{-3}\right) = \frac{\text{Sample weight (Mg)} \times \text{Fine earth fraction } \left(\%\right)}{\text{Volume of core } \left(\text{m}^3\right) \times 100} \tag{7}$$

Soil organic carbon (SOC) of the study sites were calculated by the relation [6]:

$$\text{SOC stock} \left(\text{Mg} \,\text{ha}^{-1}\right) = \frac{\text{Biogenic carbon pool}(\%) \times \text{BD} \left(\text{Mg} \,\text{m}^{-3}\right) \times \text{T} \left(\text{m}\right) \times 10^4 \left(\text{m}^2 \,\text{ha}^{-1}\right)}{100} \tag{8}$$

where SOC = Soil organic carbon; BD = corrected bulk density; and *T* = depth of the soil layer.

#### *2.7. Carbon Sequestration Study*

The total C sequestration pool of an ecosystem is calculated by adding the C-stock associated with (i) AGB and RB, (ii) understory vegetation and litter layer, and (iii) SOC stock. Carbon is accumulated in vegetative parts such as leaf, twigs, and logs, live and dead roots, and soil organic matter. The biomass carbon pool varies from plant to plant and also by age of vegetation. Thus, the total ecosystem carbon pool can be assessed by adding (i) AGB and RB carbon-pool, (ii) understory and litter C stock and (iii) biogenic carbon stock at (0 to 15 cm). To determine the Carbon sequestration rate (Mg C ha−<sup>1</sup> year−<sup>1</sup> ), total C stock (Mg C ha−<sup>1</sup> ) was divided with by age of reclamation.

#### *2.8. Statistics*

One-way ANOVA was used to compare the means of data obtained from RMS, BC30, BC<sup>60</sup> and RF. Post hoc Duncan's multiple range tests at *p* < 0.05 significance level was used to test the significant difference in the C-stock in each level of analysis. SPSS 23 was used for statistical studies, and software such as MS-EXCEL and ORIGIN Pro-8 was used for graphical representation.

#### **3. Results and Discussions**

#### *3.1. Biochar Characteristics*

The general characteristics *C. procera* biochar are presented in Table 1. Biochar yield obtained from *C. procera* feedstock was 51.87%. Biochar obtained was alkaline in nature with a pH of 7.75 and EC of 4.7 mS cm−<sup>1</sup> . The total elemental C, H and N were 68.25%, 35.39% and 13.62% respectively. The C/N ratio of 5.01 indicates that the biochar is rich in labile carbon, providing substrate for microbial action in the mine spoil, while H/C ratio of 0.51 for *C. procera* biochar represents its high degree of aromatization. *C. procera* has a high organic carbon content of 42.24%, porosity of 78% and low bulk density of 0.25 g cm−<sup>3</sup> .

The high surface area and the porous morphology of the biochar surface can be seen in the FE-SEM image of *C. procera* biochar given in Figure 4a,b. The porous structure provides an enlarged surface area and substrate for microbial action. Several spectral peaks representing various functional groups were observed on the *C. procera* biochar surface (Figure 4c). At transmittance of 3391 cm−<sup>1</sup> an O-H bond is prominent due to the breaking of hydrogen bonded hydroxyl groups. Other bonds such as –CH<sup>3</sup> (2924 cm−<sup>1</sup> ), –CH<sup>2</sup> (2870 cm−<sup>1</sup> ), C = O (1600–1700 cm−<sup>1</sup> ), due to cellulose of the feedstock, are also present. The peaks at 500–600 cm−<sup>1</sup> represent the aromatic carbons in the biochar surface.

**Table 1.** Physio-chemical properties of *C. procera* biochar (*n* = 5, mean ± standard deviation).

**Figure 4.** (**a**) FE-SEM image of *C. procera* biochar at 500× magnification, (**b**) FE-SEM image of *C. procera* biochar at 1800× magnification showing the pore sizes, (**c**) FTIR spectra of *C. procera* biochar showing the surface functional groups.

−

−

−

−

#### *3.2. Plant Biodiversity and Vegetation Analysis*

*B. arundinacea* clump density was 4033 clumps ha−<sup>1</sup> , whereas the tree density was 3233 trees ha−<sup>1</sup> . Relative distribution of the species in the reclaimed site is shown in Figure 5. *B. arundinacea* clumps are most abundant (56%), followed by *Albizzia* spp. (18%), *D. sissoo* (10.5%), and *Z. mauritiana* (5%). Das and Maiti [37] reported that the same reclaimed mine spoil at 4 years old had a bamboo clumps density of 3033 clumps ha−<sup>1</sup> , whereas the tree density was 2500 trees ha−<sup>1</sup> .

**Figure 5.** Relative density of each species in the RMS showing the percentage of each species in the study area.

#### *3.3. Estimation of Biomass Carbon Stock*

− − − − − − <sup>−</sup> <sup>−</sup> Č <sup>−</sup> − − − Plant biomass and C pool associated with RMS and RF are summarized in Table 2. In RMS, *Albizzia* spp. (4.34 Mg C ha−<sup>1</sup> ) had the highest total tree carbon stock, followed by *B. arundinacea* (2.61 Mg C ha−<sup>1</sup> ) and *D. sissoo* (2.91 Mg C ha−<sup>1</sup> ). The total carbon stock from the plant biomass was 12.59 Mg C ha−<sup>1</sup> and CO<sup>2</sup> sequestered was 48.17 Mg ha−<sup>1</sup> . Tree biomass of the RF was three times (30.63 Mg C ha−<sup>1</sup> ) higher than the RMS (12.59 Mg C ha−<sup>1</sup> ) and UMS (3.52 Mg C ha−<sup>1</sup> ). Cˇ ížková et al. [38] reported 1.6 t ha y−<sup>1</sup> potential for carbon sequestration in reclaimed grasslands in a reclaimed lignite mine. Ahirwal et al. [39] reported the effect of fast-growing trees on soil properties and the ecosystem carbon pool after eight years of afforestation. The study reported greater carbon storage in *D. sissoo* (39 Mg C ha−<sup>1</sup> ) compared to *A. lebbeck* (34 Mg C ha−<sup>1</sup> ) and *A. procera* (26 Mg C ha−<sup>1</sup> ). In a 16 year reclaimed coal mine site, Ahirwal and Maiti, [40] reported that the tree carbon stock was 75% of the reference forest site, plantation of multipurpose tree species improved mine spoil fertility, facilitates natural growth of indigenous tree species. Due to the plant soil-interaction in a coal mine spoil, the roots of the re-vegetated plants alter soil structure and function [8]. Although carbon stock associated with RMS was less than that of RF, yet the growth of native species proves that proper reclamation technology can influence ecosystem development Thus, increase in C-stock proves that plantation is a successful means of mine reclamation which certainly helps in achieving SDG 13 and 15.


**Table 2.** Total biomass, tree carbon stock and CO<sup>2</sup> sequestered in various coal mine spoils of the world and their values in the current study (RMS: reclaimed mine spoil and RF: reserve forest).

> \* CO<sup>2</sup> sequestered (Mg C ha−<sup>1</sup> ) = Tree carbon stock (Mg C ha−<sup>1</sup> ) × 44/12.

#### *3.4. Herbaceous Biomass and Litter Analysis C-Stock*

Litter consists of twigs, plant debris, foliage and branches which possess high nutrient content. It is one of the most important sources of organic matter in the mine spoils, and litter decomposition contributes to nutrients recycling and improvement in soil fertility [36]. During reclamation of a degraded mine spoil, increase in litter C improves the SOC of the ecosystem and can be an indicator of restoration success. Litter and understory carbon stock in RMS, BC30, BC60, RF and UMS are presented in Figure 6. The total carbon stock in RMS was 1.24 Mg ha−<sup>1</sup> , while higher values of 1.64 Mg ha−<sup>1</sup> and 1.73 Mg ha−<sup>1</sup> were observed for BC<sup>30</sup> and BC<sup>60</sup> respectively. Biochar application had carbon stock comparable to the RF (1.79 Mg ha−<sup>1</sup> ) and was significantly higher (*p* < 0.05) than UMS (0.5 Mg ha−<sup>1</sup> ).

Litter and understory carbon stock in reclaimed mine spoil (RMS), biochar treatment at 30 t ha<sup>−</sup> **Figure 6.** Litter and understory carbon stock in reclaimed mine spoil (RMS), biochar treatment at 30 t ha−<sup>1</sup> (BC30) and 60 t ha−<sup>1</sup> (BC60), reserved forest (RF) and unreclaimed mine spoil (UMS) (*n* = 24).

− −

− − − − −

− −

− −

−

−

#### *3.5. Mine Spoil Properties*

#### 3.5.1. Inorganic, Biogenic and Coal Carbon Estimation

Carbon fractions in RMS, BC30, BC60, RF and UMS are presented in Figure 7a. The total soil carbon at BC<sup>30</sup> (36.3 g C kg−<sup>1</sup> ) and BC<sup>60</sup> (40 g C kg−<sup>1</sup> ) was found to be significantly high (*p* < 0.05) compared to RMS (21 g C kg−<sup>1</sup> ), comparable to RF (33 g C kg−<sup>1</sup> ), and low in UMS (12 g C kg−<sup>1</sup> ). Although there was no significant difference between the soil inorganic carbon of RMS (1.9 g C kg−<sup>1</sup> ), RF (2 g C kg−<sup>1</sup> ) and UMS (0.7 g C kg−<sup>1</sup> ), the inorganic fraction was found to increase to 3.5 g C kg−<sup>1</sup> and 4.5 g C kg−<sup>1</sup> at BC<sup>30</sup> and BC60, respectively. Average coal carbon was higher in RMS, BC30, BC<sup>60</sup> and UMS compared to RF. A greenhouse experiment conducted by Rodríguez-Vila et al. [43] on copper mine spoils reported a range of 20–207 g C kg−<sup>1</sup> for total soil carbon and 3–27 g C kg−<sup>1</sup> for inorganic carbon by biochar application rate of 20–100%. − −

<sup>−</sup> **Figure 7.** (**a**) Comparison of different carbon fractions in reclaimed mine spoil (RMS), biochar treatment at 30 t ha−<sup>1</sup> (BC30) and 60 t ha−<sup>1</sup> (BC60), reserved forest (RF) and unreclaimed mine spoil (UMS). (**b**) Comparison of labile and recalcitrant carbon fractions in RMS, BC30, BC60,RF and UMS.

−

Biogenic carbon fraction is a complex pool which can be broadly divided into labile and recalcitrant carbon pool. Labile pool has a residence time of years to a few decades while the recalcitrant carbon pool can remain in the soil for hundreds to thousands of years [37]. Figure 7b shows the labile and the recalcitrant carbon present in RMS, BC30, BC<sup>60</sup> and RF. Labile fraction was found to be 56% and 51% of biogenic carbon pool for RMS and RF, respectively. The labile carbon trend was of the order UMS > RMS > RF > BC<sup>30</sup> > BC60. Recalcitrant fraction was found to be 44% and 49% of the biogenic carbon pool for BC<sup>30</sup> and BC60, respectively. The recalcitrant carbon trend was of the order BC<sup>60</sup> > BC<sup>30</sup> > RF > RMS > UMS. Labile C pools can have a large impact on soil-biochar interactions in the short term (~6 months), whereas recalcitrant carbon pool is influential in soil properties and soil function for a longer period of time [24,26]. Sub-fractions of soil organic carbon are indicators of soil fertility and are instrumental in understanding the influence of a management practice. In a study conducted on a 10 year old reclaimed coal mine spoil, Das and Maiti. [44] reported that the biogenic C constituted 45–66% of total soil carbon in RMS. Fidel et al. [26] reported that biochar improves the soil inorganic carbon by 0.023–0.045 mg C kg−<sup>1</sup> and organic carbon by 0.001–0.0069 mg C kg−<sup>1</sup> in eroded soil. The study also reported that labile biochar pools are stabilized by the recalcitrant C pool. Thus, because of the higher recalcitrant fraction, biochar application will play an instrumental role in increasing the C-stock and help achieve the UN SDG 13 and 15.

#### 3.5.2. Other Physio-Chemical Properties

RMS, BC<sup>30</sup> and BC60, RF and UMS properties are summarized in Table 3. Soil fraction was significantly (*p* < 0.05) high in the reference forest compared to the 8 year old RMS and UMS, biochar application had no effect in the soil fractions. pH of RMS was neutral, biochar application resulted in an alkaline mine spoil, while it was slightly acidic in the RF and UMS. EC of the RMS was 0.16 dS cm−<sup>1</sup> compared to 0.11 dS cm−<sup>1</sup> in RF soil, 0.17 dS cm−<sup>1</sup> in UMS, 0.09 dS cm−<sup>1</sup> in BC<sup>30</sup> and 0.1 dS cm−<sup>1</sup> in BC60. CEC was higher in RF (13.1 C mol kg−<sup>1</sup> ) compared to RMS (8.22 C mol kg−<sup>1</sup> ). Moisture content in RF was 26% higher that of RMS, while BC<sup>60</sup> and BC<sup>30</sup> improved the moisture content by 33% and 55% respectively (*p* < 0.05). Available N and P showed significant difference (*p* < 0.05) in the RMS and RF; available N and P in RF were 26% and 47% higher, respectively. The exchangeable potassium was also found to be higher in RF (55.22 mg kg−<sup>1</sup> ) compared to the 8 year old RMS (102.1 mg kg−<sup>1</sup> ). BC<sup>30</sup> and BC<sup>60</sup> improved the NPK values in the mine spoil significantly (*p* < 0.05) which helps in vegetation growth. The corrected bulk density was higher in the RF site (1.34 Mg m−<sup>3</sup> ) compared to the RMS (0.71 Mg m−<sup>3</sup> ), BC<sup>30</sup> (0.65 Mg m−<sup>3</sup> ) and BC<sup>60</sup> (0.63 Mg m−<sup>3</sup> ). Ghosh et al. [11] reported an increase in organic carbon by threefold, CEC by twofold, with a decrease in bulk density to half, by *Lantana* biochar application in a coal mine spoil. Thus, biochar application can effectively improve the physio-chemical properties of the mine spoil which will help accelerate the process of soil formation in a RMS and increase C-stock to near RF level. The application of biochar improves the soil physico-chemical properties and increases the carbon stock in the soil. This ameliorated mine spoil will support plant growth and help in ecosystem development. This in the long run will help in achieving the UN SDGs 13 and 15.

#### *3.6. Total C-Pool*

The total C stock, CO<sup>2</sup> sequestered and rate of C sequestration in RMS, BC30, BC60, RF and UMS are presented in Table 4. The C stock of RF (72.11 Mg C ha−<sup>1</sup> ) was almost thrice the RMS (30.98 Mg C ha−<sup>1</sup> ) and 5 times the UMS (13.92 Mg C ha−<sup>1</sup> ). Application of biochar @ 30 t ha−<sup>1</sup> improved the C-stock of RMS by 33%, but was 42% lower than the RF. Similarly, biochar @ 60 t ha−<sup>1</sup> improved the C-stock of RMS by 47%, but was 36% lower than the RF C-stock. CO<sup>2</sup> sequestered had the sequence, RF (264.64 Mg C ha−<sup>1</sup> ) > BC<sup>60</sup> (168.22 Mg C ha−<sup>1</sup> ) > BC<sup>30</sup> (151.70 Mg C ha−<sup>1</sup> ) > RMS (113.69 Mg C ha−<sup>1</sup> ) > USM (48.37 Mg C ha−<sup>1</sup> ). Mukhopadhyay and Masto [45] reported yard waste biochar improved

the stable carbon pool in biochar amended mine spoil to 0.873 g CO2–C kg−<sup>1</sup> compared to 0.03 g CO2–C kg−<sup>1</sup> in mine spoil. Xu et al. [46] reported that the application of bamboo leaf biochar at 5 and 15 Mg ha−<sup>1</sup> application rate increased the ecosystem carbon stock by 1486.31% and 252.98%, respectively. This increase could be due to the increase in vegetative cover with time. Plant growth can play an important role in decomposition and humus layer formation, which later turned into soil organic matter and increases the organic carbon pool of the soil and provides nutrients to the reclaimed vegetation.

**Table 3.** Characteristics of RMS (reclaimed mine soil), BC<sup>30</sup> (biochar @ 30 t ha−<sup>1</sup> ), BC<sup>60</sup> (biochar @ 60 t ha−<sup>1</sup> ), RF (reference forest) soil and unreclaimed mine spoil (UMS) (*n* = 24, mean ± Standard deviation, within each row, values with same letter are not significantly different, *p* < 0.05 with Duncan's multiple range test).


**Table 4.** Effect of biochar application on carbon stocks of different landforms. Comparison of total C stock, CO<sup>2</sup> sequestered, Rate of C accumulation in 8 year old RMS (reclaimed mine spoil), BC<sup>30</sup> (biochar @ 30 t ha−<sup>1</sup> ), BC<sup>60</sup> (biochar @ 60 t ha−<sup>1</sup> ), RF (reserve forest) and unreclaimed mine spoil (UMS).


Application of biochar increased the C-stock of RMS by 33%, and 47% at application rates of 30 t ha−<sup>1</sup> and 60 t ha−<sup>1</sup> , respectively. As mentioned earlier, recalcitrant fraction of this C-stock is 44% and 49% of the biogenic carbon pool for BC<sup>30</sup> and BC60, respectively. Thus, it can be concluded that the increase in the C-pool in the 6 months of the study was due to the labile fraction. The remaining 44% and 49% will remain in the form of recalcitrant carbon in the RMS, and thus, will aid in achieving the climate action goals of the SDG.

#### **4. Future Recommendations**

Through the course of this study, future goals and recommendations for continuing research in this field are as follows:


#### **5. Conclusions**

Plantation of hardy species in coal mine spoil restores the derelict ecosystem which promotes natural colonization of indigenous species. In the current study, 8 year old RMS effectively increased the biomass, litter and biogenic carbon stock in the soil. The rate of C accumulation for 8 year old RMS was calculated to be 3.92 Mg C ha−<sup>1</sup> year−<sup>1</sup> . Application of *C. procera* biochar in the RMS improved the soil physio-chemical properties. The inorganic and biogenic carbon pool, especially the recalcitrant pool, was improved by biochar application, suggesting that biochar can be an effective mode of enhanced carbon fixation in the spoil, along with plantation activities. A mere 6 month application period increased the C-stock by 36–42%, thus its recalcitrant carbon content can be fixed in the mine spoil for a longer period of time. This proves that biochar has tremendous potential in fixing carbon, along with forestry based reclamation of coal mine spoil. Thus, carbon stock increases with age of reclamation, and biochar application can increase the carbon stock close to reference forest site level. As the biochar-plantation synergy can both sequester carbon and also promote biodiversity, it can be an effective tool for achieving United Nations SDG 13 and 15.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/land10111112/s1, Table S1. List of species recorded in quadrats of RMS sites showing some biodiversity parameters.

**Author Contributions:** Both authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by D.G., S.K.M. was the overall supervisor of the work. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** Sincere thanks to Indian Institute of Technology (Indian School of Mines), Dhanbad and MHRD, GOI for providing fellowship to the first author (17DR000426). We would also like to thank the reviewers and the editors for their insightful comments during the review of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Huafang Huang 1,2,3, Xiaomao Wu 4,5 and Xianfu Cheng 1,3,\***


**Abstract:** This study aimed to respond to the national "carbon peak" mid-and long-term policy plan, comprehensively promote energy conservation and emission reduction, and accurately manage and predict carbon emissions. Firstly, the proposed method analyzes the Yangtze River Economic Belt as well as its "carbon peak" and carbon emissions. Secondly, a support vector regression (SVR) machine prediction model is proposed for the carbon emission information prediction of the Yangtze River Economic Zone. This experiment uses a long short-term memory neural network (LSTM) to train the model and realize the experiment's prediction of carbon emissions. Finally, this study obtained the fitting results of the prediction model and the training model, as well as the prediction results of the prediction model. Information indicators such as the scale of industry investment, labor efficiency output, and carbon emission intensity that affect carbon emissions in the "Yangtze River Economic Belt" basin can be used to accurately predict the carbon emissions information under this model. Therefore, the experiment shows that the SVR model for solving complex nonlinear problems can achieve a relatively excellent prediction effect under the training of LSTM. The deep learning model adopted herein realized the accurate prediction of carbon emission information in the Yangtze River Economic Zone and expanded the application space of deep learning. It provides a reference for the model in related fields of carbon emission information prediction, which has certain reference significance.

**Keywords:** carbon emission; SVR; LSTM neural network; carbon emission prediction

#### **1. Introduction**

In recent years, with the frequent occurrence of extremely severe weather due to global warming, countries around the world have begun to pay attention to the imbalance of carbon emissions caused by the emissions of greenhouse gases such as carbon dioxide (CO2) [1]. Excessive CO<sup>2</sup> and other greenhouse gas emissions have caused irreparable damage to the environment [2]. Meanwhile, the process of social development cannot avoid the problem of carbon emissions, so the real-time prediction and monitoring of carbon emissions information has become extremely crucial [3]. Many scholars have performed a lot of research in the field of carbon emissions.

The earliest representative studies abroad mainly used the factor decomposition method, index decomposition method, input–output method, and combination model forecasting. Scholars have proposed an improved cuckoo optimization algorithm neural network (COANN) artificial neural network structure, which is optimized by the cuckoo algorithm (COA). The performance of COANN is evaluated by the mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (CC) between the model output and the actual data set. The COANN prediction

**Citation:** Huang, H.; Wu, X.; Cheng, X. The Prediction of Carbon Emission Information in Yangtze River Economic Zone by Deep Learning. *Land* **2021**, *10*, 1380. https://doi.org/ 10.3390/land10121380

Academic Editors: Marina Cabral Pinto, Amit Kumar and Munesh Kumar

Received: 15 November 2021 Accepted: 12 December 2021 Published: 13 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

model can predict the world's CO<sup>2</sup> emissions by 2050 [4]. Some researchers proposed a multi-objective predictive energy management strategy for residential grid-connected hybrid energy systems via machine learning technology. The strategy proposed includes three levels of control: 1—the logical level of management of computational load and accuracy; 2—on the dual prediction model of the residual causal expansion convolutional network, it is used for energy production and system power load; and 3—as well as multi-objective optimization for effective transactions, to provide energy for the public grid through battery charging scheduling [5]. It was proposed that energy efficiency on non-intrusive load monitoring (NILM) can save electricity by improving the awareness of behavior changes and reducing carbon dioxide emissions into the environment. In the data published in Malaysia from 1996 to 2018, a predictive model was established and scenario simulations were carried out. Malaysia's public database was also sued for predicting the impact of CO<sup>2</sup> emissions and NILM on environmental degradation from 2019 to 2030 [6]. Some scholars have used Eviews software to analyze the carbon emission data of Beijing, Henan, Guangdong, and Zhejiang from 1997 to 2017. They also used differential stationary processing, moving average, and substituting strong impact points for data preprocessing. Through model identification, parameter estimation, and model testing, they established an integrated moving average autoregressive (ARIMA) models to predict carbon emissions in four regions [7]. It is believed that it is important to objectively evaluate the impact of relevant factors on carbon emissions. They proposed a modified production theory decomposition analysis (PDA) model under the semi-disposable hypothesis, and correspondingly decomposed the carbon emission changes of China's thermal power generation industry [8]. Some scholars have used the Lasso regression model to screen out eight significant factors affecting carbon emissions, and used the BP neural network model to predict the carbon emissions of Jiangsu Province from 2019 to 2030. They used artificial neural networks (ANN) to develop carbon emission intensity prediction models for Australia, Brazil, China, India, and the United States. Nine parameters that play an important role in the intensity of carbon emissions were selected as input variables. After many iterations, the best model was selected for each country by predefined criteria. They used a 9–5–1 multilayer perceptron with a backpropagation algorithm to build, validate, and train the model. The results of the verification model show that the error between the predicted value and the actual value is approximately 0, and the proposed ANN model can accurately predict carbon emissions [9,10]. Wang et al. (2021) used the random forest (RF) machine learning algorithm to analyze the relationship between urban factors and carbon emissions using real data from Chinese cities [11]. Yan et al. (2021) proposed a new integrated inversion model. This model was used for the intelligent assessment and prediction of water, carbon, and ecological footprint based on integrated multi-task machine learning (MML) and multi-model stack (MMS) algorithms. The accuracy and generalization ability of the model is further explained through the three largest urban agglomerations in the middle reaches of the Yangtze River [12]. Huang et al. (2021) proposed a new method to simulate the dual relationship between emission inventory and pollution concentration for emission inventory estimation [13].

Deep learning has become the latest method and means of studying carbon emissions. At present, most studies use one algorithm to study carbon emissions, and do not consider the combination of multiple algorithms to study carbon emissions. Therefore, the prediction accuracy of the established model is key to measuring whether the algorithm is suitable for carbon emission prediction. Because the long short-term memory (LSTM) neural network is very effective in predicting time-dependent problems, the time factor has a greater impact on carbon emissions. In addition, the problem of carbon emissions is a complex non-linear problem. Additionally, a large amount of data need to be classified and processed. Due to the large error term in the LSTM prediction, slack variables are introduced into the SVR model. Slack variables are introduced to correct the larger prediction errors in the LSTM model, the LSTM-SVR hybrid model is established, and a better prediction effect is achieved. The innovation lies in the indexing of "carbon emissions" information and

expanding the area of data availability. Meanwhile, the LSTM is revised by introducing slack variables into the SVR model. This research has a certain reference value for energysaving development entities in the Yangtze River Economic Zone to implement energysaving emission reduction and carbon emission refined control forecasts.

" "

#### **2. Related Concepts and Algorithm Analysis**

#### *2.1. Yangtze River Economic Belt*

The Yangtze River Economic Belt comprises Guizhou, Sichuan, Yunnan, and Chongqing in the western region, and Hubei, Hunan, and Jiangxi provinces in the central region, and Anhui, Jiangsu, Zhejiang and Shanghai in the eastern region, with 11 provinces and cities. The Yangtze River Economic Belt covers an area of approximately 2.05 million square kilometers, has a total population of approximately 600 million, and represents 40% of the country's GDP. It is regarded as a dynamic economic belt, second only to the coastal economic belt [14]. Its rapid economic development has been accompanied by increasingly prominent ecological and environmental problems in the Yangtze River Basin, such as soil erosion, floods, and ecological imbalances along the river. Due to the increasingly prominent environmental resource problems of the Yangtze River, the protection of its natural ecology has become ineffective, restricting the growth rate of the Yangtze River Economic Belt [15]. Figure 1 shows a sketch of the regional locations of provinces and cities in the Yangtze River Economic Belt. 40% of the country's GDP. It is regarded as a dynamic econ

**Figure 1.** Sketch map of the geographic regions of provinces and cities in the Yangtze River Economic Belt.

#### *2.2. Definition and Analysis of Carbon Sources*

#### 2.2.1. Definition

The "source" of greenhouse gases, in layman's terms, is the activity of emitting gases into the atmosphere. "Greenhouse gases" can make the arth's surface temperature The "source" of greenhouse gases, in layman's terms, is the activity of emitting gases into the atmosphere. "Greenhouse gases" can make the Earth's surface temperature higher, generally by absorbing and re-emitting infrared radiation to play a role [16]. Greenhouse gases mainly include CO2, ozone (O3), and methane (CH4). In addition, this also includes man-made greenhouse gases such as hydrofluorocarbons (HFCs) [17], of which CO<sup>2</sup> has the most obvious warming effect. The warming effects and life cycles of some greenhouse gases are shown in Figure 2.

**Figure 2.** Diagram of the warming effects and life cycles of some greenhouse gases.

#### 2.2.2. Analysis

"Carbon source", as arth's atmosphere [18]. These gas components enter the atmosphere from "carbon source" is "carbon sink". Simply put, "carbon sink" refers to the mechanism of removing greenhouse gases from the atmosphere, such "carbon sinks" will also lead to an increase in carbon emissions [19]. The specific classification of "carbon source " is shown in Figure 3. "Carbon source", as the name implies, is a gas component that increases the amount of CO<sup>2</sup> in the Earth's atmosphere [18]. These gas components enter the atmosphere from the surface of the earth or are formed by the chemical conversion of CO<sup>2</sup> in the atmosphere. Corresponding to the "carbon source" is the "carbon sink". Simply put, the "carbon sink" refers to the mechanism of removing greenhouse gases from the atmosphere, such as by means of the photosynthesis process of plants. From this point of view, the reduction in "carbon sinks" will also lead to an increase in carbon emissions [19]. The specific classification of "carbon sources" is shown in Figure 3. "Carbon source", as arth's atmosphere [18]. These gas components enter the atmosphere from "carbon source" is "carbon sink". Simply put, "carbon sink" refers to the mechanism of removing greenhouse gases from the atmosphere, such "carbon sinks" will also lead to an increase in carbon emissions [19]. The specific classification of "carbon source " is shown in Figure 3.

Schematic diagram of the classification of "carbon sources" **Figure 3.** Schematic diagram of the classification of "carbon sources".

#### Schematic diagram of the classification of "carbon sources" 2.2.3. Causes of Carbon Emissions

Carbon Emissions in the Process of Urbanization

' ' ' ' The factors inducing high carbon emissions in the process of China's urbanization can be divided into two categories. The first is that of economic factors such as the expansion of infrastructure construction, the growth of residents' consumption, and the transformation of land use patterns. The other is the policy incentives that lead to phenomena such as short-lived construction, major demolition, and construction, and low-density urban sprawl. On the one hand, the new construction in the process of urbanization development constitutes an incremental part of carbon emissions. On the other hand, repeated construction and wasted building energy have aggravated high energy consumption and high carbon emissions in the process of urbanization. The combination of economic factors and policy factors in China's urbanization has led to the phenomenon of high carbonization becoming more and more obvious. Firstly, industrial production has brought about an

increase in carbon emissions. Rapid industrial development is the main driving force for China's carbon emissions growth. In addition, the embodied carbon in China's export trade plays an important role in the rise of China's carbon emissions. China's exports are dominated by processing trade, with high energy consumption, which is also one of the important factors that constitute China's energy demand growth. Secondly, carbon emissions from the construction industry rapidly increased, and the increase in building areas also brought more carbon emissions. Finally, transportation carbon emissions have rapidly increased, and the increase in transportation demand has led to an upward trend in the total energy consumption of transportation and its share. In recent years, there have been significant changes in China's transportation, road transportation infrastructure, and residents' travel. With the acceleration of urban logistics circulation, the freight capacity of cities and towns has gradually strengthened. The urban expansion in the center will increase the distance traveled by residents, and the level of urban motorization will thus rapidly increase.

#### Carbon Emissions from Animals and Plants

The respiration of plants and animals produces CO2. CO<sup>2</sup> generates organic matter and oxygen through photosynthesis. Carbon exists in the form of CO<sup>2</sup> in nature, and plant straw is a biological resource produced in the process of crop production. Burning plant straws will bring about a lot of carbon emissions.

#### *2.3. Algorithmic Analysis of Carbon Emissions*

Through the classification of "carbon sources", it can be seen that CO<sup>2</sup> emissions involve a wide range and many uncertain factors. The amount of CO<sup>2</sup> cannot be directly obtained through the monitoring instrument. It needs to be calculated by methods. Common methods are subsequently explicated.

#### 2.3.1. Oak Ridge National Laboratory (ORNL)

The Oak Ridge National Laboratory was established by the US Department of Energy in 1943 and is the world's largest scientific energy research laboratory. In 1990, members of the laboratory proposed a method for CO<sup>2</sup> emissions from fossil fuel combustion:

Carbon emissions from coal = coal consumption × 0.982 × 0.733 Fuel oil carbon emissions = standard coal equivalent × 0.982 × 0.733 × 0.813 Gas carbon emissions = standard coal equivalent × 0.982 × 0.733 × 0.561

In the equation, 0.982 is the effective oxidation fraction, 0.733 is the carbon content per ton of standard coal, 0.813 means that under the premise of obtaining the same heat energy, the CO<sup>2</sup> released by fuel oil is 0.813 times the CO<sup>2</sup> released by coal, and 0.561 represents that when the same heat energy is obtained. The CO<sup>2</sup> released by natural gas is 0.561 times that of the CO<sup>2</sup> released by coal [20].

#### 2.3.2. Logistic Model

Most economic indicators are increasing functions that change over time. Conditions such as the environment restrict their growth rate and will gradually slow down their growth rate. Most economic indicators show changes in graphs that resemble a flattened S-shaped curve (logistic curve). The relationship between carbon emissions and time is closely resembles an S-shaped curve [21].

#### 2.3.3. System Dynamics Model

The system dynamics Stella software is used to construct the energy consumption model. The energy consumption model constructed by Stella software can obtain a simulation estimation model of energy consumption by inputting conditions such as GDP, population, and the proportion of output value of each industry. This model can effectively overcome errors caused by missing data [22].

The model obtains carbon emissions according to the following equation: carbon emissions = energy consumption x carbon emission factor × (1 − carbon sequestration rate) × oxidation rate.

#### 2.3.4. Input–Output Analysis (IOA)

IOA combines input–output tables and uses mathematical methods to build models. IOA calculates carbon emissions by establishing relationships through models [23]. The principle is as follows:

First: Calculate the carbon emissions of energy consumption in various sectors.


$$ce\_{\vec{j}} = f\_{\vec{j}} \cdot e\_{\vec{j}} + f\_{\vec{j}} \cdot \left(\sum\_{i=1}^{n} a\_{\vec{j}} b\_{\vec{i}\vec{j}}\right) \tag{1}$$

Second: Calculate the embodied carbon emission coefficient in the production process. First of all, a prerequisite must be met: the *i*-th sector's products will produce CO<sup>2</sup> during the production process; then, since the *j*-th sector consumes the *i*-th sector's products,

it will cause the *j*-th sector to produce implicit carbon emissions during the production process, which holds Equation (2):

$$\begin{cases} i = j, ce\_j = g\_j + g\_j \cdot b\_{ij} \\ i \ne j, ce\_j = g\_j \cdot b\_{ij} \end{cases} \tag{2}$$

In *g<sup>j</sup>* = *ωi* ·*Q<sup>i</sup> Xi* , *ω<sup>i</sup>* represents the carbon emissions from the industrial production process of the unit physical quantity product in the *i* sector (tons of CO2/ton of product), and *Q<sup>i</sup>* represents the product output of the *i* sector (10,000 tons); *X<sup>i</sup>* represents the total output of the *i* sector (CNY 10,000). Combining the two equations yields *ce<sup>j</sup>* = *g<sup>j</sup>* ·c*ij*.

Therefore, Equation (3) is the complete carbon emission coefficient of each sector:

$$(ce\_{\vec{j}}(\text{completedy}) = f\_{\vec{j}} \cdot \left(\sum\_{i=1}^{n} e\_i \cdot b\_{\vec{i}\vec{j}}\right) + g\_{\vec{j}} \cdot \mathbf{c}\_{\vec{i}\vec{j}} \tag{3}$$

On Equation (3), the total carbon emissions in the air can be calculated, as shown in Equation (4):

$$CE = \sum\_{j=1}^{n} ce\_j \cdot Y\_j \tag{4}$$

(*Y<sup>j</sup>* is the final use of the department (CNY 10,000).)

The CO<sup>2</sup> estimate provided in the IPCC National Greenhouse Gas Inventory Guidelines for energy-based CO<sup>2</sup> is shown in Equation (5):

$$\text{CO}\_2 = \sum\_{i=1}^{n} \text{CO}\_2 = \sum\_{i=1}^{n} E\_i \times \text{NCV}\_i \times \text{CEF}\_i \times \text{COF}\_i \times (44/12) \tag{5}$$

In Equation (5), CO<sup>2</sup> is the estimated amount of CO<sup>2</sup> emissions; *i* represents the estimated *i*-th energy; *E<sup>i</sup>* represents each energy consumption; *NCV<sup>i</sup>* is the average low calorific value of each energy source; *CEF<sup>i</sup>* represents the carbon emission factor per calorific value provided by the IPCC Greenhouse Gas Inventory; *COF<sup>i</sup>* stands for carbon oxidation factor. The molecular coefficient is 44/12 units of CO2, which represents the amount of CO<sup>2</sup> that 1 unit of carbon element can be converted into [24].

#### *2.4. Temporal and Spatial Characteristics of Carbon Emissions*

According to relevant data, the industrial structure, population size, economic development level, opening-up level, carbon emission intensity, etc. are all factors that affect the scale of carbon emission [25]. Carbon emissions have formed a time–space effect along with changes in time series and spatial sequences. When studying carbon emissions, it is necessary to consider the following: 1—spatial connections and changes in different economic regions; and 2—carbon emission prediction requires analysis and research on the spatial characteristics of carbon emission.

#### *2.5. Analysis of Deep Learning Algorithms*

#### (1) Recurrent Neural Network (RNN)

RNN is a neural network with cyclic characteristics. It can perform calculations on the characteristics of carbon emission time series data because it can continuously circulate information and has a short-term information memory function [26]. The most basic RNN network is composed of multiple neuron nodes, and each node has an activation function with time as a variable to enhance adaptability. Meanwhile, all function parameters of the node can be adjusted in real time. The expanded diagram of the RNN network structure is shown in Figure 4.

**Figure 4.** Expanded schematic diagram of an RNN network structure.

In Figure 4, h<sup>t</sup> represents the output of the hidden layer node at time t; ht−<sup>1</sup> represents the output at the previous moment; yt represents the output vector; xt represents the input vector; hw, xw, and yw all represent the weight vector of the hidden layer neuron node and the next hidden layer neuron node, respectively.

(2) Long Short-Term Memory (LSTM)

σ

σ

LSTM is a classic variant of RNN. LSTM has powerful classification and prediction capabilities and can handle operations with relatively long-time intervals and delays. The important thing is that LSTM can solve the problem of gradient disappearance and gradient explosion during long-sequence training [27].

The LSTM neural network structure uses a gate control unit. The neuron of each cell contains a forget gate, input gate, and output gate to strengthen the network structure, as shown in Figure 5.

—

=

 ( =

−

+

 +

 

−

 )

σ

**Figure 5.** Schematic diagram of an LSTM neural network structure.

In the network structure diagram of LSTM, the cell state similar to the conveyor belt runs on the entire chain. If the cell state undergoes a slight linear operation, the information flows through the entire chain and remains stable. The gate is a weight composed of a sigmoid function, a tanh function, and a point multiplication operation to realize the selection of information.

Demonstrate the operation according to the structure of Figure 5 LSTM.

#### 1. Forgetting door calculation:

— The forgetting door decides whether to leave the information, as shown in Equation (6). In the information entry gate, the information status is read under sigmoid, and the output is a value between 0 and 1—where 1 means complete retention and 0 means complete deletion, as shown in Equation (7):

$$F\_t = \sigma(w\_\mathcal{F}[h\_{t-1}, \mathbf{x}\_t] + b\_\mathcal{F}) \tag{6}$$

$$S(t) = \frac{1}{1 + e^{-t}} \tag{7}$$

#### 2. Input gate calculation:

The input gate determines how much new information enters, as shown in Equation (8). There are two steps: the first step is to enter the gate to determine the new information that is allowed to enter the cell; the second step is to obtain the candidate information that needs to be remembered at tanh, as shown in Equation (9):

−

+

$$I\_t = \sigma\left(\mathbf{w}\_I \cdot \left[h\_{t-1}, \mathbf{x}\_t\right] + b\_I\right) \tag{8}$$

$$\mathbf{C}\_{l} = \tanh(\mathbf{w}\_{\mathbf{c}} \cdot [h\_{l-1}, \mathbf{x}\_{l}]) \tag{9}$$

#### 3. Cell status update:

The cell state that *Ct*−<sup>1</sup> is updated to *C<sup>t</sup>* is to multiply the last state value *Ct*−<sup>1</sup> by *Ft* , discard the unnecessary part, and add the value that allows it to be remembered and multiplied by it. Finally, the information *Ct* that the update wants to add to the unit state is obtained, as shown in Equation (10):

$$\underline{\mathbf{C}}\_{t} = F\_{t} \* \mathbf{C}\_{t-1} + I\_{t}^{\*} \overset{\sim}{\mathbf{C}}\_{t} \mid \tag{10}$$

#### 4. Output gate

The output gate determines the information to be output from the cell state, as shown in Equation (11). By activating the sigmoid function, the cell state information output is determined. The tanh function is used to process the final output information *ht* of the cell state at the last current moment, as shown in Equation (12):

$$\underline{\mathbf{Q}}\_{t} = \sigma(\mathbf{w}\_{O} \cdot [h\_{t-1}, \mathbf{x}\_{t}] + b\_{O}) \tag{11}$$

$$h\_t = O\_t \* \tanh(\mathcal{C}\_t) \tag{12}$$

In Equations (11) and (12), *F<sup>t</sup>* , *I<sup>t</sup>* , and *O<sup>t</sup>* represent the calculation of the forget gate, input gate, and output gate at time *t*, respectively; *Ct*−<sup>1</sup> represents the cell state at the previous moment; *C<sup>t</sup>* represents the cell state at time *t*; *x<sup>t</sup>* represents input information; *ht*−<sup>1</sup> represents the output at the previous moment; *wF*, *w<sup>I</sup>* , and w<sup>O</sup> represent the weight vectors of the forget gate, input gate, and output gate, respectively; *bF*, *b<sup>I</sup>* , and *b<sup>O</sup>* represent the bias vectors of the forget gate, input gate, and output gate, respectively; *σ (*•*)* represents the activation function sigmoid function. The entire forward calculation of the LSTM unit cell is completed through three gates and one cell state.

#### **3. Indicator Creation and Model Design**

#### *3.1. Creation of Carbon Emission Information Indicators*

After analyzing the industry characteristics of the Yangtze River Economic Belt, the proposed method derives key factors such as industry investment scale and labor output efficiency, which affect carbon emissions in the "economic belt". Then, the extensible random environmental impact assessment (STIRPAT) model is optimized to determine the information indicators:

(1) ZB1 is used to represent the total carbon emissions in the "Yangtze River Economic Belt" basin, with a unit of 10,000 tons;

(2) ZB2 is used to represent the investment scale of the industry, combined with the research, the sum of fixed assets, and the current assets of enterprises above the designated size used as measurements, with a unit of CNY 100 million;

(3) ZB3 is used to represent industrial economic efficiency, and labor efficiency is used as output as a measure;

(4) ZB4 is used to represent carbon emission intensity, that is, CO<sup>2</sup> emissions per unit of industrial added value;

(5) ZB5 is used to represent the scale of opening up to the outside world, and it is measured by the proportion of the sum of investments from Hong Kong, Macao, and Taiwan in addition to foreign investment in the industrial added value, and the unit is %;

(6) ZB6 is used to represent the intensity of environmental protection, and the investment in environmental pollution control is expressed as the proportion of industrial GDP, for which the unit is % [28].

The expanded model indicator variable list is shown in Table 1.



*3.2. Carbon Emission Modeling Process*

The problem of carbon emission prediction is actually predicting carbon emission information in the future based on the information of carbon emission indicators in the past. Therefore, this study is actually on the relationship between a set of time series containing characteristic data to perform regression prediction on the target value.

Regardless of whether it is for regression algorithms or classification algorithms, data need to be preprocessed before the model is built, especially for practical research on multi-dimensional data. In response to the problem to be dealt with, in the selection of data features, the six indicators in Table 1 are used to predict carbon emission information. Because of the various characteristics of the original data in different dimensions, it needs to perform standardized preprocessing operations on the data.

(1) Data normalization method:

The min–max standardization is also called the maximum value normalization, which is a linear transformation of the original data, and the result value is mapped to between [0,1]. The expression Equation of the normalization method of the data is as in Equation (13):

$$\chi\_{\text{scale}} = \frac{\chi - \chi\_{\text{min}}}{\chi\_{\text{max}} - \chi\_{\text{min}}} \tag{13}$$

Maximum normalization is to calculate the maximum and minimum values of each dimension data and then convert the original data. However, maximum normalization has its limitations. If the data change, their maximum and minimum values need to be recalculated. In addition, the maximum normalization is extremely susceptible to extreme values. Therefore, maximum normalization is more suitable for processing boundary data [29].

(2) Normalization of mean variance:

$$\mathfrak{x}\_{\text{scale}} = \frac{\mathfrak{x} - \mathfrak{x}\_{\text{mean}}}{\sigma} \tag{14}$$

Mean variance normalization is a common method for preprocessing data. The essence of mean variance normalization is to calculate the mean and standard deviation of the data, so that the original data obey a normal distribution with a mean value of 0 and a standard deviation of 1 [30].

The focus of the evaluation model is to divide the data set that the research has collected. The data can be divided into three sets: training set, validation set, and test set [31]. The training set is passed into the model for model fitting, and the model parameters are optimized on the validation set, and then the model is evaluated. When the model works well, this means that the experiment has found the best model parameters, and then uses the test set for model testing.

Common indicators for evaluating the prediction accuracy of regression models are as follows:

(a) The MSE is the sum of squares of the difference between the results of the original feature data predicted by the model and the real results, but the sum of squares will continue to accumulate as the number of samples increases. In order to eliminate the influence of the number of samples, the mean value of the square error is calculated, and the MSE is obtained, as shown in Equation (15):

$$MSE = \frac{1}{N} \sum\_{i=1}^{N} |y\_i - y\_i'| ^2 \tag{15}$$

(b) The average absolute error (MAE) is the average of the absolute value of the difference between the predicted value and the true value, as shown in Equation (16):

$$MAE = \frac{1}{N} \sum\_{i=1}^{N} \left| y\_i - y\_i' \right|^2 \tag{16}$$

(c) RMSE is shown Equation (17):

$$RMSE = \sqrt{MSE} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} |y\_i - y\_i'|^2} \tag{17}$$

(d) The absolute error of the median, the absolute value of the difference between the predicted value, and the true value are not averaged, but the median is taken, which is MedAE, as shown in Equation (18):

$$\text{MedAE} = \text{median}\_{i=1,\ldots,N} |y\_i - y\_i'|\tag{18}$$

For the median absolute error index, because the expression contains the absolute value, it is necessary to derive the loss function of the model, and the absolute value index usually fails [32].

#### *3.3. Data Source*

The data of this experiment were obtained according to the online data query menu system on the official website of the National Bureau of Statistics (https://data.stats.gov.cn) (accessed on 11 December 2021). The database of the National Bureau of Statistics collected data from 2018 to 2020, and the monitoring interval was monthly. The relevant index data obtained are mainly data on the 11 provinces and cities in the Yangtze River Economic Belt. The data collected from the official website of the National Bureau of Statistics were used as the training set and test set of the carbon emission prediction model proposed; the data collected by the official statistical bureaus of local provinces and cities were used as the verification data set for demonstration applications.

Table 2 shows the descriptive data of carbon emissions in the Guizhou and Jiangsu provinces from 2000 to 2020.

**Table 2.** Carbon emissions in the Jiangsu and Guizhou provinces during the period 2000–2020.


#### *3.4. SVR Machine Model Creation*

Support vector machine (SVM) is a powerful machine learning algorithm [33]. SVM can solve classification and regression problems at the same time. In addition, SVM can handle both supervised learning target variables and unsupervised learning without target variables. Additionally, its application scenarios are very rich, which can be used for binary classification problems, multi-classification problems, linear and nonlinear problems, etc. *2ε*

The problem being studied is essentially to predict the regression problem. The application of SVM to regression is also called support vector regression (SVR) [34]. SVR is achieved by adding an insensitive loss function to SVM. It extends the classification problem to the regression problem, finds an error, and makes all the sample points as far as possible within this error to achieve a prediction of the data. ( ) ( ) = + − = −

SVR is an application of SVM in the field of regression. Its principle is to obtain a regression model (Equation (20)) on the known sample set (Equation (19)): *lε* is the ε 

$$\mathbf{D} = \{ (\mathbf{x}\_1, y\_1), (\mathbf{x}\_2, y\_2), \dots, (\mathbf{x}\_k, y\_k), \mathbf{x}\_i \in \mathbb{R}^n, y\_i \in \mathbb{R} \}\tag{19}$$

$$f(\mathbf{x}) = \boldsymbol{\omega}^T \mathbf{x} + \mathbf{b} \tag{20}$$

Make *f*(*x*) and *y* as close as possible. Among them, *w* and *b* are the parameters to be determined in the model; *w* is the normal vector of the hyperplane; and *b* is the displacement term. For general regression problems, only when *f*(*x*) and *y* are exactly equal will the loss of the model be zero. In the SVR model, a certain degree of tolerance deviation *ε* is given, so that if and only when the absolute value of the difference between *f*(*x*) and *y* is greater than the tolerance deviation *ε*, it is considered as a loss. At this time, it is equivalent to taking *f*(*x*) as the center to construct an isolation band with a width of *2ε*, as shown in Figure 6. ( ) − − − • − + ( ) ( ) − + − + = 

−

**Figure 6.** Schematic diagram of SVR.

The problem of the SVR model can be transformed into:

$$\min\_{w,b} \frac{1}{2} \parallel \omega \parallel^2 + \mathcal{C} \sum\_{i=1}^{m} l\_s (f(x\_i) - y\_i) \tag{21}$$

$$l\_{\varepsilon} = \begin{cases} 0, & \text{if } |z| \le \varepsilon \\ |z| - \varepsilon, & \text{otherwise} \end{cases} \tag{22}$$

where *C* is the regularization constant and *lε* is the ε-insensitive loss function in Figure 6. With the introduction of relaxation factors *ξ* ∨ *i* and *ξ* ∧ *i* , Equations (21) and (22) can be rewritten as

$$\min \frac{1}{2} \parallel \omega \parallel^2 + \mathbb{C} \sum\_{m=1}^{N} \left( \mathbb{\zeta}\_m^v + \mathbb{\zeta}\_m^\wedge \right) \tag{23}$$

$$\varepsilon \text{ s.t.} - \varepsilon - \mathfrak{T}\_i^\vee \le y\_i - \omega \cdot \mathfrak{\phi}(\mathfrak{x}\_i) - b \le \varepsilon + \mathfrak{I}\_i^\wedge \tag{24}$$

$$\begin{array}{c} \text{s.t.} \begin{array}{c} f(\mathbf{x}\_{i}) - y\_{i} \leq \varepsilon + \xi\_{i}^{\vee} \\ y\_{i} - f(\mathbf{x}\_{i}) \leq \varepsilon + \xi\_{i}^{\wedge} \\ \xi\_{i}^{\vee} \geq 0, \xi\_{i}^{\wedge} \geq 0, i = 1, 2, \dots, m \end{array} \tag{25}$$

SVM aims to solve two-classification problems. The actual problem is often a complex nonlinear problem, and the "dimension increase" is used to deal with the nonlinearity between data [35]. The dimensional data are converted and mapped to a high-dimensional space, and then converted into a low-dimensional space after the high-dimensional space is classified. The purpose of introducing the kernel function can solve this kind of conversion operation. Using the SVR machine to make predictions, different kernel functions are selected for modeling, and the differences between the kernel functions are compared. The kernel function categories and characteristics are shown in Figure 7.

**Figure 7.** Schematic diagram of kernel function names and characteristic parameters.

Using the duality principle [36] and introducing the kernel function, the SVR model is obtained, as shown in Equation (26):

$$f(\mathbf{x}) = \sum\_{i=1}^{m} (\pounds\_i - \alpha\_i) k(\mathbf{x}\_i \, \mathbf{x}) + b \tag{26}$$
  $C$   $\text{vastness}$ 

=

=

ˆ

= +

ˆ

ˆ

#### *3.5. LSTM-SVR Hybrid Model Construction*

Carbon emission (CO2) concentration data have the characteristics of time series and non-linearity. The LSTM-SVR hybrid model is proposed to improve the prediction accuracy of CO<sup>2</sup> concentration. Using the LSTM-SVR hybrid model to predict the CO<sup>2</sup> concentration, the specific steps are as follows:

(1) Acquisition of CO<sup>2</sup> concentration data and meteorological factor data;

(2) Preprocessing the acquired data to eliminate errors or abnormal factors in the data; (3) Use the LSTM model to train and predict the processed data to generate a set of corresponding prediction values D f t ;

(4) By making the difference between the processed data *Dt* and the predicted value D f t , the error value *et* at time *t* can be obtained;

(5) The SVR model is used to perform regression prediction on the error value *e<sup>t</sup>* at time *t*, and cross-validation and grid search algorithms are used to find the optimal kernel function parameter *g* and penalty factor *c* of the SVR model; the predicted value is obtained, that is, the error value *et* is corrected, and the corrected error value is *e* ˆ *t* ;

(6) The corrected error value *e* ˆ *t* is combined with the predicted value D f t of LSTM, and finally the predicted value *D*∗ *t* , *D*∗ *<sup>t</sup>* = *e* ˆ*<sup>t</sup>* + D f <sup>t</sup> of the mixed model is obtained. The model framework is shown in Figure 8.

**Figure 8.** LSTM-SVR hybrid model framework diagram.

#### **4. Experimental Results and Analysis**

#### *4.1. SVR Model Fitting Results*

Economic Belt" b

related information of all provinces and cities in the "Yangtze River The SVR model is used to perform non-linear regression classification and fitting on the carbon emission-related information of all provinces and cities in the "Yangtze River Economic Belt" basin. Under different kernel functions, the fitting results of the model are shown in Figures 9–13.

–

The experiment finally obtained the RMSE of the test set to be 0.715. This evaluation index is the result of calculating the normalized data. The fitting result of the Gaussian kernel function is shown in Figure 10. related information of all provinces and cities in the "Yangtze River Economic Belt" b

Figure 11 indicates that the phenomenon of over-fitting is more serious, and the function parameters need to be adjusted. After adjustment, the training set score of the final model is 0.663, the test set score is 0.634, and the RMSE of the test set is 0.682. –

ˆ

**Figure 9.** Schematic diagram of linear kernel function fitting results.

**Figure 10.** Schematic diagram of the Gaussian kernel function fitting results.

Figure 11 reveals that the fitting effect of the sigmoid kernel function is relatively poor, and the scores of the test set and the training set are both negative and relatively low. Meanwhile, the experiment needs to adjust the parameters to achieve a reasonable fitting result.

Figure 12 proves that the performances of the test set and the training set are different, that the training set score is reasonable, and that the test set is low. In all the above figures, the fitting results of the different kernel functions are different, and the specific function fitting score and the RMSE comparison are shown in Figure 13.

**Figure 11.** Schematic diagram of the sigmoid kernel function fitting results.

**Figure 12.** Schematic diagram of the polynomial kernel function fitting results.

Figure 13 visually shows that the training performance scores of the four different kernel functions of the model are relatively small. This is due to the constant use of network search in the process of model training to adjust the parameters of the model.

sigmoid kernel function

**Figure 13.** Comparison of the model performance of four types of kernel functions.

#### *4.2. Deep Learning Network Model Training Results*

This experiment used the LSTM neural network to train the model. After 60 iterations, the loss of the model and its performance on the training and test sets were obtained—these are shown in Figure 14. —

**Figure 14.** Comparison of the training and testing losses for multiple iterations of LSTM.

Figure 14 shows that the loss gradually stabilizes after multiple iterations. Figure 15 shows the fitting trend graph of the model trained by the LSTM neural network.

—

**Figure 15.** Schematic diagram of the fitting trend results after LSTM training.

#### *4.3. Carbon Emission Forecast Results*

According to the SVR model and the LSTM training model, the experimental training on the experimental data set, the comparison between the prediction curve of the carbon emission information-related indicators in the Yangtze River Economic Belt and the real curve is shown in Figures 16–18. –

and "Yangtze River Economic Belt" carbon emission information prediction and **Figure 16.** Schematic diagram of the comparison between the industry investment scale information and "Yangtze River Economic Belt" carbon emission information prediction and the actual comparison.

and "Yangtze River Economic Belt" carbon emission information prediction and

–

**Figure 17.** Schematic diagram of the comparison between the labor efficiency output information and carbon emission information prediction and the actual comparison of the "Yangtze River Economic Belt".

actual comparison of the "Yangtze River Economic Belt" **Figure 18.** A schematic diagram of the comparison between the scale of carbon emissions information and the carbon emissions information prediction and the actual comparison of the "Yangtze River Economic Belt".

In Figure 16, the scale of the industry investment is used to predict carbon emissions. There are consistent overall trends. This also reflects the fact that the economic development and growth of the Yangtze River Economic Belt will inevitably increase carbon emissions.

Figure 17 suggests that the labor efficiency output information is not very accurate for the prediction of information on carbon emissions, and that there are certain errors. The reason for this is that the labor efficiency output has spatial lag.

'

'

and complete the country's binding targets for greenhouse gas emissions.

'

In Figure 18, the information on the scale of carbon emissions is more accurate in predicting the trend of carbon emissions. In summary, in the combination of the SVR model and the LSTM model proposed herein, the relevant index information for predicting carbon emissions in the Yangtze River Economic Zone is relatively accurate. Meanwhile, the ZB1 and ZB2 indicator information play a key role in predicting carbon emissions.

#### *4.4. Policy Recommendations*

In response to the proposed factors influencing carbon emissions and the prediction results of the carbon emissions model, the following policy recommendations are proposed as follows, taking account of China's specific national conditions.

#### 1. Strengthen carbon market capacity building:

Chinese central state-owned enterprises have the responsibility and obligation to implement and fulfill the state's requirements in terms of corporate, political, and social responsibility. Chinese central state-owned enterprises should actively implement the national climate change policy requirements. They also should take the lead in promoting research into carbon asset management, low-carbon development strategies and paths, and complete the country's binding targets for greenhouse gas emissions.

This requires Chinese central state-owned enterprises to interpret carbon emission policies. Meanwhile, they also need to study the extent of their influence on the company's corporate development strategy selection, production decision making, technological progress, energy conservation, and environmental protection. Furthermore, companies should explore the impact of carbon emission policy structure on the behavior of industrial enterprises, and study the relationship between carbon emission policy and corporate competitiveness. Additionally, companies should also make full use of China's carbon emission policies to reduce the cost of reducing carbon emissions while creating a green and clean energy company, effectively increasing the value of carbon assets, and preparing for the cultivation of new industries in the future.

2. Strengthen corporate carbon asset management:

Only three (sub-)sectors are included in the first batch of the unified national carbon market. The pilot carbon emission control companies that were previously included in the pilot areas will also be included (including the Tianjin branch of CNOOC (China) Co., Ltd.).

With the improvement in the carbon market and the increasing pressures of the conditions for implementing a carbon tax, no company will be beyond reducing its carbon emissions. Therefore, it is particularly important to do a good job in terms of corporate carbon asset management in advance. A company's carbon emissions should be known as soon as possible by utilizing inventory and verification. Companies should make emission reduction or response measures as early as possible following relevant policies and development trends to minimize the impact of carbon emissions on the company. Companies with surplus carbon emission allowances can also strive for additional benefits for these companies.

3. Properly assess the impact of new projects on carbon emissions:

Carbon emission assessment projects are a new assessment method that has been gradually developed in recent years. Regardless of whether it is the carbon market or the implementation of a carbon tax, for companies with high carbon emissions, it will affect the production and operation of the company to a certain extent. For new or renovated and expanded projects, it is recommended that carbon emission assessment is carried out in the preliminary research stage of the project. The main purposes of the assessment are: 1—predict the carbon emission cost of investment projects and assess the degree of impact of carbon emissions on the economics of the project; and 2—enable a better-informed choice of measures and paths to meet carbon emission requirements through the assessment.

4. Deploy large-scale carbon emission reduction technologies as early as possible:

CO<sup>2</sup> emission reduction is a long-term and arduous task. Soon, work can be carried out on energy optimization, production energy conservation, and the development and utilization of clean energy by tapping the company's internal emission reduction potential. However, more CO<sup>2</sup> emission reduction and utilization methods are needed to achieve long-term and effective emission reduction, i.e., "post-processing" technology. Therefore, it is necessary to deploy carbon emission reduction utilization technology research and development as soon as possible, and make full preparations for emission reduction technologies and programs. While fulfilling the goals required by the state, this will also enhance the core competitiveness of the enterprise and promote its low-carbon transformation as well as its sustainable development.

In summary, in the process of carbon emission management, due to the complexity of the factors influencing carbon emission, it is a great challenge to carry out forecasting, and it will also consume a lot of human effort and material resources. The use of deep learning to build a carbon emission prediction model can predict the carbon emissions of the Yangtze River Economic Zone with high accuracy, reduce the human and material investment in carbon emission management, and provide a reference for carbon emission management.

#### **5. Conclusions**

According to the model test results, the "Yangtze River Economic Belt" basin and the industry investment scale, the labor efficiency output, carbon emission intensity, and other indicators that affect carbon emissions are relatively accurate in the carbon emission information forecast [37]. Therefore, the proposed method concludes that the SVR model for solving complex nonlinear problems can achieve a relatively excellent prediction effect under the training of LSTM. Due to the complexity of the "carbon sources" of the carbon emissions of the research object, the information affecting carbon emissions has the characteristics of diversity, dynamics, and big data. On the other hand, deep learning algorithms have strong fusion and changeable algorithms. The use of a deep learning network to process the information of the prediction model is complicated, which is main shortcoming of this study [38]. Therefore, the study hereby puts forward expectations that the prediction of carbon emission information is crucial to the country's mid-to-long-term "carbon peak" strategy. The deep learning network must be used to accurately predict carbon emissions within a specific economic region, and then be promoted nationwide. This process requires the concerted efforts of researchers from related fields to work together.

**Author Contributions:** Conceptualization, H.H. and X.W.; methodology, X.W.; software, X.W.; formal analysis, H.H.; investigation, X.C.; resources, X.C.; data curation, H.H.; writing—original draft preparation, H.H.; writing—review and editing, X.C.; visualization, X.C.; supervision, X.C.; project administration, X.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Natural Science Foundation of China (Grant No. 41271516).

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of Anhui Normal University.

**Informed Consent Statement:** Informed consent was obtained from all individual participants included in the study.

**Data Availability Statement:** The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Review* **Social Vulnerability Assessment for Landslide Hazards in Malaysia: A Systematic Review Study**

**Mohd Idris Nor Diana 1,\* , Nurfashareena Muhamad <sup>1</sup> , Mohd Raihan Taha <sup>2</sup> , Ashraf Osman <sup>3</sup> and Md. Mahmudul Alam <sup>4</sup>**


**Abstract:** Landslides represent one of the world's most dangerous and widespread risks, annually causing thousands of deaths and billions of dollars worth of damage. Building on and around hilly areas in many regions has increased, and it poses a severe threat to the physical infrastructure and people living within such zones. Quantitative assessment of social vulnerability in Malaysia is worrying because it has been given less attention than hazard-related studies. Therefore, this study's objective is to find out the indicators used for social vulnerability assessment in the context of a landslide in Malaysia. The analysis is critical for understanding the measures of social vulnerability, given that the incorporation of climate change and disaster risk mitigation issues in urban planning and management are considered priorities in ensuring a stable population growth and avoiding economic disruption. A systematic study on the Scopus and Web of Science repositories was conducted based on the PRISMA Report analysis method. This article concluded that there are six important indicators of social vulnerability in the context of landslide in Malaysia.

**Keywords:** social vulnerability assessment; landslide; social indicator; disaster risk reduction; Malaysia

### **1. Introduction**

In recent years, extreme events have increased in intensity and frequency globally, leading to rising economic losses and casualties. It is believed that these events will continue to accelerate in future climate scenarios. An accurate understanding of the physical and socioeconomic drivers of these extreme events is crucial and can ultimately enhance adaptive strategies. The frequency and intensity of geophysical events is increasing. This is the result of the interaction between humans and the environment. Climate change and increasingly aggressive human activities contribute to the vulnerability of catastrophic hazards to humans, their infrastructure, and the environment [1]. Faced with ever-increasing societal impacts arising from such events, a wealth of research and analysis has focused on understanding causal processes and outcomes [2]. Landslides are a type of geophysical event that plays a significant role in the evolution of a landscape [3]. However, landslides do pose a serious threat to local populations given that these events are being triggered increasingly by a changing climate and more unpredictable weather patterns. In recent years, it has become clear from previous research that the location, abundance, activity, frequency of landslides as well as the social and economic consequences are increasing over time and more people are exposed to the risks [4–10]. It was reported in [11] that geophysical

**Citation:** Nor Diana, M.I.; Muhamad, N.; Taha, M.R.; Osman, A.; Alam, M..M. Social Vulnerability Assessment for Landslide Hazards in Malaysia: A Systematic Review Study. *Land* **2021**, *10*, 315. https://doi.org/ 10.3390/land10030315

Academic Editor: Marina Cabral Pinto

Received: 11 February 2021 Accepted: 28 February 2021 Published: 19 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

disasters such as landslides are the deadliest. The presence of humans, infrastructure, and other forms of vulnerabilities in one location will make things worse.

Historically, efforts to reduce landslides are physically oriented resulting in a proliferation of technocratic approaches in the literature, while financial losses and social vulnerability from the geophysical events continue to increase. Over time, this gave rise to an alternative explanation that mounting losses are related less to the dynamics of the events but more to the vulnerability of exposed human populations [2]. Although assessing the magnitude and intensity of disasters is critical, the nature of population demographics and various socioeconomic contexts may also lead to a greater risk of disasters. Understanding the complexities of vulnerability to disasters, including those caused by geophysical events, is at the heart of disaster risk reduction. Efforts to reduce disaster risk involve various disciplines and should be viewed from numerous perspectives to provide long-term benefits. A comprehensive disaster risk reduction strategy that incorporates physical and socio-economic aspects is the key determinant of vulnerability.

In spite of very high importance of socioeconomic data to assess landslide vulnerability, there are lack of social data documented for analysis and mapping in Malaysia. Therefore, the objective of this study is to find out the indicators that are used for social vulnerability assessment in the context of landslides in Malaysia. The analysis is critical for understanding the measures of social vulnerability, since the incorporation of climate change and disaster risk mitigation issues in urban planning and management are a priority for ensuring stable population growth and evading economic disruption.

#### **2. Literature Review**

The definition of vulnerability is "the quality of being vulnerable (able to be easily hurt, influenced, or attacked), or something that is vulnerable" [12]. Vulnerability means the risk of being vulnerable or easily hurt by something or someone. Vulnerability is a concept that has being used over a long period of time, and it has been recognised in much research covering various fields of endeavour [13], for instance, the social sciences, economics, psychology, and engineering. It should be noted that there is no consensus regarding how vulnerability is defined [14]. It has, in fact, been interpreted in many ways according to the subject area being investigated.

According to [15], vulnerability refers to situations where individuals and societies are exposed to social, economic, and cultural risks and in essence the dangers posed by harm to them. All people and all communities at some point cannot avoid risk or harm, so at best each individual needs to prepare for every situation. Moreover, stress that social vulnerability is partly the result of social difference or social inequality, which affects or forms the susceptibility of different groups to harm or at risk and regulates their capacity to react to a certain situation [16]. There is inequality in every society and the unequal distribution of wealth and resources is something that has permeated all of human history. For instance, in a farmer's perspective, inequality can take many forms such as unequal distribution of wealth, water allocation, rights to land and water, taxation inequity, economic poverty, land tenure issues, and much more. The definition of climate vulnerability according to the Intergovernmental Panel on Climate Change (IPCC) is " . . . the degree to which geophysical, biological and socio-economic systems are susceptible to, and unable to cope with, adverse impacts of climate change" [17]. The concept of vulnerability has been refined over the decades so that people understand the disasters and hazards that occur in communities susceptible to this kind of situation. Vulnerability is something that can help people achieve a level of sustainable development realistically. Economic development or progress should be engaged with as long as the natural environment in which they occur can be sustained.

For this reason, vulnerability can be defined as individuals, households, or communities that are dealing with external shock from the outside and are unexpected [18]. Vulnerability is present in both internal and external factors that influence the lives of individuals and communities. Furthermore, vulnerability can be understood as the capacity of

individuals, groups or communities to reciprocate, cooperate, survive, and recover from the impact of environmental events that have happened around them [19]. Landslides are very indicative of how the characteristics of a social group can overcome this kind of disaster but also reflect the harsh realities of social vulnerability to natural events.

#### *2.1. Social Vulnerability to Disaster*

Vulnerability is broadly defined as the potential to suffer loss or harm. The theory includes structural vulnerability of buildings, physical exposure of people, and places to natural events, while social vulnerability describes different kinds of susceptibility based on social, economic, and political factors [20,21]. Vulnerability and exposure are dynamic, varying in temporal and spatial scales, and depend on economic, social, geographical, demographic, cultural, institutional, governance, and environmental variables [22]. Analyses of vulnerability in the engineering context of landslide or slope (or any disaster) are quite common [23,24]. The study by [25] has described vulnerability as the characteristic of a person or group and their situation that influences their capacity to anticipate, cope with, resist, and recover from the impact of a natural hazard (an extreme natural event or process). Despite its importance in disaster risk reduction, there is still a lack of approaches that contribute to a better understanding of social vulnerability hidden in dynamic contextual conditions [26].

The definition of social vulnerability within the disaster framework was introduced in the 1970s when researchers realised that exposure included socioeconomic factors affecting group resilience [27,28]. Social vulnerability is useful as an indicator in determining the differential recovery potential from disasters. Social vulnerability normally employed individual characteristics of people such as age, race, health, income, type of dwelling unit, and employment [29]. Social vulnerability is a concept that can explain social imbalances that are happening in society in some parts of the world. It is one of the results of social inequalities that occur in many communities. Factors affecting social disparities evident in a society include: lack of resources such as information, knowledge, and technology; limited access to political power or representation; social capital; social networks and connections; beliefs and customs; building stock and age of infrastructure; and type and density of infrastructure and lifelines [30]. Next, the 18 social vulnerability indicators was introduced as follows: socioeconomic status (income, political power, and prestige), gender, race and ethnicity, age, commercial and industrial development, employment loss, rural/urban, residential property, infrastructure and lifelines, renters, occupation, family structure, education, population growth, medical services, social dependence, and social needs population [16].

The design of these indicators depends on their expected use, and it must be relevant to the hazard context, methodologies, and data availability [31]. However, social vulnerability exists based on the underlying characteristic of a population, and it does not rely on the hazard or susceptibility of an area. Apart from indicators, numerous indices have been developed in order to measure social vulnerability. Many pioneer researchers have devoted much effort to formulating the concept of social vulnerability. Social Vulnerability Index (SoVI) was introduced [32] to quantify social vulnerability through an empirical basis to compare social differences within a community regarding social variables selected to mitigate the disadvantageous effects of certain events. It was asserted that socially vulnerable communities are more likely to be adversely affected in disaster events because they are much less likely to recover from them and more likely to die [33]. Even though the SoVI was devised with the United States in mind, many studies have adapted SoVI for a variety of contexts, no matter the nature of the population or places being investigated.

#### *2.2. Landslides: Malaysia's Experience*

Malaysia is located in the south-east of Asia. It is divided into two archipelagos, Peninsular Malaysia and Borneo Island. Malaysia is a tropical country with a warm and humid climate throughout the year. Over a recent 20-year period (1998–August 2018),

Malaysia has witnessed 51 disaster events [34–43]. During that time, 281 people died, more than 3 million people were affected, and disasters caused nearly US\$2 billion in damage [44]. Flood, landslides, drought, and forest fires are common in Malaysia, while the annual rainfall is the main contributor due to two monsoon periods, i.e., South West (SW) and North East (NE) occurring between April and October and from November to March, respectively. These monsoons contribute to high annual rainfalls amounting to 2000–4000 mm with a maximum of about 200 rainy days [45]. The amount of rainfall varies from one rainy day to the next [46]. The rain and consistently high temperatures throughout the year lead to intensive and extensive weathering of features on the ground. These combinations of climate and geological conditions together with other causative factors such as slope angle, drainage conditions, geological boundaries, etc. [47] have led to landslides becoming one of Malaysia's most common natural disasters.

The most common trigger for landslides is heavy or prolonged rainfall, but seismicity, river undercutting, freeze-thawing cycles, and human activity may also cause substantial and destructive landslides. As reported [48], Malaysia recorded 171 landslides between 2007 and March 2016, according to data from the US National Aeronautics Space Administration (NASA), making the country the world's 10th highest in terms of landslide frequencies. In recent years, Malaysia has experienced several landslides resulting from extreme tropical rainfall. Landslides have occurred in several parts of the country, such as Paya Terubong (Penang), Highland Towers (Kuala Lumpur), Hulu Langat, and Pos Dipang (Perak). These landslides incur significant property loss and hundreds of lives. In 2017, 6000, people were severely affected by a flash flood and landslide in the Kundang, Selangor area, which left many stretches of roads, infrastructure, and assets badly damaged [49]. When the population density of towns increases, highland or hilly terrain development also increases and this puts more stress on the natural environment. Urban areas are then exposed to a high risk of landslides [50]. Significant landslides in Malaysia were recorded from 1993 to 2020 (see Table 1).


**Table 1.** Series of significant landslide occurrences in Malaysia.


**Table 1.** *Cont.*

Source: [51–60].

In Malaysia, there have been numerous landslide events in the mountains, along the valleys, rivers, and coastal regions [61,62] but the most massive have generally been associated with rivers. Findings from the literature have shown that landslides occur frequently along hilly areas in the rainy season. There is a strong correlation between the density of drainage and distance to the river due to landslides in the mountainous region being triggered by erosion-related phenomena [63]. Development on hilly areas in Malaysia has increased the risk and likelihood of landslides [64]. Hilly areas are attractive for building residential areas, hotels, or resorts. This poses a severe threat to the physical infrastructure and population living within that area. This situation will lead to many casualties and significant financial losses if these hilly regions are struck by landslides [65].

Global landslides cause billions of dollars' worth of infrastructure damage and thousands of deaths annually. The estimated number of deaths is 1000 per year and destruction of property amounting to approximately US\$4 billion [66]. Meanwhile, losses due to landslides in Malaysia have cost more than US\$1 billion since 1973 [67]. Emergency preparedness plays a part in reducing the effects of disasters. The most effective preparedness at the initial stage was to make the right decision to reduce the number of deaths and damage to property in communities. The rescue team provided some emergency response and preparedness training for each member of the community so that their reactions were practical. In Malaysia, there are several agencies involved in dealing with landslides such as Malaysia Civil Defence Force (MCDF), Fire and Rescue Department of Malaysia, National Disaster Management Agency (NADMA), and others. Furthermore, the Ministry of Housing and Local Government has issued a guideline for any physical development on the hilly terrain area in Malaysia. Table 2 summarises the criteria of the biological effect based on the slope gradient, slope classification for engineering work, and the description of development activities.


**Table 2.** Malaysian Guideline on physical development in hilly terrains.

Source: [68].

Malaysia has its share of landslides and most of the landslide studies conducted focus on the engineering perspective. Socioeconomic aspects should be taken into account to evaluate the vulnerability of the community, especially one at high risk of experiencing such catastrophic effects, but previous research concentrated more on describing the disaster types [61,69], susceptibility, and risk assessment [70,71]. The level of quantitative

evaluation of social vulnerability in Malaysia is worrying due to the lack of social data documented for analysis and mapping. Therefore, the objective of this study is to find out indicators that are used for social vulnerability assessment in the context of a landslide in Malaysia. The analysis is critical for understanding the measures of social vulnerability, since the incorporation of climate change and disaster risk mitigation issues in urban planning and management are a priority for ensuring stable population growth and evading economic disruption.

The representativeness of Malaysia as an important case for research, though can be critical in other cases, is not an issue for his study. What we are trying to demonstrate is that in analysing landslide risk, the human part is an integral part and should be incorporated as detailed in this study. The methodology used in this study is a pioneer for landslide risk assessment. Assessing the landslide risk with the proposed methodology can be a crucial tool for engineers and policy-makers in developing a site, particularly in hilly areas, for population development. Thus, it must be done at its locality, per se, in order to assess the real risk of landslide. More importantly this methodology can serve to highlight the importance of public education to increase the level of knowledge of the population on the hazard and mitigation of possible landslide events in their area. Limited literature found on social vulnerability mapping to climate-driven disasters in the country. The socio-economic aspect is the most apparent after disasters as different patterns of damages, losses, and suffering maybe experience differently by certain groups of the population.

#### **3. Materials and Methods**

This section incorporates five significant sub-sections that explain the following: PRISMA, resources, inclusion and exclusion criteria, systematic review procedure, and data extraction and interpretation. The methodology technique to retrieved articles is the one suggested by [72].

#### *3.1. PRISMA*

The systematic review in this article was guided by the PRISMA method, and this abbreviation stands for "Preferred Reporting Items for Systematic Reviews and Meta-Analyses." PRISMA has mainly been utilised by healthcare personnel create systematic reviews and meta-analyses. As well as the medical field, PRISMA has been employed by environmental management experts to undertake systematic reviews.

#### *3.2. Resources*

This study used two primary journal databases, specifically Scopus and Web of Science (WoS). Scopus is a bibliographic database for journal articles and consists of abstract and citation sources. This database covers journals from scientific, technical, medical and social sciences and currently has more than 5000 publishers worldwide and more than 22,000 titles. Web of Science (WoS) is a database producing Clarivate Analytics, which includes articles from 256 disciplines such as science, social science, arts, humanities, etc. WoS offers full-text articles, reviews, editorials, abstracts, proceedings and book chapters. WoS includes more than 33,000 journals published from the year 1900 to the present day. Other databases like JSTOR and Google Scholar were considered for this research.

#### *3.3. Systematic Review Process*

The systematic review process includes four main stages to acquire relevant: identification, screening, eligibility, and data extraction.

#### 3.3.1. Identification

The first process of undertaking systematic reviews is identification. Identification means finding the most relevant studies, using keywords, dictionary terms, thesaurus, encyclopaedias, etc. The keywords used help to build the "search string" for the research (Table 3). Subsequently, 13 articles were found in JSTOR using the term "social vulnerability

index." From the Scopus database, in total, 147 articles related to the search string were discovered while a total of 69 items emerged from Web of Science (WoS). Meanwhile, 29 studies were found in Google Scholar search engine, where the data covers a huge range of subjects and is essentially a superset of WoS and Scopus [73].



#### 3.3.2. Screening

The second part of the systematic review process is screening. Here, it is necessary to gather all the articles related to the study topic and exclude all irrelevant items. Table 4 shows the inclusion and exclusion criteria that need to be followed in finding related articles. The total of 258 articles was screened using the inclusion and exclusion criteria including literature type, language, timeline, countries and territories, and the subject area. For the first criterion of the literature type, this study decided to focus on journal research articles and excluded papers resembling review articles, book chapters, and conference proceedings. Meanwhile, for language, the chosen one was English, and all other non-English articles were excluded. The criterion for publication was the period from 2010 to 2020 only, and the geographical criterion was Southeast Asia, Southwest Asia and Europe. Lastly, for the subject area, this study only chooses articles from social sciences, environmental science, science, and agriculture. From the inclusion and exclusion criteria, the number of articles that have been excluded is 199, in total (Figure 1).


**Table 4.** Inclusion and Exclusion criteria.

**Figure 1.** Literature searches based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) is guidelines (adapted from [74]).

#### 3.3.3. Eligibility

For the third stage eligibility, a total of 59 articles were used. Title, abstract, and the content of each paper are important and need to be examined thoroughly to make sure it fulfils the inclusion criteria and review objective. In total 50 articles have been excluded because they did not fit this criterion. Therefore, the criteria of selected articles to be analysed is focus on the social vulnerability study and the empirical articles. It is because the purpose of this study is to define the indicators used to assess social vulnerability in the context of landslides in Malaysia. The research is important for understanding social vulnerability interventions, as the inclusion of climate change and disaster risk mitigation problems in urban/rural planning and management. More specifically, this approach will help to illustrate the value of public education in growing the population's level of awareness about the risk and mitigation of potential landslide events in their area. Even though the occurrence of landslides is different due to the climatic conditions among the countries for article analysis, however, due to the lack of research on the formation of social vulnerability indicators in Southeast Asian countries, alternatively, this study has expanded its study to Southwest Asian and European countries.

#### 3.3.4. Data Extraction

After the remaining articles were assessed and analysed, the researcher started to extract the data. First, this was done by reading the abstract of the article, and then the researcher read the full text to start identifying themes and sub-themes related to the objective. After that, themes and sub-themes were organised to establish a typology for the article.

#### **4. Results**

According to the results shown in Table 5, in total, 9 articles were chosen for this study. The nine authors of the articles include [75–83] in this study. Besides, the selected articles were published in the years ranging from 2011 to 2020. It aims to identify research trends on social indicators that are constantly being studied and considered for the purpose of forming a social vulnerability index for certain area and community. Next, with reference to countries covered, two studies are from Nepal, and the rest are one study each from Portugal, England, Italy, Pakistan, India, China, and Indonesia. It comprises the name of authors, the country of studies, title of articles, and the objective of the studies by scholars.


**Table 5.** List of articles analysed for systematic review.

Sources: Author analysis, 2020.

#### *4.1. Indicators Used to Measure Social Vulnerability in a Landslide*

There are 14 indicators serving to measure social vulnerability when a landslide occurs. Included here are age, gender, ethnicity, built environment, income, family structure, education, employment, occupation, urban or rural, disability, migration, medical, and population (Table 6).


**Table 6.** List of indicators use as social vulnerability index.

Sources: Author analysis, 2020.

In this study, there are five main indicators that are focused on, these being age, ethnicity, education, disability, and health. These are the variables that most scholars measure when investigating landslides. They are explained in more detail below.

#### 4.1.1. Age

The first component that has been discussed in [77] is "urban, age (elderly), and gender." Variable for age includes the proportion of resident population aged 65 and over, proportion of resident population aged 4 and younger, proportion of residents aged 5–14, and proportion of resident population aged 15–19. The study shows a negative result for elderly people, which means they are more susceptible to vulnerability. There was reported [79] that focuses more on four component indicators—age, employment, population growth, and education. He also stated that aging index is one component that represents the age indicator.

The variables include population of people aged 65 and above and those aged 15 and younger. The aging phenomenon that is very evident in Italy has resulted from the depopulation of people in mountain areas, people leaving the land, migration, and the lure of promising jobs in the industrial and service sectors. Italy's people are generally living longer and the average birth rate has declined. According to the study by [80] there are five main indicators affecting the landslide risk perception: age, income, education level, location, and experience. In addition, the study shows that age of respondent wields an effect on the perception of landslides.

#### 4.1.2. Ethnicity

According to [76], the ethnicity indicators focus on the Dalit population and minority population such as Muslims and Sikhs/Punjabis. They found that this group was less than 5% of the total population in Nepal, and it is considering as disadvantage groups. In [77], "nationality and ethnicity" is one of the five main indicators in that particular study. The variables for ethnicity indicator include person of African origin living in the country, foreign nationality, and resident who was born outside the country as a marginal group. Like age, ethnicity can be an indicator in the social vulnerability index and help assess what is happening in a given society.

#### 4.1.3. Education

Education has always been regarded as one of the key vulnerabilities all communities have to deal with. Educated people are more likely to have advantages in everything they do compared to people without or with little education. There are three main variables relating to education as follows [75]: percentage of the population who can read and write, percentage who completed school certificate (SLC), and percentage who completed a college or university degree. In the study by [77], one of the indicators "development and education" included variables such as the proportion of illiterate people. The community can be very vulnerable when the proportions of literate and illiterate are dangerously disproportionate.

Furthermore, the level of education and qualification can affect vulnerability in one community. The higher the qualification in education that someone has, the more unlikely it is that they will experience vulnerability from any hazards. According to [78], an individual who has enough education and knowledge regarding about a certain issue will generally better understand the nature of a hazard and its likely effect on them. Not only can education affect individuals' knowledge of certain issues but it also helps to reduce poverty, improve health, get more and better job opportunities, higher salaries, etc.

#### 4.1.4. Special Need Population

The population with special needs is usually much more vulnerable than people without a disability. Disability can be a huge factor for assessing vulnerabilities, especially when disasters or hazards occur. As mentioned by [76], this factor is closely linked to socioeconomic status, education and built environment, and ethnicity—all components of vulnerability assessment. It is shown by the variance for socioeconomic status (45.12%), education and built environment (19.74%), ethnicity (10.98%), and disability (10.78%).

#### 4.1.5. Health

Health is one of the major indicators of this study. Variables such as medical services, health problems, and distance from the hospital are important factors of measuring social vulnerability as mentioned by [77,81,82]. Being healthy and having a good public healthcare system is important for communities that are more vulnerable to a disaster or hazard. Poor public health systems can simply make problems worse, and lead to more accidents and disruptions.

#### **5. Discussion**

There are not many studies concerning the Social Vulnerability Index (SoVI) with reference to landslides. Based on the research undertaken, articles regarding landslide in the context of social vulnerability index usually consider other types of hazard or where landslides are bracketed with other natural disasters. Articles based only on social vulnerability and landslides are difficult to locate. Social vulnerability or the social vulnerability index has many types—not only SoVI but also referred to as SEVI or SVI. Even though the focus is only on SoVI in this paper, the researcher has taken note of other types of social vulnerability index.

There are 14 indicators that have been employed to measure social vulnerability in the context of landslide including (see Table 5): age, gender, ethnicity, built environment, income, family structure, education, employment, occupation, urban or rural, disability, migration, medical, and population. Based on the analysis, the researcher only focuses on five main indicators that have been used by many scholars: age, ethnicity, education, special needs population, and health. These were chosen because they are very relevant to the more vulnerable in society, especially where inequalities and imperiled areas are very evident.

#### *5.1. Education*

There are a few factors that affect social vulnerability including lack of access to: resources such as information, knowledge, and technology; social networks and connections with other individuals; social capital; and infrastructure [83]. In this study, education emerges as a major indicator employed in other studies regarding social vulnerability and landslides. Education is a bridge to success for many people, and it can refer to both formal and informal education. Education can also mean information, knowledge, and technology regarding the scope of discussion. The importance of education is to help people achieve more success and status in society, get a better job and understand the issues involved in a hazard or disaster. Furthermore, it helps individuals to be prepared for any circumstances. According to [84], people who have better response mechanisms, always prepare and constantly recover from a disaster, and this is certainly the case for those individuals, households, and societies with better and more widespread higher education outcomes compared to others.

#### *5.2. Age*

Indicators such as age can also mean susceptibility to social vulnerability. Older and very young people are more vulnerable to hazards and disasters than people in the middle. A higher proportion of senior citizens means that a society is at greater risk of disaster and the strategies needed to repair any given situation, simply because older people are more vulnerable to hazards than other age groups. Older people normally need a lot of physical and emotional care and support services. They can also be more disadvantaged compared to other age groups. The indicators that have been collected from previous studies do not represent the population or the place.

#### *5.3. Ethnicity*

Racism or ethnic discord is one of the factors of disaster risk, and especially for minority groups such as migrants and/or non-residents in a given location [85]. They are also known as marginalised groups, considered to be inferior in terms of their economic status, health, social relationships, and environment. If this situation continues, it will result in lasting social, political, and economic losses [86]. Although a mixture of sociospatial and biophysical influences forms people's susceptibility to environmental hazards, race/ethnicity, and class have been central to understanding social dynamics during hazard events [87].

#### *5.4. Special Needs Population*

Special needs populations such as people with a disability are the most-at-risk persons when a disaster occurs. Disability means that the person with a physical or mental condition has limited movements, senses, or ability to participate in activities. Characteristics that are considered to be a disability are deafness, blindness, diabetes, autism, epilepsy, depression, and HIV. According to [88], disability emerges from the connection between people with health problems, such as cerebral palsy, Down syndrome, depression, as well as personal and environmental influences, including negative attitudes, limited transport facilities, public service facilities, and insufficient social support systems. They are generally the first victims of natural disasters. Indeed, early warning systems that alert the public may not

actually reach the disabled individuals in time. The death toll from a disaster is two to four times larger than for those who are not disabled [89].

#### *5.5. Healthcare Accessibility*

Those with health problems are particularly vulnerable to landslides. They require constant attention and healthcare services to ensure their safety and good health. Therefore, access to health services such as hospitals, healthcare clinics, and pharmacies is an important need for this community. One of the principal components of emergency management is healthcare management to cope with disasters [90]. In disaster prevention activities, well targeted healthcare supply chain management can function effectively and efficiently. A substantial number of disaster casualties or even fatalities could be absorbed as long as healthcare services are up and running when a disaster occurs [91].

All the variables are listed above give an essential role in determining the security of a community based on social inductors. However, the results of the author's study found that income indicators and social capital are less emphasised. Income indicators referring to those with low incomes and belonging to the group below 40% of Malaysia's income are very vulnerable to disasters. For example, the floods that occur every year have caused suffering because they cannot work, and the worst consequences, they will lose their jobs. The study [92] found that the income sub-domain is the largest contributor and gives high value to the index of endangered livelihoods of rural communities in Pahang in 2014. Low-income conditions will also affect the period for them to recover after a catastrophic event. The results of the author's research found that there are no studies that explore social asset indicators. Social assets carry meaning as resources available to individuals and groups through membership in social networks. If the household has a higher position in a group or social institution, he or she will produce higher social strengths and resources [93]. Longer membership history as well as more participation in other social groups make it easier for access to information, business opportunities, social strength, and influence. The ability to access other assets is also simpler [94]. The evolution of social capital through the interaction of relationships between people and groups in community social networks [95,96]. Social networking means the interaction of an individual with other individuals, organisations, and groups to obtain information and assistance on something related to their livelihood [96,97]. The lack and absence of these elements within the social life environment of an individual will contribute to their vulnerability factors, as emphasised by [98,99]. Social capital influences, the sustainable life they possess significantly to strengthen the ability to develop a network of cooperation between groups both internally and externally and through enhancing the institutional capacity of community groups to improve the well-being of society.

State government agencies, local governments, and community leaders are the most familiar with the people in their communities. The social vulnerability index's importance is design to assist them in ensuring the security and well-being of their population. The SVI components can help the state and local people involved in all phases of the disaster sequence, in particular, landslides. Knowledge of locations and community information that is vulnerable to landslides can help planners in identifying target groups and accelerating assistance in efforts to reduce and impact property damage and loss of life, as well as prepare for disaster events. The stakeholders and management planners can setting the evacuation centre to places in secure condition to those are needs emergency assistances such as elderly people, single mothers with kids and infants, no transportations people and migrants whose are not influent in local language. In the recovery process, local governments may recognise communities that may require additional funding for human services or as a mitigation gauge to avoid a need for more costs due to the post support [100]. The slower to recover are those with the socioeconomically low-income community with hazardous areas of landslide occurrence. Therefore, the analysis results show that there are seven indicators as outlined that should be used as a social vulnerability index in measuring the level of susceptibility of landslides events. It consisted of education, age, ethnicity, special need population, healthcare accessibility, income, and social asset indicators. Future research will examine how SoVI can be used in the planning and mitigation processes to help target disaster management interventions as part of the system. Besides, the SoVI outcome can lead to geological mapping of disaster risk management in Malaysia's decision-making systems based on specific zones.

#### **6. Conclusions**

In this study, we have reviewed a selection of socioeconomic vulnerability components. At the searching stage, 258 articles were found in key databases, and after inclusion and exclusion criteria using the PRISMA guideline, only nine articles were chosen as being valid to this research. Fourteen variables were listed, and five variables of social vulnerability, which were typically used by scholars, proved to be relevant to Malaysia. Not all places or locations have the same experiences of landslides, and so the level of social vulnerability will differ and how these are measured. Although people may experience the same hazard or disaster, it does not mean that all individuals go through the same processes of destruction, recovery, evaluation, etc., as others. There are individuals who experience much higher social vulnerability than others, and it depends on which indicators are employed. As a climatic condition and the landslides occurrences in Malaysian context, there are seven indicators underlined which are education, age, ethnicity, special needs population, health accessibility, income, and social capital. These are the important indicator to measure the social vulnerability index to the high-risk communities towards landslide hazard. The result of these indicator measurement should be useful to authorities to include it as a complementary data to their geological mapping of disaster risk management based on the location of the landslide events. Furthermore, that is why, this study is important for understanding the social vulnerability index in the context of landslides in Malaysia.

**Author Contributions:** Conceptualization, M.I.N.D.; methodology, M.I.N.D.; software, M.I.N.D.; validation, M.I.N.D., M.R.T., and N.M.; formal analysis, M.I.N.D.; investigation, M.M.A.; resources, M.I.N.D.; data curation, M.M.A.; writing—original draft preparation, M.I.N.D.; writing—review and editing, M.M.A.; visualization, N.M.; supervision, M.R.T.; project administration, M.R.T. and A.O.; funding acquisition, M.R.T. and A.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Dana Padanan Antarabangsa—(MyPAIR)—Natural Environment Research Council (NERC), grant number NEWTON/1/2018/TK01/UKM/2.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors acknowledged the technical support by Nurul Atikah Zulkepli and Siti Nursakinah Selamat for the searching material of this study. We are also thanking to four anonymous reviewers whose give the comments/suggestion helped improve and clarify this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*
