Next Article in Journal
Technical and Economic Feasibility Analysis of Solar Inlet Air Cooling Systems for Combined Cycle Power Plants
Previous Article in Journal
Double-Circuit Adaptive System of Fuzzy Phase-Autonomous and Energy-Efficient Control of Arc Furnace Electric Modes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comprehensive Assessment of Biomass Properties for Energy Usage Using Near-Infrared Spectroscopy and Spectral Multi-Preprocessing Techniques

by
Bijendra Shrestha
1,
Jetsada Posom
2,3,*,
Panmanas Sirisomboon
1 and
Bim Prasad Shrestha
4,5
1
Department of Agricultural Engineering, School of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand
2
Department of Agricultural Engineering, Faculty of Engineering, Khon Kaen University, Khon Kaen 40002, Thailand
3
Center for Alternative Energy Research and Development, Khon Kaen University, Khon Kaen 40002, Thailand
4
Department of Mechanical Engineering, Kathmandu University, Dhulikhel P.O. Box 6250, Nepal
5
Department of Bioengineering, University of Washington, Seattle, WA 98195, USA
*
Author to whom correspondence should be addressed.
Energies 2023, 16(14), 5351; https://doi.org/10.3390/en16145351
Submission received: 8 June 2023 / Revised: 2 July 2023 / Accepted: 10 July 2023 / Published: 13 July 2023
(This article belongs to the Section I1: Fuel)

Abstract

:
In this study, partial least squares regression (PLSR) models were developed using no preprocessing, traditional preprocessing, multi-preprocessing 5-range, multi-preprocessing 3-range, a genetic algorithm (GA), and a successive projection algorithm (SPA) to assess the higher heating value (HHV) and ultimate analysis of grounded biomass for energy usage by employing near-infrared (NIR) spectroscopy. A novel approach was utilized based on the assumption that using multiple pretreatment methods across different sections in the entire NIR wavenumber range would enhance the performance of the model. The performance of the model obtained from 200 biomass samples for HHV and 120 samples for ultimate analysis were compared, and the best model was selected based on the coefficient of determination of the validation set, root mean square error of prediction, and the ratio of prediction to deviation values. Based on the model performance results, the proposed HHV model from GA-PLSR and the N models from the multi-preprocessing PLSR 5-range could be used for most applications, including research, whereas the C and H models from GA-PLSR and the O model from the multi-preprocessing PLSR 5 range method 5-range air performance and are applicable only for rough screening. The overall findings highlight that the multi-preprocessing 5-range method, which was attempted as a novel approach in this study to develop the PLSR model, demonstrated better accuracy for HHV, C, N, and O, improving these models by 4.1839%, 8.1842%, 3.7587%, and 4.0085%, respectively. Therefore, this method can be considered a reliable and non-destructive alternative method for rapidly assessing biomass properties for energy usage and can also be used effectively in biomass trading. However, due to the smaller number of samples used in the model development, more samples are needed to update the model for robust application.

1. Introduction

Biomass is an important carbon-neutral, renewable bio-resource that is widely available throughout the world. It mainly consists of three polymers: cellulose, hemicellulose, and lignin, whose composition varies based on the type of biomass [1]. Hardwood and herbaceous biomass contain approximately 43–47% and 33–38% cellulose, 25–35% and 26–32% hemicellulose, and 16–24% and 17–19% lignin, respectively [2]. This composition of biomass can be converted into useful energy through various processes, such as combustion, gasification, torrefaction, or fermentation, making it a suitable alternative to fossil fuels. However, its low energy density, high moisture content, and high oxygen–carbon ratio make it challenging to store, transport, and utilize effectively. Therefore, a deep understanding of biomass properties is necessary to design the best thermal conversion methods [3,4,5]. In the current scenario, biomass is used mainly by the residential (cooking and heating) and industrial (combined heat and power) sectors through direct combustion, which negatively impacts health, the economy, energy, and the environment [6]. Research on bio-based energy technologies, such as clean cooking stoves, gasifiers, biogas, bio-char, bio-briquettes, and pellets, have yielded strong results in laboratory settings. However, due to inadequate and unreliable knowledge regarding the properties of biomass fuel, the overall efficiency and performance of these technologies remain only satisfactory. Additionally, various operation and maintenance challenges persist. Trading biomass based on volume and weight rather than its actual energy properties is still common. Therefore, the rapid, reliable, and non-destructive assessment of biomass properties is of utmost importance for identifying the actual energy potential and for proper technical and monetary management and utilization [5].
Biomass can be assessed for energy usage by evaluating its HHV and ultimate analysis. The HHV is an important and standard indicator of the energy content of biomass [7]. A bomb calorimeter is used to measure the HHV, which is destructive in nature [8]. The ultimate analysis provides information on the elemental composition of biomass in terms of wt.% of C, H, N, S, and O. The heating value of the biomass is directly correlated with C, H, and O composition [9]. Biomass with higher C and H and/or O and H contents and lower N and S contents is recommended for energy usage as it improves the HHV of the biomass [9,10].
Biomass is a good absorber of NIR radiation in the range of 3595 to 12,489 cm−1. It predominantly interacts with the bonds of non-symmetrical molecules, including C, O, H, and N [11,12], making it suitable for use in conjunction with NIRS and chemometrics for assessing the energy-related properties of biomass, including HHV and ultimate analysis parameters, such as C, H, N, S, and O [13]. Several previous studies have utilized NIRS to develop models for rapid and accurate measurement of various biomass properties for energy usage. For instance, Posom et al. [14] developed a reliable online method for measuring the HHV of sugarcane using NIRS. Phuphaphud et al. [15] developed spectroscopic models using visible and shortwave NIR to predict and classify the energy content of growing cane stalks for breeding programs. Huang et al. [10] developed a prediction model for the HHV as well as the elemental composition (C, H, and N) of straw using NIRS. Posom et al. predicted the HHV [3] and elemental composition (C, H, N, O, and S) [16] of grounded bamboo using NIRS. Skvaril et al. [17] reviewed the application of NIRS in biomass energy conversion processes. Zhang et al. [18] studied the fast analysis of HHV and elemental composition of sorghum biomass using NIRS. Xue et al. [19] studied the use of an online NIRS system for measurement of crop straw fuel properties. These studies demonstrate the potential for NIRS to provide rapid, reliable, and non-destructive alternative methods for characterizing biomass for energy usage compared to traditional destructive thermal analysis techniques.
NIRS, combined with a broad range of wavelengths and suitable chemometric models, offers extensive applications in various fields, such as food quality control, agriculture, biofuels, and drug analysis [13]. NIRS has been successfully employed for on-line, at-line, off-line, and in-line analysis, using instruments from different NIR ranges. For instance, in-line fiber-optic NIR spectra (300–1160 nm) have been utilized to classify durian pulp samples based on their dry matter content and soluble solids content [20]. FT-NIRS (800–2500 nm) has enabled rapid measurement of macronutrients such as nitrogen, phosphorus, and potassium in durian leaves, aiding in the production of high-quality durian fruits through optimal fertilization practices [21]. FT-NIR (700–2500 nm) has been employed to predict total phenolics and antioxidants in hulled and naked oats of different genotypes [22]. Vis-NIR (570–1031 nm) and Mid-NIR (860–1760 nm) spectroscopy have been utilized for starch content prediction in cassava [23]. The Micro-NIR portable spectrometer (900–1676 nm) has been found to have applications in the classification and quantification of crude oils and fuels [24]. Additionally, a portable NIR analyzer (1300–2600 nm) has been used for rapid confirmation of the presence of illicit drugs, such as cocaine [25]. NIRS provide better spectral reproducibility with a higher signal-to-noise ratio compared to other complementary analytical techniques, such as Raman and IR spectroscopies, making it one of the most important parameters in quantitative calibration [26]. The better penetration depth in samples, minimal or no sample preparation, shorter acquisition times, and wide range of application in diverse fields highlight the multidisciplinary nature of NIRS. In contrast, the presence of a strong water absorption band in the NIR region limits the applicability of NIRS for samples with a high water content. In such cases, Raman spectroscopy can be a suitable alternative as it is relatively unaffected by water interference and can effectively analyze aqueous solutions and biological samples without significant water-related issues [27,28]. However, it is important to note that Raman scattering is inherently a weak phenomenon, often requiring longer acquisition times and being more sensitive to sample fluorescence [26]. In addition, the cost of instrumentation for Raman and IR spectroscopy is higher compared to NIRS. These factors showcase the acceptance of NIRS as a rapid, reliable, and non-destructive method, resulting in energy, environmental, cost, and time savings.
Despite NIRS being a rapid, reliable, and non-destructive analytical method, individual calibration models based on spectral data and each reference parameter must be developed for the NIR-based assessment of biomass properties. This procedure might be time-consuming and costly; however, in the long term, it will be beneficial for rapid and reliable evaluation procedures to assess biomass properties for their different applications.
In this study, a built-in code in MATLAB-R2020b was used to develop PLSR calibration models using spectral data from ten different biomass varieties (including five fast-growing tree varieties and five agricultural residue varieties); reference data obtained from a bomb calorimeter for HHV (J/g); a CHNS/O elemental analyzer for wt.% of C, N, H, S, and O; and a thermogravimetric analyzer for wt.% ash content. The main objectives of this research are:
  • To develop PLSR models using no preprocessing, traditional preprocessing, multi-preprocessing 5-range and 3-range methods, GA, and SPA for assessing biomass properties for energy usage by employing NIRS.
  • To compare the performance of the PLSR models based on R2C, RMSEC, R2P, RMSEP, RPD, and bias.
  • To select the better performing PLSR-based model for each parameter and establish it as a reliable and non-destructive alternative method for rapidly assessing biomass properties for energy usage.
The research outcomes of this study have practical applications in real life. The developed model offers a rapid, reliable, and non-destructive alternative to traditional laboratory methods for assessing biomass properties. This benefits biomass traders in determining a fair price based on actual energy properties, rather than relying solely on volume or weight. Industries relying on biomass for energy can optimize system efficiency and cost-effectiveness through informed feedstock selection. The model is applicable for process monitoring and quality control in biomass-based energy production facilities. This facilitates real-time adjustments by engineers and operators, ensuring consistent and efficient energy production. Policymakers, energy companies, and researchers can utilize these findings for the proper identification, management, and utilization of bio-resources to meet future energy demands. Moreover, the research outcomes pave the way for NIR-based research in various fields to adopt or enhance similar approaches.

2. Materials and Methods

Figure 1 shows the overall research methodology for the evaluation of HHV and ultimate analysis parameters of grounded biomass for energy usage using NIRS combined with PLSR.

2.1. Sample Preparation

The biomass samples were collected from the Terai low flatland and mid-hill regions of Nepal, with altitudes ranging from 86 to 1940 m above sea level. The study included five fast-growing species: (1) Alnus nepalensis, (2) Pinux roxiburghii, (3) Bombusa vulagris, (4) Bombax ceiba, and (5) Eucalyptus camaldulensis. Also included were five agricultural residues: (1) Zea mays (cob), (2) Zea mays (shell), (3) Zea mays (stover), (4) Oryza sativa, and (5) Saccharum officinarun. Alnus nepalensis and Pinux roxiburghii were collected from the mid-hill region; Bombax ceiba, Eucalyptus camaldulensis, and Saccharum officinarum were collected from the Terai region; and Zea mays (cob, shell, stover), Bombusa vulagris, and Oryza sativa were collected from both Terai and the mid-hill region of Nepal.
During preparation, all collected samples except for Oryza sativa were manually chopped into smaller pieces, i.e. less than 30 mm × 15 mm (refer to Figure 2a); dried in the open sun; and stored in an airtight aluminum bag to maintain their biomass properties by preventing the exchange of air and moisture during transport to the Near-Infrared Spectroscopy Research Center for Agricultural Product and Food at School of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Thailand. The samples were ground using a multi-functional high-speed disintegrator (WF-04, Thai grinder, Thailand). The particle size of the grounded biomass was evaluated at the Scientific and Technological Research Equipment Center (STREC) at Chulalongkorn University, Bangkok, Thailand, using the instrument, Mastersizer 3000 (MAL1099267, Hydro MV). Figure 3 shows the representative particle size distribution of the ground biomass used in this research, ranging from 0.01 to 3080 µm. The ground samples were stored in airtight plastic ziplock bags before and during the experiment.

2.2. Spectral Data Collection

As shown in Figure 2c,d, the grounded biomass samples were placed in a glass vial (20 mm diameter and 48 mm height) and scanned using an FT-NIR spectrometer (MPA, Bruker, Ettlingen, Germany) in a transflectance mode at the controlled temperature of 25 ± 2 °C. The spectrometer operates with a resolution of 16 cm−1, with a background scan time and sample scan time of 32 scans (average), logging absorbance data-log(1/R) within wavenumber range of 3595 to 12,489 cm−1, where R is the diffuse reflectance detected from the grounded biomass sample. Prior to scanning, the FT-NIR spectrometer was normalized by performing a gold plate background scan. The primary purpose of performing a background scan on every new ground sample was to compensate for instrumental drift and ambient environmental influences, such as temperature, light, relative humidity, etc., on the measurement setup [12].
All the grounded samples were scanned twice without changing their positions, with no NIR leakage occurring during scanning. The average absorbance value for each sample, with respect to its wavenumber, was considered spectroscopic data for model development. Figure 4a shows the raw spectrum of ten different grounded biomasses within the wavenumber range between 3594.87 to 12,489.48 cm−1, which were used to evaluate the HHV and ultimate analysis parameters.

2.3. Reference Analysis

Due to the complex nature of NIR absorbance data, it must be correlated with reference values obtained using a standard laboratory method [29]. Thus, the reference data, which include HHV, C, H, N, S, and O, were evaluated after being scanned from a FT-NIR spectrometer.

2.3.1. Higher Heating Value (HHV)

The HHV of the grounded biomass is measured using the isoperibol method with an automatic bomb calorimeter (IKA C 200, Staufen, Baden-Württemberg, Germany). Before the start of the experiment, the bomb calorimeter was calibrated with two tablets of benzoic acid (IKA C 723), each with a total weight of 1.0092 g and a gross calorific value of 26,462 J/g. To verify the calibration, the test was repeated with a single tablet of benzoic acid and the results were compared. A cotton thread (IKA C 170.4) with a gross calorific value of 50 J/cotton twist was used for ignition in the bomb to measure the HHV of the grounded sample. To ensure that the space in the bomb was saturated with water vapor throughout the entire experiment period, 2 mL of aqua pro (IKA C5003.1) were added into 1 L of water and poured into the bomb calorimeter vessel [14]. The HHV (J/g) of each grounded sample was replicated twice, and the average value was considered as reference data for model development. A quantity of 0.5 ± 0.2 g of grounded sample was weighted using an electronic balance (Mettler Toledo JS1203C) with a resolution of 0.0001 g. Including preparation, the total experimental time to measure HHV for each sample was approximately 40 min.

2.3.2. Ultimate Analysis

The ultimate analysis includes quantification of wt.% of C, H, N, S, and O on a dry basis in the ground biomass to determine the major elemental composition. The wt.% of C, H, N, and S in the ground sample were measured using the CHNS/O analyzer (Thermo ScientificTM FLASH 2000, Waltham, MA, USA). The wt.% of ash content in the ground biomass was measured using the thermogravimetric analyzer (TG 209 F3 Tarsus, Netzsch, Bavaria, Germany). The wt.% of O on a dry basis in the ground biomass sample was calculated as a difference [30]:
wt.% O = 100 − (wt.% C + wt.% H + wt.% N + wt.% S + wt.% Ash)

2.3.3. Outlier and Standard Error of Laboratory

Outliers for all the measured reference data were calculated using the following equation, where Xi is the measured value of sample i, and X ¯ and SD are the average and standard deviation of the measured values of all samples, respectively:
( X i X ¯ ) S D ± 3
If Equation (2) is satisfied for any sample i, the sample is considered as an outlier and is not considered in the total dataset for model development [31].
Similarly, the standard error of laboratory (SEL), which explains the precision of the reference method, was calculated for the bomb calorimeter and CHNS/O elemental analyzer using the following equation, where y1 and y2 are the replicates of each sample reference value measurement and NT is the total number of experiment samples:
S E L = i = 1 N y 1 y 2 2 N T

2.4. Spectral Preprocessing

Spectral preprocessing is one of the most important components of NIR calibration. Ten different varieties of grounded biomass samples were scanned to collect spectral data, whose physical, chemical, and biological properties may vary from sample to sample. Although the raw spectrum for all the biomass samples appears similar, instrumental errors, variations in light scattering during sample scanning, and a large number of redundant and interfering variables can introduce unwanted and harmful signals into the spectrum (refer to Figure 4a). To improve spectral features, it is important to remove noise, address overlapping peaks and baseline shifts, handle collinearity within the spectral data, and enable easy data interpretation for calibration [32,33], and NIR spectral preprocessing is necessary before model development.
To date, models have been developed using a traditional preprocessing approach (refer Figure 4b) on the entire available wavelength range for the prediction and evaluation of respective samples. However, there has been a lack of exploration regarding the pretreatment of raw spectra by employing different preprocessing techniques on distinct sections of the entire wavelength range. It is thought that a multi-preprocessing approach, i.e., a unique preprocessing technique that divides the entire spectrum into different sections using different spectral preprocessing methods based on random pairs, will improve the assessment of the biomass properties for energy usage using NIRS. Based on this hypothesis, this study introduced a novel multi-preprocessing approach: the 5-range and 3-range methods (refer to Figure 4c,d) as unique components to improve the assessment of biomass properties using NIRS. The research outcomes from the multi-preprocessing technique with PLSR will serve as a pivotal milestone in the research and development of NIRS. This will benefit NIRS-related research from diverse fields by permitting the upgrading of existing models and their effective utilization in various applications.
Therefore, in this study, the raw spectrum was subjected to two distinct pretreatment approaches. The first approach adhered to the traditional methodology, entailing the application of a single spectral preprocessing method to the entire wavenumber range (3595 to 12,489 cm−1). Meanwhile, the second approach introduced a novel and innovative multi-preprocessing technique, whereby the entire wavenumber range was partitioned into multiple sections and underwent pretreatment using a comprehensive combination of various preprocessing methods. For the traditional approach, ten different types of spectrum pretreatment methods were used for the calibration models. These included (1) first derivative (segment = 5 and gap = 5), (2) second derivative (segment = 5 and gap = 5), (3) constant offset, (4) SNV, (5) MSC, (6) vector normalization, (7) min-max normalization, (8) mean centering, (9) first derivative (segment = 5 and gap = 5) + vector normalization, and (10) first derivative (segment = 5 and gap = 5) + MSC.
For the multi-preprocessing approach, the entire wavenumber range was divided into different sections and pretreated with various pretreatment combination sets obtained from seven different preprocessing methods, as indicated by the following markings: 0 = empty (all the absorbance values = 0), 1 = raw spectra, 2 = SNV, 3 = MSC, 4 = first derivative (5,5), 5 = second derivative (5,5), and 6 = constant offset.
For the multi-preprocessing 5-range method (refer Figure 4c), the following procedures were adopted:
(1)
Equally dividing the entire wavenumber range into five sections: 3625.72–5392.30 cm−1, 5400.02–7166.59 cm−1, 7174.31–8940.89 cm−1, 8948.60–10,715 cm−1, and 10,722.9–12,489.48 cm−1. However, since the wavenumber range from 3594.87 to 12,489.48 cm−1 is not equally divisible by 5, the last four independent variables were excluded from the total dataset, resulting in 1150 out of 1154 variables being considered for model development.
(2)
Generating all possible combinations of multi-preprocessing sets from 0 to 6.
(3)
Selecting the most effective multi-preprocessing combination by evaluating different numbers of random pairs to develop the PLSR-based model.
Similarly, for the multi-preprocessing 3-range method (refer Figure 4d), the following procedures were adopted:
(1)
Dividing the entire wavenumber range into three sections: 3594.87–5492.59 cm−1, 7498.31–5500.30 cm−1, and 7506.02–12,489.48 cm−1.
(2)
Generating all possible combinations of multi-preprocessing sets from 0 to 6.
(3)
Selecting the most effective multi-preprocessing combination by evaluating different numbers of random pairs to develop the PLSR-based model.
Figure 4c,d shows the spectrum of the grounded biomass obtained from the multi-preprocessing method with the (a) 5-range and (b) 3-range methods, respectively. In Figure 4c, the raw spectrum was pretreated with the preprocessing combination set of 3, 0, 1, 0, and 1—i.e., MSC from 3625.72–5392.30 cm−1, empty from 5400.02–7166.59 cm−1, raw spectra from 7174.31–8940.89 cm−1, empty from 8948.60–10,715 cm−1, and raw spectra from 10,722.9–12,489.48 cm−1. Similarly, in Figure 4d, the raw spectrum was pretreated with the preprocessing combination set of 4, 4, and 6—i.e., second derivative from 3594.87–5492.59 cm−1, first derivative from 7498.314–5500.30 cm−1, and constant from 7506.02–12,489.48 cm−1. The best combination set for multi-preprocessing is determined by the optimum LVs obtained from full cross-validation.
MATLAB-R2020b (MathWorks, Natick, MA, USA) built-in code was used to select the optimal combination set of multi-preprocessing methods for developing a PLSR calibration model.

2.5. Model Development

The accuracy of the model is one of the major concerns of NIRS. Accuracy can be improved by using different spectral pretreatments and appropriate data analysis methods. Various research articles related to NIRS modeling have concluded that PLSR is one of the most effective and commonly used quantitative analysis techniques [14,34,35,36]. Therefore, this study proposes PLSR-based models that can handle highly collinear spectroscopic data [37] for the assessment of grounded biomass properties. In this study, the following models were developed to match its objectives: (1) full wavenumber range–PLSR with no preprocessing and traditional preprocessing techniques, (2) multi-preprocessing PLSR 3-range method, (3) multi-preprocessing PLSR 5-range method, (4) GA-PLSR, and (5) SPA-PLSR.
To develop PLSR models using different methods, the total data obtained after removing outliers was manually divided into an 80% calibration set and a 20% validation set, as illustrated in Figure 1. The total data set consists of ten different varieties of biomass comprised of five fast-growing trees and five agricultural residues. Therefore, it is crucial to stratify the total dataset to ensure both the calibration and validation sets encompass representative samples, covering the entire range of variation within the overall sample population. Allocating 80% of the total dataset as the calibration set, which includes the maximum and minimum reference values, ensures proportional representation of all biomass varieties in the model development process. This approach reduces bias, facilitates effective learning of underlying patterns and relationships, and helps prevent issues, such as overfitting or underfitting, to generate a regression model [33]. The calibration set was first subjected to full cross-validation to select the optimal number of LVs. This number ensures the smallest possible standard error for data analysis—considering too few LVs leads to underfitting, and considering too many LVs leads to overfitting. If several LVs show similar or comparatively better model performance, the smallest number of LVs was selected for model development [38]. The PLSR models for assessing biomass properties for energy usage were created using in-house code in MATLAB-R2020b (Mathworks, USA).
GA and SPA are the wavelength selection methods that select the highly influential wavenumbers from the spectra and have been shown to provide better performance when combined with PLSR compared to PLSR with the full wavenumber range only, thus avoiding overfitting [39,40,41]. SPA selects the variables with minimum collinearity and assesses them based on the value of the root mean square error obtained from the validation set. In SPA, uninformative variables are eliminated until the model’s performance no longer increases [42]. GA selects variables with a minimum amount of redundant information, starting with one variable and adding a new one to the loop in each iteration, thereby maximizing its fitness. The model developed with GA-PLSR shows the lowest prediction error as it maximizes the fitness and co-variance between the spectral and reference data [43,44]. In GA-PLSR and SPA-PLSR, the new calibration dataset was processed through full-cross validation to select the optimum LVs, which were then considered for PLSR model development.
The accuracy of the NIR model should be compared with the reference method. Therefore, the performance of the model was determined in terms of R2c, RMSEC, R2P, RMSEP, RPD, and bias [45]. These parameters can be calculated as follows, where y is the measured value, y ^ is the predicted value, i is subscript used to indicate the number of the sample, y ¯ is the mean of the measured value, NT is the number of samples, SD is the standard deviation of the measured values of the validation set, and n is the number of samples in the validation set:
R 2 c , R 2 p = 1 i = 1 N ( y i y ^ i ) 2 i = 1 N ( y i y ¯ ) 2
R M S E C , R M S E P = i = 1 N ( y i y ^ i ) 2 N T
R P D = S D R M S E P
B i a s = i = 1 n ( y i y ^ i ) n
The better model was selected based on the tradeoff value between the highest R2c, R2P, and RPD and the lowest RMSEC, RMSEP, and bias. In this study, the performance results, namely the R2 and RPD value, were interpreted based on the recommendations of Williams et al. (2019) [46] and Zornoza et al. (2008) [47], respectively.
As per the recommendations of Williams et al. (2019), R2 up to 0.25 are not usable for NIRS calibration; 0.26–0.49 indicates poor calibration, and reasons for this should be researched; 0.50–0.64 is considered okay for rough screening; 0.66–0.81 is okay for rough screening and some other appropriate calibrations; 0.83–0.90 is usable with caution for most applications, including research; 0.92–0.96 is usable in most applications, including quality assurance; and 0.98+ is excellent and can be used in any application [46]. Similarly, according to Zornoza et al. (2008), an RPD value of less than 2 is considered insufficient for applications; RPD between 2 and 2.5 makes approximate quantitative predictions possible; RPD values between 2.5 and 3 are considered good for prediction; and RPD greater than 3 indicates an excellent prediction [47].

3. Results and Discussion

3.1. Comparison of Near-Infrared Spectra of Ground Biomass with Pure Cellulose and Hemicellulose

The energy potential and conversion efficiency of fast-growing trees and agricultural residues can be influenced by the composition of lignocellulosic matter [48]. Figure 5 compares the near-infrared spectra of pure cellulose and pure lignin with 90 samples of fast-growing trees and 110 samples of agricultural residues, all exhibiting average absorbance values. The figure reveals that the vibration band between approximately 5181–6150 cm−1 corresponds to the lignin band (with low absorbance for cellulose), while the range between approximately 6150–6800 cm−1 corresponds to the cellulose band (with low absorbance for lignin) [34]. Notably, the spectra of fast-growing trees and agricultural residues exhibit distinct peaks resembling those of both pure cellulose and pure lignin at approximately 4019 cm−1, 4405 cm−1, 4762 cm−1, 5181 cm−1, and 6897 cm−1. This resemblance of distinct peaks provides strong evidence that the ground biomass of fast-growing trees and agricultural residues contains cellulose and lignin.
The peak at 4019 cm−1 results from the combination of C-H stretching and C-C stretching in cellulose, whereas the peak at 4405 cm−1 corresponds to the combination of O-H stretching and C-O stretching in cellulose. The peak at 4762 cm−1 corresponds to the combination of O-H bending and C-O stretching in polysaccharides. The peak at 5181 cm−1 corresponds to the combination of O-H stretching and HOH bending in polysaccharides. The peak at 6897 cm−1 corresponds to the first overtone of the fundamental O-H stretching band in water and starch [49].
The cellulose and lignin content in biomass is a critical factor in determining its HHV. Biomass with a higher lignin content and lower cellulose content exhibits an improved HHV, and vice versa [50]. This finding confirms the suitability of the selected fast-growing tree and agricultural residue varieties for various applications that rely on lignocellulose matter. These applications include biomass trading for direct combustion, biomass pellet production, the paper and pulp industry, biomass-based construction and building material, and bioenergy and biofuel production, among others.

3.2. Higher Heating Value and Ultimate Analysis in Ground Biomass

Figure 6 displays a histogram of the HHV and ultimate analysis values used in the calibration set and validation set for the development of various PLSR-based models. The calibration set is represented by the blue color, while the validation set is represented by the red color. Equation (2) was employed to calculate the outliers, and any data points that satisfied the defined relation were excluded from the model development process.
The normal distribution of all the reference datasets, i.e., HHV (J/g) and wt.% of C, N, H, and O, on a dry basis—used for model development—was analyzed using SPSS 16.0. A histogram analysis revealed that all the datasets exhibit a bell-shaped normal distribution. This suggests that the data points are clustered around the mean value, demonstrating a nearly normal distribution. Additionally, the calculated standard deviations of these datasets (refer to Table 1) were found to be low, further indicating that the data points are closely packed around the mean.
Furthermore, a one-sample Kolmogorov–Smirnov test was performed using SPSS 16.0 to calculate the p-values for HHV (J/g) and wt.% of C, N, H, and O, resulting in values of 0.704, 0.060, 0.368, 0.565, and 0.119, respectively. Since all obtained p-values are greater than the significance level of 0.05, the reference data used for modelling is considered to have a uniform distribution.
The findings regarding the normal distribution, low standard deviations, and the concentration of data points around the mean value support the validity and reliability of the model developed in this research.
Table 2 presents the average HHV, ultimate analysis parameters (C, N, H and O), and ash content of different fast-growing trees and agricultural residues that were included as reference data for developing the model. The HHV is measured in J/g, and the ultimate analysis parameters and ash content are expressed as wt.% on a dry basis. In the case of the biomass samples analyzed using the CHNS/O analyzer (Thermo Scientific FLASH 2000), no sulfur content was detected. This could be attributed to the typically low levels of sulfur present in biomass, which may fall below the lower limit of detection of the analyzer. Therefore, for the purpose of this study, the sulfur content in the ground biomass is assumed to be zero and has not been considered in the model development. The wt.% of O was then calculated using Equation (1).
As per previous research, the HHV of biomass is positively correlated with C and H contents, while it is negatively correlated with O and N contents [51]. Table 1 indicates that fast-growing trees have higher average values of HHV, C, and H contents and lower O and N contents compared to agricultural residues. These results are consistent with the correlation observed between the measured data of the HHV and elemental composition of ground biomass.
Table 1 shows the statistical summary data for the HHV (J/g) and ultimate analysis parameters, i.e. wt.% of C, N, H, and O on a dry basis used in the calibration set and validation set for the model development.

3.2.1. Higher Heating Value

Out of the 200 samples, 4 were identified as outliers and were removed from the total data set to develop PLSR-based models for evaluating the HHV. The SEL for the bomb calorimeter used to evaluate HHV was calculated to be 255.7708 J/g. Table 3 displays the optimal result of various PLSR-based models using the full wavenumber range (3594.87–12,489.5 cm−1) to evaluate the HHV of the ground biomass from the fast-growing trees and agricultural residues.
Figure 7a shows the scatter plot of HHV measured and predicted values from the calibration and validation sets using GA-PLSR. The GA-PLSR with 14 LVs and spectral pretreatment first derivative using 692 important wavenumbers yielded the best performance results, with an R2C of 0.9505, RMSEC of 188.0117 J/g, R2P of 0.9574, RMSEP of 170.3282 J/g, RPD of 4.89, and bias of −21.9648 J/g. The model included a sufficient number of homogenous samples, from both fast-growing trees and agricultural residues, for model development and had a wider HHV range, resulting in higher R2C, R2P, and RPD, and lower RMSEC and RMSEP values compared to other models. Compared to the full-PLSR model performance, the GA improved the PLSR model accuracy by 8.5069%. Similarly, the multi-preprocessing 5-range method improved the accuracy of the full-PLSR model by 4.1839%. According to Williams et al. (2019) [46] and Zornoza et al. (2008) [47], the GA-PLSR model for evaluating HHV is acceptable for most applications with excellent prediction, including quality assurance.
Figure 8 shows the average absorbance values obtained after preprocessing with the first derivative, highlighting the 692 selected wavenumbers (marked in red) obtained from GA, which is within the full spectral range of 3594.87–12,489.5 cm−1. The figure highlights important peaks in the following ranges: 4003.73–4111.73 cm−1, 4366.3–4451.16 cm−1, 5091.45–5114.59 cm−1, and 5130.02–5292.02 cm−1, which may significantly influence the model performance.
In the range of 4003.73–4111.73 cm−1, the wavenumber 4019 cm−1 represents the combination of C-H stretching and C-C stretching in cellulose and is used as a reference. Similarly, the range of 4366.3–4451.16 cm−1 includes the reference wavenumber 4405 cm−1, which corresponds to the combination of O-H and C-O stretching in cellulose. Polysaccharides are characterized by the combination of O-H stretching and HOH bending, which is represented by the reference wavenumber 5102 cm−1 in the range of 5091.45–5114.59 cm−1. Additionally, the range of 5130.02–5292.02 cm−1 includes the reference wavenumber 5200 cm−1, which corresponds to the combination of O-H stretching and HOH deformation of O-H molecular water [49]. Lignocellulosic biomass derives its primary energy from cellulose, hemicellulose, and lignin [52,53]. As can be seen in Figure 7, the important peaks with vibration bonds of C-H, C-C, O-H, and C-O stretching and HOH deformation of O-H molecular water correspond to the structure of cellulose and lignin. Therefore, they are likely to have the greatest influence on the assessment of the HHV of ground fast-growing trees and agricultural residues. This study is in line with previous studies by Sirisomboon et al. [54] and Lestander et al. [55], in which the authors reported that vibration bonds of C-H, C-C, and O-H stretching contribute significantly to the HHV of bamboo and biofuels, respectively. Additionally, Zhang et al. [18] reported that the vibration bond of C-H stretching in an aromatic CH3 structure can be used to assess the HHV of sorghum biomass. Posom et al. [5] indicated in their study that the vibration of C-H stretching highly influences the prediction of the HHV of leucaena Leucocephala pellets.

3.2.2. Ultimate Analysis

The sulfur content in the ground biomass samples of fast-growing trees and agricultural residues was not detected using the CHNS/O analyzer (Thermo ScientificTM FLASH 2000). This may be because the S content in the biomass is too low to be detected [56]. Therefore, PLSR-based models for the wt.% of S were not developed in this study.

wt.% of C

The SEL for the CHNS/O elemental analyzer used to evaluate the wt.% of C content in grounded biomass was calculated as 1.6936 wt.%. Table 3 shows the overall optimum results of PLSR-based models for the evaluation of wt.% of C content in the grounded biomass within the full wavenumber range of 3594.87–12,489.48 cm−1. Out of the 120 samples, 11 samples were identified as an outlier and removed from the total dataset for model development. The model developed through GA-PLSR with spectrum preprocessing of first derivative (gap = 5 and segment = 5) and 9 LVs provided better results with an R2C, RMSEC, R2P, RMSEP, RPD, and bias value of 0.7851, 0.9753 wt.%, 0.7217, 0.9740 wt.%, 1.93, and 0.1877 wt.%, respectively. Compared with full-PLSR, the GA-PLSR method improved the model accuracy by 8.5069%. Similarly, the multi-preprocessing 5-range method improved the PLSR model by 8.1842%. The scatter plot of the GA-PLSR method for the wt.% of C content in grounded biomass is shown in Figure 7b. According to the recommendation by Williams et al. (2019) [46], the PLSR model with the GA method is usable for rough screening and some other appropriate calibrations, based on the obtained R2 value. Similarly, considering the RPD value, as suggested by Zornoza et al. (2008) [47], the model is acceptable for the prediction of wt.% C content in the grounded biomass.
Figure 9 shows the average absorbance values obtained after preprocessing with the first derivative, highlighting the 50 selected wavenumbers (marked in red) obtained from the GA, which is within the full spectral range of 3594.87–12,489.5 cm−1. The high peaks with positive values marked in red at a specific wavenumber indicate the functional group, spectra-structure, and material type, which might be significant in the assessment of wt.% of C. In Figure 8, significant peaks can be noticed at 3650, 4019, 4405, 4878, and 7042 cm−1, respectively.
The peak at 3650 cm−1 corresponds to the functional group of O-H, the spectral structure with the fundamental stretching vibrational absorption band of O-H (-CH2-OH), and the material type of primary alcohols. The peak at 4019 cm−1 corresponds to the functional group of C-H/C-C, the spectral structure of the C-H stretching and C-C stretching combination, and the material type of cellulose. The positive peaks at 4405 cm−1 and 4878 cm−1 are associated with the functional group O-H/C-H and a combination of N-H/C-N/N-H amide II and amide III; spectral structure O-H stretching and C-O stretching; and N-H in-plane bend, C-N stretching, and N-H in-plane bend combination with material-type cellulose and amides/proteins, respectively. The peak at 7042 cm−1 corresponds to an O-H aromatic with the spectral structure of an O-H first overtone of the fundamental stretching band, as well as the material type of hydrocarbons [49]. Lignin contains a high carbon content [57]. According to Zhang et al. [19], vibration bands related to C-H stretching, CH2, C-H aromatics, O-H stretching, and HOH deformation are essential for predicting the C content of sorghum biomass. Similarly, Posom and Sirisomboon [58] found that N-H stretching, N-H deformation, C-N stretching, O-H stretching, and C-O stretching of starch significantly contribute to the model development of C content in bamboo. The average absorbance plot for wt.% of C shows the peaks at 3650, 4019, 4405, 4878, and 7042 cm−1, which complement the vibration bands reported in previous studies and also the spectra of pure lignin and pure cellulose. While these observed vibration bands at different peaks may have a significant impact on the overall performance of the model, this study suggests that the FT-NIRS may not provide sufficiently high-resolution spectra to create an accurate prediction model for wt.% of C.

wt.% of H

The SEL for the CHNS/O elemental analyzer to evaluate wt.% of H content in grounded biomass was calculated as 0.3206 wt.%. The optimal results of different PLSR-based models for evaluation of wt.% of H within the full wavenumber range were presented in Table 3. Before modeling, outliers from the reference values were calculated, and 27 out of the 120 samples were detected as outliers. Therefore, 93 grounded biomass samples were used for the model development. The best model was developed from the wavelength selection method, GA-PLSR, within the wavenumber range of 3594.87–12,489.48 cm−1 and spectral preprocessing from SNV. The best performing model for the evaluation of wt.% of H content in the grounded biomass produced an R2C of 0.8814, RMSEC of 0.1041 wt.%, R2P of 0.7678, RMSEP of 0.1434 wt.%, RPD of 2.14, and bias of −0.0356 wt.%. The GA-PLSR model exhibits a minimal improvement in model accuracy of 0.0092% compared to the full-PLSR model.
Figure 7c shows the scatter plot of measured versus predicted wt.% of H content in the grounded biomass obtained using GA-PLSR. According to William et al. (2019) [46], based on the R2 value, the model can be used for rough screening and some other appropriate calibrations. To improve the performance of the model, it is recommended to include additional representative biomass samples with a high concentration and wide range of wt.% of H content that are uniformly and representatively distributed in both the calibration and validation sets and are obtained from both fast-growing trees and agricultural residue varieties.
Figure 10 shows the average absorbance spectrum that was pretreated with the SNV and uses red marks to highlight the important wavenumbers obtained using GA. The important peaks selected at 4019, 4608, 5155, 6897, and 8163 cm−1 may have a significant influence on the performance of the model for the evaluation of wt.% of H content in the grounded biomass samples. The peak at 4019 cm−1 is associated with the functional group of C-H/C-C, and the spectral structure of C-H stretching and C-C stretching combination, with material-type cellulose. The peak at 4608 cm−1 is associated with the combination of C-H stretching and C-H deformation in alkenes. Similarly, the peak at 5155 cm−1 corresponds to a combination of O-H stretching and HOH bending in water. The peak at 6897 cm−1 corresponds to the spectral structure of O-H, arising from the first overtone of the fundamental stretching band, with a material-type starch/polymeric alcohol. The peak at 8163 cm−1 is associated with the second overtone of the C-H fundamental stretching band and material-type hydrocarbons [49]. The selected peaks mostly fall within a similar range compared to the study conducted by Posom and Sirisomboon [58]. This finding supports the results of the current study, indicating that these selected peaks are likely to have a significant influence on the performance of the models.

wt.% of O

Based on the assumption that the sulfur content in biomass is zero, as its wt.% is too low to be detected by instruments, the wt.% of O in biomass is calculated using Equation (1). The optimal results of the PLSR-based models for predicting the wt.% of O content in the grounded biomass are shown in Table 3. Before modelling, outliers from the reference values were calculated, and 21 out of 120 samples were detected as outliers. Therefore, 99 grounded biomass samples were used for the model development. The best result was obtained from the multi-preprocessing PLSR 5-range method with a spectral preprocessing combination set of 3, 2, 4, 6, and 0, i.e. MSC, SNV, first derivative, constant offset, and empty, respectively, from the range 3625.72–12,489.48 cm−1, which are equally divided into five sections. Figure 7d shows the scatter plot for the measured and predicted wt.% of O. With 12 LVs, the best performing model for evaluating the wt.% of O content in the grounded biomass produced an R2C of 0.6674, RMSEC of 1.4461 wt.%, R2P of 0.6289, RMSEP of 0.1.5275 wt.%, RPD of 1.7147, and a bias of −0.4456 wt.%. Compared with full-PLSR, the multi-preprocessing 5-range method improved the model accuracy by 4.0085%. Based on Williams et al. (2019) [46] and Zornoza et al. (2008) [47], the model with the multi-preprocessing PLSR 5-range method is usable only for rough screening. Therefore, to improve the performance of the model, the inclusion of a larger number of representative samples spanning a wide range of oxygen contents is recommended. This will enable the model to capture the variability in oxygen levels across different biomass compositions. Additionally, minimizing instrumental errors through proper calibration and maintenance of the CHNS/O analyzer and thermogravimetric analyzer is essential. Exploring alternative methods for measuring the ash content in the biomass could also contribute to improving the accuracy of wt.% of O predictions.
Figure 11 shows the regression coefficient plot for wt.% of O content in the grounded biomass, which is obtained from the multi-preprocessing PLSR 5-range method. Significant peaks were observed at wavenumbers 3650, 5155, 5675, 5952, 6330, and 7042 cm−1. The peak at 3650 cm−1 corresponds to the O-H functional group typically found in primary alcohols. Similarly, the peaks at 5155 cm−1 represent a combination of O-H stretching and HOH bending in water. The negative peak at 5675 cm−1 and the positive peak at 5952 cm−1 are associated with the spectra-structure of the first overtone of the fundamental stretching band of C-H, with hydrocarbons, methylene, and aromatic hydrocarbons as the material types, respectively. The peak at 6330 cm−1 corresponds to the functional group of the O-H combination band observed in alcohols, such as R-C-O-H. The peak at 7042 cm−1 corresponds to the first overtone of the fundamental stretching band of O-H, which is typically present in hydrocarbons and aromatic compounds [49]. A previous study by Posom and Sirisomboon [58] showed peaks at similar wavenumbers with vibration bands of C-H aromatic, O-H stretching of alcohol, O-H stretching, and HOH bending of water, which supports the findings of this study. Hence, these vibration bands may have a significant influence on the development of the model for the assessment of wt.% of O in grounded biomass.

wt.% of N

The SEL of the CHNS elemental analyzer for evaluating the wt.% of N content in grounded biomass was calculated as 0.0761 wt.%. Table 3 shows the optimal outcomes of the PLSR-based models for predicting the wt.% of N content in grounded biomass. Out of the 120 samples, 25 samples were identified as an outlier and removed from the total dataset for model development. The best prediction result of the wt.% of N in grounded biomass was obtained using the multi-preprocessing PLSR 5-range method with a spectral preprocessing combination set of 4, 4, 5, 3, and 4, which included the first derivative followed by the first derivative, second derivative, MSC, and first derivative, respectively, in five equally divided sections from 3625.72–12,489.48 cm−1. Figure 7e shows the scatter plot of the measured versus predicted wt.% of N content in the grounded sample using the multi-preprocessing PLSR 5-range method. The best performance for evaluating wt.% of N content in the grounded biomass resulted in an R2C of 0.8682, RMSEC of 0.0675 wt.%, R2P of 0.8410, RMSEP of 0.0973 wt.%, RPD of 2.65, and bias of −0.0309 wt.%. Compared with full-PLSR, the multi-preprocessing 5-range method improved the model accuracy by 3.7587%. According to William et al. (2019) [46], the model is suitable for most applications, including research. Based on the recommendation of Zornoza et al. (2008) [47], the prediction of wt.% of N content from the multi-preprocessing PLSR 5-range method with an RPD value of 2.65 is considered good for prediction.
Figure 12 shows the regression coefficient plot for wt.% of N content in the grounded biomass obtained from the multi-preprocessing PLSR 5-range method. The figure displays numerous positive and negative high and low peaks. The high peaks at 4019, 4307, 4673, 5200, 5952, 6711, and 12,453 cm−1 might significantly contribute to the evaluation of wt.% of N content. The negative peak at 4019 cm−1 might correspond to a C-H stretching and C-C stretching combination with the material type shown as cellulose. The positive peaks at 4307 cm−1, 4673 cm−1, 5200 cm−1, and 5952 cm−1 might be associated with the structure of a C-H stretching and CH2 deformation combination (material: polysaccharides), C-H stretching and C=O stretching combination and C-H deformation combination (material: lipids), O-H stretching and HOH deformation combination (material: O-H molecular water), and C-H (first overtone of fundamental stretching band) and aromatic C-H (material: hydrocarbons, aromatic), respectively. The peak at 6711 cm−1 might be associated with O-H (first overtone of fundamental stretching band) with the material type shown as starch/polymeric alcohol. The common natures of the peaks were noticed in the range between 11,500 and 12,500 cm−1, for which 12,453 cm−1 is described as a reference, which might correspond to the spectral structure of a C-H combination, with the material type being hydrocarbon and aliphatic [49]. The selected regression coefficient peaks show similar peaks compared to the study performed by Posom and Sirisomboon [58], with vibration bands of C-H stretching, C-C stretching, O-H stretching, and HOH deformation combination. This supports the findings of our study and suggests that these peaks are likely to have a vital influence on the performance of the model.

3.3. Comparison with Previous Work

Although various studies have been conducted on the development of models for evaluating HHV and ultimate analysis parameters using NIRS with a similar wavenumber range and reference mean value combined with chemometrics, no research or reports have been published to date on the application of NIRS and spectral multi-preprocessing techniques for fast-growing trees and agricultural residues of Nepalese biomass, which encompasses ten different biomass varieties.
In a previous study, Nakawajana et al. [59] evaluated the HHV of grounded cassava rhizome using PLSR and achieved an R2 of 0.90. Similarly, Nakawajana et al. [34], Posom et al. [3], Zhang et al. [18], and Posom et al. [5] developed PLSR models for rick husk, grounded bamboo, sorghum biomass, and Leucaena leucocephala pellets, respectively, with an R2 0.79, 0.92, 0.96, and 0.96. All the studies used NIRS scanning of biomass on diffuse reflectance mode. However, the GA-PLSR model in this study outperformed previous research by using NIRS scanning of biomass in transflectance mode for evaluating HHV.
The PLSR-based models developed from multi-preprocessing 5-range methods for ultimate analysis showed better performance in evaluating oxygen content compared to the PLS model developed by Jetsada et al. [58] for bamboo, which had R2P values of 0.52 for oxygen. However, the results of this study for the evaluation of C, N, and H contents were lower, with Jetsada et al. [58] showing R2P values of 0.80 for C, 0.85 for H, and 0.97 for N for bamboo. Similarly, the models developed by Zhang et al. [18] for sorghum biomass with R2P values of 0.96 for wt.% of C, 0.87 for wt.% of H, 0.86 for wt.% of N, and 0.83 for wt.% of O, and by Huang et al. [10] for straw with R2P values of 0.97 for wt.% of C, 0.77 for wt.% of H, and 0.87 for wt.% of N showed better results than the PLSR-based model in this study. Nhuchhen [60] predicted the ultimate parameters of torrified biomass with respect to proximate analysis, resulting in R2 values of 0.83 for wt.% of C, 0.70 for wt.% of H, and 0.84 for wt.% of O, respectively. The proposed model in this study showed better performance for H and O, but the performance of C content in the grounded biomass could be improved.
The PLSR-based models developed from multi-preprocessing 5-range methods for ultimate analysis showed better performance in evaluating oxygen content compared to the PLS model developed by Jetsada et al. [58] for bamboo, which had R2P values of 0.52 for oxygen. However, the results of this study for the evaluation of C, N, and H contents were lower, with Jetsada et al. [58] showing R2P values of 0.80 for C, 0.85 for H, and 0.97 for N for bamboo. Similarly, the models developed by Zhang et al. [18] for sorghum biomass with R2P values of 0.96 for wt.% of C, 0.87 for wt.% of H, 0.86 for wt.% of N, and 0.83 for wt.% of O, and by Huang et al. [10] for straw with R2P values of 0.97 for wt.% of C, 0.77 for wt.% of H, and 0.87 for wt.% of N showed better results than the PLSR-based model in this study. Nhuchhen [60] predicted the ultimate parameters of torrified biomass with respect to proximate analysis, resulting in R2 values of 0.83 for wt.% of C, 0.70 for wt.% of H, and 0.84 for wt.% of O, respectively. The proposed model in this study showed better performance for H and O, but the performance of C content in the grounded biomass could be improved.
In general, having a sufficient number of homogenous biomass samples with a wider range of reference values and low SEL from a bomb calorimeter and CHNS/O elemental analyzer could have played a catalytic role in achieving a higher model performance when evaluating HHV and N. However, the lower model performance for evaluating C, O, and H content may be due to a lower number of relevant variables or the selected variables in the calibration set not having a strong correlation with C, O, and H content in biomass. To enhance the model performance for evaluating C, O, and H content, the number of representative samples with a high concentration of C, O, and H should be increased, and possible contamination during sample preparation should be avoided. Additionally, the ambient environment of the laboratory should be properly controlled, and possible NIR radiation leakage during sample scanning should be rechecked. Outliers should be addressed properly, instrumental and analysis errors should be monitored correctly, or alternative modeling techniques should be considered for accurate evaluation.
Based on a comparison with previous studies, this research provides strong evidence that the model’s performance can be enhanced by conducting NIRS scanning of ground biomass in transflectance mode rather than diffuse reflectance mode, and by applying a spectral multi-preprocessing technique. To update the model for robust application, the number of ground biomass samples must be increased and validated using unknown samples.

4. Conclusions

PLSR-based models were developed and compared using NIRS to evaluate HHV, and ultimate analysis, i.e., wt.% of C, H, N, and O content in the grounded biomass in transflectance mode, was employed to assess the biomass properties for energy usage. The model with the optimum performance was selected based on the parameters of R2C, RMSEP, R2P, RMSEP, RPD, and bias. The models for HHV (J/g), and wt.% of N are suitable for most applications, including research, while the models for wt.% of C, wt.% of O, and wt.% of H were only fair and usable for rough screening. The performance of fair models could be improved by incorporating more representative samples collected from various geographical locations in Nepal, thereby considering the wide statistical range of the reference values.
This study showed that the multi-preprocessing 5-range method, a novel approach to spectral preprocessing for PLSR model development, improved model accuracy compared to the traditional method of preprocessing NIR spectra across the entire wavenumber range with a single process. Therefore, this research provides a foundation in NIRS, indicating that preprocessing the entire wavenumber range with various preprocessing techniques could enhance model accuracy. The recommended models can serve as a reliable and non-destructive alternative method for rapidly assessing biomass properties for energy usage when employing NIRS. However, to create a robust model, it is necessary to expand the model with data from various samples and validate it with unknown samples. Adopting these models could significantly reduce the economic gap between biomass traders for energy usage and other applications. Furthermore, the research outcomes could guide academic and research institutions, policymaking think tanks, and energy companies in planning for the proper identification, management, and utilization of bio-resources to meet future energy demand and supply. The research outcomes also generate possibilities for NIR-based research to adopt or apply similar approaches.

Author Contributions

B.S.: conceptualization, methodology, software, formal analysis, investigation, resources, data curation, writing the original draft, writing–review & editing, visualization. J.P.: conceptualization, software, formal analysis, data curation, writing–review & editing, supervision. B.P.S.: writing–review & editing, conceptualization and validation. P.S.: conceptualization, data curation, writing–review & editing, supervision, project administration, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the King Mongkut’s Institute of Technology Ladkrabang, Thailand (KMITL doctoral scholarship KDS 2020/52).

Data Availability Statement

The data will be made available upon request from the corresponding author.

Acknowledgments

The authors express their gratitude to the Near-Infrared Spectroscopy Research Center for Agricultural Product and Food (www.nirsresearch.com (accessed on 9 July 2023)), Department of Agricultural Engineering, School of Engineering at King Mongkut’s Institute of Technology Ladkrabang, Thailand, for providing instruments, funding (KMITL doctoral scholarship–KDS 2020/052), and space for the experiment. They would also like to thank the Department of Research and Graduate studies, Khon Kaen University, Thailand for their research support. Additionally, the authors extend their appreciation to the Department of Mechanical Engineering, Kathmandu University, Nepal, for assisting in collecting biomass samples from various locations in Nepal.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

%percentage
Ccarbon
CHNSCHNS Elemental analyzer
GAgenetic algorithm
Hhydrogen
HHVhigher heating value
LVslatent variable number
Maxmaximum
Minminimum
MPmulti-preprocessing
MSCmultiplicative scatter correction
Nnitrogen
NTtotal number of sample
Ncnumber of sample in calibration set
NIRSnear infrared spectroscopy
Npnumber of sample in validation set
Ooxygen
PLSRpartial least squares regression
R2coefficient of determination
R2Ccoefficient of determination of calibration set
R2pcoefficient of determination of validation set
RMSECroot mean square error of calibration set
RMSEProot mean square error of prediction set
RPDratio of prediction to deviation
Ssulfur
SDstandard deviation
SECstandard error of calibration set
SELstandard error of laboratory
SEPstandard error of validation set
SNVstandard normal variate
SPAsuccessive projection algorithm
SWselected wavenumber
wt.%weight percentage

References

  1. Zhang, Y.; Wang, H.; Sun, X.; Wang, Y.; Liu, Z. Separation and Characterization of Biomass Components (Cellulose, Hemicellulose, and Lignin) from Corn Stalk. BioResources 2021, 16, 7205–7219. [Google Scholar] [CrossRef]
  2. Díez, D.; Urueña, A.; Piñero, R.; Barrio, A.; Tamminen, T. Determination of hemicellulose, cellulose, and lignin content in different types of biomasses by thermogravimetric analysis and pseudocomponent kinetic model (TGA-PKM method). Processes 2020, 8, 1048. [Google Scholar] [CrossRef]
  3. Posom, J.; Sirisomboon, P. Evaluation of the higher heating value, volatile matter, fixed carbon and ash content of ground bamboo using near infrared spectroscopy. J. Near Infrared Spectrosc. 2017, 25, 301–310. [Google Scholar] [CrossRef]
  4. Mierzwa-Hersztek, M.; Gondek, K.; Jewiarz, M.; Dziedzic, K. Assessment of energy parameters of biomass and biochars, leachability of heavy metals and phytotoxicity of their ashes. J. Mater. Cycles Waste Manag. 2019, 21, 786–800. [Google Scholar] [CrossRef]
  5. Posom, J.; Shrestha, A.; Saechua, W.; Sirisomboon, P. Rapid non-destructive evaluation of moisture content and higher heating value of Leucaena leucocephala pellets using near infrared spectroscopy. Energy 2016, 107, 464–472. [Google Scholar] [CrossRef]
  6. Demirbas, A. Hazardous Emissions from Combustion of Biomass. Energy Sources Part A Recovery Util. Environ. Eff. 2007, 30, 170–178. [Google Scholar] [CrossRef]
  7. Onifade, M.; Lawal, A.I.; Aladejare, A.E.; Bada, S.; Idris, M.A. Prediction of gross calorific value of solid fuels from their proximate analysis using soft computing and regression analysis. Int. J. Coal Prep. Util. 2019, 42, 1170–1184. [Google Scholar] [CrossRef]
  8. Sheng, C.; Azevedo, J.L.T. Estimating the higher heating value of biomass fuels from basic analysis data. Biomass Bioenergy 2005, 28, 499–507. [Google Scholar] [CrossRef]
  9. El-Sayed, S.A.; Mostafa, M. Pyrolysis characteristics and kinetic parameters determination of biomass fuel powders by differential thermal gravimetric analysis (TGA/DTG). Energy Convers. Manag. 2014, 85, 165–172. [Google Scholar] [CrossRef]
  10. Huang, C.; Han, L.; Yang, Z.; Liu, X. Ultimate analysis and heating value prediction of straw by near infrared spectroscopy. Waste Manag. 2009, 29, 1793–1797. [Google Scholar] [CrossRef] [PubMed]
  11. Adnan, A.; Horsten, D.V.; Pawelzik, E.; Morlein, A.D. Rapid Prediction of Moisture Content in Intact Green Coffee Beans Using Near Infrared Spectroscopy. Foods 2017, 6, 38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Roger, J.-M.; Mallet, A.; Marini, F. Preprocessing NIR Spectra for Aquaphotomics. Molecules 2022, 27, 6795. [Google Scholar] [CrossRef] [PubMed]
  13. Maraphum, K.; Saengprachatanarug, K.; Wongpichet, S.; Phuphuphud, A.; Posom, J. Achieving robustness across different ages and cultivars for an NIRS-PLSR model of fresh cassava root starch and dry matter content. Comput. Electron. Agric. 2022, 196, 106872. [Google Scholar] [CrossRef]
  14. Posom, J.; Phuphaphud, A.; Saengprachatanarug, K.; Maraphum, K.; Saijan, S.; Pongkan, K.; Srimai, K. Real-time measuring energy characteristics of cane bagasse using NIR spectroscopy. Sens. Bio-Sens. Res. 2022, 38, 100519. [Google Scholar] [CrossRef]
  15. Phuphaphud, A.; Saengprachatanarug, K.; Posom, J.; Taira, E.; Panduangnate, L. Prediction and Classification of Energy Content in Growing Cane Stalks for Breeding Programmes Using Visible and Shortwave Near Infrared. Sugar Tech 2022, 24, 1497–1509. [Google Scholar] [CrossRef]
  16. Posom, J.; Saechua, W. Prediction of Elemental Components of Ground Bamboo Using Micro-NIR Spectrometer. IOP Conf. Ser. Earth Environ. Sci. 2019, 301, 012063. [Google Scholar] [CrossRef] [Green Version]
  17. Skvaril, J.; Kyprianidis, K.G.; Dahlquist, E. Applications of near-infrared spectroscopy (NIRS) in biomass energy conversion processes: A review. Appl. Spectrosc. Rev. 2017, 52, 675–728. [Google Scholar] [CrossRef]
  18. Zhang, K.; Zhou, L.; Brady, M.; Xu, F.; Yu, J.; Wang, D. Fast analysis of high heating value and elemental compositions of sorghum biomass using near-infrared spectroscopy. Energy 2017, 118, 1353–1360. [Google Scholar] [CrossRef] [Green Version]
  19. Xue, J.; Yang, Z.; Han, L.; Chen, L. Study of the influence of NIRS acquisition parameters on the spectral repeatability for on-line measurement of crop straw fuel properties. Fuel 2014, 117, 1027–1033. [Google Scholar] [CrossRef]
  20. Pokhrel, D.R.; Sirisomboon, P.; Khurnpoon, L.; Posom, J.; Saechua, W. Comparing Machine Learning and PLSDA Algorithms for Durian Pulp Classification Using Inline NIR Spectra. Sensors 2023, 23, 5327. [Google Scholar] [CrossRef]
  21. Phanomsophon, T.; Jaisue, N.; Worphet, A.; Tawinteung, N.; Shrestha, B.; Posom, J.; Khurnpoon, L.; Sirisomboon, P. Rapid measurement of classification levels of primary macronutrients in durian (Durio zibethinus Murray CV. Mon Thong) leaves using FT-NIR spectrometer and comparing the effect of imbalanced and balanced data for modelling. Measurement 2022, 203, 111975. [Google Scholar] [CrossRef]
  22. Meenu, M.; Cozzolino, D.; Xu, B. Non-destructive prediction of total phenolics and antioxidants in hulled and naked oat genotypes with near-infrared spectroscopy. J. Food Meas. Charact. 2023, 1–12. [Google Scholar] [CrossRef]
  23. Posom, J.; Maraphum, K. Achieving prediction of starch in cassava [Manihot esculenta Crantz] by data fusion of VIS-NIR and Mid-NIR spectroscopy via machine learning. J. Food Compos. Anal. 2023, 122, 105415. [Google Scholar] [CrossRef]
  24. Santos, F.D.; Santos, L.P.; Cunha, P.H.; Borghi, F.T.; Romao, W.; de Castro, E.V.; de Oliveira, E.C.; Filgueiras, P.R. Discrimination of oils and fuels using a portable NIR spectrometer. Fuel 2021, 283, 118854. [Google Scholar] [CrossRef]
  25. Kranenburg, R.F.; Ramaker, H.J.; Sap, S.; van Asten, A.C. A calibration friendly approach to identify drugs of abuse mixtures with a portable near-infrared analyzer. Drug Test. Anal. 2022, 14, 1089–1101. [Google Scholar] [CrossRef] [PubMed]
  26. Chung, H.; Ku, M.-S. Comparison of near-infrared, infrared, and Raman spectroscopy for the analysis of heavy petroleum products. Appl. Spectrosc. 2000, 54, 239–245. [Google Scholar] [CrossRef]
  27. Stauffer, M. (Ed.) Applications of Molecular Spectroscopy to Current Research in the Chemical and Biological Sciences; IntechOpen: Rijeka, Croatia, 2016. [Google Scholar]
  28. Durickovic, I. Using Raman spectroscopy for characterization of aqueous media and quantification of species in aqueous solution. In Applications of Molecular Spectroscopy to Current Research in the Chemical and Biological Sciences; Stauffer, M., Ed.; IntechOpen: Rijeka, Croatia, 2016; p. 405. [Google Scholar]
  29. Jiao, Y.; Li, Z.; Chen, X.; Fei, S. Preprocessing methods for near-infrared spectrum calibration. J. Chemom. 2020, 34, e3306. [Google Scholar] [CrossRef]
  30. Qian, H.; Guo, X.; Fan, S.; Hagos, K.; Lu, X.; Liu, C.; Huang, D. A simple prediction model for higher heat value of biomass. J. Chem. Eng. Data 2016, 61, 4039–4045. [Google Scholar] [CrossRef]
  31. Posom, J.; Maraphum, K.; Phuphaphud, A. Rapid Evaluation of Biomass Properties Used for Energy Purposes Using Near-Infrared Spectroscopy. In Renewable Energy-Technologies and Applications; IntechOpen: Bristol, UK, 2020. [Google Scholar]
  32. Yun, Y.-H.; Li, H.-D.; Deng, B.-C.; Cao, D.-S. An overview of variable selection methods in multivariate analysis of near-infrared spectra. TrAC Trends Anal. Chem. 2019, 113, 102–115. [Google Scholar] [CrossRef]
  33. Broad, N.; Graham, P.; Hailey, P.; Hardy, A.; Holland, S.; Hughes, S.; Lee, D.; Prebble, K.; Salton, N.; Warren, P. Guidelines for the development and validation of near-infrared spectroscopic methods in the pharmaceutical industry. Handb. Vib. Spectrosc. 2002, 5, 3590–3610. [Google Scholar]
  34. Nakawajana, N.; Posom, J.; Paeoui, J. The prediction of higher heating value, lower heating value and ash content of rice husk using FT-NIR spectroscopy. Eng. J. 2018, 22, 45–56. [Google Scholar] [CrossRef]
  35. Assis, C.; Ramos, R.S.; Silva, L.A.; Kist, V.; Barbosa, M.H.P.; Teofilo, R.F. Prediction of Lignin Content in Different Parts of Sugarcane Using Near-Infrared Spectroscopy (NIR), Ordered Predictors Selection (OPS), and Partial Least Squares (PLS). Appl. Spectrosc. 2017, 71, 2001–2012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Li, Z.; Song, J.; Ma, Y.; Yu, Y.; He, X.; Guo, Y.; Dou, J.; Dong, H. Identification of aged-rice adulteration based on near-infrared spectroscopy combined with partial least squares regression and characteristic wavelength variables. Food Chem. X 2023, 17, 100539. [Google Scholar] [CrossRef] [PubMed]
  37. Shetty, N.; Gislum, R. Quantification of fructan concentration in grasses using NIR spectroscopy and PLSR. Field Crops Res. 2011, 120, 31–37. [Google Scholar] [CrossRef]
  38. Conzen, J. Multivariate Calibration: A Practical Guide for Developing Methods in the Quantitative Analytical Chemistry; BrukerOptik GmbH: Ettlingen, Germany, 2006. [Google Scholar]
  39. Pitak, L.; Sirisomboon, P.; Saengprachatanarug, K.; Wongpichet, S.; Posom, J. Rapid elemental composition measurement of commercial pellets using line-scan hyperspectral imaging analysis. Energy 2021, 220, 119698. [Google Scholar] [CrossRef]
  40. Xia, Z.; Zhang, C.; Weng, H.; Nie, P.; He, Y. Sensitive wavelengths selection in identification of Ophiopogon japonicus based on near-infrared hyperspectral imaging technology. Int. J. Anal. Chem. 2017, 2017, 6018769. [Google Scholar] [CrossRef] [Green Version]
  41. Zhang, C.; Liu, F.; Kong, W.; Zhang, H.; He, Y. Fast identification of watermelon seed variety using near infrared hyperspectral imaging technology. Trans. Chin. Soc. Agric. Eng. 2013, 29, 270–277. [Google Scholar]
  42. Liu, D.; Sun, D.-W.; Zeng, X.-A. Recent advances in wavelength selection techniques for hyperspectral image processing in the food industry. Food Bioprocess Technol. 2014, 7, 307–323. [Google Scholar] [CrossRef]
  43. Santos-Rufo, A.; Mesas-Carrascosa, F.-J.; García-Ferrer, A.; Meroño-Larriva, J.E. Wavelength selection method based on partial least square from hyperspectral unmanned aerial vehicle orthomosaic of irrigated olive orchards. Remote Sens. 2020, 12, 3426. [Google Scholar] [CrossRef]
  44. Maraphum, K.; Ounkaew, A.; Kasemsiri, P.; Hiziroglu, S.; Posom, J. Wavelengths selection based on genetic algorithm (GA) and successive projections algorithms (SPA) combine with PLS regression for determination the soluble solids content in Nam-DokMai mangoes based on near infrared spectroscopy. Eng. Appl. Sci. Res. 2022, 49, 119–126. Available online: https://ph01.tci-thaijo.org/index.php/easr/article/view/245217 (accessed on 9 July 2023).
  45. Jiang, Q.; Chen, Y.; Guo, L.; Fei, T.; Qi, K. Estimating soil organic carbon of cropland soil at different levels of soil moisture using VIS-NIR spectroscopy. Remote Sens. 2016, 8, 755. [Google Scholar] [CrossRef] [Green Version]
  46. Williams, P.; Manley, M.; Antoniszyn, J. Near Infrared Technology: Getting the Best Out of Light; African Sun Media: Stellenbosch, South Africa, 2019. [Google Scholar]
  47. Zornoza, R.; Guerrero, C.; Mataix-Solera, J.; Scow, K.M.; Arcenegui, V.; Mataix-Beneyto, J. Near infrared spectroscopy for determination of various physical, chemical and biochemical properties in Mediterranean soils. Soil Biol. Biochem. 2008, 40, 1923–1930. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Alzagameem, A.; Bergs, M.; Do, X.T.; Klein, S.E.; Rumpf, J.; Larkins, M.; Monakhova, Y.; Pude, R.; Schulze, M. Low-input crops as lignocellulosic feedstock for second-generation biorefineries and the potential of chemometrics in biomass quality control. Appl. Sci. 2019, 9, 2252. [Google Scholar] [CrossRef] [Green Version]
  49. Weyer, L. Practical Guide to Interpretive Near-Infrared Spectroscopy; CRC Press: Boca Raton, FL, USA, 2007. [Google Scholar]
  50. Demirbaş, A. Relationships between lignin contents and heating values of biomass. Energy Convers. Manag. 2001, 42, 183–188. [Google Scholar] [CrossRef]
  51. Hasan, M.; Haseli, Y.; Karadogan, E. Correlations to predict elemental compositions and heating value of torrefied biomass. Energies 2018, 11, 2443. [Google Scholar] [CrossRef] [Green Version]
  52. Zoghlami, A.; Paës, G. Lignocellulosic biomass: Understanding recalcitrance and predicting hydrolysis. Front. Chem. 2019, 7, 874. [Google Scholar] [CrossRef] [Green Version]
  53. Ge, X.; Chang, C.; Zhang, L.; Cui, S.; Luo, X.; Hu, S.; Qin, Y.; Li, Y. Conversion of lignocellulosic biomass into platform chemicals for biobased polyurethane application. In Advances in Bioenergy; Elsevier: Amsterdam, The Netherlands, 2018; Volume 3, pp. 161–213. [Google Scholar]
  54. Sirisomboon, P.; Funke, A.; Posom, J. Improvement of proximate data and calorific value assessment of bamboo through near infrared wood chips acquisition. Renew. Energy 2020, 147, 1921–1931. [Google Scholar] [CrossRef]
  55. Lestander, T.A.; Rhén, C. Multivariate NIR spectroscopy models for moisture, ash and calorific content in biofuels using bi-orthogonal partial least squares regression. Analyst 2005, 130, 1182–1189. [Google Scholar] [CrossRef]
  56. Han, K.; Gao, J.; Qi, J. The study of sulphur retention characteristics of biomass briquettes during combustion. Energy 2019, 186, 115788. [Google Scholar] [CrossRef]
  57. Cagnon, B.; Py, X.; Guillot, A.; Stoeckli, F.; Chambat, G. Contributions of hemicellulose, cellulose and lignin to the mass and the porous properties of chars and steam activated carbons from various lignocellulosic precursors. Bioresour. Technol. 2009, 100, 292–298. [Google Scholar] [CrossRef] [Green Version]
  58. Posom, J.; Sirisomboon, P. Evaluation of lower heating value and elemental composition of bamboo using near infrared spectroscopy. Energy 2017, 121, 147–158. [Google Scholar] [CrossRef]
  59. Nakawajana, N.; Posom, J. Comparison of analytical ability of pls and svm algorithm in estimation of moisture content, higher heating value, and lower heating value of cassava rhizome ground using FT-NIR spectroscopy. IOP Conf. Ser. Earth Environ. Sci. 2019, 301, 012032. [Google Scholar] [CrossRef]
  60. Nhuchhen, D.R. Prediction of carbon, hydrogen, and oxygen compositions of raw and torrefied biomass using proximate analysis. Fuel 2016, 180, 348–356. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the overall research methodology for the evaluation of the HHV and ultimate analysis parameters of grounded biomass for energy usage using NIRS combined with PLSR.
Figure 1. Flowchart of the overall research methodology for the evaluation of the HHV and ultimate analysis parameters of grounded biomass for energy usage using NIRS combined with PLSR.
Energies 16 05351 g001
Figure 2. Nepal biomass in (a) chips form (>30 mm × 15 mm), (b) grounded form (1.88–3080 µm), (c) FT-NIRS (MPA, Bruker, Ettlingen, Germany) scanning between the wavenumber range 3595 to 12,489 cm−1, and (d) ground sample presentation by transflectance mode.
Figure 2. Nepal biomass in (a) chips form (>30 mm × 15 mm), (b) grounded form (1.88–3080 µm), (c) FT-NIRS (MPA, Bruker, Ettlingen, Germany) scanning between the wavenumber range 3595 to 12,489 cm−1, and (d) ground sample presentation by transflectance mode.
Energies 16 05351 g002
Figure 3. Representative particle size distribution of the ground biomass ranging from 0.01 to 3080 µm.
Figure 3. Representative particle size distribution of the ground biomass ranging from 0.01 to 3080 µm.
Energies 16 05351 g003
Figure 4. (a) Raw spectra of grounded biomass. (b) Preprocessing of raw spectra using traditional approach (MSC). (c) Pretreated spectra by the multi-preprocessing method with a 5-range method in five equal sections: MSC, empty, raw spectra, empty, and raw spectra (from left to right), within the wavenumber range of 3625.72–12,489.48 cm−1. (d) Pretreated spectra by the multi-preprocessing method with a 3-range method in three sections: Second derivative, first derivative, and constant offset (from left to right) within the wavenumber range of 3594.87–12,489.48 cm−1.
Figure 4. (a) Raw spectra of grounded biomass. (b) Preprocessing of raw spectra using traditional approach (MSC). (c) Pretreated spectra by the multi-preprocessing method with a 5-range method in five equal sections: MSC, empty, raw spectra, empty, and raw spectra (from left to right), within the wavenumber range of 3625.72–12,489.48 cm−1. (d) Pretreated spectra by the multi-preprocessing method with a 3-range method in three sections: Second derivative, first derivative, and constant offset (from left to right) within the wavenumber range of 3594.87–12,489.48 cm−1.
Energies 16 05351 g004
Figure 5. Spectra of fast-growing trees and agricultural residues compared to pure cellulose and pure lignin.
Figure 5. Spectra of fast-growing trees and agricultural residues compared to pure cellulose and pure lignin.
Energies 16 05351 g005
Figure 6. Histogram of reference values in calibration and validation sets for the (a) HHV (J/g), (b) wt.% C, (c) wt.% of H, (d) wt.% of O, and (e) wt.% of N.
Figure 6. Histogram of reference values in calibration and validation sets for the (a) HHV (J/g), (b) wt.% C, (c) wt.% of H, (d) wt.% of O, and (e) wt.% of N.
Energies 16 05351 g006
Figure 7. Measured versus predicted value in calibration and validation sets for the (a) HHV (J/g), (b) wt.% of C, (c) wt.% of H, (d) wt.% of O, and (e) wt.% of N.
Figure 7. Measured versus predicted value in calibration and validation sets for the (a) HHV (J/g), (b) wt.% of C, (c) wt.% of H, (d) wt.% of O, and (e) wt.% of N.
Energies 16 05351 g007
Figure 8. The average absorbance value of HHV (J/g) obtained using the first derivative preprocessing with a selection of important wavenumbers obtained from GA, within the full wavenumber range of 3594.87–12,489.5 cm−1.
Figure 8. The average absorbance value of HHV (J/g) obtained using the first derivative preprocessing with a selection of important wavenumbers obtained from GA, within the full wavenumber range of 3594.87–12,489.5 cm−1.
Energies 16 05351 g008
Figure 9. The average absorbance value of wt.% of C obtained using the first derivative preprocessing with a selection of important wavenumbers obtained from GA, within the full wavenumber range of 3594.87–12,489.5 cm−1.
Figure 9. The average absorbance value of wt.% of C obtained using the first derivative preprocessing with a selection of important wavenumbers obtained from GA, within the full wavenumber range of 3594.87–12,489.5 cm−1.
Energies 16 05351 g009
Figure 10. The average absorbance value of wt.% of H obtained using SNV preprocessing with a selection of important wavenumbers obtained from GA, within full wavenumber range of 3594.87–12,489.5 cm−1.
Figure 10. The average absorbance value of wt.% of H obtained using SNV preprocessing with a selection of important wavenumbers obtained from GA, within full wavenumber range of 3594.87–12,489.5 cm−1.
Energies 16 05351 g010
Figure 11. The regression coefficient for the wt.% O of grounded biomass using the multi-preprocessing PLSR 5-range method.
Figure 11. The regression coefficient for the wt.% O of grounded biomass using the multi-preprocessing PLSR 5-range method.
Energies 16 05351 g011
Figure 12. The regression coefficient for the wt.% of N of grounded biomass using the multi-preprocessing PLSR 3-range method.
Figure 12. The regression coefficient for the wt.% of N of grounded biomass using the multi-preprocessing PLSR 3-range method.
Energies 16 05351 g012
Table 1. Statistical data of the HHV and ultimate analysis parameters of the grounded biomass used in PLSR model development.
Table 1. Statistical data of the HHV and ultimate analysis parameters of the grounded biomass used in PLSR model development.
ParameterExperimental MethodNTCalibration SetValidation Set
NcMaxMinMeanSDNpMaxMinMeanSD
HHV (J/g)Bomb Calorimeter19615718,61614,68216,9628483918,55314,96517,049836
C (wt.% )CHNS/O1088748.000038.395044.32782.11612147.740040.855044.90391.8910
N (wt.% )CHNS/O95760.83000.00000.28070.1870190.82000.00000.31870.2506
H (wt.% )CHNS/O93746.48004.95005.74480.3044196.25505.18505.69110.3053
O (wt.% )CHNS/O997951.120037.820044.94402.52332048.955038.850044.95052.5718
Table 2. Average reference value of HHV (J/g), ash content (wt.%), and ultimate analysis parameter of fast-growing trees and agricultural residues on a dry basis (wt.%).
Table 2. Average reference value of HHV (J/g), ash content (wt.%), and ultimate analysis parameter of fast-growing trees and agricultural residues on a dry basis (wt.%).
CategoryParticularHHV (J/g)C (wt.%)N (wt.%)H (wt.%)Ash (wt.%)O (wt.%)
Fast-growing treeAlnus nepalensis17,93245.91150.31155.72552.367145.6844
Pinus roxiburghii18,34946.83670.06065.82832.090045.1844
Bombusa vulagris17,31045.61320.23275.75362.812045.5884
Eucalyptus camaldulensis17,10544.55360.08965.61643.815845.9245
Bombax ceiba17,07744.85570.31625.81795.127143.8832
Agricultural residueZea mays (cob)17,29744.77940.24885.76192.614646.5954
Zea mays (shell)16,40945.65180.43186.21133.582644.1224
Zea mays (stover)16,75344.39880.70695.66974.003345.2212
Oryza sativa15,41740.42610.49965.304213.707340.0629
Saccharum officinarum17,02943.64130.10475.70473.030047.5194
Table 3. Results of the PLSR-based model for the HHV (J/g) and ultimate analysis (wt.%) of grounded biomass, bolded model showing the best performance.
Table 3. Results of the PLSR-based model for the HHV (J/g) and ultimate analysis (wt.%) of grounded biomass, bolded model showing the best performance.
ParameterAlgorithmPreprocessingLVsCalibration SetValidation Set
R2cRMSECR2pRMSEPRPDbias
HHV (J/g)Full-PLSRFirst derivative (5,5) + Vector normalization140.9527183.69100.9491186.16514.44−13.5781
SPA-PLSRFirst derivative (SW = 479)150.9469194.74420.9486187.09274.41−0.1578
GA-PLSRFirst derivative (SW = 692)140.9505188.01170.9574170.32824.89−21.9648
MP-PLSR:
3-range
Combination set: 5, 5, 4150.9538181.57440.9470189.90724.35−12.3121
MP-PLSR:
5-range
Combination set: 4, 4, 5, 4, 3130.9546180.05130.9533178.37614.72−35.5676
wt.% CFull-PLSRRaw spectra130.84330.83260.61291.14881.670.3193
SPA-PLSRFirst derivative + vector normalization (SW = 70)90.80010.94050.63161.12071.760.4010
GA-PLSRFirst derivative (g = 5, s = 5) (SW = 50)90.78510.97530.72170.97401.930.1877
MP-PLSR:
3-range
Combination set: 5, 0, 6130.87910.73150.67651.05021.780.1739
MP-PLSR:
5-range
Combination set: 3, 0, 1, 3, 0 120.84510.82800.67371.05481.880.3807
wt.% NFull-PLSRFirst derivative
(g = 5, s= 5)
90.84570.07300.82840.10112.63−0.0403
SPA-PLSRSecond derivative (g = 5, s = 5) (SW = 601)110.90910.05600.76910.11732.20−0.0381
GA-PLSRSecond derivative (g = 5, s = 5) (SW = 990)100.90260.05800.80100.10892.36−0.0338
MP-PLSR:
3-range
Combination set: 4, 4, 5100.91960.05270.79610.11022.36−0.0383
MP-PLSR:
5-range
Combination set: 4, 4, 5, 3, 490.86820.06750.84100.09732.65−0.0309
wt.% HFull-PLSRSNV140.83350.12340.76780.14342.10−0.0234
SPA-PLSRRaw (SW = 1148)140.82860.12520.64390.17761.680.0014
GA-PLSRSNV (SW = 457)140.88140.10410.76780.14342.14−0.0356
MP-PLSR:
3-range
Combination set: 4, 5, 6130.88000.10470.64220.17801.68−0.0145
MP-PLSR:
5-range
Combination set: 4, 0, 6, 4, 6130.88640.10190.60400.18721.60−0.0254
wt.% OFull-PLSRFirst derivative
(g = 5, s = 5)
120.79361.13900.59721.59131.5759−0.0256
SPA-PLSRFirst derivative (g = 5, s = 5) (SW = 190)110.59381.59810.61021.56541.60310.0658
GA-PLSRRaw (SW:93)150.64721.48920.62651.53241.6602−0.2593
MP-PLSR:
3-range
Combination set: 4, 6, 4 90.66231.45700.62831.52861.6591−0.2222
MP-PLSR:
5-range
Combination set: 3, 2, 4, 6, 0120.66741.44610.62891.52751.71470.4456
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shrestha, B.; Posom, J.; Sirisomboon, P.; Shrestha, B.P. Comprehensive Assessment of Biomass Properties for Energy Usage Using Near-Infrared Spectroscopy and Spectral Multi-Preprocessing Techniques. Energies 2023, 16, 5351. https://doi.org/10.3390/en16145351

AMA Style

Shrestha B, Posom J, Sirisomboon P, Shrestha BP. Comprehensive Assessment of Biomass Properties for Energy Usage Using Near-Infrared Spectroscopy and Spectral Multi-Preprocessing Techniques. Energies. 2023; 16(14):5351. https://doi.org/10.3390/en16145351

Chicago/Turabian Style

Shrestha, Bijendra, Jetsada Posom, Panmanas Sirisomboon, and Bim Prasad Shrestha. 2023. "Comprehensive Assessment of Biomass Properties for Energy Usage Using Near-Infrared Spectroscopy and Spectral Multi-Preprocessing Techniques" Energies 16, no. 14: 5351. https://doi.org/10.3390/en16145351

APA Style

Shrestha, B., Posom, J., Sirisomboon, P., & Shrestha, B. P. (2023). Comprehensive Assessment of Biomass Properties for Energy Usage Using Near-Infrared Spectroscopy and Spectral Multi-Preprocessing Techniques. Energies, 16(14), 5351. https://doi.org/10.3390/en16145351

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop