Next Article in Journal
Web-Based Forest Resources Management Decision Support System
Previous Article in Journal
Modeling Ground Firefighting Resource Activities to Manage Risk Given Uncertain Weather
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Visible-Near Infrared Spectroscopy and Chemometric Methods for Wood Density Prediction and Origin/Species Identification

1
College of Engineering and Technology, Northeast Forestry University, Harbin 150040, China
2
Forest Products Development Center, School of Forestry and Wildlife Sciences, Auburn University, Auburn, AL 36849, USA
3
Center for Renewable Carbon, University of Tennessee, Knoxville, TN 37996, USA
*
Author to whom correspondence should be addressed.
Forests 2019, 10(12), 1078; https://doi.org/10.3390/f10121078
Submission received: 13 August 2019 / Revised: 18 November 2019 / Accepted: 25 November 2019 / Published: 27 November 2019
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

:
This study aimed to rapidly and accurately identify geographical origin, tree species, and model wood density using visible and near infrared (Vis-NIR) spectroscopy coupled with chemometric methods. A total of 280 samples with two origins (Jilin and Heilongjiang province, China), and three species, Dahurian larch (Larix gmelinii (Rupr.) Rupr.), Japanese elm (Ulmus davidiana Planch. var. japonica Nakai), and Chinese white poplar (Populus tomentosa carriere), were collected for classification and prediction analysis. The spectral data were de-noised using lifting wavelet transform (LWT) and linear and nonlinear models were built from the de-noised spectra using partial least squares (PLS) and particle swarm optimization (PSO)-support vector machine (SVM) methods, respectively. The response surface methodology (RSM) was applied to analyze the best combined parameters of PSO-SVM. The PSO-SVM model was employed for discrimination of origin and species. The identification accuracy for tree species using wavelet coefficients were better than models developed using raw spectra, and the accuracy of geographical origin and species was greater than 98% for the prediction dataset. The prediction accuracy of density using wavelet coefficients was better than that of constructed spectra. The PSO-SVM models optimized by RSM obtained the best results with coefficients of determination of the calibration set of 0.953, 0.974, 0.959, and 0.837 for Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively. The results showed the feasibility of Vis-NIR spectroscopy coupled with chemometric methods for determining wood property and geographical origin with simple, rapid, and non-destructive advantages.

1. Introduction

With the rising consumption of wood and the competitiveness among export firms, timber quality traceability systems are becoming increasingly important with the ability of informing the identity or provenance of wood. Recently, many studies have been conducted in relation to the discrimination of origin and valuable species, especially for the Convention on International Trade in Endangered Species (CITES) listed species. Silva et al. indicated that partial least squares for discriminant analysis (PLS-DA) was better than soft independent modeling of class analogy (SIMCA) when the origin of true mahogany (CITES Appendix II) was identified with near infrared (NIR) spectroscopy [1]. A study from Bergo et al. [2] demonstrated that the correct classification rate of mahogany higher than 96.8% using NIR technology. Braga et al. employed NIR spectroscopy and PLS-DA to discriminate Swietenia macrophylla (CITES Appendix II) from three similar species [3]. Nisgoski et al. applied NIR spectroscopy to distinguish six Cryptomeria japonica varieties on the basis of needles and wood samples [4]. As for CITES-listed Gonystylus species, Ng et al. [5] indicated that 90% of Gonystylus species can be identified on the basis of DNA barcode sequence. A successful tracing system has the potential to drastically improve the manufacturing, management, and optimization of trees for product use. Additionally, wood traceability can provide legal and better quality for wood and its products, which is beneficial for the consumers and corresponding departments of the wood production chain.For the establishment of a useful traceability system, the traceability tools are essential to obtain wood information precisely and rapidly at certain points. The common technologies are genomic, chemical, and morphological tools [6]. However, high technical expertise and technology are required to analyze the results of these methods, which will impede the development of timber quality traceability systems due to scale, site differences, expense, and their destructive and non-repeatable nature.
Visible and near infrared (Vis-NIR) spectroscopy has been widely used in many sectors, including the food, material, and life sciences [7,8,9]. In the field of forestry, many studies have demonstrated its potential to determine components, such as moisture, density, lignin content, and so on; detect wood preservation; and classify species [10,11,12,13,14,15,16]. Sandak et al. applied cluster analysis (CA) and principal component analysis (PCA) to classify powdered samples (fraction < 0.5 mm) and wood samples, respectively. They found that all samples were clearly separated [17]. Schimleck et al. [18] reported that the within-tree variation of three wood properties (air-dry density, module of elasticity (MOE), and microfibril angle (MFA)) for Pinus taeda trees aged 13 and 22 years were consistent with full maturity in older trees when near infrared (NIR) spectroscopy and Akima’s interpolation method were used. These studies show that Vis-NIR spectroscopy could be a powerful traceability tool due to its advantages and the nature of wood.
When infrared energy falls on the surface of wood samples, the energy of the C–H, N–H, and O–H bonds is excited above their ground state and, depending on inner structure, composition, and surface feature, the Vis-NIR spectra can be translated into fiber morphology and chemistry information through multivariate modeling and removal of noise and irrelevant information [19,20]. Therefore, advanced spectral optimizing techniques should be explored to extract key information and improve model performance. Spectral optimizing strategies are mainly divided into two sections: de-noising and selection of spectral variables. However, the weakness of these two strategies is a complex execution. Different results will be obtained on the basis of various combinations of de-noising and variable selection methods [21]. Lifting wavelet transform (LWT), as a second-generation wavelet, can be used to suppress noise and simplify spectral signal [22]. Recently, wavelet coefficients have become popular in many sectors due to their time-saving ability and being without bias when compared to traditional constructed spectra by wavelet transform [23,24].
Vis-NIR spectra are broad, overlapping bands, and are nonspecific; in addition, the relationship between spectra and wood properties are generally complex [25]. Therefore, nonlinear modeling methods yield higher accuracy and have gained popularity compared to traditional linear modeling methods. Support vector machine (SVM) optimized by particle swarm optimization (PSO) is one such method with good performance [26]. Nonetheless, the parameters of PSO-SVM models including population size, maximum generation, and cross-validation number influence the performance and should be determined before modeling. The response surface methodology (RSM) has been widely applied to optimize parameters in the field of food [27,28]. To our knowledge, studies with Vis-NIR technology coupled with chemometric methods have only been conducted for the determination of the physical and chemical components [29,30,31]. However, combining Vis-NIR spectroscopy and the aforementioned chemometrics (LWT, PSO-SVM, and RSM) into wood qualitative and quantitative analysis has never been done after a review of the public domain literature, and providing such tools for wood origin and species classification coupled with density modeling could be a novel business-science innovation. In addition, compared to the traditional wavelet transform method, the LWT procedure not only has multi-resolution, but also only a small amount of memory is needed, resulting in much faster analysis speeds necessary for a supply chain environment [32,33], which is a key novelty of the paper as these methods have been limited to fields other than forestry and forest products.
This study aimed to build reliable qualitative and quantitative models for the identification of geographical origin and tree species while simultaneously predicting wood property, an important skill needed for a wood traceability system. Taking density as an example, 280 wood samples from two different geographical origins belonging to three of the major commercial species (i.e., Dahurian larch, Japanese elm, Chinese white poplar) were collected for spectra collection and density estimation. The spectra were processed by lifting wavelet transform (LWT) [34], and different reconstruct means for spectra (wavelet coefficients and constructed spectra after de-noising) were also compared. The support vector machine (SVM) optimized by particle swarm optimization (PSO) was used for tracking origin and identifying species. To improve the prediction accuracy of density, the response surface methodology (RSM) was performed for the optimization of the best combined parameters of PSO-SVM models.

2. Materials and Methods

2.1. Sample Preparation

A total of 280 wood samples divided equally into three species were harvested from two different locations (Table 1). The same species need to be collected from multiple sites at the same age-class to determine site of origin. Therefore, Chinese white poplar were collected from Jilin and Heilongjiang, independently. Eight Dahurian larch and seven Chinese white poplar, with tree heights ranging from 17.2 to 22.6 m and 21.9 to 23.6 m, respectively, were from Heilongjiang province (131°08′–131°21′ E, 45°44′–45°53′ N). Eleven Japanese elm and seven Chinese white poplar were from Jilin province (126°30′–127°16′ E, 42°06′–42°48′ N), with tree heights ranging from 8.5 to 20.1 m and 22.4 to 23.8 m, respectively. Five-centimeter-thick (5 cm) disks were removed from each tree at 2 m intervals along the stem with a total of seventy discs prepared for each species for model calibration and spectra collection. Additionally, to reduce the roughness of sample surface, the cross-section of all disks were polished using an electric plane (Eastern sketcher hardware tools, Yong Kang, China).

2.2. Spectra Collection and Density Measurement

The Vis-NIR spectra were acquired using a LabSpec Pro FR/A114260 (Analytical Spectral Devices, Inc., Boulder, USA) from 350 to 2500 nm with a spectral resolution of 3 nm @700 nm, 10 nm @1400/2100 nm. Before spectral collection, the spectrometer was preheated for 30 min and calibrated with a commercial white plate made of polytetraflouroethylene (PTFE) and cintered halon. It had the nature of being nearly 100% reflective within the whole wavelength range (350–2500 nm). White references were collected every 15 min from the surface of the white plate. A total of 30 scans were acquired and automatically averaged into one spectrum. Each sample was scanned three times from the cross-section with a glare probe (unit 6523 h.i. contact probe) and the average spectrum was regarded as the raw spectrum. Additionally, to reduce the influence of baseline shifts, baseline offset was employed and implemented in the multivariate calibration software, The Unscrambler V10.4 (CAMO Software AS, Oslo, Norway). The density of the samples was measured according to GB/T 1933-2009.

2.3. Data Analysis

2.3.1. Vis-NIR Spectral Data Processing

The procedure of lifting wavelet transform (LWT) includes split, prediction, and update (Figure 1). The detailed procedure of LWT is shown in Zhou et al. [35]. The reason that LWT can be used for the de-noising or compression of spectra is that the time domain can be replaced by the wavelength domain. The spectral data in the original wavelength domain can be transformed into coefficients, approximation coefficients (AC) and detail coefficients (DC), in a new wavelength domain when LWT is performed. Additionally, this transformation from the original domain to a new domain is time-saving and not biased [36,37]. Therefore, the spectra with a wavelength range between 350 to 2397 nm were de-noised using a biorthogonal mother wavelet method from first to seventh level on the basis of lifting scheme to eliminate the influence of optical and electrical noise and the performance of wavelet coefficients (the AC retaining the low-frequency content), and the construct spectra with AC and DC were compared in this study. In addition, the de-noising results of different order and decomposition level were analyzed using partial least squares (PLS) regression to obtain the optimal de-noising parameters. The LWT was implemented using in-house algorithms of Matlab R2010b (MathWorks, Natick, USA).

2.3.2. Optimization of Support Vector Machine Models

After the establishment of a linear model from PLS using raw spectra and the wavelet coefficients, the support vector machine (SVM) procedure was performed to analyze the nonlinear relationship between spectra and wood properties. SVM has been widely applied to classification and regression in various sectors with satisfactory model accuracy [38,39,40,41], but to our knowledge few studies [42,43] have been tested in forest-based materials. The mathematical background of SVM is described in Wang et al. [44]. However, the selection of appropriate parameter values is a key factor affecting model performance. The particle swarm optimization (PSO) procedure has been proven to optimize SVM parameter values on the basis of the social and cooperative behavior of species such as birds that achieves their needs in a multi-dimensional space [45,46]. It was adopted to track geographical origin/tree species and determine density. The SVM and PSO were also implemented in Matlab.
For the preferable optimization results of PSO, the number of initial individuals (population size) and maximum generation should be determined before optimization, the cross-validation number is also critical for the optimization process. Furthermore, these parameters are selected by trial and error or through searching many experiments across a large number of studies [47,48,49,50]. To investigate the relationship between parameters (cross-validation number, maximum generation, and population size) and PSO-SVM model accuracy, the response surface methodology (RSM) with minimum data and resources was employed to design the best combined parameters, and Box–Behnken design with a three-level (lower, equal, and high levels) and a three factor (cross-validation number, maximum generation, and population size) was applied in this study. The variables were coded at lower (−1), equal to (0), and high levels (1), respectively (Table 2). The analysis of the variance (ANOVA) was used to analyze the results. RSM was implemented in Design-Expert Software Version 11 (Stat-Ease, Minneapolis, Minnesota, USA). More information about RSM can be found in He et al. [51].

2.4. Overview of Spectral Data Analysis

The spectral data were analyzed in quantitative and qualitative analyses. For each species, the sample with highest and lowest density value in the calibration and then spectral data were randomly divided into a calibration set (N = 52) and prediction set (N = 18) using simple random sample method so that range of the physical properties was greatest for the calibration set. This was implemented using Matlab R2010b (MathWorks, Natick, MA, USA). First, the spectra for the calibration set were de-noised using biorthogonal mother wavelet (biorNr.Nd, Nr and Nd are order for reconstruction and decomposition, respectively) from the first to seventh decomposition level. The performance of wavelet coefficients and reconstructed spectra were compared to PLS models implemented in Unscrambler V10.4 (CAMO Software AS, Oslo, Norway). The optimal de-noising parameters for these species were determined by the performance of PLS. The response surface methodology (RSM) was then performed to design the best combined parameters for the estimation of density using PSO-SVM model. Finally, wavelet coefficients with the best de-noising parameters were used for classification with SVM optimized by PSO. The performance of identification origin and species using wavelet coefficients was compared to PSO-SVM model developed on the raw spectra. The total treatment was as shown in Figure 2.

3. Results

3.1. Vis-NIR Spectra Analysis

The raw Vis-NIR spectra for these species are shown in Figure 3. As for Dahurian larch, the spectra had similar absorbance patterns in the range of 350–2008 nm but the relative intensities of band in 353–835, 1424–1828, and 1897–2397 nm were higher than that of other species. Additionally, there were two obvious peaks around 2110 and 2272 nm, which were associated with the O–H stretching of cellulose and C–H stretching of hemicellulose, respectively. This difference may be due to the different wood properties between softwood and hardwood. In terms of Chinese white poplar harvested from Jilin and Heilongjiang, the different growing conditions led to the three main peaks around 1196, 1451, and 1929 nm associated with the lignin/cellulose, lignin, and water absorbance bands, respectively, moving to the right, which was different from that of Chinese white poplar from Heilongjiang.

3.2. Quantification Models

The statistical characteristics for modeling density for the calibration and prediction set are provided in Table 3. As for these species, although the range of density was different, the range in the prediction set was within the calibration set for each species. Additionally, the values of average and standard deviation between calibration and prediction set were similar, which demonstrated that the dataset was as representative of the species in these sites as possible.
The performance of wavelet coefficients and constructed spectra were also compared on the basis of the results of PLS models. Higher coefficient of determination (R2) value indicated a good de-noising result. The de-noising results of different decomposition level and biorthogonal wavelet family are shown in Figure 4. It was observed that the de-noising results were dependent on de-noising parameters, which was indicative of the potential that different wavelet parameters may yield on calibration models. As for the best performance of Dahurian larch, although wavelet coefficients yielded the same results with constructed spectra, the de-noising parameters were different, which were bior2.8 under a six decomposition level and bior1.3 with a fourth level, respectively. In the models of Japanese elm, little improvement was obtained after pretreatments; the wavelet coefficients of bior2.8 with a sixth level were the most effective, being better than the optimal results of constructed spectra (bior1.5 with third level). As for Chinese white poplar harvested from Jilin, the wavelet coefficients and constructed spectra pretreated by bior2.8 under the second level and bior1.5 under the second level, respectively, improved PLS performance as they increased R2 up to 0.840. As for Chinese white poplar from Heilongjiang, the performance of the optimal wavelet coefficients (bior2.8 with six decomposition level) was better than that of constructed spectra (bior2.8 under seven decomposition level).
As for the de-noising performance of the biorthogonal wavelet family, an obvious trend from bior1.1 to bior2.8 with the same decomposition level was not observed, which is in agreement with the results of Zhang et al. [52]. Compared to the constructed spectra processed by LWT, except for Dahurian larch, the R2 of the best model was the same—wavelet coefficients obtained better accuracy than for constructed spectra. Additionally, 2048 spectral variables were reduced to 512 (Chinese white poplar from Jilin) and 32 (Dahurian larch, Japanese elm, and Chinese white poplar from Heilongjiang) wavelet coefficients. This was an indication that the quality of the information was perhaps improved even with the reduction in the amount of information needed for calibration or analysis. Therefore, the spectra were processed with the optimal de-noising parameters, and wavelet coefficients were obtained. These wavelet coefficients were used for further analysis. The optimal wavelet coefficients for each species are shown in Figure 5.
The SVM models optimized by PSO were used for analyzing the relationship between wavelet coefficients and density. The cross-validation number, population size, and maximum generation were assumed to be 5, 10, and 5, respectively. The results of PLS regression using raw spectra and wavelet coefficients and PSO-SVM models using wavelet coefficients are shown in Table 4. It can be seen that, for these species, the performance of PSO-SVM models were better than the conventional enhanced PLS method, especially for Dahurian larch. However, the accuracy of Japanese elm improved slightly when going from the PLS to the PSO-SVM method (Table 4). This could be caused by the default variable values in the process of modeling.
To investigate the effect of variables (i.e., cross-validation number, maximum generation, and population size) on the prediction accuracy of the PSO-SVM models for the estimation of density, a Box–Behnken design with three levels and three factors was used to construct a response surface trace. The results of the ANOVA for these four species are respectively shown in Table 5, Table 6, Table 7 and Table 8. X1, X2, and X3 represent the cross-validation number, maximum generation, and population size, respectively. DF, SS, and MS represent the degrees of freedom, sum of squares, and mean square, respectively.
As seen in Table 5, for Dahurian larch, the CVmse (mean squared error of cross-validation) of PSO-SVM models was more significantly affected by cross-validation number and population size (p < 0.01) than by maximum generation at p < 0.05. All two-way interaction terms were non-significant. The quadratic variables of X 1 2 and X 3 2 were more significant, whereas X 2 2 was significant at the level of p < 0.05. In the RSM model of Japanese elm and Chinese white poplar from Jilin and Heilongjiang (Table 6 and Table 7), the linear terms of X 1 , X 2 , and X 3 were significant for the former two species, X 1 was significant for Chinese white poplar from Heilongjiang, and only the interaction of X 2 X 3   and X 1 X 2   were significant at p < 0.05 for Japanese elm and Chinese white poplar (Jilin), respectively. The quadratic of X 1 2 was more significant at the level of p < 0.01 for Dahurian larch, Japanese elm and Chinese white poplar from Jilin and Heilongjiang, and X 2 2 was significant for Chinese white poplar from Jilin. These four models obtained a good fit, as R 2 showed values up to 0.85 that were highly significant (p < 0.01).
Figure 6 is an illustration of the changes of CVmse of PSO-SVM models in terms of parameters of cross-validation number ( X 1 ), maximum generation ( X 2 ), and population size ( X 3 ). As for Dahurian larch (Figure 6A), it was concluded that the CVmse decreased with the cross-validation number decreasing from 15 to 5, and the lowest value of CVmse was observed at level (−1) for the cross-validation number. The minimum value of CVmse was obtained with the cross-validation number, maximum generation, and population size at 5, 57, and 60, respectively. In terms of the Japanese elm (Figure 6B), the CVmse increased with the decreasing cross-validation number, and its lowest value was observed when the cross-validation number, maximum generation, and population size were 15, 73, and 45, respectively. For Chinese white poplar harvested from Jilin and Heilongjiang (Figure 6C,D), increasing the cross-validation number reduced the CVmse, and the minimum value was obtained with the cross-validation number, maximum generation, and population size at 15, 86, and 60 for Jilin and 10, 50, and 60 for Heilongjiang, respectively.
Thus, regardless of different tree species, the CVmse of PSO-SVM models was significantly affected by the cross-validation number. However, different tree species exhibited different variations, and different optimal parameters were obtained in this study, suggesting that there is not a universal optimization parameter that works for all scenarios. The reason for this difference may have been due to the differences of wood properties and growing conditions.
To verify the reliability of the RSM models, PSO-SVM models were performed at the optimal parameters for these species. The results of PSO-SVM models optimized by RSM are shown in Table 9. It can be seen that R c   2 of models for these species were above 0.80, indicating the accuracy of the models were improved by using RSM optimization when compared to conventional PLS models and PSO-SVM models without RSM optimization. Additionally, for the PSO-SVM model of Japanese elm, R c 2 was increased from 0.844 to 0.974 after RSM optimization, which demonstrated that the feasibility of optimization of PSO-SVM model parameters on the basis of the RSM method.

3.3. Qualitative Analysis Models

The technology of rapid classification of geographical origin and tree species is needed and important for the establishment of a timber quality traceability system. Due to a good performance of wavelet coefficients decomposed by the optimal de-noising LWT parameters, PSO-SVM models using wavelet coefficients were used to identify geographical origin and tree species, and the results of raw spectra were also analyzed.
Chinese white poplar samples harvested from Jilin and Heilongjiang were used to determine geographical origin. Figure 7 and Table 10 show the clustering results of geographical origin and the detailed description of the classification results, respectively. Category labels 0–1 are the samples from Jilin and Heilongjiang, respectively. It can be seen that the training dataset accuracy of the PSO-SVM model based on wavelet coefficients had similar results with the modeling of raw spectra with a correctness of approximately 100% (data no shown), indicating the effectiveness of the SVM model optimized by PSO for differentiating the origin. Additionally, for the prediction set, regardless of raw spectra and LWT coefficients, the accuracy of geographical origin was 100%.
Figure 8 and Table 11 show the clustering results of species and the detailed description of the classification results, respectively.
As seen in Figure 8 and Table 11, the PSO-SVM model using wavelet coefficients demonstrated that these tree species were well-separated with a correctness of 100% and 98.61% for the training (data no shown) and prediction set, respectively. This method was more effective than models built with the raw spectra due to the higher accuracy. As for the performance of prediction set, the accuracy of wavelet coefficients was better than that of raw spectra. The difference of prediction results was due to Dahurian larch and Japanese elm. In the model of LWT wavelet coefficients, Dahurian larch samples were correctly classified, but one Japanese elm sample was misjudged to Dahurian larch. Regardless of geographical origin, Chinese white poplar from Jilin and Heilongjiang were well-separated without misjudgment for raw spectra and wavelet coefficients. This is an indication that the PSO-SVM combined with LWT may be a powerful method not only in simplifying the spectral data dimension, but also in classification of the geographical origin and tree species at the same time with high discrimination accuracy.

4. Discussion

LWT was used for de-noising and the optimal de-noising parameters were discussed in the prediction of wood tracheid length in our previous study [34]; the results demonstrated that LWT, as a second-generation wavelet, has more power to suppress noise and improve model performance when compared to other de-noising methods such as moving average, loess, Savitzky–Golay, and lowess. According to the advantages of LWT, the performance of models using wavelet coefficients of LWT and constructed spectra were first compared in this study. The ability to effectively predict wood density was due to a significant correlation between specific absorption bands of chemical properties (lignin, cellulose, and hemicellulose), and their C–H, O–H, and other N–H groups had obvious absorption in the Vis-NIR region [53,54]. Therefore, the improvement of spectral quality is important to decrease the error in the modeling. LWT was employed to remove noise and other irrelevant information for improving the accuracy in this study. The results show that, except for Dahurian larch, the results were similar, and the best performance of wavelet coefficients were better than that of constructed spectra when raw spectra were decomposed from the first to seventh decomposition level using biorthogonal family (Figure 4). This was mainly because some features of wood are inconspicuous in the original Vis-NIR spectral domain, but are magnified and become obvious in a new domain (wavelet coefficients of LWT) [55]. In addition, the transformation of these wavelet coefficients is not biased. This fills in the shortcomings that some useful information was removed and the accuracy of the model was decreased for constructed spectra [56,57].
In terms of the optimal de-noising parameters of LWT, it can be found that different species may obtain better calibration models with different de-noising parameters, demonstrating that there is no universal parameter that works in all cases. The reason that the optimal decomposition level between Chinese white poplar from Jilin (level = 2) and other species (level = 6) is different may be due to a small amount of noise for the former; namely, the noise was removed while retaining the useful information when spectra were decomposed into two levels. However, for other species, more decomposition levels are needed to remove some unobvious noises of spectra. Additionally, there was no obvious trend with the increasing order of the biorthogonal family. This result was in line with the finding reported by Zhang et al. [52]. Despite different effects of various combination of LWT parameters on wood spectral de-noising, good performance for different species was obtained and spectral dimensions were also reduced in this study. This not only simplified the complexity of modeling, but also saved time and memory.
As for the performance of linear PLS models (Table 4), regardless of raw spectra and wavelet coefficients, the RMSE values for the Chinese white poplar (Jilin) were better than other species. Therefore, the loadings of the spectral wavelength variables for these species were compared (Figure 9). The loading spectra indicate which wavelength variables were associated with the highest variation. Frequencies of high variation reflect contributions of spectra correlated with wood properties. As seen in Figure 9, the wavelengths with the highest variation for Chinese white poplar were mainly at 354, 1451, 1939, and 2395 nm, which were higher than that of other species, suggesting the highest contribution of spectra for Chinese white poplar (Jilin) density.
SVM optimized by PSO has been widely applied to predict the complicated sample properties. Zhang et al. [58] applied PSO and genetic algorithm (GA) to optimize the parameters of SVM, and they found that PSO showed a better learning ability and generalization in wood drying process modeling. A study presented by He et al. [59] indicated that PSO-SVM model is more accurate than the routine linear regression model for the water quality retrievals of Weihe River in Shanxi province. Their results demonstrated the improvement of SVM models through the optimization of PSO. The results in our study support their findings (Table 4). However, the parameters of PSO including population size, maximum generation, and number of cross-validation influenced the performance of PSO-SVM models; these parameters were selected by randomized trial or searching many experiments across many studies [47,48,49,50]. Thus, RSM was employed to design the best combined parameters. It was found that the cross-validation number had a significant effect on PSO-SVM models, regardless of different species (Table 5, Table 6, Table 7 and Table 8). For the optimization of RSM, the prediction accuracy ( R p 2 ) of Japanese elm was lower than the other three species, with this possibly being due to a narrow range for modeling density values or small-sample prediction set (N = 18). In a follow-up study, a wide range of representative large samples are needed for good prediction. However, the performance of calibration models was improved and the R c 2 of models up to 0.80 for these species (Table 9). This demonstrated that the feasibility of the optimization was based on RSM.
Identification of geographical origin and tree species is critical for the establishment of a timber quality traceability system. Sandak et al. [60] applied Fourier transform near infrared (FT-NIR) to track the provenance of spruce, and they found that FT-NIR is sensitive enough to detect wood differences of different provenance. Yang et al. demonstrated correctness of species up to 90% when NIR spectroscopy and PLS-DA were employed to identify wood species from different locations [61]. It was observed that PSO-SVM models using wavelet coefficients obtained superior performance relative to that of the raw spectra, regardless of geographical origin and tree species (Figure 7 and Figure 8). The reason behind this is that wavelet coefficients extract feature information of spectral signals, resulting in the simplification of the model structure and improvement of the model robustness [62]. Different provenances have influence on wood properties [63]. A total of 280 wood samples among three species were harvested from two different origins, that is, Heilongjiang and Jilin provinces, China, in this study (Table 1). There are big differences in climate and soil type. The former is mainly an East Asian monsoon climate with humus soil. The latter is a temperate continental monsoon climate with marsh soil. Additionally, precipitation and frost-free period are also different. These growing conditions result in wide a variation in wood properties and spectral data, and these large differences between species corresponded to the changes of Vis-NIR spectra, achieving the rapid discrimination of geographical origin and tree species on the basis of Vis-NIR spectroscopy. Therefore, due to the difference of species, what is important and required in practice is the discrimination of origin of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES)-listed species, such as mahogany and Dalbergia nigra, which will provide a powerful technical tool for detecting logs harvested illegally from protected areas.

5. Conclusions

The estimation of wood property and identification of the origin and species simultaneously are important for wood traceability, yet are challenging when using traditional methods. This study demonstrated that the wavelet coefficients based on LWT not only simplify the dimension of the spectral data, but also improve model accuracy. Vis-NIR spectroscopy coupled with chemometric analysis discriminates the geographical origin/tree species and predicts wood properties with high accuracy concurrently, which provides rapid and non-destructive methods to obtain information on the wood property for the establishment of a wood traceability system.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, B.K.V., T.Y., and X.L.; validation, X.L.; formal analysis, B.K.V.; investigation, Y.L., X.L., and T.Y; resources, Y.L. and X.L.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, B.K.V.; visualization, Y.L., B.K.V., T.Y., and X.L.; supervision, Y.L., B.K.V., T.Y., and X.L.; project administration, Y.L., B.K.V., T.Y., and X.L.; funding acquisition, X.L.

Funding

This research was funded by the China National Key Research and Development Program, grant number 2017YFC0504103.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Silva, D.C.; Pastore, T.C.; Soares, L.F.; De Barros, F.A.; Bergo, M.C.; Coradin, V.T.; Braga, J.W. Determination of the country of origin of true mahogany (Swietenia macrophylla King) wood in five Latin American countries using handheld NIR devices and multivariate data analysis. Holzforschung 2018, 7, 521–530. [Google Scholar] [CrossRef]
  2. Bergo, M.C.; Pastore, T.C.; Coradin, V.T.; Wiedenhoeft, A.C.; Braga, J.W. NIRS identification of Swietenia macrophylla is robust across specimens from 27 countries. IAWA J. 2016, 37, 420–430. [Google Scholar] [CrossRef]
  3. Braga, J.W.B.; Pastore, T.C.M.; Coradin, V.T.R.; Camargos, J.A.A.; Da Silva, A.R. The use of near infrared spectroscopy to identify solid wood specimens of Swietenia macrophylla (CITES Appendix II). IAWA J. 2011, 32, 285–296. [Google Scholar] [CrossRef]
  4. Nisgoski, S.; Schardosin, F.Z.; Batista, F.R.R.; De Muñiz, G.I.B.; Carneiro, M.E. Potential use of NIR spectroscopy to identify Cryptomeria japonica varieties from southern Brazil. Wood Sci. Ttechnol. 2016, 50, 71–80. [Google Scholar] [CrossRef]
  5. Ng, K.K.S.; Lee, S.L.; Tnah, L.H.; Nurul-Farhanah, Z.; Ng, C.H.; Lee, C.T.; Khoo, E. Forensic timber identification: A case study of a CITES listed species, Gonystylus bancanus (Thymelaeaceae). Forensic Sci. Int. Genet. 2016, 23, 197–209. [Google Scholar] [CrossRef]
  6. Tzoulis, I.; Andreopoulou, Z. Emerging traceability technologies as a tool for quality wood trade. Procedia Technol. 2013, 8, 606–611. [Google Scholar] [CrossRef]
  7. Li, S.; Ji, W.; Chen, S.; Peng, J.; Zhou, Y.; Shi, Z. Potential of VIS-NIR-SWIR spectroscopy from the Chinese soil spectral library for assessment of nitrogen fertilization rates in the paddy-rice region, China. Remote Sens. 2015, 7, 7029–7043. [Google Scholar] [CrossRef]
  8. Anting, N.; Din, M.F.M.; Iwao, K.; Ponraj, M.; Siang, A.J.L.M.; Yong, L.Y.; Prasetijo, J. Optimizing of near infrared region reflectance of mix-waste tile aggregate as coating material for cool pavement with surface temperature measurement. Energ. Build. 2018, 158, 172–180. [Google Scholar] [CrossRef]
  9. Ncama, K.; Tesfay, S.Z.; Fawole, O.A.; Opara, U.L.; Magwaza, L.S. Non-destructive prediction of ‘Marsh’ grapefruit susceptibility to postharvest rind pitting disorder using reflectance Vis/NIR spectroscopy. Sci. Hortic-Amst. 2018, 231, 265–271. [Google Scholar] [CrossRef]
  10. Acquah, G.E.; Via, B.K.; Billor, N.; Fasina, O.O.; Eckhardt, L.G. Identifying plant part composition of forest logging residue using infrared spectral data and linear discriminant analysis. Sensors 2016, 16, 1375. [Google Scholar] [CrossRef]
  11. Schimleck, L.; Meder, R. Guest editorial: Has the time finally come for NIR in the forestry sector? J. Near Infrared Spectrosc. 2011, 19, v-v. [Google Scholar] [CrossRef]
  12. Li, Y.; Li, Y.X.; Li, W.B.; Jiang, L.C. Model optimization of wood property and quality tracing based on wavelet transform and NIR spectroscopy. Spectrosc. Spect. Anal. 2018, 38, 1384–1392. [Google Scholar] [CrossRef]
  13. Schimleck, L.; Dahlen, J.; Yoon, S.C.; Lawrence, K.C.; Jones, P.D. Prediction of Douglas-fir lumber properties: Comparison between a benchtop near-infrared spectrometer and hyperspectral imaging system. Appl. Sci. 2018, 8, 2602. [Google Scholar] [CrossRef]
  14. Bächle, H.; Zimmer, B.; Wegener, G. Classification of thermally modified wood by FT-NIR spectroscopy and SIMCA. Wood Sci. Ttechnol. 2012, 46, 1181–1192. [Google Scholar] [CrossRef]
  15. Abasolo, M.; Lee, D.J.; Raymond, C.; Meder, R.; Shepherd, M. Deviant near-infrared spectra identifies Corymbia hybrids. Forest Ecol. Manag. 2013, 304, 121–131. [Google Scholar] [CrossRef]
  16. Meder, R.; Kain, D.; Ebdon, N.; Macdonell, P.; Brawner, J.T. Identifying hybridisation in Pinus species using near infrared spectroscopy of foliage. J. Near Infrared Spectrosc. 2014, 22, 337–345. [Google Scholar] [CrossRef]
  17. Sandak, A.; Sandak, J.; Negri, M. Discrimination of Wood Origin with Near-Infrared Spectroscopy. In Proceedings of the 14th International Conference on NIR Spectroscopy, Bangkok, Thailand, 9–13 November 2009. [Google Scholar]
  18. Schimleck, L.; Antony, F.; Mora, C.; Dahlen, J. Comparison of whole-tree wood property maps for 13-and 22-year-old loblolly pine. Forests 2018, 9, 287. [Google Scholar] [CrossRef]
  19. Toscano, G.; Rinnan, Å.; Pizzi, A.; Mancini, M. The use of near-infrared (NIR) spectroscopy and principal component analysis (PCA) to discriminate bark and wood of the most common species of the pellet sector. Energ. Fuel. 2017, 31, 2814–2821. [Google Scholar] [CrossRef]
  20. Xu, L.; Zhou, Y.P.; Tang, L.J.; Wu, H.L.; Jiang, J.H.; Shen, G.L.; Yu, R.Q. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration. Anal. Chim. Acta 2008, 616, 138–143. [Google Scholar] [CrossRef]
  21. Hong, Y.; Chen, Y.; Yu, L.; Liu, Y.; Liu, Y.; Zhang, Y.; Liu, Y.; Cheng, H. Combining fractional order derivative and spectral variable selection for organic matter estimation of homogeneous soil samples by Vis–NIR spectroscopy. Remote Sens. 2018, 10, 479. [Google Scholar] [CrossRef]
  22. Ebadi, L.; Shafri, H.Z.; Mansor, S.B.; Ashurov, R. A review of applying second-generation wavelets for noise removal from remote sensing data. Environ. Earth Sci. 2013, 70, 2679–2690. [Google Scholar] [CrossRef]
  23. Esteban-Díez, I.; González-Sáiz, J.M.; Gómez-Cámara, D.; Millan, C.P. Multivariate calibration of near infrared spectra by orthogonal wavelet correction using a genetic algorithm. Anal. Chim. Acta 2006, 555, 84–95. [Google Scholar] [CrossRef]
  24. Depczynski, U.; Jetter, K.; Molt, K.; Niemöller, A. Quantitative analysis of near infrared spectra by wavelet coefficient regression using a genetic algorithm. Chemometr. Intell. Lab. 1999, 47, 179–187. [Google Scholar] [CrossRef]
  25. Tsuchikawa, S. A review of recent near infrared research for wood and paper. Appl. Spectrosc. Rev. 2007, 42, 43–71. [Google Scholar] [CrossRef]
  26. Fei, S.W.; Wang, M.J.; Miao, Y.B.; Tu, J.; Liu, C.L. Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil. Energ. Convers. Manage. 2009, 50, 1604–1609. [Google Scholar] [CrossRef]
  27. Sin, H.N.; Yusof, S.; Hamid, N.S.A.; Rahman, R.A. Optimization of enzymatic clarification of sapodilla juice using response surface methodology. J. Food Eng. 2006, 73, 313–319. [Google Scholar] [CrossRef]
  28. Fan, G.; Han, Y.; Gu, Z.; Chen, D. Optimizing conditions for anthocyanins extraction from purple sweet potato using response surface methodology (RSM). LWT-Food Sci. Technol. 2008, 41, 155–160. [Google Scholar] [CrossRef]
  29. Via, B.K.; Zhou, C.F.; Acquah, G.; Jiang, W.; Eckhardt, L. Near infrared spectroscopy calibration for wood chemistry: Which chemometric technique is best for prediction and interpretation? Sensors 2014, 14, 13532–13547. [Google Scholar] [CrossRef]
  30. Liu, Y.; Liu, Y.; Chen, Y.; Zhang, Y.; Shi, T.; Wang, J.; Hong, Y.; Fei, T.; Zhang, Y. The influence of spectral pretreatment on the selection of representative calibration samples for soil organic matter estimation using Vis-NIR reflectance spectroscopy. Remote Sens. 2019, 11, 450. [Google Scholar] [CrossRef]
  31. Li, X.L.; Sun, C.J.; Zhou, B.X.; He, Y. Determination of hemicellulose, cellulose and lignin in Moso Bamboo by near infrared spectroscopy. Sci. Rep. 2015, 5, 17210. [Google Scholar] [CrossRef]
  32. Daubeches, I.; Sweldens, W. Factoring wavelet transform into lifting steps. J. Fourier Anal. Appl. 1998, 4, 247–269. [Google Scholar] [CrossRef]
  33. Mehta, R.; Rajpal, N.; Vishwakarma, V.P. A robust and efficient image watermarking scheme based on Lagrangian SVR and lifting wavelet transform. Int. J. Mach. Learn. Cyber. 2017, 8, 379–395. [Google Scholar] [CrossRef]
  34. Li, Y.; Via, B.K.; Cheng, Q.Z.; Li, Y.X. Lifting wavelet transform de-noising for model optimization of Vis-NIR spectroscopy to predict wood tracheid length in trees. Sensors Basel 2018, 18, 4306. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Zhou, F.B.; Li, C.G.; Zhu, H.Q. Research on threshold improved denoising algorithm based on lifting wavelet transform in UV-Vis spectrum. Spectrosc. Spect. Anal. 2018, 38, 506–510. [Google Scholar] [CrossRef]
  36. Li, X.Y.; Xu, Z.H.; Cai, W.S.; Shao, X.G. Filter design for molecular factor computing using wavelet functions. Anal. Chim. Acta 2015, 880, 26–31. [Google Scholar] [CrossRef]
  37. Zhang, Y.; Zou, H.Y.; Shi, P.; Yang, Q.; Tang, L.J.; Jiang, J.H.; Wu, H.L.; Yu, R.Q. Determination of benzo [a] pyrene in cigarette mainstream smoke by using mid-infrared spectroscopy associated with a novel chemometric algorithm. Anal. Chim. Acta 2016, 902, 43–49. [Google Scholar] [CrossRef]
  38. Deo, R.C.; Wen, X.; Qi, F. A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset. Appl. Energ. 2016, 168, 568–593. [Google Scholar] [CrossRef]
  39. Xie, Z.; Chen, Y.; Lu, D.; Li, G.; Chen, E. Classification of land cover, forest, and tree species classes with ZiYuan-3 multispectral and stereo data. Remote Sens. 2019, 11, 164. [Google Scholar] [CrossRef] [Green Version]
  40. Zhang, X.; Qiu, D.; Chen, F. Support vector machine with parameter optimization by a novel hybrid method and its application to fault diagnosis. Neurocomputing 2015, 149, 641–651. [Google Scholar] [CrossRef]
  41. Liu, Y.S.; Zhou, S.B.; Liu, W.X.; Yang, X.H.; Luo, J. Least-squares support vector machine and successive projection algorithm for quantitative analysis of cotton-polyester textile by near infrared spectroscopy. J. Near Infrared Spectrosc. 2018, 26, 34–43. [Google Scholar] [CrossRef] [Green Version]
  42. Cogdill, R.P.; Dardenne, P. Least-squares support vector machines for chemometrics: An introduction and evaluation. J. Near Infrared Spectrosc. 2004, 12, 93–100. [Google Scholar] [CrossRef]
  43. Mora, C.R.; Schimleck, L.R. Kernel regression methods for the prediction of wood properties of Pinus taeda using near infrared spectroscopy. Wood Sci. Ttechnol. 2010, 44, 561–578. [Google Scholar] [CrossRef]
  44. Wang, D.; Xie, L.; Yang, S.; Tian, F. Support vector machine optimized by genetic algorithm for data analysis of near-infrared spectroscopy sensors. Sensors 2018, 18, 3222. [Google Scholar] [CrossRef] [Green Version]
  45. Lin, S.W.; Ying, K.C.; Chen, S.C.; Lee, Z.J. Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 2008, 35, 1817–1824. [Google Scholar] [CrossRef]
  46. Fu, C.; Gan, S.; Yuan, X.; Xiong, H.; Tian, A. Determination of soil salt content using a probability neural network model based on particle swarm optimization in areas affected and non-affected by human activities. Remote Sens. 2018, 10, 1387. [Google Scholar] [CrossRef] [Green Version]
  47. Wu, Y.H.; Shen, H. Grey-related least squares support vector machine optimization model and its application in predicting natural gas consumption demand. J. Comput. Appl. Math. 2018, 338, 212–220. [Google Scholar] [CrossRef]
  48. Lei, Z.; Ji, T.Y.; Xie, C.X.; Li, M.S.; Wu, Q.H. Power Quality Disturbance Identification Using Improved Particle Swarm Optimizer and Support Vector Machine. In Proceedings of the 2018 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Singapore, 22–25 May 2018. [Google Scholar]
  49. Bian, Y.; Yang, M.; Fan, X.; Liu, Y. A Fire detection algorithm based on tchebichef moment invariants and PSO-SVM. Algorithms 2018, 11, 79. [Google Scholar] [CrossRef] [Green Version]
  50. Zhou, G.F.; Tan, W.; Zhang, D. Ice detection for wind turbine blades based on PSO-SVM method. J. Phys. Conf. Ser. 2018, 1087, 022036. [Google Scholar] [CrossRef]
  51. He, B.; Zhang, L.L.; Yue, X.Y.; Liang, J.; Jiang, J.; Gao, X.L.; Yue, P.X. Optimization of ultrasound-assisted extraction of phenolic compounds and anthocyanins from blueberry (Vaccinium ashei) wine pomace. Food Chem. 2016, 204, 70–76. [Google Scholar] [CrossRef]
  52. Zhang, G.J.; Li, L.N.; Li, Q.B.; Xu, Y.P. Application of denoising and background elimination based on wavelet transform to blood glucose noninvasive measurement of near infrared spectroscopy. J. Infrared Millim. Waves 2009, 28, 107–110. [Google Scholar] [CrossRef]
  53. Schimleck, L.; Evans, R.; Ilic, J. Application of near infrared spectroscopy to a diverse range of species demonstrating wide density and stiffness variation. IAWA J. 2001, 22, 415–429. [Google Scholar] [CrossRef] [Green Version]
  54. Cheng, Q.; Zhou, C.; Jiang, W.; Zhao, X.; Via, B.K.; Wan, H. Mechanical and physical properties of oriented strand board exposed to high temperature and relative humidity and coupled with near-infrared reflectance modeling. Forest Prod. J. 2018, 68, 78–85. [Google Scholar]
  55. Jetter, K.; Depczynski, U.; Molt, K.; Niemöller, A. Principles and applications of wavelet transformation to chemometrics. Anal. Chim. Acta 2000, 420, 169–180. [Google Scholar] [CrossRef]
  56. Zhang, Q.J.; Luo, Z.Z. Wavelet De-noising of Electromyography. In Proceedings of the 2006 International Conference on Mechatronics and Automation, Henan, China, 25–28 June 2006. [Google Scholar]
  57. Liao, Y.; Fan, Y.; Cheng, F. On-line prediction of pH values in fresh pork using visible/near-infrared spectroscopy with wavelet de-noising and variable selection methods. J. Food Eng. 2012, 109, 668–675. [Google Scholar] [CrossRef]
  58. Zhang, D.Y.; Zhang, C.Y.; Zhu, L.K.; Diao, Z.D. A novel intelligent modeling method for wood drying process. Appl. Mech. Mater. 2012, 121, 647–651. [Google Scholar] [CrossRef]
  59. He, T.D.; Li, J.W. A model for water quality remote retrieva based on support vector regression with parameters optimized by particle swarm optimization. Adv. Mater. Res. 2011, 383, 3593–3597. [Google Scholar] [CrossRef]
  60. Sandak, A.; Sandak, J.; Negri, M. Relationship between near-infrared (NIR) spectra and the geographical provenance of timber. Wood Sci. Ttechnol. 2011, 45, 35–48. [Google Scholar] [CrossRef]
  61. Yang, Z.; Liu, Y.; Pang, X.; Li, K. Preliminary investigation into the identification of wood species from different locations by near infrared spectroscopy. BioResources 2015, 10, 8505–8517. [Google Scholar] [CrossRef] [Green Version]
  62. Liu, L.; Ji, M.; Buchroithner, M. Combining partial least squares and the gradient-boosting method for soil property retrieval using visible near-infrared shortwave infrared spectra. Remote Sens. 2017, 9, 1299. [Google Scholar] [CrossRef] [Green Version]
  63. Bhat, K.M.; Priya, P.B. Influence of provenance variation on wood properties of teak from the Western Ghat region in India. Iawa J. 2004, 25, 273–282. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The procedure of lifting wavelet transform (LWT).
Figure 1. The procedure of lifting wavelet transform (LWT).
Forests 10 01078 g001
Figure 2. An overview of steps in visible and near infrared (Vis-NIR) spectra analysis. PLS: partial least squares, PSO: particle swarm optimization, SVM: support vector machine, RSM: response surface methodology.
Figure 2. An overview of steps in visible and near infrared (Vis-NIR) spectra analysis. PLS: partial least squares, PSO: particle swarm optimization, SVM: support vector machine, RSM: response surface methodology.
Forests 10 01078 g002
Figure 3. The raw Vis-NIR spectra with baseline correction for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Figure 3. The raw Vis-NIR spectra with baseline correction for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Forests 10 01078 g003
Figure 4. The de-noising results of LWT for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively. (a) and (b) are constructed spectra and wavelet coefficients, respectively.
Figure 4. The de-noising results of LWT for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively. (a) and (b) are constructed spectra and wavelet coefficients, respectively.
Forests 10 01078 g004
Figure 5. The optimal wavelet coefficients for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Figure 5. The optimal wavelet coefficients for each species. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Forests 10 01078 g005
Figure 6. Response surface plot for interactions between variables on CVmse (mean squared error of cross-validation) of PSO-SVM models. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Figure 6. Response surface plot for interactions between variables on CVmse (mean squared error of cross-validation) of PSO-SVM models. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Forests 10 01078 g006aForests 10 01078 g006b
Figure 7. Clustering results of geographical origin for prediction set: (a) raw spectra; (b) wavelet coefficients.
Figure 7. Clustering results of geographical origin for prediction set: (a) raw spectra; (b) wavelet coefficients.
Forests 10 01078 g007
Figure 8. Clustering results of tree species for prediction set: (a) raw spectra; (b) wavelet coefficients. Category labels A–D are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Figure 8. Clustering results of tree species for prediction set: (a) raw spectra; (b) wavelet coefficients. Category labels A–D are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Forests 10 01078 g008
Figure 9. Vis-NIR loadings for density PLS calibration models. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Figure 9. Vis-NIR loadings for density PLS calibration models. (AD) are Dahurian larch, Japanese elm, Chinese white poplar (Jilin), and Chinese white poplar (Heilongjiang), respectively.
Forests 10 01078 g009
Table 1. Geographical origin information used in this study.
Table 1. Geographical origin information used in this study.
SpeciesGeographical OriginSoil TypeClimateElevation
(m)
Frost-Free Period
(day)
Annual Precipitation
(mm)
Mean Annual Temperature
(°C)
Dahurian larch,
Chinese white poplar
HeilongjiangBlack soil, loess soil, humus soilEast Asian monsoon climate180–392110–120700–7503.7
Japanese elm,
Chinese white poplar
JilinDark brown
soil, marsh soil
Temperate continental monsoon climate125–555120–135530–5503
Table 2. Experimental values and coded levels of variable for Box–Behnken design.
Table 2. Experimental values and coded levels of variable for Box–Behnken design.
Factor LevelsIndependent Variable
No. of Cross-ValidationMaximum GenerationPopulation Size
−155020
0107540
11510060
Table 3. Descriptive statistics of density for each species.
Table 3. Descriptive statistics of density for each species.
SpeciesCalibration Set (N = 52) (g⋅cm−3)Prediction Set (N = 18) (g⋅cm−3)
MaxMinAvg.SDMaxMinAvg.SD
Dahurian larch1.1190.7170.8910.1191.0720.7340.8860.111
Japanese elm1.1970.8000.9880.1061.1090.8570.9920.075
Chinese white poplar (Jilin)0.8230.7000.7610.0300.8100.7090.7630.028
Chinese white poplar (Heilongjiang)0.8220.6740.7570.0390.8040.7030.7330.031
Table 4. PLS and particle swarm optimization (PSO)-support vector machine (SVM) models statistics of wood density 1.
Table 4. PLS and particle swarm optimization (PSO)-support vector machine (SVM) models statistics of wood density 1.
SpeciesPLS Models
(Raw Spectra)
PLS Models
(Wavelet Coefficients)
PSO-SVM Models
(Wavelet Coefficients)
LVs R c 2   RMSEC RPD LVs R c 2   RMSEC RPD R c 2   RMSEC RPD
Dahurian larch10.8230.0502.37520.8350.0482.4630.9350.0303.930
Japanese elm30.8160.0452.33340.8260.0442.3980.8440.0412.530
Chinese white poplar (Jilin)50.8260.0122.40050.8430.0122.5200.9310.0083.814
Chinese white poplar (Heilongjiang)30.7670.0182.07260.7930.0172.1980.8430.0422.524
1 RMSEC denotes the root mean square error of calibration and RPD denotes residual predictive deviation.
Table 5. The variance analysis results of Dahurian larch. DF: degrees of freedom, SS: sum of squares, MS: mean square.
Table 5. The variance analysis results of Dahurian larch. DF: degrees of freedom, SS: sum of squares, MS: mean square.
SourceDFSSMSF-Valuep-Value
Model90.00010.00003343.02<0.0001
Linear
X 1 10.00010.000122,450.82<0.0001
X 2 12.112 × 10−82.112 × 10−85.950.0449
X 3 15.969 × 10−85.969 × 10−816.810.0046
Two-way interaction
X 1 X 2 15.625 × 10−115.625 × 10−110.01580.9034
X 1 X 3 12.722 × 10−102.722 × 10−100.07670.7899
X 2 X 3 18.649 × 10−98.649 × 10−92.440.1626
Quadratic
X 1 2 10.00000.00007442.07<0.0001
X 2 2 12.626 × 10−82.626 × 10−87.400.0298
X 3 2 11.195 × 10−71.195 × 10−733.660.0007
Lack of fit32.486 × 10−88.285 × 10−91.036 × 106<0.0001
R 2 = 0.9998
Table 6. The variance analysis results of Japanese elm.
Table 6. The variance analysis results of Japanese elm.
SourceDFSSMSF-Valuep-Value
Model90.00002.606 × 10−616.260.0007
Linear
X 1 10.00000.0000105.35<0.0001
X 2 19.244 × 10−79.244 × 10−75.770.0474
X 3 19.785 × 10−79.785 × 10−76.100.0428
Two-way interaction
X 1 X 2 11.023 × 10−71.023 × 10−70.63810.4507
X 1 X 3 15.856 × 10−85.856 × 10−80.36540.5646
X 2 X 3 11.392 × 10−61.392 × 10−68.680.0215
Quadratic
X 1 2 12.435 × 10−62.435 × 10−615.190.0059
X 2 2 12.073 × 10−72.073 × 10−71.290.2929
X 3 2 12.616 × 10−72.616 × 10−71.630.2421
Lack of fit39.486 × 10−73.162 × 10−77.300.0424
R 2   =   0.9544
Table 7. The variance analysis results of Chinese white poplar (Jilin).
Table 7. The variance analysis results of Chinese white poplar (Jilin).
SourceDFSSMSF-Valuep-Value
Model90.00020.00002712.41<0.0001
Linear
X 1 10.00010.000119,504.74<0.0001
X 2 19.279 × 10−89.279 × 10−812.730.0091
X 3 14.667 × 10−84.667 × 10−86.400.0392
Two-way interaction
X 1 X 2 16.240 × 10−86.240 × 10−88.560.0222
X 1 X 3 11.782 × 10−81.782 × 10−82.440.1619
X 2 X 3 12.500 × 10−112.500 × 10−110.00340.9549
Quadratic
X 1 2 10.00000.00004851.62<0.0001
X 2 2 13.351 × 10−73.351 × 10−745.980.0003
X 3 2 11.368 × 10−91.368 × 10−90.18770.6779
Lack of fit34.810 × 10−81.603 × 10−821.940.0060
R 2   =   0.9997
Table 8. The variance analysis results of Chinese white poplar (Heilongjiang).
Table 8. The variance analysis results of Chinese white poplar (Heilongjiang).
SourceDFSSMSF-Valuep-Value
Model90.00040.00004.360.0326
Linear
X 1 10.00020.000215.470.0057
X 2 10.00000.00001.220.3067
X 3 14.557 × 10−64.557 × 10−60.46810.5159
Two-way interaction
X 1 X 2 16.268 × 10−66.268 × 10−60.64380.4487
X 1 X 3 13.660 × 10−93.660 × 10−90.00040.9851
X 2 X 3 18.860 × 10−68.860 × 10−60.91000.3719
Quadratic
X 1 2 10.00020.000219.720.0030
X 2 2 15.294 × 10−75.294 × 10−70.05440.8223
X 3 2 14.316 × 10−64.316 × 10−60.44330.5269
Lack of fit30.00003.681 × 10−60.25780.8528
R 2   =   0.8486
Table 9. PSO-SVM model optimized by RSM statistics for wood density.
Table 9. PSO-SVM model optimized by RSM statistics for wood density.
SpeciesCalibration SetPrediction Set
R c 2   RMSEC RPD R p 2   RMSEP RPD
Dahurian larch0.9530.0264.6190.9240.0303.617
Japanese elm0.9740.0176.1600.7890.0342.175
Chinese white poplar (Jilin)0.9590.0064.9640.9240.0073.635
Chinese white poplar (Heilongjiang)0.8370.0472.4770.9210.0223.558
Table 10. The detailed description of the origin classification results for prediction set.
Table 10. The detailed description of the origin classification results for prediction set.
Category LabelsRaw SpectraWavelet Coefficients
12Correctness12Correctness
1180100%180100%
2018018
Table 11. The detailed description of the species classification results for prediction set.
Table 11. The detailed description of the species classification results for prediction set.
Category LabelsRaw SpectraWavelet Coefficients
ABCDCorrectnessABCDCorrectness
A1602084.72%1800098.61%
B0119011700
C0018000180
D0001800018

Share and Cite

MDPI and ACS Style

Li, Y.; Via, B.K.; Young, T.; Li, Y. Visible-Near Infrared Spectroscopy and Chemometric Methods for Wood Density Prediction and Origin/Species Identification. Forests 2019, 10, 1078. https://doi.org/10.3390/f10121078

AMA Style

Li Y, Via BK, Young T, Li Y. Visible-Near Infrared Spectroscopy and Chemometric Methods for Wood Density Prediction and Origin/Species Identification. Forests. 2019; 10(12):1078. https://doi.org/10.3390/f10121078

Chicago/Turabian Style

Li, Ying, Brian K. Via, Tim Young, and Yaoxiang Li. 2019. "Visible-Near Infrared Spectroscopy and Chemometric Methods for Wood Density Prediction and Origin/Species Identification" Forests 10, no. 12: 1078. https://doi.org/10.3390/f10121078

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop