A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images

Zhu, Zhiqin; Qi, Guanqiu; Lei, Yangbo; Jiang, Daiyu; Mazur, Neal; Liu, Yang; Wang, Di; Zhu, Wei

doi:10.3390/chemosensors10050164

Open AccessArticle

A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images

¹

College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

²

Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA

³

BOE Technology Group Co., Ltd., Chongqing 400799, China

⁴

School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing 400066, China

⁵

School of Chemistry and Chemical Engineering, Chongqing University, Chongqing 400044, China

^*

Author to whom correspondence should be addressed.

Chemosensors 2022, 10(5), 164; https://doi.org/10.3390/chemosensors10050164

Submission received: 16 March 2022 / Revised: 25 April 2022 / Accepted: 26 April 2022 / Published: 28 April 2022

(This article belongs to the Section Analytical Methods, Instrumentation and Miniaturization)

Download

Browse Figures

Versions Notes

Abstract

:

Near-infrared (NIR) spectroscopy has been widely used in agricultural operations to obtain various crop parameters, such as water content, sugar content, and different indicators of ripeness, as well as other potential information concerning crops that cannot be directly obtained by human observation. The chemical compositions of tobacco play an important role in the quality of cigarettes. The NIR spectroscopy-based chemical composition analysis has recently become one of the most effective methods in tobacco quality analysis. Existing NIR spectroscopy-related solutions either have relatively low analysis accuracy, or are only able to analyze one or two chemical components. Thus, a precise prediction model is needed to improve the analysis accuracy of NIR data. This paper proposes a tobacco chemical component analysis method based on a neural network (TCCANN) to quantitatively analyze the chemical components of tobacco leaves by using NIR spectroscopy, including nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH value. The proposed TCCANN consists of both residual network (ResNet) and long short-term memory (LSTM) neural network. ResNet is applied to the feature extraction of high-dimension NIR spectroscopy, which can effectively avoid the gradient-disappearance issue caused by the increase of network depth. LSTM is used to quantitatively analyze the multiple chemical compositions of tobacco leaves in a simultaneous manner. LSTM selectively allows information to pass through by a gated unit, thereby comprehensively analyzing the correlation between multiple chemical components and corresponding spectroscopy. The experimental results confirm that the proposed TCCANN not only predicts the corresponding values of seven chemical components simultaneously, but also achieves better prediction performance than other existing machine learning methods.

Keywords:

tobacco; chemical composition; near-infrared spectroscopy; residual network; long short-term memory

1. Introduction

As an important economic plant, the total global economic cost of tobacco is estimated at around USD 1.85 trillion or around 1.8% of global GDP and continues to increase [1]. Tobacco contains a variety of chemical compositions, some of which contribute to the flavor and aroma of tobacco [2], and some of which affect the tobacco organoleptic quality [3]. Therefore, the chemical substances determine tobacco quality. The main chemical compositions of tobacco include nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH among others. Nicotine has a strong effect on both aroma and taste of tobacco products, and nicotine intake can have some side effects on the body [4]. Both reducing sugar and total sugar have positive correlations with aftertaste, irritation and aroma quality, and the amount of total nitrogen has a positive relationship with the smoke concentration and smoking strength [5]. The pH value of tobacco is a determinant in the acute toxicity and is correlated with total nitrogen, total alkaloid and total volatile alkali bases of tobacco [6,7]. The amount of potassium has a positive relationship with the flavor and the degree of wetness [8].

Another important aspect of analyzing the chemical composition is that the chemical composition affects the level of toxicants [9]. Some compounds derived from nicotine and total nitrogen are known or considered to be human carcinogens and linked to smoking-related diseases [10]. To ensure stable quality, reduce deleterious compositions, and improve the flavor of cigarette products, cigarette manufacturers need to rapidly obtain the testing results of chemical compositions from raw materials, processing intermediates, and final products. However, the traditional method for quantitative determination of the chemical composition requires a series of processing steps, which involve cracking and grinding the leaves and then chemical reactions with them using a variety of chemical reagents [4,9]. The chemical reaction process is time-consuming and destructive to tobacco samples. Moreover, the process involves expensive equipment and complex operations that require expert personnel. Meanwhile, expensive equipment and professionals are required to complete the complex operations related to chemical reactions. Therefore, establishing a standard and efficient method to quantitatively detect the chemical compositions of tobacco leaf samples is valuable and essential.

NIR spectroscopy has become a powerful analytical method for process analytical chemistry, due to the characteristics of rapidity and non-destructive measurements. NIR spectroscopy technique is widely applied to carry out quantitative and qualitative analysis in the tobacco industry [11,12,13]. The spectra acquired from NIR sensors have the potential to extract corresponding feature information of samples, which can reflect the abundant structure and information of tobacco [14,15,16,17]. By using the NIR spectrum data to establish an analysis model, the chemical composition of tobacco leaves can be quickly determined by the model, and the nature of tobacco will not cause destructive effects.

Data analysis approaches for NIR spectroscopy play essential and vital roles in tobacco quality determination. The NIR spectra gathered with the NIR sensor are high-dimensional, including a large number of data feature points of tobacco leaves [15,18,19,20]. Existing analysis models of NIR spectroscopy use a variety of spectra preprocessing methods to reduce uninformative features, collinearity, and redundancy in spectra data [12]. Existing spectra preprocessing methods, that include principal component analysis (PCA) [21,22], wavelet transformation (WT) [23], standard normal variate (SNV) [15], and spectroscopy derivative [11], reduce the data dimensions first, and then extract the key information. However, the processing steps of existing methods are excessively complicated, and many involved parameters need to be adjusted. The results obtained by different settings of parameters vary. Some important information in the original spectra may be lost during the preprocessing process, so the processed spectra cannot sufficiently reflect the relationship between tobacco and the NIR spectra.

With the rapid development of deep learning, various deep neural network models have been proposed in recent years [24,25,26,27]. Convolutional neural network (CNN) [28,29,30] has a powerful ability of feature extraction, and can considerably reduce the number of training parameters and training time by the connection of both local and global features [31,32]. Residual network (ResNet) [24] was proposed to solve or alleviate the issue of gradient disappearance caused by the increase of network depth. ResNet introduces residual blocks, which enable data to realize the identity mapping among different network layers in a deep network and fully extract the advanced data features [24,33]. As a time recurrent neural network, long short-term memory (LSTM) neural network [34] can learn long dependence information. LSTM was applied to solve the issues of both gradient explosion and disappearance that might occur in training process of traditional recurrent neural network (RNN) by using an internal memory unit and gate mechanism [35,36]. Recently, it has been reported that deep learning is applied to the regression and classification issues in NIR spectral data analysis [12,37]. Wang [38] applied one-dimension convolutional neural network (1-D CNN) to identify the cultivation areas of tobacco by analyzing NIR spectra. These studies have shown that compared with other machine learning methods, the spectral analysis with deep learning method can obtain better calibration performance [12,37]. This paper proposes TCCANN to achieve simultaneous quantitative analyses of multiple chemical compositions of tobacco by NIR spectroscopy. The proposed ATCNN model employs ResNet network to extract features of NIR spectroscopy data, and uses LSTM network to quantitative analyses of multiple chemical compositions.

LSTM network can keep data information in internal gated units, and achieve selective information transmission by gated units [39]. Thus, the LSTM model can comprehensively analyze the correlation between chemical compositions and spectra, and achieve simultaneous analyses of multiple chemical compositions quantitatively. The full convolutional networks are used in TCCANN. In the network training process, the data dimension can be reduced by pooling layers, thus the number of parameters decreases [40]. However, pooling layers may cause the loss of internal information and affect the training accuracy [41]. The features of tobacco spectroscopy at each wavelength of NIR spectroscopy are correlated with each other [14,38]. NIR spectra contain the information of various components to be analyzed and the related background information [42]. In order to avoid the loss of spectroscopy information in the feature extraction, the max-pooling layer is replaced by a convolutional layer with a stride of two. In the network, batch normalization (BN) operation [43] is used on each convolutional layer to speed up network training, and improve the generalization ability of the proposed network. Three appraisal indicators are used to evaluate the performance of the proposed TCCANN. We compare the prediction performance of the existing analysis methods with that of the ATCNN model. The paper has three main contributions as follows.

A ResNet module is used to directly extract the features of high-dimension NIR spectra. Traditional methods cannot process the high-dimension NIR spectra without preprocessing. However, part of the original spectra information may be lost in the data preprocessing. In our proposed ATCNN, through the internal residual block, the ResNet network enables high-dimension data to achieve the identity mapping between different network layers in the deep network, and enables the shallow data to be identity mapped to the deep network. ResNet network can fully extract the features of spectra, and improve the prediction accuracy of the proposed TCCANN.
The proposed ResNet network uses a full convolutional structure to avoid the loss of spectra information in the feature extraction process and can extract spectroscopy features more effectively. The tobacco features at each wavelength point in NIR spectra are correlated with each other. The pooling layer can reduce the number of network parameters by decreasing the feature dimensions, but the loss of partial feature information may occur in the pooling process, which affects the performance of the proposed TCCANN. TCCANN uses the convolutional layer with a stride of two to replace the pooling layer. In the feature extraction process, the relevance in the feature information of spectra can be retained, the loss of spectra information can be reduced, and the prediction accuracy of the proposed model can be improved.
The proposed TCCANN uses LSTM network to achieve the simultaneous quantitative analyses of multiple chemical compositions of tobacco. Existing methods need to consider the correlation between each chemical component and the NIR spectra, and can only analyze each chemical composition individually. LSTM network can keep data information in internal gate units and achieve selective information transmission. Thus, the comprehensive correlation analysis between multiple chemical components and spectra, and simultaneous quantitative analyses of multiple chemical compositions of tobacco can be achieved. Compared with existing models, the proposed TCCANN is simple to operate, requires less running time, and achieves better prediction performance.

The remaining structure of this paper is shown as follows. Section 2 discusses related work; Section 3 discusses the proposed TCCANN in detail; Section 4 simulates the proposed TCCANN and compares the corresponding results with existing solutions; Section 5 concludes this paper.

2. Related Work

Many studies have been conducted to analyze the chemical compositions of various plants and foods by using NIR spectroscopy. Machine learning methods, such as partial least squares (PLS) [45], support vector machine (SVM) [11], and least-squares support vector machine (LS-SVM) [44] have been widely used to analyze chemical compositions. In recent studies, researchers have used the preprocessing of spectra, various modelling procedures, and the optimization of model parameters to improve the determination accuracy. Chen [46] used the fractional calculus augmented NIR spectra to detect the nitrogen contents of rubber trees. In this method, fractional calculus was used to extract additional information from the original spectra, and the derivatives of different orders were analyzed. Then, the selected wavelengths were utilized by PLS regression method to develop the estimation model. Olarewaju [47] developed a multivariate calibration model based on PLS regression algorithm to determine the rind biochemical properties of citrus fruit from visible to near-infrared (Vis/NIR) spectra. Some mathematical pre-processing methods were introduced to develop regression models for NIR spectra analysis. Ting [45] developed a synergy interval partial least squares (Si-PLS) method to do quantitative analysis of total flavonoid content (TFC) in Goji berries. Si-PLS method split the full-spectrum region into a number of subintervals (variable-wise) first, and then calculated the combinational subintervals of all possible PLS models. Then, the optimal Si-PLS model was established by the combination of subinterval spectra, which contained the lowest loss. Han [44] developed a MCUVE-LSSVM model to determine total phenolics (TPC) and pcoumaric acid (PA) contents in barley grain. This model used Monte Carlo-Uninformative Variable Elimination (MC-UVE) to select the information wavelength first and then obtain the best calibration specificity of different components. Following this, the analysis model was established by using the least square support vector machine (LS-SVM) with optimized spectra. Jin [48] applied a stepwise-PLS approach to estimate leaf chlorophyll contents of various species from NIR reflected hyperspectral information. This method defined both maximum and minimum values of spectral bands that were used to explain the variations of dependent variables in PLS regression. Different informative spectral bands from hyperspectral reflectance were selected, and evaluated for consistency at different spectral resolutions to identify PLS regression models. Modlitbovab [49] used Laser-Induced Breakdown Spectroscopy (LIBS) as an element of a bio-imaging technique to analyze the nutrient contents of plant samples. This method used LIBS to analyze both element distributions and contents of various nutrients in plants, and highlighted the assessment values of spatial element distribution in phytotoxicity testing.

There are a lot of studies about the determination of routine chemical constituents of tobacco by using NIR spectroscopy, in which machine learning methods have been widely used. Zhang [11] proposed a WT-SVM method that integrated SVM with wavelet transformation (WT) to analyze chemical constituents of tobacco. They first employed the WT method to preprocess the spectra as inputs. Based on the radial basis function (RBF), this model then used SVM regression to analyze the chemical compositions. Tan [4] proposed a boosting partial least squares (boosting-PLS) method to determinate the nicotine content in tobacco. The boosting method was used to optimize training sets first, and then PLS was employed as the regression algorithm to determinate nicotine content. However, Tan highlighted that these results were valid only for their own dataset. Jing [50] applied the Multiblock partial least squares (MB-PLS) method to determinate the moisture in corn as well as both nicotine and sugar in tobacco leaves. In this method, the spectra were separated into sub-blocks along the wavenumber first, then PLS was used to build the corresponding model for each sub-block, and finally a determination model was built by using sub-block models. Duan [2] established a quantitative correction model by using PLS regression to analyze four different categories of chemical compositions in tobacco. The chemical compositions of tobacco include routine chemicals, primary aromatic constituents, inorganic nutrients and heavy metals. They used Savitzky–Golay 9 points algorithm to smooth the original data first, and then the first derivative was used to eliminate the spectral differences from the baseline. They also established different models by PLS regression to analyze different types of chemical compositions. Tan [23] proposed a multivariate calibration method based on WT and mutual information (MI) to analyze the total sugar in tobacco. In this method, the spectra of the training set were transformed into a set of wavelet representations by WT. Then, the reconstructed training set that retained the higher MI value was obtained by calculation, and a PLS model was constructed and optimized. However, this method only analyzed the total sugar in tobacco components. Li [51] applied both PLS and nonlinear least-squares support vector machine (LS-SVM) to the development of calibration models to estimate the constituents of tobacco seed. They used four pre-processing methods to optimize the original spectra before the establishment of calibration models, respectively, and compared the prediction performance of two different models. However, it was complicated that they needed to model each constituent individually.

Li [18] proposed a variable adaptive boosting partial least squares (VABPLS) method to establish the quantitative analysis model of tobacco NIR spectra. According to ARS theory, this method integrated a variable adaptive strategy into BPLS algorithm to analyze the spectra and re-weighted all the appeared samples and variables.

At present, the above research methods used in analyzing the chemical-compositions of tobacco leaves are machine learning methods, which limit to the dimension size of input and need to pre-process the high-dimension original spectral data. In the pretreatment process, part of important information in the original spectra may be lost, which reduces the prediction accuracy of these models. Existing methods cannot comprehensively analyze the correlation between a variety of chemical components and spectra, and can only predict each chemical component individually, which increases the difficulty in analyzing multiple chemical components.

In order to overcome the above shortcomings and achieve higher prediction accuracy, the proposed TCCANN adopts a feature extraction network with residual structure, which can reduce the loss of important information in the NIR spectrum when performing nonlinear transformation. At the same time, the use of LSTM network to extract long-range dependencies in NIR spectra overcomes the lack of residual network capability in this regard. Therefore, the features extracted by TCCANN can contain important information in the original NIR spectrum. TCCANN is able to achieve excellent prediction results by establishing a complex mapping between NIR spectra and prediction sequences.

3. The Proposed Method

3.1. The Framework of The Proposed TCCANN

This paper proposes TCCANN to do simultaneous quantitative analyses of multiple chemical compositions of tobacco by NIR spectra. Figure 1 shows the overall architecture for analysis of chemical composition of tobacco using ATCNN model. The ATCNN model can directly analyze high-dimensional original spectrum. The tobacco samples were collected from different regions of Guizhou Province, China. We use a series of different analytical methods to detect the chemical values of tobacco, and the spectroscopy of tobacco leaves was prepared by NIR sensors. Then, we divide the prepared tobacco samples into training sets and testing sets. TCCANN uses ResNet network to directly extract features of NIR spectra first, and then applies LSTM network to the quantitative analyses of multiple chemical compositions. The values of chemical compositions obtained by the proposed TCCANN are evaluated and analyzed by several error metrics. Finally, the network structure and parameters are adjusted to further optimize TCCANN and test the trained TCCANN.

3.2. The Structure of TCCANN

As shown in Figure 2, the proposed TCCANN has two key components, ResNet and LSTM. ResNet model mainly consists of five convolutional layers and two residual blocks. LSTM model contains three hidden layers. The spectra of tobacco leaves are the input of ResNet model, and the features of spectra are extracted by using a one-dimension convolutional layer. The outputs of ResNet nework are used as an input of LSTM network. LSTM model has many memory cells, which selectively transmit information through the gate structure within a unit [35], and generate the prediction values of various chemical compositions.

3.2.1. ResNet

The ResNet structure is illustrated in Figure 2. The blue squares represent the convolution layer, the gray squares represent the residual blocks, and the orange squares represent identity shortcut connection. ResNet mainly consists of five convolutional layers and two residual blocks. A residual block has three convolution layers and an identity skip connection. The first layer of ResNet network is a convolutional layer, in which the stride is 2 and the number of convolutional kernels is 32. The second layer as a residual block includes three convolutional layers that set the stride and the number of convolutional kernels to 1 and 64, respectively. As a convolutional layer, the stride and the channel number are 2 and 128, respectively, in the third layer. The fourth layer is also a residual block, including three convolutional layers with the stride of 1 and the number of convolutional kernels is 256. The last layer of ResNet is composed of three convolution layers, in which the stride is 2 and the number of convolutional kernels is 512. ResNet network is a full convolutional network, in which each max-pooling layer is replaced by a convolutional layer with the stride of 2. In the feature extraction process, the pooling layer is used to reduce the dimension of data, but it may cause the loss of internal information [41,52]. The forward calculation formula of this convolutional layer is shown as follows.

y_{i}^{(l)} = (\sum_{c} x_{j}^{(l - 1)} \otimes w_{j, i}^{(l)}) + b_{i}^{(l)})

(1)

x_{i}^{(l)} = RELU (BN (y_{i}^{(l)}))

(2)

where i is the i-th convolutional kernel of layer l, j is the j-th convolutional kernel of layer

l - 1

. In Equation (1),

x_{j}^{(l - 1)}

is the feature map of the j-th convolutional kernel of layer

l - 1

, and

c

represents a set of input feature maps in layer

l - 1

.

b_{i}^{(l)}

is the bias of the i-th convolutional kernel of the convolutional layer l.

y_{i}^{(l)}

is the output of the convolutional layer l, and

w_{j, i}^{(l)}

is the convolution kernel. ⊗ is a convolutional product with inverted weights. In Equation (2),

x_{i}^{(l)}

is the feature graph of the convolutional layer l,

RELU

is the activation function. Rectified linear units (ReLU) [53] are used as the activation function, and the formula is shown as follows.

f (x) = \max \{0, BN (y_{i}^{(l)})\}

(3)

where BN represents batch normalization(BN) [43]. Before the activation function of each layer, batch standardized processing is applied to make the output of inactive nodes follow a normal distribution with mean of 0 and variance of 1. Then, the results obtained by batch standardized calculation are used to restore the original input characteristics by scaling and panning [54]. This process can ensure the network capacity, accelerate the network training speed, and improve the network generalization ability [55].

ResNet is built by stacking multiple basic structure elements called residual blocks [24]. ResNet introduces an "identity shortcut connection" to residual blocks. The "identity shortcut connection" skips multiple layers in the network and uses the output of the previous layer as a partial input of the subsequent layer [33,56]. The structure of ResNet ensures the identity mapping in the training process. ResNet network enables the shallow data to be identity mapping to the deep network. Thus the related data features can be effectively extracted from the high-dimension NIR spectra. The residual block structure of the proposed TCCANN is shown in Figure 2,

x

is the input of a convolutional layer,

F (x)

is the output of a convolutional layer, and the activation function of convolutional layers is set to ReLU. By passing the input

x

directly into a convolutional layer, the output of the residual module changes from the original

F (x)

to

H (x) = F (x) + x

[57]. In the proposed network, the gradients received at

H (x)

equally flow back into

x

and

F (x)

[33] during the back propagation. This simple addition can significantly improve the training effect without adding any extra parameters.

3.2.2. LSTM

Long Short-Term Memory (LSTM) model can capture sequential patterns by learning how to store or ignore certain information from data input [35,58]. The key of LSTM is the cell state (memory cell), which is also shown in Figure 2. It runs straight down the entire chain with the ability to add or remove information to/from the cell state, which is regulated by gates [59]. Gates are used as the optional inlets of information. The gate mechanism includes forget gate, input gate, and output gate. At time step t, the input is

x_{t}

, and the hidden state from the previous time step

h_{t - 1}

that is introduced to LSTM cell. The forward pass of an LSTM memory cell proceeds as follows.

(1): The first step decides what information is going to be removed from the cell state. This decision is made by the following forget gate $f_{t}$ .

$f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})$

(4)
(2): The following step decides which new information is going to be stored in the cell state. First, the input gate $i_{t}$ layer decides which values are to be updated. Second, a tanh layer [35] that creates a vector of new candidate values $g_{t}$ .

$i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})$

(5)

$g_{t} = \tanh (W_{g} \cdot [h_{t - 1}, x_{t}] + b_{g})$

(6)
(3): Then, update an old cell state, $c_{t - 1}$ into a new cell state $c_{t}$ as follows.

$c_{t} = f_{t} * c_{t - 1} + i_{t} * g_{t}$

(7)
(4): Finally, the output gate $o_{t}$ decides which parts of the cell state are going to be calculated as output. The cell state first goes through tanh layer (to push the values to be between −1 and 1), and then it is multiplied by the output gate as follows.

$o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})$

(8)

$h_{t} = o_{t} * \tanh (c_{t})$

(9)

During the calculation shown in Equations (4)–(9) and Figure 2,

f_{t}

,

i_{t}

,

o_{t}

are the the forget gate, input gate, and output gate, respectively,

g_{t}

is candidate value, ∗ indicates the element-wise multiplication,

σ

and tanh are non-linear functions.

σ

is the sigmoid activation function, which has non-linearity and similarly compresses its inputs to the range of

[- 1, 1]

.

W_{f, i, g, o}

and

b_{f, i, g, o}

are the weight matrices and bias vectors, respectively. t means the

t_{t h}

time step, and the input at time step t is

x_{t}

.

h_{t - 1}

and

h_{t}

indicate the hidden state at time

t - 1

and t, while

c_{t}

is the cell state at time t. The term

h_{t - 1}

contains the critical features of tobacco as the output of a pooling layer at time

t - 1

and is used as the input of the LSTM memory cell.

3.2.3. The Specific Architecture of TCCANN

The proposed TCCANN consists of ResNet and LSTM. ResNet model has five convolutional layers and two residual blocks, and LSTM model contains three hidden layers. In network training, the number of network training epochs is set to 250,000, and the batch size of training data is set to 16. The initial learning rate is set to

0.001

, the learning rate attenuation coefficient is set to

0.99

, and the learning rate is updated every 6000 rounds.

The detailed parameters of ResNet and LSTM are shown in Table 1 and Table 2, where Conv is short for convolutional layer, FullConnection is short for fully connected layer. The original data are one-dimension NIR data and each dimension has 1609 features. Therefore, the convolutional kernel in all convolutional layers is set to a one-dimension vector. As shown in Table 1, the upper layer of TCCANN consists of ResNet. The input layer of ResNet network is conv1, in which the number of convolutional kernels is 64, the size of convolutional kernel is 9, and the stride is 2. The input size is

1 \times 1609 \times 1

, the output size of Conv1 is

1 \times 805 \times 32

. The next layer of ResNet is residual block1, in which the number of convolutional kernels is 64, the size of the convolutional kernel is 3, and the stride is 1. The input size is

1 \times 805 \times 32

, and the output size of residual block1 is

1 \times 805 \times 64

. The next layer of ResNet is the conv2, in which 128 convolutional channels exist, the size of convolutional kernel is 9, and the stride is 2. The input size is

1 \times 805 \times 64

, and the output size of conv2 is

1 \times 403 \times 128

. The extracted features from conv2 are the input of residual block2. The number of inner convolutional channels are 256, the stride is 1, the output size of residual block2 is

1 \times 403 \times 256

. The output of ResNet network is composed of three convolutional layers, conv3, conv4 and conv5. The number of convolutional channels in these three layers is 512, the size of convolutional kernel is 3, and the stride is 2. The input size is

1 \times 403 \times 256

, and the output size is

1 \times 51 \times 512

. As shown in Table 2, the lower layer of TCCANN is LSTM. LSTM model contains three hidden layers, each of which has 100 LSTM units. The output layer of LSTM model is a fully connected layer, which has 100 units and 7 outputs. The advanced features extracted from ResNet are used as the input of LSTM, and the corresponding input size is

1 \times 51 \times 512

. The final output of TCCANN is the predicted seven chemical compositions of tobacco leaves.

During the training process, mean square error (MSE) is chosen as the overall loss function evaluated at the end of each forward iteration, as shown in Equation (10).

MSE = \frac{\sum_{t = 1}^{n} {(y_{t} - {\hat{y}}_{t})}^{2}}{n}

(10)

The Adam optimizer [60] is selected to minimize the total loss, which updates the network weights and biases based on the gradient of the loss function. In each convolutional layer, ReLU activation function is employed, and each convolutional layer uses the “msra” method proposed by He for weight initialization [61].

The parameter updating in the back propagation training of TCCANN is summarized in Algorithm 1. First, both network structure and parameters of TCCANN are initialized, including the parameters of each network layer and the initial learning rate of the iterative number. Before the model training, the network weight is initialized. In network training, the network weight and learning rate are updated by the loss function and back propagation. Adam [60] is used to optimize the loss function.

Algorithm 1 TCCANN Training

Input:: training set $X$ , model layer parameters: (number of filters, size of filters, stride, activation, loss function), Iterative number $Step$ , batch_size $b$ , learning_rate $lr$
Output:: the trained model
1:: Initialization of hyperparameters: $i n i t i a l_m o d e l_{w e i g h t}$
2:: while $l o s s > 0.0001$ or $s < S t e p$ do
3:: $m o d e l_{w e i g h t} \leftarrow i n i t i a l_m o d e l_{w e i g h t}$
4:: Extract NIR spectroscopy features by using ResNet model
5:: Analyze chemical composition value by using LSTM model
6:: Calculate total loss value
7:: Minimize total loss by using Adam [60]
8:: $m o d e l_{w e i g h t} \leftarrow u p d a t e_m o d e l_{w e i g h t} (s)$
9:: $l e a r n i n g_r a t e_l r \leftarrow u p d a t e_l r (s)$
10:: $s \leftarrow s + 1$
11:: end while
12:: Calculate average loss to determine various hyperparameters
13:
14:: return the trained model

3.3. The Overall Algorithm of TCCANN

The overall flow of TCCANN is summarized as shown in Algorithm 2.

Algorithm 2 Estimation of Chemical Compositions

Input:: Sample set of tobacco data: X
Output:: Analysis results
1:: Initialization:Various hyperparameters
2:: Split $X$ into $X_train$ , $X_test$ according to 1:1
3:: for $i = 1$ to 5 do
4:: Split $X_train$ into $train_CV$ , $valid_CV$ randomly according to 4:1
5:: model = sent $train_CV$ to TCCANN for training according to Algorithm 1
6:: loss = model.evaluate(valid_CV)
7:: $\hat{c} = model . predict (valid_CV)$
8:: Calculate the total loss
9:: end for
10:: Calculate average loss to determine various hyperparameters
11:: input testing data $X_test$
12:: model = training TCCANN according to Algorithm 1
13:: ${\hat{c}}_{i} = model . predict (X_test)$

In Algorithm 2,

\hat{c}

represents the predicted value of the network on a validation set.

{\hat{c}}_{i}

represents the predicted value of the network on a testing set. Model represents the the trained TCCANN.

4. Comparative Experiments

4.1. Experiment Preparation

In this study, a total of 4000 standard samples of tobacco leaves were collected and measured from different regions of Guizhou Province by Guizhou Tobacco Science Research Institute of China. For the determination of the standard values of tobacco chemical compositions, all tobacco samples were dried in an oven at 60

^{\circ}

C under normal pressure for half an hour first, and then ground to certain granularity through a whirlwind grinding instrument. Next, the sample powders were sieved by mesh. The sieved powders were then processed and analyzed by a San+Automated Wet Chemical Analyzer (Skalar, Holand) (a continuous flow inject analytical instrument). The analyzer can accurately measure the values of routine chemical compositions including nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine and pH using a range of different analytical methods [13].

The obtained values are used as the standard values for the experimental analysis of chemical compositions. Statistical values of seven tobacco compositions from 4000 standard samples of tobacco leaves are shown in Table 3.

As shown in Table 3, all the reference values of seven compositions are normally distributed around the mean values (2.80, 30.42, 25.94, 2.35, 1.21, 0.36, 5.36) with standard deviations (STD). The ranges of seven compositions are 0.36–6.01, 5.29–51.05, 0.91–40.32, 1.00–4.45, 0.37–3.03, 0.01–1.55, and 4.53–6.01, respectively. This means that all the samples are in a good representation of distribution and cover a wide range of values. For example, nicotine has a strong effect on both the aroma and taste of tobacco products, and nicotine intake can have some side effects on human body [4]. The composition of reducing sugar and total sugar correlates with aftertaste, irritation and aroma quality, and the amount of total nitrogen correlates with the smoke concentration and smoking strength [5,9]. The pH value of tobacco is the determinant factor in the acute toxicity and is also correlated with total nitrogen, total alkaloid and total volatile alkali bases of tobacco [6]. The potassium amount has a positive relationship with the flavor and the degree of wetness [8]. The chemical compositions of tobacco leaves affects the quality of tobacco together, and various chemical compositions are coupled and closely related to each other.

NIR spectra were collected by Thermo Antaris 2 with multiple sensors (Thermo Fisher Scientific Inc., Waltham, MA, USA). NIR chemical detector is shown in Figure 3. The collected spectra have the resolution of 8 cm

^{- 1}

and 64 scans.

As shown in Figure 4, the NIR range is from 3800 cm

^{- 1}

to 10,000 cm

^{- 1}

. There are significant fluctuations from 3800 cm

^{- 1}

to 6500 cm

^{- 1}

. The 3800 cm

^{- 1}

to 4870 cm

^{- 1}

region mainly contains the combination bands of C-H plus C-H, N-H, N-H plus O-H. The 3900 cm

^{- 1}

to 4010 cm

^{- 1}

, 4110 cm

^{- 1}

to 4400 cm

^{- 1}

, and 4400 cm

^{- 1}

to 4570 cm

^{- 1}

regions contain the combination bands of C-H, C-H plus C-H, and N-H plus O-H, respectively. The 4570 cm

^{- 1}

to 4870 cm

^{- 1}

region contains the combination bands of N-H and the first overtone of C=O plus O-H. The 5050 cm

^{- 1}

to 5250 cm

^{- 1}

region contains the combination bands of O-H and the second overtone of C=O. The 5725 cm

^{- 1}

to 6110 cm

^{- 1}

region contains the first overtone regions of C-H, S-H, and the 6110 cm

^{- 1}

to 7270 cm

^{- 1}

region contains the first overtone regions of N-H and C-H plus C-H. These NIR spectra characterize the main tobacco compositions, such as nicotine, total sugar, reducing sugar and total nitrogen. In addition, the potassium has a sensitive band in spectra and the chlorine participates in photosynthesis [8]. The characteristics of potassium and chlorine were bound up with the absorption of C-H, O-H and N-H, which supports the theoretical foundation for determining potassium and chlorine by NIR spectra. There is a certain correlation between NIR spectra data and tobacco chemical constituents.

The detailed division of tobacco data is shown in Figure 5. In this study, there are a total of 4000 standard tobacco data sets, which were randomly divided into both training and testing sets at 1:1 ratio (2000 samples in training and testing sets, respectively). In the model training, the root-mean-squared error of 5-fold cross-validation is used to evaluate the network model. According to the ratio of 4:1, the spectra of routine chemical constituents of tobacco were divided into both calibration and validation sets.

Both training and testing of tobacco data were performed using an NVIDIA GeForce RTX 2080 GPU and Intel Core(TM) i7-8700 CPU with a running memory of 24 GB. The neural network was built using the deep learning framework Tensorflow

1.15.0

and Windows 10 operating system, and the training and testing of the proposed TCCANN were processed using the Python 3.6 platform.

4.2. Evaluation Metrics

In this paper, both root mean square error (RMSE) and mean absolute error (MAE) are used to evaluate the performance of the proposed model. The corresponding equations of RMSE and MAE are shown in Equations (11) and (12), respectively. As shown in Equation (13), the determination coefficient

R^{2}

is also used to evaluate the performance of the proposed model.

RMSE = \sqrt{\frac{\sum_{t = 1}^{n} {(y_{t} - {\hat{y}}_{t})}^{2}}{n}}

(11)

MAE = \frac{\sum_{t = 1}^{n} |y_{t} - {\hat{y}}_{t}|}{n}

(12)

R^{2} = 1 - \frac{\sum_{t = 1}^{n} {(y_{t} - {\hat{y}}_{t})}^{2}}{\sum_{t = 1}^{n} {(y_{t} - \bar{y})}^{2}}

(13)

where

y_{t}

is the true chemical composition value,

{\hat{y}}_{t}

is the predicted data,

\bar{y}

is the mean of all the actual samples, and n is the number of

y_{t}

. RMSE and MAE indicate the measurement precision. When the values of RMSE and MAE are close to 0, it indicates a good fitting. On the contrary,

R^{2}

measures how successful the fit is in explaining the data variation. When the value of

R^{2}

is close to 1, the model shows a good fitting [11]. Concisely, a useful model should have a high

R^{2}

value, and low RMSE and MAE values.

4.3. Parameters of TCCANN

In this experiment, the total training rounds of TCCANN are set to 250,000. The loss value of the training set is shown in Figure 6. The ordinate represents the loss value, and the abscissa represents the number of training rounds. The red curve represents the trend of the loss value as the network training progresses.

As shown in Figure 6, in the initial stage of network training, the loss value of the network shows a rapid downward trend, and then decreases slowly until reaching stabilization. When the network training is 100,000 rounds, the training loss value is

0.00307

. When the network training is 150,000 rounds, the training loss value is

7.681 \times 10^{- 4}

. When the network training reaches 250,000 rounds, the loss value eventually decreases to

1.38 \times 10^{- 4}

. The proposed TCCANN fluctuates considerably from 0 to 200,000 rounds of training. Finally, the loss of TCCANN tends to stabilize and maintains a small value, which confirms TCCANN has a better compatibility.

During the training process, 5-fold cross-validation is used in the proposed model, and both training and validation samples are randomly split according to a 4:1 ratio. In order to accurately measure both generalization ability and prediction accuracy of the proposed model, RMSE,

R^{2}

and MAE are used as evaluation indexes for both verification and test sets, respectively.

In the validation set, the RMSE mean value of chemical constituents is 0.03864, and the mean value of MAE is 0.02190. In the test set, the RMSE mean value of chemical constituents is 0.04134 and the mean value of MAE is 0.02501. The values of RMSE do not have significant differences on both verification and test sets. Similarly, the values of MAE do not have significant differences on both verification and test sets either. The correlation coefficient

R^{2}

of seven chemical compositions on both verification and test sets is greater than 0.99 and close to 1. The loss values of seven chemical compositions corresponding to both verification and test sets show a good linear relationship. Thus, the proposed networks are not involved in overfitting and underfitting issues, and have a good generalization ability.

In order to test the analytical efficiency of the proposed TCCANN, we recorded the training and testing times. With the help of Cuda and GPU: training 1928 samples 250,000 steps only cost average 44.35 s, and testing 690 samples only cost average 0.83 s which means a cost of about 1.19 milliseconds for one sample. The result show that, under the conditions of satisfying equipment and data, compared with the traditional chemical method, TCCANN is simpler to operate, the analysis speed is greatly improved, and good analysis results can be obtained.

4.4. Comparison with Existing Methods

The proposed TCCANN is used to analyze the complex NIR spectra and determine the chemical compositions of tobacco leaves, including nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH value. In order to demonstrate the good performance of the proposed model, the proposed TCCANN is compared with existing methods, including PLS regression methods [2], wavelet transformation support vector machine (WT-SVM) [11], LS-SVM methods [51] and variable adaptive boosting partial least squares (VAB-PLS) methods [18]. According to the original papers, PLS, WT-SVM, LS-SVM, and VAB-PLS were implemented to carry out comparative experiments. Four models use the same settings, platform, and evaluation indicators as the proposed model. The specific division is shown in Figure 5.

Figure 7 shows the correlations between the predicted values and the measured values of 2000 tobacco samples by using four different models on the testing set. Figure 7 has seven rows and five columns (35 scatter plots in total). Each scatter plot represents the correlation between the predicted value and the measured value of a chemical composition. The abscissa of each scatter plot represents the measured value of a chemical composition, and the ordinate represents the predicted value of a chemical composition by different models. The first to fifth columns show the values of seven chemical compositions obtained PLS, WT-SVM, VAB-PLS, LS-SVM, and TCCANN, respectively. Each column shows the values of a chemical composition obtained by PLS, WT-SVM, VAB-PLS, LS-SVM, and TCCANN, respectively. The first to seventh rows show the obtained results of nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH corresponding to red, light green, blue, sage, orange, purple, and dark green points, respectively.

As shown in Figure 7, the first column scatter diagram represents the PLS model, the second column scatter diagram represents WT-SVM model, the third column scatter diagram represents VAB-PLS model, and the fourth column scatter diagram represents the LS-SVM model, and the fifth column scatter diagram represents the proposed TCCANN. As shown in Figure 7, both predicted and measured values obtained by PLS and WT-SVM show great differences, and most of the points are not evenly distributed around the diagonal. Some differences in the predicted and measured values of LS-SVM and VAB-PLS exist, involving a few points distributed around the diagonal. Both predicted and measured values of TCCANN only have little difference, and most of the points are evenly and compactly distributed along the diagonal

y = x

. As shown in Figure 7, the closer these points are to the diagonal the better the fitting effect of the model. There is a significant linear relationship between the predicted values and the measured values of seven chemical compositions in proposed TCCANN.

Figure 8 shows the results of three evaluation indicators obtained by five analysis models on the test set. Three sub-figures from left to right show the values of RMSE, MAE, and

R^{2}

obtained by five different models, respectively. The abscissa of each line chart represents seven different chemical compositions, and the ordinate represents the specific loss value. NIC, TS, RS, TN, PO, CL, PH represent nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH value, respectively. The grey line represents the PLS model, the red line represents the WT-SVM model, the green line represents the VAB-PLS model, the blue line represents the LS-SVM model, and the orange line represents the ATCNN model.

According to the results, the loss values of seven chemical compositions obtained by RMS and MAE of PLS, WT-SVM, VAB-PLS, and LS-SVM are greater than the corresponding values obtained by TCCANN. The loss values of seven chemical compositions obtained by LS-SVM fluctuate considerably. The loss values RMSE and MAE of seven chemical compositions obtained by TCCANN are more stable than the results obtained by the other four models. According to the sub-figure of

R^{2}

, the

R^{2}

values of seven chemical compositions obtained by TCCANN are greater than the results obtained by the other four models. The TCCANN values are more stable and close to 1.

Table 4 shows the results obtained by five analysis models on the test set. NIC, TS, RS, TN, PO, CL, PH represent nicotine, total sugar, reducing sugar, total nitrogen, potassium, chlorine, and pH value, respectively. CC represents chemical-compositions. For PLS, the mean value RMSE of seven chemical compositions is 0.08961, and the minimum loss of total nitrogen is 0.08201. The mean value MAE of seven chemical compositions is 0.07857, and the minimum loss value of chlorine is 0.06393. The mean value

R^{2}

of seven chemical compositions is 0.72428. For WT-SVM, the mean value RMSE of seven chemical compositions is 0.08706, and the minimum loss of total nitrogen is 0.07069. The mean value MAE of seven chemical compositions is 0.06764, and the minimum loss value of total nitrogen is 0.05401. The mean value

R^{2}

of seven chemical compositions ss 0.78283. For VAB-PLS, the mean value RMSE of seven chemical compositions is 0.07191, and the mean value MAE of seven chemical compositions is 0.05107. The mean value

R^{2}

of seven chemical compositions is 0.94358. For LS-SVM, the mean value RMSE of seven chemical compositions is 0.06507, and the mean value MAE of seven chemical compositions is 0.05031. The mean value

R^{2}

of seven chemical compositions is 0.97301. As shown in Table 4, the average RMSE of LS-SVM is less than the corresponding ones obtained by PLS and WT-SVM. The MAE value of LS-SVM is less than the corresponding ones obtained by PLS, WT-SVM, and VAB-PLS. The

R^{2}

value obtained by LS-SVM is larger than the corresponding ones obtained by PLS, WT-SVM, and VAB-PLS. Therefore, the overall performance of LS-SVM model is better than WT-SVM, PLS, and VAB-PLS.

For TCCANN, the mean values of RMSE and MAE are 0.04134 and 0.02501, respectively. The loss values RMSE and MAE of seven chemical compositions are less than the corresponding ones obtained by the other four models. According to the above results, TCCANN bears a good overall performance and high accuracy. The correlation coefficients

R^{2}

of seven chemical compositions are greater than 0.99, which are better than the other four models. As shown in Figure 8 and Table 4, TCCANN performs significantly better than the other four machine learning methods over the tobacco dataset. In comparative experiments, the values of appraisal indexes indicate that the generalization ability and prediction accuracy of TCCANN are superior to the other four methods. Thus, TCCANN is a powerful solution to the problem of determining the chemical composition detection of tobacco leaves.

5. Conclusions

Near-infrared spectroscopy has become an important research topic in the determination of routine chemical compositions of tobacco. This paper proposes TCCANN to perform simultaneous quantitative analysis of multiple chemical compositions of tobacco by using NIR hyperspectroscopy imagery. TCCANN adopts a full convolutional network that replaces the max-pooling by a convolutional layer with a stride of two. TCCANN can effectively avoid the loss of spectroscopy information in the feature extraction process. TCCANN uses ResNet to directly extract features of NIR spectroscopy data, and applies LSTM to the simultaneous quantitative analysis of multiple chemical compositions. Through internal residual blocks, ResNet network enables data in the deep network to achieve the identity mapping between different network layers, which can fully extract advanced data features. LSTM can store data information in internal gated units, and implement selective information transmission by gated units. Thus, LSTM can comprehensively analyze the correlation between chemical compositions and spectra, and achieve the simultaneous and quantitative analysis of multiple chemical compositions. This paper uses RMSE,

R^{2}

, and MAE as the evaluation indexes to evaluate the performance of the proposed network. The proposed model is compared with four other machine learning methods (PLS, WT-SVM, VAB-PLS, LS-SVM) to demonstrate its usefulness and excellence. TCCANN achieves better results than the other four methods on several statistical indicators. The results demonstrate the superiority of TCCANN and also demonstrate that the deep learning framework can be applied to NIR spectra and efficiently achieve rapid and accurate analysis of routine chemical compositions of tobacco. However, the proposed TCCANN cannot determine the completely accurate determination of chemical compositions of tobacco. In future work, a more effective analysis model of chemical compositions will be explored.

Author Contributions

Conceptualization, Z.Z. and G.Q.; methodology, Z.Z., G.Q. and Y.L. (Yangbo Lei); software, Y.L. (Yangbo Lei) and D.J.; validation, D.J., Y.L. (Yang Liu) and D.W.; formal analysis, G.Q., N.M. and W.Z.; investigation, Z.Z.; resources, Z.Z. and D.W.; data curation, D.J., Y.L. (Yang Liu) and D.W.; writing—original draft preparation, Z.Z., G.Q. and D.J.; writing—review and editing, Z.Z., G.Q., Y.L. (Yangbo Lei), D.J. and N.M.; visualization, Y.L. (Yangbo Lei) and D.J.; supervision, Z.Z. and W.Z.; project administration, G.Q.; funding acquisition, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is jointly supported by National Natural Science Foundation of China under Grant No. 61803061 and 61906026; Innovation Research Group of Universities in Chongqing; Chongqing Natural Science Foundation under Grant cstc2020jcyj-msxmX0577, cstc2020jcyj-msxmX0634, and cstc2019jcyj-msxmX0110; “Chengdu-Chongqing Economic Circle” innovation funding of Chongqing Municipal Education Commission under Grant KJCXZD2020028; Basic Research and Frontier Exploration Project of Yuzhong District, Chongqing under Grant 20210164; Science and Technology Research Program of Chongqing Municipal Education Commission under Grant KJQN202000602; Special Key Project of Chongqing Technology Innovation and Application Development under Grant cstc2019jscx-zdztzx0068; Special Fund for Young and Middle-aged Medical Top Talents of Chongqing (ZQNYXGDRCGZS2019005); Chongqing medical scientific research project (Joint project of Chongqing Health Commission and Science and Technology Bureau for Young and Middle-aged Medical Top Talents, 2020GDRC019); the China Postdoctoral Science Foundation (2020M670111ZX); Natural Science Foundation of Chongqing (cstc2020jcyj-bshX0068); the Basic Research and Frontier Exploration Project of Yuzhong District of Chongqing (20210164).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tobacconomics. Economic Costs of Tobacco Use. 2019. Available online: http://tobacconomics.org/files/research/523/UIC_Economic-Costs-of-Tabacco-Use-Policy-Brief_v1.3.pdf/ (accessed on 11 November 2021).
Duan, J.; Huang, Y.; Li, Z.; Zheng, B.; Li, Q.; Xiong, Y.; Wu, L.; Min, S. Determination of 27 chemical constituents in Chinese southwest tobacco by FT-NIR spectroscopy. Ind. Crop. Prod. 2012, 40, 21–26. [Google Scholar] [CrossRef]
Ye, X.; Liu, G.; Liu, H.; Li, S. Study on model of aroma quality evaluation for flue-cured tobacco based on principal component analysis. J. Food Agric. Environ. 2011, 9, 501–504. [Google Scholar]
Tan, C.; Wang, J.; Wu, T.; Qin, X.; Li, M. Determination of nicotine in tobacco samples by near-infrared spectroscopy and boosting partial least squares. Vib. Spectrosc. 2010, 54, 35–41. [Google Scholar] [CrossRef]
Wang, D.; Tian, F.; Yang, S.X.; Zhu, Z. Intelligent estimate of chemical compositions based on NIR spectra analysis. In Proceedings of the 2017 IEEE International Conference on Information and Automation (ICIA), Macao, China, 18–20 July 2017; pp. 472–477. [Google Scholar]
Henningfield, J.E.; Fant, R.V.; Radzius, A.; Frost, S. Nicotine concentration, smoke pH and whole tobacco aqueous pH of some cigar brands and types popular in the United States. Nicotine Tob. Res. 1999, 1, 163–168. [Google Scholar] [CrossRef] [Green Version]
Lawler, T.S.; Stanfill, S.B.; Zhang, L.; Ashley, D.L.; Watson, C.H. Chemical characterization of domestic oral tobacco products: Total nicotine, pH, unprotonated nicotine and tobacco-specific N-nitrosamines. Food Chem. Toxicol. 2013, 57, 380–386. [Google Scholar] [CrossRef] [Green Version]
Rossel, R.V.; Walvoort, D.; McBratney, A.; Janik, L.J.; Skjemstad, J. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma 2006, 131, 59–75. [Google Scholar] [CrossRef]
Soares, F.L.; Marcelo, M.C.; Porte, L.M.; Pontes, O.F.; Kaiser, S. Inline simultaneous quantitation of tobacco chemical composition by infrared hyperspectral image associated with chemometrics. Microchem. J. 2019, 151, 104225. [Google Scholar] [CrossRef]
Gunduz, I.; Kondylis, A.; Jaccard, G.; Renaud, J.M.; Hofer, R.; Ruffieux, L.; Gadani, F. Tobacco-specific N-nitrosamines NNN and NNK levels in cigarette brands between 2000 and 2014. Regul. Toxicol. Pharmacol. 2016, 76, 113–120. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Cong, Q.; Xie, Y.; Zhao, B. Quantitative analysis of routine chemical constituents in tobacco by near-infrared spectroscopy and support vector machine. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2008, 71, 1408–1413. [Google Scholar] [CrossRef]
Zhang, C.; Wu, W.; Zhou, L.; Cheng, H.; Ye, X.; He, Y. Developing deep learning based regression approaches for determination of chemical compositions in dry black goji berries (Lycium ruthenicum Murr.) using near-infrared hyperspectral imaging. Food Chem. 2020, 319, 126536. [Google Scholar] [CrossRef]
Jiang, D.; Hu, G.; Qi, G.; Mazur, N. A Fully Convolutional Neural Network-based Regression Approach for Effective Chemical Composition Analysis Using Near-infrared Spectroscopy in Cloud. J. Artif. Intell. Technol. 2021, 1, 74–82. [Google Scholar] [CrossRef]
Wang, D.; Xie, L.; Yang, S.X.; Tian, F. Support vector machine optimized by genetic algorithm for data analysis of near-infrared spectroscopy sensors. Sensors 2018, 18, 3222. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bi, Y.; Li, S.; Zhang, L.; Li, Y.; He, W.; Tie, J.; Liao, F.; Hao, X.; Tian, Y.; Tang, L.; et al. Quality evaluation of flue-cured tobacco by near infrared spectroscopy and spectral similarity method. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2019, 215, 398–404. [Google Scholar] [CrossRef] [PubMed]
Jiang, D.; Qi, G.; Hu, G.; Mazur, N.; Zhu, Z.; Wang, D. A residual neural network based method for the classification of tobacco cultivation regions using near-infrared spectroscopy sensors. Infrared Phys. Technol. 2020, 111, 103494. [Google Scholar] [CrossRef]
Kang, S.; Zhao, K.; Yu, D.; Zheng, X.; Huang, C. Advances in Biosensing and Environmental Monitoring Based on Electrospun Nanofibers. Adv. Fiber Mater 2022, 9. [Google Scholar] [CrossRef]
Li, P.; Du, G.; Ma, Y.; Zhou, J.; Jiang, L. A novel multivariate calibration method based on variable adaptive boosting partial least squares algorithm. Chemom. Intell. Lab. Syst. 2018, 176, 157–161. [Google Scholar] [CrossRef]
Yang, X.; Yan, D. Direct white-light-emitting and near-infrared phosphorescence of zeolitic imidazolate framework-8. Chem. Commun. 2017, 53, 1801–1804. [Google Scholar] [CrossRef]
Wu, S.; Zhou, B.; Yan, D. Low-Dimensional Organic Metal Halide Hybrids with Excitation-Dependent Optical Waveguides from Visible to Near-Infrared Emission. ACS Appl. Mater. Interfaces 2021, 13, 26451–26460. [Google Scholar] [CrossRef]
Qin, Y.; Gong, H. NIR models for predicting total sugar in tobacco for samples with different physical states. Infrared Phys. Technol. 2016, 77, 239–243. [Google Scholar] [CrossRef]
Qi, G.; Zhu, Z.; Erqinhu, K.; Chen, Y.; Chai, Y.; Sun, J. Fault-diagnosis for reciprocating compressors using big data and machine learning. Simul. Model. Pract. Theory 2018, 80, 104–127. [Google Scholar] [CrossRef]
Tan, C.; Chen, H.; Wu, T.; Xu, Z.; Li, W.; Qin, X. Determination of total sugar in tobacco by near-infrared spectroscopy and wavelet transformation-based calibration. Anal. Lett. 2013, 46, 171–183. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Yuan, Q.; Zhang, Q.; Li, J.; Shen, H.; Zhang, L. Hyperspectral image denoising employing a spatial–spectral deep residual convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2018, 57, 1205–1218. [Google Scholar] [CrossRef] [Green Version]
Zeng, F.; Qi, G.; Zhu, Z.; Sun, J.; Hu, G.; Haner, M. Convex Neural Networks Based Reinforcement Learning for Load Frequency Control under Denial of Service Attacks. Algorithms 2022, 15, 34. [Google Scholar] [CrossRef]
Qi, G.; Wang, H.; Haner, M.; Weng, C.; Chen, S.; Zhu, Z. Convolutional neural network based detection and judgement of environmental obstacle in vehicle operation. CAAI Trans. Intell. Technol. 2019, 4, 80–91. [Google Scholar] [CrossRef]
Lee, H.; Kwon, H. Going Deeper With Contextual CNN for Hyperspectral Image Classification. IEEE Trans. Image Process. 2017, 26, 4843–4855. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Meng, J.; Luo, Y.; Huang, X.; Qi, G.; Zhu, Z. Deep Convolutional Neural Network for Real and Fake Face Discrimination. Proceedings of 2020 Chinese Intelligent Systems Conference; Springer: Singapore, 2021; pp. 590–598. [Google Scholar]
Li, Y.; Xu, P.; Zhu, Z.; Huang, X.; Qi, G. Real-Time Driver Distraction Detection Using Lightweight Convolution Neural Network with Cheap Multi-scale Features Fusion Block. In Proceedings of 2021 Chinese Intelligent Systems Conference; Springer: Singapore, 2022; pp. 232–240. [Google Scholar]
Vico, D.D.; Barran, A.T.; Omari, A.; Dorronsoro, J.R. Deep neural networks for wind and solar energy prediction. Neural Process. Lett. 2017, 46, 829–844. [Google Scholar] [CrossRef]
Zhu, Z.; Yin, H.; Chai, Y.; Li, Y.; Qi, G. A novel multi-modality image fusion method based on image decomposition and sparse representation. Inf. Sci. 2018, 432, 516–529. [Google Scholar] [CrossRef]
Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
Zuo, Z.; Shuai, B.; Wang, G.; Liu, X.; Wang, X.; Wang, B.; Chen, Y. Learning Contextual Dependence With Convolutional Hierarchical Recurrent Neural Networks. IEEE Trans. Image Process. 2016, 25, 2983–2996. [Google Scholar] [CrossRef] [Green Version]
Rao, G.; Huang, W.; Feng, Z.; Cong, Q. LSTM with sentence representations for document-level sentiment classification. Neurocomputing 2018, 308, 49–57. [Google Scholar] [CrossRef]
Song, S.; Lan, C.; Xing, J.; Zeng, W.; Liu, J. Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection. IEEE Trans. Image Process. 2018, 27, 3459–3471. [Google Scholar] [CrossRef]
Zhang, L.; Ding, X.; Hou, R. Classification Modeling Method for Near-Infrared Spectroscopy of Tobacco Based on Multimodal Convolution Neural Networks. J. Anal. Methods Chem. 2020, 2020, 9652470. [Google Scholar] [CrossRef] [PubMed]
Wang, D.; Tian, F.; Yang, S.X.; Zhu, Z.; Jiang, D.; Cai, B. Improved Deep CNN with Parameter Initialization for Data Analysis of Near-Infrared Spectroscopy Sensors. Sensors 2020, 20, 874. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bai, S.; Tang, H.; An, S. Coordinate CNNs and LSTMs to categorize scene images with multi-views and multi-levels of abstraction. Expert Syst. Appl. 2019, 120, 298–309. [Google Scholar] [CrossRef]
Boureau, Y.L.; Ponce, J.; LeCun, Y. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 111–118. [Google Scholar]
Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation, Inc. (NIPS): La Jolla, CA, USA, 2017; pp. 3856–3866. [Google Scholar]
Zhang, M.; Cai, W.; Shao, X. Wavelet unfolded partial least squares for near-infrared spectral quantitative analysis of blood and tobacco powder samples. Analyst 2011, 136, 4217–4221. [Google Scholar] [CrossRef]
Wu, S.; Li, G.; Deng, L.; Liu, L.; Wu, D.; Xie, Y.; Shi, L. L1 -Norm Batch Normalization for Efficient Training of Deep Neural Networks. IEEE Trans. Neural Netw. 2019, 30, 2043–2051. [Google Scholar] [CrossRef] [Green Version]
Han, Z.; Cai, S.; Zhang, X.; Qian, Q.; Huang, Y.; Dai, F.; Zhang, G. Development of predictive models for total phenolics and free p-coumaric acid contents in barley grain by near-infrared spectroscopy. Food Chem. 2017, 227, 342–348. [Google Scholar] [CrossRef]
Tingting, S.; Xiaobo, Z.; Jiyong, S.; Zhihua, L.; Xiaowei, H.; Yiwei, X.; Wu, C. Determination Geographical Origin and Flavonoids Content of Goji Berry Using Near-Infrared Spectroscopy and Chemometrics. Food Anal. Methods 2016, 9, 68–79. [Google Scholar] [CrossRef]
Chen, K.; Li, C.; Tang, R. Estimation of the nitrogen concentration of rubber tree using fractional calculus augmented NIR spectra. Ind. Crops Prod. 2017, 108, 831–839. [Google Scholar] [CrossRef]
Olarewaju, O.O.; Magwaza, L.S.; Nieuwoudt, H.; Poblete-Echeverria, C.; Fawole, O.A.; Tesfay, S.Z.; Opara, U.L. Model development for non-destructive determination of rind biochemical properties of ‘Marsh’grapefruit using visible to near-infrared spectroscopy and chemometrics. Spectrochim. Acta Part A: Mol. Biomol. Spectrosc. 2019, 209, 62–69. [Google Scholar] [CrossRef]
Jin, J.; Wang, Q. Selection of informative spectral bands for PLS models to estimate foliar chlorophyll content using hyperspectral reflectance. IEEE Trans. Geosci. Remote Sens. 2018, 57, 3064–3072. [Google Scholar] [CrossRef]
Modlitbova, P.; Pořizka, P.; Kaiser, J. Laser-induced breakdown spectroscopy as a promising tool in the elemental bioimaging of plant tissues. Trends Anal. Chem. 2020, 122, 115729. [Google Scholar] [CrossRef]
Jing, M.; Cai, W.; Shao, X. Quantitative determination of the components in corn and tobacco samples by using near-infrared spectroscopy and multiblock partial least squares. Anal. Lett. 2010, 43, 1910–1921. [Google Scholar] [CrossRef]
Li, Z.; Li, C.; Gao, Y.; Ma, W.; Zheng, Y.; Niu, Y.; Guan, Y.; Hu, J. Identification of oil, sugar and crude fiber during tobacco (Nicotiana tabacum L.) seed development based on near infrared spectroscopy. Biomass Bioenergy 2018, 111, 39–45. [Google Scholar] [CrossRef]
Xu, B.; Ye, H.; Zheng, Y.; Wang, H.; Luwang, T.; Jiang, Y.G. Dense dilated network for few shot action recognition. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, Yokohama, Japan, 11–14 June 2018; pp. 379–387. [Google Scholar]
Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
Santurkar, S.; Tsipras, D.; Ilyas, A.; Madry, A. How does batch normalization help optimization. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation, Inc. (NIPS): La Jolla, CA, USA, 2018; pp. 2488–2498. [Google Scholar]
Zheng, Q.; Fang, J.; Hu, Z.; Zhang, H. Aero-Engine On-Board Model Based on Batch Normalize Deep Neural Network. IEEE Access 2019, 7, 54855–54862. [Google Scholar] [CrossRef]
Yang, W.; Zhang, X.; Tian, Y.; Wang, W.; Xue, J.; Liao, Q. LCSCNet: Linear Compressing-Based Skip-Connecting Network for Image Super-Resolution. IEEE Trans. Image Process. 2020, 29, 1450–1464. [Google Scholar] [CrossRef] [Green Version]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 630–645. [Google Scholar]
Zhang, X.; Yin, F.; Zhang, Y.; Liu, C.; Bengio, Y. Drawing and Recognizing Chinese Characters with Recurrent Neural Network. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 849–862. [Google Scholar] [CrossRef] [Green Version]
Liu, J.; Shahroudy, A.; Xu, D.; Kot, A.C.; Wang, G. Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 3007–3021. [Google Scholar] [CrossRef] [Green Version]
Punnappurath, A.; Brown, M.S. Learning Raw Image Reconstruction-Aware Deep Image Compressors. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 1013–1019. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]

Figure 1. The Proposed TCCANN Block Diagram.

Figure 2. The Key Components of TCCANN.

Figure 3. NIR Chemical Detector.

Figure 4. Raw NIR Spectra of Samples. Different wavelengths of absorbed light show in different colors.

Figure 5. The Specific Division of Tobacco Data.

Figure 6. TCCANN Training.

Figure 7. Correlations between Predicted and Measured Values of Seven Chemical Compositions from Five Different Models.

Figure 8. Results of Different Evaluation Indexes Obtained by Different Models.

Table 1. Parameter Specifications of ResNet Model.

Model	Detailed Structure	Convolution Kernel			Output Size
		Size	Number	Stride
ResNet	Conv1	9	32	2	$1 \times 805 \times 32$
	Residual block1	3	64	1	$1 \times 805 \times 64$
	Conv2	3	128	2	$1 \times 403 \times 128$
	Residual block2	3	256	1	$1 \times 403 \times 256$
	Conv3	3	512	2	$1 \times 202 \times 512$
	Conv4	3	512	2	$1 \times 101 \times 512$
	Conv5	3	512	2	$1 \times 51 \times 512$

Table 2. Parameter Specifications of LSTM Model.

Model	Detailed Structure	Number of Units	Output Size
LSTM	Unit1	100	$1 \times 51 \times 100$
	Unit2	100	$1 \times 51 \times 100$
	Unit3	100	$1 \times 100$
	Fullconnection	100	7

Table 3. Statistical Values of Raw Spectra Data. MAX: maximum; MIN: minimum; STD: standard deviations.

	MIN(%)	MAX(%)	Mean(%)	STD
Nicotine	0.36	6.01	2.80	0.82
Total sugar	5.29	51.05	30.42	6.60
Reducing sugar	0.91	40.32	25.94	4.93
Total nitrogen	1.00	4.45	2.35	0.52
Potassium	0.37	3.03	1.21	0.46
Chlorine	0.01	1.55	0.36	0.17
pH	4.53	6.01	5.36	0.11

Table 4. Analysis Results of Seven Chemical Components Obtained by Different Models. NIC: nicotine; TS: total sugar; RS: reducing sugar; TN: total nitrogen; PO: potassium; CL: chlorine; PH: pH value.

	CC	NIC	TS	RS	TN	PO	CL	PH	Mean Value
Models		NIC	TS	RS	TN	PO	CL	PH	Mean Value
PLS	RMSE	0.09469	0.09341	0.09725	0.08201	0.09283	0.08334	0.08777	0.08961
	MAE	0.08647	0.08524	0.08516	0.07971	0.07803	0.06393	0.07146	0.07857
	$R^{2}$	0.88712	0.81838	0.84037	0.79387	0.63877	0.63394	0.45755	0.72428
WT-SVM	RMSE	0.08321	0.09881	0.09397	0.07069	0.08469	0.08677	0.09313	0.08706
	MAE	0.07497	0.07125	0.07747	0.05401	0.06531	0.05894	0.07156	0.06764
	$R^{2}$	0.89656	0.92565	0.90699	0.89736	0.69183	0.65231	0.50911	0.78283
VAB-PLS	RMSE	0.07066	0.09741	0.08843	0.07036	0.06716	0.05754	0.05178	0.07191
	MAE	0.04811	0.06340	0.07046	0.04749	0.05604	0.03723	0.03478	0.05107
	$R^{2}$	0.96445	0.95091	0.94303	0.90320	0.97046	0.96707	0.90598	0.94358
LS-SVM	RMSE	0.06441	0.06014	0.10807	0.05671	0.06674	0.05659	0.04579	0.06507
	MAE	0.04993	0.05519	0.09391	0.03943	0.03504	0.03963	0.03754	0.05031
	$R^{2}$	0.98154	0.98382	0.93357	0.96145	0.98971	0.97813	0.98281	0.97301
TCCANN	RMSE	0.04126	0.05059	0.04081	0.04361	0.04737	0.03755	0.02819	0.04134
	MAE	0.03654	0.03292	0.02434	0.02397	0.02996	0.01549	0.01186	0.02501
	$R^{2}$	0.99984	0.99882	0.99924	0.99956	0.99942	0.99986	0.99968	0.99948

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, Z.; Qi, G.; Lei, Y.; Jiang, D.; Mazur, N.; Liu, Y.; Wang, D.; Zhu, W. A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images. Chemosensors 2022, 10, 164. https://doi.org/10.3390/chemosensors10050164

AMA Style

Zhu Z, Qi G, Lei Y, Jiang D, Mazur N, Liu Y, Wang D, Zhu W. A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images. Chemosensors. 2022; 10(5):164. https://doi.org/10.3390/chemosensors10050164

Chicago/Turabian Style

Zhu, Zhiqin, Guanqiu Qi, Yangbo Lei, Daiyu Jiang, Neal Mazur, Yang Liu, Di Wang, and Wei Zhu. 2022. "A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images" Chemosensors 10, no. 5: 164. https://doi.org/10.3390/chemosensors10050164

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Long Short-Term Memory Neural Network Based Simultaneous Quantitative Analysis of Multiple Tobacco Chemical Components by Near-Infrared Hyperspectroscopy Images

Abstract

1. Introduction

2. Related Work

3. The Proposed Method

3.1. The Framework of The Proposed TCCANN

3.2. The Structure of TCCANN

3.2.1. ResNet

3.2.2. LSTM

3.2.3. The Specific Architecture of TCCANN

3.3. The Overall Algorithm of TCCANN

4. Comparative Experiments

4.1. Experiment Preparation

4.2. Evaluation Metrics

4.3. Parameters of TCCANN

4.4. Comparison with Existing Methods

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI