Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose

Cao, Mengli; Ling, Xingwei

doi:10.3390/app12199716

Open AccessArticle

Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose

by

Mengli Cao

^* and

Xingwei Ling

Logistic Engineering College, Shanghai Maritime University, Shanghai 201306, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(19), 9716; https://doi.org/10.3390/app12199716

Submission received: 2 September 2022 / Revised: 19 September 2022 / Accepted: 25 September 2022 / Published: 27 September 2022

Download

Browse Figures

Versions Notes

Abstract

:

Perfume identification (PI) based on an electronic nose (EN) can be used for exposing counterfeit perfumes more time-efficiently and cost-effectively than using gas chromatography and mass spectrometry instruments. During the past five years, decision-tree-based ensemble learning methods, also called tree ensemble learning methods, have demonstrated excellent performance when solving multi-class classification problems. However, the performance of tree ensemble learning methods for the EN-based PI problem remains uncertain. In this paper, four well-known tree ensemble learning classification methods, random forest (RF), stagewise additive modeling using a multi-class exponential loss function (SAMME), gradient-boosting decision tree (GBDT), and extreme gradient boosting (XGBoost), were implemented for PI using our self-designed EN. For fair comparison, all the tested classification methods used as input the same feature data extracted using principal component analysis. Moreover, two benchmark methods, neural network and support vector machine, were also tested with the same experimental setup. The quantitative results of experiments undertaken demonstrated that the mean PI accuracy achieved by XGBoost was up to 97.5%, and that XGBoost outperformed other tested methods in terms of accuracy mean and variance based on our self-designed EN.

Keywords:

electronic nose; perfume identification; ensemble learning; XGBoost

1. Introduction

The perfume imitation problem has become increasingly serious [1]. Counterfeit perfumes are typically made by blending ethyl alcohol, water, and cheap chemical synthetics, which can cause dizzy feelings, headaches, and skin allergies for perfume consumers. Owing to their low manufacturing cost, counterfeit perfumes are capable of undercutting certified perfumes with relatively very low prices. It is important for promoting the perfume industry to expose counterfeit perfumes. However, for common perfume consumers, it is hard to determine which perfumes are certified, since counterfeit and certified perfumes are generally enclosed in substantially identical external packings. Existing sensing instruments that can be used for PI include gas chromatography (GC) [2], mass spectrometry (MS) [3], and EN. With respect to the identification of perfume types, EN has the merits of low-cost, high time-efficiency, and portability [4].

To achieve the objective of EN-based PI, different perfume samples are dripped into the volatilization unit of an EN. Typically, the responses of gas sensors in the EN to the individual chemical constitution of a perfume are different. The response data from multiple chemical sensors contain specific ‘fingerprint’ information about the perfume simulant. By considering samples of perfumes in market and genuine perfumes as test and training sets, respectively, the EN-based PI problem turns into a multi-class classification problem, since the number of candidate perfume types is often larger than three. As the multiple decision boundaries interact with each other, multi-class classification problems become more challenging than binary classification [5].

Many EN-based PI results have been presented in the past twenty years. Nakamoto et al. [6] performed PI with an array of quartz-resonator sensors using neural network (NN) pattern recognition, which can achieve relatively high separation accuracies among ethanol-diluted perfume samples. Andrea et al. [7] proposed combining principal component analysis (PCA) and a radial basis function NN for perfume discrimination; PCA was used to extract the relevant information of perfume samples into two principal components, which were then used as the input to a radial basis function NN. Jatmiko et al. [8] achieved fragrance mixture recognition using back propagation NN and fuzzy-neuro learning vector quantization with four quartz resonator-sensitive membrane sensors. Mei et al. [9] implemented the support vector machine (SVM) method for the identification of four perfume samples with an EN system containing three common gas sensors. Sahrim et al. [10] used a k-NN kfold method with EN to identify pure and mixed agarwood oils, which can also be considered to be a specific type of perfume. However, in each of these studies, only an individual classifier was constructed to classify the features extracted from the sampling data. In the last decade, the field of machine learning (ML) has progressed significantly with many excellent new classification ML methods proposed [11,12]. As a flourishing branch of ML, ensemble learning methods combine the results of multiple base learners. Tree ensemble learning methods, which employ the decision tree learner [13] as the base learner, can usually achieve better performance than traditional methods [14]. The motivation for this paper was to implement tree ensemble learning methods for solving the EN-based PI problem and to compare their performance.

In this paper, we present EN-based PI experimental results obtained using four tree ensemble learning methods: XGBoost [14], RF [15], SAMME [16], and GBDT [17]. To obtain the experimental results, an EN system designed in the authors’ laboratory was used to collect sensing data for four perfumes. The collected raw measurements were preprocessed via wavelet denoising [18] and baseline subtraction. Then, based on the preprocessed data, six empirically selected equations were used for feature extraction. Afterwards, to further eliminate non-informative features, PCA [19] was used for feature dimension reduction. Finally, the extracted features were classified using each of the four tree ensemble learning methods. To conduct a comparative evaluation, two benchmark methods, NN [20] and SVM [21], were also used to classify the same features. To achieve a fair comparison, all the classification methods were tested based on the same training and test sets determined using a ten-fold cross-validation method. Before obtaining the PI accuracies, the hyper-parameters for all the classification methods tested were tuned using the simulated annealing method [22]. The main contribution of this paper is the provision of quantitative comparison results that can provide important guidance when employing EN systems in the perfume industry. The combination of six empirically selected equations for feature extraction represents a further contribution of this paper.

The experimental setup and methods are presented in Section 2. For clarity and coherent explanation, the principles underpinning PCA and the four tree ensemble learning methods are briefly introduced in Section 2.3.2 and Section 2.3.3, respectively. In Section 3, the experimental results obtained using the tree ensemble learning and benchmark methods are compared and discussed. Finally, the conclusions and suggestions for future work are presented in Section 4.

2. Materials and Methods

2.1. The EN System

Figure 1 shows the first version of our self-designed EN: SMU-ENOSEv1, which employs eight metal-oxide-semiconductor (MOS) chemical sensors: MiCS-5524, MiCS-5914, MiCS-6814, TGS-2600, TGS-2602, TGS-2611, and TGS-2620. The chemical sensors are mounted inside a gas chamber to keep the volume of perfume samples stable during multiple sampling cycles. An air pump is used to accelerate the volatilization of the perfume sample, which is injected into a test-tube. Meanwhile, the gaseous perfume is carried through the gas chamber. When the chemical sensors inside the chamber are in contact with the gaseous perfume, an ARM processor is used to collect the sensing voltages via an analog-to-digital converter (ADC) module, and these are transmitted to the notebook computer. Then, processing of the sensing data and perfume identification are conducted on the notebook computer.

SMU-ENOSEv1 employs a novel and simple clean mechanism. Before sampling the concentration of perfume samples, the gas chamber and collapsible bulbs must be thoroughly washed to keep the sensor surroundings clean, which is important to ensure sampling data reliability. By simply replacing the possibly contaminated test-tube with a clean test-tube, the closed chamber and collapsible tubing can be washed by pumping clean air through the gas links, as shown in Figure 1. Afterwards, the perfume samples can be dripped into the clean test-tube using a pipettor, and then volatilized and driven towards the gas chamber to activate the sampling process.

2.2. Experimental Setup

The procedure for the comparison experiments is shown in Figure 2. When the gaseous perfumes pass by the gas chamber, the sensing voltages of the sensor array are sampled to provide the raw measurements. As perfume samples, we used four types of perfumes with different fragrance notes. In each sampling cycle, the volume of perfume used was fixed at 3 μL. The time span for each sampling cycle was 50 s. After each sampling cycle, the gas links and chamber were washed for 15 min. To obtain sufficient experimental instances, the number of sampling cycles for each perfume was set as 30. Therefore, as raw measurements, we obtained a set of 120 sampling instances.

To reduce the negative impact of measurement noises and background measurements, a preprocessing sub-procedure was conducted after obtaining the raw measurements. The preprocessing sub-procedure comprised denoising and baseline subtraction, which are detailed in Section 2.3.1. Afterwards, six empirically selected equations and the PCA, which are detailed in Section 2.3.2, were used for feature extraction. As shown in Figure 2, a ten-fold cross validation method was used to divide the set of extracted features into ten equal parts. Each part was, in turn, employed as the test set. Other divisions were used as the training set. Thus, in the ten-fold cross validation method, there were ten different pairs of training and test sets. Finally, based on each pair of training and test sets, the four tree-ensemble learning methods and the two benchmark methods were separately evaluated and compared.

2.3. Identification Methods

2.3.1. Data Preprocessing

As mentioned, denoising and baseline subtraction operations were conducted to preprocess the raw measurements. For EN-based PI, the noises are mainly high-frequency noise. Thus, the wavelet denoising method [18] was adopted in our experiments. The main steps of wavelet denoising can be expressed as follows.

(1): Compute the wavelet decomposition of the raw measurements at a predefined level.
(2): Identify a threshold for the measurement noises from the decomposed measurements by applying soft thresholding to the detailed coefficients.
(3): Reconstruct the measurements based on the original approximation and modified detail coefficients.

Then, a baseline subtraction operation is applied on the denoised measurements. The baseline is defined as the sensing voltage sampled before exposing the sensors in the EN to perfume samples. By subtracting the baselines from the denoised measurements, the influence of background gas residuals on the experiments can be reduced.

2.3.2. Feature Extraction

First, as shown in Table 1, six equations were empirically selected from our previous work [23], and used to extract features from the preprocessed measurements. The preprocessed measurements obtained by an individual sensor in a single sampling cycle were substituted into the equations, which resulted in six feature values. Thus, the dimension of extracted features was 48, since there were eight gas sensors in our EN.

The six features can be categorized into two types. Features 1–3 are statistics in the time space of the original preprocessed data, while features 4 and 5 are calculated based on the first-order derivatives. Feature 6 incorporates both the first- and second-order derivatives of the preprocessed data. The different features in Table 1 characterize the perfume samples from different aspects which could contribute to the PI process.

Then, to achieve further feature extraction and dimension reduction, the features extracted using the six equations were used as the input to PCA. The pseudo-code of PCA is sketched in Algorithm 1 [19]. In PCA, a projection from the high-dimensional feature space into a relatively low-dimensional feature space is conducted. The input of PCA is a n × d feature matrix E. In PCA, a d × d′ transformation matrix W is constructed using d′ (d′ < d) top ranking Eigen vectors of the covariance matrix EE^T in the descending order of Eigen values [19]. Then, W is used to transfer the n × d matrix E to a new n × d′ matrix X. The n row vectors of X with lower dimension d′ is output as the dimension reduction result.

Algorithm 1: PCA [19].

1 Calculate the covariance matrix EE^T of the original n × d feature matrix E.

2 Calculate the Eigen vectors and values of EE^T using Eigen decomposition.

3 Sort the Eigen vectors of EE^T according to the descending order of Eigen values.

4 Construct a matrix W using the d′ top-ranking Eigen vectors.

6 Calculate the new n × d′ feature matrix X using the equation X = EW.

2.3.3. Feature Classification

As mentioned in Section 1, the EN-based PI problem can be considered as a multi-class classification problem. To evaluate the performance of the four ensemble learning methods, that is, RF, SAMME, GBDT, and XGBoost, the total set of extracted features were divided into a training subset and a test subset. A statistical model was fitted to the training set and was then used to predict the corresponding perfume type of the samples in the test set. The principles of the four ensemble learning methods are outlined as follows:

RF

As a typical bagging method, RF constructs multiple decision trees based on sd samples, which are drawn with replacement from the training set [15]. The number of decision trees constructed by RF is M (M > 1). The base leaner in RF is the classification and regression tree (CART) classifier [13]. The CART classifier in RF uses the Gini index as the tree node splitting criterion. Each constructed CART classifier outputs a classification result. Then, the final classification result is determined according to the principle of minorities subordinating to the majorities.

Note that the prototypical CART can be a regression or classification tree, which are called the CART regressor and the CART classifier, respectively [13]. The base learners of RF and SAMME are CART classifiers, whereas GBDT and XGBoost both employ CART regressors as the base learner.

SAMME

SAMME, an improved form of AdaBoost [24], which sequentially fits copies of a CART classifier on the same dataset, and steers subsequent CARTs to focus more on incorrectly classified samples by increasing their weights [16]. The pseudo-code of SAMME is shown in Algorithm 2. SAMME shares the same structure of AdaBoost. The critical improvement is the term log(K − 1) in line 5 of Algorithm 2, which directly influences the update of weights for the features. At the first iteration, the weights for all the n features are set to be the same. Then, as shown in line 6 of Algorithm 2, the weights of previously mis-classified samples are increased, whereas other weights are decreased.

Algorithm 2: SAMME [16].

GBDT

The pseudo-code of GBDT [17] is shown in Algorithm 3. For K-class (K > 2) classification problems, GBDT sequentially constructs K × M CART regressors, where M is the number of iterations. In each iteration, an individual CART regressor is fitted for each class in expectation of minimizing a greedy loss function, which is equal to the negative gradients of the multiclass log loss [17]. As shown in line 3 of Algorithm 3, using the Sigmoid function, the continuous prediction value F_k(x_i) of the constructed CARTs are mapped to probabilities that x_i belongs to the k-th class p_k(x_i). During the construction of CART regressors, the mean squared error with the improvement score by Friedman [17] was used as the splitting criteria to induce the trees and leaf nodes. Finally, the probabilities p_kM(x_i) mapped by the final CARTs can be used to classify the test samples.

Algorithm 3: GBDT [17].

XGBoost

XGBoost shares a similar sequential additive model training scheme with GBDT [14]. XGBoost incorporates several additional improvements [14] which include:

An additional regularization term in the loss function used for tree building. The objective function to be minimized in the m-th model training iteration of XGBoost is as follows:

$ℒ^{(m)} = \sum_{i = 1}^{n} l (y_{i}, {\hat{y}}_{i}^{(m - 1)} + f_{m} (x_{i})) + Ω (f_{m}),$

(1)

where l is the loss function, ${\hat{y}}_{i}^{(m - 1)}$ is the prediction for the i-th sample as the m-th iteration, and $Ω (f_{m})$ is the regularization item that is proportional to the model complexity.
Usage of second-order gradient statistics. In order to minimize the objective function, GBoost additionally uses its second-order Taylor expansion approximation, whereas GBDT only calculates its negative first-order gradient. The objective function in XGBoost can be approximated as follows.

$ℒ^{(m)} \approx \sum_{i = 1}^{n} [g_{i} f_{m} (x_{i}) + \frac{1}{2} h_{i} f_{m}^{2} (x_{i})] + Ω (f_{m}),$

(2)

where g_i is the first-order gradient of the loss function, and h_i is the corresponding second-order gradient. When the structure of q is fixed, the optimal leaf score can be calculated by minimizing

${\tilde{ℒ}}^{(m)} (q) \approx - \frac{1}{2} \sum_{j = 1}^{T} \frac{{(\sum_{i \in I_{j}} g_{i})}^{2}}{\sum_{i \in I_{j}} h_{i} + λ} + γ T,$

(3)

where T denotes the leaf node number of q, I_j is the sample sets in the j-th leaf node, and λ and γ are two constants which do not influence the minimization.

As shown in Algorithm 4, the exact greedy algorithm for CART regressor construction in XGBoost uses both the first- and second-order statistics. A CART regressor is constructed by iteratively adding child nodes to the tree, which is also called splitting the current leaf nodes. During the tree construction process, the loss is gradually reduced. After the split of a single node, the loss reduction can be represented as

Δ ℒ_{s p l i t} = \frac{1}{2} [\frac{{(\sum_{i \in I_{L}} g_{i})}^{2}}{\sum_{i \in I_{L}} h_{i} + λ} + \frac{{(\sum_{i \in I_{R}} g_{i})}^{2}}{\sum_{i \in I_{R}} h_{i} + λ} - \frac{{(\sum_{i \in [I_{L} \cup I_{R}]} g_{i})}^{2}}{\sum_{i \in [I_{L} \cup I_{R}]} h_{i} + λ}] - γ,

(4)

where I_L and I_R denote the set of samples in the left and right nodes after the split, respectively.

Algorithm 4: Exact greedy algorithm for CART construction in XGBoost [14].

3. Results and Discussion

3.1. Data Preprocessing Results

Figure 3a,b show the raw and preprocessed measurements in a typical sampling cycle, respectively. In our EN, the sensing voltages of the eight sensors were sampled through eight different channels, and are plotted as curves with different colors in Figure 3. From Figure 3, we can see the curves all show quick rising but slow descending shapes. This phenomenon coincides with the fast reaction but slow recovery characteristics of MOS sensors. All sensors were powered with a rated voltage of 5 V. Thus, all raw voltage measurements in Figure 3a are smaller than 5 V. Moreover, the curves in Figure 3a seldom coincide with each other. In other words, the responses of the eight sensors to the same perfume sample are obviously different to each other, and jointly form the “fingerprint” information for the subsequent PI process.

The spiky curves in Figure 3a depict the high-frequency noises in the raw measurements. Thus, as mentioned in Section 2.1, the wavelet method was used for sensing data denoising. To reduce the negative impact of background gas concentrations, the baseline subtraction operation was conducted to remove the sensing voltages sampled before the sensors came into contact with the perfume samples. Figure 3 shows the curves of preprocessed data. From Figure 3b, we can see that the curves are smooth, which reveals that the wavelet denoising method had removed the high-frequency noises. Additionally, it can readily be seen that all curves in Figure 3b intersect with the origin. This is because the measurements caused by the background residues were removed by the baseline subtraction operation.

3.2. Parameter Selection

All four tree-ensemble learning methods (RF, SAMME, GBDT, and XGBoost) have multiple tunable hyperparameters that can severely influence the PI performance. It is important to select optimal values for these hyperparameters based on the available data. Moreover, as two benchmark methods that have been used for EN-based PI, NN and SVM were compared with the tree ensemble learning methods. The tunable parameters of NN and SVM were also selected based on the same data.

The determination of hyperparameter values was modeled as a function minimization problem, which uses the additive inverse of identification accuracy as the objective function. To select the optimal value combination for the hyperparameters, the global minimum of the constructed objective function was searched using the simulated annealing algorithm [22]. Table 2 lists the empirically determined range of the main tunable hyperparameters of the tested methods. Apart from the parameters listed in Table 3, other parameters were found to have a minor influence on the identification accuracies, and, thus, were fixed as their default values. For example, the kernel parameter of SVM was always converged at ‘polynomial’ when its optimal value was searched for within the set of all valid values. Thus, the kernel parameter of SVM was not listed in Table 2 and was fixed as ‘polynomial’ in our experiments.

As mentioned in Section 2.2, the feature set output by PCA was divided into ten equal parts according to the ten-fold cross-validation method. Each part was, in turn, used as the test set, while the others were used as the training set. For each pair of training and test sets, the optimal value combination was searched for the hyperparameters. Then, the optimal parameter value combination was used by the classification method for predicting the perfume type with respect to the test sets. Finally, the fraction of correctly classified perfume samples was calculated as the identification accuracy. For example, if n′ perfume instances are correctly identified, the identification accuracy is n′/n.

3.3. Identification Accuracy Comparison

The feature dimension reduction results obtained by PCA are shown in Figure 4. Each point in Figure 4 represents an individual sampling instance. A total of 120 points were plotted in each subfigure of Figure 4, since the perfumes were sampled 120 times. The point color stands for the corresponding perfume type, which means points in the same color were generated with the same perfume.

In both subfigures of Figure 4, points in different colors are mixed with each other. It is hard to find a decision boundary to discriminate the sampling instances corresponding to different perfumes. The mixing of points in different colors reveals that the perfume samples cannot be identified using PCA alone. Moreover, when the output feature dimension of PCA is set as two or three, the output features contain little distinctive information about the perfume type, which poses difficulties for further classification. It is necessary to evaluate the influence of output feature dimension of PCA (d′) on the further classification performance.

Figure 5 shows the distribution of the mean identification accuracies μ obtained in our experiments. As mentioned, the value of d′ is the input feature dimension for the six tested multi-classification methods. The two bars, with and without shading background centered at the same horizontal ordinate label, were generated with the same value of d′. With respect to the same value of d′, the altitude difference between the two bars with and without shading background represents the difference between the accuracies obtained with the training and test sets.

From the mean identification accuracies results shown in Figure 5, at least three conclusions can be drawn as follows:

The feature dimension d′ has an obvious impact on the mean identification accuracies μ for all the six classification methods. When d′ < 9, all six tested methods achieved gradually increasing identification accuracies as the value of d′ escalated. In particular, as the value of d′ increased in the range 3 < d′ < 15, XGBoost achieved gradually increasing mean identification accuracies. This phenomenon suggests that a not-too-small feature dimension is necessary to provide sufficient information for further classification. However, when the value of d′ exceeded a certain threshold, which differed for the different methods, increasing the feature dimension can barely improve, and sometimes can even deteriorate, the mean identification accuracies. When the feature dimension was unnecessarily high, much redundant and unhelpful information would be incorporated, which increased the complexity of the learned model. Thus, it is important to determine an optimal value of d′ for each method to meet the demands of identification accuracy and computational burden. For example, according to the results in Figure 5, the optimal value of d′ for XGBoost was 15, since the mean identification accuracies obtained by XGBoost wee to some extent stable when the value of d′ exceeded 15.
With respect to the same feature dimension, the mean accuracies obtained with the test sets were notably lower than those obtained with the training sets for RF, SAMME, and GBDT. These differences indicate that RF, SAMME, and GBDT were affected by the over-fitting problem when used for EN-based PI. In contrast, NN, SVM, and XGBoost obtained the same mean accuracies on the training and test sets when the feature dimension was fixed at the same value. The fact that XGBoost avoided the overfitting problem was mainly attributable to the regularization term Ω(f_m) in the loss function shown in Equation (1), which penalized the model complexities during the model learning processes. A highly complex statistical model usually contains terms tailored to the training set, which have a negative influence on its prediction performance on other datasets, including the test set.
Last, but not least, XGBoost achieved the highest mean accuracy in our experiments. When the value of d′ exceeded 13, the mean accuracies obtained by XGBoost were generally higher than those obtained by the other tested methods. For clarification, Table 3 lists the highest mean accuracies on the test sets and the corresponding feature dimensions of the tested methods. The generation process of mean accuracies on the test sets was closer to real EN-based PI applications. Thus, the mean accuracies on the training sets were not listed and compared. Although the value of d′ was relatively high when XGBoost achieved its highest μ value of 97.5%, when d′ equaled 15, the mean accuracy obtained by XGBoost was 96.67%, which was close to its highest μ value and higher than the highest μ values of the other tested methods. In our experiments, when d′ equaled 15, the input data for the classification methods was a 120 × 15 matrix, which can be readily processed by a typical commercial notebook computer.

The identification accuracy variances δ obtained in our experiments are shown in Figure 6. As mentioned, RF, SAMME, and GBDT were faced with the overfitting problem, which explains why the three methods obtained generally higher δ values on the test sets than those obtained on the training sets. When the feature dimension d′ exceeded seven, the δ values of NN and SVM almost converged at 0.34% and 0.48%, respectively. When the feature dimension d′ exceeded nine, the δ values of XGBoost became smaller and more stable. Moreover, with respect to the highest mean identification accuracies, the δ value of NN, SVM, and XGBoost were 0.34%, 0.48%, and 0.15%, respectively. The experimental results shown in Figure 5 and Figure 6 demonstrate that it is feasible to realize PI by successively using PCA and XGBoost to reduce the feature dimension to about 15 and the feature classification in PI based on our self-designed EN, respectively.

3.4. Advantages and Limitations

As mentioned in Section 1, all methods tested in this paper are based on a portable EN device, and, thus, have the inherent advantages of low-cost, high time-efficiency, and portability. However, the limitation of these methods is that they cannot discriminate different constituents within a chemical compound, which can be achieved using GC and MS with enough investment, time, and training. Nevertheless, various real applications can be modeled as the problem of identifying the type of different separated volatile organic compounds (VOCs), which can be solved with our EN using the comparison methods.

The advantage and limitations of the methods can be summarized as follows: The advantage of PCA is that it can achieve good performance on feature dimension reduction. However, PCA has the limitation that it cannot solely realize feature classification. In contrast, the tested classification methods have the advantage that they can obtain good classification performance; however, they also have the limitation that they cannot be used alone for feature extraction or dimension reduction. Moreover, according to our experimental results, XGBoost, NN, and SVM have the advantage that they can overcome the overfitting problem, while RF, SAMME, and GBDT have the limitation that they are influenced by the overfitting problem.

4. Conclusions and Future Work

Combined with PCA-based feature extraction, four well-known tree ensemble learning methods (RF, SAMME, GBDT, and XGBoost) were successfully implemented for PI using a portable self-designed EN. By considering samples of perfumes in the market as the test sets in our study, the model trained with genuine perfume samples was used to predict whether the tested perfume was genuine or fake. RF, SAMME, GBDT, and XGBoost obtained different PI accuracies in our experiments, due to their different ways of constructing the base decision tree learners and of combining the base learner prediction results. From the experimental results, it was found that XGBoost has the capability of avoiding the overfitting problem and was superior to RF, SAMME, GBDT, NN, and SVM in terms of accuracy mean and variance based on our EN system. A managerial implication of this paper is that comparing PI methods in advance quantitatively is important to improve time efficiency and to save financial costs for EN-based PI.

Some future directions for our work include the following:

(1): Comparing other feature extraction methods using the same framework as described here. In the present research, the feature classification subprocess was emphasized and thoroughly studied, while only PCA was tested for feature extraction.
(2): Comparing the time-efficiency of the EN-based PI methods. Because the quantity of raw measurements used was not too large, all the tested methods ran fluently and accomplished the PI tasks in similarly short time spans.
(3): Simultaneously identifying more types of perfumes with our EN. The identification of more perfume types can be achieved by identifying them in batches; only four types of perfumes were identified in this investigation.
(4): Designing a new EN with more compact structure. For ease of use in field stations, the EN should be compact.
(5): Identifying other types of VOCs. The gas sensors in our EN were also sensitive to other types of VOCs. It is anticipated that our EN can also be used for identifying other VOCs.

Author Contributions

Conceptualization, methodology, software, investigation, funding acquisition, writing—original draft preparation, writing—review and editing, M.C.; data curation, software, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant no. 61801287).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The experimental result data has been deposited in the Github repository (http://github.com/aoxfrank/EnoseData.git).

Acknowledgments

The authors would like to thank the anonymous reviewers for their helpful comments that have significantly improved this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Le Roux, A.; Bobrie, F.; Thébault, M. A typology of brand counterfeiting and imitation based on a semiotic approach. J. Bus. Res. 2016, 69, 349–356. [Google Scholar] [CrossRef]
Tissandié, L.; Brevard, H.; Belhassen, E.; Alberola, M.; Meierhenrich, U.; Filippi, J.-J. Integrated comprehensive two-dimensional gas-chromatographic and spectroscopic characterization of vetiveryl acetates: Molecular identifications, quantification of constituents, regulatory and olfactory considerations. J. Chromatogr. A 2018, 1573, 125–150. [Google Scholar] [CrossRef] [PubMed]
López-Nogueroles, M.; Chisvert, A.; Salvador, A. Determination of atranol and chloroatranol in perfumes using simultaneous derivatization and dispersive liquid–liquid microextraction followed by gas chromatography–mass spectrometry. Anal. Chim. Acta 2014, 826, 28–34. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Doh, I.-J.; Bae, E. Design and Validation of a Portable Machine Learning-Based Electronic Nose. Sensors 2021, 21, 3923. [Google Scholar] [CrossRef] [PubMed]
Moral, P.D.; Nowaczyk, S.; Pashami, S. Why Is Multiclass Classification Hard? IEEE Access 2022, 10, 80448–80462. [Google Scholar] [CrossRef]
Nakamoto, T.; Fukuda, A.; Moriizumi, T. Perfume and flavour identification by odour-sensing system using quartz-resonator sensor array and neural-network pattern recognition. Sens. Actuators B Chem. 1993, 10, 85–90. [Google Scholar] [CrossRef]
Branca, A.; Simonian, P.; Ferrante, M.; Novas, E.; Negri, R.M.n. Electronic nose based discrimination of a perfumery compound in a fragrance. Sens. Actuators B Chem. 2003, 92, 222–227. [Google Scholar] [CrossRef]
Jatmiko, W.; Fukuda, T.; Arai, F.; Kusumoputro, B. Artificial odor discrimination system using multiple quartz resonator sensors and various neural networks for recognizing fragrance mixtures. IEEE Sens. J. 2006, 6, 223–233. [Google Scholar] [CrossRef]
Mei, X.; Wang, B.; Zhu, Z.; Zhao, P.; Hu, X.; Lu, G. Design of electronic nose system for perfume recognition based on support vector machine. J. Jilin Univ. (Inf. Sci. Ed.) 2014, 32, 355–360. [Google Scholar]
Lias, S.; Ali, N.A.M.; Jamil, M.; Jalil, A.M.; Othman, M.F. Discrimination of Pure and Mixture Agarwood Oils via Electronic Nose Coupled with k-NN kfold Classifier. Procedia Chem. 2016, 20, 63–68. [Google Scholar] [CrossRef]
Kwon, H.; Kim, Y.; Park, K.-W.; Yoon, H.; Choi, D. Advanced ensemble adversarial example on unknown deep neural network classifiers. IEICE Trans. Inf. Syst. 2018, E101D, 2485–2500. [Google Scholar] [CrossRef] [Green Version]
Kwon, H. Detecting Backdoor Attacks via Class Difference in Deep Neural Networks. IEEE Access 2020, 8, 191049–191056. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Zhu, J.; Zou, H.; Rosset, S.; Hastie, T. Multi-class Adaboost. Stat. Its Interface 2009, 2, 349–360. [Google Scholar]
Friedman, J. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Donoho, D.L. De-noising by soft-thresholding. IEEE Trans. Inf. Theor. 1995, 41, 613–627. [Google Scholar] [CrossRef]
Gewers, F.L.; Ferreira, G.R.; Arruda, H.F.D.; Silva, F.N.; Comin, C.H.; Amancio, D.R.; Costa, L.D.F. Principal Component Analysis: A Natural Approach to Data Exploration. ACM Comput. Surv. 2022, 54, 1–34. [Google Scholar] [CrossRef]
Hinton, G.E. Connectionist learning procedures. Artif. Intell. 1989, 40, 185–234. [Google Scholar] [CrossRef]
Wu, T.F.; Lin, C.J.; Weng, R.C. Probability Estimates for Multi-class Classification by Pairwise Coupling. J. Mach. Learn. Res. 2004, 5, 975–1005. [Google Scholar]
Hwang, C.R. Simulated annealing: Theory and applications. Acta Appl. Math. 1988, 12, 108–111. [Google Scholar] [CrossRef]
Ling, X.; Cao, M. Perfume identification using a chemical sensor array via LightGBM and prepositive feature reduction. In Proceedings of the 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 15–17 April 2022; pp. 1991–1995. [Google Scholar]
Freund, Y.; Schapire, R.E. A desicion-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Structure of the EN system designed for PI experiments in the authors’ laboratory.

Figure 2. The procedure for the comparison experiments.

Figure 3. The raw and preprocessed measurements in a typical sampling cycle. (a) The raw sensing voltage measurements; (b) The preprocessed sensing voltage measurements.

Figure 4. Distributions of the principal components output by PCA. Points in the same color were generated with samples of the same perfume type. (a) PC1 and PC2 when the output feature dimension of PCA is set as 2 (d′ = 2); (b) PC1, PC2, and PC3 when the output feature dimension of PCA is set as 3 (d′ = 3).

Figure 5. The mean identification accuracies μ obtained by the six tested classification methods when the output feature dimension of PCA is set as different numbers. Bars with and without shading pattern indicate the mean accuracies obtained with the test and training sets, respectively.

Figure 6. The variances of identification accuracies δ obtained by the six tested classification methods when the output vector dimension of PCA is set as different numbers. Bars with and without shading pattern indicate the accuracy variances obtained with the test and training sets, respectively.

Table 1. The six equations used to extract features from the preprocessed measurements.

ID	Feature Name	Calculation Equation
1	Mean of response-values	$\frac{\sum_{t = t_{0}}^{N} V (t)}{N}$
2	Maximal response-value	$\max V (t)$
3	Response-value at the first-order differential maximum	$V (t_{a g m}), t_{a g m} = argmax \frac{d V (t)}{d t}$
4	The first-order derivative maximum	$\max [\frac{d V (t)}{d t}]$
5	The first-order derivative average	$\frac{\sum_{t = t_{0}}^{N} [d V (t) / d t]}{N}$
6	Mean curvature of the response curves	$[\sum_{t = t_{0}}^{N} \frac{\| d^{2} V (t) / d t^{2} \|}{{[1 + {(d V (t) / d t)}^{2}]}^{\frac{2}{3}}}] / N$

Table 2. Ranges of main tunable hyper-parameters of the tested methods.

	Notation	Range	Meaning
RF	M	{50, 100, …, 250}	Total number of tree learners
	sd	{0.1, 0.2, …, 1}	The number of samples drawn to train each tree learner
	ud	{1, 2, …, 10}	Upper limit of the tree learner depth
	mln	{2, 3, …, 10}	Upper limit of sample numbers in a single leaf node
	mss	{2, 3, …, 10}	Minimum number of samples required to split a node
SAMME	M	{50, 100, …, 250}	Total number of tree learners
	η	{0.1, 0.2, …, 1}	Learning rate used in the weight update
	ud	{1, 2, …, 10}	Upper limit of the tree learner depth
GBDT	M	{50, 100, …, 250}	Total number of tree learners
	η	{0.1, 0.2, …, 1}	Learning rate used in the prediction update
	ud	{1, 2, …, 10}	Upper limit of the tree learner depth
	ss	{0.5, 0.6, …, 1}	Proportion of sample used for fitting each tree learner
	mid	{0.5, 1, …, 5}	Minimum impurity decrease required to split a node
XGBoost	M	{50, 100, …, 250}	Total number of tree learners
	η	{0.1, 0.2, …, 1}	Learning rate used in the prediction update
	ud	{1, 2, …, 10}	Upper limit of the tree learner depth
	ss	{0.5, 0.6, …, 1}	Proportion of samples used for fitting each tree learner
	mcw	{3, 4, …, 7}	Minimum sum of sample weights required in a child node of each tree learner
NN	hls	{400, 500, 600, 700}	Number of neurons in the first hidden layer
	act	{‘relu’, ‘logistic’, ‘tanh’, ‘identity’}	Activation function for the hidden layer
	slv	{‘lbfgs’, ‘sgd’, ‘adam’}	Solver for weight optimization
SVM	deg	{0.5, 1, 1.5, 2}	Degree of the polynomial kernel function
SVM	C	{1e3, 1e4, 2e4, 1e5}	Penalty inversely proportional to regularization strength

Table 3. The highest mean identification accuracies μ obtained on the test sets with the six classification methods, and the corresponding feature dimensions d′.

	Highest μ	d′
RF	94.17%	17
SAMME	95%	23
GBDT	88.33%	13
NN	92.5%	13
SVM	92.5%	7
XGBoost	97.5%	21

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, M.; Ling, X. Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose. Appl. Sci. 2022, 12, 9716. https://doi.org/10.3390/app12199716

AMA Style

Cao M, Ling X. Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose. Applied Sciences. 2022; 12(19):9716. https://doi.org/10.3390/app12199716

Chicago/Turabian Style

Cao, Mengli, and Xingwei Ling. 2022. "Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose" Applied Sciences 12, no. 19: 9716. https://doi.org/10.3390/app12199716

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose

Abstract

1. Introduction

2. Materials and Methods

2.1. The EN System

2.2. Experimental Setup

2.3. Identification Methods

2.3.1. Data Preprocessing

2.3.2. Feature Extraction

2.3.3. Feature Classification

3. Results and Discussion

3.1. Data Preprocessing Results

3.2. Parameter Selection

3.3. Identification Accuracy Comparison

3.4. Advantages and Limitations

4. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI