Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group

Yi, Hongjie; Zhang, Ke; Ma, Kun; Zhou, Lijian; Tang, Futong

doi:10.3390/math10224264

Open AccessArticle

Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group

by

Hongjie Yi

¹,

Ke Zhang

¹,

Kun Ma

²,

Lijian Zhou

^1,*

and

Futong Tang

¹

School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266033, China

²

Qingdao Customs, Qingdao 266071, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(22), 4264; https://doi.org/10.3390/math10224264

Submission received: 17 October 2022 / Revised: 8 November 2022 / Accepted: 10 November 2022 / Published: 15 November 2022

(This article belongs to the Special Issue Artificial Intelligence and Internet of Things for Intelligent Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Natural rubber is mainly dependent on import in China, its domestic market price is influenced by the Natural Rubber Customs Declaration Price (NRCDP). Considering the fluctuating properties of the NRCDP, a method of the NRCDP based on Wavelet and the optimized Back Propagation (BP) neural network Group using a Genetic Algorithm (W-GA-BPG) is proposed. First, an NRCDP dataset is established based on the original Customs Declaration Price (CDP) dataset collected by Qingdao Customs, in which the commodity types are selected consistently according to the sampling intervals, and the features are deleted if they are less affected by the fluctuation of NRCDP. Secondly, the selected features in NRCDP are decomposed using wavelet transform to obtain a group of feature sequences with different scales. Then, a Group of BP neural networks (BPG) optimized by Genetic Algorithm (GA) is used to predict multiple decomposition sub-sequences, respectively. Finally, the predicted values are obtained through wavelet reconstruction. Combined with the NRCDP dataset, the W-GA-BPG model is established by comparing and analyzing experiments by evaluating the Mean Square Error (MSE) and determination coefficient of the prediction results. The MSE and determination coefficient predicted using the proposed model are 0.0043 and 0.9302, respectively, which is the best prediction effect.

Keywords:

natural rubber customs declaration price; wavelet decomposition; genetic algorithm; BP neural network group

MSC:

37M10

1. Introduction

Natural rubber is short in China, which mainly depends on imports [1]. Due to exchange rate, freight, taxes and other factors, there will be certain fluctuations and abnormalities in Natural Rubber Customs Declaration Price (NRCDP). The uncertainty of NRCDP increases the commercial risk of illegal price declaration. This not only hurts the development of natural rubber production and its related industries but also damages the development of a national economy. Therefore, it is very important to predict NRCDP in economic development. Moreover, the predicted NRCDP can be used for macroeconomic decisions or for early warning of false declaration prices.

At present, the price prediction of a certain commodity is mainly performed by transferring the price prediction methods of other commodities and the time series prediction methods. Since the collected price data are nonlinear and uncertain, neural networks are applied widely in price prediction. Gao et al. [2] and Yu et al. [3] used a BP neural network to predict egg prices and auto insurance claims, respectively. Su et al. [4] proposed a BP neural network model that combined principal component analysis with (Levenberg-Marquardt) an LM algorithm. Roman Tkachenko et al. [5] predicted medical insurance cost based on the use of Ito decomposition and the neural-like structure of the successive geometric transformations model (SGTM). And Shakhovska, N et al. [6] proposed Bagged CART and Random Forest algorithms to predict the medical insurance cost. Moreover, a neural network also obtained good performance in the prediction of stock prices [7,8,9,10,11] and other commodity prices [12,13,14]. However, the relative methods used for NRCDP prediction have not been used. According to the no free lunch theorem [15], the superior performance of each model on one dataset is necessarily accompanied by the inferior performance of that model on another particular dataset. Therefore, we need specific design models to predict NRCDP. Moreover, the BP neural network is first proposed to predict NRCDP in this paper. Due to the complexity and variability of data, it is difficult to capture data features accurately using BP neural networks on one hand. On the other hand, it is possible to fall into a local minimum in the training, which may greatly reduce the accuracy of prediction. To overcome this problem, the Genetic Algorithm (GA) is applied to adjust the parameters of the BP neural networks [16]. To analyze the features of complicated and changeable data, time-frequency analysis can be used to process and analyze data. Time-frequency analysis can describe the time and frequency features of data, which included usually short-time Fourier transform, wavelet transform and so on. Compared with other methods, wavelet transform has a better time-frequency feature window and has obvious advantages in processing nonlinear and uncertain data. Therefore, wavelet decomposition is chosen to analyze and process the data [17,18]. Since the feature sequences obtained by wavelet decomposition are of different scales, a single BP neural network cannot give full play to the advantages of wavelet decomposition, so it is necessary to design a group of different BP neural networks according to different feature sequences.

Based on the above, a prediction method of the NRCDP based on Wavelet and the optimized BP neural network Group (BPG) using GA (W-GA-BPG) is proposed, which makes the prediction of NRCDP more accurate. The overall structure of this paper is organized as follows. Section 2, Section 3 and Section 4 discuss the wavelet decomposition, BP Neural Network and GA methods, respectively. In Section 5, the process of establishing the NRCDP dataset based on the original dataset is introduced in detail. In Section 6, the framework of the proposed method is introduced in detail, and in Section 6.1, Section 6.2, Section 6.3 and Section 6.4, the selection of model parameters is experimented. In Section 6.5, the experiments are performed to evaluate the effectiveness of the proposed model. The main work of this paper is shown in Figure 1.

2. Wavelet Decomposition

Wavelet decomposition overcomes the disadvantage that Fourier transform is localized only in frequency. It is a transform analysis method with both time and frequency domain [19]. Wavelet decomposition can extract information from the original data with different scales. The Low-Frequency Sequence (LFS) and High-Frequency Sequences (HFSs) can be obtained by decomposing the original data. Thus, the time, location and frequency information of data can be captured effectively. The original data x[n] are filtered by the low-pass filter h[n] and high-pass filter g[n] to obtain their multi-scale features, which is called discrete wavelet transform (DWT) [20,21]. The original data are filtered through a low-pass filter with a cutoff frequency f/2 (f is the highest frequency of the original data) to obtain its low-frequency component

y_{L F 1} (n)

, in which the resolution is reduced to half. At the same time, the original data is filtered through a high-pass filter with the same cutoff frequency to obtain its high-frequency component

y_{H F_{1}} (n)

.

y_{L F 1} (n) = x [n] * h [n] = \sum_{k = - \infty}^{\infty} x [k] h [n - k]

(1)

y_{H F 1} (n) = x [n] * g [n] = \sum_{k = - \infty}^{\infty} x [k] g [n - k]

(2)

Then, the obtained features are down-sampled, thus, half of the samples are left. The LFS A₁ and the HFS D₁ are obtained. The down-sampled LFS A₁ is decomposed continually with the same process. The original data are decomposed by n-layer wavelet to obtain n+1 sequence A_n, D_n, D_n₋₁, …, D₂, D₁.

3. BP Neural Network

BP neural network is mainly composed of input layer, hidden layer and output layer. Among them, the selection of node number of hidden layer, the size of training data, and the selection of activation function of hidden layer and output layer have great influence on the prediction accuracy of BP neural network [22]. The three-layer BP neural network model can simulate any nonlinear problem well. Therefore, in practical application, three-layer neural network is usually used for training and prediction [23], whose structure is shown in Figure 2.

The BP neural network training process is as follows:

Step 1: Initialize the neural network. First, the number of input nodes N_I and number of output nodes N_O of the BP neural network is determined according to the task to be classified. The node number N_H of the hidden layer can be selected by empirical formula or experiments. Then, the network weights and biases are initialized. The weight between the input node i and the hidden layer node h is set as w_ih. The weight between the hidden layer node h and the output node j is set as w_hj. Their biases are set as a and b, respectively. Finally, initialize the network parameters. The learning rate is set as η, and the activation function g(x) is assumed as a sigmoid function.

g (x) = \frac{1}{1 + e^{- x}}

(3)

Step 2: Input training sample. The processed samples are input into the initialized BP neural network.

Step 3: Calculate the forward propagation. The hidden layer output H can be calculated by the following function.

H_{h} = g (\sum_{i = 1}^{N_{I}} w_{i h} x_{i} + a_{h}), h = 1, 2, \dots N_{H}

(4)

x is the input sample. Then, the output layer value O is calculated according to the following function.

O_{j} = g (\sum_{h = 1}^{N_{H}} H_{h} w_{h j} + b_{j}), j = 1, 2, \dots N_{O}

(5)

Step 4: Calculate the error and updates the weights and biases. The error calculation formula is as follows.

e_{j} = Y_{j} - O_{j}, j = 1, 2, \dots, N_{O}

(6)

e is the error between the predicted value O and the expected value Y. The weights and biases between the input layer and the hidden layer are updated with the following formula.

{\hat{w}}_{i h} = w_{i h} + η H_{h} (1 - H_{h}) x_{i} \sum_{j = 1}^{N_{O}} w_{h j} e_{j}

(7)

{\hat{a}}_{h} = a_{h} + η H_{h} (1 - H_{h}) \sum_{j = 1}^{N_{O}} w_{h j} e_{j}

(8)

The weights and biases between the hidden layer and output layer are updated according to the same as above, the formula is as follows.

{\hat{w}}_{h j} = w_{h j} + η H_{h} e_{j}

(9)

{\hat{b}}_{j} = b_{j} + η e_{j}

(10)

Step 5: The iteration. Steps 3 and Step 4 are carried out with the new samples until the maximum iteration number or the target error is satisfied.

4. Genetic Algorithm

To solve the problem that BP neural network may fall into a local minimum, GA is used to optimize BP neural network.

A GA conducts a self-adaptive probabilistic optimization search by mimicking the natural evolution laws of genetics; as a highly efficient optimization algorithm, it has the advantages of high robustness, strong global search capability and computational simplicity [24]. The optimum problem can be solved by simulating the process of biological evolution, which can generate the next generation of solutions through selection, crossover, mutation and other genetic operations. Therefore, it can be used to find the global minimum in training BP neural network. Meanwhile, according to the designed fitness function, the population is updated by estimating the size of the fitness function value. Finally, the optimal weights and biases obtained by GA are assigned to the BP neural network. The detailed optimization process of weights and biases of the BP neural network by GA is as follows:

Step 1: Initialization. Generate random weights and initial populations.

Step 2: The fitness calculation. The fitness value F is used to measure the fitness between individuals and the environment. It can be calculated by the absolute error between the predicted value and the real value of the BP neural network.

F = k (\sum_{i = 1}^{n} | y_{i} - o_{i} |)

(11)

k is a genetic coefficient, n is the number of nodes at the output layer, y_i is the expected value and o_i is the predicted value by BP neural network.

Step 3: The genetic operation. Before meeting the termination conditions, a series of genetic operations such as selection, crossover and mutation are repeated to obtain new populations. The fitness proportion method is used to select the individual. The probability of an individual being selected is as follows.

p_{i} = \frac{l_{i}}{\sum_{j = 1}^{N} l_{i}}

(12)

p_i is the probability of individual selection,

l_{i} = \frac{k}{F_{i}}

(

F_{i}

is the individual fitness), N is the population size. Then, crossover and mutation operations are performed.

Step 4: The terminated conditions. (1) Reaching a given number of iterations; (2) the number of new population individuals reached its maximum population size N.

This process can be expressed in pseudo-code as follows.

Set the parameters

Popsize: Population size

Maxgen: Number of evolutionary iterations

Pop: Initial population

P_cross: Crossover probability

P_mutation: Mutation probability

F: Fitness value

P: Selection probability

Pseudo code

Step 1: Initialization parameters;

Step 2: Random generation of the first population Pop;

Step 3: Calculate the fitness value F, and select the two individuals with largest P;

Step 4: Randomly generate a random number l of (0, 1):

L < P_cross: The two chromosomes cross each other;

Step5: Randomly generate a random number l of (0, 1):

L < P_mutation: Two chromosomes are mutated;

Step 6: Put the evolved new individual into the new population new_pop;

Step 7: If the number of new population individuals does not reach Popsize OR The maximum evolutionary algebra is not reached Maxgen, then goto: step3; else Stop the evolution.

5. The NRCDP Dataset Establishment

The original dataset of NRCDP is provided by Qingdao Customs, which records the CDP information with the different kinds of natural rubber commodities in different time periods every day. It contains 4422 pieces of natural rubber information dated from 15:45:20 on 17 March 2021 to 15:15:58 on 13 September 2021. The attributes and details are shown in Table 1.

The original NRCDP dataset has some problems such as outliers, irrelevant variables and inconsistency of commodity types, which cannot reflect the change in data with time in some commodity types and will affect the accuracy of the prediction model. To solve them, the NRCDP dataset is established by selecting and preprocessing the original data. The detailed operations are as follows:

Step 1: Select the commodity categories. The number of recorded different categories of commodities is shown in Table 2. It can be seen from the table that the number of natural rubber (400129 and 400130) is too little to reflect data characteristics, 400129 and 400130 are filtered out. Although the remaining natural rubber categories can reflect the data characteristics to some extent, only natural rubber (400122) is selected as a sample by comparing their performance.

Step 2: Delete the irrelevant variables. By analyzing the dataset, it can be seen that the order number, commodity code, commodity name, total CNY price, quantity and unit are irrelevant to the experiment purpose, so they are deleted.

Step 3: Process the outliers. By analyzing the dataset, it is found that the data range is between 2 and 14,000, which is a relatively large span. Moreover, there is only a small part of the data between 20 and 14,000. This is due to the fact that the NRCDP is particularly high if the number of individual batches declared at customs is smaller than that of normal batches during the final order processing. These NRCDPs cannot reflect the fluctuation rule of natural rubber prices in the market. Therefore, the NRCDP larger than 20 is filtered out first. To observe clearly, the rest of the data distribution is shown in Figure 2. As can be seen from Figure 3, the NRCDP is mainly concentrated around 10–13 Yuan. The division of price ranges for the entire dataset is shown in Table 3. Therefore, 3475 NRCDPs between 6 and 14 Yuan are chosen as the basic data in this paper.

Step 4: Calculate the average value of NRCDPs every day as a customs declaration price sample. Thus, a new NRCDP dataset is established. The NRCDP dataset information is shown in Figure 4.

Due to the occasional absence of import transactions for natural rubber products (400122), date discontinuity exists in the established dataset. From 17 March 2021 to 13 September 2021, the NRCDP dataset includes 141 records of NRCDP. Every record contains two attributes (date and price). In this paper, the NRCDP dataset is divided into two parts, the first 2/3 part is selected as the training set, and the last 1/3 part is selected as the test set.

6. W-GA-BPG Model Construction

In the prediction model, firstly, wavelet transform is carried out to obtain different LFS and HFSs of NRCDP. Secondly, BP neural networks with the same number of sequences are established. GA is used to optimize each BP neural network, and corresponding LFS and HFSs predictions are obtained, respectively. Finally, the final prediction result is obtained by wavelet reconstruction. Its algorithm is shown in Figure 5.

The detailed steps of the algorithm are as follows:

Step 1: Obtain the wavelet decomposition sequence. A group of sequences is obtained by wavelet decomposition, which contains n HFSs (D₁, D₂ … D_n) and a LFS (A_n). The LFS can obviously reflect the coarse trend of NRCDP, and it is beneficial to the following BP neural network to capture the trend of the data. The HFS can reflect the detailed information of NRCDP, which is beneficial to the following BP neural network to capture the changeable characteristics of the data fully.

Step 2: Select the optimal BP neural network model. The node numbers of the input and hidden layer for the BP neural network are obtained by experiments. Based on that, the corresponding network parameters can be obtained by the same strategy.

Step 3: Establish a Group of BP neural networks (BPG) optimized by GA. Since the feature sequences with different scales obtained by wavelet decomposition have different characteristics, the prediction performance using a single BP neural network cannot reach the expected effect. In addition, a BP neural network may fall into a local minimum in training. To solve these problems, a BPG optimized by GA (GA-BPG) is established. The network in the group is denoted as GA-BP1, GA-BP2 … GA-BPn, GA-BPn+1 (n is the decomposition level). A group of initial weights and biases are randomly assigned to the BP neural network, and the initial population is generated randomly. Then the weights and biases are updated by a series of genetic operations (selection, crossover and mutation).

Step 4: Obtain the wavelet decomposition prediction sequence. Each sequence is input into the corresponding GA-BP natural network, and the prediction sequence is obtained.

Step 5: The final prediction result of NRCDP is obtained by wavelet reconstruction.

6.1. The Wavelet Decomposition Function and Level Selection

To select wavelet functions, the predicted results of the BP neural network with different decomposition levels using db4 and db6 wavelet functions are compared, the MSE is shown in Table 4. It can be seen from the table that the MSE obtained by two and three decomposition levels with the db4 wavelet function is 0.0742 and 0.1094, respectively, and the MSE of two decomposition levels is 0.0352 lower than that of three decomposition levels. The MSE obtained using two and three decomposition levels with the db6 wavelet function is 0.01164 and 0.0163, respectively. The MSE of the three decomposition levels is 0.1001 lower than that of two-level decomposition. The MSE obtained by three decomposition levels with the db6 wavelet function is 0.0579 lower than that by two decomposition levels with the db4 wavelet function. In conclusion, the db6 wavelet function and three decomposition levels have the smallest MSE and the best effect.

The feature sequences with different scales by db6 wavelet function and three decomposition levels are shown in Figure 6. It can be seen from the figure that the LFS A₃ ranges from 0 to 11.5, which reflects the main features of the original sequence. The range of HFSs D₁, D₂ and D₃ are −1 to 1, −0.2 to 0.2, −0.6 to 0.4. With the increase in the number of decomposition levels, the value of the HFS becomes smaller and smaller, which reflects the details of the original sequence in different decomposition levels in turn.

To sum up, the db6 wavelet function and three decomposition levels are selected to obtain four feature sequences (D₁, D₂, D₃ and A₃) in this paper.

6.2. The BP Neural Network Parameters Selection

A three-layer BP neural network structure is adopted by analyzing the NRCDP data, including an input layer, a hidden layer and an output layer. It is very important to determine the numbers of the input nodes and the hidden nodes in the BP neural network. Since a clear formula to determine them is not exist in current studies, they are determined by experiments in this paper.

According to the experience of previous experiments, the number of input nodes is set between 3 and 10 and the number of hidden layer nodes is set between 3 and 15. Other initial network parameters are set as shown in Table 5. As experimental results are shown in Table 6, IN represents the input node number, and HN represents the hidden node number.

As shown in Table 6, the values in each column are fluctuating rather than monotonously increasing or decreasing. When the number of input nodes is 7, and the number of hidden layer nodes is 3–15, the MSE is 0.0386, 0.0417, 0.0476, 0.0453, 0.0565, 0.0319, 0.0737, 0.0691, 0.1456, 0.0847, 0.0854, 0.0654 and 0.6566, respectively. The MSE values do not show a monotonously increasing or decreasing trend with the number of hidden layer nodes. When the number of input nodes is 7 and the number of hidden layer nodes is 8, MSE reaches the minimum, which is 0.0319. Thus, the 7-8-1 three-layer BP network structure is adopted in this paper.

Based on the above structure, different learning rates are tried. Setting the learning rate as 0.1, 0.01 and 0.001, respectively, the corresponding results are shown in Table 7. It can be observed the MSE is minimum when the learning rate is 0.01. Thus, the learning rate is set as 0.01 in this paper.

6.3. The GA Parameters Selection

The initial parameter Settings of the GA are shown in Table 8.

In GA, the crossover probability (CP) and mutation probability (MP) is very important for the convergence of the BP network. The CP and MP are set as 0.1, 0.2, …, 0.9, respectively. The prediction effectiveness is shown in Table 9. When CP is determined, for example, CP is 0.2, the MSE values do not increase or decrease with the increase in MP. By comparing the data, the prediction and optimization effect of the BP neural network is the best when the CP is 0.2 and the MP is 0.5.

Based on the above analysis, the BP neural network is set as a three-layer neural network structure with a network topology of 7-8-1, and the learning rate is 0.01. The CP and MP of the GA are set as 0.2 and 0.5.

6.4. The GA_BPG Parameters Selection

For the components with different characteristics obtained by wavelet decomposition, it is necessary to establish BPG to predict them, respectively. Based on the above analysis, this paper selects the db6 wavelet function to decompose the NRCDP data at three levels to obtain four sub-sequences D₁, D₂, D₃ and A₃. Then, a GA-BPG is established, including four three-layer BP neural networks optimized by GA (GA-BP1, GA-BP2, GA-BP3 and GA-BP4) with a 7-8-1 network topology structure. The network parameters are the same as in the above analysis. Finally, the sub-sequences are predicted, respectively. By experiments such as Section 6.3, the setting of CPs and MPs in the GA for optimizing the four BP neural networks are selected, as shown in Table 10.

6.5. Comparative Analysis with Other Predictive Models

The prediction results are evaluated using MSE and determinant coefficient (R²)

M S E = \frac{1}{n} \sum_{i = 1}^{n} (y_{r e a l} - y_{p r e d})^{2}

(13)

R^{2} = \frac{\sum_{i = 1}^{n} {(y_{p r e d} - y_{m e a n})}^{2}}{\sum_{i = 1}^{n} {(y_{r e a l} - y_{m e a n})}^{2}}

(14)

y_{r e a l}

is the real value of NRCDP,

y_{p r e d}

is the predicted value of NRCDP,

y_{m e a n}

is the real average value of NRCDP, and n is the length of the predicted sample.

6.5.1. The Other Predictive Model Parameter Selection

To evaluate the proposed method, W-GA-BPG is compared and analyzed with LSTM, the least square method, BP neural network, GA-BP, wavelet decomposition combined with BP neural network (W-BP) (The overall framework proposed in [25] was used in the experiment), and wavelet decomposition combined with a single GA-BP (W-GA-BP).

Each model has the same structure with seven input nodes and one output node. The input of the BP neural network and GA-BP model is the NRCDP data without wavelet decomposition. The number of nodes in the middle layer of the LSTM model is 16, the learning rate is 0.01, and the maximum number of iterations is 50. The parameters used in BP neural networks, GA-BP, W-BP and W-GA-BP models are the optimal results selected by the above experiments.

6.5.2. Prediction Effect Comparison

MSE and R² are used to evaluate the model. The predicted result and error of each model are shown in Figure 7 and Figure 8. The MSE and R² between the true value and predicted value using different models are shown in Table 11.

According to Figure 7 and Figure 8 and Table 11, the results using W-GA-BPG have the highest accuracy compared with other models, and the MSE and R2 values are 0.0043,0.9302, respectively. The MSE of LSTM, least square method and BP neural network are 0.0365, 0.0328 and 0.0319, respectively. The R² values are 0.4092, 0.4694 and 0.4829, respectively. Therefore, the results obtained by BP neural network have the highest accuracy among the three traditional models, LSTM, least square method and BP. The MSE values obtained using GA-BP and W-BP models are 0.0242 and 0.0163, respectively, and the R² values are 0.6087 and 0.7325, respectively. Compared with the traditional BP neural network, the prediction results of GA and wavelet decomposition combined with the BP neural network are successively improved in accuracy. The MSE and R² of W-GA-BP are 0.0061, and 0.9032, respectively, and the MSE is smaller than GA-BP and W-GA, and R² is larger than GA-BP and W-GA. Overall, the highest accuracy can be attached using the proposed W-GA-BPG model, which can predict the NRCDP more accurately.

To sum up, the prediction accuracy of the BP neural network optimized by wavelet decomposition or GA is lower than the proposed method. The combination method can not only extract the trend and details of the original sequence but also optimize the weight of the neural network by using GA. It provides a reference for the early warning of NRCDP, which is of great significance to macro-control in the natural rubber market.

7. Conclusions

In this paper, we have established the NRCDP dataset and proposed a prediction model, W-GA-BPG, which is based on wavelet and the optimized BP neural network group using GA. To obtain the general trend and detailed information on NRCDP, the db6 wavelet function is used to decompose NRCDP three times. To solve the problem that it may fall into a local minimum in BP neural network training and that the obtained sub-sequences are different in distribution and value, an optimized BP neural network group using GA is proposed to predict the sub-sequences, respectively. The proposed model can fully capture the information of NRCDP with different scales and avoid the possibility of falling into the local minimum for the BP neural network. Compared with other models, experimental results show that the performance of the proposed W-GA-BPG prediction is the best, which has the smallest error (0.0043) and the highest determination coefficient (0.9302). The proposed model is applied to the prediction problem to be solved in the Research on Building Digital Ecology of Qingdao Shipping Trade Finance Based on Blockchain. Moreover, then it can warn of the natural rubber market price fluctuation.

The proposed model for predicting NRCDP is only considered the impact of historical price data on future prices, and other factors are not considered, such as exchange rate, economic index and so on. To further improve the performance of the prediction model, more factors influencing NRCDP will be considered in our future work. Moreover, the proposed model only validates the prediction of the NRCDP, and its wide application should be further verified.

Author Contributions

Conceptualization, H.Y.; methodology, H.Y.; software, K.Z.; validation, K.Z.; formal analysis, K.Z.; investigation, H.Y.; resources, L.Z.; data curation, K.M.; writing—original draft preparation, K.Z.; writing—review and editing, L.Z.; visualization, F.T.; supervision, H.Y.; funding acquisition, K.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research on Building Digital Ecology of Qingdao Shipping Trade Finance Based on Blockchain, grant number 2020HK280.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors wish to thank Qingdao University of Technology and Qingdao Customs for their financial and experimental environment support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, R.J.; Mo, Y.Y.; Yang, L.; Wu, W.; He, C.H. Re-understanding and development suggestions on strategic position of Natural rubber industry in China. China Trop. Agric. 2022, 19, 13–18. [Google Scholar]
Gao, Y.; An, S.B. Comparative Study on the Predictive Effect of the Price of Eggs in China-Comparative analysis based on BP neural network model and egg futures predictive model. Price Theory Pract. 2021, 41, 75–78. [Google Scholar] [CrossRef]
Yu, W.; Guan, G.; Li, J.; Wang, Q.; Xie, X.; Zhang, Y.; Huang, Y.; Yu, X.; Cui, C. Claim Amount Forecasting and Pricing of Automobile Insurance Based on the BP Neural Network. Complexity 2021, 2021, 6616121. [Google Scholar] [CrossRef]
Su, H.; Duan, L.Z. Prediction of Mutton Price Based on Improved BP Neural Network. Comput. Simul. 2020, 37, 460–465. Available online: https://kns.cnki.net/kcms/detail/detail.aspx?FileName=JSJZ202004097&DbName=CJFQ2020 (accessed on 3 January 2021).
Tkachenko, R.; Izonin, I.; Vitynskyi, P.; Lotoshynska, N.; Pavlyuk, O. Development of the Non-Iterative Supervised Learning Predictor Based on the Ito Decomposition and SGTM Neural-Like Structure for Managing Medical Insurance Costs. Data 2018, 3, 46. [Google Scholar] [CrossRef] [Green Version]
Shakhovska, N.; Melnykova, N.; Chopiyak, V.; Ml, M.G. An Ensemble Methods for Medical Insurance Costs Prediction Task. Comput. Mater. Contin. 2022, 70, 3969–3984. [Google Scholar] [CrossRef]
Zhang, D.; Lou, S. The application research of neural network and BP algorithm in stock price pattern classification and prediction—Science Direct. Future Gener. Comput. Syst. 2021, 115, 872–879. [Google Scholar] [CrossRef]
Yu, Z.; Qin, L.; Chen, Y.; Parmar, M. Stock price forecasting based on LLE-BP neural network model. Phys. A Stat. Mech. Its Appl. 2020, 553, 124197. [Google Scholar] [CrossRef]
Kamalov, F. Forecasting significant stock price changes using neural networks. Neural Comput. Appl. 2020, 32, 17655–17667. [Google Scholar] [CrossRef]
Hu, L.Y.; Zhu, J.M. A comparative analysis of stock price prediction based on stepwise regression and BP neural network. J. Liaoning Univ. Technol. 2019, 39, 201–205. [Google Scholar] [CrossRef]
Lu, W.J.; Li, J.Z.; Wang, J.Y.; Qin, L.L. A CNN-BiLSTM-AM method for stock price prediction. Neural Comput. Appl. 2020, 33, 4741–4753. [Google Scholar] [CrossRef]
Sun, W.; Huang, C.C. A carbon price prediction model based on secondary decomposition algorithm and optimized back propagation neural network. J. Clean. Prod. 2020, 243, 118671. [Google Scholar] [CrossRef]
Xu, X.J.; Zhang, Y. Corn cash price forecasting with neural networks. Comput. Electron. Agric. 2021, 184, 106120. [Google Scholar] [CrossRef]
Qiao, W.B.; Yang, Z. Forecast the electricity price of U.S. using a wavelet transform-based hybrid model. Energy 2020, 193, 116704. [Google Scholar] [CrossRef]
Ciuffo, B.; Punzo, V. “No Free Lunch” Theorems Applied to the Calibration of Traffic Simulation Models. IEEE Trans. Intell. Transp. Syst. 2014, 15, 553–562. Available online: https://ieeexplore.ieee.org/document/6670773 (accessed on 26 August 2020). [CrossRef]
Fang, Y.C. Forecast of Foreclosure Property Market Trends during the Epidemic Based on GA-BP Neural Network. Sci. Program. 2022, 2022, 3220986. [Google Scholar] [CrossRef]
Ghaderpour, E.; Pagiatakis, S.D.; Hassan, Q.K. A Survey on Change Detection and Time Series Analysis with Applications. Appl. Sci. 2021, 11, 6141. [Google Scholar] [CrossRef]
Aksenovich, T.V. Comparison of the Use of Wavelet Transform and Short-Time Fourier Transform for the Study of Geomagnetically Induced Current in the Autotransformer Neutral. In Proceedings of the 2020 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), Vladivostok, Russia, 6–9 October 2020. [Google Scholar]
Wang, Z.X.; Wang, P.; Wang, Y.; Peng, H.; Hua, P.; Zhang, J. Wavelet decomposition and genetic BPNN hybrid model based modelling approach for As concentration prediction in surface water. Acta Sci. Circumstantiae 2021, 41, 2942–2950. [Google Scholar] [CrossRef]
Phadikar, S.; Sinha, N.; Ghosh, R. Automatic Eyeblink Artifact Removal From EEG Signal Using Wavelet Transform With Heuristically Optimized Threshold. IEEE J. Biomed. Health Inform. 2021, 25, 475–484. [Google Scholar] [CrossRef]
Khelil, K.; Berrezzek, F.; Bouadjila, T. DWT-Based Wind Speed Forecasting Using Artificial Neural Networks in the region of Annaba. In Proceedings of the Control Systems and Signal Processing (CCSSP). 2020 1st International Conference on Communications, El Oued, Algeria, 16–17 May 2020. [Google Scholar]
Zhao, Y.; Shen, L.H.; Ma, J.X.; Qiu, X. Traffic Short-Term Prediction Generated by Wavelet Decomposition and BP Neural Network of Traffic Zone. J. Chongqing Jiaotong Univ. (Nat. Sci.) 2021, 40, 60–66. Available online: https://kns.cnki.net/kcms/detail/50.1190.U.20210601.1023.002.html (accessed on 20 January 2022).
Cheng, J.W.; Xu, J.; Wang, Y.L.; Zhang, H.W.; Li, X.H. Granary rice temperature prediction model based on BP neural network. Mod. Electron. Tech. 2021, 45, 178–182. [Google Scholar] [CrossRef]
Wang, L.; Bi, X.H. Risk assessment of knowledge fusion in an innovation ecosystem based on a GA-BP neural network. Cogn. Syst. Res. 2021, 66, 201–210. [Google Scholar] [CrossRef]
Zhang, L.; Fang, Z.Z.; Ma, T.F. AQI Prediction of BP Neural Network Basedon Wavelet Decomposition of Time Series. J. Xuzhou Inst. Technol. (Nat. Sci.) 2020, 35, 45–52. [Google Scholar] [CrossRef]

Figure 1. The overall working structure is in this paper.

Figure 2. Three layers topological structure of BP neural network.

Figure 3. Scatter chart of NRCDPs between 2 and 19.99 Yuan.

Figure 4. NRCDP dataset information.

Figure 5. Process of W-GA-BPG algorithm.

Figure 6. The wavelet decomposition results of the NRCDP data.

Figure 7. Prediction results for (a) LSTM, (b) Least Square, (c) BP, (d) GA-BP, (e) W-BP, (f) W-GA-BP and (j) W-GA-BPG.

Figure 8. Errors comparison between the predicted value and true value with different models.

Table 1. The various attributes and details of original NRCDP dataset.

Attribute	Details of Original Dataset
Date and time	2021.3.17.15:45:20–2021.9.13.15:15:58
Declaration number	I for import and E for export
Commodity code	400110, 400121, 400122, 400129, 400130
Commodity name	Natural rubber, natural rubber (block), Natural rubber tobacco film, tobacco film, Natural latex
Total China Yuan (CNY) price	Quantity * Unit price
Unit price	Unit: Yuan
Unit	035kg

The * means multiply the two.

Table 2. The quantity of records corresponding to different commodities.

Commodity Code	400110	400121	400122	400129	400130	Summation
Quantity	343	442	3599	36	2	4422

Table 3. The quantity of in different NRCDP range.

Price Range	2–5.99	6.00–14.00	14.01–19.99	20.00–1000	1000–14,000
Quantity	8	3475	19	0	97

Table 4. Experimental results of different wavelet functions and decomposition levels.

Function	Decomposition Level	MSE
db4	2	0.0742
db4	3	0.1094
db6	2	0.1164
db6	3	0.0163

Table 5. Initial parameters of BP neural network.

Learning Rate	Activation Function	Maximum Iterations	Expected Error
0.01	Sigmoid	1000	0.001

Table 6. MSE of different nodes between input and hidden layer.

	3	4	5	6	7	8	9	10
IN	3	4	5	6	7	8	9	10
3	0.0609	0.0578	0.0447	0.0518	0.0386	0.0332	0.0466	0.0519
4	0.0891	0.0627	0.0614	0.0593	0.0417	0.0494	0.0684	0.1231
5	0.0787	0.0831	0.0699	0.0768	0.0476	0.0702	0.0520	0.0660
6	0.0437	0.0632	0.0654	0.0840	0.0453	0.1357	0.0907	0.0402
7	0.0728	0.1305	0.0608	0.0563	0.0565	0.0652	0.0771	0.0560
8	0.0755	0.1310	0.0728	0.0556	0.0319	0.0731	0.0585	0.4825
9	0.0967	0.1309	0.0513	0.1392	0.0737	0.1052	0.3032	0.0862
10	0.3381	0.0524	0.0719	0.2020	0.0691	0.0640	0.0920	0.0946
11	0.0640	0.1108	0.0624	0.1518	0.1456	0.0436	0.3397	0.0656
12	0.2164	0.0987	0.1281	0.2359	0.0847	0.0501	0.1709	0.0616
13	0.1660	0.2259	0.1465	0.1337	0.0854	0.0767	0.0511	0.0780
14	0.1227	0.0983	0.0712	0.0597	0.0654	0.1558	0.1815	0.1623
15	0.5279	0.0985	0.0554	0.4628	0.6566	0.0668	0.1232	0.0744

Table 7. Results with different learning rates.

Learning Rate	0.1	0.01	0.001
MSE	0.0798	0.0319	0.0954

Table 8. GA initialization parameters.

Initial Group Size	Evolution Algebra	Expect Error
30	10	0.001

Table 9. The MESs with different CRs and MRs.

	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9
MP	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9
0.1	0.0457	0.0503	0.0456	0.0425	0.0428	0.0498	0.0462	0.0838	0.0569
0.2	0.0451	0.0421	0.1527	0.0538	0.0352	0.0628	0.0858	0.0854	0.0613
0.3	0.0926	0.0421	0.0554	0.0502	0.0635	0.0420	0.0417	0.0543	0.0577
0.4	0.0472	0.0918	0.0691	0.0495	0.0514	0.0422	0.0906	0.0415	0.0563
0.5	0.0387	0.0242	0.0583	0.0416	0.0348	0.0673	0.0386	0.1451	0.0285
0.6	0.0833	0.0366	0.0460	0.0468	0.0361	0.0659	0.0846	0.0919	0.0385
0.7	0.0463	0.0418	0.0816	0.0318	0.0423	0.0328	0.0365	0.0483	0.0374
0.8	0.0485	0.0491	0.0787	0.0448	0.0401	0.0889	0.0433	0.0896	0.0398
0.9	0.0583	0.0444	0.0995	0.0412	0.0320	0.0889	0.0798	0.0449	0.0635

Table 10. Optimized CP and MP of different BP neural networks.

	GA-BP1	GA-BP2	GA-BP3	GA-BP4
CP	0.8	0.2	0.4	0.7
MP	0.4	0.3	0.2	0.5

Table 11. Comparison of evaluation indexes of prediction models.

Model	MSE	R²
LSTM	0.0365	0.4092
Least square method	0.0328	0.4694
BP	0.0319	0.4829
GA-BP	0.0242	0.6087
W-BP	0.0163	0.7325
W-GA-BP	0.0061	0.9032
W-GA-BPG	0.0043	0.9302

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yi, H.; Zhang, K.; Ma, K.; Zhou, L.; Tang, F. Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group. Mathematics 2022, 10, 4264. https://doi.org/10.3390/math10224264

AMA Style

Yi H, Zhang K, Ma K, Zhou L, Tang F. Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group. Mathematics. 2022; 10(22):4264. https://doi.org/10.3390/math10224264

Chicago/Turabian Style

Yi, Hongjie, Ke Zhang, Kun Ma, Lijian Zhou, and Futong Tang. 2022. "Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group" Mathematics 10, no. 22: 4264. https://doi.org/10.3390/math10224264

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Natural Rubber Customs Declaration Price Based on Wavelet Decomposition and GA-BP Neural Network Group

Abstract

1. Introduction

2. Wavelet Decomposition

3. BP Neural Network

4. Genetic Algorithm

5. The NRCDP Dataset Establishment

6. W-GA-BPG Model Construction

6.1. The Wavelet Decomposition Function and Level Selection

6.2. The BP Neural Network Parameters Selection

6.3. The GA Parameters Selection

6.4. The GA_BPG Parameters Selection

6.5. Comparative Analysis with Other Predictive Models

6.5.1. The Other Predictive Model Parameter Selection

6.5.2. Prediction Effect Comparison

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI