1. Introduction
Natural rubber is short in China, which mainly depends on imports [
1]. Due to exchange rate, freight, taxes and other factors, there will be certain fluctuations and abnormalities in Natural Rubber Customs Declaration Price (NRCDP). The uncertainty of NRCDP increases the commercial risk of illegal price declaration. This not only hurts the development of natural rubber production and its related industries but also damages the development of a national economy. Therefore, it is very important to predict NRCDP in economic development. Moreover, the predicted NRCDP can be used for macroeconomic decisions or for early warning of false declaration prices.
At present, the price prediction of a certain commodity is mainly performed by transferring the price prediction methods of other commodities and the time series prediction methods. Since the collected price data are nonlinear and uncertain, neural networks are applied widely in price prediction. Gao et al. [
2] and Yu et al. [
3] used a BP neural network to predict egg prices and auto insurance claims, respectively. Su et al. [
4] proposed a BP neural network model that combined principal component analysis with (Levenberg-Marquardt) an LM algorithm. Roman Tkachenko et al. [
5] predicted medical insurance cost based on the use of Ito decomposition and the neural-like structure of the successive geometric transformations model (SGTM). And Shakhovska, N et al. [
6] proposed Bagged CART and Random Forest algorithms to predict the medical insurance cost. Moreover, a neural network also obtained good performance in the prediction of stock prices [
7,
8,
9,
10,
11] and other commodity prices [
12,
13,
14]. However, the relative methods used for NRCDP prediction have not been used. According to the no free lunch theorem [
15], the superior performance of each model on one dataset is necessarily accompanied by the inferior performance of that model on another particular dataset. Therefore, we need specific design models to predict NRCDP. Moreover, the BP neural network is first proposed to predict NRCDP in this paper. Due to the complexity and variability of data, it is difficult to capture data features accurately using BP neural networks on one hand. On the other hand, it is possible to fall into a local minimum in the training, which may greatly reduce the accuracy of prediction. To overcome this problem, the Genetic Algorithm (GA) is applied to adjust the parameters of the BP neural networks [
16]. To analyze the features of complicated and changeable data, time-frequency analysis can be used to process and analyze data. Time-frequency analysis can describe the time and frequency features of data, which included usually short-time Fourier transform, wavelet transform and so on. Compared with other methods, wavelet transform has a better time-frequency feature window and has obvious advantages in processing nonlinear and uncertain data. Therefore, wavelet decomposition is chosen to analyze and process the data [
17,
18]. Since the feature sequences obtained by wavelet decomposition are of different scales, a single BP neural network cannot give full play to the advantages of wavelet decomposition, so it is necessary to design a group of different BP neural networks according to different feature sequences.
Based on the above, a prediction method of the NRCDP based on Wavelet and the optimized BP neural network Group (BPG) using GA (W-GA-BPG) is proposed, which makes the prediction of NRCDP more accurate. The overall structure of this paper is organized as follows.
Section 2,
Section 3 and
Section 4 discuss the wavelet decomposition, BP Neural Network and GA methods, respectively. In
Section 5, the process of establishing the NRCDP dataset based on the original dataset is introduced in detail. In
Section 6, the framework of the proposed method is introduced in detail, and in
Section 6.1,
Section 6.2,
Section 6.3 and
Section 6.4, the selection of model parameters is experimented. In
Section 6.5, the experiments are performed to evaluate the effectiveness of the proposed model. The main work of this paper is shown in
Figure 1.
2. Wavelet Decomposition
Wavelet decomposition overcomes the disadvantage that Fourier transform is localized only in frequency. It is a transform analysis method with both time and frequency domain [
19]. Wavelet decomposition can extract information from the original data with different scales. The Low-Frequency Sequence (LFS) and High-Frequency Sequences (HFSs) can be obtained by decomposing the original data. Thus, the time, location and frequency information of data can be captured effectively. The original data
x[
n] are filtered by the low-pass filter
h[
n] and high-pass filter
g[
n] to obtain their multi-scale features, which is called discrete wavelet transform (DWT) [
20,
21]. The original data are filtered through a low-pass filter with a cutoff frequency
f/2 (
f is the highest frequency of the original data) to obtain its low-frequency component
, in which the resolution is reduced to half. At the same time, the original data is filtered through a high-pass filter with the same cutoff frequency to obtain its high-frequency component
.
Then, the obtained features are down-sampled, thus, half of the samples are left. The LFS A1 and the HFS D1 are obtained. The down-sampled LFS A1 is decomposed continually with the same process. The original data are decomposed by n-layer wavelet to obtain n+1 sequence An, Dn, Dn−1, …, D2, D1.
3. BP Neural Network
BP neural network is mainly composed of input layer, hidden layer and output layer. Among them, the selection of node number of hidden layer, the size of training data, and the selection of activation function of hidden layer and output layer have great influence on the prediction accuracy of BP neural network [
22]. The three-layer BP neural network model can simulate any nonlinear problem well. Therefore, in practical application, three-layer neural network is usually used for training and prediction [
23], whose structure is shown in
Figure 2.
The BP neural network training process is as follows:
Step 1: Initialize the neural network. First, the number of input nodes
NI and number of output nodes
NO of the BP neural network is determined according to the task to be classified. The node number
NH of the hidden layer can be selected by empirical formula or experiments. Then, the network weights and biases are initialized. The weight between the input node
i and the hidden layer node
h is set as
wih. The weight between the hidden layer node
h and the output node
j is set as
whj. Their biases are set as
a and
b, respectively. Finally, initialize the network parameters. The learning rate is set as
η, and the activation function
g(
x) is assumed as a sigmoid function.
Step 2: Input training sample. The processed samples are input into the initialized BP neural network.
Step 3: Calculate the forward propagation. The hidden layer output
H can be calculated by the following function.
x is the input sample. Then, the output layer value
O is calculated according to the following function.
Step 4: Calculate the error and updates the weights and biases. The error calculation formula is as follows.
e is the error between the predicted value
O and the expected value
Y. The weights and biases between the input layer and the hidden layer are updated with the following formula.
The weights and biases between the hidden layer and output layer are updated according to the same as above, the formula is as follows.
Step 5: The iteration. Steps 3 and Step 4 are carried out with the new samples until the maximum iteration number or the target error is satisfied.
4. Genetic Algorithm
To solve the problem that BP neural network may fall into a local minimum, GA is used to optimize BP neural network.
A GA conducts a self-adaptive probabilistic optimization search by mimicking the natural evolution laws of genetics; as a highly efficient optimization algorithm, it has the advantages of high robustness, strong global search capability and computational simplicity [
24]. The optimum problem can be solved by simulating the process of biological evolution, which can generate the next generation of solutions through selection, crossover, mutation and other genetic operations. Therefore, it can be used to find the global minimum in training BP neural network. Meanwhile, according to the designed fitness function, the population is updated by estimating the size of the fitness function value. Finally, the optimal weights and biases obtained by GA are assigned to the BP neural network. The detailed optimization process of weights and biases of the BP neural network by GA is as follows:
Step 1: Initialization. Generate random weights and initial populations.
Step 2: The fitness calculation. The fitness value
F is used to measure the fitness between individuals and the environment. It can be calculated by the absolute error between the predicted value and the real value of the BP neural network.
k is a genetic coefficient, n is the number of nodes at the output layer, yi is the expected value and oi is the predicted value by BP neural network.
Step 3: The genetic operation. Before meeting the termination conditions, a series of genetic operations such as selection, crossover and mutation are repeated to obtain new populations. The fitness proportion method is used to select the individual. The probability of an individual being selected is as follows.
pi is the probability of individual selection, ( is the individual fitness), N is the population size. Then, crossover and mutation operations are performed.
Step 4: The terminated conditions. (1) Reaching a given number of iterations; (2) the number of new population individuals reached its maximum population size N.
This process can be expressed in pseudo-code as follows.
Set the parameters
Popsize: Population size
Maxgen: Number of evolutionary iterations
Pop: Initial population
P_cross: Crossover probability
P_mutation: Mutation probability
F: Fitness value
P: Selection probability
Pseudo code
Step 1: Initialization parameters;
Step 2: Random generation of the first population Pop;
Step 3: Calculate the fitness value F, and select the two individuals with largest P;
Step 4: Randomly generate a random number l of (0, 1):
L < P_cross: The two chromosomes cross each other;
Step5: Randomly generate a random number l of (0, 1):
L < P_mutation: Two chromosomes are mutated;
Step 6: Put the evolved new individual into the new population new_pop;
Step 7: If the number of new population individuals does not reach Popsize OR The maximum evolutionary algebra is not reached Maxgen, then goto: step3; else Stop the evolution.
5. The NRCDP Dataset Establishment
The original dataset of NRCDP is provided by Qingdao Customs, which records the CDP information with the different kinds of natural rubber commodities in different time periods every day. It contains 4422 pieces of natural rubber information dated from 15:45:20 on 17 March 2021 to 15:15:58 on 13 September 2021. The attributes and details are shown in
Table 1.
The original NRCDP dataset has some problems such as outliers, irrelevant variables and inconsistency of commodity types, which cannot reflect the change in data with time in some commodity types and will affect the accuracy of the prediction model. To solve them, the NRCDP dataset is established by selecting and preprocessing the original data. The detailed operations are as follows:
Step 1: Select the commodity categories. The number of recorded different categories of commodities is shown in
Table 2. It can be seen from the table that the number of natural rubber (400129 and 400130) is too little to reflect data characteristics, 400129 and 400130 are filtered out. Although the remaining natural rubber categories can reflect the data characteristics to some extent, only natural rubber (400122) is selected as a sample by comparing their performance.
Step 2: Delete the irrelevant variables. By analyzing the dataset, it can be seen that the order number, commodity code, commodity name, total CNY price, quantity and unit are irrelevant to the experiment purpose, so they are deleted.
Step 3: Process the outliers. By analyzing the dataset, it is found that the data range is between 2 and 14,000, which is a relatively large span. Moreover, there is only a small part of the data between 20 and 14,000. This is due to the fact that the NRCDP is particularly high if the number of individual batches declared at customs is smaller than that of normal batches during the final order processing. These NRCDPs cannot reflect the fluctuation rule of natural rubber prices in the market. Therefore, the NRCDP larger than 20 is filtered out first. To observe clearly, the rest of the data distribution is shown in
Figure 2. As can be seen from
Figure 3, the NRCDP is mainly concentrated around 10–13 Yuan. The division of price ranges for the entire dataset is shown in
Table 3. Therefore, 3475 NRCDPs between 6 and 14 Yuan are chosen as the basic data in this paper.
Step 4: Calculate the average value of NRCDPs every day as a customs declaration price sample. Thus, a new NRCDP dataset is established. The NRCDP dataset information is shown in
Figure 4.
Due to the occasional absence of import transactions for natural rubber products (400122), date discontinuity exists in the established dataset. From 17 March 2021 to 13 September 2021, the NRCDP dataset includes 141 records of NRCDP. Every record contains two attributes (date and price). In this paper, the NRCDP dataset is divided into two parts, the first 2/3 part is selected as the training set, and the last 1/3 part is selected as the test set.
6. W-GA-BPG Model Construction
In the prediction model, firstly, wavelet transform is carried out to obtain different LFS and HFSs of NRCDP. Secondly, BP neural networks with the same number of sequences are established. GA is used to optimize each BP neural network, and corresponding LFS and HFSs predictions are obtained, respectively. Finally, the final prediction result is obtained by wavelet reconstruction. Its algorithm is shown in
Figure 5.
The detailed steps of the algorithm are as follows:
Step 1: Obtain the wavelet decomposition sequence. A group of sequences is obtained by wavelet decomposition, which contains n HFSs (D1, D2 … Dn) and a LFS (An). The LFS can obviously reflect the coarse trend of NRCDP, and it is beneficial to the following BP neural network to capture the trend of the data. The HFS can reflect the detailed information of NRCDP, which is beneficial to the following BP neural network to capture the changeable characteristics of the data fully.
Step 2: Select the optimal BP neural network model. The node numbers of the input and hidden layer for the BP neural network are obtained by experiments. Based on that, the corresponding network parameters can be obtained by the same strategy.
Step 3: Establish a Group of BP neural networks (BPG) optimized by GA. Since the feature sequences with different scales obtained by wavelet decomposition have different characteristics, the prediction performance using a single BP neural network cannot reach the expected effect. In addition, a BP neural network may fall into a local minimum in training. To solve these problems, a BPG optimized by GA (GA-BPG) is established. The network in the group is denoted as GA-BP1, GA-BP2 … GA-BPn, GA-BPn+1 (n is the decomposition level). A group of initial weights and biases are randomly assigned to the BP neural network, and the initial population is generated randomly. Then the weights and biases are updated by a series of genetic operations (selection, crossover and mutation).
Step 4: Obtain the wavelet decomposition prediction sequence. Each sequence is input into the corresponding GA-BP natural network, and the prediction sequence is obtained.
Step 5: The final prediction result of NRCDP is obtained by wavelet reconstruction.
6.1. The Wavelet Decomposition Function and Level Selection
To select wavelet functions, the predicted results of the BP neural network with different decomposition levels using db4 and db6 wavelet functions are compared, the MSE is shown in
Table 4. It can be seen from the table that the MSE obtained by two and three decomposition levels with the db4 wavelet function is 0.0742 and 0.1094, respectively, and the MSE of two decomposition levels is 0.0352 lower than that of three decomposition levels. The MSE obtained using two and three decomposition levels with the db6 wavelet function is 0.01164 and 0.0163, respectively. The MSE of the three decomposition levels is 0.1001 lower than that of two-level decomposition. The MSE obtained by three decomposition levels with the db6 wavelet function is 0.0579 lower than that by two decomposition levels with the db4 wavelet function. In conclusion, the db6 wavelet function and three decomposition levels have the smallest MSE and the best effect.
The feature sequences with different scales by db6 wavelet function and three decomposition levels are shown in
Figure 6. It can be seen from the figure that the LFS
A3 ranges from 0 to 11.5, which reflects the main features of the original sequence. The range of HFSs
D1,
D2 and
D3 are −1 to 1, −0.2 to 0.2, −0.6 to 0.4. With the increase in the number of decomposition levels, the value of the HFS becomes smaller and smaller, which reflects the details of the original sequence in different decomposition levels in turn.
To sum up, the db6 wavelet function and three decomposition levels are selected to obtain four feature sequences (D1, D2, D3 and A3) in this paper.
6.2. The BP Neural Network Parameters Selection
A three-layer BP neural network structure is adopted by analyzing the NRCDP data, including an input layer, a hidden layer and an output layer. It is very important to determine the numbers of the input nodes and the hidden nodes in the BP neural network. Since a clear formula to determine them is not exist in current studies, they are determined by experiments in this paper.
According to the experience of previous experiments, the number of input nodes is set between 3 and 10 and the number of hidden layer nodes is set between 3 and 15. Other initial network parameters are set as shown in
Table 5. As experimental results are shown in
Table 6, IN represents the input node number, and HN represents the hidden node number.
As shown in
Table 6, the values in each column are fluctuating rather than monotonously increasing or decreasing. When the number of input nodes is 7, and the number of hidden layer nodes is 3–15, the MSE is 0.0386, 0.0417, 0.0476, 0.0453, 0.0565, 0.0319, 0.0737, 0.0691, 0.1456, 0.0847, 0.0854, 0.0654 and 0.6566, respectively. The MSE values do not show a monotonously increasing or decreasing trend with the number of hidden layer nodes. When the number of input nodes is 7 and the number of hidden layer nodes is 8, MSE reaches the minimum, which is 0.0319. Thus, the 7-8-1 three-layer BP network structure is adopted in this paper.
Based on the above structure, different learning rates are tried. Setting the learning rate as 0.1, 0.01 and 0.001, respectively, the corresponding results are shown in
Table 7. It can be observed the MSE is minimum when the learning rate is 0.01. Thus, the learning rate is set as 0.01 in this paper.
6.3. The GA Parameters Selection
The initial parameter Settings of the GA are shown in
Table 8.
In GA, the crossover probability (CP) and mutation probability (MP) is very important for the convergence of the BP network. The CP and MP are set as 0.1, 0.2, …, 0.9, respectively. The prediction effectiveness is shown in
Table 9. When CP is determined, for example, CP is 0.2, the MSE values do not increase or decrease with the increase in MP. By comparing the data, the prediction and optimization effect of the BP neural network is the best when the CP is 0.2 and the MP is 0.5.
Based on the above analysis, the BP neural network is set as a three-layer neural network structure with a network topology of 7-8-1, and the learning rate is 0.01. The CP and MP of the GA are set as 0.2 and 0.5.
6.4. The GA_BPG Parameters Selection
For the components with different characteristics obtained by wavelet decomposition, it is necessary to establish BPG to predict them, respectively. Based on the above analysis, this paper selects the db6 wavelet function to decompose the NRCDP data at three levels to obtain four sub-sequences
D1,
D2,
D3 and
A3. Then, a GA-BPG is established, including four three-layer BP neural networks optimized by GA (GA-BP1, GA-BP2, GA-BP3 and GA-BP4) with a 7-8-1 network topology structure. The network parameters are the same as in the above analysis. Finally, the sub-sequences are predicted, respectively. By experiments such as
Section 6.3, the setting of CPs and MPs in the GA for optimizing the four BP neural networks are selected, as shown in
Table 10.
6.5. Comparative Analysis with Other Predictive Models
The prediction results are evaluated using MSE and determinant coefficient (
R2)
is the real value of NRCDP, is the predicted value of NRCDP, is the real average value of NRCDP, and n is the length of the predicted sample.
6.5.1. The Other Predictive Model Parameter Selection
To evaluate the proposed method, W-GA-BPG is compared and analyzed with LSTM, the least square method, BP neural network, GA-BP, wavelet decomposition combined with BP neural network (W-BP) (The overall framework proposed in [
25] was used in the experiment), and wavelet decomposition combined with a single GA-BP (W-GA-BP).
Each model has the same structure with seven input nodes and one output node. The input of the BP neural network and GA-BP model is the NRCDP data without wavelet decomposition. The number of nodes in the middle layer of the LSTM model is 16, the learning rate is 0.01, and the maximum number of iterations is 50. The parameters used in BP neural networks, GA-BP, W-BP and W-GA-BP models are the optimal results selected by the above experiments.
6.5.2. Prediction Effect Comparison
MSE and R
2 are used to evaluate the model. The predicted result and error of each model are shown in
Figure 7 and
Figure 8. The MSE and R
2 between the true value and predicted value using different models are shown in
Table 11.
According to
Figure 7 and
Figure 8 and
Table 11, the results using W-GA-BPG have the highest accuracy compared with other models, and the MSE and R2 values are 0.0043,0.9302, respectively. The MSE of LSTM, least square method and BP neural network are 0.0365, 0.0328 and 0.0319, respectively. The R
2 values are 0.4092, 0.4694 and 0.4829, respectively. Therefore, the results obtained by BP neural network have the highest accuracy among the three traditional models, LSTM, least square method and BP. The MSE values obtained using GA-BP and W-BP models are 0.0242 and 0.0163, respectively, and the R
2 values are 0.6087 and 0.7325, respectively. Compared with the traditional BP neural network, the prediction results of GA and wavelet decomposition combined with the BP neural network are successively improved in accuracy. The MSE and R
2 of W-GA-BP are 0.0061, and 0.9032, respectively, and the MSE is smaller than GA-BP and W-GA, and R
2 is larger than GA-BP and W-GA. Overall, the highest accuracy can be attached using the proposed W-GA-BPG model, which can predict the NRCDP more accurately.
To sum up, the prediction accuracy of the BP neural network optimized by wavelet decomposition or GA is lower than the proposed method. The combination method can not only extract the trend and details of the original sequence but also optimize the weight of the neural network by using GA. It provides a reference for the early warning of NRCDP, which is of great significance to macro-control in the natural rubber market.
7. Conclusions
In this paper, we have established the NRCDP dataset and proposed a prediction model, W-GA-BPG, which is based on wavelet and the optimized BP neural network group using GA. To obtain the general trend and detailed information on NRCDP, the db6 wavelet function is used to decompose NRCDP three times. To solve the problem that it may fall into a local minimum in BP neural network training and that the obtained sub-sequences are different in distribution and value, an optimized BP neural network group using GA is proposed to predict the sub-sequences, respectively. The proposed model can fully capture the information of NRCDP with different scales and avoid the possibility of falling into the local minimum for the BP neural network. Compared with other models, experimental results show that the performance of the proposed W-GA-BPG prediction is the best, which has the smallest error (0.0043) and the highest determination coefficient (0.9302). The proposed model is applied to the prediction problem to be solved in the Research on Building Digital Ecology of Qingdao Shipping Trade Finance Based on Blockchain. Moreover, then it can warn of the natural rubber market price fluctuation.
The proposed model for predicting NRCDP is only considered the impact of historical price data on future prices, and other factors are not considered, such as exchange rate, economic index and so on. To further improve the performance of the prediction model, more factors influencing NRCDP will be considered in our future work. Moreover, the proposed model only validates the prediction of the NRCDP, and its wide application should be further verified.