Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics

Ma, Feihu; Wang, Shuhan; Xie, Tianchang; Sun, Cuiyu

doi:10.3390/app14156766

Open AccessArticle

Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics

¹

School of Transportation Engineering, East China Jiaotong University, Nanchang 330013, China

²

School of Materials and Energy, Guangdong University of Technology, Guangzhou 510006, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6766; https://doi.org/10.3390/app14156766

Submission received: 9 July 2024 / Revised: 28 July 2024 / Accepted: 30 July 2024 / Published: 2 August 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

In the current era, the government consistently emphasizes the pursuit of high-quality development, as evidenced by the ongoing increase in the tertiary industry’s GDP share. As a crucial component of the modern service sector, logistics plays a pivotal role in determining the operational efficiency and overall quality of the industrial economy. This study focuses on constructing a Chongqing logistics express demand prediction index system. It employs an improved BP neural network model to forecast the logistics express demand for Chongqing over the next five years. Given the limited express demand data sequence and the normalized characteristics of the data, the selected training method is the Bayesian regularization approach, with the LeCun Tanh function serving as the hidden layer activation function. Additionally, a genetic algorithm is designed to optimize the initial weights and thresholds of the BP neural network, thereby enhancing prediction accuracy and reducing the number of iterations. The experimental results of the improved GA-BP network are analyzed and compared, demonstrating that the improved BP neural network, utilizing GA optimization, can more reliably and accurately predict regional logistics express demand. According to the findings, the forecast indicates that the logistics express demand for Chongqing in 2026 will be 2,171,642,700 items.

Keywords:

region logistics development; demand forecasting; logistics planning; neural network optimization; genetic algorithm

1. Introduction

As China continues to optimize and upgrade its industrial structure, the proportion of tertiary industry in its GDP is steadily increasing. Concurrently, advancements in mobile Internet technology have fueled rapid growth in the modern service sector within the tertiary industry. Notably, the logistics industry plays a critical role in ensuring the effective circulation of production materials and consumer goods within urban operations, thereby supporting the normal functioning of urban industries, trade, and various service sectors. Furthermore, logistics enhances the efficiency of urban industrial operations through rational resource allocation and management. It also eliminates geographical constraints, fosters inter-regional economic connections and resource flows, and facilitates the process of regional economic integration.

In 2022, China’s total social logistics reached 347.6 trillion yuan, marking a 3.4% year-on-year increase, and the demand for logistics continued to grow steadily. The total cost of social logistics is projected to be 17.8 trillion yuan, reflecting a 4.4% year-on-year rise. Notably, the ratio of total social logistics costs to GDP stands at 14.7%, a figure significantly higher than the approximately 7% ratio observed in developed countries. The evolution of mobile Internet technology has led to the emergence of online trading platforms, resulting in rapid growth in China’s e-commerce market, which now stands as the world’s largest. Within this expansive market, logistics costs are substantial, and reducing logistics costs is a crucial focus for the logistics industry’s development.

Japanese scholar Nishizawa Xiu has proposed the “third source of profit” doctrine, which highlights that in modern society, as industrial operation systems become more comprehensive and market competition intensifies, there is a limit to expanding profits through industrial development alone. Effectively reducing logistics costs, which constitute a significant portion of the total cost, equates to increasing profits, thus representing the third source of profit. Logistics costs are mainly divided into three major categories, namely transport costs, warehousing costs, and management costs. The reduction of logistics costs is of great significance in enhancing industrial efficiency, optimizing supply chain management, and promoting economic development.

Ding and Zhao [1] observed that the cost challenges associated with e-commerce logistics were further exacerbated during the COVID-19 pandemic. To effectively control logistics costs, it is essential to enhance information exchange among suppliers, manufacturers, third-party logistics service providers, shipping companies, and other stakeholders involved in the distribution chain. Therefore, the forecasting of transaction demand and logistics costs has become increasingly critical. Gülsah Şişman [2] proposed the implementation of a Lean Six Sigma (LSS) framework within the supply chain, exploring strategies to reduce logistics costs through practical case studies. The findings indicated that a primary reason for the rise in logistics costs was insufficient communication among different departments within the supply chain. Future research directions should focus on the development of a unified production plan and the enhancement of integrated functions, with logistics demand forecasting playing a vital role in the formulation of this production plan. Winkelhaus and Grosse [3] introduced a framework for a Logistics 4.0 system grounded in the principles of Industry 4.0. In their article, Logistics 4.0 is defined as a logistics system designed to meet individual customer needs in a cost-neutral manner, leveraging digital technologies to facilitate this development in both industry and trade. Emerging trends such as mass customization and sustainability have increased the complexity and demand for logistics systems. Effectively managing this complexity necessitates the development of advanced planning and control instruments.

Currently, the problems of uneven spatial development, inadequate construction of logistics infrastructure, and mismatch between urban transportation capacity and freight volume growth caused by unreasonable logistics planning have become more and more prominent, resulting in high logistics costs and difficulties in improving logistics efficiency. Logistics planning infrastructure such as logistics hubs, logistics centers, logistics distribution centers, and other investments in fixed assets having a long-term nature, if there is no logistics demand data in line with the law of future economic development of the region as the planning support, will lead to the original logistics service system and the region’s future logistics demand not matching, the status of the logistics infrastructure repeatedly building logistics infrastructure, and an imbalance between logistics supply and demand and other increases in logistics costs.

In this paper, we review a substantial body of literature to employ grey correlation analysis for establishing a logistics express demand impact indicator system in Chongqing. Subsequently, we normalize the extracted indicator data and apply a BP neural network, utilizing Bayesian regularization as the training method and the LeCun Tanh activation function to predict the demand for logistics express services in Chongqing over the next five years. To address the issue of exceeding the maximum number of iterations during training, we design a genetic algorithm to optimize the initial weights and thresholds of the BP neural network.

The main research contributions of this paper are as follows:

Various influencing factors in the literature were comprehensively considered to establish the logistics express impact indicator system in Chongqing. Additionally, tertiary industry and related indicators were added to link the panning of the logistics express industry with the development of the tertiary industry. The strong correlation between the tertiary industry and the logistics express industry is verified in this paper.
When using short series data such as indicators of logistics demand influencing factors as input data for neural networks, it should be considered whether the training algorithm of the neural network will lead to overfitting of the network, and the BR training algorithm used in this paper can to a certain extent avoid the problem of overfitting when predicting short series data.
Selecting the appropriate activation function for the hidden layer of a neural network can improve the accuracy of predictions based on data features. In this research, the LeCun Tanh function was successfully applied to counteract the issue of gradient disappearance in the training of backpropagation neural networks.
In this paper, a kind of genetic algorithm will be developed to enhance the stability of the neural network structure by optimizing the initial weights and thresholds when extended iterations lead to unstable prediction accuracy. This process is intended to speed up iterations and improve the stability of the network.

The following is the arrangement for the rest of the paper: Section 2 provides an overview and summary of the relevant literature on the logistics demand impact indicator system and logistics demand forecasting. Section 3 describes the methods and models used in this study and highlights the improved parts of the paper and the research line. Section 4 compares and analyses the application of the previously introduced methods and models in the example. The optimal model is also chosen to forecast the logistics express demand in Chongqing. Section 5 summarizes the research in this paper and gives an outlook for future research given the limitations of this paper.

2. Literature Review

2.1. Impact of Tertiary Industry Development on the Logistics Express Industry

Guo et al. [4] noted that since China acceded to the World Trade Organization at the beginning of the 21st century, the country has seen a significant increase in opportunities to engage in international export trade, gradually establishing itself as a major exporting nation. This rapid economic development has led to substantial changes in the industrial structure, with the share of the tertiary sector in China’s output rising from 36.2 percent in 1998 to 44.6 percent in 2012. As China enters a new phase of economic development, the government has emphasized the importance of achieving high-quality economic growth through the optimization of its industrial structure.

Li [5]’s research argues, as the proportion of the tertiary industry within the national economy continues to rise, its internal structure is undergoing significant changes. The shares of both living services and production services have increased alongside a growing number of modern service industries that have been enhanced through the application of advanced industrial information technologies, notably exemplified by logistics and distribution services. Many regions are prioritizing the logistics industry as a key sector in their plans for developing the tertiary industry. However, the advancement of the modern logistics industry is inextricably linked to the support of transportation networks, professional infrastructure, and effective management conditions. Therefore, Regional governments must avoid neglecting the foundational environmental conditions necessary for the growth of the logistics sector. Additionally, as a service-oriented industry, the demand for logistics services directly influences the trajectory of its development. Li et al. [6] analyzed the characteristics of the spatial distribution of industries in China and concluded that the long-term distribution of employment in tertiary industry shows a concentration trend toward large cities. Yuan et al. [7] discussed and studied the spatial distribution characteristics of the tertiary industry in the Pearl River Delta city cluster by using the Dagum Gini coefficient, kernel density estimation, local spatial auto-regression, etc. The results of the study showed that the tertiary industry agglomeration effect was most obvious in Guangzhou and Shenzhen, with a higher degree of agglomeration of consumptive and productive tertiary industries. The above study shows that the development trend of tertiary industry in China’s large and medium-sized cities is not balanced, and the development of logistics can strengthen the flow of resources and links between city networks. It is indispensable to pay attention to the development of the logistics industry in the future development of tertiary industry in China.

In general, China’s tertiary industry and the logistics and express delivery sector exhibit a complementary relationship. The advancement of the tertiary industry facilitates the transformation and upgrading of the logistics and express delivery sector by providing essential foundational elements, such as transport infrastructure, regional logistics demand, and a skilled workforce. Conversely, the growth of the logistics and express delivery industry lays the groundwork for the structural optimization of tertiary industry, thereby enhancing its efficiency and responsiveness to market demands.

2.2. System of Indicators Related to the Impact of Logistics Demand

The accuracy of logistics and express demand forecasting depends largely on whether the establishment of the logistics and express demand impact indicator system is reasonable and whether the indicator data are handled appropriately. Numerous scholars have proposed diversified logistics demand impact indicator systems according to the economic development of the study area and the focus on different industries. Xiaoyan Xu et al. [8] constructed a fresh agricultural products market demand indicator system in the process of researching the logistics demand in Shandong Province, China. According to the current situation of the local logistics economy, combined with the three principles of practicability, importance, and scientificity, the factors affecting the logistics demand of agricultural products are attributed to six major aspects. Among them, indicators such as freight turnover and Internet development are verified by this paper to have wide applicability in the construction of urban logistics and express demand impact indicator systems. In their study of the postal industry and the express delivery sector, Nataša Čačić et al. [9] employed the theory of price elasticity to analyze the influence of price changes on the demand for express and parcel services, and the subsequent impact on the profitability of postal operators. Their study revealed that indicators associated with the postal sector significantly influence the demand within the express sector as well. Li and Ye [10] studied the demand for cold chain logistics of agricultural products under the background of “double cycle” and designed a system of influencing indicators for the market demand of cold chain logistics in China, selecting six typical data indicators from four dimensions, namely product supply, economy and society, residents’ consumption, and the scale of the industry, to construct an indicator system of influencing factors, and analyzed its importance by the grey degree of correlation method. Among them, the Tertiary sector contribution rate and other indicators can be adopted in the impact indicator system of urban logistics express demand after the test of this paper. Xiangyang Ren et al. [11] analyzed the intrinsic correlation between potential factors and regional demand for cold chain logistics of agricultural products, and the results of the analysis were that the proportion of the tertiary industry, the index of disposable income per capita of urban households, and the index of total prices of agricultural products were the top three factors affecting the demand for cold chain logistics of agricultural products. Xin Zou et al. [12] constructed an evaluation index system for urban logistics competitiveness (ULC) by analyzing 18 cities in the Sichuan Province of China as a case study, which includes factors such as gross regional product (GRP), total freight volume, and postal service income. Its assessment system is mainly divided into three levels, namely: economic development, logistics industry scale, and informationization level. Among them, the total freight volume and freight turnover at the logistics industry scale level have been proven to have a significant impact on regional logistics express demand.

2.3. Demand Forecasting Methodology

In the realm of logistics demand forecasting, early approaches relied on grey model forecasting, time series analysis, and linear regression methods, often treating the volume of transported goods as the primary indicator for logistics demand prediction. These methods typically operated within relatively narrow data dimensions. However, with the emergence of artificial intelligence, machine learning, and related technologies, researchers have increasingly gravitated towards the utilization of neural networks and the integration of these with traditional methods to forecast regional logistics demand. This trend has involved the incorporation of more comprehensive indicator data, leading to continual improvements in prediction accuracy. Nevertheless, it is important to note that neural network methods present unique challenges such as overfitting and gradient disappearance.

Wu Huirong et al. [13] presented a method to forecast the transportation demands of large quantities of goods by leveraging the production and transport coefficient. They also outlined a strategy for adjusting the transportation infrastructure based on the evolving trends of these goods. Through a case study in Heilongjiang Province, the authors took into account various factors such as fertilizer usage, rural electricity consumption, total power of agricultural machinery, and the area of grain crops in their prediction of grain output. To achieve this, they developed GM(1, 1) and GM(1, 1)-MLP hybrid neural network models, which were then validated using real-world data. Minling Zeng et al. [14] establish a regional logistics demand prediction indicator system. The GM(1, 1) grey prediction model with a weakened Boolean operator was used to predict the regional rural logistics demand. Xiangyang Ren et al. [11] analyzed the intrinsic correlation between potential factors and the regional cold chain logistics demand of agricultural products and then established a hierarchical cumulative grey model GM(1, N) to predict the cold chain logistics demand of the study area in the next five years. Lijuan Huang et al. [15] constructed a regional logistics demand forecasting indicator system, classified 12 indicators into three levels, analyzed the coupling strength between the indicators at the same level and the logistics demand, reduced the dimensionality of the indicators at the three levels by using principal component analysis (PCA), input the reduced dimensionality of the three principal components into BP neural network for model training, and then finally predicted the logistics demand of the study area in the next three years. Based on analyzing the factors affecting logistics demand, Nan Yu et al. [16] took into account that the logistics demand data series had the characteristics of non-linearity and small-sample modeling and introduced the ant colony algorithm to optimize the penalty parameters “c” and “g” of the radial basis function in the support vector machine in the process of modeling from the perspective of the urban freight transport volume. The optimized support vector machine model is used to forecast the logistics demand in the study area. In a study by Zhang Jiaojiao [17], the suggestion was made to integrate adaptive learning rates and momentum terms into the gradient descent technique used in BP neural networks to improve the speed of convergence. Furthermore, adjustments were made to the network architecture to guarantee the stability of the model. Hongpeng Guo et al. [18] designed a multi-layer perceptron (MLP) neural network model based on the simulation platform of the MLP neural network according to the matrix–vector multiplication operation and weight-updating operation and proposed a method to improve the MLP neural network by using the deep learning training mechanism on the basis of the standard MLP neural network, choosing the RBF function as the kernel function of the model and optimizing the parameter combination by using the PSO algorithm. Ma and Luo [19] combined the logistic regression algorithm with an improved neural network algorithm to create a model for forecasting logistics demand. They further optimized the ladder network structure to better suit the unique features of the data mining assignment, combining its decision-making component with the shallow network to boost the model’s effectiveness in logistics demand prediction. Huang et al. [20] applied the grey prediction model and BP neural network model in combination and designed the GM-BPNN algorithm to solve the regional logistics demand forecasting problem with the characteristics of mutability, instability, non-linearity, etc. The results proved that the designed combined model integrated the advantages of the two algorithms and compensated for the shortcomings that were prominent in the forecasting data. For the shortcomings of traditional neural networks with slow convergence and low accuracy, Hao and Zou [21] optimized the iterative initial weights and thresholds of the BP neural network model using the improved particle swarm algorithm and applied the model to forecast the logistics demand of fresh agricultural products in Shanghai, and the experimental results proved that the model converged faster and with higher accuracy. Liu et al. [22] used a combination of Lasso regression forecasting and polynomials in solving the problem of future natural gas demand forecasting in China, built a combined forecasting model, and optimized the weights of the combined model. The experimental results showed that the model increased the dimensionality of the data and had higher forecasting accuracy than the benchmark model. Rao et al. [23] In the study of carbon emission forecasting in Hubei Province, in order to avoid the interference of multicollinearity and retain the key information in the data to the greatest extent, a ridge regression model was introduced to regress the existing model.

According to the existing research results, we found that the logistics demand impact indicator is different from some research indicators with large data volume; generally, the length of the data series is only about 20 items (the unit being one year, i.e., 20 years), and fewer researchers pay attention to the characteristics of the data of the logistics demand impact indicator and look for the suitable neural network activation function and training algorithm according to the characteristics of their data. The goal of this study is to discuss how to select the activation function and training algorithm suitable for the characteristics of its indicator data and optimize the improved model according to its shortcomings to improve the prediction accuracy of the neural network and its stability.

3. Methodology and Modelling

3.1. Grey Correlation Analysis

The concept of grey correlation analysis is included in the grey system control theory proposed by Chinese scholar Deng [24] and is a multi-factor statistical analysis method used to analyze the extent to which a certain item in a grey system is affected by a variety of other influencing factors in the system. This statistical model has been used by scholars in research fields such as battery energy performance analysis [25], civil engineering structure analysis [26], and coordination analysis of regional economy and logistics industry [27]. In this paper, Deng’s grey correlation is used to analyze the indicators related to logistics express demand. The steps are as follows:

In this research, the regional express transaction volume acts as the reference sequence to assess the entire system. The comparison set includes 18 pertinent metrics chosen from existing literature to represent the data sequence that influences the system’s performance.
Before the grey correlation analysis, the data need to be standardized, and the standardization method used in this paper is the Z-score method. The main formula is as follows:

The standardized sequence is required to be

y_{1}, y_{2}, \dots, y_{n}

x_{i} = \frac{y_{i} - \bar{y}}{s}

(1)

\bar{y} = \frac{1}{n} \sum_{i = 1}^{n} y_{i}

(2)

s = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(3)

3.: Calculate point correlation coefficients and indicator series correlations.

The formula is as follows [28]:

γ (x_{0} (k), x_{i} (k)) = \frac{\min_{i} \min_{i} | x_{0} (k) - x_{i} (k) | + ξ \max_{i} \max_{k} | x_{0} (k) - x_{i} (k) |}{| x_{0} (k) - x_{i} (k) | + ξ \max_{i} \max_{k} | x_{0} (k) - x_{i} (k) |}

(4)

γ (X_{0}, X_{i}) = \frac{1}{n} \sum_{k = 1}^{n} γ (x_{0} (k), x_{i} (k))

(5)

γ (x_{0} (k), x_{i} (k))

denoted point correlation coefficient.

γ (X_{0}, X_{i})

denoted as indicator serial correlation.

3.2. Factor Analysis for Dimensionality Reduction

Factor analysis (FA) is a statistical method used to study the relationship between variables by reducing the number of variables to identify the underlying structure of the original variable. It aims to identify the underlying factors behind the observed variables that explain the observed correlations between the variables. Component analysis has been used in fields such as psychology [29], ecology [30], and animal husbandry [31].

The main formulas are as follows:

x_{1} = a_{11} F_{1} + a_{12} F_{2} + \dots + a_{1 m} F_{m}, x_{2} = a_{21} F_{1} + a_{22} F_{2} + \dots + a_{2 m} F_{m}, x_{p} = a_{p 1} F_{1} + a_{p 2} F_{2} + \dots + a_{p m} F_{m}

(6)

x_{1}, x_{2}, \dots, x_{p}

denoted the normalized raw data series,

F_{1}, F_{2}, \dots, F_{m}

are common factors, the exact meaning of which needs to be interpreted according to the specific problem. The linear combination factor is called the load matrix.

In this paper, we mainly use SPSS to factor analyze the selected logistics express demand impact indicators and select the number of dimensionality reduction factors based on the results of the analysis.

3.3. BP Neural Network

Backpropagation of prediction error is the core concept of BP neural networks, proposed by Rumelhart [32], a scholar in the 1980s. Its method solves the problem of adjusting the weights and biases in the training process of neural networks. As a whole, BP neural network is a feed-forward neural network, and the network structure is divided into input layer, hidden layer and output layer, in which the hidden layer can be one or more layers. The neurons in each layer are fully linked to all neurons in the next layer. The structure of the BP neural network with three input neurons, two output neurons, and a hidden layer will be shown in Figure 1.

The following will highlight two parts of this paper that improve the BP neural network based on the characteristics of the logistics express demand indicator data.

The main roles of activation functions in BP neural networks are:

Introducing the property of non-linearity enhances the fitted representation of the neural network.
In the error backpropagation process of the BP neural network, the partial derivatives of the activation function are used to determine the direction of error descent, which then guides the updating of the network weights and biases.
Choosing an activation function that is suitable for the characteristics of the input data can effectively avoid the problems of gradient vanishing and gradient explosion during the fitting process of the neural network and improve the overall learning fitting effect of the network.

There are several common activation functions, and their derivative images are shown in Figure 2.

The main training algorithms for BP neural networks are:

3.3.1. Gradient Descent Algorithm

The gradient descent method is the basis of the BP neural network training algorithm; its core idea is to gradually adjust the parameters along the direction of the fastest decline of the objective function in order to achieve the purpose of minimizing the loss function. The specific formula is as follows:

L (w, b) = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - f (ω x_{i} + b))}^{2}

(7)

ω_{1} = ω_{0} - η \nabla f (x)

(8)

L (ω, b)

is the loss function, usually mean square error (MSE);

f (ω x_{i} + b)

is the predicted value;

ω

is the network propagation weight; and b is bias. The loss function serves to measure the learning accuracy of the network model.

ω_{1}

is the updated network weights calculated according to the gradient descent method.

η

is the learning rate, and

\nabla f (x)

is the partial derivative of the loss function with respect to the weights or bias. It is iterated several times during the training process that the result of its loss function is less than the set threshold or unchanged.

3.3.2. Levenberg–Marquardt Algorithm (LM)

The Levenberg–Marquardt algorithm is an optimization algorithm designed to solve nonlinear least squares problems. It combines aspects of the Gauss–Newton method and gradient descent method, effectively balancing their characteristics through the introduction of a damping factor. The LM algorithm uses not only the gradient information of the objective function but also the information of the second-order derivatives (Hessian matrices), which makes it more robust and efficient in dealing with nonlinear problems. The specific formula is as follows:

(J_{k}^{T} J_{k} + λ_{k} I) d_{k} = - J_{k}^{T} F_{k}

(9)

By solving the above equation, we can obtain

d_{k}

as a gradient descent direction;

λ_{k}

is the damping factor, and

F_{k} = F (x_{k})

is quadratically continuous differentiable functions.

J_{k} = J (x_{k})

is the Jacobi matrix for

F

at value

x_{k}

.

3.3.3. Bayesian Regularization (BR)

A neural network can be considered as a conditional distribution model

P (Y ∣ X, W)

, The meaning of this is the probability distribution of the predicted value of the outcome as

Y

under the condition that

X

is the input and

W

is the weight of the network. Bayesian regularization is a regularization method based on Bayesian statistical principles. It avoids overfitting by incorporating prior knowledge into the probability distribution of the model parameters. In Bayesian regularization, model parameters are treated as random variables that obey some prior distribution. We then update the posterior distribution of these parameters by observing the data. The specific formula is as follows:

G = φ L (w, b) + η e_{ω}

(10)

e_{ω} = \frac{1}{n} \sum_{i = 1}^{N_{w}} W_{i}^{2}

(11)

where n is the total number of samples in the selected training set;

e_{ω}

is the network weights sum of squares mean, i.e., the penalty term;

N_{ω}

is the number of weights;

W_{i}

is the ith weight;

φ

and η are the penalty factor. The parameters of which are calculated as follows:

φ = \frac{N_{ω} - γ}{2 L (ω, b)}

(12)

η = \frac{γ}{2 e_{ω}}

(13)

γ = N_{ω} - 2 η \cdot tr {(H)}^{- 1}

(14)

γ is the number of effective weights, and H is the loss function Hessian matrix. In the training of the BP neural network, the Bayesian regularization algorithm can adaptively adjust

φ

and η. If

φ > η

, then the E term in the objective function G plays a greater role and the algorithm will maximize the increase in training accuracy; if

φ < η

, then the algorithm will make the effective weight of the network as little as possible through the penalty term and small so as to simplify the model and prevent overfitting.

The BP neural network optimization process of this paper is described below:

Step1.: Based on the input and output of the data, determine the optimal number of hidden layer neurons for the BP neural network based on an empirical formula.
Step2.: For the overfitting problem of the neural network caused by the default use of the LM algorithm, compare the experimental results of the LM training algorithm and the BR training algorithm, and choose the better performing algorithm to continue the optimization.
Step3.: On the basis of the optimization of the training algorithm in the previous step, the four neural network activation functions are experimented and compared, then the activation function with better experimental effect is selected to continue the optimization.
Step4.: According to the shortcomings of previous experiments, such as more iterations and unstable accuracy, a genetic algorithm is designed to optimize the weights and thresholds of the BP neural network, and finally the optimization results are compared.

Specific experimental comparisons are presented in Section 4.

3.4. Genetic Algorithm Optimization

Genetic algorithms were proposed by John Holland in the early 1970s, inspired by the phenomena of “survival of the fittest” and “genetic variation” in the process of biological evolution. The main process is:

Step1.: Initialization: Randomly generate an initial population, which consists of a set of initial individuals, each represented as a string (chromosome), with each character (gene) in the string representing a parameter value in the solution space.
Step2.: Adaption assessment and selection Calculate the adaption (adaptation function value) of each individual; the adaptive function is designed based on the objective function of the problem and is used to measure the strengths and weaknesses of the individuals. Individuals with high adaptation have a higher probability of being selected into the next generation.
Step3.: Gene crossover and mutation: Generating new offspring by exchanging data sequences of initial genes is the process of crossover of genes. Setting the probability to make the gene change randomly is the process of gene mutation.
Step4.: Population replacement and iteration: The newly generated offspring after gene crossover and mutation are used as the new population, and the above steps are continued until the termination condition is reached (fitness meets a set threshold or the maximum number of iterations is reached).

The Figure 3 is the basic flowchart of the research in this paper:

4. Experimental Results and Analyses

4.1. Establishment of a Logistics Express Impact Indicator System

The logistics industry occupies an important position in modern urban development, and its industry development has a profound impact on the regional economy. In this paper, on the basis of reviewing the literature, 18 indicators are selected as the logistics express demand impact indicator system alternative. In selecting the indicators, there are the following principles:

The selection of indicators should reflect the development trend of the regional logistics industry as much as possible.
Select indicators that have a greater impact on the regional logistics and express delivery industry.
Ensure the reliability and authenticity of the sources of indicators.

In this paper, the grey correlation model is used to analyze the 18 indicators and the regional logistics express volume. The grey correlation index of 18 indicators for regional logistics express volume was derived. The 14 indicators with grey correlation coefficients above 0.75 were selected for the next step, as shown in the Table 1.

4.2. Data Sources

In order to ensure the authority of obtaining the indicators, the data for the indicators in this paper come from the Chongqing Statistical Yearbook (2002–2021) compiled by the Chongqing Municipal Bureau of Statistics. Due to the different units of the indicators, this paper standardized the collected data with Z-Score using IBM SPSS Statistics 26 software with the aim of removing the effect of the unit scale of the indicators as well as simplifying the complexity of the data. The results are shown in the Table 2.

4.3. Factor Analysis and Dimensionality Reduction

The standardized data for the 14 indicators were analyzed by using Spearman’s correlation analysis, the results of which are shown in Figure 4.

According to the image, it can be intuitively understood that the 14 indicators have a high correlation with each other, which provides the basis for the factor analysis later. Subsequently, factor analysis and dimensionality reduction of the indicators were carried out using SPSS software. The KMO test value of the indicators is 0.754, and the significance of Bartlett’s sphericity test is 0. This indicates that it is more appropriate to use factor analysis to reduce the dimensionality of the indicator data. After factor analysis, the indicator was downscaled to three component factors, and the results are shown in the Table 3.

4.4. Analysis of BP Neural Network Prediction Results

An empirical formula was used to determine the number of neurons in the hidden layer of the neural network for this experiment. This formula offers an interval range for the number of neurons in the hidden layer of the neural network. The empirical formula is shown below:

H = \sqrt{m + n} + a

(15)

where m and n are the number of input and output neurons, respectively, and a is a constant between 1 and 10. The experiments in this paper have 3 input neurons and 1 output neuron, so the number of hidden layer neurons should be between 3 and 12. The dimensionality-reduced data is imported into the procedure written using Matlab R2021b for solving the best-hidden layer of the neural network. The optimum number of hidden layer neurons for this experiment was solved to be 10.

Subsequently, the neural network prediction module in Matlab R2021b was used for prediction, the training method was the commonly used Levenberg–Marquardt algorithm, and the hidden layer activation function was the Tanh function. 75% of the dataset is set as the training set and 25% as the test set. The results of the network training are shown in Figure 5.

It can be seen from Figure 5 that some degree of overfitting occurs during the training process using the LM algorithm. The LM algorithm, although it has fewer iterations and shorter training time, tends to lead to the overfitting problem when the sequence of training data is small. However, many researchers use the LM algorithm for training BP neural networks. The Bayesian regularized neural network training method used in this paper can effectively avoid the occurrence of overfitting in the case of fewer training data sequences. The prediction results using LM and BR algorithms are shown in Figure 6 and Table 4.

The various evaluation parameters of the prediction model are shown in Table 5.

In this paper, we will analyze the activation function selection of the BP neural network. Firstly, we need to select an activation function that has a defined domain in both positive and negative intervals according to the characteristics of the data after dimensionality reduction, and we generally select the Tanh function (hyperbolic tangent function) as the activation function of the hidden layer, but we found that some of the data inputs in this paper are more than ±1.5, and the derivative value of the Tanh function will become smaller after ±1.5, which will easily lead to the disappearance of the gradient and affect the learning effect of the network during the training process. Therefore, in this paper, the LeCun Tanh function is used as the hidden layer activation function to improve this situation. The comparison of the derivatives of the LeCun Tanh and Tanh functions is shown in Figure 7.

The prediction results of the BP neural network using BR as the training algorithm and LeCun Tanh and Tanh as the activation function are shown in Figure 8 and Table 6.

The various evaluation parameters of the prediction model are shown in Table 7.

During the training process using BR as the training algorithm and LeCun Tanh function as the activation function of the neural network, due to the increase in the complexity of the training algorithm and activation function, it results in the number of training iterations often exceeding the maximum number of iterations set, and the prediction accuracy of its network exists unstable from time to time. In order to improve the above situation, this paper uses genetic algorithm to optimize the initial transfer weights and thresholds of the neural network. This makes the network prediction accuracy increase while reducing the number of training iterations and improving the stability of the network. A comparison of the experimental results of the optimized neural network with the BP neural network (both using BR as the training algorithm and LeCun Tanh as the hidden layer activation function) is shown in Figure 9 and Table 8.

The various evaluation parameters of the prediction model are shown in Table 9.

The relevant parameters regarding the genetic algorithm are set as follows, with the chromosome length set as C.

C = I * H + H + H * O + O

(16)

I

is the number of input layer neurons,

H

is the number of hidden layer neurons, and

O

is the number of output layer neurons.

The number of evolutionary generations is 350, the population size is 50, the crossover probability is 0.58, and the mutation probability is 0.15. The iterative graph of genetic algorithm adaptation is shown in Figure 10.

After module and parameter optimization of the BP neural network model, we conducted a forecasting comparison based on the forecasting methods used in the relevant literature reviewed, on top of the regional logistics express impact indicator system established in this paper. Multi-layer perceptron (MLP), ridge regression, lasso regression, and GM (1, 1) models were used for comparison. The comparison results are shown in Table 10 and Figure 11.

Firstly, in the comparison, it can be seen that the SSE, MAE, MSE, RMSE, and MAPE of the prediction results of the GM(1, 1) model are higher compared to the other four models, probably because GM(1, 1) is more suitable for predicting the data objects with smoother trends, and it is prone to larger errors in predicting the data with larger fluctuations. Secondly, ridge regression and lasso regression models, as two commonly used linear regression models, perform very close to each other in predicting nonlinear systematic data but underperform compared to MLP and improved GA-BP, which are nonlinear models.

After the above comparison, the improved BP neural network, after using a genetic algorithm to optimize the initial weights and thresholds of the network, has higher prediction accuracy and stability and can be used as a prediction model for the future demand of logistics and express delivery in Chongqing. After that, this paper uses the time prediction model to predict Factor_1, Factor_2, and Factor_3 and inputs the predicted values of the next five years into the trained neural network. This approach aims to predict the logistics and express delivery demand for Chongqing Municipality from 2022 to 2026. The experimental results indicate that the volume of express delivery services in Chongqing Municipality is projected to reach 2,171,642,700 items by 2026, a remarkable increase of 122% compared to 979,000,000 items in 2021. The findings are illustrated in Table 11 and Figure 12.

In addition, at the time of data collection and processing in this paper, the Chongqing Municipal Bureau of Statistics had only published the statistical yearbook for 2021. However, in January 2024, the statistical yearbook of 2022 was published, in which the volume of express delivery business in 2022 was 1,091,768,000 items, which is only 3.7% different from the predicted value of 2022 in this paper. This directly verifies the accuracy of the model in this paper.

5. Conclusions and Prospect

The generation of logistics express demand is determined by comprehensive environmental factors, which is precisely why it is necessary to select impact indicators from various aspects such as regional economy, industrial environment, and infrastructure construction. The significance of its prediction is not only the assessment of the development of the logistics industry but also the outlook for the development of socioeconomic industries. The research in this paper is mainly for the use of grey correlation analysis indicators and the establishment of regional logistics express impact factor indicator system. Then, according to the characteristics of the processed indicator data, the adapted neural network activation function and training algorithm are selected, and finally, according to the defects in the BP neural network prediction experiment, the GA-BP algorithm is designed for optimization. After comparing the experimental results, GA-BPNN has higher prediction accuracy, and its predicted logistics and express delivery demand in Chongqing in 2026 is 2,171,642,700 items.

From a practical point of view, the research in this paper can provide data support for local government’s policy-making and logistics planning and can model ideas for enterprises’ production planning.

Firstly, according to the forecast results of this paper, by 2026, the logistics express volume in Chongqing will increase by 118% compared to 2022 under such a development trend. Chongqing government should pay attention to the future planning of regional logistics to avoid duplication of construction due to unreasonable planning of logistics infrastructure. Policies should strongly promote the innovation of logistics and distribution modes (e.g., logistics alliance, crowdsourcing logistics, etc.) to cope with the significant growth of logistics demand in the future.

Secondly, the growth of logistics demand has obvious seasonality. In the next few years, logistics demand may not be a smooth rise, but it will occur in some time nodes of large growth. Logistics enterprises should follow suit, such as shopping carnivals, large international activities before the logistics demand prediction and assessment, in order to avoid too many orders and insufficient capacity caused by the malfunctioning of the operation. Therefore, the forecasting model in this paper provides ideas for logistics enterprises to forecast logistics express demand.

The limitations of the study mainly lie in the limitations of the research object and the limitations of the model parameters. The development of the logistics industry includes comprehensive factors such as national policies and the international economic situation. Therefore, the scope of the study can be expanded in future research to include the consideration of policies and economic situations to analyze and evaluate from a more comprehensive level.

Additionally, this paper mainly focuses on optimizing and improving the activation function and training algorithm of the BP neural network, but whether other neural network models can be improved in the same way is worthy of further research.

Author Contributions

Methodology, S.W.; formal analysis, T.X.; data curation, S.W.; writing—original draft, S.W.; writing—review and editing, F.M. and C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We would like to express our gratitude to Zhao Bowen for helping us organize the relevant references.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ding, Q.; Zhao, H. Study on e-commerce logistics cost control methods in the context of COVID-19 prevention and control. Soft Comput. 2021, 25, 11955–11963. [Google Scholar] [CrossRef]
Şişman, G. Implementing Lean Six Sigma methodology to reduce the logistics cost: A case study in Turke. Int. J. Lean Six Sigma 2023, 14, 610–629. [Google Scholar] [CrossRef]
Winkelhaus, S.; Grosse, E.H. Logistics 4.0: A systematic review towards a new logistics system. Int. J. Prod. Res. 2019, 58, 18–43. [Google Scholar] [CrossRef]
Guo, X.H.; Wu, L. A review of the evolution of industrial structure in new China (1949–2016). Res. Chin. Econ. Hist. 2018, 133–142. [Google Scholar]
Li, J.F. Advanced Industrial Structure and Modernisation of Tertiary Industry. J. Sun Yat-Sen Univ. (Soc. Sci. Ed.) 2005, 4, 124–130. [Google Scholar]
Li, S.L.; Liu, X.Y.; Du, C. The spatial dynamic development pattern of Chinese industries and its causes--Another discussion on the flattening of urban system. Financ. Trade Res. 2018, 29, 15–27. [Google Scholar] [CrossRef]
Yuan, X.D.; Chen, B.Y.; He, X.; Zhang, G.J.; Zhou, C.S. Spatial Differentiation and Influencing Factors of Tertiary Industry in the Pearl River Delta Urban Agglomeration. Land 2024, 13, 172. [Google Scholar] [CrossRef]
Xu, X.Y.; Yang, H.M.; Lu, X.K.; Wang, X.; Kang, J.C. A comparative study of logistics demand forecasting based on different models in Shandong Province. Packag. Eng. 2022, 43, 207–215. [Google Scholar] [CrossRef]
Čačić, N.; Jovanović, B.; Šarac, D.; Trubint, N.; Duđak, L.; Blagojević, M. Demand modelling and forecasting the future development of parcel and express services. Econ. Comput. Econ. Cybern. Stud. Res. 2023, 57, 253–268. [Google Scholar] [CrossRef]
Li, S.C.; Ye, J. Demand analysis and prediction of cold chain logistics of agricultural products based on grey regression model. Highw. Traffic Sci. Technol. 2022, 39, 166–174. [Google Scholar] [CrossRef]
Ren, X.; Tan, J.; Qiao, Q.; Wu, L.; Ren, L.; Meng, L. Demand forecast and influential factors of cold chain logistics based on a grey model. Math. Biosci. Eng. 2022, 19, 7669–7686. [Google Scholar] [CrossRef] [PubMed]
Zou, X.; Somenahalli, S.; Scrafton, D. Evaluation and analysis of urban logistics competitiveness and spatial evolution. Int. J. Logist. Res. Appl. 2020, 23, 493–507. [Google Scholar] [CrossRef]
Wu, H.; Chen, S.; Cui, S. Demand forecasting of bulk cargo transport based on GM (1, 1)-MLP neural network model. Highw. Traffic Sci. Technol. 2023, 40, 233–240. [Google Scholar] [CrossRef]
Zeng, M.; Liu, R.; Gao, M.; Jiang, Y. Demand forecasting for rural e-commerce logistics: A gray prediction model based on weakening buffer operator. Mob. Inf. Syst. 2022, 2022, 3395757. [Google Scholar] [CrossRef]
Huang, L.; Xie, G.; Zhao, W.; Gu, Y.; Huang, Y. Regional logistics demand forecasting: A BP neural network approach. Complex Intell. Syst. 2021, 9, 2297–2312. [Google Scholar] [CrossRef]
Yu, N.; Xu, W.; Yu, K.L. Research on regional logistics demand forecast based on improved support vector machine: A case study of Qingdao city under the New Free Trade Zone Strategy. IEEE Access 2020, 8, 9551–9564. [Google Scholar] [CrossRef]
Zhang, J. Cold Chain Logistics Demand Forecasting for Fresh Agricultural Products in Jilin Province Based on Improved BP Neural Networks. Master’s Thesis, Jilin University, Jilin, China, 2021. [Google Scholar] [CrossRef]
Guo, H.; Guo, C.; Xu, B.; Xia, Y.; Sun, F. MLP neural network-based regional logistics demand prediction. Neural Comput. Appl. 2021, 33, 3939–3952. [Google Scholar] [CrossRef]
Ma, H.; Luo, X. Logistics demand forecasting model based on improved neural network algorithm. J. Intell. Fuzzy Syst. 2021, 40, 6385–6395. [Google Scholar] [CrossRef]
Huang, J.; Zhang, D. An improved GM-BPNN combinatorial forecasting method for uncertain logistics demand. Stat. Decis. Mak. 2022, 295, 881–922. [Google Scholar] [CrossRef]
Hao, Y.; Zou, Y. Demand Forecasting for Fresh Agricultural Products Logistics in Shanghai Based on BP Neural Networks. J. Shanghai Marit. Univ. 2024, 45, 39–45. [Google Scholar] [CrossRef]
Liu, H.; Liu, Y.; Wang, C.; Song, Y.; Jiang, W.; Li, C.; Zhang, S.; Hong, B. Natural Gas Demand Forecasting Model Based on LASSO and Polynomial Models and Its Application: A Case Study of China. Energies 2023, 16, 4268. [Google Scholar] [CrossRef]
Rao, C.; Huang, Q.; Chen, L.; Mark, G.; Hu, Z. Forecasting the carbon emissions in Hubei Province under the background of carbon neutrality: A novel STIRPAT extended model with ridge regression and scenario analysis. Environ. Sci. Pollut. Res. 2023, 30, 57460–57480. [Google Scholar] [CrossRef]
Deng, J.L. Control problems of grey systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar]
Wu, G.; Luo, N. Multi-objective optimization of PEMFC performance based on grey correlation analysis and response surface method. Front. Energy Res. 2023, 11, 1206418. [Google Scholar] [CrossRef]
Zhao, J.; Cui, C.; Zhang, P.; Wang, K.; Zhao, M. Parameter sensitivity analysis of the seismic response of a piled wharf structure. Buildings 2023, 13, 349. [Google Scholar] [CrossRef]
Yue, Y.; Jiao, L.-Q.; Gao, P. Grey correlation analysis of logistics and economy in Shanxi. Econ. Issues 2017, 7, 121–124. [Google Scholar] [CrossRef]
Liu, S.F.; Cai, H.; Yang, Y.J.; Cao, Y. Advances in grey correlation analysis models. Syst. Eng. Theory Pract. 2013, 33, 2041–2046. [Google Scholar]
Lorenzo-Seva, U. SOLOMON: A method for splitting a sample into equivalent subsamples in factor analysis. Behav. Res. Methods 2022, 54, 2665–2677. [Google Scholar] [CrossRef] [PubMed]
Sannigrahi, S.; Zhang, Q.; Pilla, F.; Joshi, P.K.; Basu, B.; Keesstra, S.; Roy, P.S.; Wang, Y.; Sutton, P.C.; Chakraborti, S.; et al. Responses of ecosystem services to natural and anthropogenic forcings: A spatial regression based assessment in the world’s largest mangrove ecosystem. Sci. Total Environ. 2020, 715, 137004. [Google Scholar] [CrossRef]
Kern, E.L.; Cobuci, J.A.; Costa, C.N.; Pimentel, C.M.M. Factor analysis of linear type traits and their relation with longevity in Brazilian Holstein cattle. Asian-Australas. J. Anim. Sci. 2014, 27, 784. [Google Scholar] [CrossRef]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]

Figure 1. Neural network structure diagram.

Figure 2. Four activation functions and derivative images.

Figure 3. Research flowchart.

Figure 4. Spearman’s correlation analysis of indicators.

Figure 5. Error decline plot for LM training algorithm.

Figure 6. LM and BR algorithm prediction results.

Figure 7. Comparison of the derivatives of Tanh with LeCun Tanh.

Figure 8. Comparison of Tanh function and LeCun Tanh function prediction results.

Figure 9. Comparison of BPNN and GA-BPNN prediction results.

Figure 10. Genetic optimization algorithm adaptation iterative graph.

Figure 11. Radar plots of experimental results for different models.

Figure 12. Prediction result diagram of express demand.

Table 1. Grey correlation coefficients for indicators.

Indicator Variables	Grey Correlation Coefficient	References
X₁—Fixed Internet user/10,000 households	0.8653	Xu et al. [8]
X₂—Scale of investment in transport, storage, and postal services/CNY 10,000	0.8543	Ren et al. [11]
X₃—Share of transport, storage, and postal services/%	0.8498	Nataša et al. [9]
X₄—Import and export trade/USD 10,000	0.8463
X₅—Total retail sales of consumer goods/10,000 CNY	0.8426	Ma and Luo [19]
X₆—Tertiary GDP/CNY 100,000,000	0.8393
X₇—Expenditure on transport/CNY 10,000	0.8287	Huang et al. [15]
X₈—GDP/CNY 100,000,000	0.8217	Hao and Zou [21]
X₉—Number of persons employed in transport, storage, and postal services/10,000 people	0.8181
X₁₀—Tertiary sector contribution rate/%	0.8082	Li and Ye [10]
X₁₁—Per capita disposable income of urban permanent residents/CNY	0.7704	Yu et al. [16]
X₁₂—Gross transport, storage, and postal product/CNY 100,000,000	0.7588
X₁₃—Volume of freight/10,000 tons	0.7515	Zeng et al. [14]
X₁₄—Freight turnover/10,000 ton-kilometers	0.7508
Mobile telephone users/10,000 households	0.7441
The scale of investment in the information transmission, computer services, and software industry/CNY 10,000	0.6952
Operate trucks/vehicles	0.6634	Guo et al. [18]
Road mileage/km	0.6386

Table 2. Standardization of indicators. (Y is the volume of logistics express transactions after standardization).

Year	$Y$	$X_{1}$	$X_{2}$	$X_{3}$	$X_{4}$	$X_{5}$	$X_{6}$	$X_{7}$	$X_{8}$	$X_{9}$	$X_{10}$	$X_{11}$	$X_{12}$	$X_{13}$	$X_{14}$
2002	−0.70	−1.01	−1.13	1.08	−1.07	−1.13	−1.11	−1.25	−1.18	−1.31	−0.64	−1.32	−1.40	−1.49	−1.50
2003	−0.70	−0.95	−1.12	0.79	−1.05	−1.10	−1.08	−1.24	−1.14	−1.22	−0.99	−1.24	−1.35	−1.42	−1.47
2004	−0.70	−0.88	−1.10	0.88	−1.02	−1.07	−1.06	−1.24	−1.09	−1.13	−1.01	−1.14	−1.25	−1.31	−1.35
2005	−0.70	−0.87	−1.04	1.18	−1.01	−1.02	−1.01	−1.18	−1.04	−1.05	−0.20	−1.05	−1.14	−1.23	−1.26
2006	−0.69	−0.84	−1.01	1.47	−0.98	−0.97	−0.97	−1.06	−0.99	−0.99	0.12	−0.94	−1.01	−1.12	−1.09
2007	−0.69	−0.79	−0.93	0.69	−0.93	−0.89	−0.88	−1.03	−0.88	−0.90	−1.02	−0.85	−0.93	−0.92	−0.90
2008	−0.65	−0.75	−0.82	0.88	−0.87	−0.77	−0.74	−0.93	−0.75	−0.81	−0.81	−0.69	−0.66	−0.53	−0.54
2009	−0.63	−0.72	−0.56	0.98	−0.92	−0.66	−0.66	−0.79	−0.66	−0.71	−0.65	−0.57	−0.49	−0.39	−0.40
2010	−0.61	−0.60	−0.48	0.79	−0.80	−0.51	−0.49	−0.62	−0.49	−0.58	−1.34	−0.41	−0.25	−0.03	−0.10
2011	−0.56	−0.48	−0.45	0.40	−0.38	−0.30	−0.27	0.21	−0.24	−0.30	−0.22	−0.17	0.06	0.41	0.34
2012	−0.51	−0.35	−0.13	−0.19	0.22	−0.12	−0.14	0.38	−0.06	−0.03	−0.21	0.07	0.09	0.12	0.44
2013	−0.33	−0.12	0.20	−0.39	0.61	0.07	0.02	0.77	0.11	0.32	0.42	0.08	0.26	0.14	0.14
2014	−0.21	−0.05	0.30	−0.58	1.28	0.26	0.19	0.82	0.30	0.62	−0.10	0.26	0.39	0.43	0.39
2015	0.02	0.26	0.64	−0.78	0.76	0.48	0.39	1.10	0.47	0.79	0.46	0.45	0.54	0.61	0.48
2016	0.31	0.56	0.94	−0.97	0.46	0.73	0.66	0.88	0.71	0.98	0.66	0.66	0.72	0.73	0.70
2017	0.47	1.00	0.88	−1.17	0.56	0.97	0.95	1.05	0.96	1.28	0.89	0.88	0.89	0.94	1.04
2018	0.93	1.40	1.13	−1.07	0.87	1.20	1.18	0.92	1.14	1.20	2.83	1.12	1.12	1.31	1.23
2019	1.27	1.59	1.27	−1.17	1.00	1.42	1.47	1.08	1.38	1.22	0.82	1.39	1.39	0.87	1.24
2020	1.90	1.69	1.65	−1.46	1.25	1.45	1.60	1.18	1.55	1.31	−0.43	1.57	1.27	1.11	1.17
2021	2.79	1.92	1.74	−1.36	2.00	1.97	1.94	0.96	1.90	1.31	1.43	1.88	1.75	1.77	1.43

Table 3. Factor analysis dimensionality reduction results.

Year	Factor_1	Factor_2	Factor_3
2002	−1.35941	−0.48689	−0.06964
2003	−1.17288	−0.30338	−0.59206
2004	−1.14122	−0.22605	−0.64507
2005	−1.36911	−0.51505	0.43343
2006	−1.36874	−0.66043	0.85061
2007	−0.72930	−0.19463	−0.81934
2008	−0.56759	−0.24500	−0.63413
2009	−0.52864	−0.22644	−0.47136
2010	0.00314	0.07949	−1.53201
2011	0.81866	−0.97357	−0.19903
2012	1.10003	−0.87051	−0.33696
2013	1.13923	−1.05458	0.44497
2014	1.72081	−0.87636	−0.38454
2015	1.37638	−0.63810	0.27886
2016	0.78650	0.13559	0.41601
2017	0.72043	0.44500	0.60881
2018	−0.17869	0.32164	3.09303
2019	0.14649	1.68686	0.27305
2020	0.60653	2.38330	−1.50996
2021	−0.00263	2.21911	0.79533

Table 4. Comparison of prediction results between LM algorithm and BR algorithm.

Year	Actual Value	LM Training		BR Training
		Fitted	Errors	Fitted	Errors
2017	0.4656	0.4642	−0.0015	0.5506	0.0849
2018	0.9272	0.9119	−0.0152	0.9239	−0.0033
2019	1.2675	1.7827	0.5152	1.4604	0.1929
2020	1.9027	1.8037	−0.0990	1.8985	−0.0042
2021	2.7883	2.7229	−6.55%	2.7823	−0.61%

Table 5. Evaluation of LM algorithm and BR algorithm prediction results.

Evaluation Indicators	LM Training	BR Training
SSE	0.27980	0.04450
MAE	0.13928	0.05828
MSE	0.05596	0.00890
RMSE	0.23656	0.09434
MAPE	10.03%	6.85%
Accuracy Rate	89.97%	93.15%

Table 6. Comparison of Tanh function and LeCun Tanh function prediction results.

Year	Actual Value	Tanh Activation		LeCun Tanh Activation
		Fitted	Errors	Fitted	Errors
2017	0.4656	0.5506	0.0849	0.5177	0.0521
2018	0.9272	0.9239	−0.0033	0.8919	−0.0353
2019	1.2675	1.4604	0.1929	1.3753	0.1078
2020	1.9027	1.8985	−0.0042	1.8365	−0.0661
2021	2.7883	2.7823	−0.0061	2.7689	−0.0195

Table 7. Evaluation of Tanh function and LeCun Tanh function prediction results.

Evaluation Indicators	Tanh Activation	LeCun Tanh Activation
SSE	0.04450	0.02034
MAE	0.05828	0.05616
MSE	0.00890	0.00407
RMSE	0.09434	0.06378
MAPE	6.85%	5.53%
Accuracy Rate	93.15%	94.47%

Table 8. Comparison of BPNN and GA-BPNN prediction results.

Year	Actual Value	BPNN		GA-BPNN
		Fitted	Errors	Fitted	Errors
2017	0.4656	0.5177	0.0521	0.476	0.0103
2018	0.9272	0.8919	−0.0353	0.8932	−0.0340
2019	1.2675	1.3753	0.1078	1.4013	0.1339
2020	1.9027	1.8365	−0.0661	1.8507	−0.0520
2021	2.7883	2.7689	−0.0195	2.7539	−0.0344

Table 9. Evaluation of BPNN and GA-BPNN prediction results.

Evaluation Indicators	BPNN	GA-BPNN
SSE	0.02034	0.02306
MAE	0.05616	0.05291
MSE	0.00407	0.00461
RMSE	0.06378	0.06792
MAPE	5.53%	4.08%
Accuracy Rate	94.47%	95.92%

Table 10. Experimental results for different models.

Methods Used	SSE	MAE	MSE	RMSE	MAPE	Accuracy Rate
MLP	0.24599	0.20377	0.049198	0.22181	17.39%	82.61%
Ridge regression	0.61513	0.29801	0.12303	0.35075	27.83%	72.17%
Lasso regression	0.6144	0.2979	0.12288	0.35054	27.88%	72.12%
GM(1, 1)	1.9265	0.45564	0.38529	0.62072	33.05%	66.95%
Improved GA-BP	0.02306	0.05291	0.00461	0.06792	4.08%	95.92%

Table 11. Chongqing 2022–2026 express demand forecasts.

Year	Standardized Value	Actual Forecast Value/10,000 Items
2022	3.0449	105,083.11
2023	4.1087	134,863.39
2024	5.4723	173,040.09
2025	6.3959	198,896.85
2026	7.0485	217,164.27

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, F.; Wang, S.; Xie, T.; Sun, C. Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics. Appl. Sci. 2024, 14, 6766. https://doi.org/10.3390/app14156766

AMA Style

Ma F, Wang S, Xie T, Sun C. Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics. Applied Sciences. 2024; 14(15):6766. https://doi.org/10.3390/app14156766

Chicago/Turabian Style

Ma, Feihu, Shuhan Wang, Tianchang Xie, and Cuiyu Sun. 2024. "Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics" Applied Sciences 14, no. 15: 6766. https://doi.org/10.3390/app14156766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Regional Logistics Express Demand Forecasting Based on Improved GA-BP Neural Network with Indicator Data Characteristics

Abstract

1. Introduction

2. Literature Review

2.1. Impact of Tertiary Industry Development on the Logistics Express Industry

2.2. System of Indicators Related to the Impact of Logistics Demand

2.3. Demand Forecasting Methodology

3. Methodology and Modelling

3.1. Grey Correlation Analysis

3.2. Factor Analysis for Dimensionality Reduction

3.3. BP Neural Network

3.3.1. Gradient Descent Algorithm

3.3.2. Levenberg–Marquardt Algorithm (LM)

3.3.3. Bayesian Regularization (BR)

3.4. Genetic Algorithm Optimization

4. Experimental Results and Analyses

4.1. Establishment of a Logistics Express Impact Indicator System

4.2. Data Sources

4.3. Factor Analysis and Dimensionality Reduction

4.4. Analysis of BP Neural Network Prediction Results

5. Conclusions and Prospect

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI