Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling

Guo, Dudu; Jiang, Peifan; Qin, Yin; Zhang, Xue; Zhang, Jinquan

doi:10.3390/app14188162

Open AccessArticle

Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling

by

Dudu Guo

^1,2,

Peifan Jiang

^3,*,

Yin Qin

²,

Xue Zhang

³ and

Jinquan Zhang

⁴

¹

School of Transportation Engineering, Xinjiang University, Urumqi 830017, China

²

Xinjiang Key Laboratory of Green Construction and Smart Traffic Control of Transportation Infrastructure, Xinjiang University, Urumqi 830017, China

³

School of Business, Xinjiang University, Urumqi 830017, China

⁴

Xinjiang Hualing Logistics & Distribution Co., Urumqi 830017, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8162; https://doi.org/10.3390/app14188162

Submission received: 27 July 2024 / Revised: 4 September 2024 / Accepted: 9 September 2024 / Published: 11 September 2024

(This article belongs to the Special Issue Data Science and Machine Learning in Logistics and Transport)

Download

Browse Figures

Versions Notes

Abstract

:

This study focuses on the problem of the supply prediction of logistics transportation vehicles in road transportation. Aiming at the problem that the supply data of logistics transportation has the characteristics of long sequential data, numerous influencing factors, and a significant spatiotemporal evolution law, which leads to the lack of accuracy of supply predictions, this paper proposes a supply prediction method for logistics transportation based on an improved Informer model. Firstly, multidimensional feature engineering is applied to historical supply data to enhance the interpretability of labeled data. Secondly, a spatiotemporal convolutional network is designed to extract the spatiotemporal features of the supply volume. Lastly, a long short-term memory (LSTM) model is introduced to capture the supply volume’s long- and short-term dependencies, and the predicted value is derived through a multilayer perceptron. The experimental results show that mean square error (MSE) is reduced by 73.8%, 79.36%, 82.24%, 78.58%, 77.02%, 53.96%, and 40.38%, and mean absolute error (MAE) is reduced by 52%, 59.5%, 60.36%, 57.52%, 53.9%, 31.21%, and 36.58%, respectively, when compared to the auto-regressive integrated moving average (ARIMA), support vector regression (SVR), LSTM, gated recurrent units (GRUs), a back propagation neural network (BPNN), and Informer and InformerStack single models; compared with the ARIMA + BPNN, ARIMA + GRU and ARIMA + LSTM integrated models, the MSE is reduced by 74.88%, 71.56%, and 74.07%, respectively, and the MAE is reduced by 51.31%, 50%, and 52.02%; it effectively reduces the supply prediction error and improves the prediction accuracy.

Keywords:

logistics transportation; supply forecast; informer model; data feature engineering; artificial neural networks

1. Introduction

With the rapid development of road transportation services in the remote regions of northwest China, the volume of road transport has also shown a significant upward trend. Vehicle transport capacity plays a crucial role in the logistics system. In an environment characterized by limited resources and diverse demands, accurately forecasting vehicle supply becomes particularly important. Therefore, enhancing vehicle supply forecasting capabilities will have a profound impact on optimizing road transportation operations and promoting economic development.

Logistics and transport supply capacity refers to the range of resources provided by logistics and transport services, including transport equipment, route facilities, service providers, and other transport resources. As the logistics transportation industry evolves, the number of transportation vehicles serves as a key quantitative indicator of logistics transportation supply capacity [1]. The adequate and judicious allocation of vehicles significantly influences transportation capacity, efficiency, and service quality. Thoughtful planning and management of vehicle numbers can ensure a balance between supply and demand, relieve transport congestion and delays, and enhance the efficiency and dependability of logistics transport. Additionally, logistics supply plays a crucial role in the supply chain, focusing on logistics transport and distribution services. It involves coordination and collaboration with other supply chain components to optimize the seamless functioning of each logistics process, elevate supply chain efficiency and service levels, and drive the high-quality development of regional economies [2]. The management of logistics supply is considered a pivotal strategy for cost reduction, efficiency enhancement, and competitiveness. It is often referred to as the “third source of profit” for logistics and transport enterprises [3]. However, due to excessive costs, structural mismatches, outdated service quality, and lack of innovation, effective reforms on the supply side are necessary. Key to these reforms is the forecasting of logistics and transport supply, which aids in estimating future transport demand and facilitating the effective planning and allocation of transport resources.

In summary, the main contributions of this paper include the following three points:

Multidimensional feature engineering based on historical data has been established. This is used to extract and select features closely related to the model’s prediction task, enhance the interpretability of label data, and improve the model’s generalization ability and predictive performance.
A new method for forecasting logistic transportation supply has been introduced. It is based on the enhanced Informer model. The original Informer model did not account for sequential spatial features, so a spatiotemporal convolutional network has been developed to address this. Additionally, an LSTM model has been incorporated to capture the long- and short-term dependencies in the temporal data.
In-depth experimental evaluations demonstrate that this method is suitable for long-sequence time series forecasting, exhibiting high predictive accuracy. It shows better capture capabilities for local changes and fluctuations, resulting in overall improved predictive performance.

The remaining parts of the article are as follows: the second part discusses the related work in the vehicle supply forecasting field; the third part introduces the model design, outlining the structure of the improved Informer model proposed; and the fourth part covers the design of the experiments and the analysis of the experimental results. Finally, the fifth part summarizes the entire text and provides an outlook on the next steps for this work.

2. Related Work

2.1. Logistics and Transportation Supply Forecasting Model

In recent years, logistics and transport supply forecasting-related research has gradually been carried out. Domestic and foreign research on logistics supply mainly qualitatively analyzes logistics supply prediction through the introduction of information technology, the Internet of Things [4,5,6,7,8,9,10,11,12], blockchain, big data, system dynamics [13,14,15,16], and other modern high-tech systems, and achieves the purpose of improving real-time supervision of the supply system and the performance of product source tracking to improve the supply capacity and security of the supply system; however, the research on the supply prediction of logistics and transport is relatively sparse. Solak et al. [17] developed inverse logic network fuzzy mixed integers for extra-low-voltage systems (ELVS) that comply with existing directives in Turkey to evaluate and coordinate supply capacity. Gruzauskas et al. [18] developed a collaborative technological strategy that facilitates the sharing of information, which leads to improved forecasting accuracy and inventory control for better coordination between supply and demand. Mi et al. [19] used a combination of traditional time series analysis and a moving average model to develop a time series model of total potato production and consumption. They used the model to forecast potatoes’ production and consumption trends in the medium and long term. Ilaeva et al. [20] used panel data with time retrospective and spatial sampling, combining a time series analysis and spatial observation analysis for logistics supply forecasting analysis, and experimentally found that panel data have significant advantages over time or spatial data alone. Wen et al. [21] proposed an adaptive variational modal decomposition strategy using genetic algorithm (GA) optimization to address the problem of time series decomposition parameters significantly impacting prediction results. They proposed a real-time decomposition framework and embedded the whole decomposition and prediction process into this framework. Nishino et al. [22] used time series predictive analysis to assess the risk of the number of ignitions and the trend of fires in a building after an earthquake. Huang et al. [23] established a multi-rule combination prediction of component data based on a multivariate fuzzy time series model using fuzzy logic relations (FLRs) to explore the adjustment rules between components. A GA was used to find the optimal weights of the predicted values under different rules to derive the combined predicted values. Wen et al. [24] proposed a new time series prediction model LSTM–attention–LSTM. This model uses two LSTM models as an encoder and decoder and introduces an attention mechanism between the encoder and decoder. Sun et al. [25] designed an improved time series parallel attention-based long short-term memory (PA-LSTM) prediction model based on deep learning based on a discussion of time series features, temporal attention mechanisms, and deep learning time series prediction.

2.2. Informer Model

The Informer model is a novel time series prediction model designed to bring new solutions to long-term series prediction tasks through its powerful long-term series alignment and processing capabilities. The model significantly reduces the consumption of graphics processing unit (GPU) resources by combining the advantages of the transformer model with the processing power of traditional recurrent neural networks (RNNs), making long-term series prediction tasks more practical and feasible. The architecture and training process of the Informer model involves several key techniques and optimizations, including prob-sparse self-attention, self-attention distillation mechanisms, generative decoders, etc. These techniques and mechanisms work together to make the Informer model perform well in dealing with long-term series prediction problems. Many scholars often combine other deep learning methods to enhance the accuracy and precision of the prediction model. Wang [26] and Gong et al. [27] proposed a new short-term wind power prediction model based on temporal convolutional networks (TCNs) and the Informer model to solve the problem of low prediction accuracy due to the large fluctuation of wind speed in short-term predictions. Wang et al. [28] proposed graph convolutional networks–Informer (GCNs–Informer) to construct the prediction model, which used GCN to establish the relationship between multiple wind turbine arrays and enhance the data correlation. The Informer model was used to extract temporal information from the data and predict the long-term series. Yin et al. [29] proposed a new hybrid optimization forecasting strategy consisting of the similar hours (SHs) method and the Informer model. Peng et al. [30] proposed an algorithm based on variational modal decomposition (VMD) and the enhanced chaotic game optimization (ECGO) algorithm, and an improved solar power forecasting method using the Informer model with a locality-sensitive hashing (LSH) attention mechanism. Li et al. [31] proposed a hybrid algorithm combining ensemble empirical modal decomposition (EEMD) and the Informer model, in which the parameters of the Informer were optimized using the particle swarm optimization algorithm (PSO). Yang et al. [32] argued that the identification of causal relationships between external variables and loads is an important factor for accurate load forecasting in power load forecasting, and therefore proposed an Informer hybrid forecasting method with improved causal reasoning. The model constructed by Xu et al. [33] abandoned the common recurrent neural network to deal with the time series problem and adopted a sparse self-attention mechanism as the main body of the seq2seq structure, supplemented by specific input and output modules to deal with the long-range relationships in the time series, effectively using the parallelism advantage of the self-attention mechanism and thus improving the prediction accuracy and prediction efficiency. Ma et al. [34] proposed a long-term structure for long-series time series forecasting (LSTF) based on an improved Informer model combined with the fast Fourier transform (FFT) method. LSTF for long-term structural state trend predictions is referred to as the FFT–Informer model. This method uses FFT to represent the structural state trend characteristics by extracting the magnitude and phase from a certain period of a data series. Li et al. [35] proposed an optimization model combining convolutional neural networks (CNNs) and the Informer model to predict the dynamic rise in the temperature of bearings. Xie et al. [36] proposed a new deep learning network based on an information encoder and the bi-directional long short-term memory (IE–Bi–LSTM), which utilizes the encoder part of the Informer model to capture connections globally, extracts long feature sequences with rich information from multi-channel sensors, and employs an attentional distillation layer to improve computational efficiency. Gao et al. [37] proposed an improved stacked integration algorithm-based LSTM–Informer model (ISt–LSTM–Informer) to accurately predict photovoltaic power generation at different time scales. Jiang et al. [38] improved the Informer model for photovoltaic power prediction by using the locality-sensitive hashing (LSH) attention mechanism. Zhuang et al. [39] proposed a medium-term photovoltaic power prediction model combining a GCN and the Informer model. This fusion model utilizes the multi-output capability of the Informer model and provides more reliable feature information for medium-term photovoltaic power prediction by using the feature extraction capability of the GCN model on the nodes, which ensures the accuracy of the long-sequence prediction. Zhao et al. [40] proposed an anomaly detection scheme based on a graph attention network (GAT) and the Informer model. Shi et al. [41] used the Informer as the main predictor and based on the encoder–decoder architecture and the self-attention mechanism, the accurate prediction results are efficiently output.

Therefore, based on existing research, this paper proposes a logistics transportation supply prediction method based on the improved Informer model. Aiming at the spatiotemporal evolution law of logistics and transportation supply volume data, we establish multidimensional feature engineering to extract the correlation factor feature information, design a spatiotemporal convolutional network inside the Informer model to extract the spatiotemporal features of the data, add an LSTM model to further extract the long- and short-term dependencies of the time series data, and finally predict the future supply volume through a multilayer perceptual machine.

3. Problem Description and Model Design

This section describes the spatiotemporal evolution patterns of supply data and the principles of the Informer model temporal prediction algorithm. In this section, a spatiotemporal convolutional network is designed to extract the spatiotemporal features of the data and propose a supply prediction method for logistics transportation based on the improved Informer model.

3.1. Problem Description

Logistics and transportation supply capacity is a key external factor in promoting the development of China’s modern logistics system and its freight transportation network. The logistics industry relies on networked operations to ensure transportation efficiency, and the logistics transportation supply network has two basic characteristics, namely spatiotemporal dispersion and fragmentation of the volume, resulting in significant spatiotemporal evolution characteristics and noise in the supply data, and the network is affected by a variety of supply-related factors, which makes it difficult to forecast the supply volume.

Logistics and transportation supply prediction is a time series prediction problem involving a set of known time series lengths of historical observations. If the length is

L

, the time is

t_{1}

to the time

t_{L}

, then the historical observations are

[x_{t_{1}}, x_{t_{2}}, \dots, x_{t_{L}}]

and the expectation of time series prediction is denoted as

f (x_{t 1}, x_{t 2}, \dots, x_{t L})

. To find a relationship that can satisfy the following mapping and minimize the error between the predicted value

[{\hat{y}}_{t_{L + 1}}, {\hat{y}}_{t_{L + 2}}, \dots, {\hat{y}}_{t_{L + T}}]

and the true value

[y_{t_{L + 1}}, y_{t_{L + 2}}, \dots, y_{t_{L + T}}]

:

f (x_{t_{1}}, x_{t_{2},} \dots, {x_{t}}_{L}) = [{\hat{y}}_{t_{L + 1}}, {\hat{y}}_{t_{L + 2}}, \dots, {\hat{y}}_{t_{L + T}}]

(1)

where

L

denotes the observation length and

T

is the prediction length.

The paper focuses on the spatiotemporal evolutionary patterns of supply quantity data, designs multidimensional feature engineering to extract the feature information of related factors, and introduces the cutting-edge Informer model in the field of time series forecasting to model supply quantity. Within the Informer model, a spatiotemporal convolutional network is designed to extract data spatiotemporal features, an LSTM model is added to capture long- and short-term memory dependencies, and finally, a multilayer perceptron is used to predict future supply quantities. This creates a logistics transportation supply forecasting method based on an improved Informer model.

3.2. Principles of Informer Model Timing Prediction Algorithm

Existing forecasting methods in the field of time series prediction mostly focus on short-term issues. However, in practical applications, longer sequences pose higher demands on the predictive capabilities of models. This paper’s logistics transportation supply prediction has a length of 64, belonging to a long-sequence time series. Recent research [42] has shown that transformer models with parallel computation and attention mechanisms have great potential for improving temporal prediction accuracy but have some limitations. The Informer model is an advanced method based on the transformer architecture specifically designed for long-sequence time series prediction. It addresses the limitations of the transformer models such as quadratic time complexity, high memory consumption, and fixed encoder–decoder architecture. It primarily offers the following three significant advantages:

Multi-Head Sparse Attention Mechanism

The Informer model consists of an encoder and a decoder. In the encoder, a multi-head sparse attention mechanism (prob-sparse self-attention) is proposed to replace the traditional multi-head self-attention mechanism, allowing parallel independent training of variable feature information in different spaces. Through the integration computation mechanism, overfitting is prevented to a certain extent. The self-attention mechanism, based on tuple input, performs scaled dot-product calculations using Query–Key–Value as shown below [43].

A t t e n t i o n (Q, K, V) = S o f t \max (\frac{Q K^{T}}{\sqrt{d}}) V

(2)

where

Q

represents the query vector Query,

K

denotes the key vector Key,

V

denotes the value vector Value,

d

represents the input variable dimension, and

Soft \max

indicates the activation function. Due to the typical long-tailed distribution of the attention mechanism arithmetic process, only a few parts of the dot product contribute a large weight to the main attention, and the influence of most parts is very small; the sparse attention mechanism can reduce the spatiotemporal complexity from

Q (L^{2})

without any significant loss of precision. To evaluate the sparsity of Query, Zhou [43,44] et al. used Kullback–Leibler (KL) divergence to measure the distribution distance; the larger the value of the scatter, the larger the weight of its influence, and according to the weight size can be sorted. The sparsity measure formula of the second query vector Query is shown below.

M (Q_{i}, K) = \ln \sum_{j = 1}^{L_{K}} e^{\frac{q_{i} K_{j}^{T}}{\sqrt{d}}} - \frac{1}{L_{K}} \sum_{j = 1}^{L_{K}} \frac{q_{i} K_{j}^{T}}{\sqrt{d}}

(3)

To reduce complexity, an approximate formula using sparsity is employed [45], where most query vectors are replaced with the maximum value and the remaining query vectors are replaced with a uniform distribution, as shown below.

\bar{M} (Q_{i}, K) = M A X (\frac{q_{i} k_{j}^{T}}{\sqrt{d}} - \frac{1}{L_{K}} \sum_{j = 1}^{L_{K}} \frac{q i k_{j}^{T}}{\sqrt{d}})

(4)

The computational formula for the multi-head sparse attention mechanism at this point is as follows:

A (Q, K, V) = S o f t \max (\frac{\bar{Q} K^{T}}{\sqrt{d}}) V

(5)

where

\sqrt{d}

represents the normalization factor and

\bar{Q}

represents the query vector with the largest complexity in sparse evaluation constituting a sparse matrix of the same size as

Q

.

Self-Attentive Distillation Mechanism

The Informer model’s encoder is for capturing long-range dependencies between long series of time series inputs, but the feature mapping at the coding layer generates redundancy in the input values. To reduce the high spatial complexity and memory consumption caused by the long inputs, the encoder utilizes distillation to privilege the dominant high-level features. It assigns higher weights to the more dominant features, and the length of inputs is halved layer by layer. The length of the sequence is halved layer

j

by layer

j + 1

.

X_{j + 1}^{t} = M a x P o o l (E L U (C o n v l d ({[X_{j}^{t}]}_{A B})))

(6)

where

X_{j + 1}^{t}

denotes the distillation operation from layer

j

to layer

j + 1

,

{{[X}_{J}^{T}]}_{A B}

denotes the attention module and the basic variables of sparse attention,

M axpool

represents the maximum pooling process with a step size of two,

ELU

represents the activation function, and

Convld

represents the one-dimensional convolution.

The distillation technique incorporates one-dimensional convolution and max pooling. After the sequence data have been processed by the multi-head sparse attention mechanism, it is fed directly into the distillation layer, which reduces the network parameters and emphasizes important features. This process involves dimensionality reduction and lowering memory consumption, scaling the output dimensional variables to half of their original length and temporal range. This operation is repeated until the final output is achieved, after which the results are passed to the subsequent multi-head attention module for feature exchange. Specifically, the distillation layer utilizes one-dimensional convolution and max pooling to scale down the dimensions, thereby enhancing the model’s robustness and reducing its memory usage. The construction of this process within the encoder is illustrated in Figure 1.

Generative Decoder

The decoder of the Informer model includes fully connected layers, full attention layers, and multi-head probability sparse self-attention layers. To prevent auto-regression from interfering with training effectiveness, the decoder input is a masked sparse multi-head attention mechanism consisting of a target placeholder sequence and partial historical sequences. The predicted length of the long sequence is equal to the length of the target placeholder sequence. Placeholder sequence positions are filled with zeros for long-sequence prediction to avoid influence from information not yet predicted on the current prediction point. The input sequence

X_{d e}^{t}

of the decoder at moment

t

consists of two parts; the first part is a reference to the historical sequence

X_{token}^{t}

and the second part is a predictive sequence

X_{0}^{t}

used to mask the future characterizing factors, as shown below.

X_{d e}^{t} = F u s i o n [C o n c a t (X_{t o k e n}^{t}, X_{0}^{t})] \in R^{(L_{token} + L_{y}) d_{\mod el}}

(7)

where

X_{token}^{t}

represents the historical sequence, i.e., the output dimension of the encoder,

X_{0}^{t}

represents the target placeholder sequence, which is uniformly set to 0,

Concat

represents the sequence concatenation function, which indicates the splicing of the historical sequence and the target placeholder sequence, and

Fusion

is the feature fusion operation. The generative decoder of the Informer model employs a generative inference method for decoding. It utilizes multi-head attention to handle intermediate results from the encoder output, adjusts the output data dimensionality through fully connected layers, and applies inverse normalization operations to the output data. During the training process, the entire prediction results are generated at once, eliminating the need to rely on the output from the previous step. This significantly enhances the computational efficiency of decoding.

3.3. Improved Informer Modeling

The above study provides a detailed analysis of the basic principles of the Informer model and its adaptability in long-sequence time series prediction. The Informer model exhibits significant advantages in long-sequence time series prediction. However, there is a limitation in that it does not consider sequence spatial features. In response to the characteristics of logistics transportation supply data, this paper primarily focuses on improving the Informer model in the following two aspects: (1) establishing multidimensional feature engineering for input data to increase data complexity and enhance model representational capacity and (2) designing spatiotemporal convolutional networks to extract spatiotemporal features of supply quantity, incorporating LSTM modules to further extract both short-term and long-term temporal dependencies.

3.3.1. Multidimensional Feature Engineering Construction for Supply Data

Feature engineering is mainly used to extract and select the features closely related to the model prediction task, including the steps of feature selection and encoding, etc. Suitable features can reduce redundancy and noise, enhance data complexity, and improve the model’s generalization ability and prediction effect. This paper mainly analyzes demand, freight rate, and spatiotemporal evolution patterns in the prediction task where supply is the output label.

For the input supply volume data, this paper mines the time information, adding the weekend, quarter, mid-year day, and month global timestamps to construct the feature engineering, and takes the labels for coding. For example, the quarter timestamp takes the value range of [1, 4], where the first quarter covers January to March, the second quarter covers April to June, the third quarter includes July to September, and the fourth quarter is from October to December. The feature construction and coding are shown in Table 1.

According to Table 1, to construct the data for the seven-dimensional characteristics of supply engineering, supply is closely related to demand, freight rates, time–space evolutionary patterns, and other factors. To further determine the quantitative correlation between these influencing factors and supply, this chapter introduces the Pearson correlation coefficient and carries out a Pearson correlation analysis of the influencing factors one by one, time step by time step, as shown below:

r = \frac{\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{n} {(X_{i} - \bar{X})}^{2} \sum_{i = 1}^{n} {(Y_{i} - \bar{Y})}^{2}}}

(8)

X_{i}

and

Y_{i}

are the values of the influencing factors and indicators taken at the

i

th time step,

\bar{X}

and

\bar{Y}

are the mean values of the influencing factors and indicators, respectively, and

n

is the sample size.

The Pearson correlation coefficient value domain is [−1, 1], where a value close to 1 indicates a positive correlation; the classification of the correlation coefficient varies in different industries and fields, and the interpretation of the correlation coefficient criterion is not universal. This paper uses the literature in the field of time series prediction [46] to obtain the classification of the correlation coefficient, as shown in Table 2.

| r_{x y} |

As the later stages require the consideration of sequential spatiotemporal information, the positional and temporal information of input sequences are encoded into vector form. Positional encoding involves generating corresponding position vectors using sine and cosine functions, which are then added to the embedding layer for processing. Let the length of the input sequence be

L

, with the position represented by

p

, the embedding vector dimension as

dmodel

, and the embedding dimension as

i

; then, the position encoding vector

E

is shown in the following equation. The specific position encoding steps are shown in Figure 2.

{\begin{cases} E_{(p, 2 i)} = \sin (p / {(2 L)}^{2 i / d_{\mod el}}) \\ E_{(p, 2 i + 1)} = \cos (p / {(2 L)}^{2 i / d_{\mod el}}) \end{cases}

(9)

Similarly, time coding converts dates to timestamp coding; the unit time step in this paper is days (d) and the timestamp generated after encoding is a three-dimensional variable that includes mid-week days, mid-month days, and mid-year days. In addition, since the supply and characteristics are different types of data with differences in size and unit, this paper uses the Z-Score normalization method to process them separately, and the calculation formula is shown below.

Z = \frac{x - \bar{x}}{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} (x_{i} - \bar{x})}}

(10)

where

Z

represents the normalization result,

x

denotes the input value of the independent variable,

\bar{x}

denotes the mean value of the sample data,

i

represents the value of the first sample datum, and

N

represents the number of sample data. The timestamp normalization normalizes mid-week, mid-month, and mid-year days to the uniform range of [−0.5, 0.5], and the conversion formula is shown below.

D_{W} = \frac{1}{6} (d_{w} - 1) - 0.5 D_{M} = \frac{1}{30} (d_{m} - 1) - 0.5 D_{Y} = \frac{1}{365} (d_{y} - 1) - 0.5

(11)

where

d_{w}

denotes the mid-week day, where the range of values is [1, 7];

d_{m}

denotes the mid-month day, where the range of values is [1, 31];

d_{y}

denotes the mid-year day, where the range of values is [1, 366]; and

D_{W}

,

D_{M}

, and

D_{Y}

are the results of the normalization of

d_{w}

,

d_{m}

, and

d_{y}

, respectively.

3.3.2. Spatiotemporal Convolutional Network to Extract Spatiotemporal Features

The concatenated features are fed into the Informer model, with model parameters configured to accommodate the concatenated spatiotemporal data, including the sizes of the encoder and decoder, the attention mechanism, and the sequence length. This setup aims to further enhance the accuracy and fitting capability of time series predictions. As illustrated in Figure 3, this paper proposes a spatiotemporal convolutional network (STCN module) that applies spatiotemporal convolution to the output values of the Informer model’s decoder, facilitating the extraction of spatiotemporal features from the data. The STCN module comprises four main convolutional layers, each designed to extract features across different spatiotemporal dimensions.

The local spatial convolutional layer (st_region_convs) employs a standard convolutional kernel to capture the spatial information of the domain around each data point. To maintain the consistency of spatial dimensions, the inputs and outputs apply a padding strategy that sets the padding size to (1, 1) to extract local spatial features. Let

X

be the input feature matrix

X \in R^{H \times W \times C}

, where

H, W

, and

C

denote the height, width, and number of channels, respectively, and the convolution kernel

K_{s}

using

3 \times 3

, where the st_region_convs layer is represented as shown in Equation (12). Immediately after that, the spatial feature extraction convolutional layer (st_pair_t_convs) and temporal feature extraction convolutional layer (st_pair_n_convs) are set up to optimize for spatial and temporal dimensions, respectively. The st_pair_t_convs layer uses a convolution kernel of

1 \times 3

that focuses on capturing the temporal continuum without changing the spatial dimensions and applies a convolution, maintaining a single length in the spatial dimension, helping in understanding the dynamics of the time series in more detail, as shown in Equations (13) and (14). The st_pair_n_convs layer employs a convolution kernel of

3 \times 1

, aiming to strengthen the spatial relationships within a single time step and enhance the model’s understanding of the spatial dynamics by expanding the spatial context, as shown below.

Y_{s} = X * K_{s} + b_{s}

(12)

Y_{t} = X * K_{t} + b_{t}

(13)

Y_{n} = X * K_{n} + b_{n}

(14)

*

denotes the convolution operation,

b_{s}

denotes the bias top,

Y_{s} \in R^{H \times X \times C}, Y_{t} \in R^{H \times W \times C}, Y_{n} \in R^{H \times W \times C}

,

K_{s} \in R^{1 \times 3 \times C}, and K_{n} \in R^{3 \times 1 \times C}

. To synthesize the features extracted from different convolutional layers and reduce the feature dimensions, a feature compression convolutional layer (st_condense) is designed, which uses a convolutional kernel of

1 \times 1

to fuse the features of the three aforementioned convolutional layers and map the features to a lower dimensionality space, realizing the compression and integration of the features, reducing the computational complexity, and retaining key spatiotemporal information, as shown below.

Y_{c} = (Y_{s} \oplus Y_{t} \oplus Y_{n}) * K_{c} + b_{c}

(15)

\oplus

denotes the feature splicing operation and

Y_{c} \in R^{H \times W \times C}

is the compressed output feature matrix. In this section, through the spatiotemporal convolutional network design, the spatiotemporal features are effectively extracted and fused while keeping the spatial structure of the time series data unchanged, which provides an informative and computationally efficient feature representation for the subsequent time series prediction task.

3.3.3. LSTM Module Extracts Temporal Long- and Short-Term Memory

After the decoder output data are processed by the STCN module, this paper introduces the long short-term memory network (LSTM module) to extract the long- and short-term dependencies of the temporal data. The LSTM module can mitigate gradient explosion and vanishing problems when learning long-term dependency terms [47], and outputs to the fully connected layer, obtaining the predicted values through the multilayer perceptual machine. The structure of the LSTM module is shown in Figure 4, and the oblivion gate determines the discarded information from the cell state and reads the previous output and the current input to perform a nonlinear mapping. The output vector is multiplied with the cell state; the input gate determines the new information that is stored in the cell state; the old state is multiplied with the forget gate to determine the discarded information together with the candidate value to realize the cell-state update; and finally, based on the cell state, the result is outputted through the output gate. The overall principle of the work is shown below [48].

\begin{array}{l} f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f}) \\ i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}) \\ {\tilde{C}}_{t} = \tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C}) \\ C_{t} = f_{t} * {C_{t}}_{- 1} + i_{t} * {\tilde{C}}_{t} \\ o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) \\ h_{t} = o_{t} * \tanh (C_{t}) \end{array}

(16)

where

x_{t}

represents the input of the current moment,

{\tilde{C}}_{t}

represents the information candidate state,

h_{t - 1}

indicates the hidden state of the previous moment,

f_{t}

denotes the forget gate control signal,

i_{t}

denotes the input gate control signal,

o_{t}

denotes the output gate control signal,

C_{t - 1}

denotes the memory cell (cell state) of the previous moment,

C_{t}

denotes the memory cell of the current moment, and

h_{t}

denotes the hidden state of the current moment.

3.3.4. Improving Supply Forecasting with the Informer Model

This paper addresses the issue of supply volume forecasting by first constructing a multidimensional feature engineering framework aimed at extracting key features from various dimensions to enhance the expressiveness of label data. This involves capturing features related to spatiotemporal evolution characteristics, freight rates, and demand volume, thereby comprehensively revealing the intrinsic properties of the data and improving the model’s interpretability of the label data. Next, an embedding layer is introduced to facilitate effective encoding of data features, providing rich input information for subsequent model learning. Additionally, the STCN module is designed to further extract the spatiotemporal features of the supply volume data, considering dynamic information across time and space. Finally, an LSTM model is incorporated to capture both short-term and long-term dependencies in the supply volume, enhancing the model’s ability to grasp long-range dependencies. The prediction values are subsequently obtained through a multilayer perceptron. Overall, the structure of the improved Informer model is illustrated in Figure 5.

4. Experiment and Analysis

In this section, the development environment and data sources of the experiment are introduced. Meanwhile, the model parameters and evaluation indexes are designed. In this section, the experimental results of the improved Informer model are analyzed in detail in comparison with the single-prediction model and the combined-prediction model, respectively, and the experiments are also validated using open-source datasets.

4.1. Experimental Environment and Data

The configuration of this experimental design used an AMD Ryzen 7 5800H (CPU) and an NVIDIA GeForce RTX 3060 laptop GPU; the development environment was Windows 11, Pycharm2022.1.1, using a Torch1.8.0+Cu111 framework, and Pandas 1.1.5 and other software packages, and the development language was Python3.8. To verify the practicability of the proposed method in this paper, the logistics transportation supply data of a less-than-truckload (LTL) logistics transportation company between 29 April 2018 and 6 August 2022, for a transportation route from W city to K city, were introduced, and anomalies and missing data were deleted for a total of 1320 days. The datasets were sorted by day as a time axis, and the datasets were divided into training sets, test sets, and validation sets in a ratio of 7:2:1.

4.2. Parameterization and Evaluation Indicators

Setting a single-prediction model, combined-prediction model, and the method of this paper in the above experimental data, a comparison experimental analysis was conducted. The specific parameter settings of the supply quantity prediction model of this paper’s method are shown in Table 3.

To validate the performance of the supply quantity prediction model, the fit on the supply quantity test set was evaluated. In this paper, the prediction performance was comprehensively assessed using two indicators, the mean square error (MSE) and the mean absolute error (MAE), which were used to assess the difference between the predicted values of the model and the actual observed values. This is one of the most-used indicators for assessing the performance of regression models.

M S E = \frac{1}{N} {\sum_{i = 1}^{N} ({\hat{y}}_{i} - y_{i})}^{2}

(17)

The mean absolute error is a common measure of the accuracy of a model’s predictions, characterizing the average size of the absolute value of the difference between the model’s output predictions and the actual values.

M A E = \frac{1}{N} \sum_{i = 1}^{N} | {\hat{y}}_{i} - y_{i} |

(18)

where

y_{i}

represents the true value,

{\hat{y}}_{i}

represents the predicted value, and N represents the sample size of the test set.

4.3. Experiments and Analysis of Results

4.3.1. Multidimensional Feature Engineering of Data

The supply volume, like the demand volume, exhibits temporal characteristics, with freight rates serving as a key influencing factor for changes in supply volume. The temporal evolution of supply volume and freight rates is illustrated in Figure 6, which shows the evolution of supply and freight rates, with (a) a chart of time series changes in supply and (b) a chart of changes in the chronology of freight costs, where the dashed line represents the trend line. Overall, the supply volume shows an upward trend over time, with a recurring spiral seasonality. In contrast, freight rates remain relatively stable over time, exhibiting a slight upward trend punctuated by sharp fluctuations on specific days each year. These fluctuations are closely related to factors such as transportation industry dynamics and unforeseen social events.

According to the Pearson correlation coefficient, the correlation coefficient between the supply quantity and each influencing factor is calculated to assess the linear relationship between the different factors and supply quantity. The Pearson correlation coefficient calculation results of the supply indicators are plotted in Figure 7, with a darker color representing a stronger correlation.

According to the analysis of supply correlation characteristics in Table 4, the correlation coefficient between supply (Car-num) and demand (Order-num) is 0.85, meaning there is a very strong correlation, freight and time evolution characteristics, and a supply correlation. This paper will examine the existence of a correlation between the influencing factors and the input data of the supply to construct the feature engineering in Figure 8 to enhance the expression of the input data and improve the supply prediction model’s performance.

4.3.2. Comparative Experiments on Single-Predictive Models

To validate the predictive performance of the supply forecasting method proposed in this paper, a comparative experiment was conducted using a single-prediction model. The enhanced Informer model was compared against several classic time series forecasting models, including the linear auto-regressive integrated moving average (ARIMA) model, the nonlinear support vector regression (SVR) model, the LSTM model, the gated recurrent units (GRUs) model, and the backpropagation neural network (BPNN) model, as well as the original Informer and InformerStack models. This comparison aimed to assess the fitting capability of the proposed supply forecasting model for the logistics transportation supply volume. The results of the comparative experiments between the proposed method and the single-prediction models are presented in Table 5.

From Table 5, it can be observed that compared to single-prediction models such as ARIMA, SVR, LSTM, GRU, BPNN, Informer, and InformerStack, the proposed method yields smaller MSE and MAE results, demonstrating superior performance in supply forecasting accuracy. To further validate and visually illustrate the predictive performance of the proposed method for supply volume, a visualization of the fitting results from the comparative experiments on the test set is presented. The results of the single prediction model comparison are depicted in Figure 9, where Enhanced_Informer represents the predicted values of the proposed method and the legend includes the corresponding supply volume predictions from the other models, with TRUE denoting the actual supply volume values.

In the comparative experiments of single-prediction models, the Enhanced_Informer model demonstrates a superior degree of fit for supply volumes, resulting in a reduced discrepancy between predicted and actual values. Specifically, the ARIMA model primarily captures the linear characteristics of supply volume but performs poorly on nonlinear traits; the proposed method reduces MSE by 73.8% and MAE by 52% compared to the ARIMA model. The support vector regression (SVR) model fits predictions by identifying a plane or “hyperplane” in a high-dimensional space of data points. In comparison, the proposed method achieves a reduction of 79.36% in MSE and 59.5% in MAE. When compared to the LSTM and GRU models, which utilize gated units for long-term predictions, the proposed method reduces MSE by 82.24% and 78.58%, respectively, with corresponding reductions in MAE of 60.36% and 57.52%. Additionally, in comparison to the BPNN model, known for its efficacy in extracting nonlinear features, the proposed method results in a 77.02% decrease in MSE and a 53.9% decrease in MAE. In the experiments involving long-sequence forecasting algorithms, the proposed method enhances data representation capabilities by incorporating multidimensional feature engineering, introducing GCN for extracting spatial features of transport routes, and integrating STCN and LSTM modules to bolster long-sequence dependency. This method results in a reduction of 53.96% in MSE and 31.21% in MAE compared to the original Informer model. The InformerStack model, which is a stacked version of the Informer model, possesses additional layer structures for module stacking; the proposed method demonstrates a reduction of 40.38% in MSE and 36.58% in MAE when compared to the InformerStack model. Overall, it is evident that the proposed Enhanced_Informer model provides higher accuracy in supply volume predictions in the context of single-prediction model comparisons.

4.3.3. Comparative Experiments on Integrated Predictive Models

To verify the performance of the proposed supply forecasting method in comparison with recently popular ensemble models composed of linear and nonlinear approaches, comparative experiments involving ensemble models were conducted. These included the ARIMA + BPNN model, ARIMA + GRU model, and ARIMA + LSTM model. The aim was to analyze the fitting effectiveness of the proposed supply forecasting model. The results of the comparative experiments between the proposed method and the ensemble prediction models are presented in Table 6.

To further validate and visually demonstrate the predictive performance of the proposed method for supply volume, a visualization of the fitting results from the comparative experiments on the test set is presented. The results of the ensemble model comparisons are shown in Figure 10. The supply volume predicted by the Enhanced_Informer model exhibits a smaller discrepancy from the true values compared to the ensemble prediction models, indicating a greater ability to capture the local signal fluctuations of the true values and demonstrating improved model fit quality.

In the comparative experiments involving ensemble models formed by combining linear and nonlinear approaches, the proposed method demonstrates significant improvements in predictive performance. Specifically, compared to the ARIMA + BPNN ensemble model, the proposed method reduces MSE by 74.88% and MAE by 51.31%. When compared to the ARIMA + GRU ensemble model, the proposed method achieves a reduction of 71.56% in MSE and 50% in MAE, effectively halving the error metrics. In comparison with the ARIMA + LSTM ensemble model, the proposed method results in a 74.07% decrease in MSE and a 52.02% decrease in MAE. These findings indicate that the enhanced Informer model proposed in this paper exhibits superior fitting performance for supply volume predictions within the context of ensemble prediction models.

4.3.4. Comparative Experiments on Open-Source Datasets

To further validate the diversity of the dataset used in the predictive model, the proposed method was tested on the open-source ETTh1 dataset associated with the original Informer model. Comparative experiments were conducted using both single models and ensemble models. The effectiveness of the proposed method was assessed through visualizations of the fitting results between the predicted and actual values from the test set. The visualization results for the test set fittings are presented in Figure 11 and Figure 12, where “TRUE” denotes the actual values and the labels for other models represent the predicted values of both the proposed model and the comparison models for the corresponding time series in the test set.

In the comparative experiments conducted using the open-source dataset, the enhanced predictive model presented in this paper yields conclusions consistent with the supply volume prediction data. On the ETTh1 dataset, the ensemble models demonstrate overall better fitting performance compared to the single models, exhibiting improved model accuracy and robustness. However, the proposed enhanced Informer model outperforms the ensemble prediction models, further indicating that the results from the ETTh1 dataset experiments suggest the superior fitting capability of the proposed model for long-sequence time series data.

In summary, through two sets of comparative experiments using empirical data on logistics transportation supply volume and the open-source ETTh1 dataset, the results consistently show that the proposed enhanced Informer model achieves high prediction accuracy for long-sequence time series forecasting. Additionally, the visual fitting charts indicate a better ability to capture local changes and fluctuations, thereby demonstrating overall improved predictive performance.

5. Conclusions

In the context of the road transportation environment in northwest China, vehicle supply forecasting faces several challenges. This region is characterized by its vast geographic space and relatively low population density, resulting in an efficient and smooth transportation network. However, the long sequences of supply volume data, numerous influencing factors, and significant spatiotemporal evolution patterns contribute to low accuracy in traditional supply forecasting methods. Consequently, logistics companies urgently require improved forecasting methods to better manage vehicle scheduling and resource allocation.

This paper examines the logistics supply data from a certain consolidated freight logistics company, addressing issues related to long sequences of supply volume data and the multitude of influencing factors, which hinder prediction accuracy. This paper proposes a logistics supply forecasting method based on an enhanced Informer model. Building upon the Informer model, we design a multidimensional feature engineering approach and a spatiotemporal convolutional network while integrating an LSTM module to improve prediction performance.

According to the experimental results, the proposed enhanced Informer model demonstrates higher prediction accuracy for logistics supply volume, effectively reducing forecasting errors and significantly capturing local variation features. This improvement in fitting quality is crucial for enhancing supply–demand matching, vehicle scheduling, and decision-making in modern logistics transportation systems. In future work, we aim to incorporate a more diverse range of data sources to enhance the model’s overall predictive capability. Additionally, we will explore the development of adaptive supply forecasting strategies tailored to different traffic conditions.

Author Contributions

Conceptualization, D.G. and Y.Q.; methodology, D.G., P.J. and Y.Q.; software, D.G.; validation, D.G., P.J. and Y.Q.; formal analysis, D.G. and P.J.; investigation, X.Z., J.Z. and Y.Q.; resources, D.G. and Y.Q.; data curation, D.G., P.J. and Y.Q.; writing—original draft preparation, D.G.; writing—review and editing, D.G., P.J. and Y.Q.; supervision, D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Key Research and Development Program Project of the Department of Science and Technology of the Autonomous Region (grant no. 2022B01015) and the Science and Technology Program Project of the Bureau of Ecology, Environment and Industrial Development of Ganquanbao Economic and Technological Development Zone (Industrial Zone) (grant no. GKJ2023XTWL04).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Jinquan Zhang was employed by the Xinjiang Hualing Logistics & Distribution Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Xu, Z.; Wen, J.; Wang, X.; Liu, N. Optimization study of low-carbon vehicle paths for cold T. chain logistics with improved refrigeration cost—Based on ALNS genetic algorithm. Soft Sci. 2024, 38, 92–100. [Google Scholar]
Wang, C.; Yang, X.; Shi, L. Research on the impact of Chengdu-Chongqing supply chain hub on regional economic development based on system dynamics perspective. Railw. Trans. Econ. 2023, 45, 131–139. [Google Scholar]
Tang, H. Do a good job of logistics management, digging deep into the enterprise’s third profit source. Taxation 2018, 13, 130. [Google Scholar]
Pan, Y. Application of information technology in the development of agricultural logistics. Guangdong Seric. 2020, 54, 81–82. [Google Scholar]
Zhang, L. Research on the coupling and interaction mechanism of agricultural logistics capacity and agricultural economic development. Agric. Econ. 2019, 4, 141–142. [Google Scholar]
Li, D.; Yang, H. Analysis of research progress and development trend of agricultural internet of things technology. Agric. Mach. 2018, 49, 1–20. [Google Scholar]
Zhang, Y.; Liu, J.; Chen, C. Design of Traceability System for Agricultural Products by Applying Hyperledger Fabric and IoT Technology. J. Shanxi Agric. Univ. 2022, 42, 12–23. [Google Scholar]
Mirabelli, G.; Solina, V. Blockchain and agricultural supply chains traceability: Research trends and future challenges. Procedia Manuf. 2020, 42, 414–421. [Google Scholar] [CrossRef]
Pincheira, M.; Ali, M.; Vecchio, M.; Giaffreda, R. Blockchain-based traceability in Agri-Food supply chain management: A practical implementation. In Proceedings of the 2018 IoT Vertical and Topical Summit on Agriculture-Tuscany (IOT Tuscany), Tuscany, Italy, 8–9 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–4. [Google Scholar]
Misra, N.N.; Dixit, Y.; Al Mallahi, A.; Bhullar, M.; Upadhyay, R.; Martynenko, A. IoT, big data and artificial intelligence in agriculture and food industry. IEEE Internet Things J. 2020, 9, 6305–6324. [Google Scholar] [CrossRef]
Lokers, R.; Knapen, R.; Janssen, S.; van Randen, Y.; Jansen, J. Analysis of big data technologies for use in agro-environmental science. Environ. Model. Softw. 2016, 84, 494–504. [Google Scholar] [CrossRef]
Jahani, R.; Jain, R.; Ivanov, D. Data science and big data analytics: A systematic review of methodologies used in the supply chain and logistics research. Ann. Oper. Res. 2023, 7, 11. [Google Scholar] [CrossRef]
He, M.; An, Y.; Pu, J. Dynamic Modeling and Optimization of Modern Agricultural Logistics System in Jiangsu Province. J. Jiangsu Univ. 2021, 42, 562–568. [Google Scholar]
Li, X.; Wang, Y.; Wang, F. Research on the application mode and implementation path of block chain technology in cross border logistics. Contemp. Econ. Manag. 2020, 42, 32–39. [Google Scholar]
Lu, M.; Jiang, X.; Liu, S.; Lin, Q. Research on blocking database of power material supply chain based on system dynamics. J. Xi’an Univ. Technol. 2021, 37, 580–587. [Google Scholar]
Liu, Y.; Tian, Q.; Lu, D. Research on interactive development of agriculture and logistics industry based on system dynamics. Chin. J. Agric. Mech. Chem. 2019, 40, 222–228. [Google Scholar]
Solak, A. Squatter housing transformations in Turkey after 2002: Public choice perspective. Int. J. Hous. Policy 2021, 21, 612–625. [Google Scholar] [CrossRef]
Gruzauskas, V.; Gimzauskiene, E.; Navickas, V. Forecasting accuracy influence on logistics clusters activities: The case of the food industry. J. Clean. Prod. 2019, 240, 118225.1–118225.13. [Google Scholar] [CrossRef]
Mi, J.; Luo, Q.; Gao, M.; Zhang, H. Research on the medium- and long-term supply and demand balance of potato. China Agric. Resour. Zoning 2015, 36, 27–34. [Google Scholar]
Ilaeva, Z.; Alikhadzhieva, D.; Pashaev, M. Transport Logistics as a Tool of Interaction Between Transport Streams. SHS Web Conf. 2023, 172, 02034. [Google Scholar] [CrossRef]
Wen, J.; Wang, M.; Liu, J. Multi-step prediction algorithm for time series based on time series decomposition and random forest. J. East China Univ. Sci. Technol. 2023, 49, 873–881. [Google Scholar]
Nishino, T.; Hokugo, A. A stochastic model for time series prediction of the number of post-earthquake fire ignitions in buildings based on the ignition record for the 2011 Tohoku Earthquake. Earthq. Spectra 2020, 36, 232–249. [Google Scholar] [CrossRef]
Huang, H.; Tian, Y.; Tao, Z. Multi-rule combination prediction of compositional data time series based on multivariate fuzzy time series model and its application. Expert Syst. Appl. 2024, 238, 121966. [Google Scholar] [CrossRef]
Wen, X.; Li, W. Time Series Prediction Based on LSTM-Attention-LSTM Model. IEEE Access. 2023, 11, 48322–48331. [Google Scholar] [CrossRef]
Sun, J.; Guo, W. Time Series Prediction Based on Time Attention Mechanism and LSTM Neural Network. In Proceedings of the 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS), Raichur, India, 24–25 February 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
Wang, S.; Chang, L.; Liu, H.; Chang, Y.; Xue, Q. Short-term prediction of wind power based on temporal convolutional network and the informer model. IET Gener. Transm. Distrib. 2023, 18, 941–951. [Google Scholar] [CrossRef]
Gong, M.; Yan, C.; Xu, W.; Zhao, Z.; Li, W. Short-term wind power forecasting model based on temporal convolutional network and Informer. Energy 2023, 283, 129171. [Google Scholar] [CrossRef]
Wang, H.; Li, D.; Chen, F.; Du, J.; Song, K. GCNInformer: A combined Deep learning model based on GCN and Informer for wind power forecasting. Energy Sci. Eng. 2023, 11, 3836–3854. [Google Scholar] [CrossRef]
Yin, Z.; Gong, M.; Sun, J.; Han, C.; Jing, L. A new hybrid optimization prediction strategy based on SH-Informer for district heating system. Energy 2023, 282, 129010. [Google Scholar]
Peng, T.; Fu, Y.; Wang, Y.; Xiong, J.; Suo, L. An intelligent hybrid approach for photovoltaic power forecasting using enhanced chaos game optimization algorithm and Locality sensitive hashing based Informer model. J. Build. Eng. 2023, 78, 107635. [Google Scholar] [CrossRef]
Li, F.; Wan, Z.; Thomas, K.; Zan, G.; Li, M. Improving the accuracy of multi-step prediction of building energy consumption based on EEMD-PSO-Informer and long-time series. Comput. Electr. Eng. 2023, 110, 108845. [Google Scholar] [CrossRef]
Yang, K.; Shi, F. Medium- and Long-Term Load Forecasting for Power Plants Based on Causal Inference and Informer. Appl. Sci. 2023, 13, 13. [Google Scholar] [CrossRef]
Xu, H.; Peng, Q.; Wang, Y.; Zhan, Z. Power-Load Forecasting Model Based on Informer and Its Application. Energies 2023, 16, 7. [Google Scholar] [CrossRef]
Ma, J.; Dan, J. Long-Term Structural State Trend Forecasting Based on an FFT–Informer Model. Appl. Sci. 2023, 13, 2553. [Google Scholar] [CrossRef]
Li, H.; Liu, C.; Yang, F.; Ma, X.; Guo, N. Dynamic Temperature Prediction on High-Speed Angular Contact Ball Bearings of Machine Tool Spindles Based on CNN and Informer. Lubricants 2023, 11, 343. [Google Scholar] [CrossRef]
Xie, X.; Huang, M.; Liu, Y.; An, Q. Intelligent Tool-Wear Prediction Based on Informer Encoder and Bi-Directional Long Short-Term Memory. Machines 2023, 11, 94. [Google Scholar] [CrossRef]
Gao, Y.; Liu, G.; Luo, D.; Bavirisetti, D.P.; Xiao, G. Multi-timescale photovoltaic power forecasting using an improved Stacking ensemble algorithm based LSTM-Informer model. Energy 2023, 283, 128669. [Google Scholar]
Jiang, Y.; Fu, K.; Huang, W.; Zhang, J.; Li, X. Ultra-short-term PV power prediction based on Informer with multi-head probability sparse self-attentiveness mechanism. Front. Energy Res. 2023, 11, 1301828. [Google Scholar] [CrossRef]
Zhuang, W.; Li, Z.; Wang, Y.; Xi, Q.; Xia, M. GCN–Informer: A Novel Framework for Mid-Term Photovoltaic Power Forecasting. Appl. Sci. 2024, 14, 2181. [Google Scholar] [CrossRef]
Zhao, M.; Peng, H.; Li, P.; Ren, Y. Graph Attention Network and Informer for Multivariate Time Series Anomaly Detection. Sensors 2024, 24, 5. [Google Scholar] [CrossRef]
Shi, Z.; Li, J.; Jiang, Z.; Li, H.; Yu, C. WGformer: A Weibull-Gaussian Informer based model for wind speed prediction. Eng. Appl. Artif. Intell. 2024, 131, 107891. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5999–6009. [Google Scholar]
Hu, F.; Sha, Z.; Wei, P. Deep learning for GNSS zenith tropospheric delay forecasting based on the informer model using 11-year ERA5 reanalysis data. GPS Solut. 2024, 182, 28. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
Tan, Y.; Huang, Y.; Liu, J.; Yang, R. Multi-time series forecasting of power system diverse loads based on the MMoE-CNN-informer model. J. Electr. Eng. 2024, 29, 1–11. [Google Scholar]
Imane, J.; Fatima-Zahra, B.; Issam, M.K.; Amine, T. Prediction of solar energy guided by pearson correlation using machine learning. Energy 2021, 224, 120109. [Google Scholar]
Federico, L.; Lorenzo, B.; Marcella, C.; Rita, C. Working Memory Connections for LSTM. Neural Netw. 2021, 144, 334–341. [Google Scholar]
Zhang, X.; Yang, C.; Ding, H. Displacement prediction of the Jina Ancient Landslide in Zhaotong, Yunnan, based on time-series InSAR and GRU-ARIMA. Geod. Geodyn. 2024, 29, 1–14. [Google Scholar]

Figure 1. Self-attentive distillation mechanism.

Figure 2. Position encoding.

Figure 3. Spatiotemporal convolutional network (STCN module).

Figure 4. LSTM block.

Figure 5. Improvement of Informer model structure.

Figure 6. Evolution of supply and freight rates. (a) Chart of time series changes in supply and (b) chart of changes in the chronology of freight costs.

Figure 7. Pearson correlation analysis chart.

Figure 8. Multidimensional feature engineering of data.

Figure 9. Results of single model comparison experiment. (a) support vector regression (SVR) model; (b) long short-term memory (LSTM) model; (c) gated recurrent units (GRUs) model; (d) back propagation neural network (BPNN) model; (e) Informer model; and (f) InformerStack model.

Figure 10. Experimental results of integrated model comparison. (a) ARIMA + BPNN model; (b) ARIMA + GRU model; and (c) ARIMA + LSTM model.

Figure 11. Results of single model comparison experiment (ETTh1). (a) SVR model; (b) LSTM model; (c) GRU model; (d) BPNN model; (e) Informer model; and (f) InformerStack model.

Figure 12. Experimental results of integrated model comparison (ETTH1). (a) ARIMA + BPNN model; (b) ARIMA + GRU model; and (c) ARIMA + LSTM model.

Table 1. Data feature engineering and coding.

Feature Category	Feature	Characterization
Economic characteristic	Order_num	Quantity demanded
Economic characteristic	Cost	Freight
Date characteristics	Weekends	Weekend, 0 weekend, 1 weekday
	Quarter	Quarterly, four quarters, [1, 4]
	Day_of_year	Mid-year day, [1, 365]
	Month	Month, December, 1–12

Table 2. Classification of Pearson’s correlation coefficient.

Form	Extremely Low Correlation	Low Correlation	Moderately Relevant	Strong Correlation	Highly Relevant
$\| r_{x y} \|$	[0, 0.2)	[0.2, 0.4)	[0.4, 0.6)	[0.6, 0.8)	[0.8, 1.0]

Table 3. Model parameter settings.

Parameters	Descriptive	Value
seq_len	Sequence length	160
label_len	Label length	64
pred_len	Predicted length	64
freq	Frequency	d
enc_in	Encoder input size	7
dec_in	Decoder input size	7
c_out	Output	7
d_model	Model size	512
n_heads	Attention span	8
e_layers	Encoder layers	2
d_layers	Decoder layers	1
s_layers	Stack encoder layers	[3, 2, 1]
d_ff	Fcn dimension	2048
factor	Prob atten factor	5
distill	Distillate	True
attn	Attention	prob
embed	Embedding	timeF
train_epochs	Number of trainings	100
batch_size	Batch size	32
learning_rate	Learning rate	0.0001

Table 4. Example of input data feature engineering.

Date	Weekends	Quarter	Year	Month	Cost	Order-Num	Car-Num
2018/1/1	1	1	1	1	9235	45	1
2018/1/2	1	1	2	1	15,205	112	1
2018/1/3	1	1	3	1	11,350	91	1
…	…	…	…	…	…	…	…
2023/5/12	1	2	132	5	118,018	1053	20
2023/5/13	0	2	133	5	103,078	889	25
2023/5/14	0	2	134	5	102,251	875	47

Data sources: the authors have compiled the data based on the logistics supply on the route of H Logistics transporting from the city of W to the city of K (April 2018–August 2022).

Table 5. Experimental results of single-prediction model comparison.

Model	Target	Car-num	Etth1
Improved Informer Model	MSE	2.79	1.39
Improved Informer Model	MAE	1.30	0.60
ARIMA Model	MSE	10.65	10.11
ARIMA Model	MAE	2.71	2.98
SVR Model	MSE	13.52	1.18
SVR Model	MAE	3.21	0.94
LSTM Model	MSE	15.71	0.21
LSTM Model	MAE	3.28	0.35
GRU Model	MSE	13.03	0.20
GRU Model	MAE	3.06	0.34
BPNN Model	MSE	12.14	0.22
BPNN Model	MAE	2.82	0.36
Informer Model	MSE	6.06	0.90
Informer Model	MAE	1.89	0.82
InformerStack Model	MSE	4.68	0.96
InformerStack Model	MAE	2.05	0.96

Table 6. Comparative experimental results of integrated prediction models.

Model	Target	Car-num	Etth1
Informer Model	MSE	2.79	1.39
Informer Model	MAE	1.30	0.60
ARIMA + BPNN Model	MSE	11.11	2.90
ARIMA + BPNN Model	MAE	2.67	1.61
ARIMA + GRU Model	MSE	9.81	3.04
ARIMA + GRU Model	MAE	2.60	1.64
ARIMA + LSTM Model	MSE	10.76	2.90
ARIMA + LSTM Model	MAE	2.71	1.61

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, D.; Jiang, P.; Qin, Y.; Zhang, X.; Zhang, J. Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling. Appl. Sci. 2024, 14, 8162. https://doi.org/10.3390/app14188162

AMA Style

Guo D, Jiang P, Qin Y, Zhang X, Zhang J. Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling. Applied Sciences. 2024; 14(18):8162. https://doi.org/10.3390/app14188162

Chicago/Turabian Style

Guo, Dudu, Peifan Jiang, Yin Qin, Xue Zhang, and Jinquan Zhang. 2024. "Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling" Applied Sciences 14, no. 18: 8162. https://doi.org/10.3390/app14188162

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Logistics Transportation Vehicle Supply Forecasting Based on Improved Informer Modeling

Abstract

1. Introduction

2. Related Work

2.1. Logistics and Transportation Supply Forecasting Model

2.2. Informer Model

3. Problem Description and Model Design

3.1. Problem Description

3.2. Principles of Informer Model Timing Prediction Algorithm

3.3. Improved Informer Modeling

3.3.1. Multidimensional Feature Engineering Construction for Supply Data

3.3.2. Spatiotemporal Convolutional Network to Extract Spatiotemporal Features

3.3.3. LSTM Module Extracts Temporal Long- and Short-Term Memory

3.3.4. Improving Supply Forecasting with the Informer Model

4. Experiment and Analysis

4.1. Experimental Environment and Data

4.2. Parameterization and Evaluation Indicators

4.3. Experiments and Analysis of Results

4.3.1. Multidimensional Feature Engineering of Data

4.3.2. Comparative Experiments on Single-Predictive Models

4.3.3. Comparative Experiments on Integrated Predictive Models

4.3.4. Comparative Experiments on Open-Source Datasets

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI