Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model

Abualuroug, Said; Alzubi, Ahmad; Iyiola, Kolawole

doi:10.3390/app14135817

Open AccessArticle

Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model

by

Said Abualuroug

,

Ahmad Alzubi

^*

and

Kolawole Iyiola

Institute of Social Sciences, University of Mediterranean Karpasia, Mersin 33000, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5817; https://doi.org/10.3390/app14135817

Submission received: 31 May 2024 / Revised: 21 June 2024 / Accepted: 27 June 2024 / Published: 3 July 2024

(This article belongs to the Special Issue Applications of Deep Learning and Artificial Intelligence Methods: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Inventory prediction is concerned with the forecasting of future demand for products in order to optimize inventory levels and supply chain management. The challenges include demand volatility, data quality, multi-dimensional interactions, lead time variability, seasonal trends, and dynamic pricing. Nevertheless, these models suffer from numerous shortcomings, and in this research, we propose a new model, MMCW-BiLSTM (modified multi-dimensional collaboratively wrapped BiLSTM), for inventory prediction. The MMCW-BiLSTM model reflects a considerable leap in inventory forecasting by combining a number of components in order to consider intricate temporal dependencies and incorporate feature interactions. The MMCW-BiLSTM makes use of BiLSTM layers, collaborative attention mechanisms, and a multi-dimensional attention approach to learn from augmented datasets consisting of the original features and the extracted time series data. Moreover, adding a Taylor series transformation allows for a more precise description of the features in the model, thus improving the prediction precision. The results show that the models make the least mistakes when they use the AV demand forecasting dataset, with MAE values of 1.75, MAPE values of 2.89, MSE values of 6.76, and RMSE values of 2.6. Similarly, when utilizing the product demand dataset, the model also achieves the lowest error values for these metrics at 1.97, 3.91, 8.76, and 2.96. Likewise, when utilizing the dairy goods sales dataset, the model also achieves the lowest error values for these metrics at 2.54, 3.69, 10.39, and 3.22.

Keywords:

inventory prediction; BiLSTM; collaborative attention mechanisms; multi-dimensional attention layer; Taylor series

1. Introduction

As enterprises grow larger, the role of inventory as a necessary buffer to maintain production and adapt to fluctuations in demand becomes more significant. As the company expands, there is a greater requirement for a range of raw materials, partially processed goods, and purchased components. As a result, there is a greater need for capital to handle these items. A high forecast for inventory demand resulting in a backlog can hinder company growth and potentially lead to bankruptcy [1,2,3]. Therefore, we cannot overstate the importance of inventory management in the manufacturing sector. In China, the actual cost of producing a product represents only 10% of the overall cost, while logistic expenses make up around 40%. Among these logistic expenses, inventory costs make up a substantial portion, ranging from 80 to 90%, which translates to 32–36% of the total product cost [4]. This is significantly higher than the cost of directly manufacturing products. Additionally, a low forecasted demand for inventory could lead to losses from stock-outs, impacting not just one company but the entire supply chain. Companies such as a global manufacturing corporation-owned commercial logistics company, Boeing, and Airbus, with global subsidiaries, exemplify this phenomenon. Many of their main products consist of numerous parts, ranging from hundreds to tens of thousands [5,6,7,8].

Furthermore, a single supplier involves numerous companies worldwide, ranging from dozens to hundreds. Inadequate or excessive inventory levels can impact thousands of companies in the supply chain, affecting the overall health of these large enterprises [9,10]. This underscores the importance of a logical and scientific forecast for inventory demand. As companies expand, the number and variety of products they produce also increase, making inventory demand forecasting more challenging. For instance, as companies grow, their product range becomes larger and more complex [11,12]. A subsidiary of a major global manufacturing company focused on commercial logistics brought attention to the difficulties of predicting inventory needs for large items such as Boeing and Airbus aircraft, which consist of numerous parts. The complexity of forecasting inventory demands for these huge companies is apparent. Previous studies in accounting and inventory management have mainly concentrated on either controlling inventory or planning for inventory needs [13,14]. Taking into consideration the different systematic control solutions available, prior studies have aimed to determine the optimal timing and quantity for ordering inventory [15]. Furthermore, experts recommend different strategies for material planning based on the duration of product use. Most suggested analytical methods approach inventory management problems as a multi-faceted optimization challenge, aiming to minimize expenses related to ordering and storing inventory while simultaneously maximizing profit or overall usefulness [16]. Prior research has typically used stochastic approximations to predict expected backorders based on the base stock level. Researchers have recently introduced a model that analyzes inventory levels across multiple levels of distribution to predict stock-outs. Another method for forecasting inventory levels is time series forecasting. However, there is a wealth of information in inventory databases that are not typically utilized, such as previous stock levels, amounts of overdue stock, and flags indicating operational issues [17,18]. Despite there being numerous studies conducted on anticipating inventory backorders, there is a current lack of research in the field when it comes to a profit-driven model based on big data [19,20].

The primary goal of this research is to fill the gap by examining how a model can provide valuable insights into the financial benefits of utilizing optimal backorder strategies in the age of big data [21]. Moreover, this study underscores a specific challenge in utilizing machine learning classifiers for this purpose. In classic supply chain management, the number of items on backorder (referred to as the positive or minority category) is significantly lower than the number of active products or items not on backorder (referred to as the negative or majority category), resulting in a situation with unequal classes [22,23]. Different real-world forecasting areas, such as loan approval prediction, corporate bankruptcy forecasting, or credit card fraud detection, frequently exhibit this lack of balance. When dealing with imbalanced datasets in modeling, it is essential to prioritize the minority positive instances over the majority class examples. It is essential to strike a balance between the two classes in order to effectively utilize the positive data. Additionally, we need to address the varying costs associated with incorrectly predicting backordered and non-backordered products [24]. The existing inventory models for forecasting that are shown with the help of the above-mentioned paradigms face certain challenges, including demand volatility, which is one of the biggest problems in the prediction. The quality of data is also a great challenge in least squares-based inventory models. Multi-dimensional interaction is another big challenge, as it becomes hard to capture interactions of features. The gaps in existing inventory prediction models are multi-faceted, encompassing both technical limitations and practical barriers that hinder their real-world performance. It is not easy to design effective predictive models to optimize the model’s parameters. Some practical challenges, such as the high cost of investments, time, and effort, may slow down the adaption, including SMEs. To address these concerns, this research suggests implementing a novel method for forecasting inventory.

The research introduces the utilization of the MMCW-BiLSTM model for inventory prediction, which is a new technique for the modeling of complex temporal dependencies and interactions between features. The model combines BiLSTM layers with collaborative and multi-dimensional attention mechanisms as well as Taylor series transformation, which are instrumental in the accurate forecasting of inventory levels.

The MMCW-BiLSTM model incorporates a complex system for inventory prediction as the system is built, combining different methods to enhance the predictability and precision of the forecast. This approach of using BiLSTM layers, along with collaborative and multi-dimensional attention, has indeed helped the model capture complex temporal dependency and correlation among features within the given data set. Yet, another advantage of using Taylor series transformation is that it captures higher-order interactions and thus enhances the predictive power of the mathematical model.

We organize the document as follows: Section 2 outlines previous studies on inventory forecasting. Section 3 provides an explanation of the inventory prediction model’s approach. Section 4 examines the mathematical modeling of the modified multi-dimensional collaboratively wrapped BiLSTM model. Section 5 offers a detailed description of the implementation and evaluation of performance. Finally, Section 6 wraps up the research and proposes possible avenues for future research.

2. Literature Review

Babai M.Z. et al. [1] presented a new Bayesian method that utilizes compound Poisson distributions for predicting inventory. They compared their technique with a Bayesian approach utilizing Poisson distributions and a Gamma prior distribution, along with both parametric and non-parametric frequent methods, all of which yielded precise predictions. Although the proposed Bayesian method demonstrated superior performance in both empirical and theoretical aspects, it was more computationally intensive and complex compared to parametric frequent methods, making it potentially challenging for practitioners to grasp. Despite these frameworks providing SPC robustness, they can be computationally expensive. In our case, by means of progression in the application of neural networks, we aspire to decrease computational magnitude while optimizing predictive precision.

Chaoliang Han and Qiuying Wang [2] developed a BP neural network to construct a predictive model that analyzes inventory demand and explores the complex relationship between inventory demand and different influencing factors. Through training the data on the influencing factors of inventory demand, this model generates effective strategies for inventory management and control, achieving improved accuracy and more accurate predictions. However, this model faced a challenge in determining the optimal number of nodes in the hidden layer. Neural networks are flexible and nonlinear but can be manually tuned sometimes. This is a novel approach that uses both attention mechanisms and multi-dimensional modeling so that the model can learn these underlying properties on its own, hence reducing the chances of having to manually tune the parameters.

In their research, Ning Xue1 et al. [3] presented a meta-heuristic approach that successfully develops CNN-LSTM structures for predicting time series using actual data from a neighborhood food store. Their experiments demonstrate that this model can effectively handle complex nonlinear inventory forecasting issues. Nevertheless, optimizing the parameters may still pose a challenge and necessitate ongoing human oversight. Even so, structures like CNN-LSTM show a rather high level of feature extraction but may have issues with parameter dependence. To boost the model reliability and eliminate the necessity of parameter tuning in the training process, attention mechanisms and feature engineering are applied here.

In their research, Muchaendepi et al. [4] concentrated on enhancing professionalism and education in the field of inventory management to strengthen the skills of individuals tasked with overseeing inventories in small and medium-sized businesses. This resulted in a significant enhancement in predictive accuracy. Nevertheless, having excessive stock levels in a company can result in increased expenses for storage, handling, and interest payments on short-term borrowings. Upskilling is important, but it does not seem to directly solve the intricacies of predicting inventory. The solution is designed to include both mathematical modeling and skill training as a central part of addressing inventory management issues.

G.T.S. Ho et al. [5] implemented a system utilizing blockchain technology to provide a management platform for the accurate storage of data related to the traceability of spare parts. This system utilized hyper-ledger fabric and hyper-ledger composer for organizational agreement and validation, ensuring high levels of privacy, resilience, flexibility, and scalability. Nevertheless, the repeated resale of many spare parts poses difficulty in tracking their past transactions. The concept of blockchain applies secure and transparent handling of data, although the technology might find it slightly hard to address a versatile supply chain transaction. That approach is to use predictive modeling coupled with the blockchain traceability approach to give insights into inventories in the future.

Petr Hajek and Mohammad Zoynul Abedin [6] introduced a machine learning model that includes an under-sampling technique to optimize the anticipated profit of backorder choices. The model was efficient in computation and resistant to changes in warehousing costs, inventory expenses, and sales margins. However, the model’s training duration was exceptionally long. Features of the machine learning models include scalability and flexibility, but they need a lot of processing power. Our approach is built upon resilient neural panels and feature engineering skills that enhance computational performance and the training period.

In order to achieve precise predictions, Dua Weraikat [7] introduced a pharmaceutical supply chain model that attains high prediction accuracy. However, the model faced difficulties due to computational complexities and problems with overfitting. Our framework learns the attention mechanism to enhance focus on other components and their interactions with each other; this might help to avoid overfitting by paying much attention to salient features within the dataset.

Tsukasa Demizu et al. [25] introduced a model-based deep reinforcement learning approach that demonstrated exceptional sample efficiency. They also suggested a novel inventory management technique for new products that integrates offline model learning with online planning. This model achieves significant profit and efficiency gains while also decreasing the occurrence of stock-outs throughout the entire sales duration of a product. It was challenging to improve inventory management right after it was introduced because there was not enough data available for learning purposes. By employing the BiLSTM layers with collaborative and multi-dimensional attention to the model, the model can capture even the temporal dependency and feature interactions when activating the model only with the preliminary data.

3. Methodology

The existing inventory models for forecasting that are shown with the help of the above-mentioned paradigms face certain challenges, including demand volatility, which is one of the biggest problems in the prediction. The quality of data is also a great challenge in the least squares-based inventory models. The main aim of the research is to design a predictive model system to anticipate future inventory levels. We first read the dataset like AV Demand Forecasting [26], Forecast for Product Demand [27] and Dairy Goods Sales Datasets [28]; the model takes input data and applies the BiLSTM to capture temporal dependencies in the forward and backward directions. Next, the output passes through collaborative and multi-dimensional attention to learn collaboration between different features and adjust the attention based on multi-dimensional feature interactions. Subsequently, a Taylor series transformation is then used in order to identify higher-order feature interactions in a more effective manner and improve the model’s capacity to identify complex underlying patterns within the data set. Finally, the dense layers are used for further transformations and dimensionality reduction to make a final forecast for the inventory levels in the future. Ultimately, we evaluate the model with new data to infer appropriate inventory levels, leading to effective inventory management and supply chain process optimization. The architecture for the proposed inventory prediction model is shown in Figure 1.

3.1. Input

The data utilized in the inventory forecasting model are gathered from [26,27,28] and are illustrated in mathematical format as detailed.

K = \sum_{n = 1}^{y} S_{e}

(1)

Here,

S_{e}

denotes the database.

3.2. Preprocessing

In the first step, the dataset is read, attributes are extracted, and items, such as product type, sales volume, and timestamps, are identified and collected as the relevant information. In addition to this, time series analysis is carried out by employing the ARIMA method in order to identify some temporal patterns in the data.

Linear models, often referred to as Box–Jenkins models, are frequently utilized in the analysis of univariate time series data. These models depict the value of a sequence at a specific point in time by incorporating previous values and discrepancies in a straightforward approach. The ARIMA models can be expressed in the following equations:

A R I M A (q, e, r) {(Q, E, R)}_{s}

(2)

In the AR model,

q

stands for the number of parameters, e represents the degree of differencing, and r stands for the count of parameters in the MA model. Additionally,

Q

represents the number of parameters in the AR seasonal model,

E

indicates the level of seasonal differencing, and

R

signifies the count of parameters in the MA seasonal model. Lastly,

s

denotes the seasonality period.

ϕ_{q} (C) φ q_{s} (C^{s}) \nabla^{e} \nabla^{E S} Y_{t} = θ_{r} (C) Θ_{R_{s}} (C^{s}) b_{t}

(3)

In this context,

Y_{t}

represents the value observed at a specific time point (or a transformed value);

φ_{q} (C)

is the AR operator

[(1 - φ_{1} C - φ_{2} C^{2} \dots\dots\dots φ_{q} C^{q})]

;

C

is the backshift operator,

[C Y_{t} = Y_{t - 1}];

φ_{q_{s}} (C^{s})

refers to the seasonal AR operator

[(1 - φ_{1} C^{s} - φ_{2} C^{2 s} \dots\dots φ_{q} C^{q s})]

;

C^{s}

is the seasonal backshift operator

C^{s} Y_{t} = Y_{t - s}

;

\nabla^{e}

denotes the differencing operator

[\nabla^{e} Y_{t} = {(1 - C)}^{e} Y_{t}]

;

\nabla^{E s}

represents the seasonal differencing operator

[\nabla^{E s} Y_{t} = {(1 - C^{t})}^{E} Y_{t}]; b_{t}

is the random error at time point

t

;

[b_{t} ~ N (0, σ_{b}^{2})], θ_{r} (C)

is the MA operator;

[(1 - θ_{1} C - θ_{2} C^{2}, \dots\dots\dots θ_{1} C^{r})]

; and

Θ_{R s} (C^{s})

signifies the seasonal MA operator

[1 - Θ_{1} C^{s} - Θ_{2} C^{2 s} \dots\dots\dots Θ_{R} C^{R s}]

.

Autocorrelation and partial autocorrelation functions are essential components in building ARIMA models. Examining the coefficients at various time delays helps determine the degree of differencing

e

, seasonal differencing

E

, and the number of parameters within the AR, MA, and seasonal AR components.

3.3. Feature Extraction

Feature extraction is the process that involves finding and extracting useful attributes or characteristics from the inventory dataset, which then specifically targets the precision of forecasting inventory levels.

3.3.1. Statistical Features

Inventory data contain statistical features that reflect the distribution, variability, and overall characteristics of the data. It uses advanced forecasting techniques that analyze seasonal trends and external factors like weather patterns. It helps to make predictions about the feature based on past data. The following is a description of each statistical feature:

(i): Mean: The mean represents the typical value or central tendency of inventory-related data over a specified period of time. It is calculated as the average after processing the data.

$M E = \frac{1}{d} \sum_{j = 1}^{d} S_{e}^{*}$

(4)

Hence, the dataset denotes the position of data points $j$ and the overall number of elements $d$ , and the preprocessed output is denoted as $S_{e}^{*}$ .
(ii): Variance: Variance indicates the degree of dispersion of the inventory data points from the mean. A greater variance simply points to a higher variability in the data.

$V A = \frac{1}{d} \sum_{j = 1}^{d} {(S_{e}^{*} - M E)}^{2}$

(5)
(iii): Standard Deviation: The standard deviation is determined by taking the square root of the variance, indicating the usual amount of deviation of data points from the mean. It measures the extent of the dispersion or scatter in the dataset.

$S D = \sqrt{\frac{1}{d} \sum_{i = 1}^{d} {(S_{e}^{*} - M E)}^{2}}$

(6)
(iv): Skewness: Skewness of inventory data measures asymmetrical distribution with respect to the mean of these data. A positive skewness means that the distribution of a given data point has a longer tail on the right side of the skewness, while a negative skewness means that the tails of the distribution of a given data point are extended on the left side of the skewness.

$S K = \frac{\frac{1}{d} \sum_{j = 1}^{d} {(S_{e}^{*} - M E)}^{3}}{{(\frac{1}{d} \sum_{j = 1}^{d} {(S_{e}^{*} - M E)}^{2})}^{3 / 2}}$

(7)
(v): Kurtosis: Kurtosis is the measure of the peakedness or flatness degree of the inventory data distribution compared to being normally distributed. Larger kurtosis describes a sharper peak and larger tails, while smaller kurtosis indicates a flat distribution.

$K U = \frac{\frac{1}{d} \sum_{j = 1}^{d} {(S_{e}^{*} - M E)}^{4}}{{(\frac{1}{d} \sum_{j = 1}^{d} {(S_{e}^{*} - M E)}^{2})}^{2}}$

(8)
(vi): Entropy: Entropy quantifies the uncertainty or irregularity in the probability distribution of inventory data. It provides information about the uncertainty or unordered data set.

$E (d) = - \sum_{j = 1}^{d} p (S_{e}^{*}) \log_{2} p (S_{e}^{*})$

(9)

Here, $p (S_{e}^{*})$ represents the probability mass function of $d$ .
(vii): Geometric Mean: The geometric mean is determined by finding the nth root of the total product of n numbers in the inventory dataset. This method is used to assess the central tendency of datasets that exhibit exponential growth or decay.

$G M = {(∐_{j = 1}^{d} p (S_{e}^{*}))}^{\frac{1}{d}}$

(10)
(viii): Harmonic Mean: Harmonic mean is the reciprocal of the arithmetical mean of the reciprocals of the inventory data values. It is quite useful for computing averages such as the rate or ratios.

$H M = \frac{d}{\sum_{j = 1}^{d} \frac{1}{S_{e}^{*}}}$

(11)
(ix): Minimum: The minimum value corresponds to the smallest data point in the inventory preprocessed dataset. It gives the lowest observed value.

$M i n = \min (S_{1}^{*}, S_{2}^{*}, \dots\dots\dots\dots S_{n}^{*})$

(12)
(x): Maximum: The maximum value stands for the largest data point in the inventory preprocessed data set. It determines the peak value of the considered phenomenon.

$M a x = \max (S_{1}^{*}, S_{2}^{*}, \dots\dots\dots\dots S_{n}^{*})$

(13)
(xi): Sum: The sum is actually a sum of inventory data values over a given time period. It offers the integrated value of the whole dataset.

$S U = \sum_{j = 1}^{d} S_{n}^{*}$

(14)

The outcome obtained from the statistical feature is denoted as

S = [M E / / V A / / S D / / S K / / K U / / E (c) / / G M / / H M / / \max / / \min / / S U]

.

3.3.2. Rolling Mean

The rolling mean is calculated over a given window, determining the number of data points to be involved in each average calculation. This rolling mean can be calculated by taking the average of consecutive data points in a rolling window. Hence, the rolling mean provides a smoothed representation of the underlying trend or pattern in the preprocessed inventory data.

R M = \frac{1}{y} \sum_{j = t - y + 1}^{t} S_{n}^{*}

(15)

In this context,

y

represents the window size indicating the total number of data points used in each calculation.

S_{n}^{*}

represents the preprocessed data at a specific time

t

.

3.3.3. Expanding Mean Features

The expanding mean, commonly referred to as the cumulative moving average, is a statistical calculation that provides the average for all the data points seen until a specific time point. In comparison to a fixed window size of the rolling mean, the expanding mean is continuously expanding to keep all available data points in the dataset from the beginning up to the current time point.

E M = \frac{1}{t} \sum_{j = 1}^{t} S_{n}^{*}

(16)

3.3.4. Exponential Smoothing Features (ESF)

The exponential smoothing technique uses exponential functions to smooth time series data. It is used to make short-term forecasts based on prior assumptions. It also helps in producing slightly unreliable long-term forecasts. Exponential smoothing techniques are straightforward yet effective methods that are commonly applied in various business scenarios like inventory management. There are three distinct techniques for ESF: basic exponential smoothing (SES), double exponential smoothing (DES), and triple exponential smoothing (TES). Simple exponential smoothing is a budget-friendly approach utilized when historical data do not exhibit cyclical patterns or trends. This technique involves generating a smooth curve by incorporating one actual previous value and one forecasted previous value.

T_{t} = α z_{t - 1} + (1 - β) T_{t - 1}

(17)

In this context,

T_{t}

is used to signify the estimated value (or smoothed value) at

T = t

, while

T_{t - 1}

represents the smoothed value when

T = t - 1, z_{t - 1}

denotes the actual value of the series at T = t − 1. The smoothing parameter, denoted as

α

and varying between 0 and 1, is responsible for determining the level of smoothness within the data series. Despite

T_{t}

taking into consideration the influence of all past data, only two numerical values

z_{t - 1}

and

U_{t - 1}

, are required for the actual computation process. Hence, one of the main benefits of using exponential smoothing is its ability to efficiently handle considerable amounts of data.

The method of double exponential smoothing, also referred to as Holt’s exponential smoothing, is frequently used in the field of economics to forecast values and predict trends. The mathematical formula for Holt’s exponential smoothing can be stated as follows:

U_{t} = α z_{t} + (1 - α) (U_{t - 1} + T_{t - 1})

(18)

T_{t} = γ (U_{t} + U_{t - 1}) + (1 - γ) T_{t - 1}

(19)

\overset{\land}{z_{t}} (l) = U_{t} + l T_{t}

(20)

In this context,

U_{t}

represents the smoothed level, and

T_{t}

represents the trend. The smoothing parameters are denoted by

α

and

γ

. The variable

z_{t}

denotes the actual value of the time series at a certain time period

t

, while

\overset{\land}{z_{t}} (l)

shows that the forecast made

l

steps ahead of the origin of the forecast. The initial values for the trend and level are determined by the governing equations provided in the following section.

T_{t} = 0

(21)

U_{t} = z_{2} - z_{1}

(22)

The Holt–Winters method, sometimes called triple exponential smoothing, is a valuable tool for analyzing seasonal time series since it can accurately represent both trend and seasonality trends in past data. This approach dissects the level, trend, and seasonal trends as part of its procedure. While widely used, it is limited in that it can only fit one seasonal pattern, which can be a drawback for load forecasting. The Holt–Winters technique offers two computational models: one for additive and one for multiplicative calculations. Additive calculations are appropriate for data with consistent seasonal patterns, while multiplicative models are suitable for data with changing seasonal patterns. Equations (23)–(26) are used for the additive model.

M_{t} = α (z_{t} - U_{t - 1}) + (1 - α) (M_{t - 1} - T_{t - 1})

(23)

T_{t} = β (M_{t} + M_{t - 1}) + (1 - β) T_{t - 1}

(24)

U_{t} = γ (z_{t} + M_{t - 1}) + (1 - γ) U_{t - 1}

(25)

\overset{\land}{z_{t + q}} = M_{t} + q T_{t} + U_{t + m + q}

(26)

In this context,

M

indicates the estimated level affected by the quantity of

α

,

T

refers to the estimated trend influenced by the quantity of

β

, and

U

represents the estimate of seasonality influenced by the quantity of

γ

.

q

Signifies the seasonal period,

m

represents the length,

z

is the current data, and

\overset{\land}{z}

represents the forecasted value for the next period.

3.3.5. Autocorrelation

Autocorrelation is a measure of correlation between a time series and lagged versions of itself across time spans, to a certain extent. Autocorrelation looks at the connections between a variable and its past values at present time.

A C = \frac{\sum_{t = l + 1}^{i} (S_{n}^{*} - \bar{S}) (S_{t - l}^{*} - \bar{S})}{\sum_{t = 1}^{i} {(S_{n}^{*} - \bar{S})}^{2}}

(27)

In this context,

A C

signifies the correlation between data points at a certain interval,

l

represents the inventory information at a specific time

t

, and

\bar{S}

shows the average of the inventory data. The variable

i

denotes the overall number of observations in the inventory data, while

l

denotes the specific time interval used to calculate the correlation.

Feature-extracted output is denoted as

F E = [S / / R M / / E M / / E S / / A C]

.

3.4. Dataset Augmentation

In order to enrich the existing dataset and introduce a holistic set of variables into the predictive modeling, features like statistical features, rolling mean, expanding mean, exponential smoothing, and auto-correlation, which are generated from the original dataset features, have been merged. This augmentation process implies joining the extracted features with an existing dataset, usually by adding them as new columns or variables. The augmented data set of both the static attributes and the dynamic insights gained with time series analysis simultaneously generates more data that reflects the overall system of the inventory.

D A = [S_{e}, F E]

(28)

Here,

[S_{e}, F E]

denotes the concatenated metrics

S_{e}

and

F E

.

In this context, every row in the dataset

D A

represents a sample, showing the original features and additional extracted features derived from time series analysis. This expanded dataset

D A

is subsequently utilized as input for predictive modeling techniques.

4. Working of Modified Multi-Dimensional Collaborative Wrapped BiLSTM in Inventory Prediction

The augmented dataset forms the input for the modified multi-dimensional collaboratively wrapped BiLSTM in inventory prediction. The overall structure of the MMCW-BiLSTM model is made up of several sub-components that function together to analyze the input information and deliver accurate predictions of future inventory levels. First, the model takes input data, extracts their attributes, and then uses ARIMA to calculate the time series. These time series data provide various statistical features, such as rolling mean, expanding mean, exponential smoothing, and auto-correlation features. Ultimately, the MMCW-BiLSTM model augments these extracted features with dataset features. The architecture model consists of three BiLSTM layers, followed by two additional BiLSTM layers. The collaborative attention layer, the multi-dimensional attention layer, and the full attention layer all received their output from the previous two BiLSTM layers. Subsequently, a Taylor series transformation is then used in order to identify higher-order feature interactions in a more effective manner and improve the model’s capacity to identify complex underlying patterns within the data set. Finally, the dense layers are used for further transformations and dimensionality reduction to make a final forecast for the inventory levels in the future. This systematic interaction of components helps the MMCW-BiLSTM model to learn from augmented datasets and generate accurate inventory forecasts.

4.1. BiLSTM

Hochreiter and Schmidhuber [13] introduced the idea of LSTM, which comprises three gates and two conveyor belts in a single unit to safeguard the state of each neuron and regulate the flow of information. This gating mechanism allows for controlled information transfer within the LSTM unit’s four interconnected neural network layers. To comprehend the structure and performance of a BiLSTM network, it is necessary to first examine a unidirectional LSTM network, which falls under the category of recurrent neural networks (RNNs). Traditional RNNs do not have a clear layer structure and use cyclic connections in hidden layers to handle short-term memory and sequences like time series data. LSTM networks, on the other hand, deal with the problem of long-term dependencies that come with having more neurons. LSTMs tackle this problem by storing crucial information throughout all LSTM units in a memory cell, functioning like a conveyor belt. Each LSTM unit consists of a memory cell and three gates that regulate the information flow by deciding what to keep and what to discard within the system. This process allows LSTMs to capture lasting relationships and address the previously mentioned challenge. The three gates are commonly known as the forget gate, input gate, and output gate. Two BiLSTM layers process the input, recognizing both forward and backward temporal relationships in sequential data.

h_{j, t}^{f} = L S T M (D A, h_{j, t - 1}^{f})

(29)

h_{j, t}^{b} = L S T M (D A, h_{j, t + 1}^{b})

(30)

In this context, the input at the time step

t

is represented as

D A

, while the LSTM cell is symbolized as LSTM. The output of the

j t h

LSTM cell, denoted as

H_{1}

, falls within the range of 1 to 2. Ultimately, the output is generated by combining the forward and backward states.

H_{1} = [h_{j, t}^{f}, h_{j, t}^{b}]

(31)

4.2. Collaborative Attention Layer

This layer is meant to discover collaborative patterns of the different features by computing attention scores for each feature. Here, the attention weight is denoted as

C

, which is given below

C = s o f t \max (W_{c} . H_{2})

(32)

In this context, we represent the matrix of learnable parameters as

W_{c}

, while

H_{2}

represents the output generated from the second BiLSTM layer.

4.3. Multi-Dimensional Attention Layer

This sub-layer equally enhances the attention mechanism by taking into account the multi-dimensional representations of features. The input for this layer is the attention scores generated by the collaborative attention layer, which are then used to calculate new attention weights. The algorithm for collaborative and multi-dimensional attention is provided in Algorithm 1.

D = s o f t \max (W_{u} . C)

(33)

In this case, a different parameter matrix is represented

W_{u}

.

Algorithm 1: Collaborative and multi-dimensional attention.

x = Collaborative Attention(num_heads=64, key_dim=3)(x1, x2)

# Multi-Dimensional Attention

x = Multidimensional Attention(x)

x = Dropout(0.5)(x)

x = Dense(100, activation = ‘linear’)(x)

x = Dense(64, activation = ‘linear’)(x)

model_out = Dense(units=1, activation=‘linear’)(x)

4.4. Full Attention Layer

Ultimately, the information generated by the previous BiLSTM layer is modified by the attention scores calculated in the multi-dimensional attention layer to produce the ultimate representation.

F A = D . H_{2}

(34)

4.5. Taylor Series Transformation

The last stage of the representation

F A

is subject to a Taylor series conversion to account for more complex relationships between features. This conversion could involve expanding polynomials or applying other nonlinear transformations. Let the final representation following the full attention layer be denoted as

F A = [f a_{1}, f a_{2}, \dots\dots\dots f a_{n}]

, with each component

f a_{i}

belonging to the vector

F A

. The Taylor series conversion can be expressed mathematically as follows:

F A = \sum_{l = 0}^{\infty} \frac{T S (l)}{i!} {(Z - Z_{o})}^{l}

(35)

Here,

F A (l)

is the

l t h

derivative of the representation

F A

in terms of the input features and is denoted by

Z

.

Z_{o}

serves as the central point for the Taylor series. The incorporation of the Taylor series transformation allows the MMCW-BiLSTM model to effectively capture more complex interactions between features, leading to a more thorough data representation and enhanced accuracy in inventory prediction. The algorithm for the Taylor series transformation is provided in Algorithm 2.

	Algorithm 2: Taylor Series.
S.NO	for i in range(1, approx_order):
	model_out = add([model_out, Dense(output_shape)(Activation(lambda x: self.exponent(x, n = i))(model_in))])
	outputs = Activation(‘linear’)(model_out)
	outputs = Flatten()(outputs)
	outputs = Dense(units = 1, activation = ‘linear’)(outputs)
	model = Model(inputs = input layer, outputs = outputs)
	model. compile(
	loss = ‘mse’,
	optimizer = ‘adam’,
	metrics = [‘mean_absolute_error’]
	)
	model.summary()
	from keras.utils import plot_model
	plot_model(model, to_file = “Results\\Arc.png”, show_shapes = True, show_dtype = True, show_layer_names = True, show_layer_activations = True, dpi = 600)
	model.fit(x_train, self.y_train, epochs = epochs, batch_size = 32)
	Pred = model. Predict(x_test)
	pred = pred.reshape(pred.shape[0] × pred.shape[1])
	return pred

4.6. Dense Layer

In the dense layer, the input values multiplied by the weights are added to the bias—which is also referred to as a dot product. The weights and the biases are learnable parameters that the network tunes during training to minimize the loss function.

D L = W . F A + b

(36)

Here, the weight matrix is denoted as

W

. FA denotes the Taylor series transformation output, and the bias vector is denoted as

b

. The architecture for the proposed Modified multi-dimensional collaborative wrapped BiLSTM in inventory prediction is shown in Figure 2 and Figure 3 depicts the layer details of the developed model.

5. Result and Discussion

A new model called MMCW-BiLSTM has been developed for inventory prediction in this research. Its performance is evaluated by comparing it with other top models to determine its effectiveness.

5.1. Experimental Setup

To conduct the inventory prediction experiment, a Python script of version 3.7.6 is run on a Windows 10 OS with 8GB RAM. The datasets used in this model are the AV demand forecasting dataset and the forecast for product demand dataset. The activation function used here is linear, the loss function is MSE, the dropout rate is 0.5, the validation split is 20%, the batch size is 32, and Adam is used as the optimizer.

5.2. Dataset Description

5.2.1. AV Demand Forecasting Dataset

The AV demand forecasting dataset is a dataset that contains data concerning the demand for AV services in a specific urban environment for the entire time unit [26]. It comprises a wide number of parameters, including time stamps, which show the date and time of the observations; metrics that measure AV usage, which include trip counts, distance traveled, and ride duration; as well as contextual factors like climate conditions and essential events. The dataset gives those involved in the AV research field, urban planning, and transport companies the opportunity to examine AV demand over time, foresee future trends, and eventually implement measures to attain specific targets. This structured dataset serves as a great resource for comprehending the dynamics that play out in AV utilization in an urban environment, aiding in informed decision-making in the areas of policy planning, service optimization, and market analysis. A part of the training data is taken for the sample, which is validation. The total increase in training percentage will determine better-predicted results. For high training data value, better results will be attained. The out-of-sample data points are the data taken for the testing.

5.2.2. Forecast for Product Demand Dataset

The data collection includes historical demand data for a manufacturing company operating on a global scale [27]. The company sells thousands of products in hundreds of different categories. There are four main warehouses to send the goods within the region; the area takes care of them. This is because the products are manufactured in different places all around the world, and it normally takes more than one month to send products via ocean to some central warehouses. It would be advantageous for the company, however, if it could forecast products in different central areas with a high degree of accuracy for the monthly demand two months later.

5.2.3. Dairy Goods Sales Dataset

The dairy goods sales dataset provides a detailed and comprehensive collection of data related to dairy farms, dairy products, sales, and inventory management [28]. This dataset encompasses a wide range of information, including farm location, land area, product details, brand information, quantities, pricing, sales information, customer locations, sales channels, stock quantities, stock thresholds, and reorder quantities.

5.3. Performance Analysis Based on AV Demand Forecasting Dataset

The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in Figure 4. The graph shows that in Figure 4a, the MAE for epochs 180, 200, 300, 400, and 500 had error values of 3.33, 3.16, 3.00, 2.38, and 1.75 with a TP of 90. Similarly, Figure 4b indicates that the MMCW-BiLSTM model had the lowest error values of 8.66, 6.50, 5.23, 4.75, and 2.89 for MAPE at a TP of 90. Additionally, Figure 4c shows the results for epochs 180, 200, 300, 400, and 500 with the minimum error values of 17.26, 14.89, 13.38, 10.43, and 6.76 for MSE at a TP rate of 90. Similarly, Figure 4d presents the RMSE results of the MMCW-BiLSTM model during a TP of 90, achieving the lowest error values of 4.15, 3.86, 3.66, 3.23, and 2.60.

5.4. Performance Analysis Based on Forecast for Product Demand Dataset

The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in Figure 5, which displays the results obtained from using the MMCW-BiLSTM for predicting inventory occurrences. In Figure 5a, the MAE values at epochs 180, 200, 300, 400, and 500 show the smallest errors to be 3.74, 3.17, 2.74, 2.45, and 1.97 with a TP of 90. Similarly, Figure 5b shows the outcomes of the MMCW-BiLSTM model with the MAPE at a TP of 90, achieving the lowest error values of 11.65, 6.86, 6.54, 5.76, and 3.91. Additionally, Figure 5c presents the results for epoch values of 180, 200, 300, 400, and 500, indicating the minimum error values of 20.10, 15.30, 12.64, 9.97, and 8.76 for MSE at a TP of 90. Similarly, Figure 5d demonstrates the results of the MMCW-BiLSTM model with the RMSE during a TP of 90, reaching the lowest errors of 4.48, 3.91, 3.56, 3.16, and 2.96.

5.5. Performance Analysis Based on Forecast for Dairy Goods Sales Dataset

The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in Figure 5, which displays the results obtained from using the MMCW-BiLSTM for predicting inventory occurrences. In Figure 6a, the MAE values at epochs 180, 200, 300, 400, and 500 show the smallest errors to be 4.06, 3.57, 3.07, 2.86, and 2.54 with a TP of 90. Similarly, Figure 6b shows the outcomes of the MMCW-BiLSTM model with the MAPE at a TP of 90, achieving the lowest error values of 17.17, 10.72, 7.57, 5.15, and 3.69. Additionally, Figure 6c presents the results for epoch values of 180, 200, 300, 400, and 500, indicating the minimum error values of 22.78, 19.13, 15.12, 12.34, and 10.39 for MSE at a TP of 90. Similarly, Figure 6d demonstrates the results of the MMCW-BiLSTM model with the RMSE during a TP of 90, reaching the lowest errors of 4.77, 4.37, 3.89, 3.51, and 3.22.

5.6. Comparative Methods

To showcase the successes of the MMCW-BiLSTM model, a comparison was conducted. This analysis included various techniques such as SVM [29], Bayesian approach [30], BP neural network [31], and meta-heuristic CNN-LSTM [3].

5.6.1. Comparative Analysis Based on TP for AV Demand Forecasting Dataset

The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 19.47% and achieved the lowest MAE of 1.75, as illustrated in Figure 7a.

In Figure 7b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 15.74%, reaching a minimum MAPE of 3.48 with a TP of 90.

The MMCW-BiLSTM model, shown in Figure 7c, outperforms the meta-heuristic CNN-LSTM model by 31.34% in forecasting inventory. It attains an MSE score of 6.76 with a TP of 90, surpassing existing methods.

In Figure 7d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 2.60, with a TP of 90, which is 14.60% higher than the CNN-LSTM model.

5.6.2. Comparative Analysis Based on TP for Forecast for Product Demand Dataset

The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 40.46% and achieved the lowest MAE of 1.97, as illustrated in Figure 8a.

In Figure 8b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 2.85%, reaching a minimum MAPE of 3.91 with a TP of 90.

The MMCW-BiLSTM model, shown in Figure 8c, outperforms the meta-heuristic CNN-LSTM model by 26.72% in forecasting inventory. It attains an MSE score of 8.76 with a TP of 90, surpassing existing methods.

In Figure 8d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 2.96, with a TP of 90, which is 12.56% higher than the CNN-LSTM model.

5.6.3. Comparative Analysis Based on TP for Forecast for Dairy Goods Sales Dataset

The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 5.92% and achieved the lowest MAE of 2.54, as illustrated in Figure 9a.

In Figure 9b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 67.43%, reaching a minimum MAPE of 3.69 with a TP of 90.

The MMCW-BiLSTM model, shown in Figure 9c, outperforms the meta-heuristic CNN-LSTM model by 11.91% in forecasting inventory. It attains an MSE score of 10.39 with a TP of 90, surpassing existing methods.

In Figure 9d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 3.22, with a TP of 90, which is 6.14% higher than the CNN-LSTM model.

5.7. Statistical Analysis Based on AV Demand Forecasting Dataset

The MMCW-BiLSTM model shows the highest accuracy in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 51.61 and achieving the lowest error value of 2.62.

The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 22.54 and achieving the lowest error value of 5.87.

The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 11.26 and achieving the lowest error value of 9.83.

The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 45.61 and reaching a peak RMSE of 3.14. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in Table 1, Table 2, Table 3 and Table 4 for AV Demand Forecasting Dataset.

5.8. Statistical Analysis Based on Product Demand Dataset

The MMCW-BiLSTM model shows the highest accuracy in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 14.13 and achieving the lowest error value of 3.37.

The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 90.78 and achieving the lowest error value of 7.06.

The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 32.22 and achieving the lowest error value of 16.37.

The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 14.87 and reaching a peak RMSE of 4.05. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in Table 5, Table 6, Table 7 and Table 8 for Product Demand Dataset.

5.9. Statistical Analysis Based on Dairy Goods Sales Dataset

The MMCW-BiLSTM model shows the highest MAE in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 28.30 and achieving the lowest error value of 3.73.

The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 36.47 and achieving the lowest error value of 9.73.

The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 41.02 and achieving the lowest error value of 21.12.

The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 23.20 and reaching a peak RMSE of 4.60. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in Table 9, Table 10, Table 11 and Table 12 for Dairy Goods Sales Dataset.

5.10. Time Complexity Analysis

The computational time comparison between the developed MMCW-BiLSTM and the other existing methods is analyzed with different iterations to show the effectiveness of the MMCW-BiLSTM method. The results show the method’s computational efficiency, and it consistently requires a reduction in time compared with other existing methods. While comparing the suggested method based on the datasets with the existing approaches, the developed model has the lowest computing time of 20.46, 20.52, and 20.59 at iteration 100. Table 13 includes information on the computational complexity of the developed model.

5.11. Comparative Discussion

The proposed MMCW-BiLSTM model outperforms existing models in terms of volatility of demand, issues with data quality, and difficulty in capturing multi-dimensional interactions between features with the use of advanced techniques purposefully built for forecasting inventory tasks. Using BiLSTM layers in conjunction with both collaborative and multi-dimensional attention mechanisms gives the model the ability to tackle complex temporal dependencies and feature correlations in the dataset. This comprehensively leads to more accurate forecasts, especially when several factors are implicated and influence inventory levels. In addition, Taylor series transformations allow the model to take into account higher-order interactions among the features, which enhances its overall predictive power. The MMCW-BiLSTM model was compared to traditional and state-of-the-art methods like SVM, Bayesian approaches, BP neural networks, and meta-heuristic CNN-LSTM methods. The MMCW-BiLSTM model consistently performed better than these techniques, such that there was an affirmation of the model in accurately forecasting inventory levels. In short, it is the incorporation of superior architecture and exhaustive evaluation procedures that make our MMCW-BiLSTM models perform better than other predictors for inventory forecasting. The results show that the models have the least number of errors, with values of 1.75, 2.89, 6.76, and 2.6 for MAE, MAPE, MSE, and RMSE when using the AV demand forecasting dataset. Likewise, when utilizing the forecast product demand dataset, the model has the lowest error values of 1.97, 3.91, 8.76, and 2.96 for these metrics. Similarly, when utilizing the dairy goods sales dataset, the model has the lowest error values of 2.54, 3.69, 10.39, and 3.22. The comparative discussion table for the AV demand forecasting dataset and forecast for the product demand dataset is shown in Table 14.

6. Conclusions

Finally, the MMCW-BiLSTM model is a powerful development in the inventory prediction field. The MMCW-BiLSTM model provides better accuracy and efficiency in inventory forecasting by tackling the challenges that the existing models are facing, which include demand instability and the inability to capture multi-dimensional interactions. The combination of sophisticated techniques, including BiLSTM layers, cooperative and multi-dimensional attention mechanisms, and Taylor series transformation, allows the model to improve its predictive performance. The MMCW-BiLSTM model accurately forecasts inventory levels, which is a crucial aspect of more efficient inventory management and supply chain optimization, ultimately leading to better operational efficiency and profitability for e-commerce firms. Furthermore, further research and improvement of the MMCW-BiLSTM design will continue to pave the way for new advancements in stock prediction and supply chain management. The results suggest that the models show the lowest error values for MAE, MAPE, MSE, and RMSE at 1.75, 2.89, 6.76, and 2.6, respectively, when using the AV demand forecasting dataset. Similarly, when using the forecasting product demand dataset, the model also achieves the lowest error values for the same metrics at 1.97, 3.91, 8.76, and 2.96. Similarly, when using the dairy goods sales dataset, the model also achieves the lowest error values for the same metrics at 2.54, 3.69, 10.39, and 3.22. In the future, strategies need to be developed to achieve real-time inventory prediction using the MMCW-BiLSTM model, which will allow businesses to react to demand changes quickly and adjust their stock in real time. Firstly, exploring the incorporation of more complex techniques, including deep reinforcement learning or generative adversarial networks, could help enhance the adaptability and accuracy of inventory prediction models. Deep reinforcement learning can learn non-stationary inventory policies. The model accepts the forecast as a policy input, matches dynamic programming in the case of backorders, and can outperform dynamic programming in the case of lost sales.

Author Contributions

Validation, K.I.; Data curation, S.A.; Writing—original draft, S.A.; Supervision, A.A. and K.I.; Project administration, A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not appliable.

Informed Consent Statement

Not appliable.

Data Availability Statement

Data used in this study are available at: Dataset 1. Available online: https://www.kaggle.com/code/piyushrg/av-demand-forecasting/input?select=train_0irEZ2H.csv (13 April 2024). Dataset 2. Available online: https://www.kaggle.com/datasets/felixzhao/productdemandforecasting (23 March 2024). Dataset 3. Available online: https://www.kaggle.com/datasets/suraj520/dairy-goods-sales-dataset (23 March 2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

Babai, M.-Z.; Chen, H.; Syntetos, A.A.; Lengu, D. A compound-Poisson Bayesian approach for spare parts inventory forecasting. Int. J. Prod. Econ. 2021, 232, 107954. [Google Scholar] [CrossRef]
Han, C.; Wang, Q. Research on commercial logistics inventory forecasting system based on neural network. Neural Comput. Appl. 2021, 33, 691–706. [Google Scholar] [CrossRef]
Xue, N.; Triguero, I.; Figueredo, G.P.; Landa-Silva, D. Evolving deep CNN-LSTMs for inventory time series prediction. In Proceedings of the 2019 IEEE Congress on Evolutionary Computation (CEC), Wellington, New Zealand, 10–13 June 2019; pp. 1517–1524. [Google Scholar]
Muchaendepi, W.; Mbohwa, C.; Hamandishe, T.; Kanyepe, J. Inventory management and performance of SMEs in the manufacturing sector of Harare. Procedia Manuf. 2019, 33, 454–461. [Google Scholar] [CrossRef]
Ho, G.T.; Tang, Y.M.; Tsang, K.Y.; Tang, V.; Chau, K.Y. A blockchain-based system to enhance aircraft parts traceability and trackability for inventory management. Expert Syst. Appl. 2021, 179, 115101. [Google Scholar] [CrossRef]
Hajek, P.; Abedin, M.Z. A profit function-maximizing inventory backorder prediction system using big data analytics. IEEE Access 2020, 8, 58982–58994. [Google Scholar] [CrossRef]
Weraikat, D.; Zanjani, M.K.; Lehoux, N. Improving sustainability in a two-level pharmaceutical supply chain through Vendor-Managed Inventory system. Oper. Res. Health Care 2019, 21, 44–55. [Google Scholar] [CrossRef]
Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Lin, T.; Guo, T.; Aberer, K. Hybrid neural networks for learning the trend in time series. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, Melbourne, Australia, 19–25 August 2017; pp. 2273–2279. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzel, R. Learning to diagnose with lstm recurrent neural networks. arXiv 2015, arXiv:1511.03677. [Google Scholar]
Yang, J.; Nguyen, M.N.; San, P.P.; Li, X.; Krishnaswamy, S. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; Volume 15, pp. 3995–4001. [Google Scholar]
Benzazoua Bouazza, A.; Ardjouman, D.; Abada, O. Establishing the Factors Affecting the Growth of Small and Medium-sized Enterprises in Algeria. Am. Int. J. Soc. Sci. 2015, 4, 101–115. [Google Scholar]
Beyene, A. Enhancing the competitiveness and productivity of SMEs in Africa: An analysis of differential roles of national governments through improved support services. Afr. Dev. J. 2002, 27, 130–156. [Google Scholar]
Lukács, E. The economic role of SMEs in world economy, especially in Europe. Eur. Integr. Stud. 2005, 4, 3–12. [Google Scholar]
Bowen, M.; Morara, M.; Mureithi, S. Management of business challenges among small and micro enterprises in Nairobi-Kenya. KCA J. Bus. Manag. 2009, 2, 16–31. [Google Scholar] [CrossRef]
Beck, T.; Demirguc-Kunt, A. Small and medium-size enterprises: Access to finance as a growth constraint. J. Bank. Financ. 2006, 30, 2931–2943. [Google Scholar] [CrossRef]
Oyelaran-Oyeyinka, B.; Lal, K. Learning new technologies by small and medium enterprises in developing countries. Technovation 2006, 26, 220–231. [Google Scholar] [CrossRef]
Naudé, W.; Havenga, J.J.D. Directions in African Entrepreneurship Research in Entrepreneurship and SME Research in Africa: A Selected Bibliography. 1963–2001; PU for CHE: Potchefstroom, South Africa, 2002. [Google Scholar]
Demizu, T.; Fukazawa, Y.; Morita, H. Inventory management of new products in retailers using model-based deep reinforcement learning. Expert Syst. Appl. 2023, 229, 120256. [Google Scholar] [CrossRef]
Dataset 1. Available online: https://www.kaggle.com/code/piyushrg/av-demand-forecasting/input?select=train_0irEZ2H.csv (accessed on 13 April 2024).
Dataset 2. Available online: https://www.kaggle.com/datasets/felixzhao/productdemandforecasting (accessed on 23 March 2024).
Dataset 3. Available online: https://www.kaggle.com/datasets/suraj520/dairy-goods-sales-dataset (accessed on 23 March 2024).
Wang, J.; Zhu, W.; Sun, D.; Lu, H. Application of svm combined with mackov chain for inventory prediction in supply chain. In Proceedings of the 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing, Dalian, China, 12–14 October 2008; pp. 1–4. [Google Scholar]
Bergman, J.J.; Noble, J.S.; McGarvey, R.G.; Bradley, R.L. A Bayesian approach to demand forecasting for new equipment programs. Robot. Comput.-Integr. Manuf. 2017, 47, 17–21. [Google Scholar] [CrossRef]
Gu, L.; Han, Y.; Wang, C.; Shu, G.; Feng, J.; Wang, C. Research on inventory forecasting based on BP neural network. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–8. [Google Scholar]

Figure 1. The architecture of the proposed inventory prediction model.

Figure 2. The architecture of the proposed modified multi-dimensional collaborative wrapped BiLSTM in inventory prediction.

Figure 3. Layer details of the proposed modified multi-dimensional collaborative wrapped BiLSTM in inventory prediction.

Figure 4. Performance analysis based on TP for AV demand forecasting datasets: (a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Figure 5. Performance analysis based on TP for product demand datasets: (a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Figure 6. Performance analysis based on TP for Dairy Goods Sales Dataset: (a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Figure 7. Comparative analysis based on TP for demand forecasting datasets:(a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Figure 8. Comparative analysis based on TP for product demand datasets:(a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Figure 9. Comparative analysis based on TP for Dairy goods Sales datasets:(a) MAE, (b) MAPE, (c) MSE, and (d) RMSE.

Table 1. Statistical table based on MAE on AV Demand Forecasting Dataset.

Method	Best	Mean	Variance
SVM	3.72	3.32	0.12
Bayesian approach	3.46	3.09	0.26
BP neural network	3.67	3.22	0.17
The meta-heuristic method evolves CNN-LSTM	3.32	2.67	0.17
Pharmaceutical supply chain model	3.97	3.14	0.42
MMCW-BiLSTM	2.62	2.08	0.08

Table 2. Statistical table based on MAPE on AV Demand Forecasting Dataset.

Method	Best	Mean	Variance
SVM	12.11	7.69	8.76
Bayesian approach	15.65	10.32	14.40
BP neural network	14.27	9.48	8.19
The meta-heuristic method evolves CNN-LSTM	16.16	9.65	10.63
Pharmaceutical supply chain model	19.10	11.92	18.86
MMCW-BiLSTM	5.87	4.41	1.36

Table 3. Statistical table based on MSE on AV Demand Forecasting Dataset.

Method	Best	Mean	Variance
SVM	20.03	14.19	15.44
Bayesian approach	17.65	15.13	9.80
BP neural network	19.95	15.66	12.42
The meta-heuristic method evolves CNN-LSTM	20.37	13.68	18.12
Pharmaceutical supply chain model	20.90	15.29	21.79
MMCW-BiLSTM	9.83	7.92	1.35

Table 4. Statistical table based on RMSE on AV Demand Forecasting Dataset.

Method	Best	Mean	Variance
SVM	4.48	3.73	0.30
Bayesian approach	4.20	3.86	0.20
BP neural network	4.47	3.93	0.21
The meta-heuristic method evolves CNN-LSTM	4.51	3.65	0.35
Pharmaceutical supply chain model	4.57	3.86	0.38
MMCW-BiLSTM	3.14	2.81	0.04

Table 5. Statistical table based on MAE on Product Demand Dataset.

Method	Best	Mean	Variance
SVM	4.28	3.528065	0.364956
Bayesian approach	4.24	3.696128	0.184637
BP neural network	4.11	3.893839	0.035414
Meta-heuristic method evolves CNN-LSTM	4.32	3.478351	0.467172
Pharmaceutical supply chain model	3.85	3.231541	0.144676
MMCW-BiLSTM	3.37	2.600898	0.182591

Table 6. Statistical table based on MAPE on Product Demand Dataset.

Method	Best	Mean	Variance
SVM	14.58	10.56	6.47
Bayesian approach	24.20	10.69	37.23
BP neural network	18.14	12.98	19.40
Meta-heuristic method evolves CNN-LSTM	23.31	14.85	22.90
Pharmaceutical supply chain model	13.47	9.51	10.34
MMCW-BiLSTM	7.06	5.71	1.09

Table 7. Statistical table based on MSE on Product Demand Dataset.

Method	Best	Mean	Variance
SVM	22.00	18.01009	14.40631
Bayesian approach	25.17	19.11433	26.71737
BP neural network	25.04	20.44375	12.16211
Meta-heuristic method evolves CNN-LSTM	25.77	21.02997	28.72542
Pharmaceutical supply chain model	21.64	16.39032	13.74577
MMCW-BiLSTM	16.37	11.75615	7.222364

Table 8. Statistical table based on RMSE on Product Demand Dataset.

Method	Best	Mean	Variance
SVM	4.69	4.22	0.22
Bayesian approach	5.02	4.33	0.38
BP neural network	5.00	4.50	0.16
Meta-heuristic method evolves CNN-LSTM	5.08	4.54	0.42
Pharmaceutical supply chain model	4.65	4.02	0.21
MMCW-BiLSTM	4.05	3.41	0.15

Table 9. Statistical table based on MAE on Dairy Goods Sales Dataset.

Method	Best	Mean	Variance
SVM	5.01	4.37	0.31
Bayesian approach	4.11	3.49	0.21
BP neural network	4.83	4.08	0.40
Meta-heuristic method evolves CNN-LSTM	4.76	4.02	0.80
Pharmaceutical supply chain model	4.76	4.01	0.17
MMCW-BiLSTM	3.73	2.91	0.20

Table 10. Statistical table based on MAPE on Dairy Goods Sales Dataset.

Method	Best	Mean	Variance
SVM	30.86	16.90	85.05
Bayesian approach	19.90	12.13	22.70
BP neural network	23.55	16.21	29.96
Meta-heuristic method evolves CNN-LSTM	27.69	18.77	41.33
Pharmaceutical supply chain model	18.71	13.51	24.84
MMCW-BiLSTM	9.73	6.04	6.09

Table 11. Statistical table based on MSE on Dairy Goods Sales Dataset.

Method	Best	Mean	Variance
SVM	32.33	26.73	38.98
Bayesian approach	25.02	18.25	17.60
BP neural network	31.47	23.86	32.84
Meta-heuristic method evolves CNN-LSTM	31.39	23.36	60.00
Pharmaceutical supply chain model	30.51	22.77	21.42
MMCW-BiLSTM	21.12	13.59	16.02

Table 12. Statistical table based on RMSE on Dairy Goods Sales Dataset.

Method	Best	Mean	Variance
SVM	5.69	5.13	0.40
Bayesian approach	5.00	4.24	0.24
BP neural network	5.61	4.85	0.36
Meta-heuristic method evolves CNN-LSTM	5.60	4.76	0.75
Pharmaceutical supply chain model	5.52	4.75	0.23
MMCW-BiLSTM	4.60	3.65	0.27

Table 13. Comparative analysis based on computational time.

Method	Computational Time
Method	AV Demand Forecasting Dataset	Product Demand Dataset	Dairy Goods Sales Dataset
SVM	20.80	20.76	20.69
Bayesian approach	20.80	20.81	20.78
BP neural network	20.81	20.81	20.81
Meta-heuristic method evolves CNN-LSTM	20.81	20.83	20.82
Pharmaceutical supply chain model	20.83	20.83	20.83
MMCW-BiLSTM	20.46	20.52	20.59

Table 14. Comparative discussion table for AV demand forecasting dataset, forecast for product demand dataset, and dairy goods sales dataset.

Models		SVM	Bayesian Approach	BP Neural Network	Meta-Heuristic Method Evolves CNN-LSTM	Pharmaceutical Supply Chain Model	MMCW-BiLSTM
AV Demand Forecasting Dataset	MAE	2.78	3.46	2.57	2.09	2.09	1.75
	MAPE	12.11	9.85	4.8	5.88	7.44	2.89
	MSE	12.97	16.13	10.29	8.3	8.88	6.76
	RMSE	3.6	4.02	3.21	2.88	2.98	2.6
Forecast for Product Demand Dataset	MAE	2.39	4.06	3.64	2.48	2.77	1.97
	MAPE	8.34	24.20	6.74	9.39	4.02	3.91
	MSE	11.02	25.17	14.13	10.39	11.10	8.76
	RMSE	3.32	5.02	3.76	3.22	3.33	2.96
Dairy Goods Sales Dataset	MAE	4.74	3.79	3.03	2.70	3.54	2.54
	MAPE	27.48	5.15	8.10	11.33	5.81	3.69
	MSE	32.33	20.23	14.98	11.79	17.61	10.39
	RMSE	5.69	4.50	3.87	3.43	4.20	3.22

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abualuroug, S.; Alzubi, A.; Iyiola, K. Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model. Appl. Sci. 2024, 14, 5817. https://doi.org/10.3390/app14135817

AMA Style

Abualuroug S, Alzubi A, Iyiola K. Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model. Applied Sciences. 2024; 14(13):5817. https://doi.org/10.3390/app14135817

Chicago/Turabian Style

Abualuroug, Said, Ahmad Alzubi, and Kolawole Iyiola. 2024. "Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model" Applied Sciences 14, no. 13: 5817. https://doi.org/10.3390/app14135817

APA Style

Abualuroug, S., Alzubi, A., & Iyiola, K. (2024). Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model. Applied Sciences, 14(13), 5817. https://doi.org/10.3390/app14135817

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inventory Prediction Using a Modified Multi-Dimensional Collaborative Wrapped Bi-Directional Long Short-Term Memory Model

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Input

3.2. Preprocessing

3.3. Feature Extraction

3.3.1. Statistical Features

3.3.2. Rolling Mean

3.3.3. Expanding Mean Features

3.3.4. Exponential Smoothing Features (ESF)

3.3.5. Autocorrelation

3.4. Dataset Augmentation

4. Working of Modified Multi-Dimensional Collaborative Wrapped BiLSTM in Inventory Prediction

4.1. BiLSTM

4.2. Collaborative Attention Layer

4.3. Multi-Dimensional Attention Layer

4.4. Full Attention Layer

4.5. Taylor Series Transformation

4.6. Dense Layer

5. Result and Discussion

5.1. Experimental Setup

5.2. Dataset Description

5.2.1. AV Demand Forecasting Dataset

5.2.2. Forecast for Product Demand Dataset

5.2.3. Dairy Goods Sales Dataset

5.3. Performance Analysis Based on AV Demand Forecasting Dataset

5.4. Performance Analysis Based on Forecast for Product Demand Dataset

5.5. Performance Analysis Based on Forecast for Dairy Goods Sales Dataset

5.6. Comparative Methods

5.6.1. Comparative Analysis Based on TP for AV Demand Forecasting Dataset

5.6.2. Comparative Analysis Based on TP for Forecast for Product Demand Dataset

5.6.3. Comparative Analysis Based on TP for Forecast for Dairy Goods Sales Dataset

5.7. Statistical Analysis Based on AV Demand Forecasting Dataset

5.8. Statistical Analysis Based on Product Demand Dataset

5.9. Statistical Analysis Based on Dairy Goods Sales Dataset

5.10. Time Complexity Analysis

5.11. Comparative Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI