1. Introduction
As enterprises grow larger, the role of inventory as a necessary buffer to maintain production and adapt to fluctuations in demand becomes more significant. As the company expands, there is a greater requirement for a range of raw materials, partially processed goods, and purchased components. As a result, there is a greater need for capital to handle these items. A high forecast for inventory demand resulting in a backlog can hinder company growth and potentially lead to bankruptcy [
1,
2,
3]. Therefore, we cannot overstate the importance of inventory management in the manufacturing sector. In China, the actual cost of producing a product represents only 10% of the overall cost, while logistic expenses make up around 40%. Among these logistic expenses, inventory costs make up a substantial portion, ranging from 80 to 90%, which translates to 32–36% of the total product cost [
4]. This is significantly higher than the cost of directly manufacturing products. Additionally, a low forecasted demand for inventory could lead to losses from stock-outs, impacting not just one company but the entire supply chain. Companies such as a global manufacturing corporation-owned commercial logistics company, Boeing, and Airbus, with global subsidiaries, exemplify this phenomenon. Many of their main products consist of numerous parts, ranging from hundreds to tens of thousands [
5,
6,
7,
8].
Furthermore, a single supplier involves numerous companies worldwide, ranging from dozens to hundreds. Inadequate or excessive inventory levels can impact thousands of companies in the supply chain, affecting the overall health of these large enterprises [
9,
10]. This underscores the importance of a logical and scientific forecast for inventory demand. As companies expand, the number and variety of products they produce also increase, making inventory demand forecasting more challenging. For instance, as companies grow, their product range becomes larger and more complex [
11,
12]. A subsidiary of a major global manufacturing company focused on commercial logistics brought attention to the difficulties of predicting inventory needs for large items such as Boeing and Airbus aircraft, which consist of numerous parts. The complexity of forecasting inventory demands for these huge companies is apparent. Previous studies in accounting and inventory management have mainly concentrated on either controlling inventory or planning for inventory needs [
13,
14]. Taking into consideration the different systematic control solutions available, prior studies have aimed to determine the optimal timing and quantity for ordering inventory [
15]. Furthermore, experts recommend different strategies for material planning based on the duration of product use. Most suggested analytical methods approach inventory management problems as a multi-faceted optimization challenge, aiming to minimize expenses related to ordering and storing inventory while simultaneously maximizing profit or overall usefulness [
16]. Prior research has typically used stochastic approximations to predict expected backorders based on the base stock level. Researchers have recently introduced a model that analyzes inventory levels across multiple levels of distribution to predict stock-outs. Another method for forecasting inventory levels is time series forecasting. However, there is a wealth of information in inventory databases that are not typically utilized, such as previous stock levels, amounts of overdue stock, and flags indicating operational issues [
17,
18]. Despite there being numerous studies conducted on anticipating inventory backorders, there is a current lack of research in the field when it comes to a profit-driven model based on big data [
19,
20].
The primary goal of this research is to fill the gap by examining how a model can provide valuable insights into the financial benefits of utilizing optimal backorder strategies in the age of big data [
21]. Moreover, this study underscores a specific challenge in utilizing machine learning classifiers for this purpose. In classic supply chain management, the number of items on backorder (referred to as the positive or minority category) is significantly lower than the number of active products or items not on backorder (referred to as the negative or majority category), resulting in a situation with unequal classes [
22,
23]. Different real-world forecasting areas, such as loan approval prediction, corporate bankruptcy forecasting, or credit card fraud detection, frequently exhibit this lack of balance. When dealing with imbalanced datasets in modeling, it is essential to prioritize the minority positive instances over the majority class examples. It is essential to strike a balance between the two classes in order to effectively utilize the positive data. Additionally, we need to address the varying costs associated with incorrectly predicting backordered and non-backordered products [
24]. The existing inventory models for forecasting that are shown with the help of the above-mentioned paradigms face certain challenges, including demand volatility, which is one of the biggest problems in the prediction. The quality of data is also a great challenge in least squares-based inventory models. Multi-dimensional interaction is another big challenge, as it becomes hard to capture interactions of features. The gaps in existing inventory prediction models are multi-faceted, encompassing both technical limitations and practical barriers that hinder their real-world performance. It is not easy to design effective predictive models to optimize the model’s parameters. Some practical challenges, such as the high cost of investments, time, and effort, may slow down the adaption, including SMEs. To address these concerns, this research suggests implementing a novel method for forecasting inventory.
The research introduces the utilization of the MMCW-BiLSTM model for inventory prediction, which is a new technique for the modeling of complex temporal dependencies and interactions between features. The model combines BiLSTM layers with collaborative and multi-dimensional attention mechanisms as well as Taylor series transformation, which are instrumental in the accurate forecasting of inventory levels.
The MMCW-BiLSTM model incorporates a complex system for inventory prediction as the system is built, combining different methods to enhance the predictability and precision of the forecast. This approach of using BiLSTM layers, along with collaborative and multi-dimensional attention, has indeed helped the model capture complex temporal dependency and correlation among features within the given data set. Yet, another advantage of using Taylor series transformation is that it captures higher-order interactions and thus enhances the predictive power of the mathematical model.
We organize the document as follows:
Section 2 outlines previous studies on inventory forecasting.
Section 3 provides an explanation of the inventory prediction model’s approach.
Section 4 examines the mathematical modeling of the modified multi-dimensional collaboratively wrapped BiLSTM model.
Section 5 offers a detailed description of the implementation and evaluation of performance. Finally,
Section 6 wraps up the research and proposes possible avenues for future research.
2. Literature Review
Babai M.Z. et al. [
1] presented a new Bayesian method that utilizes compound Poisson distributions for predicting inventory. They compared their technique with a Bayesian approach utilizing Poisson distributions and a Gamma prior distribution, along with both parametric and non-parametric frequent methods, all of which yielded precise predictions. Although the proposed Bayesian method demonstrated superior performance in both empirical and theoretical aspects, it was more computationally intensive and complex compared to parametric frequent methods, making it potentially challenging for practitioners to grasp. Despite these frameworks providing SPC robustness, they can be computationally expensive. In our case, by means of progression in the application of neural networks, we aspire to decrease computational magnitude while optimizing predictive precision.
Chaoliang Han and Qiuying Wang [
2] developed a BP neural network to construct a predictive model that analyzes inventory demand and explores the complex relationship between inventory demand and different influencing factors. Through training the data on the influencing factors of inventory demand, this model generates effective strategies for inventory management and control, achieving improved accuracy and more accurate predictions. However, this model faced a challenge in determining the optimal number of nodes in the hidden layer. Neural networks are flexible and nonlinear but can be manually tuned sometimes. This is a novel approach that uses both attention mechanisms and multi-dimensional modeling so that the model can learn these underlying properties on its own, hence reducing the chances of having to manually tune the parameters.
In their research, Ning Xue1 et al. [
3] presented a meta-heuristic approach that successfully develops CNN-LSTM structures for predicting time series using actual data from a neighborhood food store. Their experiments demonstrate that this model can effectively handle complex nonlinear inventory forecasting issues. Nevertheless, optimizing the parameters may still pose a challenge and necessitate ongoing human oversight. Even so, structures like CNN-LSTM show a rather high level of feature extraction but may have issues with parameter dependence. To boost the model reliability and eliminate the necessity of parameter tuning in the training process, attention mechanisms and feature engineering are applied here.
In their research, Muchaendepi et al. [
4] concentrated on enhancing professionalism and education in the field of inventory management to strengthen the skills of individuals tasked with overseeing inventories in small and medium-sized businesses. This resulted in a significant enhancement in predictive accuracy. Nevertheless, having excessive stock levels in a company can result in increased expenses for storage, handling, and interest payments on short-term borrowings. Upskilling is important, but it does not seem to directly solve the intricacies of predicting inventory. The solution is designed to include both mathematical modeling and skill training as a central part of addressing inventory management issues.
G.T.S. Ho et al. [
5] implemented a system utilizing blockchain technology to provide a management platform for the accurate storage of data related to the traceability of spare parts. This system utilized hyper-ledger fabric and hyper-ledger composer for organizational agreement and validation, ensuring high levels of privacy, resilience, flexibility, and scalability. Nevertheless, the repeated resale of many spare parts poses difficulty in tracking their past transactions. The concept of blockchain applies secure and transparent handling of data, although the technology might find it slightly hard to address a versatile supply chain transaction. That approach is to use predictive modeling coupled with the blockchain traceability approach to give insights into inventories in the future.
Petr Hajek and Mohammad Zoynul Abedin [
6] introduced a machine learning model that includes an under-sampling technique to optimize the anticipated profit of backorder choices. The model was efficient in computation and resistant to changes in warehousing costs, inventory expenses, and sales margins. However, the model’s training duration was exceptionally long. Features of the machine learning models include scalability and flexibility, but they need a lot of processing power. Our approach is built upon resilient neural panels and feature engineering skills that enhance computational performance and the training period.
In order to achieve precise predictions, Dua Weraikat [
7] introduced a pharmaceutical supply chain model that attains high prediction accuracy. However, the model faced difficulties due to computational complexities and problems with overfitting. Our framework learns the attention mechanism to enhance focus on other components and their interactions with each other; this might help to avoid overfitting by paying much attention to salient features within the dataset.
Tsukasa Demizu et al. [
25] introduced a model-based deep reinforcement learning approach that demonstrated exceptional sample efficiency. They also suggested a novel inventory management technique for new products that integrates offline model learning with online planning. This model achieves significant profit and efficiency gains while also decreasing the occurrence of stock-outs throughout the entire sales duration of a product. It was challenging to improve inventory management right after it was introduced because there was not enough data available for learning purposes. By employing the BiLSTM layers with collaborative and multi-dimensional attention to the model, the model can capture even the temporal dependency and feature interactions when activating the model only with the preliminary data.
3. Methodology
The existing inventory models for forecasting that are shown with the help of the above-mentioned paradigms face certain challenges, including demand volatility, which is one of the biggest problems in the prediction. The quality of data is also a great challenge in the least squares-based inventory models. The main aim of the research is to design a predictive model system to anticipate future inventory levels. We first read the dataset like AV Demand Forecasting [
26], Forecast for Product Demand [
27] and Dairy Goods Sales Datasets [
28]; the model takes input data and applies the BiLSTM to capture temporal dependencies in the forward and backward directions. Next, the output passes through collaborative and multi-dimensional attention to learn collaboration between different features and adjust the attention based on multi-dimensional feature interactions. Subsequently, a Taylor series transformation is then used in order to identify higher-order feature interactions in a more effective manner and improve the model’s capacity to identify complex underlying patterns within the data set. Finally, the dense layers are used for further transformations and dimensionality reduction to make a final forecast for the inventory levels in the future. Ultimately, we evaluate the model with new data to infer appropriate inventory levels, leading to effective inventory management and supply chain process optimization. The architecture for the proposed inventory prediction model is shown in
Figure 1.
3.1. Input
The data utilized in the inventory forecasting model are gathered from [
26,
27,
28] and are illustrated in mathematical format as detailed.
Here, denotes the database.
3.2. Preprocessing
In the first step, the dataset is read, attributes are extracted, and items, such as product type, sales volume, and timestamps, are identified and collected as the relevant information. In addition to this, time series analysis is carried out by employing the ARIMA method in order to identify some temporal patterns in the data.
Linear models, often referred to as Box–Jenkins models, are frequently utilized in the analysis of univariate time series data. These models depict the value of a sequence at a specific point in time by incorporating previous values and discrepancies in a straightforward approach. The ARIMA models can be expressed in the following equations:
In the AR model,
stands for the number of parameters, e represents the degree of differencing, and r stands for the count of parameters in the MA model. Additionally,
represents the number of parameters in the AR seasonal model,
indicates the level of seasonal differencing, and
signifies the count of parameters in the MA seasonal model. Lastly,
denotes the seasonality period.
In this context, represents the value observed at a specific time point (or a transformed value); is the AR operator ; is the backshift operator, refers to the seasonal AR operator ; is the seasonal backshift operator ; denotes the differencing operator ; represents the seasonal differencing operator is the random error at time point ; is the MA operator; ; and signifies the seasonal MA operator .
Autocorrelation and partial autocorrelation functions are essential components in building ARIMA models. Examining the coefficients at various time delays helps determine the degree of differencing , seasonal differencing , and the number of parameters within the AR, MA, and seasonal AR components.
3.3. Feature Extraction
Feature extraction is the process that involves finding and extracting useful attributes or characteristics from the inventory dataset, which then specifically targets the precision of forecasting inventory levels.
3.3.1. Statistical Features
Inventory data contain statistical features that reflect the distribution, variability, and overall characteristics of the data. It uses advanced forecasting techniques that analyze seasonal trends and external factors like weather patterns. It helps to make predictions about the feature based on past data. The following is a description of each statistical feature:
- (i)
Mean: The mean represents the typical value or central tendency of inventory-related data over a specified period of time. It is calculated as the average after processing the data.
Hence, the dataset denotes the position of data points and the overall number of elements , and the preprocessed output is denoted as .
- (ii)
Variance: Variance indicates the degree of dispersion of the inventory data points from the mean. A greater variance simply points to a higher variability in the data.
- (iii)
Standard Deviation: The standard deviation is determined by taking the square root of the variance, indicating the usual amount of deviation of data points from the mean. It measures the extent of the dispersion or scatter in the dataset.
- (iv)
Skewness: Skewness of inventory data measures asymmetrical distribution with respect to the mean of these data. A positive skewness means that the distribution of a given data point has a longer tail on the right side of the skewness, while a negative skewness means that the tails of the distribution of a given data point are extended on the left side of the skewness.
- (v)
Kurtosis: Kurtosis is the measure of the peakedness or flatness degree of the inventory data distribution compared to being normally distributed. Larger kurtosis describes a sharper peak and larger tails, while smaller kurtosis indicates a flat distribution.
- (vi)
Entropy: Entropy quantifies the uncertainty or irregularity in the probability distribution of inventory data. It provides information about the uncertainty or unordered data set.
Here, represents the probability mass function of .
- (vii)
Geometric Mean: The geometric mean is determined by finding the nth root of the total product of n numbers in the inventory dataset. This method is used to assess the central tendency of datasets that exhibit exponential growth or decay.
- (viii)
Harmonic Mean: Harmonic mean is the reciprocal of the arithmetical mean of the reciprocals of the inventory data values. It is quite useful for computing averages such as the rate or ratios.
- (ix)
Minimum: The minimum value corresponds to the smallest data point in the inventory preprocessed dataset. It gives the lowest observed value.
- (x)
Maximum: The maximum value stands for the largest data point in the inventory preprocessed data set. It determines the peak value of the considered phenomenon.
- (xi)
Sum: The sum is actually a sum of inventory data values over a given time period. It offers the integrated value of the whole dataset.
The outcome obtained from the statistical feature is denoted as .
3.3.2. Rolling Mean
The rolling mean is calculated over a given window, determining the number of data points to be involved in each average calculation. This rolling mean can be calculated by taking the average of consecutive data points in a rolling window. Hence, the rolling mean provides a smoothed representation of the underlying trend or pattern in the preprocessed inventory data.
In this context, represents the window size indicating the total number of data points used in each calculation. represents the preprocessed data at a specific time .
3.3.3. Expanding Mean Features
The expanding mean, commonly referred to as the cumulative moving average, is a statistical calculation that provides the average for all the data points seen until a specific time point. In comparison to a fixed window size of the rolling mean, the expanding mean is continuously expanding to keep all available data points in the dataset from the beginning up to the current time point.
3.3.4. Exponential Smoothing Features (ESF)
The exponential smoothing technique uses exponential functions to smooth time series data. It is used to make short-term forecasts based on prior assumptions. It also helps in producing slightly unreliable long-term forecasts. Exponential smoothing techniques are straightforward yet effective methods that are commonly applied in various business scenarios like inventory management. There are three distinct techniques for ESF: basic exponential smoothing (SES), double exponential smoothing (DES), and triple exponential smoothing (TES). Simple exponential smoothing is a budget-friendly approach utilized when historical data do not exhibit cyclical patterns or trends. This technique involves generating a smooth curve by incorporating one actual previous value and one forecasted previous value.
In this context, is used to signify the estimated value (or smoothed value) at , while represents the smoothed value when denotes the actual value of the series at T = t − 1. The smoothing parameter, denoted as and varying between 0 and 1, is responsible for determining the level of smoothness within the data series. Despite taking into consideration the influence of all past data, only two numerical values and , are required for the actual computation process. Hence, one of the main benefits of using exponential smoothing is its ability to efficiently handle considerable amounts of data.
The method of double exponential smoothing, also referred to as Holt’s exponential smoothing, is frequently used in the field of economics to forecast values and predict trends. The mathematical formula for Holt’s exponential smoothing can be stated as follows:
In this context,
represents the smoothed level, and
represents the trend. The smoothing parameters are denoted by
and
. The variable
denotes the actual value of the time series at a certain time period
, while
shows that the forecast made
steps ahead of the origin of the forecast. The initial values for the trend and level are determined by the governing equations provided in the following section.
The Holt–Winters method, sometimes called triple exponential smoothing, is a valuable tool for analyzing seasonal time series since it can accurately represent both trend and seasonality trends in past data. This approach dissects the level, trend, and seasonal trends as part of its procedure. While widely used, it is limited in that it can only fit one seasonal pattern, which can be a drawback for load forecasting. The Holt–Winters technique offers two computational models: one for additive and one for multiplicative calculations. Additive calculations are appropriate for data with consistent seasonal patterns, while multiplicative models are suitable for data with changing seasonal patterns. Equations (23)–(26) are used for the additive model.
In this context, indicates the estimated level affected by the quantity of , refers to the estimated trend influenced by the quantity of , and represents the estimate of seasonality influenced by the quantity of . Signifies the seasonal period, represents the length, is the current data, and represents the forecasted value for the next period.
3.3.5. Autocorrelation
Autocorrelation is a measure of correlation between a time series and lagged versions of itself across time spans, to a certain extent. Autocorrelation looks at the connections between a variable and its past values at present time.
In this context, signifies the correlation between data points at a certain interval, represents the inventory information at a specific time , and shows the average of the inventory data. The variable denotes the overall number of observations in the inventory data, while denotes the specific time interval used to calculate the correlation.
Feature-extracted output is denoted as .
3.4. Dataset Augmentation
In order to enrich the existing dataset and introduce a holistic set of variables into the predictive modeling, features like statistical features, rolling mean, expanding mean, exponential smoothing, and auto-correlation, which are generated from the original dataset features, have been merged. This augmentation process implies joining the extracted features with an existing dataset, usually by adding them as new columns or variables. The augmented data set of both the static attributes and the dynamic insights gained with time series analysis simultaneously generates more data that reflects the overall system of the inventory.
Here, denotes the concatenated metrics and .
In this context, every row in the dataset represents a sample, showing the original features and additional extracted features derived from time series analysis. This expanded dataset is subsequently utilized as input for predictive modeling techniques.
4. Working of Modified Multi-Dimensional Collaborative Wrapped BiLSTM in Inventory Prediction
The augmented dataset forms the input for the modified multi-dimensional collaboratively wrapped BiLSTM in inventory prediction. The overall structure of the MMCW-BiLSTM model is made up of several sub-components that function together to analyze the input information and deliver accurate predictions of future inventory levels. First, the model takes input data, extracts their attributes, and then uses ARIMA to calculate the time series. These time series data provide various statistical features, such as rolling mean, expanding mean, exponential smoothing, and auto-correlation features. Ultimately, the MMCW-BiLSTM model augments these extracted features with dataset features. The architecture model consists of three BiLSTM layers, followed by two additional BiLSTM layers. The collaborative attention layer, the multi-dimensional attention layer, and the full attention layer all received their output from the previous two BiLSTM layers. Subsequently, a Taylor series transformation is then used in order to identify higher-order feature interactions in a more effective manner and improve the model’s capacity to identify complex underlying patterns within the data set. Finally, the dense layers are used for further transformations and dimensionality reduction to make a final forecast for the inventory levels in the future. This systematic interaction of components helps the MMCW-BiLSTM model to learn from augmented datasets and generate accurate inventory forecasts.
4.1. BiLSTM
Hochreiter and Schmidhuber [
13] introduced the idea of LSTM, which comprises three gates and two conveyor belts in a single unit to safeguard the state of each neuron and regulate the flow of information. This gating mechanism allows for controlled information transfer within the LSTM unit’s four interconnected neural network layers. To comprehend the structure and performance of a BiLSTM network, it is necessary to first examine a unidirectional LSTM network, which falls under the category of recurrent neural networks (RNNs). Traditional RNNs do not have a clear layer structure and use cyclic connections in hidden layers to handle short-term memory and sequences like time series data. LSTM networks, on the other hand, deal with the problem of long-term dependencies that come with having more neurons. LSTMs tackle this problem by storing crucial information throughout all LSTM units in a memory cell, functioning like a conveyor belt. Each LSTM unit consists of a memory cell and three gates that regulate the information flow by deciding what to keep and what to discard within the system. This process allows LSTMs to capture lasting relationships and address the previously mentioned challenge. The three gates are commonly known as the forget gate, input gate, and output gate. Two BiLSTM layers process the input, recognizing both forward and backward temporal relationships in sequential data.
In this context, the input at the time step
is represented as
, while the LSTM cell is symbolized as LSTM. The output of the
LSTM cell, denoted as
, falls within the range of 1 to 2. Ultimately, the output is generated by combining the forward and backward states.
4.2. Collaborative Attention Layer
This layer is meant to discover collaborative patterns of the different features by computing attention scores for each feature. Here, the attention weight is denoted as
, which is given below
In this context, we represent the matrix of learnable parameters as , while represents the output generated from the second BiLSTM layer.
4.3. Multi-Dimensional Attention Layer
This sub-layer equally enhances the attention mechanism by taking into account the multi-dimensional representations of features. The input for this layer is the attention scores generated by the collaborative attention layer, which are then used to calculate new attention weights. The algorithm for collaborative and multi-dimensional attention is provided in Algorithm 1.
In this case, a different parameter matrix is represented
.
Algorithm 1: Collaborative and multi-dimensional attention. |
x = Collaborative Attention(num_heads=64, key_dim=3)(x1, x2) |
# Multi-Dimensional Attention |
x = Multidimensional Attention(x) |
x = Dropout(0.5)(x) |
x = Dense(100, activation = ‘linear’)(x) |
x = Dense(64, activation = ‘linear’)(x) |
model_out = Dense(units=1, activation=‘linear’)(x) |
4.4. Full Attention Layer
Ultimately, the information generated by the previous BiLSTM layer is modified by the attention scores calculated in the multi-dimensional attention layer to produce the ultimate representation.
4.5. Taylor Series Transformation
The last stage of the representation
is subject to a Taylor series conversion to account for more complex relationships between features. This conversion could involve expanding polynomials or applying other nonlinear transformations. Let the final representation following the full attention layer be denoted as
, with each component
belonging to the vector
. The Taylor series conversion can be expressed mathematically as follows:
Here,
is the
derivative of the representation
in terms of the input features and is denoted by
.
serves as the central point for the Taylor series. The incorporation of the Taylor series transformation allows the MMCW-BiLSTM model to effectively capture more complex interactions between features, leading to a more thorough data representation and enhanced accuracy in inventory prediction. The algorithm for the Taylor series transformation is provided in Algorithm 2.
| Algorithm 2: Taylor Series. |
S.NO | for i in range(1, approx_order): |
| model_out = add([model_out, Dense(output_shape)(Activation(lambda x: self.exponent(x, n = i))(model_in))]) |
| outputs = Activation(‘linear’)(model_out) |
| outputs = Flatten()(outputs) |
| outputs = Dense(units = 1, activation = ‘linear’)(outputs) |
| model = Model(inputs = input layer, outputs = outputs) |
| model. compile( |
| loss = ‘mse’, |
| optimizer = ‘adam’, |
| metrics = [‘mean_absolute_error’] |
| ) |
| model.summary() |
| from keras.utils import plot_model |
| plot_model(model, to_file = “Results\\Arc.png”, show_shapes = True, show_dtype = True, show_layer_names = True, show_layer_activations = True, dpi = 600) |
| model.fit(x_train, self.y_train, epochs = epochs, batch_size = 32) |
| Pred = model. Predict(x_test) |
| pred = pred.reshape(pred.shape[0] × pred.shape[1]) |
| return pred |
4.6. Dense Layer
In the dense layer, the input values multiplied by the weights are added to the bias—which is also referred to as a dot product. The weights and the biases are learnable parameters that the network tunes during training to minimize the loss function.
Here, the weight matrix is denoted as
. FA denotes the Taylor series transformation output, and the bias vector is denoted as
. The architecture for the proposed Modified multi-dimensional collaborative wrapped BiLSTM in inventory prediction is shown in
Figure 2 and
Figure 3 depicts the layer details of the developed model.
5. Result and Discussion
A new model called MMCW-BiLSTM has been developed for inventory prediction in this research. Its performance is evaluated by comparing it with other top models to determine its effectiveness.
5.1. Experimental Setup
To conduct the inventory prediction experiment, a Python script of version 3.7.6 is run on a Windows 10 OS with 8GB RAM. The datasets used in this model are the AV demand forecasting dataset and the forecast for product demand dataset. The activation function used here is linear, the loss function is MSE, the dropout rate is 0.5, the validation split is 20%, the batch size is 32, and Adam is used as the optimizer.
5.2. Dataset Description
5.2.1. AV Demand Forecasting Dataset
The AV demand forecasting dataset is a dataset that contains data concerning the demand for AV services in a specific urban environment for the entire time unit [
26]. It comprises a wide number of parameters, including time stamps, which show the date and time of the observations; metrics that measure AV usage, which include trip counts, distance traveled, and ride duration; as well as contextual factors like climate conditions and essential events. The dataset gives those involved in the AV research field, urban planning, and transport companies the opportunity to examine AV demand over time, foresee future trends, and eventually implement measures to attain specific targets. This structured dataset serves as a great resource for comprehending the dynamics that play out in AV utilization in an urban environment, aiding in informed decision-making in the areas of policy planning, service optimization, and market analysis. A part of the training data is taken for the sample, which is validation. The total increase in training percentage will determine better-predicted results. For high training data value, better results will be attained. The out-of-sample data points are the data taken for the testing.
5.2.2. Forecast for Product Demand Dataset
The data collection includes historical demand data for a manufacturing company operating on a global scale [
27]. The company sells thousands of products in hundreds of different categories. There are four main warehouses to send the goods within the region; the area takes care of them. This is because the products are manufactured in different places all around the world, and it normally takes more than one month to send products via ocean to some central warehouses. It would be advantageous for the company, however, if it could forecast products in different central areas with a high degree of accuracy for the monthly demand two months later.
5.2.3. Dairy Goods Sales Dataset
The dairy goods sales dataset provides a detailed and comprehensive collection of data related to dairy farms, dairy products, sales, and inventory management [
28]. This dataset encompasses a wide range of information, including farm location, land area, product details, brand information, quantities, pricing, sales information, customer locations, sales channels, stock quantities, stock thresholds, and reorder quantities.
5.3. Performance Analysis Based on AV Demand Forecasting Dataset
The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in
Figure 4. The graph shows that in
Figure 4a, the MAE for epochs 180, 200, 300, 400, and 500 had error values of 3.33, 3.16, 3.00, 2.38, and 1.75 with a TP of 90. Similarly,
Figure 4b indicates that the MMCW-BiLSTM model had the lowest error values of 8.66, 6.50, 5.23, 4.75, and 2.89 for MAPE at a TP of 90. Additionally,
Figure 4c shows the results for epochs 180, 200, 300, 400, and 500 with the minimum error values of 17.26, 14.89, 13.38, 10.43, and 6.76 for MSE at a TP rate of 90. Similarly,
Figure 4d presents the RMSE results of the MMCW-BiLSTM model during a TP of 90, achieving the lowest error values of 4.15, 3.86, 3.66, 3.23, and 2.60.
5.4. Performance Analysis Based on Forecast for Product Demand Dataset
The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in
Figure 5, which displays the results obtained from using the MMCW-BiLSTM for predicting inventory occurrences. In
Figure 5a, the MAE values at epochs 180, 200, 300, 400, and 500 show the smallest errors to be 3.74, 3.17, 2.74, 2.45, and 1.97 with a TP of 90. Similarly,
Figure 5b shows the outcomes of the MMCW-BiLSTM model with the MAPE at a TP of 90, achieving the lowest error values of 11.65, 6.86, 6.54, 5.76, and 3.91. Additionally,
Figure 5c presents the results for epoch values of 180, 200, 300, 400, and 500, indicating the minimum error values of 20.10, 15.30, 12.64, 9.97, and 8.76 for MSE at a TP of 90. Similarly,
Figure 5d demonstrates the results of the MMCW-BiLSTM model with the RMSE during a TP of 90, reaching the lowest errors of 4.48, 3.91, 3.56, 3.16, and 2.96.
5.5. Performance Analysis Based on Forecast for Dairy Goods Sales Dataset
The performance of the MMCW-BiLSTM to predict inventory by varying the epochs 180, 200, 300, 400, and 500 are depicted in
Figure 5, which displays the results obtained from using the MMCW-BiLSTM for predicting inventory occurrences. In
Figure 6a, the MAE values at epochs 180, 200, 300, 400, and 500 show the smallest errors to be 4.06, 3.57, 3.07, 2.86, and 2.54 with a TP of 90. Similarly,
Figure 6b shows the outcomes of the MMCW-BiLSTM model with the MAPE at a TP of 90, achieving the lowest error values of 17.17, 10.72, 7.57, 5.15, and 3.69. Additionally,
Figure 6c presents the results for epoch values of 180, 200, 300, 400, and 500, indicating the minimum error values of 22.78, 19.13, 15.12, 12.34, and 10.39 for MSE at a TP of 90. Similarly,
Figure 6d demonstrates the results of the MMCW-BiLSTM model with the RMSE during a TP of 90, reaching the lowest errors of 4.77, 4.37, 3.89, 3.51, and 3.22.
5.6. Comparative Methods
To showcase the successes of the MMCW-BiLSTM model, a comparison was conducted. This analysis included various techniques such as SVM [
29], Bayesian approach [
30], BP neural network [
31], and meta-heuristic CNN-LSTM [
3].
5.6.1. Comparative Analysis Based on TP for AV Demand Forecasting Dataset
The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 19.47% and achieved the lowest MAE of 1.75, as illustrated in
Figure 7a.
In
Figure 7b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 15.74%, reaching a minimum MAPE of 3.48 with a TP of 90.
The MMCW-BiLSTM model, shown in
Figure 7c, outperforms the meta-heuristic CNN-LSTM model by 31.34% in forecasting inventory. It attains an MSE score of 6.76 with a TP of 90, surpassing existing methods.
In
Figure 7d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 2.60, with a TP of 90, which is 14.60% higher than the CNN-LSTM model.
5.6.2. Comparative Analysis Based on TP for Forecast for Product Demand Dataset
The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 40.46% and achieved the lowest MAE of 1.97, as illustrated in
Figure 8a.
In
Figure 8b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 2.85%, reaching a minimum MAPE of 3.91 with a TP of 90.
The MMCW-BiLSTM model, shown in
Figure 8c, outperforms the meta-heuristic CNN-LSTM model by 26.72% in forecasting inventory. It attains an MSE score of 8.76 with a TP of 90, surpassing existing methods.
In
Figure 8d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 2.96, with a TP of 90, which is 12.56% higher than the CNN-LSTM model.
5.6.3. Comparative Analysis Based on TP for Forecast for Dairy Goods Sales Dataset
The MMCW-BiLSTM model performed better than the meta-heuristic CNN-LSTM model when it came to predicting inventory. It showed a significant improvement of 5.92% and achieved the lowest MAE of 2.54, as illustrated in
Figure 9a.
In
Figure 9b, the MMCW-BiLSTM model shows better predictive performance for inventory than the meta-heuristic CNN-LSTM model. It surpasses the meta-heuristic CNN-LSTM model by 67.43%, reaching a minimum MAPE of 3.69 with a TP of 90.
The MMCW-BiLSTM model, shown in
Figure 9c, outperforms the meta-heuristic CNN-LSTM model by 11.91% in forecasting inventory. It attains an MSE score of 10.39 with a TP of 90, surpassing existing methods.
In
Figure 9d, the MMCW-BiLSTM model outperforms the meta-heuristic CNN-LSTM model in predicting inventory by achieving an RMSE of 3.22, with a TP of 90, which is 6.14% higher than the CNN-LSTM model.
5.7. Statistical Analysis Based on AV Demand Forecasting Dataset
The MMCW-BiLSTM model shows the highest accuracy in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 51.61 and achieving the lowest error value of 2.62.
The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 22.54 and achieving the lowest error value of 5.87.
The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 11.26 and achieving the lowest error value of 9.83.
The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 45.61 and reaching a peak RMSE of 3.14. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in
Table 1,
Table 2,
Table 3 and
Table 4 for AV Demand Forecasting Dataset.
5.8. Statistical Analysis Based on Product Demand Dataset
The MMCW-BiLSTM model shows the highest accuracy in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 14.13 and achieving the lowest error value of 3.37.
The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 90.78 and achieving the lowest error value of 7.06.
The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 32.22 and achieving the lowest error value of 16.37.
The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 14.87 and reaching a peak RMSE of 4.05. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in
Table 5,
Table 6,
Table 7 and
Table 8 for Product Demand Dataset.
5.9. Statistical Analysis Based on Dairy Goods Sales Dataset
The MMCW-BiLSTM model shows the highest MAE in forecasting inventory levels, surpassing the pharmaceutical supply chain model by 28.30 and achieving the lowest error value of 3.73.
The MMCW-BiLSTM model shows the highest MAPE in forecasting inventory levels, outperforming the pharmaceutical supply chain model by 36.47 and achieving the lowest error value of 9.73.
The MMCW-BiLSTM model outperforms the pharmaceutical supply chain model in predicting inventory levels, showing an improvement of 41.02 and achieving the lowest error value of 21.12.
The MMCW-BiLSTM model shows the lowest RMSE when forecasting inventory levels, outperforming the pharmaceutical supply chain model by 23.20 and reaching a peak RMSE of 4.60. The statistical table based on MAE, MAPE, MSE, and RMSE is shown in
Table 9,
Table 10,
Table 11 and
Table 12 for Dairy Goods Sales Dataset.
5.10. Time Complexity Analysis
The computational time comparison between the developed MMCW-BiLSTM and the other existing methods is analyzed with different iterations to show the effectiveness of the MMCW-BiLSTM method. The results show the method’s computational efficiency, and it consistently requires a reduction in time compared with other existing methods. While comparing the suggested method based on the datasets with the existing approaches, the developed model has the lowest computing time of 20.46, 20.52, and 20.59 at iteration 100.
Table 13 includes information on the computational complexity of the developed model.
5.11. Comparative Discussion
The proposed MMCW-BiLSTM model outperforms existing models in terms of volatility of demand, issues with data quality, and difficulty in capturing multi-dimensional interactions between features with the use of advanced techniques purposefully built for forecasting inventory tasks. Using BiLSTM layers in conjunction with both collaborative and multi-dimensional attention mechanisms gives the model the ability to tackle complex temporal dependencies and feature correlations in the dataset. This comprehensively leads to more accurate forecasts, especially when several factors are implicated and influence inventory levels. In addition, Taylor series transformations allow the model to take into account higher-order interactions among the features, which enhances its overall predictive power. The MMCW-BiLSTM model was compared to traditional and state-of-the-art methods like SVM, Bayesian approaches, BP neural networks, and meta-heuristic CNN-LSTM methods. The MMCW-BiLSTM model consistently performed better than these techniques, such that there was an affirmation of the model in accurately forecasting inventory levels. In short, it is the incorporation of superior architecture and exhaustive evaluation procedures that make our MMCW-BiLSTM models perform better than other predictors for inventory forecasting. The results show that the models have the least number of errors, with values of 1.75, 2.89, 6.76, and 2.6 for MAE, MAPE, MSE, and RMSE when using the AV demand forecasting dataset. Likewise, when utilizing the forecast product demand dataset, the model has the lowest error values of 1.97, 3.91, 8.76, and 2.96 for these metrics. Similarly, when utilizing the dairy goods sales dataset, the model has the lowest error values of 2.54, 3.69, 10.39, and 3.22. The comparative discussion table for the AV demand forecasting dataset and forecast for the product demand dataset is shown in
Table 14.
6. Conclusions
Finally, the MMCW-BiLSTM model is a powerful development in the inventory prediction field. The MMCW-BiLSTM model provides better accuracy and efficiency in inventory forecasting by tackling the challenges that the existing models are facing, which include demand instability and the inability to capture multi-dimensional interactions. The combination of sophisticated techniques, including BiLSTM layers, cooperative and multi-dimensional attention mechanisms, and Taylor series transformation, allows the model to improve its predictive performance. The MMCW-BiLSTM model accurately forecasts inventory levels, which is a crucial aspect of more efficient inventory management and supply chain optimization, ultimately leading to better operational efficiency and profitability for e-commerce firms. Furthermore, further research and improvement of the MMCW-BiLSTM design will continue to pave the way for new advancements in stock prediction and supply chain management. The results suggest that the models show the lowest error values for MAE, MAPE, MSE, and RMSE at 1.75, 2.89, 6.76, and 2.6, respectively, when using the AV demand forecasting dataset. Similarly, when using the forecasting product demand dataset, the model also achieves the lowest error values for the same metrics at 1.97, 3.91, 8.76, and 2.96. Similarly, when using the dairy goods sales dataset, the model also achieves the lowest error values for the same metrics at 2.54, 3.69, 10.39, and 3.22. In the future, strategies need to be developed to achieve real-time inventory prediction using the MMCW-BiLSTM model, which will allow businesses to react to demand changes quickly and adjust their stock in real time. Firstly, exploring the incorporation of more complex techniques, including deep reinforcement learning or generative adversarial networks, could help enhance the adaptability and accuracy of inventory prediction models. Deep reinforcement learning can learn non-stationary inventory policies. The model accepts the forecast as a policy input, matches dynamic programming in the case of backorders, and can outperform dynamic programming in the case of lost sales.