Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model

Aldahmani, Emad; Alzubi, Ahmad; Iyiola, Kolawole

doi:10.3390/app14188110

Open AccessArticle

Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model

by

Emad Aldahmani

,

Ahmad Alzubi

^*

and

Kolawole Iyiola

Institute of Graduate Research and Studies, University of Mediterranean Karpasia, TRNC, Mersin 33010, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8110; https://doi.org/10.3390/app14188110

Submission received: 23 July 2024 / Revised: 28 August 2024 / Accepted: 2 September 2024 / Published: 10 September 2024

(This article belongs to the Special Issue Deep Learning in Supply Chain and Logistics)

Download

Browse Figures

Versions Notes

Abstract

:

This research presents a uni-regression deep approximate forecasting model for predicting future demand in supply chains, tackling issues like complex patterns, external factors, and nonlinear relationships. It diverges from traditional models by employing a deep learning strategy through recurrent bidirectional long short-term memory (BiLSTM) and nonlinear autoregressive with exogenous inputs (NARX), focusing on regression-based approaches. The model can capture intricate dependencies and patterns that elude conventional approaches. The integration of BiLSTM and NARX provides a robust foundation for accurate demand forecasting. The novel uni-regression technique significantly improves the model’s capability to detect intricate patterns and dependencies in supply chain data, offering a new angle for demand forecasting. This approach not only broadens the scope of modeling techniques but also underlines the value of deep learning for enhanced accuracy in the fluctuating supply chain sector. The uni-regression model notably outperforms existing models in accuracy, achieving the lowest errors: mean average error (MAE) at 1.73, mean square error (MSE) at 4.14, root mean square error (RMSE) at 2.03, root mean squared scaled error (RMSSE) at 0.020, and R-squared at 0.94. This underscores its effectiveness in forecasting demand within dynamic supply chains. Practitioners and decision-makers can leverage the uni-regression model to make informed decisions, optimize inventory management, and enhance supply chain resilience. Furthermore, the findings contribute to the ongoing evolution of supply chain demand forecasting methodologies.

Keywords:

supply chain; temporal pattern; demand indicator; fusion forecasting; bidirectional long short-term memory

1. Introduction

Forecasting demand is an essential part of managing the supply chain and has a considerable influence on planning, capacity, and inventory control choices. The significance of precise demand prediction becomes apparent in inventory control when incorrect forecasts lead to higher backlog and holding expenses [1,2,3]. The uncertainty of demand has a significant impact on order amplification, which affects all participants in the supply chain. In the broader field of supply chain management, accurate demand forecasting is crucial for making decisions, allocating resources, and improving operational efficiency [4,5].

However, traditional methodologies face challenges in capturing the complex dynamics of modern supply chains as global markets evolve and become more interconnected [6]. Linear models and time series analyses, though foundational, struggle to predict the nonlinear and intricate relationships that characterize contemporary business environments. Traditional methods of demand forecasting are no longer effective in accurately predicting customer demands due to intense competition in various industries [7,8,9,10,11,12]. In order to address this problem, companies are currently implementing advanced data science methods to predict customer demand. By treating customer demand as a series of data points over a period of time, the issue of demand forecasting can be seen as a challenge of predicting values in a time series [13,14,15]. However, predicting the demand for components presents its own difficulties, such as a lack of adequate information about downstream processes, sporadic occurrences of demand, and a limited comprehension of market trends. It is crucial to address these challenges in order to reduce inventory costs and make flexible decisions in agile supply chains. While there have been several studies on supply chain management, limited research has specifically focused on the issues related to forecasting demand for components [16,17]. Conventional approaches like moving averages and the Croston method, which is commonly used, may not be effective in dealing with the growing volatility of the supply chain [11,18,19].

The demand patterns in the supply chain industry are constantly changing due to technological advancements, globalization, and changing consumer preferences. Traditional forecasting models struggle to keep up with these nonlinear and unpredictable patterns. Industries that rely on quick-to-market products and are heavily influenced by market trends face particular challenges in accurately predicting future demand [20]. This leads to inefficient inventory management and resource allocation [11,21,22,23]. Linear models like linear regression oversimplify the relationships between input variables and demand, making it difficult to capture the complex nature of supply chain data [24]. Time series analysis, although effective in capturing patterns over time, often fails to account for sudden shifts and irregularities in demand, resulting in inaccurate forecasts [25]. In recent years, artificial neural networks have gained attention in the field of computational intelligence due to their adaptability and robustness [10,26,27,28,29,30]. These networks excel in decision-making, offering the ability to adapt to environmental changes, handle nonlinear systems, and process data quickly [31]. Machine learning (ML) has been shown to be more effective than traditional methods in improving customer engagement and accuracy of demand forecasts, due to the application of predictive analytics. ML techniques are particularly skilled at managing intricate interdependencies and nonlinear relationships, ultimately leading to enhanced performance across the supply [8,9,32,33]. Promising results for ML approaches are discovered after applying them to large-scale homogenous product sales data on Amazon. Additional examples include foundries and chocolate manufacture, where the time series are likewise homogeneous [34]. The intricacy of ML approaches poses a challenge to their adoption [27]. In addition to the methods’ inherent complexity, it is unknown how much economic benefit can be made by adopting them. To temper the early excitement surrounding machine learning techniques and their true usefulness for organizations, more research is required [35,36].

Motivation and Contribution

The motivation for this research stems from the inherent challenges and crucial role of demand forecasting in contemporary supply chain management. Accurate demand forecasting is pivotal for organizations to optimize their resources, streamline production processes, and maintain an efficient inventory. Traditional demand forecasting models often encounter various difficulties in capturing the intricate temporal patterns and external factors influencing demand. Recognizing the limitations of existing methods, this research seeks to contribute to the field by leveraging advanced machine learning techniques.

This research proposes a novel integration of NARX models and recurrent BiLSTM networks for demand forecasting. Autoregressive models, like NARX, utilize historical information to predict future values, but they may not fully exploit intricate patterns within supply chain data. By integrating NARX models with recurrent biLSTM networks, a more comprehensive forecasting tool can be created that combines the strengths of capturing temporal dependencies and nonlinear patterns. This integration aims to enhance forecasting capabilities to adapt to the complexities of modern supply chains. The main contribution of the research is as follows,

Uni-regression deep approximate forecasting model: The contribution of the uni-regression deep approximate forecasting model could bring a transformative perspective to the predictive modeling landscape. While the existing models, such as the recurrent BiLSTM Network and NARX model, excel in capturing historical patterns and accounting for external factors, the introduction of the uni-regression model introduces a new dimension. Uni-regression, by embracing a deep approximate forecasting approach, likely enhances the capacity to grasp intricate relationships within the supply chain data. Its unique ability to approximate complex patterns might offer an advantage in capturing nuanced demand dynamics, especially in scenarios where traditional models may fall short. The fusion of insights from the uni-regression model with those derived from the BiLSTM and NARX models could potentially lead to a more comprehensive and accurate demand forecast, contributing to the overall improvement in forecasting accuracy and reliability within the dynamic processes of supply chains.

The manuscript follows a specific organizational structure. In Section 2, there is a discussion about existing works, their approaches, and challenges. Section 3 explains the effective demand forecasting methodology using a uni-regression deep approximate forecasting model. Section 4 outlines the functioning of the uni-regression deep approximate forecasting model. Section 5 presents the results obtained from applying the proposed demand forecasting model, and Section 6 concludes the manuscript.

2. Literature Review

In their research, [2] addressed the challenging task of anticipating demand in assembly industries. The accurate prediction of component demand is crucial, particularly when faced with uncertain end-customer demand. To tackle this issue, they suggested a comprehensive approach that integrates the autoregressive moving average model with exogenous inputs (ARIMAX) and machine learning (ML) models. They also conducted a comparison between these advanced methods and traditional univariate benchmarks. The multivariate approach offers several advantages, including a comprehensive understanding of demand shifts using multiple leading indicators, resulting in improved forecasting accuracy of 17.45% NMAE and decreased inventory-related costs of 216.32 € throughout the component’s life cycle. The use of statistical and ML-based models also allows for flexibility in adapting to different stages of the component’s life cycle.

Ref. [5] addressed the issue of demand forecasting in supply chain management in their study, where they presented an advanced model that combines ML techniques, time series analysis, and deep learning methodologies. Using data from the SOK market in Turkey, a rapidly expanding retail chain, they showcased the efficacy of this model in optimizing stocks, cutting costs, and enhancing sales and customer loyalty. The model attains an average of 24.7% of MAPE. The distinctive feature of their study lies in their combination of deep learning methodology and innovative decision integration techniques.

Ref. [13] introduced a technique for demand forecasting in highly competitive business environments. Their method, which utilizes a multi-layer long short-term memory (LSTM) network, excels in its ability to dynamically select models. This is achieved through a grid search approach that considers different combinations of LSTM hyperparameters. The method’s main strength lies in its capacity to effectively capture nonlinear patterns in time series data, outperforming established forecasting techniques such as ARIMA, ETS, ANN, KNN, RNN, SVM, and single-layer LSTM. Statistical tests confirm the significant superiority of the proposed approach, particularly when assessing performance measures like RMSE and SMAPE, attaining values of 2.596 and 0.1023, respectively. When comparing computational intelligence techniques like SVM, ANN, and LSTM with traditional statistical methods (ETS and ARIMA), it becomes evident that the former outperforms the latter.

Ref. [16] introduced the application of distributed random forest (DRF) and gradient boosting machine (GBM) learning methods for forecasting backorder situations in the supply chain. They utilized a versatile approach that considers different prediction attributes and the diverse attributes of real-time data influenced by errors from both machines and humans. The approach attains classification accuracy of 79% and 85% for training, which demonstrates its efficiency. This approach gives decision managers the ability to adjust to different scenarios. The authors justified their choice of DRF and GBM models by their ability to provide superior explanations.

Ref. [11] introduced the UNISON data-driven framework for forecasting intermittent demand for electronic components in the supply chain. They acknowledged the complexities of this task, which are exacerbated by demand fluctuations, short product life cycles, and long production and technology migration timelines. The framework aims to minimize demand series fluctuations while considering the limitations of available information and the need for cross-product modeling variability. The results demonstrated its effectiveness in improving forecast accuracy to 78.51% for various product categories, thereby enabling flexible decision-making and enhancing supply chain resilience for smart production.

Ref. [24] examined customer demand prediction for remanufactured products and market development. They highlighted the limitations of traditional statistical models in capturing the nonlinear dynamics of customer demand, particularly in e-marketplaces. To overcome this, they utilized advanced machine learning techniques, such as the random forest ensemble regression tree model, to create a precise demand prediction model for remanufactured products. The study used real-world Amazon data and revealed the complexity of predicting remanufactured product demand while demonstrating the effectiveness of machine learning in this field.

In [10] study, the integration of machine learning techniques, notably ARIMAX and neural networks, was explored with the aim of enhancing demand forecasting within multi-stage supply chains. The research showed significant improvements in operational and financial performance, proving the effectiveness of machine learning-based methods in reducing inventory costs and enhancing asset efficiency, especially in capital-intensive industries like steel manufacturing. Additionally, the research emphasized the hybrid forecasting model’s ability to capture complex relationships among variables at the aggregate level, resulting in more accurate demand predictions.

Ref. [31] developed a machine learning technique to enhance the precision of order demand prediction in e-commerce distribution centers. Their approach integrates the unique traits of time series data and employs an adaptive neuro-fuzzy inference system to analyze the patterns of e-commerce order arrivals as time progresses. The implementation framework consists of four stages to support practical application, and a structured model evaluation framework is in place to ensure the reliability of the predictive model. The model shows RMSE of 8.7, 11.33, and 9.55 for retailers 1, 2, and 3, respectively. Additionally, this method allows for the exploration of integration with optimization or heuristic approaches to enable more robust decision-making in the dynamic e-commerce environment.

Ref. [37] developed a manufacturer-and-retailer supply chain model considering the retailer’s selling price, product green level, and advertisement effort-sensitive demand. It analyzes a sharing contract policy on manufacturer green investment. This model demonstrates how green production methods and investments can boost manufacturer profits while reducing pollution in the environment. However, the analytical method is more complex and time-consuming.

In a two-layer supply chain management setting with stochastic conditions, Prasanta [38] focus on the best replenishment strategies that a retailer may implement to maximize average profit when (s)he is presented with trade offers of credit and price discount from suppliers. In order to maximize average profit, the extended study suggests that the merchant borrow half of the cost of purchases from financial support and make payments at the beginning of the advance payment period.

Ref. [39] established a production model that screens out faulty products and reduces greenhouse gas emissions. In this case, it has been assumed that the manufacturing process transfers from a controlled state to an uncontrolled state following an arbitrary, random period of time. The research determines the ideal business and production duration, as well as the ideal production rate, and optimizes the model’s predicted average profit.

Ref. [38] created a flawed production inventory model that includes warranty and insurance coverage for any unintentionally damaged product parts, rework, preventive machine maintenance, and other features. This study determines the best manufacturing schedule and selling price for the product’s various warranty periods in order to maximize profit per unit over the course of the business period.

Ref. [40] presented a gated recurrent unit model based on the firefly algorithm (FA-GRU) for intelligent demand forecasting. Here the use of FA helps to select the appropriate parameter for GRU. The inaccuracies in prediction are very low compared to other methods. However, the computational time required for the model is not considered here.

Ref. [41] introduced a random forest regressor for the prediction of responses in a CNC milling operation. The model’s robustness makes it suitable for generalizing different machining-related applications. However, the RF regressor performs poorly in sparse data where certain expected values do not exist at all. It also faces difficulties regarding interpretability.

Ref. [42] presented an error–trend–seasonal framework for secular seasonality and trend forecasting of the tuberculosis incidence rate in China. The ETS model provided a pronounced improvement for the long-term seasonality and trend forecasting in TB incidence rate over the SARIMA models. However, this model cannot obtain detailed information (age and sex) concerning TB cases.

Ishaani Priyadarshini and Chase Cotton (2021) introduced a long short-term memory (LSTM)–convolutional neural networks (CNN)–grid-search-based deep neural network model for sentiment analysis. The incorporation of grid search as a hyperparameter optimization technique minimizes pre-defined losses and increases the accuracy of the model. However, the model works only in small data. Also, the training cost of the model is too expensive.

Ref. [43] presented a recurrent BiLSTM for German named entity recognition. The recurrent model achieves the highest performance gains. However, the model suffers from increased computational complexity.

Ref. [44] introduced a nonlinear autoregressive network with exogenous input (NARX) for the prediction of global solar radiation. The model can be used to predict solar data in locations where measuring instruments are not employed or radiation data are unavailable. However, the model assumes that the future output depends only on past inputs, which may not hold in all cases.

Even though these existing approaches show considerable performance in demand forecasting, they still face various limitations like inaccurate predictions, increased computational complexity, interpretability issues, generalization issues, and difficulties in training complex methods. In order to resolve these limitations, a deep learning model is developed in this research.

Research Gaps

In the real world, data collection and cleaning are often challenging tasks. Ensuring that the data are accurate, consistent, and relevant is crucial for the success of the forecasting models; also, incomplete or noisy data may negatively impact the performance of the models, and the cleaning process might not eliminate all inconsistencies [2].
Applicability of feature extraction methods directly depends on domain-specific knowledge and the quality selection of techniques. The probabilistic and Apriori approaches may fail to cover all relevant features or introduce biases [5].
Statistical measures including mean, variance, and skewness, critical for time series feature extraction, are highly susceptible to outliers, potentially distorting extracted features and leading to incorrect predictions. Furthermore, the efficacy of these techniques presupposes the continuation of past trends into the future, a premise vulnerable to disruption by sudden changes or external factors [13].
The problem of overfitting or underfitting may arise, especially in a limited and unvaried dataset; incorporating relevant external factors into the model is also computationally intensive [16].

3. Proposed Methodology

This study aims to use machine learning techniques to predict demand in the supply chain. The first step involves gathering input data related to the supply chain, which are then cleaned to remove any inconsistencies, errors, or irrelevant information. The cleaned data are then used to extract features, with a particular focus on retail features. Probabilistic and Apriori approaches are used to extract retail-specific features, which are then combined into a single retail feature using a weighted sum. Along with retail features, statistical features like mean, variance, standard deviation, and skewness over time are also extracted. Time series feature extraction employs methods such as simple average, log–log linear regression, weighted average, and two-level linear model to identify temporal patterns in the data. Log–log linear regression uses logarithmic transformations to analyze and measure the rate of change over time, giving a more nuanced understanding of trends [45]. The weighted average assigns varying weights to data points based on their importance, resulting in a more precise representation of evolving patterns. Two-level linear model methods use linear modeling at two levels to analyze complex relationships within the data. These techniques collectively contribute to accurately forecasting future demand in the dynamic context of the supply chain by identifying and understanding time-dependent trends. Feature extraction for demand indicators involves considering features related to sales frequency and inverse sales frequency. These extracted features then serve as inputs to a recurrent BiLSTM network and NARX. The recurrent BiLSTM network is utilized in the initial stage of demand forecasting to capture temporal dependencies and patterns in the data. The NARX model with autoregression considers historical data and external factors to anticipate future demand in demand forecasting. To enhance the accuracy and reliability of the forecast results, a combination of forecasts from the recurrent BiLSTM network and NARX model is used. This fusion approach is presented in the proposed framework. The architecture of the developed demand forecasting model is presented in Figure 1.

3.1. Input

The demand forecasting dataset is sourced from [26,28,29]; let the product in the supply chain be denoted as

K = \{S_{1}, S_{2}, \dots S_{i}, \dots \dots S_{m}\}

(1)

Here,

S_{i}

denotes the key sale product in the supply chain.

3.2. Preprocessing

Data collection marks the beginning of the preprocessing stage for demand forecasting in the supply chain, as data processing involves detailed handling to ensure the quality and relevance of data for analysis. One of the important steps in this process is the removal of NaN values representing lacking or undefined data. Later, data-cleaning strategies are employed to solve inconsistencies, errors, and irrelevant information in the dataset. Following the removal of NaN values and the cleaning of data, feature extraction techniques, such as probabilistic approaches along with Apriori methods, derive important findings that are relevant to the retail domain. The overall goal is to generate a more refined dataset that can provide the relevant features, thus paving the way for subsequent steps in demand forecasting.

3.3. Feature Extraction

A feature extraction process is a systematic procedure for the preparation of cleaned data with machine learning applications in mind. After the supply chain data go through a rigorous cleansing process whereby inconsistencies and inaccuracies are eliminated, it becomes necessary to move on to an extraction-of-features phase aimed at identifying relevant information. In this systematic process, several techniques are used to crystallize the meaningful patterns and characteristics from cleaned data. This procedure entails retail features, statistical feature extraction, and tracking metrics like mean variance standard deviation and skewness over time. Time series feature extraction involves many approaches, including a simple average, log–log linear regression model weighted averages, and two levels of the linear approach to reveal temporal sequences in data. With this systematic extraction of the aforementioned features, the dataset is refined and made more informative, paving the way to precise and efficient training for demand forecasting in an enriched supply chain environment.

3.3.1. Retail Features

Retail feature extraction refers to the process of extracting valuable features from retail data that are important for demand forecasting. In this framework, retail features are extracted by the blend of probabilistic and Apriori methods. The purpose is to find significant correlations and relationships between data from the retail industry.

(a): Probabilistic Approach

This method employs probability distributions to simulate and assess retail data. For instance, it could use probability distributions to present the likelihood of some retail events or patterns emerging. This allows a probabilistic interpretation of the data that is useful in characterizing uncertainties common to retail environments. The probabilistic approach can handle noisy data and outliers effectively and provides confidence intervals, which helps in decision-making. The probabilistic approach adjusts well to changing market conditions. The frequency score of

S_{i}

is

P (S_{i} | S_{j}, K) = P (S_{i}) \dots P (S_{i} | S_{j}) \dots \dots P (S_{m} | S_{m - 1}, S_{m - 2} \dots S_{1})

(2)

where

K

is the supply chain product and

S_{i}, S_{j},

and

S_{m},

etc. are the supply chain data.

(b): Apriori Approach

The frequent item set mining algorithm Apriori is used to mine association rules [46]. In terms of retail feature extraction, the Apriori algorithm would probably find frequent item sets that are groups of items (features), mostly present jointly in a given amount of related data. This aids in identifying relationships and dependencies among various retail elements. The Apriori approach can handle large datasets effectively.

C (S_{i} \to S_{j}) = \frac{S u p (S_{i} \cup S_{j})}{S u p (S_{i})}

(3)

In this case, the confidence is written as

C

and support of every key scale to

i t h

named

S u p (S_{i})

.

3.3.2. Statistical Features

The statistical features are hence an important component in characterizing the underlying patterns and behaviors of the cleansed data across different times. Extracting statistical features is essential because it allows quantifying, interpreting, and making informed decisions based on data patterns. By calculating statistical features, insights into the central tendencies, variability, and distribution of data can be gained. Statistical features help to identify patterns and detect trends. By analyzing historical data, we can uncover recurring behaviors or changes over time. Here’s an explanation of each statistical feature, along with its mathematical equation:

(a): Mean

The mean is a measure of the central tendency, which gives an average value in the dataset. The mean is obtained by adding all individual data points and dividing the sum by the total number of data points. In demand forecasting, the mean establishes a benchmark for comparison in understanding what is considered average, from around which these fluctuations ensue.

M = \frac{1}{n} \sum_{f = 1}^{n} S_{i}^{*}

(4)

In this case,

M

stands for the mean value;

S_{i}^{*}

is an individual data point and

n

refers to how many numeric values there are in total within the dataset.

(b): Variance

Variance measures the scatter or dispersion of data points from their mean value. Variance provides the average of squared deviations for each data point from its mean. High variance implies more variations in the demand over time.

V = \frac{1}{n} {\sum_{f = 1}^{n} (S_{i}^{*} - M)}^{2}

(5)

(c): Standard Deviation

This is the square root of the variance and it stands as a measure of average deviation between the dataset points from the mean. Dispersion is commonly measured by using standard deviation. It measures the average or typical distance between every data point and its mean. In the supply chain context, it aids in determining possible variations of demand.

S = \sqrt{V}

(6)

(d): Skewness

Skewness describes the lack of symmetry in the distribution among data points. Skewness represents the form of distribution. A positive skewness implies that the graph is right-skewed, showing more of the high-demand periods. Negative skewness means left-skewed distribution, which reveals more periods of low demand.

S K = \frac{\sum_{f = 1}^{n} {(S_{i}^{*} - M)}^{3}}{{(\frac{1}{n} \sum_{f = 1}^{n} (S_{i}^{*} - M))}^{3 / 2}}

(7)

3.3.3. Time Series Features

The joint use of these time series feature extraction methods is substantially supportive in the identification of time-dependent trends critical for forecasting within the dynamic supply chain environment. Each method provides a specific angle of view and helps to understand demand dynamics better. The use of simple averages, logarithmic transformations, weighted averages, and multi-level linear modeling in their analysis means that it becomes more reliable, allowing forecasting models to capture as well as adapt temporal patterns hidden within supply chain data.

(a): Simple Average

The simple mean is a basic method that computes the arithmetic average of given values. In the framework of time series analysis, it constitutes a mean line by smoothing away short-term variations and accentuating long-run tendencies. The average is fundamental in detecting the general direction of demand attaining over time, which helps establish a permanent base point for detecting broader patterns and trends as well as reducing fluctuations due to short-term effects.

S A = \frac{1}{n} \sum_{f = 1}^{n} S_{i}^{*}

(8)

(b): Log–Log Linear Regression

Log–log linear regression uses logarithmic transforms of the data. This method will allow for a qualitative evaluation and quantification of the rate at which change occurs over time, by converting data to a log–log scale. One of the advantages of log–log linear regression is that it helps in identifying exponential or power law relationships within time series data. It reveals how the demand dynamics change over time and allows more accurate modeling of nonlinear trends.

\log (x) = e . \log (y) + c

(9)

In this case,

\log (x)

is the dependent variable while

\log (y)

is the independent variable, with

e

being the slope and

c

the intercept.

(c): Weighted Average

Weighted average assigns weighting based on the importance of data points. This implies that outliers can exist such that some data points have a greater effect on the average calculation than others, indicating their prominence in analysis. The weighted average gives more significance to important points of data so that changing patterns can be captured accurately. This is especially useful when some periods or events are more important for prediction.

W A = \frac{\sum_{f = 1}^{n} w_{f} . S_{i}^{*}}{\sum_{f = 1}^{n} w_{f}}

(10)

(d): Two-level Linear Model

Two-level linear model techniques refer to the use of two-level applications of linear modeling for the analysis of complex relationships in data. This might involve hierarchical or multi-layered modeling, where notions of different levels are regarded at the same time. Two-level linear model methods provide a powerful framework for modeling complex structures and hierarchies in time series data. Through the study of interdependences at different levels, these approaches contribute to a better understanding of factors affecting demand habits over time.

x = α_{0} + α_{1} . T_{t 1}^{*} + α_{2} . T_{t 2}^{*} + \dots + \in

(11)

Here,

x

is the dependent variable whereas

T_{t 1}^{*}

and

T_{t 2}^{*}

are independent variables at various levels, while

α_{0}, α_{1}, α_{2}

denote coefficients and

\in

is referred to as the error term.

3.3.4. Demand Indicator Feature

The demand indicator feature is a specific attribute or characteristic that has been derived from an analysis of historical data regarding sales or demands. It is required to use demand indicators to understand the pattern of demands for certain products or items over time. These characteristics play a significant part in the prediction of future needs and informed decision-making within supply chain management. The forms of demand indicators may differ, and most often they are extracted from the analysis of sales data in history. Some common demand indicators include:

(a): Sale Frequency (SF)

SF stands out as a significant measure in business analysis and refers to the mean frequency of sales transactions occurring within any set period. SF is a quantitative measure of the tempo of sales activities and is calculated by dividing the total number by total time durations. This metric is useful in measuring the frequency and intensity of customer transactions, providing critical information about demand patterns for a company’s offerings. In addition, the seasonality of market signals is particularly well analyzed by SF. Through SF changes under different time intervals, businesses can find correlations between high and low seasons. These data become crucial for strategic planning, enabling companies to synchronize their marketing campaigns and various inventory operations as well as coordinating the entire organization with customer demand. In essence, sale frequency is a dynamic metric that enables businesses to make informed decisions, ensures efficient utilization of resources, and facilitates adaptability in light of market dynamics.

SF is calculated by dividing the total number of sales

(S)

by the total time period

(T)

.

S F = \frac{N}{T}

(12)

In this case,

N

symbolizes the total number of times that

S_{i}

was on top in supply chain delivery, while

T

denotes the overall product within a retail supplier firm.

(b): Inverse Sale Frequency (ISF)

ISF in the retail supply chain is an important metric that marks product sales rarity within a given period. ISF is calculated by multiplying the total supply of a product by both seasonal and non-seasonal demand, which tells us how often it sells. A higher value of ISF means lower sales frequency, implying that a product is exclusive or has a slower turnover rate. This metric carries important implications for inventory management, as it helps retailers to recognize products that have unique market positions. Strategically, a higher ISF may make retailers concentrate on creating perceived value or dynamic marketing initiatives that stimulate demand as compared with the results of a lower ISF, which directs inventory optimization strategies based upon persistent demand levels. Basically, inverse sale frequency is a helpful tool to give retailers insight into whether to adjust inventory and marketing methods depending on the specificity of demand for each product in dynamic business conditions.

I S F = \log (\frac{P}{S} + \frac{P}{N S})

(13)

In this case,

P

is the total product in the retail supply chain,

S

indicates the seasonal demand of

S_{i}

during the sale, and

N S

represents non-seasonal sale demand.

3.4. Notations and Assumptions

The notations and assumptions used in the proposed model are tabulated in Abbreviations.

4. Proposed Uni-Regression Deep Approximate Forecasting Model (UDAF) in Demand Forecasting

The feature extraction output vector forms the input for the uni-regression deep approximate forecasting model in demand forecasting. This demand forecasting methodology involves the use of statistical and time series features as inputs to two separate predictive models. The first model is a recurrent BiLSTM network, which belongs to the family of neural networks that can represent time dependencies and complex sequential patterns. Drawing on the bidirectional structure, BiLSTM is capable of capturing historical patterns and relationships in input features that enable the representation of changing demand dynamics. The BiLSTM model is well suited for demand forecasting when there is a limited amount of historical data or when the demand patterns are nonlinear. The BiLSTM model also can forecast both time series and cross-sectional data. At the same time, an NARX model is used for demand forecasting. It is most likely that this model can look beyond historical information and account for external factors that may affect demand, thus developing a fuller view of the variables influencing it. The final demand forecast was formed by integrating the recurrent BiLSTM network and NARX models. The application of this fusion approach is aimed at improving the general accuracy and reliability that can be observed in forecasting results by embedding partial insights based on various modeling techniques, which helps to provide a more complete picture of future demands in terms of dynamic processes linked with supply chains.

(i): NARX Neural Network

NARXNN is introduced as a superior edition of NARNN, which is a nonlinear autoregressive neural network. Unlike NARNN, which employs a single delayed feedback loop of the output regressor, NARXNN includes two tapped-delay lines from the input–output signals [47]. Consequently, NARXNN has the capability to incorporate external input values into the parametric equation.

i (q) = k [a (q); \overline{v} (q); \overline{i} (q - 1)]

(14)

i (q) = k [a (q), \dots \dots a (q - y_{D} + 1); \overset{\land}{a} (q), \dots \dots \overset{\land}{a} (q - y_{i} + 1)]

(15)

The discrete time step

q

represents the model input and output as

\overline{v} (q) \in ℜ

. The input and output memory orders are represented by

y_{N} \geq 1

, with

\{y_{N}, y_{i}\} \in N^{*}

. The feedback loop enhances the NARXNN predictor’s responsiveness to past data.

The NARXNN algorithm contains a feed-forward network with two layers. The output layer applies a linear transfer function, whereas the hidden layer employs a sigmoid function

(σ)

that is computed.

σ (u) = \frac{1}{1 + \exp (- σ)}

(16)

This network has a unique feature wherein tapped delay lines are used to store previous values of the

v (q)

and

i (q)

sequences. The NARXNN’s output

i (q)

is then fed back into the network’s input with a delay. There are two different training modes for the NARXNN model:

(a): Series-Parallel (SP) Mode

One that uses feedback-delayed information from real values.

(b): Parallel (P) Mode

One that uses feedback-delayed information from real values and sets estimated outputs for the output’s regressor.

NARX Model

The NARX model captures dependencies based on historical demand data

B_{t - 1}

and exogenous inputs

A_{c}

.

\overset{\land}{t} c = e (t_{c - 1}, t_{c - 2}, \dots \dots t_{c - s}, a_{c - 1}, a_{c - 2}, \dots \dots a_{c - j}) + \in_{c}

(17)

In this case,

\overset{\land}{t} c

corresponds to the forecast demand at time

c

, and

e

is a nonlinear function used for describing relations between lagged demands based on exogenous inputs.

ε_{c}

is the error term.

NARXNN is a model that incorporates NARX-SP feedback for better accuracy and effective training. Simulations have confirmed that the NARXNN is more effective than conventional methods such as feed-forward and Elman networks in identifying the patterns of sequential output. This improved performance is credited to the utilization of tapped-delay lines for inserting input vectors, which enables faster propagation paths during the gradient descent process. However, the computational resources required for NARXNN are significantly reduced when the output memory is eliminated. Additionally, NARXNN, as part of the RNN, faces the vanishing gradient problem to some extent. This problem occurs when the RNN stops learning after a certain number of inputs, leading to a decrease in prediction accuracy due to fading memory caused by the shrinking gradient descent.

(ii): Recurrent Bidirectional Long Short-Term Memory (BiLSTM) Network

To tackle issues like the vanishing gradient problem and the decrease in prediction accuracy due to fading memory caused by the shrinking gradient descent, LSTM was introduced. LSTM replaces nodes with memory cells and incorporates a gating mechanism to address the problem of gradient vanishing in RNN. The memory cells in LSTM have the capability to either forget or remember states based on the input scale. Three memory gates, namely input, forget, and output, are utilized to provide fundamental support to the memory cell. By enabling the decoder to consult with both the final hidden layer and a specific subset of the encoder’s initial input, the attention layer serves as a filter for pertinent context. A neural network that is trained to focus on significant information can better retain lengthy data sequences. This could enhance the performance of the model.

The BiLSTM network takes in the input features

A_{t}

, and produces a hidden state

b_{c}

for each time step.

Equation for forward LSTM model:

b_{c}^{f o r} = L S T M (A_{c}, b_{c - 1}^{f o r}, d_{c - 1}^{f o r})

(18)

Equation for backward LSTM model:

b_{c}^{b a c k} = L S T M (A_{c}, b_{c + 1}^{b a c k w a r d}, d_{c + 1}^{b a c k w a r d})

(19)

Bidirectional output:

b_{c}^{B i L S T M} = [b_{c}^{f o r}, b_{c}^{b a c k}]

(20)

General RNN:

b_{c}^{R N N} = R N N (A_{c}, b_{c - 1}^{R N N})

(21)

(iii): Concatenation of the models

The mathematics behind the whole hybrid model which involves recurrent BiLSTM network and nonlinear autoregressive with exogenous inputs (NARX) must consider some more generalized RNN type. The final demand forecast is derived by taking a weighted average of predictions from the BiLSTM network and NARX model, as well as incorporating an additional general RNN term for completeness.

M_{c} = α . B i L S T M (A_{c}) + β . R N N (A_{c}) + (1 - α - β) . N A R X (B_{c - 1}, A_{c})

(22)

First,

A_{c}

represents the features extracted at

c

, and

B i L S T M (A_{c})

stands for prediction by a recurrent bidirectional LSTM network.

R N N (A_{c})

is a generalized RNN term.

B_{c - 1} = [b_{c - 1}, b_{c - 2}, \dots b_{c - s}]

is a representation of the lagged values for demand.

A_{t} = [a_{c - 1}, a_{c - 2}, \dots a_{c - j}]

represents the exogenous input variables.

N A R X (B_{c - 1}, A_{c})

is the forecast obtained from the NARX model.

α

and

β

are weight parameters that represent the amount of impact each model makes towards the resulting prediction.

The fusion approach seeks to improve demand-forecasting accuracy and reliability by combining the strengths of the BiLSTM network, RNN component, and NARX model. The weight parameters

α

and

β

make it possible to adjust the contribution of each model with regard to their relative performance on different aspects of demand dynamics.

The architecture of the proposed uni-regression deep approximate forecasting model in demand forecasting is depicted in Figure 2, and its layer details are presented in Figure 3. In addition, the pseudo-code of the developed approach is presented in Algorithm 1.

Algorithm 1. Pseudo code for UDAF
1.	Input K
2.	Procedure Preprocessing (input data)
	return preprocessed data
3.	Procedure Feature extraction (preprocessed data)
	return concatenated retail_features, Statistical_features, Time series_featues, Demand indicator_features
4.	Procedure Train_test_split (Retail_features, Statistical_features, Time series_features, Demand indicator_features, label)
	return Train data, Train label, Test data, Test label
5.	Procedure Uni-regression deep approximate forecasting model (Train data, Train label)
	return Trained model
6.	Procedure Testing (model, test data)
	return output

5. Results and Discussions

The proposed UDAF model is applied to demand forecasting, and its effectiveness is assessed through a comparison with currently prevailing state-of-the-art models.

5.1. Experimental Setup

For the purpose of conducting the experiment in line with the requirement for predicting demand, a Python version 3.9 program is run on a Windows 10 operating system. The system used possesses a memory capacity of 8 GB.

5.2. Dataset Description

(i) Demand Forecasting Dataset: A global retail chain desires to utilize its extensive data to create a reliable prediction model for forecasting sales of each specific product in its inventory at its 76 stores. This will require analyzing historical sales data from the past three years on a weekly basis while considering sales and promotional details specific to each week and store. However, no additional information regarding the stores and products is accessible. Can you still accurately forecast the sales values for every product and store combination for the next 12 weeks? If so, then let’s proceed without hesitation.

(ii) M5 Forecasting Dataset: The dataset is collected from one of the two complementary competitions that together constitute the M5 forecasting challenge [48]. In the challenge, the items sales at stores in various locations for the two 28-day time periods are predicted. The dates when the products are sold are listed in the file calendar.csv. The historical daily unit sales data by product and store are contained in the sales_train_validation.csv file. The appropriate format for submissions is sample_submission.csv. The file sell_prices.csv includes details on the cost of goods sold by date and store.

(iii) Daily Demand Forecasting Orders: The database is a real database of a Brazilian company of large logistics and is collected over 60 days. The database consists of 12 predictive attributes and a target which is the total daily orders. The dataset includes attributes such as week of the month, day, non-urgent order, urgent order, order types (A, B, C), fiscal sector orders, traffic controller sector orders, banking orders (A, B, C), and a target order.

5.3. Comparative Methods

In order to showcase the accomplishments of the UDAF model, a comparative analysis is performed. The techniques employed for the comparison consist of RF regressor [41,42,43,44,49,50]. The models are evaluated based on Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), root mean square standardized error, and R-squared.

5.3.1. Comparative Analysis of Demand Forecasting Dataset Based on Delay

The UDAF model’s effectiveness in demand forecasting is evaluated by comparing it with other methodologies using MAE, MSE, RMSE, and RMSLE, in line with the delay of 25.

According to Figure 4a, the UDAF model surpasses NARX in demand forecasting, with the lowest error of 5.46 and an MAE of 1.73.

Furthermore, Figure 4b demonstrates the UDAF model’s proficiency in demand forecasting through MSE, outperforming NARX by 91.39% with a 4.14 MSE for a delay of 25.

Regarding demand forecasting, Figure 4c showcases the superior performance of the UDAF model compared to NARX. It achieves a 38.66% improvement with an RMSE of 2.03 for a delay of 25, surpassing previous techniques.

Similarly, in demand forecasting, as shown in Figure 4d, the UDAF model exhibits superior performance compared to NARX, achieving a 27.73% improvement with a 0.019 RMSSE for a delay of 25, surpassing previous techniques.

Regarding demand forecasting, Figure 4e showcases the superior performance of the UDAF model compared to NARX. It achieves a 4.62% improvement with an R-squared of 0.936 for a delay of 25, surpassing previous techniques.

5.3.2. Comparative Analysis of Demand Forecasting Dataset Based on K-Fold

The evaluation’s Figure 5a demonstrates the UDAF model’s exceptional performance in demand forecasting, outperforming NARX with the lowest error of 1.67 and an MAE of 1.22.

Figure 5b further showcases the UDAF model’s effectiveness in demand forecasting, surpassing NARX by 22.49% with an MSE of 1.16 for a K-fold 10.

When it comes to demand forecasting, the UDAF model, as seen in Figure 5c, excels in comparison to NARX. It achieves an improvement of 81.02 and has an RMSE of 1.08 for a K-fold 10, surpassing previous techniques.

Furthermore, Figure 5d demonstrates the UDAF model’s proficiency in demand forecasting through RMSSE, outperforming NARX by 47.36% with a 0.019 RMSSE for a K-fold 10.

Regarding demand forecasting, Figure 5e showcases the superior performance of the UDAF model compared to NARX with an R-squared of 0.96 for a K-fold 10, which is 1.04% improved over NARX.

5.3.3. Comparative Analysis of M5 Forecasting Dataset Based on Delay

The UDAF model’s effectiveness in demand forecasting is evaluated by comparing it with other methodologies using MAE, MSE, RMSE, and RMSLE, in line with the delay of 25.

According to Figure 6a, the UDAF model surpasses NARX in demand forecasting, with an MAE of 2.69.

Furthermore, Figure 6b demonstrates the UDAF model’s proficiency in demand forecasting through MSE, outperforming NARX by 5.56% with a 6.97 MSE for a delay of 25.

Regarding demand forecasting, Figure 6c showcases the superior performance of the UDAF model compared to NARX. It achieves a 2.94% improvement with an RMSE of 0.026 for a delay of 25, surpassing previous techniques.

Similarly, in demand forecasting, as shown in Figure 6d, the UDAF model exhibits superior performance compared to NARX, achieving a 6.17% improvement with an RMSSE of 0.76 for a delay of 25, surpassing previous techniques.

Regarding demand forecasting, Figure 6e showcases the superior performance of the UDAF model compared to NARX. It achieves a 6.17% improvement with an R-squared of 0.81 for a delay of 25, surpassing previous techniques.

5.3.4. Comparative Analysis of M5 Forecasting Dataset Based on K-Fold

The evaluation’s Figure 7a demonstrates the UDAF model’s exceptional performance in demand forecasting, outperforming NARX with the lowest MAE of 3.57.

Figure 7b further showcases the UDAF model’s effectiveness in demand forecasting, surpassing NARX by 43.27% with an MSE of 3.5 for a K-fold 10.

When it comes to demand forecasting, the UDAF model, as seen in Figure 7c, excels in comparison to NARX. It achieves an improvement of 24.59% and has an RMSE of 1.87 for a K-fold 10, surpassing previous techniques.

Furthermore, Figure 7d demonstrates the UDAF model’s proficiency in demand forecasting through RMSSE, outperforming NARX by 3.12% with a 0.031 RMSSE for a K-fold 10.

Regarding demand forecasting, Figure 7e showcases the superior performance of the UDAF model compared to NARX with an R-squared of 0.90 for a K-fold 10, which is 2.22% improved over NARX.

5.3.5. Comparative Analysis of Daily Demand Forecasting Orders Based on Delay

The UDAF model’s effectiveness in demand forecasting is evaluated by comparing it with other methodologies using MAE, MSE, RMSE, and RMSLE, in line with the delay of 25.

According to Figure 8a, the UDAF model surpasses NARX in demand forecasting, with an MAE of 2.94.

Furthermore, Figure 8b demonstrates the UDAF model’s proficiency in demand forecasting through MSE, outperforming NARX by 0.88% with a 7.85 MSE for a delay of 25.

Regarding demand forecasting, Figure 8c showcases the superior performance of the UDAF model compared to NARX. It achieves a 0.36% improvement with an RMSE of 2.80 for a delay of 25, surpassing previous techniques.

Similarly, in demand forecasting, as shown in Figure 8d, the UDAF model exhibits superior performance compared to NARX, achieving a 3.70% improvement with an RMSSE of 0.026 for a delay of 25, surpassing previous techniques.

Furthermore, Figure 8e demonstrates the UDAF model’s proficiency in demand forecasting through R-squared, outperforming NARX by 3.33% with a 0.90 MSE for a delay of 25.

5.3.6. Comparative Analysis of Daily Demand Forecasting Orders Based on K-Fold

The evaluation’s Figure 9a demonstrates the UDAF model’s exceptional performance in demand forecasting, outperforming NARX with the lowest MAE of 2.35.

Figure 9b further showcases the UDAF model’s effectiveness in demand forecasting, surpassing NARX by 51.55% with an MSE of 2.5 for a K-fold 10.

When it comes to demand forecasting, the UDAF model, as seen in Figure 9c, excels in comparison to NARX. It achieves an improvement of 30.39% and has an RMSE of 1.58 for a K-fold 10, surpassing previous techniques.

Furthermore, Figure 9d demonstrates the UDAF model’s proficiency in demand forecasting through RMSSE, outperforming NARX by 3.125 % with a 0.031 RMSSE for a K-fold 10.

Regarding demand forecasting, Figure 9e showcases the superior performance of the UDAF model compared to NARX with an R-squared of 0.90 for a K-fold 10, which is 1.11% improved over NARX.

5.4. Comparative Discussion

The existing methods employed for demand forecasting, such as RF regressor, ensemble ARIMA–NN, distributed RF-GB, LSTM grid search, recurrent BiLSTM, and NARX, have some limitations. RF regressor [41] can be prone to overfitting and there are difficulties regarding interpretability, whereas the developed UDAF model can resolve interpretability issues, attaining high performance on various datasets. Ensemble ARIMA–NN [49] is also prone to overfitting and cannot hold for time series data, whereas the developed UDAF model reduces the risk of overfitting by reducing the number of parameters in the model. The training of the LSTM grid search [50] is expensive. In recurrent BiLSTM [43], the BiLSTM can increase the model complexity. The developed UDAF model reduces the model complexity and attains efficient results by considering data beyond historical data. NARX models [44] assume that the future output depends only on past inputs, which may not hold in all cases. In order to solve these above issues, the proposed UDAF model is developed, combining recurrent BiLSTM and NARX, which benefits from the complementary strengths of these two approaches. The BiLSTM captures temporal dependencies, while NARX incorporates autoregressive features and can look beyond historical information and account for external factors that may affect demand. This synergy leads to improved forecasting accuracy and robustness compared to individual methods and reduces the complexity of the model. The results of the demand forecasting models are analyzed in Table 1, Table 2 and Table 3, indicating that the suggested model performs better than others in terms of MAE, MSE, RMSE, RMSSE, and R-squared. In particular, the proposed UDAF model showcased significantly lower values for MAE, MSE, RMSE, RMSSE, and R-squared, of 1.73, 4.14, 2.03, 0.020 and 0.94, respectively, for a delay of 25.

6. Conclusions

In conclusion, this research makes a substantial contribution to supply chain management by leveraging advanced machine learning techniques for demand prediction. The study follows a systematic approach, starting with the meticulous gathering and cleaning of input data from the supply chain, ensuring accuracy and relevance.

The focus on retail-specific features, extracted through probabilistic and Apriori approaches, adds a layer to the analysis. The incorporation of statistical and time series feature extraction methods, such as mean, variance, log–log linear regression, and two-level linear models, further enhances the model’s capability to capture temporal patterns in demand. The application of a recurrent BiLSTM network, combined with an NARX model, reflects a sophisticated approach to demand forecasting that considers both historical data and external factors. The fusion framework, which merges predictions from two models, seeks to enhance the precision and dependability of demand forecasts within the constantly changing supply chain context.

The uni-regression deep approximate forecasting model achieves the lowest values of MAE, MSE, RMSE, RMSSE, and R-squared, specifically 1.73, 4.14, 2.03, 0.020, and 0.94, respectively. These low error values demonstrate the model’s precision in forecasting demand. This study not only enhances academic comprehension of demand forecasting but also offers valuable practical guidance for supply chain professionals seeking more robust and effective prediction methods to optimize their operations and reduce excess inventory costs while meeting customer demand. The model can identify the risks early and allows for risk management strategies to mitigate disruptions in the supply chain. Accurate forecasts enhance customer satisfaction and responsiveness, providing a competitive advantage, and also help in efficient resource allocation, which allocates production capacity, transportation, and labor effectively.

However, the model heavily relies on the quality and availability of the data, which can impact the model’s performance if the data are incomplete. Future research may explore the scalability and adaptability of this framework across diverse supply chain contexts and industries, further advancing the field of predictive analytics in supply chain management.

Author Contributions

Investigation, A.A.; Writing—original draft, E.A.; Visualization, A.A. and K.I.; Supervision, A.A. and K.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the corresponding author on request.

Acknowledgments

The authors wish to express their sincere gratitude to all those who contributed to the successful completion of this research. Special thanks are extended to our colleagues, who provided invaluable feedback and insights throughout the study.

Conflicts of Interest

The authors reported no potential conflicts of interest.

Abbreviations

Notations in the developed model
Notations and Assumptions	Definition
$K$	Input
$P (S_{i} \| S_{j}, K)$	Probabilistic approach
$C (S_{i} \to S_{j})$	Apriori approach
$M$	Mean
$V$	Variance
$S$	Standard deviation
$S K$	Skewness
$S A$	Simple average
$\log (x)$	Dependent variable
$\log (y)$	Independent variable
$W A$	Weighted average
$S F$	Sale frequency
$I S F$	Inverse sale frequency
$i (q)$	Output of NARXNN
$\overset{\land}{t} c$	Demand forecast at the time c of NARX
$M_{c}$	Final demand forecast output

References

Chen, F.; Drezner, Z.; Ryan, J.K.; Simchi-Levi, D. Quantifying the bullwhip effect in a simple supply chain: The impact of forecasting, lead times, and information. Manag. Sci. 2000, 46, 436–443. [Google Scholar] [CrossRef]
Gonçalves, J.N.; Cortez, P.; Carvalho, M.S.; Frazao, N.M. A multivariate approach for multi-step demand forecasting in assembly industries: Empirical evidence from an automotive supply chain. Decis. Support Syst. 2021, 142, 113452. [Google Scholar] [CrossRef]
Kerkkänen, A.; Korpela, J.; Huiskonen, J. Demand forecasting errors in industrial context: Measurement and impacts. Int. J. Prod. Econ. 2009, 118, 43–48. [Google Scholar] [CrossRef]
Babai, M.Z.; Boylan, J.E.; Rostami-Tabar, B. Demand forecasting in supply chains: A review of aggregation and hierarchical approaches. Int. J. Prod. Res. 2022, 60, 324–348. [Google Scholar] [CrossRef]
Kilimci, Z.H.; Akyuz, A.O.; Uysal, M.; Akyokus, S.; Uysal, M.O.; Bulbul, B.A.; Ekmis, M.A. An improved demand forecasting model using deep learning approach and proposed decision integration strategy for supply chain. In Complexity; Wiley: Hoboken, NJ, USA, 2019. [Google Scholar]
Bahram, M.; Hubmann, C.; Lawitzky, A.; Aeberhard, M.; Wollherr, D. A combined model-and learning-based framework for interaction-aware maneuver prediction. IEEE Trans. Intell. Transp. Syst. 2016, 17, 1538–1550. [Google Scholar] [CrossRef]
Dua, D.; Graff, C. UCI Machine Learning Repository. University of California, Irvine. 2017. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 6 February 2024).
Fanoodi, B.; Malmir, B.; Jahantigh, F.F. Reducing demand uncertainty in the platelet supply chain through artificial neural networks and ARIMA models. Comput. Biol. Med. 2019, 113, 103415. [Google Scholar] [CrossRef]
Fausett, L.V. Fundamentals of neural networks: Architectures, algorithms and applications. In Pearson Education India; Prentice Hall: Upper Saddle River, NJ, USA, 2006. [Google Scholar]
Feizabadi, J. Machine learning demand forecasting and supply chain performance. Int. J. Logist. Res. Appl. 2022, 25, 119–142. [Google Scholar] [CrossRef]
Fu, W.; Chien, C.F. UNISON data-driven intermittent demand forecast framework to empower supply chain resilience and an empirical study in electronics distribution. Comput. Ind. Eng. 2019, 135, 940–949. [Google Scholar] [CrossRef]
Ghosh, P.K.; Manna, A.K.; Dey, J.K.; Kar, S. Optimal production run in an imperfect production process with maintenance under warranty and product insurance. Opsearch 2023, 60, 720–752. [Google Scholar] [CrossRef]
Abbasimehr, H.; Shabani, M.; Yousefi, M. An optimized model using the LSTM network for demand forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
Guo, F.; Diao, J.; Zhao, Q.; Wang, D.; Sun, Q. A double-level combination approach for demand forecasting of repairable airplane spare parts based on turnover data. Comput. Ind. Eng. 2017, 110, 92–108. [Google Scholar] [CrossRef]
Villegas, M.A.; Pedregal, D.J.; Trapero, J.R. A support vector machine for model selection in demand forecasting applications. Comput. Ind. Eng. 2018, 121, 1–7. [Google Scholar] [CrossRef]
Islam, S.; Amin, S.H. Prediction of probable backorder scenarios in the supply chain using Distributed Random Forest and Gradient Boosting Machine learning techniques. J. Big Data 2020, 7, 65. [Google Scholar] [CrossRef]
Kant, N.A.; Dar, M.R.; Khanday, F.A.; Psychalinos, C. Analog implementation of TDCNN single-cell architecture using sinh-domain companding technique. In Proceedings of the 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bengaluru, Karnataka, 27–28 August 2016; pp. 653–657. [Google Scholar]
Gupta, P.R.; Sharma, D.; Goel, N. Image Forgery Detection by CNN and Pretrained VGG16 Model. In Proceedings of the Academia-Industry Consortium for Data Science: AICDS 2020; Springer: Singapore, 2022; pp. 141–152. [Google Scholar]
Husna, A.; Amin, S.H.; Shah, B. Demand forecasting in supply chain management using different deep learning methods. In Demand Forecasting and Order Planning in Supply Chains and Humanitarian Logistics; IGI: Hershey, PA, USA, 2021; pp. 140–170. [Google Scholar]
Borucka, A. Seasonal methods of demand forecasting in the supply chain as support for the company’s sustainable growth. Sustainability 2023, 15, 7399. [Google Scholar] [CrossRef]
Croston, J.D. Forecasting and stock control for intermittent demands. J. Oper. Res. Soc. 1972, 23, 289–303. [Google Scholar] [CrossRef]
Hoseini Shekarabi, S.A.; Gharaei, A.; Karimi, M. Modelling and optimal lot-sizing of integrated multi-level multi-wholesaler supply chains under the shortage and limited warehouse space: Generalised outer approximation. Int. J. Syst. Sci. Oper. Logist. 2019, 6, 237–257. [Google Scholar] [CrossRef]
Nagare, M.; Dutta, P. Single-period ordering and pricing policies with markdown, multivariate demand and customer price sensitivity. Comput. Ind. Eng. 2018, 125, 451–466. [Google Scholar] [CrossRef]
Van Nguyen, T.; Zhou, L.; Chong, A.Y.L.; Li, B.; Pu, X. Predicting customer demand for remanufactured products: A data-mining approach. Eur. J. Oper. Res. 2020, 281, 543–558. [Google Scholar] [CrossRef]
Ren, A.; Li, Z.; Ding, C.; Qiu, Q.; Wang, Y.; Li, J.; Qian, X.; Yuan, B. Sc-dcnn: Highly-scalable deep convolutional neural network using stochastic computing. ACM SIGPLAN Not. 2017, 52, 405–418. [Google Scholar] [CrossRef]
Ampazis, N. Forecasting demand in supply chain using machine learning algorithms. Int. J. Artif. Life Res. (IJALR) 2015, 5, 56–73. [Google Scholar] [CrossRef]
Bohanec, M.; Borštnar, M.K.; Robnik-Šikonja, M. Explaining machine learning models in sales predictions. Expert Syst. Appl. 2017, 71, 416–428. [Google Scholar] [CrossRef]
Chase, C.W., Jr. Machine learning is changing demand forecasting. J. Bus. Forecast. 2016, 35, 43. [Google Scholar]
Minis, I. Applications of neural networks in supply chain management. In Handbook of Research on Nature-Inspired Computing for Economics and Management; Information Science Reference: Hershey, PA, USA, 2007; pp. 589–607. [Google Scholar]
Tanaka, K. A sales forecasting model for new-released and nonlinear sales trend products. Expert Syst. Appl. 2010, 37, 7387–7393. [Google Scholar] [CrossRef]
Leung, K.H.; Mo, D.Y.; Ho, G.T.; Wu, C.H.; Huang, G.Q. Modelling near-real-time order arrival demand in e-commerce context: A machine learning predictive methodology. Ind. Manag. Data Syst. 2020, 120, 1149–1174. [Google Scholar] [CrossRef]
Batani, A. Providing a decision-making model for continuous monitoring of patient’s hypertension using artificial neural network and quality control charts. Razi J. Med. Sci. 2018, 25, 46–57. [Google Scholar]
Menhaj, M.B. Computational Intelligence: Fundamental of Neural Networks; Amirkabir University of Technology Publication: Tehran, Iran, 2005. [Google Scholar]
Carbonneau, R.; Vahidov, R.M.; Laframboise, K. Machine Leaning-Based Demand Forecasting in SupplyChains. Int. J. Intell. Inf. Technol. 2007, 3, 40–57. [Google Scholar] [CrossRef]
Wang, J.; Yu, L.C.; Lai, K.R.; Zhang, X. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 225–230. [Google Scholar]
Weng, T.; Liu, W.; Xiao, J. Supply chain sales forecasting based on lightGBM and LSTM combination model. Ind. Manag. Data Syst. 2020, 120, 265–279. [Google Scholar] [CrossRef]
Saha, S.; Alrasheedi, A.F.; Khan, M.A.A.; Manna, A.K. Optimal strategies for green investment, sharing contract and advertisement effort in a supply chain coordination problem. Ain Shams Eng. J. 2024, 15, 102595. [Google Scholar] [CrossRef]
Kumar Ghosh, P.; Manna, A.K.; Dey, J.K.; Kar, S. Optimal policy for an inventory system with retailer’s hybrid payment strategy and supplier’s price discount facility under a supply chain management. In Optimization; Taylor & Francis: London, UK, 2023; pp. 1–40. [Google Scholar]
Dolai, M.; Manna, A.K.; Mondal, S.K. Sustainable manufacturing model with considering greenhouse gas emission and screening process of imperfect items under stochastic environment. Int. J. Appl. Comput. Math. 2022, 8, 93. [Google Scholar] [CrossRef]
Al-Khazraji, H.; Nasser, A.R.; Khlil, S. An intelligent demand forecasting model using a hybrid of metaheuristic optimization and deep learning algorithm for predicting concrete block production. IAES Int. J. Artif. Intell. 2022, 11, 649. [Google Scholar] [CrossRef]
Bhattacharya, S.; Chakraborty, S. Prediction of Responses in a CNC Milling Operation Using Random Forest Regressor. Facta Univ. Ser. Mech. Eng. 2023, 21, 685–700. [Google Scholar] [CrossRef]
Wang, Y.; Xu, C.; Ren, J.; Wu, W.; Zhao, X.; Chao, L.; Yao, S. Secular seasonality and trend forecasting of tuberculosis incidence rate in China using the advanced error-trend-seasonal framework. Infect. Drug Resist. 2020, 13, 733–747. [Google Scholar] [CrossRef] [PubMed]
Wiedemann, G.; Jindal, R.; Biemann, C. microNER: A Micro-Service for German Named Entity Recognition based on BiLSTM-CRF. arXiv 2018, arXiv:1811.02902. [Google Scholar]
Mohanty, S.; Patra, P.K.; Sahoo, S.S. Prediction of global solar radiation using nonlinear auto regressive network with exogenous inputs (narx). In Proceedings of the 2015 39th National Systems Conference (NSC), Greater Noida, India, 14–16 December 2015; pp. 1–6. [Google Scholar]
Lütkepohl, H.; Xu, F. The role of the log transformation in forecasting economic variables. Empir. Econ. 2012, 42, 619–638. [Google Scholar] [CrossRef]
Xie, H. Research and case analysis of apriori algorithm based on mining frequent item-sets. Open J. Soc. Sci. 2021, 9, 458. [Google Scholar] [CrossRef]
Massaoudi, M.; Chihi, I.; Sidhom, L.; Trabelsi, M.; Refaat, S.S.; Abu-Rub, H.; Oueslati, F.S. An effective hybrid NARX-LSTM model for point and interval PV power forecasting. IEEE Access 2021, 9, 36571–36588. [Google Scholar] [CrossRef]
Howard, A.; Makridakis, S. M5 Forecasting—Accuracy. Kaggle. 2020. Available online: https://kaggle.com/competitions/m5-forecasting-accuracy (accessed on 25 February 2024).
Boonmee, L.E.E.; Suhartono, S.; Apiradee, L.I.M.; Ahn, S.K. Forecasting World Tuna Catches with ARIMA-Spline and ARIMA-Neural Networks Models. Walailak J. Sci. Technol. (WJST) 2021, 18, 9726-15. [Google Scholar]
Priyadarshini, I.; Cotton, C. A novel LSTM–CNN–grid search-based deep neural network for sentiment analysis. J. Supercomput. 2021, 77, 13911–13932. [Google Scholar] [CrossRef]

Figure 1. The architecture of the proposed demand forecasting model.

Figure 2. Architecture of the proposed uni-regression deep approximate forecasting model in demand forecasting.

Figure 3. Layer details for uni-regression deep approximate forecasting model.

Figure 4. Comparative analysis based of demand forecasting dataset based on delay: (a) MAE, (b) MSE, (c) RMSE, (d) RMSSE, and (e) R-squared.

Figure 5. Comparative analysis of demand forecasting dataset based on K-fold: (a) MAE, (b) MSE (c) RMSE, (d) RMSSE, and (e) R-squared.

Figure 6. Comparative analysis of M5 forecasting dataset based on delay: (a) MAE, (b) MSE, (c) RMSE, (d) RMSSE, and (e) R-squared.

Figure 7. Comparative analysis of M5 forecasting dataset based on K-fold: (a) MAE, (b) MSE (c) RMSE, (d) RMSSE, and (e) R-squared.

Figure 8. Comparative analysis of daily demand forecasting orders based on delay: (a) MAE, (b) MSE, (c) RMSE, (d) RMSSE, and (e) R-squared.

Figure 9. Comparative analysis of daily demand forecasting orders based on K-fold: (a) MAE, (b) MSE (c) RMSE, (d) RMSSE, and (e) R-squared.

Table 1. Demand forecasting dataset comparison for delay and K-fold.

Models	Demand Forecasting Dataset
	Delay 25					K-Fold 10
	MAE	MSE	RMSE	RMSSE	R-Squared	MAE	MSE	RMSE	RMSSE	R-Squared
RF Regressor	1.88	8.73	2.95	0.029	0.80	1.67	6.87	2.62	0.025	0.84
Ensemble ARIMA–NN	1.95	9.66	3.11	0.030	0.81	1.77	7.07	2.66	0.026	0.87
ETS	1.90	9.49	3.08	0.030	0.82	1.70	6.97	2.64	0.026	0.88
LSTM Grid Search	1.86	8.55	2.92	0.028	0.82	1.66	4.83	2.20	0.021	0.90
Recurrent BiLSTM	1.85	8.14	2.85	0.028	0.84	1.32	4.59	2.14	0.021	0.94
NARX	1.82	7.92	2.81	0.027	0.89	1.31	3.82	1.96	0.019	0.95
UDAF	1.73	4.14	2.03	0.020	0.94	1.22	1.16	1.08	0.010	0.96

Table 2. M5 forecasting dataset comparison for delay and K-fold.

Models	M5 Forecasting Dataset
	Delay 25					K-Fold 10
	MAE	MSE	RMSE	RMSSE	R-Squared	MAE	MSE	RMSE	RMSSE	R-Squared
RF Regressor	4.96	11.94	3.46	0.034	0.61	4.02	9.21	3.03	0.042	0.84
Ensemble ARIMA–NN	4.82	10.35	3.22	0.032	0.64	4.13	9.42	3.07	0.041	0.85
ETS	4.00	9.90	3.15	0.031	0.69	4.06	9.31	3.05	0.039	0.86
LSTM Grid Search	3.65	8.24	2.87	0.028	0.71	4.02	7.17	2.68	0.033	0.87
Recurrent BiLSTM	2.99	7.96	2.82	0.028	0.74	3.67	6.94	2.63	0.033	0.88
NARX	2.81	7.38	2.72	0.027	0.76	3.67	6.17	2.48	0.032	0.88
UDAF	2.69	6.97	2.64	0.026	0.81	3.57	3.50	1.87	0.030	0.90

Table 3. Daily demand forecasting orders dataset comparison for delay and K-fold.

Models	Daily Demand Forecasting Orders
	Delay 25					K-Fold 10
	MAE	MSE	RMSE	RMSSE	R-Squared	MAE	MSE	RMSE	RMSSE	R-Squared
RF Regressor	5.38	12.32	3.51	0.034	0.70	2.80	8.21	2.87	0.043	0.85
Ensemble ARIMA–NN	5.36	11.56	3.40	0.033	0.74	2.90	8.41	2.90	0.039	0.85
ETS	5.18	10.23	3.20	0.031	0.76	2.83	8.31	2.88	0.035	0.86
LSTM Grid Search	5.12	8.21	2.86	0.027	0.78	2.80	6.17	2.48	0.035	0.86
Recurrent BiLSTM	3.46	8.05	2.84	0.027	0.80	2.45	5.94	2.44	0.034	0.88
NARX	3.20	7.92	2.81	0.027	0.87	2.45	5.16	2.27	0.032	0.89
UDAF	2.95	7.85	2.80	0.026	0.90	2.35	2.50	1.58	0.031	0.90

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aldahmani, E.; Alzubi, A.; Iyiola, K. Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model. Appl. Sci. 2024, 14, 8110. https://doi.org/10.3390/app14188110

AMA Style

Aldahmani E, Alzubi A, Iyiola K. Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model. Applied Sciences. 2024; 14(18):8110. https://doi.org/10.3390/app14188110

Chicago/Turabian Style

Aldahmani, Emad, Ahmad Alzubi, and Kolawole Iyiola. 2024. "Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model" Applied Sciences 14, no. 18: 8110. https://doi.org/10.3390/app14188110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Demand Forecasting in Supply Chain Using Uni-Regression Deep Approximate Forecasting Model

Abstract

1. Introduction

Motivation and Contribution

2. Literature Review

Research Gaps

3. Proposed Methodology

3.1. Input

3.2. Preprocessing

3.3. Feature Extraction

3.3.1. Retail Features

3.3.2. Statistical Features

3.3.3. Time Series Features

3.3.4. Demand Indicator Feature

3.4. Notations and Assumptions

4. Proposed Uni-Regression Deep Approximate Forecasting Model (UDAF) in Demand Forecasting

NARX Model

5. Results and Discussions

5.1. Experimental Setup

5.2. Dataset Description

5.3. Comparative Methods

5.3.1. Comparative Analysis of Demand Forecasting Dataset Based on Delay

5.3.2. Comparative Analysis of Demand Forecasting Dataset Based on K-Fold

5.3.3. Comparative Analysis of M5 Forecasting Dataset Based on Delay

5.3.4. Comparative Analysis of M5 Forecasting Dataset Based on K-Fold

5.3.5. Comparative Analysis of Daily Demand Forecasting Orders Based on Delay

5.3.6. Comparative Analysis of Daily Demand Forecasting Orders Based on K-Fold

5.4. Comparative Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI