Next Article in Journal
Study on the Difference in Wavefront Distortion on Beams Caused by Wavelength Differences in the Strong Turbulence Region
Previous Article in Journal
Practical Test on the Operation of the Three-Phase Induction Motor under Single-Phasing Fault
Previous Article in Special Issue
Multicast Routing Based on Data Envelopment Analysis and Markovian Decision Processes for Multimodal Transportation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Deep Learning Approach to Predict Supply Chain Delivery Delay Risk Based on Macroeconomic Indicators: A Case Study in the Automotive Sector

Department of Industrial Engineering, University of Bologna, Viale del Risorgimento 4, 61121 Bologna, Italy
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(11), 4688; https://doi.org/10.3390/app14114688
Submission received: 12 April 2024 / Revised: 12 May 2024 / Accepted: 16 May 2024 / Published: 29 May 2024

Abstract

:
The development of predictive approaches to estimate supplier delivery risks has become vital for companies that rely heavily on outsourcing practices and lean management strategies in the era of the shortage economy. However, the literature that presents studies proposing the development of such approaches is still in its infancy, and several gaps have been found. In particular, most of the current studies present approaches that can only estimate whether suppliers will be late or not. Moreover, even if autocorrelation in data has been widely considered in demand forecasting, it has been neglected in supplier delivery risk predictions. Finally, current approaches struggle to consider macroeconomic data as input and rely mostly on machine learning models, while deep learning ones have rarely been investigated. The main contribution of this study is thus to propose a new approach that for the first time simultaneously adopts a deep learning model able to capture autocorrelation in data and integrates several macroeconomic indicators as input. Furthermore, as a second contribution, the performance of the proposed approach has been investigated in a real automotive case study and compared with those studies resulting from approaches that adopt traditional statistical models and models that do not consider macroeconomic indicators as additional inputs. The results highlight the capabilities of the proposed approach to provide good forecasts and outperform benchmarks for most of the considered predictions. Furthermore, the results provide evidence of the importance of considering macroeconomic indicators as additional input.

1. Introduction

The ability to anticipate supply chain (SC) risks has always been pivotal in supply chain risk management (SCRM). Indeed, the necessity not only to consider reactive strategies but also to consider proactive ones has been highlighted in several studies in this field [1,2]. However, this historical need has recently become increasingly relevant. Several supply chain disruptions occurred in recent years, leading unprepared industries nearly to default and obliging even big companies to shut down their plants or cut their production rates due to the shortages of parts experienced during the pandemic period [3,4]. For this reason, adopting a proactive risk-management perspective in the current shortage economy [5] by structuring SC with resilience [6] has become a fundamental aspect if they are to survive.
In addition to the increased risk exposure of supply chains, this current time period is also characterized by outstanding advancements in the field of artificial intelligence (AI) [7,8] and especially in machine learning (ML) and deep learning (DL) [9,10], which represent technological solutions with high potentials for gaining visibility over supply chains. Visibility is one of the antecedents of resilience [11], and industries’ correct adoption of these technologies is a fundamental step toward more resilient systems.
Thus, researchers should guide practitioners through the intelligent adoption of ML technology in this era. More specifically, assuming a design research perspective [12], new studies should provide information about how to effectively design ML artifacts tailored to deal with different supply chain risks. Unfortunately, while most studies proposed ML artifacts to forecast demand risks proactively, less attention has been dedicated to designing approaches able to predict supply risks. However, similar to demand risks, supply risks can lead to severe disruptions. In particular, predicting supplier delivery risk is particularly relevant for companies that rely heavily on outsourcing and lean manufacturing practices. Indeed, when operating with a low level of stock, delivery delays can lead to shortages of components and thus block production lines. As a result, small delays have been demonstrated to decrease sales by up to 10% for three to four weeks, while longer delays can have an even larger negative impact [13].
Studies proposing innovative ML artifacts to predict supplier delivery delay risks are thus necessary to ensure more resilient supply chains. However, despite this need, the literature proposing ML approaches that can effectively predict supplier delivery risks is in its infancy, and a definitive solution to the problem thus remains elusive. In particular, most studies on this topic proposed approaches to solve the problem in a classification manner (i.e., to predict whether a delivery delay will occur or not) [14,15,16,17,18]. Predictions provided in this form are useful, but to effectively prioritize risk action, a punctual estimation of the amount of delay needs to be defined. Moreover, very little attention has been dedicated to integrating macroeconomic indicators to support predictions [18]. However, according to [19], relying only on internal company data or past historical delivery data does not allow for anticipating market changes. Lastly, even if autocorrelation in data has often been considered in demand forecasting [19,20,21], autocorrelation in delivery delay data has never been adopted to build a forecasting model to solve the problem.
Thus, the main motivation for this study is to propose an approach that, for the first time, simultaneously covers all these gaps. Therefore, a deep learning model able to capture autocorrelation in data, specifically a Long Short-Term Memory (LSTM) model, has thus been proposed to predict delivery risk in a regression manner (i.e., by punctually estimating the number of days of delay or the advance a component will be delivered with). Concurrently, a procedure is presented to identify the best macroeconomic indicators and which lagged version to consider for predictions.
A research methodology involving an experimental design that considers real supplier delivery data from an automotive case study has been adopted to investigate the capability of the proposed approach against several benchmarks. An automotive case study has been selected in particular as the autmotive sector represents a sector characterized by high outsourcing volumes and lean manufacturing strategies, and thus it represents a situation in which predicting supplier delivery delays is vital. In particular, considering the proposed approach, the following research questions have been investigated:
  • Which predictive accuracy can the new proposed approach reach in predicting supplier delivery risk in a regression form in a real case study?
  • Which predictive accuracy advantages can the proposed deep learning approach obtain compared to traditional statistical models like ARIMA?
  • Which predictive accuracy advantages can the integration of macroeconomic indicators bring compared to models built without considering these variables?
The remainder of this paper is structured as follows. Section 2 reviews the literature. Section 3 presents the proposed approach and the research methodology adopted. Section 4 presents the results. Section 5 discusses the results of the study and presents their theoretical and managerial implications. Lastly, Section 6 summarizes the conclusions.

2. Literature Review

In this section, challenges arising in the development of predictive ML and DL models are introduced to highlight the high number of designing options that arise when building these models. Afterward, studies proposing predictive approaches integrating macroeconomic indicators to predict supply chain risks are reviewed in Section 2.2, while studies proposing ML and DL approaches to predict supplier delivery risks are reported in Section 2.3 specifically. Finally, common trends and the main gaps and limitations of the revised studies are reported in Section 2.4 and Table 1 to underline the main novelties of the proposed approach.

2.1. Modeling Arising in the Design of ML and DL Predictive Approaches

ML and DL have recently gained attention in the Supply Chain Risk Management (SCRM) field due to their abilities to automatically learn relationships from data and provide forecasts about future risks. However, despite their capabilities, ML and DL are far from being completely automatedtechnologies. Human expertise remains fundamental for their design and optimization, and human experts need to make several modeling decisions to solve different problems effectively. Indeed, based on [22], multiple steps must be considered in the design of these approaches, and several options are available for each step. In particular, each approach needs to start with a proper data management stage and end with an appropriate model learning stage.
Regarding the data management stage, first a decision must be made as to which kinds of data are required. Among the options is relying only on the variable’s past historical records to make predictions. However, considering additional internal company data or expanding the collection to include data from outside the company are potential alternatives. Once the data have been collected, a preprocessing step should be taken to adjust the data by performing different operations. Typically, these operations involve feature extraction, selection, and scaling. Feature extraction aims to build and extract new relevant features from the raw data to improve the model’s performance. Feature selection aims to reduce the collected data to only effectively relevant ones. Lastly, feature scaling aims to report all the collected data at the same interval [23,24].
Different strategies and techniques must also be chosen in the model learning stage. Here, the type of problem to solve must be identified, and the most suitable learning algorithm must be selected in the model selection step. Predictions can thus be based on regression algorithms when the problem requires continuous variables to be predicted, while classification algorithms can be adopted if a binary or discrete forecast must be produced. Once selected, models must undergo the training stage, in which local or global training strategies can be adopted. In local training, one model for each specific group of data is developed. Contrary to this, a single model for multiple groups of data is built when a global training strategy is adopted [25]. Lastly, in the hyperparameter selection step, the values of those not directly learned parameters must be optimized, and different techniques and research spaces to search for the optimal parameters must be specified. Overall, designing a predictive-based approach thus involves choosing between multiple techniques for a high number of steps.
Table 1. Literature summary.
Table 1. Literature summary.
StudyPredicted
Risk
Predictive
Problem
Predictive
Model
Auto
Regressive
Input Data
Macro
Economic
Input Data
[19]DRML
[26]DRML
[27]DRML
[20]DRML
[21]DRDL
[28]DRDL
[14]SDCML
[15]SDCML
[16]SDCML
[17]SDCML
[18]SDCDL
[29]SDRML
This studySDRDL
D: Demand, SD: Supplier Delivery, R: Regression, C: Classification, ML: Machine Learning, and DL: Deep Learning.

2.2. ML and DL Predictive Approaches for SCRM-Exploiting Macroeconomic Indicators

In the data management stage design for ML and DL models, macroeconomic indicators have been widely adopted as input when estimating supply chain demand risks. In [19], the problem of providing 12 months ahead forecast data of sales has been addressed. In particular, both past historical data of product sales and 67,851 macroeconomic data have been adopted as input for a LASSO regression model. The LASSO regression model has been proposed because it can automatically perform the features selection step by identifying the most useful macroeconomic indicators and their lagged versions. This approach has been tested on five years of data related to two products of a raw material supplier for the tire industry, and the results suggest that the proposed method can lead to accuracy improvements of up to 18.8% compared to other traditional models like Holt-Winters, Exponential Smoothing, and ARIMA models. This study has been successively extended in [26,27]. In ref [26], a bigger dataset, including 10 years of sales from five global plants of tire manufacturers, has been considered and, in addition to the investigation of the forecasting accuracy of the proposed approach, its impacts on the service level and inventory have been examined. In particular, the results reported that adopting the LASSO model leads to lower inventory costs. Lastly, in ref [27], the empirical investigation was extended by including data from two other companies: a global steel producer and a producer of composite building materials. The additional case studies confirmed the advantages of the proposed method, which reported a reduction of 25.6% in terms of mean absolute percentage error compared to the conventional forecasting method. Other studies investigating the integration of macroeconomic indicators when forecasting demand data can also be found in [20,21,28]. In ref [20], gross domestic product, unemployment rate, crude oil price, purchasing managers’ indices, and copper price were used as input to compare the forecasting accuracy of several traditional and ML models. Conversely, in ref [21,28], DL models started to be adopted for the task, with the results highlighting the greater capability of these models to provide more accurate forecasts.
Conversely, supplier-related risks have been rarely predicted, relying on macroeconomic data. To the best of this author’s knowledge, the only study considering macroeconomic indicators for predicting supplier delivery delays is the one performed by [18]. In this study, 54 variables, including internal to company and macroeconomic indicators, were provided as input to a deep neural network to solve the problem. However, contrary to previous studies, the problem of predicting supplier delivery delays has been formulated in a classification manner (i.e., by predicting only if a component would have been delivered late or not). Furthermore, this study did not consider autoregressive variables representing the past historical behavior of suppliers.

2.3. ML and DL Predictive Approaches for Supplier Delivery Risks

Different design strategies can be noted when developing predictive approaches to estimate supplier delivery risks. However, a clear lack of studies that consider macroeconomic data as input for these models can be found. Moreover, a common trend in proposing approaches to solving classification problems can also be found.
In particular, the first approaches to predicting supplier delivery delays were developed by [14,15]. Both approaches proposed collecting only internal company data and solving the predictive problem by using ML classification algorithms. In addition, the same learning strategies, namely, those based on constructing a separate model for each supplier, were proposed in both papers. No feature selection techniques were suggested in the former, while several feature selection techniques were compared in the latter, and a recursive feature elimination procedure was identified to yield the best results.
Classification algorithms were also adopted by [16,17,18]. However, these studies applied a learning strategy based on one global model that generated predictions for multiple suppliers at once. Refs. [16,17] only used internal company data, whereas [18] used data from inside and outside the company related to weather and macroeconomic indicators. Regarding the feature selection techniques, expert domain interviews were suggested by [16], while a literature survey was selected by [18]. In contrast, due to the nature of the federated learning approach proposed by [17], the selected features were limited to those available from all companies involved.
Lastly, the only study that proposed ML regression models to solve the problem was carried out by [29]. The study only collected internal company data, and the features were selected based on their correlations with the variable to predict. A learning strategy based on a single global model that generated predictions for all the considered suppliers is adopted in this study.

2.4. Research Gaps and Novelties of the Current Study

Based on the investigated literature, three main directions have been found to be under investigated.
First, while macroeconomic indicators have been widely adopted to support demand risk predictions, they have rarely been integrated when forecasting supplier delivery risks. Indeed, only the study of [18] started to investigate this area.
Second, predictive approaches to estimate supplier delivery risks have been mainly proposed when the problem needs to be solved in a classification manner. Only ref. [30] proposed an approach for solving the delivery risks prediction problem in a regression manner.
Lastly, although autocorrelation in data and autoregressive models have often been applied when predicting demand risks, none of the approaches reported in Section 2.3 have considered adopting models that are specifically tailored to deal with time-series data and to consider this aspect.
Based on this evidence, the main innovations proposed in this study thus rely on the definition of a predictive approach which for the first time addresses simultaneously the problem of forecasting supplier delivery delay risk in a regression manner, the necessity of integrating macroeconomic indicators, and the requirement to not forget the time-series nature of the problem and thus possible autocorrelation in the data. Indeed, according to Table 1 which summarizes the previous study, no one has proposed a predictive approach covering all three of these aspects simultaneously for the supplier delivery risk prediction problem.

3. Materials and Methods

In this section, the problem under investigation is described. Thereafter, the problem is formally stated, and the proposed approach is presented. Lastly, the research methodology followed to test the proposed approach is discussed.

3.1. Problem Statement

Manufacturing companies rely on many suppliers. Thus, hundreds of entities compose their SCs, and suppliers deliver thousands of components daily. In this context, the on-time delivery of each component is fundamental for guaranteeing a smooth production flow for the final manufacturers. It has thus become essential to know in advance if future deliveries will be made on time. However, knowing whether a delivery will be late or not is sometimes not enough. Material planners should have precise daily information about each component’s expected delivery delay or advance to avoid excessive inventory or stockout. In addition, with the increase of globalization, SCs have become more exposed to global dynamics and as a result, the local delivery performance of suppliers could be affected by the macroeconomic conditions of different countries and sectors. Overall, the problem faced by an increasing number of industries is thus how to effectively predict the amount of delay or advance in the delivery of each component from their suppliers by leveraging the collection of publicly available macroeconomic indicators.

3.2. Proposed Approach

To solve the problem presented in Section 3.1, this study proposes an approach that starts by framing the problem as a one-step-ahead multivariate time-series forecasting regression problem for unevenly spaced time series.
As it is necessary to predict continuous values (i.e., the exact amount of delay or advance of each supplied component), the problem is framed as a regression problem. The need to investigate the relationship between the behaviors of multiple macroeconomic variables and delivery data suggests the multivariate aspect of the predictive problem, while the evolving nature of the delivery performance over time leads to a time series formulation. Furthermore, even if it is true that many manufacturers receive components daily (according to the ‘every part every day’ principle), it is also true that other typical planning and inventory management strategies based on material requirement planning (MRP) and the reorder point (ROP) can lead to component deliveries with different frequencies over time. Considering that an evenly spaced time series is only a special case of unevenly spaced time series, the problem has been formulated to consider the latter possibility. In conclusion, different forecasting horizons can be investigated, but the scope of the present paper limits this aspect to the prediction of delivery delay or advance to only one delivery in the future, leading to the one-step-ahead nature of the problem.
Following the guidelines reported in [22], an approach is proposed to start from the data management stage and end with model learning. It was decided to base the proposed approach on these two blocks because selecting a specific learning algorithm only partially affects the overall results.
Indeed, according to [31], the preprocessing step can seriously affect the overall performance, and different preprocessing techniques can be adopted based on the problem to solve. A framework of the proposed approach that specifically states the techniques suggested for each step is reported in Figure 1.

3.2.1. Data Management Stage

Following ref [22], three steps are considered in the data management phase: data collection, data augmentation, and data preprocessing.
First, domain expert interviews are proposed as the means of data collection that effectively restrict the possibly infinite number of macroeconomic data to those considered to most affect the delivery performance of suppliers. Indeed, experts’ interviews have proven to be effective in refiltering macroeconomic indicators [19]. In particular, the external cues-based methodology suggested in [32] is proposed to select experts. According to [32], the external cues-based methodology should be preferred when tasks are related to highly specialized markets.
Afterward, a data augmentation step based on a linear interpolation of macroeconomic data is proposed to reconstruct the daily value of the macroeconomic variables. Indeed, while the problem statement requires the delivery delay performance of each supplied component to be predicted at a daily level, macroeconomic variables are usually recorded with a monthly frequency.
Once the data augmentation has been executed, a data preprocessing step is proposed involving feature engineering, feature selection, and feature scaling.
A lag transformation is proposed for the feature engineering step. Considering a variable X, the lag transformation L ( X , α ) creates a new variable whose t-th value is equal to X t α for t > α, while for t ≤ α, the new variable contains missing values. The lag transformation is proposed because macroeconomic variables usually affect local systems with a certain delay in time. Thus, generating different lagged values for the collected macroeconomic data is considered necessary under the requirement identified in the problem statement. However, considering all possible combinations for the parameter α and all macroeconomic variables at a daily level can easily lead to long computational times for dataset generation as well as memory storage problems. To overcome these problems, the proposed feature engineering step will generate only those lagged variables resulting from considering α in a limited discrete subset. Furthermore, it is proposed that for each separate component, only the observations for which delivery delay data are recorded and not those observations containing missing data generated from the lag transformation will be considered for each macroeconomic variable.
A univariate feature selection based on the K-best algorithm implemented [33] separately for each component in Sklearn is proposed for the feature selection step. The proposed feature selection algorithm selects only the K features from the entire subset of features, reporting the highest scores according to the algorithm rank, in which the parameter K must be found experimentally. This technique is suggested for two main reasons. First, compared to the heuristic approach proposed in [18], this technique does not require the choice of macroeconomic indicators to be limited to those related to the location in which a specific supplier is located. Rather, the proposed technique automatically identifies the macroeconomic variables that are most relevant to the supplier’s delivery performance. On the other hand, compared to the other feature selection methods, its reduced computational complexity guarantees a fast application even when a large number of macroeconomic variables are considered.
Lastly, a data scaling step based on the MinMax scaler is proposed to scale each variable in the dataset in the range [0, 1] according to the equation below:
x s c a l e d = x m i n ( x ) max x m i n ( x )
Here, for a given variable, x s c a l e d represents the scaled vector, x represents the original vectors, and max x and m i n ( x ) represent, respectively, the maximum and minimum values of the vectors. A data scaling step is proposed to overcome the problem generated by possible different units of measurement adopted for the different macroeconomic variables considered.

3.2.2. Model Learning Stage

According to the time-series regression conceptualization of the problem, a long short-term memory (LSTM) model [34] is proposed in the model selection step. Indeed, LSTM models have proven to be capable of dealing with time-series regression problems in several real SC applications [30,35,36,37]. Regarding the training strategy, a local training strategy is proposed with the root mean squared error (RMSE) as the loss function to find the optimal internal parameters of the LSTM model. Although the LSTM model can be trained globally on multiple components, and this option reduces the number of models that need to be maintained, it requires as input a unique feature vector for all components. Therefore, considering the possibility that different features can affect different components, it is proposed to perform training individually for each component. Furthermore, the approach of adopting one model separately for each component has been widely applied in the automotive sector when developing predictive approaches for estimating demand risks [20,38]. In ref [20], training multiple local models instead of one global model that provides forecasts for all the components led to better results for regression problems. The RMSE was adopted as the loss function as it is widely implemented for solving regression problems and because its formulation helps in avoiding large prediction errors.
Lastly, a grid search procedure executed individually for each trained model is proposed to find the best value for the number of layers and the learning rate in the hyperparameter tuning step. Although other hyperparameters (i.e., those parameters that cannot be directly learned from data) define the structure of an LSTM model, only the values of these hyperparameters are optimized as they are identified as the most relevant in the tuning process of LSTM models [39]. Lastly, an early stopping mechanism is adopted to identify the best value of the number of epochs. The number of epochs can be regarded as the number of consecutive iterations for which the model is trained to find the best value of the internal parameters that minimizes the adopted loss function. Under this procedure, a high number of epochs can be initially adopted as input, but the training of the model can be automatically stopped early if, for σ consecutive epochs, no improvements in the RMSE computed on the validation dataset are found. Adopting an automated strategy to properly train each model is fundamental for dealing with the typically high number of components supplied in the automotive sector.

3.3. Research Methodology

A multimethod research design integrating case-based research and experimental design was applied to investigate the capabilities of the proposed approach. Case-based research was selected due to its ability to capture real-world systems’ complexity [40]. On the other hand, an experimental design was developed based on the data collected in the case study to understand how the adoption of different techniques in the steps considered in the proposed approach affects its overall performance.

3.3.1. Case Study Selection and Data Collection

A real Italian automotive company was selected as the case study to investigate the potential of the proposed approach. The automotive sector has been selected as it is one of the most important economic sectors globally [41] and has recently experienced a paradigm shift led by new technologies. Furthermore, the literature review identified a lack of empirical studies in this sector. Similarly, the company was selected due to its relevance and the complexity of its supply chain. Indeed, with almost 500 million in revenues, the industry relies on more than 600 suppliers who supply, on average, 10,000 components.
Two different types of data were collected from the selected case study. First, suppliers’ delivery performance data for 134 different components, supplied from 24 suppliers, were recorded from 2021–2022. Specifically, data were restricted to those deliveries for which at least 30 delivery data points had been recorded. It was decided to restrict the amount of data to components for which at least 30 deliveries had been recorded to guarantee a minimum amount of data for training and testing the proposed approach properly. Similar methodological strategies were applied by [16]. Second, multiple macroeconomic variables that are considered to affect suppliers’ potential deliveries were identified based on the expertise of the company’s SC manager. The Eurostat database was chosen as a data source due to its reliability and wide coverage of European company data. An overview of the collected data is reported in Table 2, while summary statistics of the delivery performance data are reported in Table 3.

3.3.2. Experimental Design

Based on the data collected in the case study described in Section 3.3.1, an experimental design was devised to investigate the capabilities of the proposed approach. Specifically, a factorial design technique was followed for the experiments [42]. A factorial design is particularly useful when the influence of different factors on specific response variables needs to be investigated. The factors (i.e., the independent variables that are manipulated in the experiment), their levels (i.e., the values investigated for each factor), and the response variables (i.e., the dependent variables measured over the experiments) are detailed in Sections Experimental Conditions: Factors and Levels and Response Variables.

Experimental Conditions: Factors and Levels

Two factors were selected for the experiment: the input data adopted in the data management stage and the predictive model considered in the model learning stage.
For the first factor, two levels were tested. No macroeconomic variables were considered in the first level, which was identified with the name UNIVARIATE. Here, the future supplier delivery delay predictions of a specific component were based only on that component’s past historical delivery performance records. In contrast, for the second level identified with MULTIVARIATE, macroeconomic variables and past delivery delay performance were used as input for the data management stage.
Two levels were also selected for the second factor. An ARIMAX model was used as a predictive model for the first level. An ARIMAX model is a generalization of the ARIMA model [43] whereby predictions of future values can be generated by considering both historical values and the values assumed by exogenous predictors. The ARIMAX model can be used for both stationary and nonstationary time series. When the time series is stationary, the ARIMAX model can be referred to as ARMAX. A general ARMAX model with exogenous predictors can be formulated as follows:
y t = β 0 + β 1 y t 1 + + β p y t p + 1 ε t 1 + . . . + q ε t q + ε t + θ i X i t
where β i is the coefficient of the autoregressive part, y t is the value of the label at time t, p is the order of the autoregressive process, i represents the coefficient of the moving average part, ε t is the residual error at time t, and q is the order of the moving average component. Lastly, θ i is the coefficient of the exogenous variable X i , and X i t is the value of the covariate X i at the time instant t. When no exogenous variables are considered, the ARIMAX model coincides with the traditional ARIMA model. An LSTM was adopted as the other learning technique.
Overall, the experimental design thus consisted of 22 experimental conditions represented as (UNIVARIATE, ARIMAX), (MULTIVARIATE, ARIMAX), (UNIVARIATE, LSTM), and (MULTIVARIATE, LSTM). Specifically, the experimental condition (MULT IVARIATE, LSTM) represents the proposed approach, while the remaining three are considered as benchmarks.

Response Variables

The response variables monitored for each experimental condition are represented by widely adopted accuracy metrics for regression problems. The considered metrics include the mean absolute error (MAE), the symmetric mean absolute percentage error (SMAPE), and RMSE, which are defined as follows:
M A E = 1 N t = 1 N | Y t Y ^ t |
S M A P E = 1 N t = 1 N | Y t Y ^ t | Y t + Y ^ t 2
R M S E = 1 N t = 1 N ( Y t Y ^ t ) 2
where Y t is the true historical value of the delivery delay or the advance recorded at time t for a specific component, Y ^ t is the value predicted, and N is the length of the test set.

3.3.3. Experiment Set-Up

The set-up adopted for each of the four considered experimental conditions involved several steps. First, the amount of delivery delay or the advance of a specific component was considered the dependent variable to predict (label). In contrast, the macroeconomic variables and/or past historical delivery delay records of the same component were considered the independent variables adopted for the predictions (features).
Second, both the features and the label dataset were split into three consecutive temporal portions. The first 60% were identified as the training set, the second 20% as the validation set, and the last 20% as the test set.
Thereafter, different data management strategies were applied according to the experiment under consideration. On the one hand, only the data augmentation and scaling steps described in Section 3.2.1 were applied for the UNIVARIATE experimental conditions. On the other hand, all steps reported in Section 3.2.1 were performed for the experiments that considered a MULTIVARIATE experimental condition. Specifically, for the feature engineering phase, the lagged macroeconomic indicators were generated considering possible lag values expressed in days in the discrete subset [30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330, 360]. In the feature selection step, the limit of macroeconomic features to select was selected by testing, separately for each model, different K values in the discrete subset [1, 20, 30].
For each of the three considered possible values of K, 134 separate predictive models—one for each component—were thus trained on their respective training sets, and a hyperparameter tuning step based on a grid search procedure was executed to select the best values of the hyperparameters for each model, minimizing the RMSE in the validation dataset. For those experiments that considered an LSTM model in the model learning phase, the Keras library and Keras tuner package were adopted to build the model and tune its hyperparameters [44]. The research space adopted to perform the grid search and the threshold adopted for the early stopping procedure are reported in Table 4. The other parameters were left to the default values reported in the Keras library.
The research spaces adopted for the hyperparameter tuning when an ARIMAX model was considered in the experiments are reported in Table 5.
Lastly, the number of macroeconomic features reporting the lowest RMSE in the validation test for the majority of components was considered the best number of macroeconomic features to adopt for the final computation of the response metrics in the test set. According to the preliminary results reported in Figure 2a,b, the ARIMAX model reported better results when only one macroeconomic variable was considered. On the other hand, 20 features were identified as the optimal number of macroeconomic features for the LSTM models.
The selected features and hyperparameters computed in the previous step were thus implemented to retrain multiple models on the union of the training and validation sets, and the response metrics described in Section Response Variables were computed for each component separately over their respective test sets for each of the four identified experimental conditions.

4. Results

This section presents the results related to the experiments described in Section 3.3.2. First, this section reports the prediction accuracy that can be achieved when predicting supplier delivery delays with the proposed approach and with other benchmarks. Second, the approach that reported the best results for the investigated components is identified. Lastly, the relative weights that the data management and model learning stages assume in producing the overall performance of the proposed approach are investigated.

4.1. Response Variables Distribution

The boxplot shown in Figure 3 reports the distribution of the values that the three monitored response metrics (SMAPE, RMSE, MAE) assumed in each of the four considered experimental conditions. The chart thus represents the error distribution reported by the four predictive approaches when estimating the supplier delivery performance of each of the 134 examined components. The box for each group spans the interquartile range, with the bottom and top boundaries corresponding to the first quartile and third quartile, respectively. A line inside each box represents the median.
As shown in the chart, the experimental condition corresponding to the proposed approach, identified as (LSTM, MULTIVARIATE), reported a median error of 53% expressed in terms of SMAPE, 5.5 days in terms of RMSE, and 4.1 days in terms of MAE. The proposed approach thus has a lower median error than the other benchmarks. However, the results show that for 50% of the forecasts provided by the proposed approach, the obtained prediction errors can vary from 25% to 88% in terms of SMAPE and from 2.8 days to 6.9 days in terms of MAE. Furthermore, considerable errors of up to 161% can be made in terms of SMAPE. However, these peaks are lower than those of the other benchmarks.

4.2. Best Experimental Condition

Figure 4 shows the percentage of times that an experimental condition reported the best (lowest) error over the 134 generated forecasts. The plot thus reveals which experimental condition had the best forecasting results in predicting supplier delivery delay performance for most of the considered components.
In line with the results reported in Figure 3, Figure 4 highlights the fact that the experimental condition representing the proposed approach (LSTM, MULTIVARIATE) was best for most of the considered components. However, the proposed approach was not able to generate the best performance for all the considered components. Indeed, the proposed approach was best for only 62% of the considered components in terms of SMAPE and for only 50% and 41% of the examined components in terms of RMSE and MAE, respectively.

4.3. Analysis of Factors’ Relative Importance

Figure 5 reports the importance of the two investigated factors in generating the results reported in Figure 4. The analysis thus provides insight into which of the two building blocks composing the proposed approach (the data management block and the model learning block) would most affect the final results if changed. More specifically, the results reported in Figure 4 were provided as input to a random forest regression model, and the importance of each factor on the results was estimated according to the methodology described in [45].
The results show that the data management block was more important in generating the final results when considering the RMSE and MAE metrics, with the data management stage accounting, respectively, for 85% and 73% of the final results. In contrast, the model learning block was more important when considering the results related to the SMAPE error, in which the importance of the data management block was only 39%.

5. Discussion

The results reported in Figure 3 highlight the overall capability of the proposed approach to provide good predictions for most of the considered components. The results of [29] were similar, highlighting the competitiveness of the proposed approach and the possibility of estimating delivery delays based on regression algorithms in the automotive sector. Furthermore, as shown in Figure 4, the joint adoption of both macroeconomic variables in the data management phase and an LSTM model in the model learning phase (i.e., the proposed approach) outperformed the other benchmarks for most of the forecasts related to the considered components. These results thus support the literature that reports the benefits of adopting deep learning models in other sectors and for other types of forecasts. In addition, it aligns with the need expressed by [15,16] to use external variables to predict suppliers’ delivery delays. Lastly, the evidence shown in Figure 5 supports the suggestion made by [22] to consider the whole ML lifecycle when developing new approaches based on this technology.
Overall, these results provide evidence of the effectiveness of the proposed approach in solving the problem formulated in Section 3.1 and its capability to cover the research gap reported in Section 2.4. Unlike other approaches [14,15,16,18], the proposed approach can make predictions about delivery risks based on regression algorithms. Second, unlike the work proposed by [29], it allows for the consideration of macroeconomic indicators in predictions. Third, the new proposed approach has finally considered the autocorrelated nature of delivery delays. Lastly, the need to potentially consider macroeconomic indicators related to multiple countries and sectors for each component has been met. Indeed, as shown in Figure 2b, more than 20 different macroeconomic variables have been considered for each component.
The outcome of this experiment has several theoretical and managerial implications.
From the theoretical point of view, the results support two major theories connected to the field of SC risk management [46]. The higher performance of the proposed approach, composed of a data management stage that considers macroeconomic indicators and a model learning stage based on a DL model, provides support for the information processing theory and the high-reliability theory. According to the former, organizations, as information processing systems, must enhance their capabilities by gathering and processing information from the environment to mitigate SC risk [47]. Meanwhile, according to the latter, organizations must accept complexity and avoid simple explanations for problems [48].
On the other hand, the results also have several managerial implications. First, according to Figure 5, managers should prioritize investment, time, and efforts for the data management phase rather than for the model learning phase. Indeed, the results showed that this is the phase that most affects the results and thus has the highest return on investment. Moreover, two strategies should be preferred when considering the adoption of macroeconomic indicators in this phase: First, predictive models should be built based on deep learning models rather than traditional statistical ones. According to Figure 2, deep learning models have been shown to be able to lead to higher predictive accuracy. Moreover, conversely to the approach proposed in [18], macroeconomic indicators related to the specific sector or country of the supplier under investigation should not be the only indicators considered. Indeed, according to Figure 2, deep learning models reported the best results when 20 features were provided as input. This means that to increase predictive accuracy, models should be able to observe what happens to other sectors and countries. Considering sectors and countries other than those related to a specific supplier can indeed be seen as a form of implicitly monitoring upstream levels of the supply chain. Lastly, according to Figure 3, managers should take into consideration the fact that that even if the proposed approach can reach good results for the majority of components, cases where predicting delivery risks remain difficult exist and thus the approach should leverage other risk management strategies like prepositioned inventories or reactive mitigation strategies to deal with these situations.

6. Conclusions

The effective design of predictive approaches that anticipate supplier delivery risks represents a fundamental step to building a more resilient supply chain.
This paper’s main contribution is thus twofold. First, a new deep-learning approach to solving the problem of predicting supplier delivery risks in a regression manner by simultaneously considering autocorrelation in data and macroeconomic indicators as additional input was proposed for the first time. Then, an empirical investigation of the predictive capability of the proposed approach and a comparison against several benchmarks of its main building block was conducted in a real automotive case study.
In particular, which predictive accuracy the new proposed approach was able to reach was first investigated. The results highlight the overall capability of the proposed approach to provide good predictions of delivery delays with errors that, for 50% of the considered components, span from 25% to 88% in terms of SMAPE and from 2.8 days to 6.9 days in terms of MAE.
Then, the predictive accuracy advantages that the proposed deep-learning approach can obtain compared to traditional statistical models like ARIMA have been explored. The results report that the proposed approach outperformed the other benchmarks in 41% to 62% of instances, depending on the considered metrics.
Lastly, the predictive accuracy advantages that the integration of macroeconomic indicators could bring compared to models built without considering these variables have been analyzed. The experiments highlighted the fact that macroeconomic variables were found to affect the overall predictive performance seriously and have an impact of up to 85% on the final result.
However, the results reported in the study must be considered subject to some limitations. In particular, the case-based research methodology adopted cannot be easily extended to other sectors. Furthermore, the results cannot be extended to components for which less than 30 historical data points can be collected.
Future research can thus be directed at testing the proposed approach on multiple case studies that can also consider more components. Furthermore, investigating hybrid approaches that combine local models built independently for each component with global models built for different groups of components could be another interesting research direction when a different amount of historical data is available for each component. Lastly, integrating delivery delay predictions in typical SC risk-management problems such as supplier selection and order allocations, inventory management, or SC network design, and understanding how predictions’ short- and long-term accuracy impacts these decisions can be promising research areas.

Author Contributions

Conceptualization, M.G. and M.B.; methodology, M.G. and F.C.; software, M.G. and L.C.; formal analysis, M.G., F.C. and L.C; investigation, M.G., F.C. and L.C.; resources, M.G., F.C. and L.C.; data curation, M.G., F.C. and L.C.; writing—original draft preparation, M.G., F.C. and L.C.; writing—review and editing, M.G., F.C., L.C. and M.B.; visualization, M.G., F.C. and L.C.; supervision M.B.; project administration, M.B.; funding acquisition, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This study was carried out within the MICS (Made in Italy—Circular and Sustainable) Extended Partnership and received funding from the European Union Next-Generation EU (PIANO NAZIONALE DI RIPRESA E RESILIENZA (PNRR)—MISSIONE 4 COMPONENTE 2, INVESTIMENTO 1.3—D.D. 1551.11-10-2022, PE00000004). This manuscript reflects only the authors’ views and opinions; neither the European Union nor the European Commission can be considered responsible for them.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy issues.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Knemeyer, A.M.; Zinn, W.; Eroglu, C. Proactive planning for catastrophic events in supply chains. J. Oper. Manag. 2009, 27, 141–153. [Google Scholar] [CrossRef]
  2. Grötsch, V.M.; Blome, C.; Schleper, M.C. Antecedents of proactive supply chain risk management—A contingency theory perspective. Int. J. Prod. Res. 2013, 51, 2842–2867. [Google Scholar] [CrossRef]
  3. Chowdhury, P.; Paul, S.K.; Kaisar, S.; Moktadir, M.A. COVID-19 pandemic related supply chain studies: A systematic review. Transp. Res. Part E Logist. Transp. Rev. 2021, 148, 102271. [Google Scholar] [CrossRef] [PubMed]
  4. Hobbs, J.E. Food supply chains during the COVID-19 pandemic. Can. J. Agric. Econ. Rev. Can. Dagroeconomie 2020, 68, 171–176. [Google Scholar] [CrossRef]
  5. Ivanov, D.; Dolgui, A. The shortage economy and its implications for supply chain and operations management. Int. J. Prod. Res. 2022, 60, 7141–7154. [Google Scholar] [CrossRef]
  6. Saglam, Y.C.; Çankaya, S.Y.; Sezen, B. Proactive risk mitigation strategies and supply chain risk management performance: An empirical analysis for manufacturing firms in Turkey. J. Manuf. Technol. Manag. 2020, 32, 1224–1244. [Google Scholar] [CrossRef]
  7. Baryannis, G.; Validi, S.; Dani, S.; Antoniou, G. Supply chain risk management and artificial intelligence: State of the art and future research directions. Int. J. Prod. Res. 2019, 57, 2179–2202. [Google Scholar] [CrossRef]
  8. Ganesh, A.D.; Kalpana, P. Future of artificial intelligence and its influence on supply chain risk management A systematic review. Comput. Ind. Eng. 2022, 169, 108206. [Google Scholar] [CrossRef]
  9. Yang, M.; Lim, M.K.; Qu, Y.; Ni, D.; Xiao, Z. Supply chain risk management with machine learning technology: A literature review and future research directions. Comput. Ind. Eng. 2023, 175, 108859. [Google Scholar] [CrossRef]
  10. Akbari, M.; Do, T.N.A. A systematic review of machine learning in logistics and supply chain management: Current trends and future directions. Benchmarking Int. J. 2021, 28, 2977–3005. [Google Scholar] [CrossRef]
  11. Spieske, A.; Birkel, H. Improving supply chain resilience through industry 4.0: A systematic literature review under the impressions of the COVID-19 pandemic. Comput. Ind. Eng. 2021, 158, 107452. [Google Scholar] [CrossRef] [PubMed]
  12. vom Brocke, J.; Hevner, A.; Maedche, A. Introduction to Design Science Research. In Design Science Research. Cases; Springer: Cham, Switzerland, 2020; pp. 1–13. [Google Scholar] [CrossRef]
  13. Niemi, T.; Hameri, A.P.; Kolesnyk, P.; Appelqvist, P. What is the value of delivering on time? J. Adv. Manag. Res. 2020, 17, 473–503. [Google Scholar] [CrossRef]
  14. Cavalcante, I.M.; Frazzon, E.M.; Forcellini, F.A.; Ivanov, D. A supervised machine learning approach to data-driven simulation of resilient supplier selection in digital manufacturing. Int. J. Inf. Manag. 2019, 49, 86–97. [Google Scholar] [CrossRef]
  15. Baryannis, G.; Dani, S.; Antoniou, G. Predicting supply chain risks using machine learning: The trade-off between performance and interpretability. Future Gener. Comput. Syst. 2019, 101, 993–1004. [Google Scholar] [CrossRef]
  16. Brintrup, A.; Pak, J.; Ratiney, D.; Pearce, T.; Wichmann, P.; Woodall, P.; McFarlane, D. Supply chain data analytics for predicting supplier disruptions: A case study in complex asset manufacturing. Int. J. Prod. Res. 2019, 58, 3330–3341. [Google Scholar] [CrossRef]
  17. Zheng, G.; Kong, L.; Brintrup, A. Federated machine learning for privacy preserving, collective supply chain risk prediction. Int. J. Prod. Res. 2023, 61, 8115–8132. [Google Scholar] [CrossRef]
  18. Bodendorf, F.; Sauter, M.; Franke, J. A mixed methods approach to analyze and predict supply disruptions by combining causal inference and deep learning. Int. J. Prod. Econ. 2023, 256, 108708. [Google Scholar] [CrossRef]
  19. Sagaert, Y.R.; Aghezzaf, E.H.; Kourentzes, N.; Desmet, B. Tactical sales forecasting using a very large set of macroeconomic indicators. Eur. J. Oper. Res. 2018, 264, 558–569. [Google Scholar] [CrossRef]
  20. Rožanec, J.M.; Kažič, B.; Škrjanc, M.; Fortuna, B.; Mladenić, D. Automotive OEM Demand Forecasting: A Comparative Study of Forecasting Algorithms and Strategies. Appl. Sci. 2021, 11, 6787. [Google Scholar] [CrossRef]
  21. Yasir, M.; Ansari, Y.; Latif, K.; Maqsood, H.; Habib, A.; Moon, J.; Rho, S. Machine learning–assisted efficient demand forecasting using endogenous and exogenous indicators for the textile industry. Int. J. Logist. Res. Appl. 2022, 1–20. [Google Scholar] [CrossRef]
  22. Ashmore, R.; Calinescu, R.; Paterson, C. Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges. ACM Comput. Surv. (CSUR) 2019, 54, 1–39. Available online: http://arxiv.org/abs/1905.04223 (accessed on 30 April 2024). [CrossRef]
  23. Mutlag, W.K.; Ali, S.K.; Aydam, Z.M.; Taher, B.H. Feature Extraction Methods: A Review. J. Phys. Conf. Ser. 2020, 1591, 012028. [Google Scholar] [CrossRef]
  24. Kumar, V. Feature Selection: A literature Review. Smart Comput. Rev. 2014, 4, 1632–1653. [Google Scholar] [CrossRef]
  25. Montero-Manso, P.; Hyndman, R.J. Principles and algorithms for forecasting groups of time series: Locality and globality. Int. J. Forecast. 2021, 37, 1632–1653. [Google Scholar] [CrossRef]
  26. Sagaert, Y.R.; Kourentzes, N.; De Vuyst, S.; Aghezzaf, E.H.; Desmet, B. Incorporating macroeconomic leading indicators in tactical capacity planning. Int. J. Prod. Econ. 2019, 209, 12–19. [Google Scholar] [CrossRef]
  27. Verstraete, G.; Aghezzaf, E.H.; Desmet, B. A leading macroeconomic indicators’ based framework to automatically generate tactical sales forecasts. Comput. Ind. Eng. 2020, 139, 106169. [Google Scholar] [CrossRef]
  28. Wang, C.H. Considering economic indicators and dynamic channel interactions to conduct sales forecasting for retail sectors. Comput. Ind. Eng. 2022, 165, 107965. [Google Scholar] [CrossRef]
  29. Steinberg, F.; Burggräf, P.; Wagner, J.; Heinbach, B.; Saßmannshausen, T.; Brintrup, A. A novel machine learning model for predicting late supplier deliveries of low-volume-high-variety products with application in a German machinery industry. Supply Chain Anal. 2023, 1, 100003. [Google Scholar] [CrossRef]
  30. Gabellini, M.; Calabrese, F.; Civolani, L.; Regattieri, A.; Mora, C. A Data-Driven Approach to Predict Supply Chain Risk Due to Suppliers’ Partial Shipments. In Smart Innovation, Systems and Technologies; Scholz, S.G., Howlett, R.J., Setchi, R., Eds.; Springer Science and Business Media Deutschland GmbH: Berlin, Germany, 2024; pp. 227–237. [Google Scholar] [CrossRef]
  31. Adineh, A.H.; Narimani, Z.; Satapathy, S.C. Importance of data preprocessing in time series prediction using SARIMA: A case study. Int. J. Knowl.-Based Intell. Eng. Syst. 2021, 24, 331–342. [Google Scholar] [CrossRef]
  32. Mauksch, S.; von der Gracht, H.A.; Gordon, T.J. Who is an expert for foresight? A review of identification methods. Technol. Forecast. Soc. Change 2020, 154, 119982. [Google Scholar] [CrossRef]
  33. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. Available online: http://jmlr.org/papers/v12/pedregosa11a.html (accessed on 30 April 2024).
  34. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  35. Gabellini, M.; Civolani, L.; Regattieri, A.; Calabrese, F. A Data Model for Predictive Supply Chain Risk Management. In Lecture Notes in Mechanical Engineering; Springer: Cham, Switzerland, 2023; pp. 365–372. [Google Scholar] [CrossRef]
  36. Gabellini, M.; Calabrese, F.; Regattieri, A.; Ferrari, E. Multivariate Multi-Output LSTM for Time Series Forecasting with Intermittent Demand Patterns. In Proceedings of the Summer School Francesco Turco, Otranto, Italy, 11–13 September 2024; AIDI—Italian Association of Industrial Operations Professors: Lazio, Italy, 2022. Available online: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85176726507&partnerID=40&md5=57a91bccffa761dad0e449d27de71820 (accessed on 30 April 2024).
  37. Pacella, M.; Papadia, G. Evaluation of deep learning with long short-term memory networks for time series forecasting in supply chain management. In Procedia CIRP; Elsevier B.V.: Amsterdam, The Netherlands, 2021; pp. 604–609. [Google Scholar] [CrossRef]
  38. Gonçalves, J.N.C.; Cortez, P.; Carvalho, M.S.; Frazão, N.M. A multivariate approach for multi-step demand forecasting in assembly industries: Empirical evidence from an automotive supply chain. Decis. Support Syst. 2021, 142, 113452. [Google Scholar] [CrossRef]
  39. Greff, K.; Srivastava, R.K.; Koutnik, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [Google Scholar] [CrossRef]
  40. Gerring, J. What is a case study and what is it good for? Am. Political Sci. Rev. 2004, 98, 341–354. [Google Scholar] [CrossRef]
  41. Helmold, M. New Work in the Automotive Industry. In New Work, Transformational and Virtual Leadership: Lessons from COVID-19 and Other Crises; Springer International Publishing: Cham, Switzerland, 2021; pp. 157–169. [Google Scholar] [CrossRef]
  42. Nassis, E.; Gruffi, L. Factorial design: Design, measures, classic example. In Translational Sports Medicine; Academic Press: Cambridge, MA, USA, 2023; pp. 269–274. [Google Scholar] [CrossRef]
  43. Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  44. Chollet, F. and others, “Keras”. 2015. Available online: https://github.com/fchollet/keras (accessed on 30 April 2024).
  45. Hutter, F.; Hoos, H.; Leyton-Brown, K. An Efficient Approach for Assessing Hyperparameter Importance. In Proceedings of the 31st International Conference on Machine Learning, Bejing, China, 21–26 June 2014; Xing, E.P., Jebara, T., Eds.; Proceedings of Machine Learning Research, PMLR: London, UK, 2014; Volume 32, pp. 754–762. Available online: https://proceedings.mlr.press/v32/hutter14.html (accessed on 30 April 2024).
  46. Fan, Y.; Stevenson, M. A review of supply chain risk management: Definition, theory, and research agenda. Int. J. Phys. Distrib. Logist. Manag. 2018, 48, 205–230. [Google Scholar] [CrossRef]
  47. Daft, R.L.; Lengel, R.H.; Trevino, L.K. Message Equivocality, Media Selection, and Manager Performance: Implications for Information Systems. MIS Q. 1987, 11, 355–366. Available online: http://www.jstor.org/stable/248682 (accessed on 30 April 2024). [CrossRef]
  48. Rijpma, J.A. Complexity, Tight-Coupling and Reliability: Connecting Normal Accidents Theory and High Reliability Theory. J. Contingencies Crisis Manag. 1997, 5, 15–23. [Google Scholar] [CrossRef]
Figure 1. Proposed approach.
Figure 1. Proposed approach.
Applsci 14 04688 g001
Figure 2. (a) Evaluation of the best number K of features to select for the ARIMAX model, showing the percentage of components for which a specific K yields the best results in different accuracy metrics; (b) Evaluation of the best number K of features to select for the LSTM model, showing the percentage of components for which a specific K yields the best results in different accuracy metrics.
Figure 2. (a) Evaluation of the best number K of features to select for the ARIMAX model, showing the percentage of components for which a specific K yields the best results in different accuracy metrics; (b) Evaluation of the best number K of features to select for the LSTM model, showing the percentage of components for which a specific K yields the best results in different accuracy metrics.
Applsci 14 04688 g002
Figure 3. Distribution of the monitored response metrics values over the investigated experimental conditions.
Figure 3. Distribution of the monitored response metrics values over the investigated experimental conditions.
Applsci 14 04688 g003
Figure 4. Percentage of times that a predictive approach results in the one reporting the lower error over the considered forecast.
Figure 4. Percentage of times that a predictive approach results in the one reporting the lower error over the considered forecast.
Applsci 14 04688 g004
Figure 5. Relative importance of the data management and model learning blocks on the results reported in different response metrics.
Figure 5. Relative importance of the data management and model learning blocks on the results reported in different response metrics.
Applsci 14 04688 g005
Table 2. Case study data collection.
Table 2. Case study data collection.
InformationValue/Source
Company sectorAutomotive
Company collected dataHistorical days of delay or advance in the delivery of a specific component
Data collection periodJanuary 2021–December 2022
Considered suppliers24
Considered components134
Macroeconomic data sourceEUROSTAT
Macroeconomic collected variables
  • Inflation rate of each European country
  • Producer prices in the industry of different economic activities in the European Union for each European country
  • Production in the industry index of different economic activities in the European Union for each European country
  • Production in the service index of different economic activities in the European Union for each European country
Table 3. Summary statistics of supplier delivery performance data.
Table 3. Summary statistics of supplier delivery performance data.
SupplierNumber of ComponentsMinimum
Value
Mean
Value
Standard
Deviation
Maximum
Value
19−630.710.847
21630.712.862
31−424.219.357
42−711.29.040
51−920.911.550
61−228.513.761
71−54.86.028
81−247.514.945
92−1714.115.160
101−24−4.78.825
1112−34−0.37.635
1218−51−1.09.251
131−1018.814.354
141−390.113.236
153−31−3.28.115
161−690.728.346
172−197.09.036
181−236.48.030
1924−411.18.830
203−514.711.038
211−5012.912.639
2243−450.37.234
233−56−2.619.318
241−621.814.044
Table 4. Research space of the LSTM hyperparameters.
Table 4. Research space of the LSTM hyperparameters.
HyperparametersResearch Space
Number of layers1–4
Learning rate0.0001–0.01
Max number of epochsMax 1000
Early stopping threshold (σ)15 epochs with no improvements of RMSE of the validation dataset
Number of layers1–4
Table 5. Research space of the ARIMAX hyperparameters.
Table 5. Research space of the ARIMAX hyperparameters.
HyperparametersResearch Space
p1–10
d1–3
q1–10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gabellini, M.; Civolani, L.; Calabrese, F.; Bortolini, M. A Deep Learning Approach to Predict Supply Chain Delivery Delay Risk Based on Macroeconomic Indicators: A Case Study in the Automotive Sector. Appl. Sci. 2024, 14, 4688. https://doi.org/10.3390/app14114688

AMA Style

Gabellini M, Civolani L, Calabrese F, Bortolini M. A Deep Learning Approach to Predict Supply Chain Delivery Delay Risk Based on Macroeconomic Indicators: A Case Study in the Automotive Sector. Applied Sciences. 2024; 14(11):4688. https://doi.org/10.3390/app14114688

Chicago/Turabian Style

Gabellini, Matteo, Lorenzo Civolani, Francesca Calabrese, and Marco Bortolini. 2024. "A Deep Learning Approach to Predict Supply Chain Delivery Delay Risk Based on Macroeconomic Indicators: A Case Study in the Automotive Sector" Applied Sciences 14, no. 11: 4688. https://doi.org/10.3390/app14114688

APA Style

Gabellini, M., Civolani, L., Calabrese, F., & Bortolini, M. (2024). A Deep Learning Approach to Predict Supply Chain Delivery Delay Risk Based on Macroeconomic Indicators: A Case Study in the Automotive Sector. Applied Sciences, 14(11), 4688. https://doi.org/10.3390/app14114688

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop