Next Article in Journal
An Algorithm for Producing Fuzzy Implications via Conical Sections
Previous Article in Journal
On an Exact Convergence of Quasi-Periodic Interpolations for the Polyharmonic–Neumann Eigenfunctions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning

1
Department of Statistics, Boston University, 1 Silber Way, Boston, MA 02215, USA
2
Department of Integrated System Engineering, Ohio State University, 281 W Lane Ave, Columbus, OH 43210, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Algorithms 2024, 17(11), 498; https://doi.org/10.3390/a17110498
Submission received: 24 August 2024 / Revised: 29 October 2024 / Accepted: 1 November 2024 / Published: 4 November 2024
(This article belongs to the Special Issue Advancements in Causal Discovery Algorithms: Theory and Applications)

Abstract

:
Managing delivery risks is a critical challenge in modern supply chain management due to the increasing complexity and interdependencies of global supply networks. Existing methods often rely on correlation-based approaches, which fail to uncover the true causes behind delivery delays. This limitation makes it difficult for supply chain managers to identify actionable factors that can mitigate risks effectively. To address these challenges, we propose a novel method that integrates causal discovery with reinforcement learning to identify the root causes of delivery risks. Unlike traditional correlation-based methods, our approach uncovers both the direction and strength of causal relationships between variables, allowing for more accurate identification of the key drivers behind delivery delays. By applying causal strength quantification, we further measure the impact of each factor on delivery performance. Using real-world supply chain data, our results demonstrate that the proposed method reveals hidden causal relationships between factors such as shipping mode, order size, and delivery status. These insights enable supply chain managers to implement more targeted interventions, significantly improving risk mitigation strategies.

1. Introduction

In today’s global economy, supply chain management (SCM) is essential for ensuring the efficient delivery of goods from suppliers to customers. Effective SCM helps businesses boost productivity, enhance customer satisfaction, and stay competitive [1]. For example, Apple Inc. has optimized its supply chain to lower costs and speed up delivery times, contributing to its success in the tech industry [2]. The importance of SCM became even more evident during the COVID-19 pandemic, which caused shortages of critical goods like medical supplies and PPE [3]. Similarly, industries like automotive manufacturing, which use just-in-time (JIT) inventory systems, were heavily impacted by the 2011 Tohoku earthquake, as disrupted supply chains halted global production [4]. These examples highlight how crucial SCM is in reducing disruptions, cutting costs, and ensuring business continuity.
Managing supply chain risks has become increasingly challenging due to the complexity of modern supply chains, which often span multiple countries and involve numerous stakeholders. A key challenge is identifying the root causes of these risks. Traditional methods, which rely on expert judgment and statistical analysis, often fail to handle this complexity effectively [5]. These methods can produce misleading correlations, leading to misguided decisions. For example, delays may be wrongly attributed to specific suppliers when the real cause lies in broader issues such as transportation bottlenecks or production slowdowns [6]. External factors, like geopolitical tensions and natural disasters, can also cause unexpected disruptions. For instance, the U.S.–China trade war forced companies to rethink their sourcing strategies due to global supply chain disturbances [7]. The increasing pressure to meet shorter delivery times, as seen with companies like Amazon, further complicates SCM. Amazon’s same-day delivery service depends on advanced technologies and real-time data analytics, but maintaining this efficiency requires continuous adjustments to inventory and logistics systems [8]. As a result, supply chain managers must balance efficiency with risk management in a fast-changing environment, where unexpected disruptions can have significant financial and reputational consequences.
To address these challenges, we propose a novel approach that integrates causal discovery with reinforcement learning to identify the root causes of delivery risks. Causal discovery uncovers relationships between variables using observational data, providing a clearer understanding of what drives delivery delays [9]. Reinforcement learning enhances this process by refining the causal graph to focus on the most relevant factors [10]. Unlike traditional methods, our model introduces a causal strength quantification mechanism, which measures both the direction and magnitude of relationships. This is essential for understanding how factors like supplier delays and bottlenecks impact decision-making and risk management. Our method also leverages domain-specific knowledge, making it more adaptable to the variability and dependencies common in supply chains.
Alternative methods, such as the PC algorithm and Fast Greedy Search (FGS), have significant limitations. For instance, the PC algorithm assumes no hidden variables and outputs equivalence classes, leading to potential ambiguity [11]. FGS struggles with noisy and dynamic supply chain data [12]. Our approach, which combines reinforcement learning and causal strength quantification, overcomes these limitations by offering more accurate causal insights.
By incorporating domain-specific knowledge, our model understands the directional nature of relationships, such as how product specifications influence demand but not the reverse. This allows the model to continuously refine the causal graph, improving its precision over time. For example, it can identify whether factors like shipping modes, shipment times, or order characteristics directly or indirectly contribute to delivery delays. These insights enable supply chain managers to implement targeted interventions and make more informed decisions, ultimately improving both efficiency and risk management.

Existing Works in Supply Chain on Delivery Risk

To understand the current state of research on delivery risk management, we review existing works that focus on addressing delays in supply chains. This section examines both reactive and predictive approaches, highlighting their limitations in uncovering the root causes of delivery risks.
Most existing studies concentrate on reactive measures to address delivery risks after they occur. For example, some strategies aim to optimize production processes to reduce delays during manufacturing [13,14], while others focus on improving delivery strategies once delays are detected [15,16]. However, these approaches typically overlook the root causes of delivery risks, concentrating instead on short-term solutions.
Several studies have applied machine learning and deep learning techniques to predict delivery risks and improve supply chain operations. Algorithms like random forests, support vector machines (SVMs), and neural networks have been used to analyze historical data and identify patterns associated with delayed shipments [17,18,19,20,21]. While these models effectively predict delays, their black-box nature limits their ability to explain the underlying causes [22], making them less suitable for long-term strategic decisions.
Optimization techniques have also been employed to enhance supply chain resilience. Methods such as adjusting inventory levels, rerouting plans, and rescheduling production in response to predicted delays have shown promise in minimizing disruptions [23,24,25]. However, these techniques focus primarily on mitigating the immediate effects of delivery risks, such as rerouting shipments or increasing inventory buffers. They do not address the root causes of the risks, which may lead to recurring issues [26].
The use of real-time data and IoT technologies has further advanced the dynamic management of delivery risks. IoT-enabled systems offer real-time visibility into the supply chain, allowing managers to react quickly to emerging risks [27,28,29]. However, while these systems improve real-time decision-making, they still fall short in identifying the root causes of delays, limiting their effectiveness in providing long-term solutions [30]. This underscores the need for methods that not only predict delivery risks but also uncover their root causes, enabling supply chain managers to develop more strategic and sustainable risk mitigation strategies.
Table 1 summarizes the existing methods and their respective results in the context of delivery risk management. The cited studies encompass a wide range of approaches, from predictive modeling to IoT-enabled real-time management, highlighting the current emphasis on mitigating delivery risks. Despite the advancements in prediction and short-term interventions, as previously mentioned, a significant gap remains in understanding the root causes of these risks. Addressing these underlying causes is essential for developing long-term strategies that enhance the resilience of supply chains.
By integrating causal discovery with reinforcement learning, our proposed method aims to fill this gap by identifying not just the patterns associated with delays but the true drivers behind them. This approach offers supply chain managers the ability to implement more effective, targeted interventions that go beyond temporary fixes, ultimately ensuring a more sustainable and resilient supply chain system.
Paper Organization In Section 2, we outline our proposed method for identifying the root causes of delivery risks. In Section 3, we conduct an exploratory data analysis (EDA) to identify key patterns and relationships in the data and analyze the experimental results. In Section 4, we discuss the potential business impact of our work. We conclude in Section 5 and discuss future work.

2. Methodology

The Proposed Method

Building on the gaps identified in existing works, this section introduces a novel framework that integrates causal discovery with reinforcement learning to address delivery risks in supply chains.
The proposed method consists of two main components: (1) causal discovery through reinforcement learning and (2) causal strength quantification using information entropy. The framework is designed to uncover the root causes of delivery risks by identifying the underlying causal relationships between supply chain variables. We adopt the data-generating model proposed by [31,32], where each variable x i corresponds to a node i in a d-node directed acyclic graph (DAG) G . The observed value of x i is modeled as a function of its parent variables in the graph, plus an independent additive noise n i . Specifically,
x i : = f i ( x pa ( i ) ) + n i , i = 1 , 2 , , d ,
where f i ( x pa ( i ) ) represents the set of parent variables x j that have directed edges toward x i , and the noise terms n i are assumed to be jointly independent. We also assume causal minimality, meaning that each function f i is non-constant in any of its arguments, as explained by [32].
As shown in Figure 1, our model consists of two components: causal discovery via reinforcement learning and the calculation of inverse information entropy (IIE) causal strength [33]. The model processes the observed dataset X = { x 1 , x 2 , , x i } , where x i represents the dimensions of the input observations and outputs a causal structure comprising a causal graph G and causal strengths. Figure 2 illustrates the causal structure simulated from the observed data O = { G , S } , where S denotes the strength of the causal relationships in graph G.
Our encoder–decoder approach builds upon the work of [10]. The model employs an encoder–decoder framework to generate a directed graph. The encoder, following the architecture introduced by [34], consists of six identical layers, each with two sublayers. The first sublayer implements a multi-headed self-attention mechanism, while the second is a fully connected feedforward network with positional encoding. These sublayers are connected using residual connections, as described by [35]. The output of each sublayer is normalized as follows:
Layer Norm ( x + Sublayer ( x ) ) ,
where Sublayer ( x ) represents the function applied in the sublayer. To maintain consistency, all sublayer and embedding outputs in the model have the same dimension, d model = 512 .
For decoding, we use a single-layer decoder, defined as:
g ( W 1 , W 2 , u ) = u tan h ( W 1 enc i + W 2 enc j ) ,
where W 1 , W 2 R d h × d n , and  u R d h × 1 are trainable parameters. Here, d h represents the number of hidden layers in the decoder, and  d n is the dimension of the encoder output. In this equation, the decoder processes the transformed representation z = Sublayer ( x ) and attempts to map it to the observed relationships in the data. The decoder uses the features learned by the sublayer to predict the strength and direction of causal relationships. Thus, the sublayer extracts meaningful features from the input, and the decoder uses those features to estimate the causal relationships. The sublayer function Sublayer ( x ) acts as a feature extractor, filtering out noise and producing a transformed representation z, which retains the most relevant information from the input x. This transformation is crucial because the input x often contains high-dimensional, redundant data. The sublayer extracts key features that the decoder then uses to reconstruct the causal graph.
To generate the adjacency matrix, each element is passed through a sigmoid function and then sampled based on a Bernoulli distribution with probability σ ( g ) :
M Ber ( σ ( g ) ) .
Here, the Tanh (hyperbolic tangent) function in Equation (3) and the sigmoid function in Equation (4) are combined to balance transformation values and interpretability. The Tanh function maps input values to a range of [ 1 , 1 ] , ensuring that both positive and negative influences are captured symmetrically, which is important when dealing with data involving variations in both directions. However, for outputs like probabilities or causal relationship strengths, values need to be constrained within the range of [ 0 , 1 ] . This is where the sigmoid function comes into play. The sigmoid function compresses the Tanh-generated values into this range, providing a smooth and interpretable output that represents probabilities or relative strengths. By using Tanh to handle a wide range of input values and sigmoid to transform these into a usable probability space, our model ensures that the results are both balanced and interpretable for causal discovery.
To prevent self-loops, the diagonal elements ( i , i ) of the adjacency matrix are set to zero. By iteratively inputting the encoder outputs of all variables, a complete directed graph’s adjacency matrix is obtained. The scoring function uses the Bayesian Information Criterion (BIC), which is decomposable and allows for adjusting the penalty term. The BIC score for a graph G is given by:
S BIC ( G ) = 2 log p ( X ; L ^ ; G ) + d L log m ,
where L ^ is the maximum likelihood estimate, d L is the number of parameters in L, and m is the size of the dataset X. To ensure the generated graph is a directed acyclic graph (DAG), the score function includes both a reward and a penalty term, incorporating two acyclic constraints:
reward = S BIC ( G ) + λ 1 I ( G DAGs ) + λ 2 h ( A ) ,
where I ( · ) is an indicator function, λ 1 , λ 2 0 are hyperparameters, A { 0 , 1 } d × d , and  h ( A ) is a function introduced by [36], which is non-negative and small for cyclic graphs. The binary adjacency matrix of a directed graph G is acyclic if and only if:
h ( A ) = trace e A d = 0 ,
where e A is the matrix exponential of A. Larger values of λ 1 and λ 2 increase the likelihood that a high-reward graph is acyclic. Our goal is to maximize the reward across all possible directed graphs, which is equivalent to solving:
min G S BIC ( G ) + λ 1 I ( G DAGs ) + λ 2 h ( A ) .
The expected return during training can be expressed as:
J ( φ s ) = E A π ( · s ) S BIC ( G ) + λ 1 I ( G DAGs ) + λ 2 h ( A ) ,
where π ( · s ) represents the strategy, and  φ denotes the neural network parameters for graph generation. During training, random samples are drawn from the dataset X. The encoder output is fed into the critic, which is a simple two-layer feedforward neural network with a Tanh activation function. The critic minimizes the mean squared error between predicted and actual rewards and penalties and is trained using the Adam optimizer.
Moving on to the inverse information entropy (IIE), based on [33], the causal strength is defined as:
T = 1 S p X 2 S p X 1 ,
where S ( p x 1 ) and S ( p x 2 ) represent the information entropies of variables x 1 and x 2 , respectively. For finite datasets, the entropy of the probability distribution for risk factors can be estimated using the entropy estimator, as described by [37,38]:
S ^ ( X ) = ψ ( n ) ψ ( 1 ) + 1 n 1 i n 1 log x i + 1 x i ,
where ψ ( n ) is the digamma function, and n represents the dimensionality of the dataset X. Based on Equation (10), the IIE causal strength is computed using raw data. However, due to potential differences in dimensionality among the variables in the raw data, the calculated causal strength might deviate from its true value. To account for this, the IIE causal strength for normalized data is given by:
T N = 1 S p X 2 , N S p X 1 , N .
Finally, we applied a log transformation to the computed causal strengths to normalize their distribution and stabilize variance. This transformation enhances the interpretability of the results, making it easier to compare the relative impact of different factors on delivery risks:
Causal Strength = log ( T N ) .
Next, we propose the algorithm and hyperparameter we used for our causal discovery via RL for root cause attribution on late delivery. As can be seen in Algorithm 1, the proposed approach is as follows:
Algorithm 1 Causal Discovery with RL for Late Delivery Risk
1:
Input: Data pertaining to late delivery risk.
2:
Step 1: Preprocess the observed data and set hyperparameters, including the number of epochs, learning rate, and scoring function.
3:
Step 2: Feed the preprocessed data into the encoder, and pass the encoder outputs along with those from the critic to the decoder. The decoder performs calculations using Equation (3) and samples according to Equation (4).
4:
Step 3: Rate the generated directed graph using Equation (4), and return rewards and penalties to the critic. Track the maximum reward using Equation (8).
5:
Step 4: Check if the preset number of iterations has been reached. If not, return to Step 2; otherwise, output the directed graph with the maximum score.
6:
Step 5: From the final directed graph, calculate causal strengths with log transformation (13), remove edges with a strength less than 0.1, and output the refined causal graph with the associated strengths.
7:
Step 6: Generate an explainable delivery risk report from the output graph, tailored for business stakeholders.
As shown in Table 2, the training of our model was performed with a learning rate of 0.001, 500 epochs, and a batch size of 64, chosen through cross-validation to ensure optimal performance. We used the Adam optimizer for its robustness in managing sparse and noisy gradients, while L2 regularization ( λ = 0.0001 ) and a dropout rate of 0.2 were applied to mitigate overfitting. Additionally, we implemented a custom reward function designed to minimize supply chain risks and set a causal strength threshold of 0.05 to filter out weaker relationships.
The reward function parameters λ 1 = 0.5 and λ 2 = 0.1 were selected to balance the trade-off between discovering significant causal relationships and penalizing weaker, less relevant connections. Specifically, λ 1 adjusts the weight for strong causal rewards, helping the model prioritize meaningful relationships, while λ 2 introduces a mild penalty to reduce noise in the causal graph without being overly restrictive. These values were fine-tuned through empirical testing to optimize performance for supply chain scenarios.
Our approach, which combines causal discovery with reinforcement learning, addresses limitations in traditional, theory-based structural models that often rely on fixed assumptions and expert-driven interpretations. Such models typically focus on predefined relationships and struggle with the high variability of real-world supply chains, where interactions between variables are dynamic and complex [9,26]. By contrast, our data-driven framework adapts to observed data, uncovering causal relationships in real-time and quantifying their strengths. This flexibility allows our method to respond to shifting supply chain conditions, capturing both direct and indirect causal influences on delivery risk—something conventional structural models may overlook.
Our experimental results in the next section further distinguish this approach by quantifying causal strengths, providing actionable insights beyond correlation. For example, while previous studies highlight shipment mode as influential, our model assigns a causal strength, indicating its true impact on late delivery risk. This enables supply chain managers to prioritize interventions based on data-driven causal influence rather than intuition alone [39]. By aligning with recent calls for more adaptable and quantifiable risk assessment tools, our framework not only confirms established risk factors but also provides a more granular, context-sensitive analysis that enhances decision-making in complex logistical environments.
In the next section, we present the experimental results and analysis. We evaluate our model’s performance using real-world supply chain data and compare it against existing methods to demonstrate its effectiveness in identifying the root causes of delivery risks.

3. Experiments

Identifying key factors influencing delivery risk is crucial in supply chain management, and multiple studies have laid the groundwork by focusing on variables that directly impact delivery outcomes. For instance, variables such as shipment mode and delivery status have consistently been found to influence delays and overall efficiency in supply chains. Shipment mode, in particular, determines the speed and reliability of deliveries, with studies showing that faster, premium shipping options reduce the likelihood of delays compared to economy modes [17,24]. Similarly, delivery status reflects real-time shipment progress, which has been linked to late delivery risks, as it indicates the presence or absence of disruptions [18].
Other critical variables commonly examined include order quantity and customer segment. Order quantity can strain logistics if not adequately managed, potentially leading to bottlenecks in shipping [19]. Customer segment analysis provides insights into delivery priority and logistical complexity, as certain segments (e.g., corporate clients) often require stricter delivery timelines and tailored services [28]. Additionally, sales per customer and profit ratio are frequently considered in supply chain studies to determine the financial impact of delayed deliveries and prioritize orders based on revenue potential [23].
By selecting these proper variables, one can capture a comprehensive set of delivery risk factors identified in prior research (we chose 16 variables in a real experiment in the next section in Table 3). Each variable represents a distinct aspect of delivery risk, from logistical decisions (like shipping mode and scheduled days) to customer-related metrics (such as customer segment and sales). This diverse selection, grounded in existing literature, enables a robust analysis of factors that collectively contribute to delivery risks.

3.1. Experimental Data

3.1.1. DataCo Global

We first describe the experimental setup used to evaluate the model. This section outlines the dataset, performance metrics, and baseline models used as benchmarks for comparison. For this analysis, we employed the DataCo Global Supply Chain dataset from Kaggle, which is designed for analyzing various aspects of supply chain management. The dataset contains 180,520 samples and 53 features. Our goal is to identify the root causes of late delivery risks by applying our causal discovery method via reinforcement learning with causal strength calculation.
In the preprocessing stage, we manually filtered out variables deemed less relevant to delivery risk, such as “Order Date” and “Order City”, which had minimal influence on the analysis. This step reduced unnecessary noise and allowed us to concentrate on the most important variables. After this initial filtering, we addressed multicollinearity, which occurs when two or more features are highly correlated, potentially skewing the model by introducing redundant information. To resolve this, we computed the correlation matrix and removed features with correlation coefficients above 0.95, retaining only one representative feature from each correlated group. This step streamlined the model and improved its interpretability by ensuring that each feature contributed unique information.
Next, categorical variables such as “Shipment Mode” and “Customer Segment” were converted into numerical values using integer encoding, as most machine learning algorithms require numerical inputs. After encoding, we addressed missing values. Features with a large proportion of missing data (over 40%) and those irrelevant to the outcome were excluded to improve data quality and reduce bias in the model. Finally, we selected 16 key features based on their relevance to the delivery risk problem, ensuring a balanced and optimized input dataset for modeling. These features, including shipment time and delivery status, were directly related to late deliveries. The remaining features are listed in Table 3, with detailed descriptions provided in Appendix A.
This preprocessing approach helps prevent overfitting, enhances model generalization, and simplifies interpretation by focusing on the most relevant data. Additionally, by removing irrelevant or redundant variables, we improve computational efficiency and ensure the model is better equipped to handle real-world scenarios, providing reasonable explanations for the root causes of late deliveries.
In the next subsection, we will conduct a comprehensive exploratory data analysis (EDA) on the dataset. This analysis will involve examining the data in detail to uncover underlying patterns, relationships, and trends. Using various visualization techniques and statistical summaries, we aim to identify key insights that can inform our modeling approach and provide a deeper understanding of the factors contributing to late delivery risks. These insights will be critical in guiding feature selection and improving the accuracy and interpretability of the model.

3.1.2. Explanatory Data Analysis

Figure 3 provides an initial overview of the “Late_delivery_risk” ( X 7 ) variable within the dataset, which is essential for our exploratory data analysis (EDA) before proceeding with causal discovery through reinforcement learning for root cause attribution. The first chart presents a countplot of late delivery risks, indicating that instances of late deliveries (‘1’) slightly exceed on-time deliveries (‘0’), suggesting that delivery delays are a prevalent issue in this dataset. The second chart, a scatter plot, examines the relationship between actual shipping days and scheduled shipping days, with different colors representing the presence or absence of late delivery risk. The clustering of late deliveries around certain combinations of scheduled and actual shipment days hints at underlying patterns contributing to these delays. Together, these visualizations provide key insights into where delays are concentrated, laying the groundwork for the causal analysis to follow.
Figure 4 further enhances our exploratory data analysis (EDA) by offering key insights into delivery performance and customer segmentation, which are crucial for understanding factors influencing late delivery risks. The bar chart on the left (a) shows the distribution of delivery statuses across the dataset, with ”Late delivery” being the most frequent outcome, significantly outnumbering “Advance shipping”, “Shipping on time”, and “Shipping canceled”. This indicates that delays are a pervasive issue, making it a central focus for our causal analysis. The high occurrence of late deliveries suggests potential systemic issues, which could be explored further by examining other variables in the dataset. For instance, we hypothesize that Delivery Status ( X 6 ) could be a key causal factor contributing to late deliveries.
The pie chart on the right (b) provides a breakdown of orders by customer segment ( X 8 ). The majority of orders come from the “Consumer” segment (51.8%), followed by “Corporate” (30.4%), and ”Home Office” (17.9%). This segmentation is important for our analysis, as it may reveal segment-specific factors influencing delivery delays. While it is crucial to understand the different behaviors and risks associated with each segment, we have reservations about whether the customer segment directly causes late delivery. Although segments may impact shipping times, they might not be a direct causal factor for late deliveries, which will be explored in the subsequent causal analysis.
The heatmap visualization in Figure 5 presents the correlation matrix for key features in the dataset. Each cell in the matrix represents the correlation coefficient between two features, with the color intensity indicating the strength and direction of the correlation. Several important patterns emerge from this visualization. For example, “Days for shipment (scheduled)” ( X 3 ) and “Shipping Mode” ( X 16 ) display a strong positive correlation (0.92), suggesting that the shipping mode may significantly influence, or be influenced by, the scheduled shipment days. Likewise, “Sales per customer” ( X 5 ), “Order Item Total”, and “Sales” are highly correlated, indicating potential redundancy among these features.
Of particular relevance to our study on late delivery risks, the ”Late_delivery_risk” ( X 7 ) variable shows a moderate negative correlation with “Days for shipment (scheduled)” ( X 3 ) ( 0.37 ) and a positive correlation with “Delivery Status” ( X 6 ) (0.19). This suggests that shorter scheduled shipment times may be associated with a higher likelihood of late deliveries and that delivery status plays a significant role in explaining delays. Understanding these correlations is critical as we move toward applying causal discovery with reinforcement learning. By identifying and possibly removing highly correlated features, we can reduce multicollinearity and ensure that the remaining features contribute unique, meaningful information to the causal model. This step is essential for accurately attributing the root causes of late delivery risks in the dataset.
However, while correlation matrices offer valuable insights into relationships between variables, they also have significant limitations. Correlation analysis highlights associations but does not provide information about directionality or causality. For instance, a strong correlation between shipment delays and certain shipping modes may suggest an association, but it does not clarify whether the shipping mode is causing the delay or if both are influenced by another underlying factor. This limitation underscores the need for our causal discovery approach, which can go beyond correlations to uncover the true directional relationships between variables, leading to more actionable insights for mitigating delivery risks.

3.2. Experimental Analysis

In this section, we go beyond the correlation matrix. By leveraging causal discovery with reinforcement learning, we uncover the underlying causal mechanisms driving delivery risks. Causal discovery allows us to establish the directionality between variables, distinguishing between factors that are merely correlated and those that are true drivers of delays.
The final causal graph in Figure 6 illustrates the intricate relationships between various features and their impact on ‘Late_delivery_risk’ ( X 7 ), providing a clear map of how different variables contribute to the likelihood of a late delivery. Notably, the shipping mode ( X 16 ) emerges as a key factor, exerting a strong direct influence on ‘Days for shipping (real)’ ( X 2 ), with a causal strength of 9.24. This relationship, in turn, significantly affects both ‘Days for shipment (scheduled)’ ( X 3 ) and directly influences ‘Late_delivery_risk’ ( X 7 ), with strengths of 8.92 and 10.25, respectively. These results suggest that the choice of shipping mode triggers a cascade of effects, ultimately determining whether a delivery will be late.
Additionally, ‘Delivery Status’ ( X 6 ) has a direct and significant impact on ‘Late_delivery_risk‘, with a causal strength of 10.66, indicating that certain delivery statuses are strong predictors of potential delays. The interconnectedness of these nodes shows that delays in actual shipping times not only increase the risk of late delivery directly but also indirectly affect other factors, such as scheduled shipping times, compounding the delay.
At a more moderate level, ‘Sales per customer’ ( X 5 ) and ‘Order Item Product Price’ ( X 12 ) influence ‘Late_delivery_risk’ indirectly, with causal strengths of 1.10 and 1.12, respectively. While these variables have less impact, they suggest that the value of orders and customer segments may affect delivery priority or efficiency, thereby influencing the overall risk of delays. This combination of direct and indirect effects highlights the complexity of interactions where both logistical choices and customer-related factors converge to influence delivery outcomes.
Figure 7 reveals additional patterns within the supply chain that, while not directly tied to ‘Late_delivery_risk‘, provide valuable business insights. The analysis shows that both ‘Order Item Product Price’ ( X 11 ) and ‘Order Item Quantity’ ( X 14 ) moderately influence ‘Benefit per Order’ ( X 4 ), each with a causal strength of 1.28. This suggests that higher product prices and larger order quantities contribute to increased profitability per order. More importantly, ‘Order Item Profit Ratio’ ( X 13 ) exerts a stronger influence on ‘Benefit per Order‘, with a causal strength of 2.20, emphasizing the pivotal role of profit margins in driving overall order profitability. These results underscore the importance of optimizing pricing strategies and profit margins to maximize financial performance within the supply chain.
Comparative Analysis with Existing Literature: Our causal analysis identified shipment mode and delivery status as key drivers with significant causal strengths impacting delivery risks. This finding aligns with previous research by Chong and Ryu [17,18], which highlighted shipment mode as a critical factor in determining on-time delivery performance. However, while prior studies often established correlations between shipment mode and delays, they lacked insights into the causal direction and strength of these relationships. Our approach quantifies the causal strength (e.g., a causal strength of 9.24 from shipment mode to actual shipping days), illustrating not just association but the magnitude of impact that different shipment modes exert on delivery timelines. This causal insight enables a more precise prioritization of shipping strategies in risk mitigation, particularly for high-risk shipments that could benefit from optimized shipping mode selection.
Moreover, our analysis revealed a direct causal link between delivery status and late delivery risk, with a significant causal strength (10.66). This finding diverges from studies such as Bicer and Seifert [15], which focus primarily on reactive inventory adjustments to minimize delays. While inventory management is certainly relevant, our results suggest that monitoring delivery status provides a more immediate indicator of risk, allowing for preemptive interventions. This causal relationship, which our method detects with reinforced confidence, emphasizes the importance of real-time monitoring over purely inventory-based adjustments, thus proposing a shift in risk management priorities. Interestingly, our method also found moderate causal effects of customer-related variables, such as order quantity and sales per customer, on late delivery risk (causal strengths of 1.10 and 1.12, respectively). This nuance contributes to the discussion by supporting findings from Bushuev and Filatova [15], who noted that large orders often disrupt logistics planning, leading to delays. However, unlike traditional studies that treat these customer variables as indirect or secondary risk factors, our model establishes them as contributing causal drivers, albeit with lower causal strength. This insight underscores the interconnectedness of logistical and customer factors in influencing delivery outcomes.
In summary, our causal discovery approach not only corroborates established relationships identified in prior studies but also provides additional granularity by quantifying the causal strengths. This level of detail advances current knowledge by facilitating targeted risk mitigation strategies that address high-impact factors with greater precision.
Although these insights are not directly related to the root causes of late deliveries, they point to key areas for further research and optimization in supply chain management. Focusing on these factors could lead to substantial improvements in profitability, highlighting the broader applicability of our causal discovery framework in uncovering critical drivers of business performance beyond delivery risks.

4. Discussion and Business Impact

We conclude that the application of causal discovery with reinforcement learning in the context of supply chain management, particularly for identifying the root causes of late delivery risks, yields several actionable business insights:
  • Improving Shipping Strategies: Our analysis revealed that the choice of shipping mode ( X 16 ) has a direct and significant impact on actual shipping days ( X 2 ), which, in turn, greatly influences the risk of late deliveries ( X 7 ). The causal strength of 9.24 between shipping mode and actual shipping days highlights the critical role of selecting optimal shipping methods to minimize delays. By leveraging our causal discovery approach, businesses can optimize their shipping strategies, prioritizing methods that reduce delivery times and thus mitigate late delivery risks. This can lead to improved customer satisfaction and reduced operational costs.
  • Enhancing Delivery Status Monitoring: The direct relationship between delivery status ( X 6 ) and late delivery risk ( X 7 ), with a causal strength of 10.66, underscores the importance of real-time monitoring and management of delivery processes. Our method allows businesses to identify the statuses most predictive of delays, facilitating targeted interventions to address bottlenecks. By improving the monitoring of delivery statuses, companies can proactively reduce late deliveries, enhancing supply chain efficiency and providing a more reliable service.
  • Strategic Pricing and Inventory Decisions: Beyond delivery concerns, our analysis found that ‘Order Item Profit Ratio’ ( X 13 ) and ‘Order Item Product Price’ ( X 11 ) influence ‘Benefit per Order’ ( X 4 ). While not directly related to late delivery risks, these findings are crucial for profitability. The causal strength of 2.20 for the profit ratio suggests that optimizing profit margins is key to maximizing order-level benefits. By applying our causal discovery approach, businesses can refine their pricing strategies and inventory decisions to enhance profitability, leading to more informed decision-making and improved financial performance across the supply chain.
Our approach not only identifies the key drivers of late deliveries but also offers a broader understanding of factors influencing overall supply chain performance. Integrating these insights into daily operations can drive both immediate and long-term improvements in efficiency, customer satisfaction, and profitability. Future research could further explore these relationships to refine and extend these strategies, ensuring that supply chain operations remain adaptive and resilient amid evolving challenges.

5. Conclusions and Future Work

Our proposed framework successfully overcomes the limitations of existing methods by integrating causal discovery and reinforcement learning to address delivery risks in supply chains. By uncovering the root causes of delivery delays, the model goes beyond mere prediction, offering a roadmap for proactive interventions that directly mitigate these risks. The inclusion of causal strength quantification ensures that the most impactful relationships are prioritized, allowing decision-makers to focus on the key drivers of delays. In our experiments, this approach consistently outperformed traditional models, providing clearer insights into the causal mechanisms behind delivery risks. These insights not only support short-term operational improvements but also inform long-term strategic decisions. Additionally, the framework’s adaptability ensures its continued effectiveness as supply chains evolve, making it a valuable tool for maintaining operational efficiency and resilience.
Looking ahead, several avenues for future research and improvement can be explored. First, the uncertainty in the calculation of causal strength could be further examined, addressing questions such as how confident we are in the existence of strong causal relationships between variables like X i and X j . The model could also be fine-tuned to handle diverse types of data or adjust for varying supply chain conditions, increasing its robustness across different industries such as economics, healthcare, or manufacturing. Additionally, experimenting with alternative scoring functions and hyperparameters during the reinforcement learning process could further enhance the accuracy of causal discovery. The methodology could be expanded to include multi-modal data inputs or adapted for real-time data streams, providing dynamic and responsive insights that evolve with changing supply chain conditions. These advancements would broaden the applicability of our approach, making it a versatile tool for optimizing performance across a wide range of domains.

Author Contributions

Authors equally contributed to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated and/or analyzed during the current study are available in Kaggle. https://www.kaggle.com/datasets/shashwatwork/dataco-smart-supply-chain-for-big-data-analysis (accessed on 31 August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
IIEInverse information entropy
BICBayesian information criterion
DoSDays of shipment
RLReinforcement Learning
EDAExploratory data analysis

Appendix A. Data Description

Table A1. Variables Description.
Table A1. Variables Description.
VariableDescription
TypeType of transaction made
Days for shipping (real)Actual shipping days of the purchased product
Days for shipment (scheduled)Days of scheduled delivery of the purchased product
Benefit per orderEarnings per order placed
Sales per customerTotal sales per customer made per customer
Delivery StatusDelivery status of orders: Advance shipping or not
Late_delivery_riskCategorical variable that indicates if sending the product would cause a late delivery
Customer SegmentBusiness segment of the customer
LatitudeLatitude coordinates of the purchase
LongitudeLongitude coordinates of the purchase
Order Item Discount RateOrder item discount percentage
Order Item Product PricePrice of products without discount
Order Item Profit RatioOrder Item Profit Ratio
Order Item QuantityNumber of products per order
Order StatusOrder Status: COMPLETE, PENDING, CLOSED, etc.
Shipping ModeThe following shipping modes are presented: First Class, Second Class, Same Day, etc.

References

  1. Ketchen Jr, D.J.; Rebarick, W.; Hult, G.T.M.; Meyer, D. Best value supply chains: A key competitive weapon for the 21st century. Bus. Horiz. 2008, 51, 235–243. [Google Scholar] [CrossRef]
  2. Singh, J.; Singh, P. Why Is Apple’s Supply Chain Management The Best In The World? Int. J. Humanit. Manag. 2015, II, 1. [Google Scholar]
  3. Ivanov, D. Predicting the impact of the COVID-19 pandemic on global supply chains: A simulation-based analysis on the case of China. Transp. Res. Part Logist. Transp. Rev. 2020, 136, 101922. [Google Scholar] [CrossRef]
  4. Park, J.; Hong, S.Y. Just-in-time production systems in the aftermath of a disaster: The 2011 Japanese earthquake and tsunami. Bus. Horiz. 2013, 56, 75–85. [Google Scholar] [CrossRef]
  5. Blanchard, D. Supply Chain Management: Best Practices; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
  6. Neuberg, J. Causality and correlation: Pitfalls of traditional statistical methods. Stat. Sci. 2003, 18, 465–475. [Google Scholar]
  7. Mao, H.; Görg, H. Friends like this: The impact of the US–China trade war on global value chains. World Econ. 2020, 43, 1776–1791. [Google Scholar] [CrossRef]
  8. Christopher, M.; Peck, H. Building the resilient supply chain. Int. J. Logist. Manag. 2004, 15, 1–13. [Google Scholar] [CrossRef]
  9. Peters, J.; Janzing, D.; Schölkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms; The MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
  10. Zhu, S.; Ng, I.; Chen, Z. Causal discovery with reinforcement learning. arXiv 2019, arXiv:1906.04477. [Google Scholar]
  11. Le, T.D.; Hoang, T.; Li, J.; Liu, L.; Liu, H.; Hu, S. A fast PC algorithm for high dimensional causal discovery with multi-core PCs. IEEE ACM Trans. Comput. Biol. Bioinform. 2016, 16, 1483–1495. [Google Scholar] [CrossRef]
  12. He, Z.; Deng, S.; Xu, X.; Huang, J.Z. A fast greedy algorithm for outlier mining. In Proceedings of the Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, PAKDD 2006, Singapore, 9–12 April 2006; Proceedings 10. Springer: Berlin/Heidelberg, Germany, 2006; pp. 567–576. [Google Scholar]
  13. Bicer, I.; Seifert, R.W. Optimal dynamic order scheduling under capacity constraints given demand-forecast evolution. Prod. Oper. Manag. 2017, 26, 2266–2286. [Google Scholar] [CrossRef]
  14. Chen, H.; Zuo, L.; Wu, C.; Wang, L.; Diao, F.; Chen, J.; Huang, Y. Optimizing detailed schedules of a multiproduct pipeline by a monolithic MILP formulation. J. Pet. Sci. Eng. 2017, 159, 148–163. [Google Scholar] [CrossRef]
  15. Bushuev, M.A.; Filatova, E.A. Delivery strategy improvement for e-commerce supply chain. J. Ind. Eng. Manag. 2018, 11, 50–64. [Google Scholar]
  16. Shao, X.; Tang, W. Production and delivery scheduling with deteriorating jobs and time-dependent learning effects. J. Oper. Res. Soc. 2018, 69, 108–121. [Google Scholar]
  17. Chong, A.Y.; Ch’ng, E.; Liu, M.J.; Li, B. Predicting late delivery risk using artificial neural networks. Expert Syst. Appl. 2018, 88, 1–10. [Google Scholar]
  18. Ryu, S.; Kang, K.; Lee, H. Predicting delivery delays in supply chain management using machine learning. J. Manuf. Syst. 2019, 50, 1–14. [Google Scholar]
  19. Lim, J.H.; Park, K.Y. Predicting delivery risk using ensemble learning techniques. IEEE Access 2019, 7, 156634–156642. [Google Scholar]
  20. Wang, S.; Zeng, J. Supply chain disruption risk management using machine learning. J. Manuf. Syst. 2019, 51, 195–206. [Google Scholar]
  21. Bo, S.; Zhang, Y.; Huang, J.; Liu, S.; Chen, Z.; Li, Z. Attention Mechanism and Context Modeling System for Text Mining Machine Translation. arXiv 2024, arXiv:2408.04216. [Google Scholar]
  22. Verma, S.; Gangele, V. Supply chain risk management using deep learning techniques. Int. J. Comput. Appl. 2019, 178, 8–12. [Google Scholar]
  23. Cruijssen, F.; Dullaert, W. Enhancing supply chain resilience through supply chain collaboration: An optimization perspective. Transp. Res. Part Logist. Transp. Rev. 2019, 122, 14–26. [Google Scholar]
  24. Mourtzis, D.; Vlachou, E. Real-time monitoring and control in manufacturing: The role of digital twin and industry 4.0. Procedia CIRP 2019, 81, 467–472. [Google Scholar]
  25. Mishra, A.; Kumar, S. Smart supply chain management: A review and future research directions. J. Clean. Prod. 2020, 273, 123091. [Google Scholar]
  26. Zhou, K.; Dai, C. Supply chain risk prediction and management using deep reinforcement learning. J. Artif. Intell. Res. 2018, 62, 575–589. [Google Scholar]
  27. Lee, I.; Lee, K. The Internet of Things (IoT): Applications, investments, and challenges for enterprises. Bus. Horiz. 2018, 61, 577–590. [Google Scholar] [CrossRef]
  28. Janjua, M.B.; Kausar, F. Real-time monitoring of delivery risks in supply chain using IoT and big data. J. Supply Chain. Manag. 2019, 55, 30–45. [Google Scholar]
  29. Lin, H.; Chen, C.H. Real-time risk management in smart manufacturing systems: A case study on IoT-enabled supply chain. J. Intell. Manuf. 2020, 31, 689–701. [Google Scholar]
  30. Liu, C.; Yu, L. Internet of Things for improving supply chain risk management: A systematic review. Int. J. Prod. Res. 2020, 58, 2954–2975. [Google Scholar]
  31. Hoyer, P.; Janzing, D.; Mooij, J.M.; Peters, J.; Schölkopf, B. Nonlinear causal discovery with additive noise models. In Proceedings of the Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–11 December 2008. [Google Scholar]
  32. Peters, J.; Mooij, J.M.; Janzing, D.; Schölkopf, B. Causal Discovery with Continuous Additive Noise Models. J. Mach. Learn. Res. 2014, 15, 2009–2053. [Google Scholar]
  33. Mu, G.; Chen, Q.; Liu, H.; An, J.; Wang, C. The inverse information entropy causal reasoning method to reveal causality in power system operation data. Chin. J. Electr. Eng. 2022, 42, 5406–5417. [Google Scholar]
  34. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  35. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  36. Zheng, X.; Aragam, B.; Ravikumar, P.K.; Xing, E.P. Dags with no tears: Continuous optimization for structure learning. In Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montreal, QC, Canada, 3–8 December 2018. [Google Scholar]
  37. Daniusis, P.; Janzing, D.; Mooij, J.; Zscheischler, J.; Steudel, B.; Zhang, K.; Schölkopf, B. Inferring deterministic causal relations. arXiv 2012, arXiv:1203.3475. [Google Scholar]
  38. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. Stat. Nonlinear Soft Matter Phys. 2004, 69, 066138. [Google Scholar] [CrossRef] [PubMed]
  39. Bramer, M.; Heath, T. Data Mining for Advanced Analytics: Techniques for Risk Assessment and Decision-Making; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Figure 1. Proposed causal discovery approach with reinforcement learning for delivery risk.
Figure 1. Proposed causal discovery approach with reinforcement learning for delivery risk.
Algorithms 17 00498 g001
Figure 2. Causal structure.
Figure 2. Causal structure.
Algorithms 17 00498 g002
Figure 3. Analysis of late delivery risk.
Figure 3. Analysis of late delivery risk.
Algorithms 17 00498 g003
Figure 4. Delivery status and customer segments.
Figure 4. Delivery status and customer segments.
Algorithms 17 00498 g004
Figure 5. Correlation matrix.
Figure 5. Correlation matrix.
Algorithms 17 00498 g005
Figure 6. Causal structure for delivery risk.
Figure 6. Causal structure for delivery risk.
Algorithms 17 00498 g006
Figure 7. Causal structure for benefits per order.
Figure 7. Causal structure for benefits per order.
Algorithms 17 00498 g007
Table 1. Summary of methods and results on delivery risk prediction.
Table 1. Summary of methods and results on delivery risk prediction.
ArticleExisting MethodsResults
 [13]Optimized Production ProcessesMitigated delays at the production stage
 [14]Optimizing delivery strategiesReduced delivery delays post-risk materialization
 [15]Enhanced delivery strategiesImproved on-time delivery rate by optimizing routes
 [17]Machine Learning (Random Forest)Improved accuracy in predicting delivery delays
 [19]Support Vector MachinesIdentified key delay patterns in historical data
 [21]Clustering with AttentionHigher predictive accuracy for identifying delayed shipments
 [23]Optimization AlgorithmsAdjusted inventory levels and routing plans to mitigate delays
 [24]Real-Time Adjustment (IoT)Enabled dynamic responses to real-time risks
 [28]IoT-enabled Supply Chain ManagementImproved response time to detected delivery risks
 [30]IoT and Predictive AnalyticsDetected anomalies and delayed shipments using sensor data
Table 2. Hyperparameters used in the training process.
Table 2. Hyperparameters used in the training process.
Parameter ValueDescription
Learning Rate0.001Controls the size of the step taken in each iteration during gradient descent. Chosen to ensure stable convergence without overshooting.
Epochs 500Number of iterations over the entire dataset during training. Selected based on performance stabilization after cross-validation.
Batch Size64Number of samples processed before the model updates. A moderate size chosen for memory efficiency and gradient accuracy.
OptimizerAdamAdam optimizer dynamically adjusts learning rates, suited for handling sparse gradients and noisy data in supply chains.
Regularization (L2) λ = 0.0001 Prevents overfitting by penalizing large weights. Regularization factor tuned for balance between complexity and generalization.
Dropout Rate0.2Reduces overfitting. A value of 20% chosen based on experimentation for improved generalization.
Reward Function Parameters λ 1 = 0.5 , λ 2 = 0.1 Two hyperparameters for reward function: λ 1 controls the causal reward weight, and  λ 2 is a penalty factor for weak causal links.
Causal Strength Threshold0.05Threshold set to filter out weaker causal links, ensuring only significant relationships are included in the causal graph.
Table 3. Denotations and their corresponding variables.
Table 3. Denotations and their corresponding variables.
DenotationVariableDenotationVariable
X 1 Type X 9 Latitude
X 2 Days for shipping (real) X 10 Longitude
X 3 Days for shipment (scheduled) X 11 Order Item Discount Rate
X 4 Benefit per order X 12 Order Item Product Price
X 5 Sales per customer X 13 Order Item Profit Ratio
X 6 Delivery Status X 14 Order Item Quantity
X 7 Late_delivery_risk X 15 Order Status
X 8 Customer Segment X 16 Shipping Mode
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bo, S.; Xiao, M. Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning. Algorithms 2024, 17, 498. https://doi.org/10.3390/a17110498

AMA Style

Bo S, Xiao M. Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning. Algorithms. 2024; 17(11):498. https://doi.org/10.3390/a17110498

Chicago/Turabian Style

Bo, Shi, and Minheng Xiao. 2024. "Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning" Algorithms 17, no. 11: 498. https://doi.org/10.3390/a17110498

APA Style

Bo, S., & Xiao, M. (2024). Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning. Algorithms, 17(11), 498. https://doi.org/10.3390/a17110498

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop