Product Demand Prediction with Spatial Graph Neural Networks

Li, Jiale; Fan, Li; Wang, Xuran; Sun, Tiejiang; Zhou, Mengjie

doi:10.3390/app14166989

Open AccessArticle

Product Demand Prediction with Spatial Graph Neural Networks

by

Jiale Li

¹,

Li Fan

²

,

Xuran Wang

³,

Tiejiang Sun

⁴ and

Mengjie Zhou

^5,*

¹

Tandon School of Engineering, New York University, New York, NY 10012, USA

²

College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China

³

The Department of Computer and Information Science, The University of Pennsylvania, Philadelphia, PA 19104, USA

⁴

School of Information Engineering, Chang’an University, Xi’an 710064, China

⁵

Department of Computer Science, University of Bristol, Bristol BS8 1QU, UK

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 6989; https://doi.org/10.3390/app14166989

Submission received: 2 July 2024 / Revised: 30 July 2024 / Accepted: 6 August 2024 / Published: 9 August 2024

(This article belongs to the Special Issue Methods and Applications of Data Management and Analytics)

Download

Browse Figures

Versions Notes

Abstract

:

In the rapidly evolving online marketplace, accurately predicting the demand for pre-owned items presents a significant challenge for sellers, impacting pricing strategies, product presentation, and marketing investments. Traditional demand prediction methods, while foundational, often fall short in addressing the dynamic and heterogeneous nature of e-commerce data, which encompasses textual descriptions, visual elements, geographic contexts, and temporal dynamics. This paper introduces a novel approach utilizing the Graph Neural Network (GNN) to enhance demand prediction accuracy by leveraging the spatial relationships inherent in online sales data, named SGNN. Drawing from the rich dataset provided in the fourth Kaggle competition, we construct a spatially aware graph representation of the marketplace, integrating advanced attention mechanisms to refine predictive accuracy. Our methodology defines the product demand prediction problem as a regression task on an attributed graph, capturing both local and global spatial dependencies that are fundamental to accurate predicting. Through attention-aware message propagation and node-level demand prediction, our model effectively addresses the multifaceted challenges of e-commerce demand prediction, demonstrating superior performance over traditional statistical methods, machine learning techniques, and even deep learning models. The experimental findings validate the effectiveness of our GNN-based approach, offering actionable insights for sellers navigating the complexities of the online marketplace. This research not only contributes to the academic discourse on e-commerce demand prediction but also provides a scalable and adaptable framework for future applications, paving the way for more informed and effective online sales strategies.

Keywords:

demand prediction; Graph Neural Network; spatial information

1. Introduction

In the vast expanse of today’s online marketplace, the ability to effectively sell pre-owned items hinges not just on the inherent qualities of the items themselves but also critically on the nuanced presentation within their descriptions. From vivid imagery to compelling narratives, each element plays a pivotal role in captivating potential buyers. Sellers invest considerable effort into optimizing their listings, yet they frequently encounter a disheartening obstacle: despite a meticulously optimized product listing, the anticipated demand may fail to materialize. This scenario leaves sellers in a quandary, particularly when substantial investments in marketing have been made, underscoring a pressing need for accurate demand prediction.

Navigating the online marketplace demands more than just intuition; it requires a nuanced understanding of a multitude of factors that influence buyer behavior [1]. The challenge of demand prediction lies in its inherent complexity, influenced by a web of interrelated factors, including product descriptions, visual appeal, contextual cues such as geographic location, the presence of competing ads, and the intricate patterns of historical demand. These factors play a significant role in shaping customer behavior [2] and contribute to an environment where traditional sales strategies may fall short, necessitating innovative approaches to predict demand accurately. Besides, advertising demand often follows short-term and long-term patterns, including weekly and seasonal trends, thus requiring models to be able to capture these temporal dynamics [3].

The fourth Kaggle competition, leveraging the Avito dataset (https://www.kaggle.com/competitions/avito-demand-prediction (accessed on 15 June 2024)), exemplifies this complex challenge and highlights the critical importance of accurate demand prediction in the online marketplace. Avito, as Russia’s largest classified ads platform, offers a rich and diverse dataset that encapsulates the multifaceted nature of e-commerce, including the very factors mentioned above that influence buyer behavior. This competition not only tasks participants with predicting demand for a wide range of products but also emphasizes the potential impact of such predictions on crucial business decisions, from pricing strategies to marketing investments. By providing a real-world context, the competition creates a unique opportunity to test and validate innovative approaches to demand prediction, directly addressing the need for more sophisticated models in this domain.

However, the path to effective demand prediction in the e-commerce domain is fraught with challenges. The heterogeneous nature of online advertising data, encompassing textual descriptions, images, and contextual information, presents a significant analytical challenge. Traditional statistical methods, while foundational, often lack the flexibility to accommodate the dynamic and multifaceted nature of e-commerce data. Machine learning techniques have introduced a degree of adaptability, yet the quest for models that can fully capture and interpret the complex interplay of factors influencing demand continues.

In response to these challenges, our paper proposes a novel approach utilizing the Graph Neural Network to enhance the accuracy of e-commerce product demand prediction. By constructing a spatially aware graph representation of the marketplace and integrating advanced attention mechanisms, our methodology aims to capture the nuanced relationships and dependencies that shape demand patterns. This approach not only promises to refine demand predictions but also introduces a scalable framework adaptable to the diverse landscape of online sales.

As we delve into this exploration, our paper seeks to make a significant contribution to the domain of e-commerce demand prediction. Through an in-depth examination of the challenges inherent in online sales, a comprehensive review of existing methodologies, and the introduction of an innovative GNN-based approach, we endeavor to equip sellers with a robust tool to navigate the complexities of the online marketplace. In our investigation, we propose the following research questions (RQs) to evaluate the efficacy of our spatial GNN model in the context of geographic product demand prediction: RQ1: How does the performance of the SGNN model in predicting product demand compare to a variety of established baseline models? RQ2: To what extent does each component within the SGNN model contribute to the improvement of product demand prediction accuracy? RQ3: How stable is the SGNN model’s predictive accuracy across a range of hyperparameter settings? Our findings validate the effectiveness of our approach and offer actionable insights for optimizing online sales strategies.

In essence, this paper enriches the academic discourse on demand prediction in e-commerce and addresses a critical gap with its pioneering GNN-based methodology. By doing so, it lays the groundwork for future research and practical applications poised to transform the online sales landscape.

2. Related Works

2.1. E-Commerce Demand Prediction

2.1.1. Characteristics of Online Advertising Data

Customers’ willingness to purchase on e-commerce platforms is influenced not only by historical data but also significantly by the quality, thoroughness, authenticity, and context of advertisements [1]. This complexity underscores the multifaceted nature of online advertising data. Such data encompass numerical data (e.g., historical demand), image data (e.g., ad visuals), text data (e.g., product descriptions), and location data (e.g., geographic targeting). Detailed descriptions and contextual cues, such as geographic location and the presence of similar ads, play a significant role in shaping customer behavior [2]. Moreover, advertising demand often follows seasonal patterns and short-term and long-term trends, necessitating models that can capture these temporal dynamics [3]. The dynamic nature of online markets introduces volatility and noise, further challenging models to provide accurate predictions amidst these fluctuations. These interconnected characteristics highlight the need for advanced models capable of processing and analyzing diverse and complex data types to enhance demand prediction accuracy.

2.1.2. Demand Prediction Methods

Demand prediction in the context of e-commerce is a relatively novel area of research.

Traditional Statistical Methods. Early efforts in demand prediction for online advertising predominantly employed traditional statistical methods. Levis et al. present a systematic optimization-based approach for customer demand prediction using support vector regression (SVR) [4]. Jain et al. also explore SVR for demand prediction, using a similar three-step algorithm involving nonlinear and linear programming, and a recursive step to adapt to historical sales data for accurate predictions [5]. These traditional statistical models have been widely used due to their simplicity and interoperability.

Machine Learning Techniques. Machine learning techniques offer enhanced flexibility and accuracy in demand prediction for e-commerce. Tugay et al. propose a novel approach that considers market dynamics with multiple sellers offering the same product at different prices [6]. They apply various regression algorithms and a stacked generalization (ensemble learning) technique, showing that the latter yields superior results.

Deep Learning Model. Deep learning models, particularly neural networks, have shown exceptional capability in processing vast amounts of data and identifying complex patterns. Bandara et al. use Long Short-Term Memory (LSTM) networks to exploit non-linear demand relationships in an e-commerce product hierarchy [7]. In domains beyond online advertising, Zhang et al. synthesize research on artificial neural networks (ANNs) in predictions, providing insights into modeling issues and future directions [8]. Kuo et al. compare neural networks with traditional models, highlighting their advantages in handling non-linear relationships [9]. Azzouni et al. propose an LSTM framework for predicting network traffic, demonstrating its effectiveness on real-world data [10].

Ensemble models. Ensemble models combine different techniques to leverage their respective strengths for improved predicting accuracy. Aburto et al. present a hybrid system combining ARIMA models and neural networks for demand prediction, improving accuracy and inventory management in a Chilean supermarket [11]. Irem et al. investigate various classifiers and their combinations, showing that ensemble models outperform single classifiers and simple combinations in predicting accuracy [12].

2.1.3. Spatial Statistics in Demand Prediction

Spatial statistics has long been a cornerstone in understanding and predicting spatially distributed phenomena, including product demand. Traditional spatial statistical methods have provided valuable insights into the geographic aspects of demand prediction, forming a foundation upon which more advanced techniques like our SGNN approach build. Spatial autoregressive (SAR) models [13] have been widely used in econometrics and marketing to capture spatial dependencies in demand patterns. These models assume that the demand in one location is influenced by the demand in neighboring locations, an idea that aligns with our graph-based approach. However, SAR models typically rely on predefined spatial weight matrices and may struggle with non-linear spatial relationships. Geographically Weighted Regression (GWR) [14] extends traditional regression by allowing coefficients to vary over space, considering that the impact of predictors on demand may differ across locations. While GWR provides localized insights, it may not fully capture the complex, non-linear spatial interactions that our SGNN model aims to address. Kriging, or Gaussian process regression [15], has been applied in geostatistics to predict values at unobserved locations based on observed data points. This method excels in interpolation but may face challenges in extrapolation and handling the high-dimensional feature spaces that are common in e-commerce data. Our SGNN approach builds upon these foundational spatial statistical methods by leveraging the flexibility of Graph Neural Networks. It allows for learnable, non-linear spatial relationships and can incorporate high-dimensional feature spaces more naturally. Moreover, the attention mechanism in our model provides a data-driven alternative to the fixed spatial weights used in traditional spatial statistics, potentially capturing more nuanced spatial dependencies in demand patterns.

2.2. Graph Neural Network in Modeling Spatial Dynamics Events

Graph Neural Networks offer several advantages over traditional neural networks. They can be trained on datasets that include both input data and pairwise relationships between items, making them particularly effective for modeling spatial dynamics events [16]. Furthermore, demand predicting in online marketplaces also involves spatial dynamics [17], as the location from which products are sold can influence the demand for geographically proximate products.

In this paper, we creatively apply GNN to predict e-commerce product demand by leveraging the geographic information inherent in the data. We construct an adjacency matrix based on the spatial connectivity of administrative regions in Russia, defining a graph where each geographic region is treated as a node, with various features from the dataset serving as attributes for each node. By utilizing this graph structure, GNN effectively integrates location-specific information into the prediction model, capturing spatial dependencies that other neural network models do not consider. This approach not only enhances the accuracy of demand predictions but also provides a more comprehensive understanding of the factors influencing demand across different regions.

3. Methodology

To tackle the complexities of online retail and to accurately predict product demand, our methodology adopts a novel approach based on the model foundation of the Spatial Graph Neural Network (SGNN). This section unfolds our methodological framework, meticulously designed to employ the intricate spatial relationships and rich attribute data inherent in retail locations. We start with the attributed graph prediction problem formulation,setting the stage by defining the graph structure that encapsulates the multifaceted nature of retail locations and their interconnections. Following this, we delve into constructing graph with spatial adjacency, a critical step that operationalizes the notion that geographic proximity significantly influences demand patterns among retail nodes. The “Attention-Aware Message Propagation” subsection introduces a sophisticated mechanism to dynamically weigh and integrate information from neighboring nodes, ensuring that the most relevant spatial and contextual signals are emphasized in predicting demand. In node-Level product demand prediction,we articulate how the model synthesizes the aggregated information to predict demand at individual locations, highlighting the model’s capacity to distill both local and global insights. Lastly, the “training objective” outlines our strategy for refining the model’s predictions, emphasizing the optimization of a loss function that aligns predicted demands with actual demands, thus encapsulating our comprehensive approach to address the challenge of demand predicting in the digital marketplace. Through this methodological journey, we aim to offer a robust framework that not only enhances the accuracy of demand predictions but also provides actionable insights for retailers operating in the dynamic online marketplace.

3.1. Attributed Graph Prediction Problem Formulation

We define the product demand prediction problem in the context of SGNN as a regression task on an attributed graph

G = (V, E, X)

. Here,

V

represents the set of vertices corresponding to different retail locations, and

E \subseteq V \times V

denotes the set of edges reflecting the spatial connectivity between these locations. The vertex feature matrix is denoted by

X \in R^{| V | \times f}

, encapsulating f-dimensional features that capture each location’s attributes relevant to product demand, such as demographic data, store characteristics, and past demand trends.

Each node

v_{i} \in V

is associated with a target variable

y_{i}

representing the product demand to be predicted. Our goal is to learn a function

f : G \to R^{| V |}

that maps the attributed graph to a vector of predicted product demands. Formally, we define the predicted demand for a node

v_{i}

as

{\hat{y}}_{i} = f {(G, X)}_{i}

, where f is parameterized by the weights of a GNN.

3.2. Constructing Graph with Spatial Adjacency

In the realm of e-commerce, understanding the spatial dynamics of product demand is crucial for tailoring marketing strategies and optimizing inventory distribution. The geographic location of retail outlets significantly influences consumer behavior, as customers are more likely to purchase from nearby stores due to convenience and lower shipping costs. Furthermore, sociodemographic factors and regional preferences can lead to variations in demand patterns across different areas. Recognizing these spatial dependencies is essential for accurately predicting demand at each retail location. By constructing a graph with spatial adjacency, we can model the complex interplay between geographic proximity and demand similarity, providing a structured framework to capture these nuanced relationships. This approach allows us to not only predict demand more accurately but also to uncover insights into how spatial factors influence consumer preferences and buying behavior. Transitioning from this rationale, the construction of the adjacency matrix

A \in R^{| V | \times | V |}

becomes a foundational step in our methodology, enabling us to quantitatively analyze and leverage these spatial relationships for enhanced demand prediction.

The adjacency matrix

A \in R^{| V | \times | V |}

is constructed based on spatial proximity among the retail locations, capturing the notion that nearby locations may exhibit similar demand patterns. An edge weight

w_{i j}

is assigned to each edge

(v_{i}, v_{j}) \in E

, formulated as

w_{i j} = exp (- δ \cdot d (v_{i}, v_{j})),

(1)

where

δ

is a decay factor and

d (v_{i}, v_{j})

measures the geographical distance between locations

v_{i}

and

v_{j}

.

3.3. Attention-Aware Message Propagation

In the intricate landscape of retail, not all interactions between locations bear equal significance in shaping product demand. The influence of one retail location on another can vary dramatically based on a multitude of factors, such as the similarity of the products offered, the competitive dynamics between the stores, or even the demographic characteristics of the surrounding areas. Traditional methods of aggregating information across nodes in a graph often assume uniformity in these relationships, potentially glossing over the subtleties that could inform more nuanced demand predictions. To address this limitation and embrace the complexity of real-world retail networks, it becomes urgent to introduce a mechanism that can discern and weigh the varying degrees of influence between neighboring locations. By implementing an attention-aware message propagation system, we can selectively amplify or attenuate the information flow between nodes, ensuring that the aggregation of features across the network accurately reflects the heterogeneity of inter-node impacts. This approach not only enhances the model’s ability to capture the essence of spatial interactions but also fine-tunes the predictive accuracy by focusing on the most relevant signals for demand prediction in each specific location. Following this rationale, the introduction of an attention mechanism becomes a critical advancement in our methodology, allowing for a dynamic and context-sensitive representation of retail locations within the graph.

We introduce an attention mechanism to capture the varying impact that different neighbors have on a node’s feature representation. The attention coefficients

α_{i j}

quantify the importance of node

v_{j}

’s features to node

v_{i}

, computed as

α_{i j} = \frac{exp (LeakyReLU (a^{T} [W h_{i} ∥ W h_{j}]))}{\sum_{k \in N (v_{i})} exp (LeakyReLU (a^{T} [W h_{i} ∥ W h_{k}]))},

(2)

where

a

is a learnable weight vector,

W

is a shared linear transformation applied to each node’s features, and

∥

denotes concatenation.

3.4. Node-Level Product Demand Prediction

The process of predicting product demand at the node level, represented by

{\hat{y}}_{i}

for a node

v_{i}

, is a sophisticated operation that integrates the core principles of spatial adjacency and attention mechanisms to accurately model demand dynamics. This integration is pivotal, as it allows the model to consider not just the intrinsic attributes of a retail location encapsulated in

h_{i}^{(l)}

, but also the complex web of interactions it has with its neighbors. The spatial adjacency construction ensures that the model recognizes the influence of geographic proximity in shaping demand patterns, acknowledging that retail locations situated closer together are more likely to exhibit similar demand characteristics due to shared market conditions and consumer bases. Furthermore, the introduction of attention-aware message propagation enhances the model’s ability to discern the varying degrees of relevance among these interactions. By computing attention coefficients

α_{i j}

, the model dynamically adjusts the influence of neighboring nodes based on the context, ensuring that the aggregated information is tailored to reflect the unique demand landscape of each location. This nuanced approach to information aggregation, where the attention mechanism acts as a filter to prioritize the most impactful signals from the neighborhood, is crucial for capturing the heterogeneity inherent in retail networks.

Following the above analysis, the predicted demand

{\hat{y}}_{i}

for a node

v_{i}

that reflects the culmination of feature transformations and attention-weighted information aggregation from its neighborhood is formulated as follows:

{\hat{y}}_{i} = σ (w_{o}^{T} (h_{i}^{(l)} + \sum_{v_{j} \in N (v_{i})} α_{i j} h_{j}^{(l)})),

(3)

where

h_{i}^{(l)}

is the feature representation of node

v_{i}

at the l-th layer,

w_{o}

is the output layer’s weight vector, and

σ

is an activation function, typically chosen as the identity function for regression tasks. The final layer’s output

{\hat{y}}_{i}

is the demand prediction for product i, and the network is trained to minimize the prediction error over all nodes.

3.5. Training Objective

Building upon the foundational understanding that the predictive accuracy of our model depends on its ability to intricately model the spatial dynamics and attention-driven interactions within the retail network, the training objective of our predictive model takes on a crucial role. The loss function,

L

, serves as a quantifiable measure of the model’s performance, estimating the discrepancy between the predicted demands

\hat{y}

and the actual demands

y

across the graph. The choice of a mean squared error (MSE) loss function is careful, emphasizing the importance of penalizing larger errors more severely to refine the model’s predictive accuracy:

L (y, \hat{y}) = \frac{1}{| V |} \sum_{i = 1}^{| V |} {(y_{i} - {\hat{y}}_{i})}^{2} .

(4)

This mathematical formulation is underpinned by a deeper strategy aimed at capturing the nuanced interplay of factors influencing product demand. The propagation of features across the layers of the Graph Neural Network is not merely a process of data transformation but a methodical approach to distill both the local and global spatial dependencies that are critical to understanding and predicting demand accurately. These dependencies are revealed through the model’s attention mechanisms and the spatial adjacency matrix, enabling a comprehensive analysis of how geographic proximity and contextual relevance between retail locations influence demand patterns.

By meticulously calibrating the parameters of the function f, our objective transcends the minimization of prediction error. We try to shape a predictive model that not only excels in capturing the complex dynamics governing demand across the retail network but also demonstrates an exceptional ability to generalize across diverse nodes and fluctuating demand scenarios. This involves a careful balancing act of ensuring the model remains sensitive to the subtleties of spatial and contextual information while avoiding overfitting to the training data, thereby ensuring its applicability and robustness in real-world settings.

The training phase, therefore, becomes a critical juncture where theoretical concepts and empirical data converge, guiding the model towards achieving a deep, contextual understanding of demand predicting. Through this process, we aim to equip stakeholders in the online marketplace with a powerful analytical tool, capable of handling the intricacies of demand prediction with enough precision and insight. This endeavor not only advances the frontier of research in e-commerce demand prediction but also offers obvious benefits for retailers seeking to optimize their strategies in response to the ever-evolving landscape of consumer preferences and market conditions.

4. Experiments

In our investigation, we answer the research questions (RQs) proposed in Section 1, which aim at evaluating the efficacy of our spatial GNN model in the scenario of geographic product demand prediction. These questions are designed to probe the model’s performance, the incremental value of its components, and its robustness across different settings.

4.1. Experiment Setup

4.1.1. Datasets

Overall Introduction

In the experiments, we utilize the Avito Demand Prediction Challenge hosted on Kaggle as the benchmark, which presents participants with a unique opportunity to apply machine learning techniques to predict demand for an extensive range of products listed on Avito’s platform. Avito, being Russia’s largest classified ads service, encompasses a wide variety of categories, including electronics, real estate, and services, making it a rich source of data for such predictive modeling tasks. The objective of the challenge is to accurately predict the probability of an ad leading to a product transaction, based on the information provided in the ad’s description, context, and metadata. This task has profound implications for both sellers by optimizing their ad placements for higher sales, and buyers, by enhancing their shopping experience through the prioritization of listings likely to meet their purchase intent.

Detailed Data Information

The dataset provided by Avito’s team contains multiple modalities of information, including images, text, categorical, and continuous features. We provide their details as follows:

item id: Id of a particular advertisement.
user id: Id of a user.
region: The region that Ads belong to.
city: The city that a Ad belongs to.
top-level category: The top-level ad category as classified by Avito’s ad model.
fine-grain category: The fine-grain ad category as classified by Avito’s ad model.
param 1: The first optional parameter from Avito’s ad model.
param 2: The second optional parameter from Avito’s ad model.
param 3: The third optional parameter from Avito’s ad model.
title: The textual title for the Ad.
description: The multi-sentence textual description for the Ad.
price: The numerical value for the Ad’s price.
item seq number: Ad sequential number for the user.
activation date: The date that the Ad was placed onto the platform.
user type: The type of the user, including Private, Company, and Shop.
image: The Id code corresponding to the image which is tied to a jpg file in train jpg. Considering that not every Ad has an image, we don’t employ this feature for further analysis.
image top 1: Avito’s classification code for the image.
deal probability: The target variable. This is the likelihood that the ad will actually sell the item. It is not possible to verify every transaction with certainty, so the value of this column can be any floating point number from zero to one.

This dataset contains

1, 503, 424

records, which are randomly divided into train/ validation/test sets according to the

70 % / 10 % / 20 %

ratio.

Data Analysis

Univariate analysis: In the univariate analysis section of our study, we delve into an in-depth examination of six pivotal features deemed to have substantial importance in understanding the dynamics of our dataset. These features include deal probability, price, region, city, top-level category, and fine-grain category. This analysis aims to shed light on the individual characteristics and distributions of these features, providing foundational insights for further multivariate analysis. The distribution histograms for such features have been provided in the Figure 1 and Figure 2. From such figures, we can obtain the following observations:

deal probability: An initial observation from the distribution histograms indicates a pronounced long-tail distribution issue with the deal probability feature. Notably, approximately $65 %$ of the ads exhibit a zero deal probability, signifying a substantial portion of ads that do not culminate in a transaction. Conversely, a minimal fraction of ads achieve a deal probability of 1, indicating a successful sale. This distribution suggests a high variance in the likelihood of deals being closed across the dataset.
price: Prior to analysis, the price feature undergoes a logarithmic transformation to normalize its distribution. Post-transformation, the price distribution approximates a normal distribution, as evidenced by the histograms. This transformation mitigates the skewness originally present in the data, facilitating more meaningful statistical analysis and interpretation.
region: The analysis of the region feature reveals a geographical disparity in ad postings. The Krasnodar region emerges as the most prominent area for ad postings, followed closely by the Sverdlovsk and Rostov regions. This distribution highlights the regional variances in marketplace activity, potentially influenced by factors such as population density and economic conditions.
city: Delving into the city feature, we observe that the highest number of ads are posted in Krasnodar and Ekaterinburg cities. Subsequent rankings include Novosibirsk, Rostov-na-Donu, and Nizhny Novgorod cities. This urban-centric distribution underscores the role of major cities as hubs for online marketplace transactions, possibly attributed to their larger populations and higher internet penetration rates.
top-level category: The top-level category feature analysis reveals a dominant preference for posting ads in the “Personal things” category, accounting for more than 0.6 million users. Following this, the categories for “Home and Cottages” and “Consumer Electronics” are notable, with approximately 0.2 million users posting ads in each. This distribution indicates a significant inclination towards selling personal belongings, with a notable interest in home-related items and electronics.
fine-grain category: Within the subcategories, “Clothes, shoes, accessories”, “Children’s clothing and footwear”, and “Goods for children and toys” emerge as the top three, each with around 0.3 million postings. This detailed breakdown within the fine-grain category feature further elucidates consumer behavior, highlighting a strong market for personal and children-related items.

Figure 1. The distribution histogram for the deal probability and log of price over each value range.

Figure 2. The distribution histogram for the region, city, top-level category, and fine-grain category, correspondingly.

The univariate analysis of these selected features, supported by the distribution histograms in Figure 1 and Figure 2, provides a comprehensive overview of the dataset’s characteristics. This foundational understanding paves the way for more intricate multivariate analyses and predictive modeling, with the ultimate goal of enhancing the accuracy and efficiency of product demand prediction in the online marketplace.

Bivariate analysis: In the exploration of bivariate relationships within our dataset, we delve into the interactions between deal probability and four critical features: region, city, top-level category, and user type. This analysis aims to uncover the nuanced dynamics that these features may have with the likelihood of a deal being closed. By examining the mean deal probability across various groups within these features, we gain insights into how different factors influence transaction outcomes. The findings from this analysis are visualized in Figure 3, facilitating a more intuitive understanding of these relationships.

deal probability and region: Upon examining the relationship between deal probability and region, it is observed that the mean deal probability across all regions hovers around $15 %$ . This uniformity suggests that while regional factors may influence the volume of ads, they do not significantly differentiate the likelihood of a deal being closed. This finding could indicate that other factors beyond geographical location play a more pivotal role in influencing deal probability.
deal probability and city: Similar to the observation with regions, the analysis of deal probability and city reveals that all cities also exhibit a mean deal probability of approximately $15 %$ . This consistency across cities further supports the notion that the likelihood of closing a deal is not heavily dependent on specific urban centers, highlighting the importance of looking beyond geographic specifics to understand deal closure dynamics.
deal probability and top-level category: A more nuanced insight emerges from the relationship between deal probability and top-level category. The “Services” category stands out with the highest mean deal probability at $40 %$ , followed by “Transport” and “Animals” at $25 %$ . This distinction suggests that ads within the “Services” category are significantly more likely to result in a deal, possibly due to the inherent nature of services being in higher demand or more immediately consumable compared to physical goods. This disparity underscores the potential for tailoring strategies based on category-specific demand dynamics.
deal probability and fine-grain category: The analysis of deal probability and user type reveals a notable difference in mean deal probabilities between Private Users ( $15 %$ ) and Shop Users ( $5 %$ ). This discrepancy suggests that ads posted by private individuals are three times more likely to close a deal than those posted by shops. This could be attributed to a variety of factors, including perceived trustworthiness, pricing differences, or the nature of the goods and services offered by these user types.

4.1.2. Baselines

In our experiments, we benchmark our proposed methodology against a diverse set of representative models that span across three major categories in the domain of product demand prediction. This comparative analysis aims to highlight the strengths and potential of our approach in capturing the complex dynamics of demand prediction. Below is an introduction to each baseline model utilized in our study.

Linear Model:

Generalized Linear Model(GLM) [18]: A foundational approach in statistical modeling, GLM extends traditional linear regression to support various types of distribution for the target variable, such as binomial and Poisson distributions. This model is pivotal for understanding the linear relationships between the features and the target demand, serving as a baseline to assess the incremental benefits of more complex models.

Tree-based Model:

XGBoost [19]: A highly efficient and scalable implementation of gradient boosting framework, XGBoost has gained popularity for its performance in various predictive modeling competitions. It leverages an ensemble of decision trees, optimized through gradient boosting, to capture non-linear relationships and interactions among features.
LightGBM [20]: An advanced gradient boosting model that utilizes a novel tree-growing algorithm to enhance efficiency and scalability. LightGBM is designed to handle large-scale data, offering a faster training process without compromising on model accuracy.
CatBoost [21]: Another gradient boosting variant, CatBoost is renowned for its handling of categorical features directly, without the need for extensive preprocessing. It provides robust solutions to avoid overfitting, making it highly effective in diverse predictive tasks, including demand prediction.

Deep Model:

Multiple Layer Perceptron (MLP) [22]: MLP is a class of feedforward artificial neural network (ANN) that consists of at least three layers of nodes: an input layer, a hidden layer, and an output layer. MLP utilizes a backpropagation technique for training, capable of capturing complex non-linear relationships between inputs and outputs.
LSTM [23]: Long Short-Term Memory networks, a type of recurrent neural network (RNN) architecture, are specifically designed to address the vanishing gradient problem of traditional RNNs. LSTMs are adept at learning long-term dependencies, making them particularly suitable for time-series predicting tasks like demand prediction.
GRU [24]: Gated Recurrent Units (GRUs) are a variant of RNNs that simplify the LSTM architecture while retaining its capability to capture dependencies over various time spans. GRUs offer a more efficient and equally effective alternative for sequential data modeling.
CNN [25]: Convolutional Neural Networks, traditionally known for their prowess in image processing, have also been adapted for spatial predicting. By capturing spatial dependencies through their hierarchical structure, CNNs can be utilized to effectively model the geographical information and relationships in demand data.

4.1.3. Evaluation Metrics

In assessing the performance of our product demand prediction models, we employ three key evaluation metrics that offer a comprehensive view of model accuracy and fit. These metrics are essential for quantifying the discrepancy between the actual demand values and the predictions made by our models.

Mean Absolute Error (MAE): MAE is a straightforward metric that calculates the average absolute difference between the actual demand $y_{i}$ and the predicted demand ${\hat{y}}_{i}$ across all observations. It provides an intuitive measure of the model’s accuracy, with lower values indicating better performance. The MAE is defined as

$\begin{matrix} MAE = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - {\hat{y}}_{i} |, \end{matrix}$

(5)

where N is the total number of observations. This metric is particularly useful for understanding the magnitude of prediction errors without considering their direction.
Root Mean Squared Error (RMSE): This metric offers a more sensitive measure of model accuracy by squaring the errors before averaging, thus giving greater weight to larger errors. RMSE is defined as the square root of the average of squared differences between the predicted and actual values:

$\begin{matrix} RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}, \end{matrix}$

(6)

where N represents the total number of observations. The RMSE is beneficial for identifying when a model might be prone to producing significant errors, as it penalizes larger discrepancies more heavily than smaller ones.
R-squared (R²): R-squared, also known as the coefficient of determination, measures the proportion of the variance in the dependent variable that is predictable from the independent variables. It indicates how well the regression predictions approximate the real data points. An R-squared of 1 indicates that the regression predictions perfectly fit the data. The formula for R-squared is

$\begin{matrix} R^{2} = 1 - \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}} \end{matrix}$

(7)

where $\bar{y}$ is the mean of the actual demand values. R-squared is a relative measure of fit that can be useful when comparing different models’ abilities to explain the variability in the data.

4.1.4. Implementation Details

For the implementation phase of our study, we’ve adopted a rigorous and replicable approach to ensure the reliability of our results. Specifically, we’ve carried out each experiment a total of five times, each with a unique seed, to ensure the robustness of our findings. The average of these runs is then reported as the final performance metric of our model, providing a solid foundation for the evaluation of our spatial GNN’s effectiveness in predicting product demand.

In the detailed setup of each experiment, our model is meticulously trained over a span of 50 epochs. This extensive training period is carefully monitored with an early stopping mechanism in place, a strategic decision aimed at curbing the potential for overfitting and ensuring the model’s generalizability to unseen data. Furthermore, we have calibrated the architecture of our spatial GNN to consist of three layers. This specific configuration is chosen to strike a balance between capturing the intricate spatial relationships within the data and avoiding the pitfall of over-smoothing, where the model’s output becomes indistinct and loses valuable detail.

The technical execution of our method leverages the powerful capabilities of PyTorch v2.3.0, a choice that facilitates the efficient and flexible development of deep learning models. To harness the computational intensity of training a spatial GNN, our experiments are conducted on a state-of-the-art NVIDIA RTX 4090 GPU manufactured by TSMC in the Hsinchu, Taiwan, equipped with an ample 24 GB of memory. This hardware setup not only accelerates the training process but also enables the handling of complex models and large datasets with ease, ensuring that our implementation is both fast and effective.

4.2. Overall Performance (RQ1)

Addressing Research Question 1, we assess the performance of our Spatial Graph Neural Network model in the realm of product demand prediction, setting it against a spectrum of modeling approaches, including linear, tree-based, and deep learning models. This evaluation leverages metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R²), providing a comprehensive view of each model’s predictive accuracy. As delineated in Table 1, our analysis reveals a clear hierarchy in model performance. The data underscores that both tree-based models (XGBoost, LightGBM, and CatBoost) and deep learning models (encompassing LSTM, GRU, and CNN) surpass the Generalized Linear Model (GLM) in terms of predictive capability. This outcome hints at the limitations of traditional linear models in capturing the intricate and nonlinear relationships inherent in real-world product demand data. While deep learning models such as LSTM and GRU have demonstrated promise across various domains, their edge over sophisticated tree-based models in the context of product demand prediction appears marginal. This observation suggests a need for the further refinement and domain-specific tailoring of deep learning approaches to unlock their full potential in this area. Notably, the CNN model outperforms MLP, LSTM, and GRU, highlighting the pivotal role of spatial information in enhancing demand prediction accuracy. Amidst this competitive landscape, our SGNN model emerges as a standout performer, consistently outshining both the traditional tree-based and contemporary deep learning baselines. This remarkable performance, characterized by lower MAE, RMSE, and higher R² scores, underscores the efficacy of our approach in navigating the complexities of product demand prediction. The SGNN model’s superior accuracy attests to its capacity to effectively leverage spatial relationships, offering a robust solution that could significantly benefit the online advertisement sector. The findings from our analysis not only validate the innovative design of the SGNN model but also illuminate the path for future advancements in the field of demand predicting. The compelling results achieved by our model on the Avito dataset underscore its potential as a transformative tool for marketers and strategists aiming to optimize their online presence and engagement strategies.

4.3. Ablation Study (RQ2)

To delve deeper into the intricacies of our Spatial Graph Neural Network model’s architecture and to answer Research Question 2 (RQ2), we embarked on an exploratory journey to dissect the influence of individual components within our model. Specifically, we scrutinized the unique contributions of the spatial adjacency and attention-aware message propagation modules by systematically omitting each from the SGNN framework and observing the subsequent impact on prediction accuracy. This methodical approach allowed us to isolate and understand the value added by these critical modules. The results of this analysis are meticulously compiled in Table 2. In this table, “spatial” refers to the module that integrates spatial adjacency information, while “attention” corresponds to the module that facilitates attention-aware message propagation. The comparative performance metrics, including MAE, RMSE, and R², serve as clear indicators of each module’s significance. Upon reviewing the table, it becomes apparent that the absence of either the spatial or attention module from the SGNN framework results in a marked reduction in predictive performance. This is evidenced by increased values of MAE and RMSE, alongside decreased R² scores, signaling a decline in the model’s ability to predict product demand accurately. Such findings underscore the indispensable nature of both modules in our predictive framework. The spatial adjacency module, in particular, plays a pivotal role by embedding geographical information into the model, thereby allowing for a nuanced understanding of spatial relationships and their impact on product demand. The noticeable performance drop observed upon the removal of this module underscores its critical contribution to capturing the spatial dynamics integral to accurate demand prediction. Similarly, the attention-aware message-propagation module enriches the model by enabling it to selectively prioritize and learn from the information disseminated across neighboring nodes. This adaptability enhances the model’s capacity to discern relevant patterns and relationships, further bolstering its predictive accuracy. In sum, the ablation study not only reaffirms the value of incorporating geographical information into demand prediction models but also highlights the synergistic effect of combining spatial adjacency with attention mechanisms. The discernible performance degradation observed upon excluding these modules validates our initial hypothesis and motivation for their integration, solidifying their status as cornerstone components of our SGNN framework. This insight not only propels our understanding of SGNN’s inner workings but also paves the way for future enhancements and optimizations in geographical demand prediction methodologies. To further elucidate the impact of our proposed modules, we also evaluated a stripped-down version of SGNN without both the spatial adjacency and attention-aware message propagation modules. This basic GNN model achieved an MAE of 0.2103, RMSE of 0.2014, and R² of 0.8976. These results, which are notably inferior to the full SGNN model, underscore the significant contributions of both modules to the model’s predictive performance. Moreover, this baseline provides a useful point of comparison with the various prediction methods discussed in Section 4.2, further highlighting the advantages of our full SGNN approach.

4.4. Hyperparameter Robustness (RQ3)

In pursuit of answering Research Question 3 (RQ3), our investigation delves into the sensitivity of the Spatial Graph Neural Network model’s predictive accuracy to variations in key hyperparameters: the decay factor, batch size, and the number of SGNN layers. This exploration aims to discern the extent to which these parameters influence the model’s precision in predicting product demand.

Decay Factor: We initiate our exploration by adjusting the decay factor within our model across a spectrum from 0.1 to 0.9. The empirical findings, encapsulated in Table 3, reveal an intriguing pattern: while the overall performance remains relatively stable across a wide range of decay factor values, we observe a discernible dip in predictive accuracy at the extremities of this range. This phenomenon underscores the delicate balance required in setting the decay factor, as values that are too high or too low can adversely affect the scope of information propagation within the SGNN, thereby impacting learning efficiency.

Batch Size: Further scrutiny is applied to the impact of batch size variations on model performance, with our analysis spanning batch sizes from 1000 to 20,000. As detailed in Table 4, the results showcase an impressive resilience in performance metrics (MAE, RMSE, and R²) against changes in batch size. Notably, the SGNN model consistently maintains a MAE above the

0.1700

threshold across almost all tested batch sizes, highlighting its robustness and stability in handling varying data volumes during training.

Layer Number: The final dimension of our hyperparameter analysis focuses on the number of SGNN layers, ranging from 1 to 5. This examination, presented in Table 5, seeks to ascertain whether a deeper SGNN architecture translates to enhanced predictive performance. The findings confirm the model’s resilience to variations in layer numbers, with an optimal performance observed at a layer count of 3. Beyond this point, an increase in layers leads to a noticeable decline in performance across all metrics (MAE, RMSE, and R²), attributed to the over-smoothing issue prevalent in deeper Graph Neural Network models. This observation highlights the criticality of layer number selection in preserving the model’s capacity to generate distinct and informative node representations without succumbing to the diluting effects of over-smoothing.

This comprehensive analysis of hyperparameter sensitivity not only affirms the SGNN model’s robustness across a range of configurations but also sheds light on the optimal settings conducive to maximizing predictive accuracy. The insights garnered from this exploration provide valuable guidance for fine-tuning the SGNN framework, ensuring its adaptability and effectiveness in the dynamic landscape of product demand prediction.

5. Conclusions and Future Works

5.1. Conclusions

In this paper, we introduced an innovative approach to product demand prediction through the utilization of the Spatial Graph Neural Network, tailored for the online retail domain. By adopting a novel methodology that intricately models the spatial relationships and attributes of retail locations, we have successfully demonstrated the potential of SGNN in transcending traditional demand prediction methods. Our approach, which constructs a spatially aware graph representation of the marketplace and integrates advanced attention mechanisms, aims to capture the nuanced relationships and dependencies that shape demand patterns.

The comparative analysis against a comprehensive suite of baseline models, spanning linear, tree-based, and deep learning approaches, has underscored the SGNN framework’s superior capability in accurately predicting demand. Through rigorous evaluation employing MAE, RMSE, and R², our findings reveal the SGNN model’s adeptness at providing actionable insights, thereby empowering retailers to navigate the intricacies of the online marketplace with enhanced strategic foresight.

One of the key strengths of our SGNN approach lies in its ability to effectively capture and leverage spatial dependencies in demand patterns. By constructing a graph representation of the marketplace, our model can inherently account for geographical proximity and regional influences on product demand, a feature that traditional methods often struggle to incorporate effectively. This spatial awareness allows for more nuanced predictions that consider the complex interplay between location and demand.

While our SGNN approach demonstrates significant advantages, it is important to acknowledge its limitations. The computational complexity of SGNN models can pose challenges for very large-scale applications, particularly in scenarios with extremely high numbers of retail locations or products. Additionally, the model’s performance is dependent on the quality and completeness of the spatial data available, which may not always be consistent across different marketplaces or regions. Furthermore, while our model effectively captures spatial relationships, it may not fully account for temporal dynamics in demand patterns, which could be crucial in certain contexts. These limitations point to potential areas for future research and refinement of the SGNN approach.

5.2. Future Works

In light of the promising results demonstrated by employing the Spatial Graph Neural Network for product demand prediction in online retail, several avenues for future research and development emerge. These potential directions not only aim to refine and extend the current model’s capabilities but also seek to explore new applications and methodologies within the realm of SGNN and beyond. below are some detailed insights into possible future work.

Temporal Dynamics Integration: Incorporating temporal dynamics into the SGNN framework could significantly enhance the model’s predictive accuracy. Future work could explore methods for embedding time-series data into the graph structure, allowing the model to capture not only spatial relationships but also temporal patterns in consumer behavior and demand fluctuations.
Hybrid Models: Combining SGNN with other machine learning techniques, such as reinforcement learning or unsupervised learning algorithms, could lead to hybrid models that leverage the strengths of multiple approaches. For instance, reinforcement learning could optimize inventory levels dynamically based on SGNN demand predictions, offering a comprehensive solution for supply chain management.
Cross-Domain Adaptation: Exploring the applicability of SGNN in domains beyond retail, such as urban planning, transportation, and social network analysis, could unveil new insights and applications. The spatial and relational modeling capabilities of SGNN hold potential for predicting traffic flow, urban development trends, or information propagation in social networks.
Advanced Graph Architectures: Investigating more sophisticated Graph Neural Network architectures, including Graph Attention Network and Heterogeneous Graph Neural Network, could provide deeper insights into complex spatial interactions. These advanced models could better capture the heterogeneity in data types and relationships present in retail networks.
Scalability and Efficiency: Addressing the computational challenges associated with SGNN, particularly for large-scale applications, remains a critical area for future work. Developing more efficient algorithms and leveraging distributed computing frameworks could enhance the scalability and practicality of SGNN for real-world applications.
Interpretability and Explainability: Enhancing the interpretability of SGNN models is crucial for gaining insights into the underlying factors driving demand predictions. Future work could focus on developing methodologies for visualizing and interpreting graph-based models, providing valuable feedback for decision-makers in retail and other sectors.

Author Contributions

Methodology, X.W. and T.S.; Software, T.S.; Formal analysis, X.W.; Investigation, J.L.; Writing—original draft, J.L.; Writing—review & editing, L.F.and M.Z.; Supervision, M.Z.; Project administration, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://www.kaggle.com/competitions/avito-demand-prediction (accessed on 5 August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kim, J.U.; Kim, W.J.; Park, S.C. Consumer perceptions on web advertisements and motivation factors to purchase in the online shopping. Comput. Hum. Behav. 2010, 26, 1208–1222. [Google Scholar] [CrossRef]
Rai, S.; Gupta, A.; Anand, A.; Trivedi, A.; Bhadauria, S. Demand prediction for e-commerce advertisements: A comparative study using state-of-the-art machine learning methods. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 6–8 July 2019; IEEE: Piscataway, NJ, USA. [Google Scholar]
Kin-man To, C. Innovation process management in global fashion businesses: A review of contextual aspects. Res. J. Text. Appar. 2003, 7, 60–73. [Google Scholar] [CrossRef]
Levis, A.A.; Papageorgiou, L.G. Customer demand forecasting via support vector regression analysis. Chem. Eng. Res. Des. 2005, 83, 1009–1018. [Google Scholar] [CrossRef]
Jain, A.; Karthikeyan, V.; Sahana, B.; Shambhavi, B.R.; Sindhu, K.; Balaji, S. Demand forecasting for e-commerce platforms. In Proceedings of the 2020 IEEE International Conference for Innovation in Technology (INOCON), Bangluru, India, 6–8 November 2020; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
Tugay, R.; Oguducu, S.G. Demand prediction using machine learning methods and stacked generalization. arXiv 2020, arXiv:2009.09756. [Google Scholar]
Bandara, K.; Shi, P.; Bergmeir, C.; Hewamalage, H.; Tran, Q.; Seaman, B. Sales demand forecast in e-commerce using a long short-term memory neural network methodology. In Proceedings of the Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, 12–15 December 2019; Proceedings, Part III 26. Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62. [Google Scholar] [CrossRef]
Kuo, C.; Reitsch, A. Neural networks vs. conventional methods of forecasting. J. Bus. Forecast. 1995, 14, 17. [Google Scholar]
Azzouni, A.; Pujolle, G. A long short-term memory recurrent neural network framework for network traffic matrix prediction. arXiv 2017, arXiv:1705.05690. [Google Scholar]
Aburto, L.; Weber, R. Improved supply chain management based on hybrid demand forecasts. Appl. Soft Comput. 2007, 7, 136–144. [Google Scholar] [CrossRef]
Islek, I.; Ögüdücü, S.G. A Decision Support System for Demand Forecasting based on Classifier Ensemble. In FedCSIS (Communication Papers); FedCSIS: Sofia, Bulgaria, 2017. [Google Scholar]
LeSage, J.; Pace, R.K. Introduction to Spatial Econometrics; Chapman and Hall/CRC: Boca Raton, FL, USA, 2009. [Google Scholar]
David, O. Geographically weighted regression: The analysis of spatially varying relationships. Geogr. Anal. 2003, 35, 272–275. [Google Scholar]
Noel, C. The origins of kriging. Math. Geol. 1990, 22, 239–252. [Google Scholar]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef] [PubMed]
Gandhi, A.; Aakanksha; Kaveri, S.; Chaoji, V. Spatio-temporal multi-graph networks for demand forecasting in online marketplaces. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases; Springer International Publishing: Cham, Switzerland, 2021. [Google Scholar]
Hastie, T.J.; Pregibon, D. Generalized linear models. In Statistical Models in S; Routledge: London, UK, 2017; pp. 195–247. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 52. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
Amalnick, M.S.; Habibifar, N.; Hamid, M.; Bastan, M. An intelligent algorithm for final product demand forecasting in pharmaceutical units. Int. J. Syst. Assur. Eng. Manag. 2020, 11, 481–493. [Google Scholar] [CrossRef]
Abbasimehr, H.; Shabani, M.; Yousefi, M. An optimized model using LSTM network for demand forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
Shu, W.; Zeng, F.; Ling, Z.; Liu, J.; Lu, T.; Chen, G. Resource demand prediction of cloud workloads using an attention-based GRU model. In Proceedings of the 2021 17th International Conference on Mobility, Sensing and Networking (MSN), Exeter, UK, 13–15 December 2021; IEEE: New York, NY, USA, 2021. [Google Scholar]
Tang, Z.; Ge, Y. CNN model optimization and intelligent balance model for material demand forecast. Int. J. Syst. Assur. Eng. Manag. 2022, 13 (Suppl. S3), 978–986. [Google Scholar] [CrossRef]

Figure 3. The histograms for mean deal probability in each group of corresponding variables, including region, user type, city, and parent category.

Table 1. The evaluation performance of various baselines and our SGNN on the Avito dataset for product demand prediction.

Method	Avito
Method	MAE	RMSE	R²
GLM	0.3285	0.3172	0.7349
XGBoost	0.2891	0.2587	0.8170
LightGBM	0.2572	0.2312	0.8346
CatBoost	0.2689	0.2563	0.8308
MLP	0.2952	0.2951	0.7652
LSTM	0.2491	0.2286	0.8569
GRU	0.2438	0.2210	0.8625
CNN	0.2347	0.2054	0.8901
SGNN	0.1685	0.1504	0.9234

Table 2. The ablation study results for our SGNN model.

Method	Avito
Method	MAE	RMSE	R²
SGNN w/o spatial and attention	0.2103	0.2014	0.8976
SGNN w/o spatial	0.1962	0.1897	0.9045
SGNN w/o attention	0.1876	0.1721	0.9108
SGNN	0.1685	0.1504	0.9234

Table 3. The hyperparameter robustness analysis on the decay factor.

Decay Factor	Avito
Decay Factor	MAE	RMSE	R²
0.1	0.1958	0.1792	0.9063
0.3	0.1731	0.1565	0.9187
0.5	0.1685	0.1504	0.9234
0.7	0.1786	0.1592	0.9146
0.9	0.1901	0.1712	0.9084

Table 4. The hyperparameter robustness analysis on the batch size.

Batch Size	Avito
Batch Size	MAE	RMSE	R²
1000	0.1702	0.1535	0.9216
2000	0.1685	0.1504	0.9234
5000	0.1674	0.1492	0.9257
10,000	0.1668	0.1487	0.9261
20,000	0.1665	0.1483	0.9273

Table 5. The hyperparameter robustness analysis based on the layer number.

Layer Number	Avito
Layer Number	MAE	RMSE	R²
1	0.1923	0.1794	0.9082
2	0.1746	0.1593	0.9175
3	0.1685	0.1504	0.9234
4	0.1891	0.1750	0.9126
5	0.2187	0.2023	0.8927

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, J.; Fan, L.; Wang, X.; Sun, T.; Zhou, M. Product Demand Prediction with Spatial Graph Neural Networks. Appl. Sci. 2024, 14, 6989. https://doi.org/10.3390/app14166989

AMA Style

Li J, Fan L, Wang X, Sun T, Zhou M. Product Demand Prediction with Spatial Graph Neural Networks. Applied Sciences. 2024; 14(16):6989. https://doi.org/10.3390/app14166989

Chicago/Turabian Style

Li, Jiale, Li Fan, Xuran Wang, Tiejiang Sun, and Mengjie Zhou. 2024. "Product Demand Prediction with Spatial Graph Neural Networks" Applied Sciences 14, no. 16: 6989. https://doi.org/10.3390/app14166989

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Product Demand Prediction with Spatial Graph Neural Networks

Abstract

1. Introduction

2. Related Works

2.1. E-Commerce Demand Prediction

2.1.1. Characteristics of Online Advertising Data

2.1.2. Demand Prediction Methods

2.1.3. Spatial Statistics in Demand Prediction

2.2. Graph Neural Network in Modeling Spatial Dynamics Events

3. Methodology

3.1. Attributed Graph Prediction Problem Formulation

3.2. Constructing Graph with Spatial Adjacency

3.3. Attention-Aware Message Propagation

3.4. Node-Level Product Demand Prediction

3.5. Training Objective

4. Experiments

4.1. Experiment Setup

4.1.1. Datasets

Overall Introduction

Detailed Data Information

Data Analysis

4.1.2. Baselines

4.1.3. Evaluation Metrics

4.1.4. Implementation Details

4.2. Overall Performance (RQ1)

4.3. Ablation Study (RQ2)

4.4. Hyperparameter Robustness (RQ3)

5. Conclusions and Future Works

5.1. Conclusions

5.2. Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI