Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects

Wang, Bing; Bi, Wenjie; Liu, Haiying

doi:10.3390/math11102387

Open AccessArticle

Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects

by

Bing Wang

¹,

Wenjie Bi

¹ and

Haiying Liu

^2,*

¹

Business School, Central South University, No. 932, Lushan South Road, Changsha 410083, China

²

School of Accounting, Hunan University of Finance and Economics, No. 139, Fenglin Second Road, Changsha 410205, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(10), 2387; https://doi.org/10.3390/math11102387

Submission received: 5 April 2023 / Revised: 13 May 2023 / Accepted: 19 May 2023 / Published: 21 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

In reality, sellers face challenges in obtaining perfect demand information. Demand is influenced not only by price but also by behavioral factors such as reference effects, which complicate optimal pricing for enterprises. To address this problem, we propose a dynamic pricing model that incorporates demand learning and considers consumer reference effects. Using the Bayesian method and based on historical sales and prices, sellers can learn about demand patterns. We analyze the model to determine the existence of an optimal solution and provide an algorithm to solve it. Our numerical simulation demonstrates that the total consumer demand and the impact of price on demand remain relatively stable over time. However, the factors influencing the reference effects exhibit greater variability. Sellers can also gain insights into market demand through their learning behavior in each phase and adjust production based on market size. For instance, our simulation shows an increase in market demand over time, allowing the seller to adjust the production plan according to the demand change.

Keywords:

dynamic pricing; demand learning; Bayesian updating; reference effect

MSC:

90B50

1. Introduction

At the macro level, economic growth mainly relies on investment, exports, and consumption [1,2], while at the micro level, revenue management is the main means for enterprises to achieve revenue growth. Dynamic pricing, one of the most powerful revenue management techniques, has been widely applied in various industries, such as airlines, hotels, fashion, and retail [3]. The accurate prediction of demand is crucial for effective dynamic pricing strategies, leading many scholars to develop different demand models [4]. However, these models assume that sellers possess complete market information and fully understand the market demand, implying a determinate demand function. For example, in the classic linear demand function, this underlying assumption implies that the seller knows the specific values of all parameters of the demand function [5,6]. In reality, acquiring complete information on the market size is impractical for sellers, many of which often have a limited understanding of market size and product demand. This uncertainty poses pricing challenges for the seller. To address this, some scholars have proposed parametric demand learning [7,8,9]. However, most of these studies focus on one-time parameter estimation and develop price strategies based on a deterministic demand function, neglecting the dynamic nature of the learning process. In the retail industry, sellers are increasingly recognizing the need for gradual adjustment in demand estimation throughout the entire sales period [10]. Therefore, this paper focuses on dynamic pricing within the context of dynamic demand learning.

Whether a demand learning strategy can achieve good performance depends on the following three main points: First, the data source of learning plays a crucial role as the quality of the data directly impacts the learning outcomes. When the correlation between the data and the real demand is weak, the effectiveness of demand learning is compromised. Secondly, the parameter updating mechanism or learning method is essential. Since demand learning spans the entire sales period, each learning method will obtain parameter estimates, and it is vital to reasonably update them. Bayesian rule, a widely employed learning method, offers many advantages. It exhibits strong convergence properties, enables learning with limited data, and achieves more accurate results as the dataset grows. Additionally, the Bayesian model enhances interpretability, facilitating a clear understanding of underlying principles. Therefore, we adopt the Bayesian method for demand learning. The third point pertains to the construction of the demand model. If the established demand model fails to fully capture the factors influencing demand, the learning results will inevitably deviate from the real demand.

Historical prices and sales as an excellent data source for demand learning. Using such data offers several advantages. First, historical sales can be regarded as price experiments in a real business environment, providing direct insights into the relationship between price and demand. Second, these data do not entail additional costs for sellers to acquire. In contrast, conducting market research or questionnaire surveys often incur additional costs. Finally, the advancements in information technology have facilitated the storage and analysis of historical data.

In addition to data, another critical aspect of demand learning is understanding the factors that influence market demand. In the realm of dynamic pricing, two key factors affecting market demand during a sales period are the current price of the product and its historical prices. The impact of the current price on demand is evident. Meanwhile, historical prices impact product demand through consumer reference price effects (hereafter referred to as reference effects). Specifically, a product’s historical prices shape consumers’ price expectations, serving as reference prices. When the current price exceeds their expectations, their willingness to purchase decreases. Conversely, when the current price is lower than expected, they are more willing to purchase the product. Numerous studies emphasize the importance of reference effects in demand estimation and revenue management for businesses, especially within the retail industry [11,12]. Mazumdar et al. [13] conducted a literature review on reference prices, examining their formation, usage, and impact on the purchase decisions of consumers. The findings indicated that reference prices have a crucial effect on enterprise management decisions. Cornelsen, Mazzocchi, and Smith [14] provided evidence of consumer reference effects by analyzing household-level data encompassing food prices and purchases. They found significant biases in pricing strategies when reference effects were not considered. Mehra, Sajeesh, and Voleti studied how the reference price in the non-durable goods market affects a firm’s product positioning and pricing strategy [15].

Based on the above analysis and findings, this paper investigates the problem of how a monopolistic seller determines the optimal prices for multiple sales periods in the presence of demand uncertainty and consumer reference price effects. Specifically, at the beginning of each period, the seller has only a rough estimate of the parameters of the demand function. Using deterministic dynamic programming, the seller solves for the optimal price at that time and conducts sales based on the solved price. At the end of the period, the seller re-estimates the demand function parameters using the Bayesian method while simultaneously initiating the process of solving for the next period’s optimal price. This iterative process continues until the end of the sales period.

The contribution of this paper is three-fold. First, it examines the demand learning and reference price jointly. Unlike the existing literature, this paper adopts a dynamic learning process. Precisely, sellers can use the data generated during each sales period (i.e., prices and sales) for demand learning and update previous learning results. This dynamic approach closely aligns with real-world business applications. Second, we propose an effective algorithm for approximating Bayesian dynamic programming. This enables sellers to make precise real-time price adjustments. Finally, this paper shows that reference price-influencing factors exhibit greater variation compared to price-influencing factors in demand learning. This indicates weaker heterogeneity in price sensitivity among consumers compared to reference prices. Therefore, exploring mechanisms for heterogeneous reference price formation is crucial for both businesses and academia.

The remainder of this paper is organized as follows: Section 2 reviews the relevant literature. Section 3 introduces the dynamic pricing model incorporating reference price and extending it to the case of uncertain demand, that is, dynamic pricing based on the reference effect and demand learning. Section 4 describes the approximate solution algorithm for the proposed model. Section 5 presents numerical analysis and discussions, while Section 6 concludes the study.

2. Literature Review

Two research streams are related to our work: dynamic pricing with uncertain demand and dynamic pricing with reference price effects.

Depending on the form of the demand function, the current literature on uncertain demand can be divided into two categories. One category focuses on demand functions where price is the only variable, while another considers demand as a function of price and other non-price variables, which are collectively referred to as covariates. In the case of the former, three main research approaches are prevalent. The first approach involves employing statistical methods, such as classical maximum likelihood or least-squares estimators, for parameter estimation [16,17]. Keskin and Zeevi [18] used the greedy least-squares (GILS) method to address a multi-product, limited inventory dynamic pricing problem with partially known demand distribution but unknown linear demand function parameters. Their semi-shortsighted optimal approximation strategy proved that the expected return loss rate log(T) results from this strategy and provides sufficient conditions for the progressive optimization of any strategy. Den-Boer and Zwart [19] investigated the dynamic pricing problem with uncertain demand function parameters under limited inventory. They employed maximum likelihood estimation to calculate the unknown parameters of the demand function, utilizing price, demand, and inventory data for each sales period. Due to the endogenous learning characteristics, the calculated price strategy yielded an expected revenue loss of O([log]² (T)).

The second approach involves employing the Bayesian framework. Aoki [20] was the first to apply this framework in the study of demand learning, and subsequent scholars have built upon his research [21,22]. Harrison et al. [23] investigated dynamic pricing in financial services, where the parameters of the underlying demand model are unknown, but the seller can learn these parameters by observing the outcomes of each sales attempt. Ghate [24] focused on the dynamic auction–design problem, where the seller faced uncertainty regarding the market response. They formulated the problem as a Bayesian Markovian decision process, where the seller learned more about the demand by observing the number of posted bids. Uncertainty in demand at the individual consumer level, reflected in the unknown consumer’s arrival rate, has also received scholarly attention. For example, Mason and Välimäki [25] analyzed the optimal selling of an asset when demand is uncertain, with the seller acquiring knowledge of the arrival rate through a Bayesian approach.

The third approach involves utilizing online learning methods. This method employs price experiments to gather data, estimates parameters based on the data, and finally solves the optimization problem using the estimated parameters. A notable characteristic of this method is the trade-off it strikes between exploration and exploitation. Keskin and Zeevi [26] studied a dynamic pricing problem where a seller encounters an unknown demand model that can change over time. Yang et al. [27] designed an algorithm to facilitate synchronized dynamic pricing, allowing competitive firms to estimate their demand functions through observations and adjust their pricing strategies accordingly.

In reality, various factors beyond price influence demand, including product attributes and customer characteristics. Qiang and Bayati [28] considered linear demand as a function of price and additional information, such as demand covariates, including marketing expenditure, geographic information, customer socio-economic characteristics, macroeconomic indicators, and weather information. When the seller sells a large number of products, Javanmard and Nazerzadeh [29] assumed that consumers make purchase decisions based on a general choice model incorporating product features and their own characteristics, with unknown model parameters. Cohen et al. [30] posited that the demand function is determined by a linear combination of multiple product attributes, with unknown coefficient values that the seller can learn through a feasible collection. Chen and Jiang [31] employed the GILS method to segment the market into high- and low-quality product markets, where the seller lacked advanced knowledge of customer demand information for quality. They found that, under demand uncertainty, a high-quality firm may yield higher profits when initially setting prices. Ban and Keskin [32] examined dynamic pricing at the individual customer level, with customers encoding their personalized characteristics as a d-dimensional feature vector. The personalized demand model’s parameters depended on s out of the d features. The seller could learn the relationship between customer features and product demand through sales observations.

The above literature review demonstrates a wealth of research on demand uncertainty; however, it does not consider the impact of certain behavioral factors of consumers on demand.

In the field of dynamic pricing with reference effects, researchers have proposed various mechanisms for forming reference prices based on the different relationships between the reference and the historical prices. Popescu and Wu [33] investigated the dynamic pricing problem for a single monopoly producer facing consumers with reference effects. They assumed that the reference price is an exponential smoothing of historical prices, influenced by all historical prices. The authors theoretically proved that the price strategy converges to a steady state and provided numerical results indicating that the steady state price value decreases with customers’ memory and is sensitive to the reference effect. Yang et al. [34] extended Popescu and Wu’s [33] research by examining the dynamic pricing problem for a single monopoly producer under limited inventory and random demand, in which the consumers’ arrival in each period is subject to a Poisson distribution and the consumers’ reference price is also an exponential smoothing of historical prices. They explore how the reference effect influences the initial price, pricing trend, price dispersion, and expected revenue, along with the fixed pricing (FP), dynamic pricing (DP), and dynamic pricing with reference (DPR) policies. However, alternative reference price update mechanisms have been proposed. Nasiry and Popescu [35] suggested that the reference price update of consumers is determined by the maximum (minimum) historical price of the product and the price of the latest period (known as the peak-to-end law). They established a single-product dynamic pricing model based on the peak-to-end law and found that, if loss-averse consumers are anchored at a low price, the steady price range widens. Bi et al. [36] extended Nasiry and Popescu’s [35] model to multiple products and showed that manufacturers can lower consumers’ reference price by reducing the price of core products, thereby attracting more customers. Additionally, Bi et al. [37] considered consumers’ reference effect as a linear combination of prices within the memory window. They assumed that these memory factors are random variables subject to Markov first-order stochastic processes and studied the dynamic pricing problem for monopoly sellers. Drawing on prospect theory, consumers feel a “gain” (“loss”) when the reference price is greater (less) than the current price, and they react asymmetrically to these two situations (i.e., they exhibit loss-avoidance behavior). This is an aspect that has received a lot of attention in the literature with regard to reference prices [38,39,40,41]. Dynamic pricing models that integrate reference effects with other consumer behaviors have also attracted the interest of several scholars in recent years. For example, Chen et al. [42] investigated dynamic pricing where consumers exhibit both reference effects and strategic behavior.

Some scholars have jointly explored dynamic pricing based on demand learning and reference price effects. In a discrete dynamic pricing scenario, Cao et al. [43] assumed that the customer arrival rate is unknown to the seller and is influenced by the reference price. They found that overlooking the reference price effect could result in substantial revenue losses. Den and Keskin [44] designed a policy involving gradual changes in the selling price to control the evolution of the reference price. They used accumulated sales data to strike a balance between learning and earnings, employing, for instance, an online learning framework. However, our study differs from both of these two studies. We incorporate the reference effect into the linear demand model in a continuous dynamic pricing scenario, where the parameters of the demand function are unknown to the seller and are updated using the Bayesian method.

3. Model

This section describes the main setup of the study. To provide a better understanding of the dynamic pricing problem based on demand learning and reference effects, we first introduce the dynamic pricing problem with reference effects and then develop a model that considers the seller’s demand learning behavior. Before presenting the model, we define its notation, as shown in Table 1.

3.1. Dynamic Pricing Model Based on Consumer Reference Effect

According to the assumptions of Popescu and Wu [33], as well as Nasiry and Popescu [35], the seller in this study operates in a monopoly and continues to provide a product to the market. Owing to the existence of the reference effect, consumers’ purchase decisions are influenced by current and historical prices, i.e., the reference effect affects demand. In the aforementioned studies, the influence of the reference effect on demand takes the following form:

D (p, r) = D (p, p) + R (p - r, r)

(1)

Generally,

D (p, r)

is a non-negatively bounded and continuous function with decreases in price

p

and reference price

r

. The reference effect

R (x, r)

decreases in

x

and is twice differentiable in

x

and

r

.

R (x, r) > 0

for

x < 0

,

R (x, r) < 0

for

x > 0

, and

R (x, r) = 0

for

x = 0

.

For example, Nasiry and Popescu [35] assumed that the reference effect affects demand linearly, namely:

D (p, r) = (β_{0} + β_{1} \cdot p) + β_{2} \cdot \max \{p - r, 0\} + β_{3} \cdot \min \{p - r, 0\}

(2)

where

β_{0}

is the total market volume,

β_{1}

is the price influencing factor, and

β_{2}

and

β_{3}

are the reference price influencing factors.

β_{0} > 0, β_{1}, β_{2}, β_{3} \leq 0

. If

β_{2} < β_{3}

, consumers are loss-averse. Conversely, consumers are loss-seeking if

β_{2} > β_{3}

, which means the difference between the same price and the reference price generates more “loss” than “gain;” therefore, the effect on demand is different. When

β_{2} = β_{3}

, consumers are loss-neutral, then the demand function is a smooth curve.

The vector form of Equation (2) is as follows:

\begin{array}{l} D (p, r) & = (β_{0}, β_{1}, β_{2}, β_{3}) \cdot {(1, p, \max \{p - r, 0\}, \min \{p - r, 0\})}^{'} \\ = θ \cdot {(1, p, \max \{p - r, 0\}, \min \{p - r, 0\})}^{'} . \end{array}

(3)

The single-term profit of the seller is as follows:

\prod (p, r) = (p - c) [D (p, p) + R (p - r)] = π (p) + (p - c) R (p - r)

(4)

As in Bi et al. [28],

π (p)

is non-monotonic and concave in

p

, whereas

(p - c) R (p - r)

is concave in

p

and supermodular in

(p, r)

.

If the demand function is linear, the single-term profit can be written as follows:

\prod (p, r) = (p - c) [(β_{0} + β_{1} \cdot p) + β_{2} \cdot \max \{p - r, 0\} + β_{3} \cdot \min \{p - r, 0\}]

(5)

Considering the seller’s long-term profit, the description of the dynamic pricing model with the reference effect is as follows:

\begin{array}{l} V (r_{0}) = \max_{p_{t} \in [\underline{p}, \bar{p}]} \sum_{t = 1}^{\infty} γ^{t} E (\prod (p_{t}, r_{t})) \\ s . t . r_{t + 1} = g (p_{t}, r_{t}) . \end{array}

(6)

where

E (\cdot)

denotes the expectation,

r_{0}

is the initial state variable, and the constraint represents the update rule for the reference price. The current reference price

r_{t}

is a function of the previous price. In Popescu and Wu [33], the reference price is a weighted average of the historical price, where more recent prices are assigned a large weight. In Nasiry and Popescu’s [35] study, the reference price is a linear combination of prices in the memory window. In this study, we adopted the same reference price update rule as Popescu and Wu [33], as follows:

r_{t + 1} = α r_{t} + (1 - α) p_{t}

(7)

where

α \in [0, 1)

is the memory factor. Larger values of

α

represent a longer-term memory, and the reference price depends strongly on historical prices.

In Model (6),

p_{t}

is the decision variable and

r_{t}

is the only state variable. The goal is to find the price for each period of

p_{t}

to make the entire sales cycle more profitable. Combined with Equation (7), the Bellman equation of Model (6) can be written as follows:

\begin{array}{l} V (r_{t}) = \max_{p_{t} \in [\underline{p}, \bar{p}]} \prod (p_{t}, r_{t}) + γ V (r_{t + 1}) \\ s . t . r_{t + 1} = α p_{t} + (1 - α) r_{t} . \end{array}

(8)

Using the first order condition for profit maximization and the envelope theorem, we can derive that the steady-state price

p^{*}

of the Bellman Equation (6) must satisfy the following equation:

\frac{π^{'} (p)}{1 - γ} = \frac{(p - c) λ (p)}{1 - α γ}

(9)

where

π^{'} (p)

refers to the part of the single-period profit that is only affected by price and

λ (p)

refers to the partial derivative of

R (x, r)

when

x = 0

(see Popescu and Wu [33] for details).

3.2. Dynamic Pricing Model Based on Seller Demand Learning and Consumer Reference Effect

This section introduces a dynamic pricing model that considers the reference effect and demand learning. A distinct feature of our model is that the seller does not have prior knowledge of the coefficients of the demand function; namely,

θ

in Equations (2) and (3) is not invariant. However, the seller can learn these coefficients based on historical information, such as sales and prices. The seller’s goal is to maximize expected total discounted profit. To facilitate the analysis, we made several assumptions.

Assumption 1.

The demand function is a linear combination of price and reference price, as follows:

D (p_{t}, r_{t}) = (β_{0, t}, β_{1, t}, β_{2, t}) \cdot {(1, p_{t}, p_{t} - r_{t})}^{'} + ε_{t} = θ_{t} \cdot {(1, p_{t}, p_{t} - r_{t})}^{'} + ε_{t} = d (p_{t}, r_{t}; θ_{t}) + ε_{t}

(10)

where

{ε_{t}}

is a sequence of independent, identically distributed random variables with a mean of zero and finite variance.

θ_{t} = (β_{0, t}, β_{1, t}, β_{2, t}) \in Θ

. In this study, we do not consider the asymmetry between “gain” and “loss”.

Although the demand function coefficients are unknown, the seller knows the a priori probability density function for

θ

, which is denoted by

f_{0} (θ)

. In the first period, the seller sets the price according to

f_{0} (θ)

and the initial reference price

r_{0}

, while consumers observe

p_{1}

and make their purchase decisions. At the beginning of the second period, the seller updates

θ

based on

p_{1}

,

D_{1}

, and

f_{1} (θ)

, sets

p_{2}

, and continues until the end of the entire sales period (we assume that selling occurs over T discrete time periods). For convenience,

D (p_{t}, r_{t})

is used instead of

D_{t}

in this study. We denote information from the t-th selling period as

H_{t} = {D_{t - 1}, p_{t - 1}, H_{t - 1}}

and

H_{0} = {r_{0}, f_{0} (θ)}

. Following Aoki [20], the seller uses the Bayesian method to update the coefficients. Specifically,

f (θ | H_{t + 1}) = \frac{f (θ | H_{t}) f (D_{t} | H_{t}, p_{t}, θ)}{f (D_{t} | H_{t}, p_{t})} = \frac{f (θ | H_{t}) f (D_{t} | p_{t}, θ)}{f (D_{t} | H_{t}, p_{t})}

(11)

where

f (θ | H_{t + 1})

represents the posterior probability density function of

θ

at the beginning of the t + 1-th period, and

f (θ | H_{0}) = \frac{f_{0} (θ) f (D_{0} |p_{0}, θ)}{\int_{Θ} f_{0} (θ) f (D_{0} |p_{0}, θ) d θ}

.

Assumption 2.

ε_{t}

obeys a Gaussian distribution with mean zero and standard deviation

σ

.

According to Assumption 2, we can derive the following:

f (D_{t} | p_{t}, θ) = \frac{1}{\sqrt{2 π σ}} \exp (- \frac{1}{2 σ^{2}} {[D_{t} - d (θ, p_{t}, r_{t})]}^{2})

Additionally, based on Aoki’s [20] analysis, if

f_{0} (θ)

does not satisfy the known statistical distribution, the size of the set

H_{t}

will increase with time and the computation of the posterior probability Equation (11) will be very large. Therefore, we put forward the following assumption.

Assumption 3.

f_{0} (θ)

satisfies the normal distribution density function with mean

μ_{0}

and covariance

Λ_{0}

.

By repeating the calculation of Equation (11), we obtain

θ_{t} = \frac{1}{σ^{2}} Λ_{t} C_{t} X_{t} + Λ_{t} Λ_{0}^{- 1} μ_{0}

(12)

where

X_{t} = [\begin{matrix} D (p_{1}, r_{1}) \\ ⋮ \\ D (p_{t - 1}, r_{t - 1}) \end{matrix}]

,

C_{t} = [\begin{matrix} p_{1} - r_{1} & \dots & p_{t - 1} - r_{t - 1} \\ p_{1} & \dots & p_{t - 1} \\ 1 & \dots & 1 \end{matrix}]

,

σ^{2} Λ_{t}^{- 1} = σ^{2} Λ_{t - 1}^{- 1} + (\begin{matrix} p_{t - 1} - r_{t - 1} \\ p_{t - 1} \\ 1 \end{matrix}) (p_{t - 1} - r_{t - 1}, p_{t - 1}, 1)

, and

σ^{2} Λ_{0}^{- 1} = (\begin{matrix} λ_{1} & λ_{4} & λ_{5} \\ λ_{4} & λ_{2} & λ_{6} \\ λ_{5} & λ_{6} & λ_{3} \end{matrix})

.

Moreover,

λ_{1}, \dots, λ_{6}

are constant. Therefore, we rewrite Equation (12) as:

θ_{t} = (I - K_{t}) [θ_{t - 1} + \frac{Λ_{t - 1}}{σ^{2}} (\begin{matrix} p_{t - 1} - r_{t - 1} \\ p_{t - 1} \\ 1 \end{matrix}) D (p_{t - 1}, r_{t - 1})]

(13)

where

K_{t} = \frac{\frac{Λ_{t - 1}}{σ^{2}} (\begin{matrix} {(p_{t - 1} - r_{t - 1})}^{2} & (p_{t - 1} - r_{t - 1}) \cdot p_{t - 1} & p_{t - 1} - r_{t - 1} \\ (p_{t - 1} - r_{t - 1}) \cdot p_{t - 1} & {(p_{t - 1})}^{2} & p_{t - 1} \\ p_{t - 1} - r_{t - 1} & p_{t - 1} & 1 \end{matrix})}{1 + (p_{t - 1} - r_{t - 1}, p_{t - 1}, 1) \frac{Λ_{t - 1}}{σ^{2}} (\begin{matrix} p_{t - 1} - r_{t - 1} \\ p_{t - 1} \\ 1 \end{matrix})}

and

I

are unit matrices. Equation (13) shows the final demand function coefficient update mechanism.

Based on Assumptions 1–3, the stochastic dynamic pricing model based on demand learning and reference effects is expressed as follows:

\begin{array}{l} E (V (r_{0} |H_{0})) = \max_{p_{t} \in [\underline{p}, \bar{p}]} \sum_{t = 1}^{T} γ^{t} Π (p_{t}, r_{t}; θ_{t}) = \max_{p_{t} \in [\underline{p}, \bar{p}]} \sum_{t = 1}^{T} γ^{t} (p_{t} - c) (θ_{t} \cdot {(1, p_{t}, p_{t} - r_{t})}^{'}) \\ s . t . r_{t + 1} = α p_{t} + (1 - α) r_{t} \\ θ_{t} = (I - K_{t}) [θ_{t - 1} + \frac{Λ_{t - 1}}{σ^{2}} (\begin{matrix} p_{t - 1} - r_{t - 1} \\ p_{t - 1} \\ 1 \end{matrix}) D (p_{t - 1}, r_{t - 1})] . \end{array}

(14)

4. Model Analysis and Solution

In this section, we discuss a solution to the dynamic pricing model based on the reference effect and demand learning.

The Bellman equation for Model (14) is as follows:

\begin{array}{l} E (V (r_{t} |H_{t})) = (p_{t} - c) (θ_{t} \cdot (1, p_{t}, p_{t} - r_{t})^{'}) + γ E (V (r_{t + 1} |H_{t + 1})) \\ s . t . E (V (r_{T + 1} |H_{T + 1})) = 0 . \end{array}

(15)

In the last period, the seller had information

D_{1} (p_{1}, r_{1}; θ_{1}), \dots, D_{T - 1} (p_{T - 1}, r_{T - 1}; θ_{T - 1})

about the demand realized in the past sales period with the realized price information

p_{1}, p_{2}, \dots, p_{T - 1}

and

θ_{T - 1}

. The seller can compute

θ_{T}

using Equation (13). Thus, the optimal price can be solved as

p_{T}^{*} = \arg \max_{p_{T}} E [(p_{T} - c) (θ_{T} \cdot {(1, p_{T}, p_{T} - r_{T})}^{'}) |H_{T}]

In the penultimate period, the optimal prices

p_{T - 1}

and

P_{T}

are chosen to maximize

E [(p_{T - 1} - c) (θ_{T - 1} \cdot {(1, p_{T - 1}, p_{T - 1} - r_{T - 1})}^{'}) + V_{T} |H_{T - 1}]

E [(p_{T - 1} - c) (θ_{T - 1} \cdot {(1, p_{T - 1}, p_{T - 1} - r_{T - 1})}^{'})]

is a function of the

(p_{T - 1}, r_{T - 1}; θ_{T - 1})

.

E (V_{T} |H_{T - 1})

can also be considered a function of

(p_{T - 1}, r_{T - 1}; θ_{T - 1})

. At time

T - 1

, the seller knows

r_{T - 1}

and

θ_{T - 1}

; thus, the optimal prices

p_{T - 1}

and

P_{T}

can be deduced.

Returning to the above analysis in the first period, we find that an optimal solution exists. However, we cannot obtain a specific form of the

E (V (r_{t} |H_{t}))

, and it is difficult to find the optimal solution. Therefore, we propose a numerical method to obtain an approximate solution. Specifically, in each period t, the seller updates an

θ_{t}

and fixes it. Then, Model (14) becomes a non-random dynamic programming model, and we solve

p_{t}

using inverse recursion according to the Bellman equation. Given

(p_{t}, θ_{t})

, the seller observes

D_{t}

and updates

θ_{t}

to

θ_{t + 1}

and repeats the procedure until the end of the time horizon. The specific Algorithm 1 is as follows:

Algorithm 1. Compute

p_{t}^{*}

and

θ_{t}

.

Initialization

r_{0}, p_{0}, θ_{0}, Λ_{0} / σ^{2}, α, γ, c, T

for

t = 1, 2, \dots, T

do
compute

r_{t} = α p_{t - 1} + (1 - α) r_{t - 1}

,
for

k = T, T - 1, \dots, t

do
compute

V (r_{k}) = (p_{k} - c) (θ_{t} \cdot {(1, p_{k}, p_{k} - r_{k})}^{'}) + γ V (r_{k + 1})

p_{t}^{*} = \underset{p_{t} \in [\underline{p}, \bar{p}]}{\arg \max} V (r_{t})

D_{t} = θ_{t} \cdot {(1, p_{t}^{*}, p_{t}^{*} - r_{t})}^{'}

end for
update

θ_{t + 1} = (I - K_{t + 1}) [θ_{t} + \frac{Λ_{t}}{σ^{2}} (\begin{matrix} p_{t} - r_{t} \\ p_{t} \\ 1 \end{matrix}) D (p_{t}, r_{t})]

end for

5. Numerical Simulation

To present the results more intuitively, in this section, we conduct numerical simulations for Model (14) under the given parameters and conditions and show the simulation results. Considering that there will not be an infinite number of prices set by the seller in practice but only a choice within a finite number, we assume that the decision variable

p_{t}

will only fluctuate within a finite number of prices belonging to a discrete, deterministic set.

5.1. Numerical Simulation Results

Given the parameters

β_{0, 0} = 100

,

β_{1, 0} = - 20

,

β_{2, 0} = - 50

,

α = 0.5

,

γ = 0.95

,

p_{0} = 4.2

,

r_{0} = 4.2

,

p_{t} \in [4.2, 5]

,

c = 4

,

T = 6

and

Λ_{0} / σ^{2} = [500, - 50, 40; - 50, 10, 400; 40, 400, 20]

, based on the value iteration and compression mapping theorems, we first obtain the price and reference price for each period set by the seller. As shown in Figure 1, although the decision variables are limited to a finite set of values, the decision variables (prices) of the model in this study do not converge monotonically to a single value but rather fluctuate back and forth. This is in contrast to the short-term results observed by Popescu and Wu [33]. As illustrated in Figure 2, compared with the decision variable

p_{t}

, the state variable

r_{t}

is monotonic and increases with time in this simulation.

Based on the derived and reference prices and the two previous assumptions, this study will further simulate how the demand function coefficients

β_{0}, β_{1}, β_{2}

are updated. Figure 3, Figure 4 and Figure 5 illustrate the update of these coefficients. With a demand-learning seller, the change in market aggregates in each period is not significant, and the gap between the first and sixth issues is only 2.5 units (see Figure 3). In other words, consumers do not react strongly to more “intelligent” sellers. The change in the price impact factor for each period is also insignificant (see Figure 4), with only 1.5 units. However, compared with the total demand impact coefficient and the price impact factor, the change in the reference effect utility impact factor is significant, with 7 units. The demand-learning seller learns the demand function by observing changes in product demand caused by the reference effect impact factors. Figure 6 shows that market demand increases over time.

5.2. Discussion

In this subsection, we will discuss the numerical simulation results. First, although the seller has been updating the values of

β_{0}

(i.e., market size) and

β_{1}

(i.e., price sensitivity) throughout the sales period, the changes are relatively flat. This indicates that the seller’s decisions are based on prior knowledge of these parameters, resulting in minimal deviation from the optimal outcome. We can use market segmentation theory to explain why the changes of

β_{0}

and

β_{1}

parameters are relatively flat. Before selling the product, the seller determines their target market and product positioning. According to market segmentation theory, when the target market is defined, the initial market scale (i.e.,

β_{0}

) is established, and product positioning provides a clear understanding for consumers, helping them clarify the substitutability of products, which, in turn, affects consumers’ price sensitivity (i.e.,

β_{1}

). The process of target market determination and product positioning mentioned above is the process through which the seller obtains prior information about

β_{0}

and

β_{1}

. Therefore, the changes in

β_{0}

and

β_{1}

during the seller’s learning process occur gradually. For example, in China, Pinduoduo.com and JD.com are two well-known online shopping platforms. Although many people intuitively feel that JD.com’s retail prices are higher than those on Pinduoduo.com, consumers who have previously chosen to shop on JD.com do not tend to shift to the Pinduoduo.com platform. This implies that the demand for products on JD.com has not changed significantly. This phenomenon can be attributed to the initial differences in target audiences for the two platforms. JD.com’s primarily caters to high-value customers, while Pinduoduo.com focuses on attracting low-value customers.

The second result of the numerical simulation reveals that the variation in the influencing factor (i.e.,

β_{2}

) of the reference effect surpasses the variations of

β_{0}

and

β_{1}

. This highlights the potential for significant decision losses if sellers solely make decisions based on prior information regarding

β_{2}

. The reference effect, as a consumer’s behavioral characteristic at the individual level, exhibits evident heterogeneity, which has been overlooked in the literature when constructing the reference price formation mechanism. Consequently, obtaining reasonable prior information through measures such as target market and product positioning becomes challenging. However, to some extent, this underscores the necessity of learning about the influencing factors of reference effects and prompts further exploration of heterogeneous reference price formation mechanisms.

Finally, considering supply chain management, it is necessary for sellers to engage in demand learning. By updating the parameters of the demand function at the end of each period, accurate demand forecasts for the next period can be generated. The predicted results can then be communicated to upstream production enterprises through the supply chain, enabling timely adjustments to production plans and optimizing revenue across the entire supply chain.

6. Conclusions

This study summarizes previous research on dynamic pricing problems with reference effects and proposes a model to address this problem. The existing literature has primarily assumed constant parameters in dynamic pricing problems with a reference effect. It has also overlooked consumer behavioral factors in their dynamic pricing models based on seller demand learning. Drawing on this analysis, we first presented a dynamic pricing problem considering both seller demand learning and the consumer reference effect. In Section 4, we introduced a model that incorporates these factors and provided theoretical proof of its solution. We outlined the algorithmic steps for analyzing the numerical solution and presented a specific numerical example along with its analysis. Through the numerical analysis, we observed that, with the consumer reference effect, the total consumer demand and the effect of price on demand slightly change in each period, whereas the reference effect influences factor changes more, indicating that the seller can learn about the change in product demand by monitoring the reference effect influence factor. The management implications of the above findings for enterprises are as follows: First, conducting market and product positioning before the formal product launch is necessary. This can help reduce demand uncertainty. Second, it is advisable for enterprises to adjust prices as early as possible. This is because increasing the frequency of price adjustments can accelerate the seller’s learning about the impact of reference effects on demand. Consequently, it leads to more stable demand predictions in subsequent periods.

While this study contributes to the field of dynamic pricing, it is not without limitations. For example, we focus on dynamic pricing based on demand learning in a monopolistic environment, while, in reality, sellers face intense competition, and their demands are influenced by competitors’ reactions. Therefore, constructing a reasonable demand learning model and solving it remains a challenging problem. In addition, this paper does not consider the loss aversion behavior in the consumers’ reference effects.

It is worth noting that this study specifically introduces the reference effect as an example of a consumer behavioral factor into the model. Researchers have the opportunity to incorporate other consumer behavioral factors, such as consumer inertia and strategic behavior. Furthermore, the demand-learning problem presented in this study is model-driven, lacking historical sales data for the product. Conversely, the existing literature has focused on data-driven problems, as seen in the work of Li and Wu [45]. Thus, exploring the integration of historical data into the problem addressed in this study warrants further investigation.

Author Contributions

Conceptualization, B.W., W.B. and H.L.; Formal analysis, B.W. and W.B.; Methodology, B.W. and H.L.; Project administration, B.W.; Supervision, W.B.; Visualization, B.W.; Writing—original draft, B.W.; Writing—review and editing, W.B. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the National Natural Science Foundation of China [grant number 71871231]. Graduate Research Innovation Project of Central South University [grant number 2018zzts093]. Humanities and Social Science Foundation Project of Ministry of Education [grant number 21YJAZH051]. Key Project of Teaching Reform in Hunan Province [grant number HNJG-2022-0351].

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Batrancea, L.M.; Balcı, M.A.; Akgüller, Ö.; Gaban, L. What Drives Economic Growth across European Countries? A Multimodal Approach. Mathematics 2022, 10, 3660. [Google Scholar] [CrossRef]
Batrancea, L.M.; Balcı, M.A.; Chermezan, L.; Akgüller, Ö.; Masca, E.S.; Gaban, L. Sources of SMEs Financing and Their Impact on Economic Growth across the European Union: Insights from a Panel Data Study Spanning Sixteen Years. Sustainability 2022, 14, 15318. [Google Scholar] [CrossRef]
Gallego, G.; Topaloglu, H. Revenue Management and Pricing Analytics; Springer: New York, NY, USA, 2019. [Google Scholar]
Huang, J.; Leng, M.; Parlar, M. Demand functions in decision modeling: A comprehensive survey and research directions. Decision Sci. 2013, 44, 557–609. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, J.; Tang, W. Dynamic pricing for non-instantaneous deteriorating items. J. Intell. Manuf. 2015, 26, 629–640. [Google Scholar] [CrossRef]
Xue, M.; Tang, W.; Zhang, J. Optimal dynamic pricing for deteriorating items with reference-price effects. Int. J. Syst. Sci. 2016, 47, 2022–2031. [Google Scholar] [CrossRef]
Den Boer, A.V. Dynamic pricing and learning: Historical origins, current research, and new directions. Surv. Oper. Res. Manag. Sci. 2015, 20, 1–18. [Google Scholar] [CrossRef]
Cheung, W.C.; Simchi-Levi, D.; Wang, H. Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 2017, 65, 1722–1731. [Google Scholar] [CrossRef]
Misra, K.; Schwartz, E.M.; Abernethy, J. Dynamic online pricing with incomplete information using multiarmed bandit experiments. Market. Sci. 2019, 38, 226–252. [Google Scholar] [CrossRef]
Lin, K.Y. Dynamic pricing with real-time demand learning. Eur. J. Oper. Res. 2006, 174, 522–538. [Google Scholar] [CrossRef]
Chen, X.; Hu, P.; Hu, Z. Efficient algorithms for the dynamic pricing problem with reference price effect. Manag. Sci. 2017, 63, 4389–4408. [Google Scholar] [CrossRef]
Liu, H.; Liu, S. Research on advertising and quality of paid apps, considering the effects of reference price and goodwill. Mathematics 2020, 8, 733. [Google Scholar] [CrossRef]
Mazumdar, T.; Raj, S.P.; Sinha, I. Reference price research: Review and propositions. J. Mark. 2005, 69, 84–102. [Google Scholar] [CrossRef]
Cornelsen, L.; Mazzocchi, M.; Smith, R. Between Preferences and References: Evidence from Great Britain on Asymmetric Price Elasticities; Alma Mater Studiorum Università di Bologna: Bologna, Italy, 2018. [Google Scholar] [CrossRef]
Mehra, A.; Sajeesh, S.; Voleti, S. Impact of reference prices on product positioning and profits. Prod. Oper. Manag. 2020, 29, 882–892. [Google Scholar] [CrossRef]
Broder, J.; Rusmevichientong, P. Dynamic pricing under a general parametric choice model. Oper. Res. 2012, 60, 965–980. [Google Scholar] [CrossRef]
den Boer, A.V.; Zwart, B. Simultaneously learning and optimizing using controlled variance pricing. Manag. Sci. 2014, 60, 770–783. [Google Scholar] [CrossRef]
Keskin, N.B.; Zeevi, A. Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 2014, 62, 1142–1167. [Google Scholar] [CrossRef]
den Boer, A.V.; Zwart, B. Dynamic pricing and learning with finite inventories. Oper. Res. 2015, 63, 965–978. [Google Scholar] [CrossRef]
Aoki, M. On a dual control approach to the pricing policies of a trading specialist. In Proceedings of the 5th Conference on Optimization Techniques Part II 5, Rome, Italy, 7 May 1973; Volume 1973, pp. 272–282. [Google Scholar]
Nguyen, D. The monopolistic firm, random demand, and Bayesian learning. Oper. Res. 1984, 32, 1038–1051. [Google Scholar] [CrossRef]
Kwon, H.D.; Lippman, S.A.; Tang, C.S. Optimal markdown pricing strategy with demand learning. Probab. Eng. Inform. Sci. 2012, 26, 77–104. [Google Scholar] [CrossRef]
Harrison, J.M.; Keskin, N.B.; Zeevi, A. Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Manag. Sci. 2012, 58, 570–586. [Google Scholar] [CrossRef]
Ghate, A. Optimal minimum bids and inventory scrapping in sequential, single-unit, Vickrey auctions with demand learning. Eur. J. Oper. Res. 2015, 245, 555–570. [Google Scholar] [CrossRef]
Mason, R.; Välimäki, J. Learning about the arrival of sales. J. Econ. Theory 2011, 146, 1699–1711. [Google Scholar] [CrossRef]
Keskin, N.B.; Zeevi, A. Chasing demand: Learning and earning in a changing environment. Math. Oper. Res. 2017, 42, 277–307. [Google Scholar] [CrossRef]
Yang, Y.; Lee, Y.C.; Chen, P.A. Competitive Demand Learning: A Data-Driven Pricing Algorithm. arXiv 2020, arXiv:2008.05195. [Google Scholar] [CrossRef]
Qiang, S.; Bayati, M. Dynamic pricing with demand covariates. arXiv 2016, arXiv:1604.07463. [Google Scholar] [CrossRef]
Javanmard, A.; Nazerzadeh, H. Dynamic pricing in high-dimensions. J. Mach. Learn. Res. 2019, 20, 315–363. [Google Scholar] [CrossRef]
Cohen, M.C.; Lobel, I.; Paes Leme, R. Feature-based dynamic pricing. Manag. Sci. 2020, 66, 4921–4943. [Google Scholar] [CrossRef]
Chen, Y.H.; Jiang, B. Dynamic Pricing of Experience Goods in Markets with Demand Uncertainty. SSRN 2017. Available online: https://ssrn.com/abstract=2841112 (accessed on 1 February 2017). [CrossRef]
Ban, G.Y.; Keskin, N.B. Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Manag. Sci. 2021, 67, 5549–5568. [Google Scholar] [CrossRef]
Popescu, I.; Wu, Y. Dynamic pricing strategies with reference effects. Oper. Res. 2007, 55, 413–429. [Google Scholar] [CrossRef]
Yang, H.; Zhang, D.; Zhang, C. The influence of reference effect on pricing strategies in revenue management settings. Int. Oper. Res. 2017, 24, 907–924. [Google Scholar] [CrossRef]
Nasiry, J.; Popescu, I. Dynamic pricing with loss-averse consumers and peak-end anchoring. Oper. Res. 2011, 59, 1361–1368. [Google Scholar] [CrossRef]
Bi, W.J.; Sun, Y.H.; Tian, L.Q. Multi-product dynamic pricing model with reference price submitting to peak-end rule. J. Syst. Eng. 2015, 30, 476–484. [Google Scholar]
Bi, W.; Li, G.; Liu, M. Dynamic pricing with stochastic reference effects based on a finite memory window. Int. J. Prod. Res. 2017, 55, 3331–3348. [Google Scholar] [CrossRef]
Wang, Q.; Zhao, N.; Wu, J.; Zhu, Q. Optimal pricing and inventory policies with reference price effect and loss-averse customers. Omega 2021, 99, 102174. [Google Scholar] [CrossRef]
Zhao, N.; Wang, Q.; Cao, P.; Wu, J. Pricing decisions with reference price effect and risk preference customers. Int. Trans. Oper. Res. 2021, 28, 2081–2109. [Google Scholar] [CrossRef]
Bai, T.; Wu, M.; Zhu, S.X. Pricing and ordering by a loss averse newsvendor with reference dependence. Transport. Res. E-Log. 2019, 131, 343–365. [Google Scholar] [CrossRef]
Colombo, L.; Labrecciosa, P. Dynamic oligopoly pricing with reference-price effects. Eur. J. Oper. Res. 2021, 288, 1006–1016. [Google Scholar] [CrossRef]
Chen, K.; Zha, Y.; Alwan, L.C.; Zhang, L. Dynamic pricing in the presence of reference price effect and consumer strategic behaviour. Int. J. Prod. Res. 2020, 58, 546–561. [Google Scholar] [CrossRef]
Cao, P.; Zhao, N.; Wu, J. Dynamic pricing with Bayesian demand learning and reference price effect. Eur. J. Oper. Res. 2019, 279, 540–556. [Google Scholar] [CrossRef]
den Boer, A.V.; Keskin, N.B. Dynamic pricing with demand learning and reference effects. Manag. Sci. 2022, 68, 7112–7130. [Google Scholar] [CrossRef]
Chen, L.; Wu, C. Bayesian dynamic pricing with unknown customer willingness-to-pay and limited inventory. SSRN J. 2018. Available online: https://ssrn.com/abstract=2689924 (accessed on 3 September 2018). [CrossRef]

Figure 1. Price path.

Figure 2. Reference price path.

Figure 3.

β_{0}

varies with time.

Figure 3.

β_{0}

varies with time.

Figure 4.

β_{1}

varies with time.

Figure 4.

β_{1}

varies with time.

Figure 5.

β_{2}

varies with time.

Figure 5.

β_{2}

varies with time.

Figure 6. Demand varies with time.

Table 1. Symbol Description.

Notation	Description
$t$	$Time period, where 1 \leq t \leq T$
$θ$	Demand function impact factor
$c$	$Per unit \cos t of product, c > 0$
$α$	$Consumer ’ s memory factor, 0 \leq α < 1$
$γ$	$Discount factor and 0 < γ \leq 1$
$D (p_{t}, r_{t})$	$Demand in period t$
$p (θ)$	$Probability density function of θ$
$H_{t}$	$Set of the prior information in period t$
$p_{t}$	$Product price in period t$ , $where p_{t} \in [\underline{p}, \bar{p}]$
$r_{t}$	$Reference price in period t$ , $where r_{t} \in [\underline{p}, \bar{p}]$
$Π (p_{t}, r_{t})$	$Profit in period t$
$V (p_{t}, r_{t})$	$Total profit in the previous t$ periods

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, B.; Bi, W.; Liu, H. Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects. Mathematics 2023, 11, 2387. https://doi.org/10.3390/math11102387

AMA Style

Wang B, Bi W, Liu H. Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects. Mathematics. 2023; 11(10):2387. https://doi.org/10.3390/math11102387

Chicago/Turabian Style

Wang, Bing, Wenjie Bi, and Haiying Liu. 2023. "Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects" Mathematics 11, no. 10: 2387. https://doi.org/10.3390/math11102387

APA Style

Wang, B., Bi, W., & Liu, H. (2023). Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects. Mathematics, 11(10), 2387. https://doi.org/10.3390/math11102387

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Pricing with Parametric Demand Learning and Reference-Price Effects

Abstract

1. Introduction

2. Literature Review

3. Model

3.1. Dynamic Pricing Model Based on Consumer Reference Effect

3.2. Dynamic Pricing Model Based on Seller Demand Learning and Consumer Reference Effect

4. Model Analysis and Solution

5. Numerical Simulation

5.1. Numerical Simulation Results

5.2. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI