*Article* **Imbalance Market Real Options and the Valuation of Storage in Future Energy Systems**

**John Moriarty <sup>1</sup> and Jan Palczewski 2,\***


Received: 13 December 2018; Accepted: 28 March 2019; Published: 11 April 2019

**Abstract:** As decarbonisation progresses and conventional thermal generation gradually gives way to other technologies including intermittent renewables, there is an increasing requirement for system balancing from new and also fast-acting sources such as battery storage. In the deregulated context, this raises questions of market design and operational optimisation. In this paper, we assess the real option value of an arrangement under which an autonomous energy-limited storage unit sells incremental balancing reserve. The arrangement is akin to a perpetual American swing put option with random refraction times, where a single incremental balancing reserve action is sold at each exercise. The power used is bought in an energy imbalance market (EIM), whose price we take as a general regular one-dimensional diffusion. The storage operator's strategy and its real option value are derived in this framework by solving the twin timing problems of when to buy power and when to sell reserve. Our results are illustrated with an operational and economic analysis using data from the German Amprion EIM.

**Keywords:** multiple optimal stopping; general diffusion; real option analysis; energy imbalance market

#### **1. Introduction**

In today's electric grids, power system security is managed in real time by the system operator, who coordinates electricity supply and demand in a manner that avoids fluctuations in frequency or disruption of supply (see, for example, New Zealand Electricity Authority 2016). In addition, the system operator carries out planning work to ensure that supply can meet demand, including the procurement of non-energy or ancillary services such as operating reserve, the capacity to make near real-time adjustments to supply and demand. These services are provided principally by network solutions such as the control of large-scale generation, although from a technical perspective they can also be provided by smaller, distributed resources such as demand response or energy storage (National Grid ESO 2019a; Xu et al. 2016). Such resources have strongly differing operating characteristics: when compared to thermal generation, for example, energy storage is energy limited but can respond much more quickly. Storage also has important time linkages, since each discharge necessitates a corresponding recharge at a later time.

The coming decades are expected to bring a period of "energy transition" in which markets for ancillary services will evolve, among other highly significant changes to generation, consumption and network operation. The UK government, for example, has an ambition that *"new solutions such as storage or demand-side response can compete directly with more traditional network solutions"* (UK Office of Gas and Electricity Markets 2017, p. 29). In harmony, the UK System Operator National Grid has recently declared its intention to *"create a marketplace for balancing that encourages new and existing providers, and all new technology types"* (National Grid ESO 2019b). In anticipation of changes such as these, we will examine the participation of autonomous energy storage in a future marketplace for balancing.

Operating reserve is typically procured via a two-price mechanism, with a reservation payment plus an additional utilisation payment each time the reserve is called for (Ghaffari and Venkatesh 2013; Just and Weber 2008). Since the incentivisation and efficient use of operating reserve for system balancing is of increasing importance with growing penetration of variable renewable generation (King et al. 2011), several system operators have recently introduced real-time energy imbalance markets (EIMs) in which operating reserve is pooled, including in Germany (Ocker and Ehrhart 2017) and California (Lenhart et al. 2016; Western EIM 2019). Such markets typically involve the submission of bids and offers from several providers for reserves running across multiple time periods, which are then accepted, independently in each period, in price order until the real-time balancing requirement is met. As one provider can potentially be called upon over multiple consecutive periods, this reserve procurement mechanism is not well suited to energy-limited reserves such as energy storage. However, storage-oriented solutions are being pioneered in a number of countries including a recent tender by the National Grid in the UK (National Grid ESO 2019a) and various trials by state system operators in the US (Xu et al. 2016).

This paper considers operating reserve contracts for energy limited storage devices such as batteries. In contrast to previous work on the pricing and hedging of energy options where settlement is financial (see, for example, Benth et al. (2008) and references therein), we take account of the physical settlement required in system balancing, considering also the limited energy and time linkages of storage. The potential physical feedback effects of such contracts are investigated by studying the operational policy of the storage or battery operator. To address the limited nature of storage, the considered reserve contract is for a fixed quantity of energy. In this way, each contract written can be physically covered with the appropriate amount of stored energy. We consider a simple arrangement where the system operator sets the contract parameters, namely the premia (the reservation and utilisation payments) plus an EIM price level *x*∗ at which the energy is delivered. That is, rather than being the outcome of a price formation process, these parameters are set administratively. Our analysis thus focuses exclusively on the timing of the battery operator's actions. This dynamic modelling contrasts with previous economic studies of operating reserve in the literature, which have largely been static (Just and Weber 2008).

To quantify the economic opportunity for the storage operator, we use real options analysis. Real Options analysis is the application of option pricing techniques to the valuation of non-financial or 'real' investments with flexibility (Borison 2005; Dixit and Pindyck 1994). Here, the energy storage unit is the real asset, and is coupled with the timing flexibility of the battery operator, who observes the EIM price in real time. The arrangement may be viewed as providing the battery operator with a real perpetual American put option on the reserve contract described above. This option is either of swing type (called the lifetime problem in this paper) or of single exercise type (the single problem). The feature that sets it apart from the existing literature on swing options is the random refraction time (cf. Carmona and Touzi 2008).

A key question in Real Options analyses is the specification of the driving randomness (Borison 2005). In this paper, we model the EIM price to resemble the historical statistical dynamics of imbalance prices. In common with electricity spot prices and commodity prices more generally but unlike the prices of financial assets, imbalance prices typically exhibit significant mean reversion (Ghaffari and Venkatesh 2013; Pflug and Broussev 2009).

To avoid trivial cases, we impose the following, mild, sustainability conditions on the arrangement:


Condition **S1** is also known as the individual rationality or participation condition (Fudenberg et al. 1991). While the battery operator is assumed to be a profit maximiser, the system operator may engage in the arrangement for wider reasons than profit maximisation. To acknowledge the potential additional benefits provided by batteries, for example in providing response quickly and without direct emissions, condition **S2** is less strict than individual rationality.

By considering reserve contracts for incremental capacity (defined as an increase in generation or equivalently a decrease in load), we are able to provide complete solutions whose numerical evaluation is straightforward. Contracts for a decrease in generation, or an increase in load, lead to a fundamentally different set of optimisation problems which have been partially solved by Szabó and Martyr (2017).

This study extends earlier work (Moriarty and Palczewski 2017) with two important differences. Firstly, the dynamics of the imbalance price is described there by an exponential Brownian motion. In the present paper, by employing a different methodological approach, we obtain explicit results for mean-reverting processes (and also other general diffusions) which better describe the statistical properties of imbalance prices (Ghaffari and Venkatesh 2013; Pflug and Broussev 2009). Secondly, the present paper takes into account deterioration of the store. Without this feature, it was found that the value of storage is either very small (corresponding roughly to writing a single reserve contract) or infinite.

Through a benchmark case study, we obtain the following economic recommendations. Firstly, investments in battery storage to provide reserve will be profitable on average for a wide range of the contract parameters. Secondly, the EIM price level *x*∗ at which energy is delivered is an important consideration. This is because, as *x*∗ increases, the EIM price reaches *x*∗ significantly less frequently and the reserve contract starts to provide cover for rare events, resulting in infrequent power delivery and low utilisation of the battery, which may make the business case unattractive. These observations suggest that the contractual arrangement studied in this paper is more suitable for the frequent balancing of less severe imbalance.

#### *1.1. Objectives*

Given the model parameters *x*∗, *pc* ≥ 0 and *Kc* ≥ 0, we wish to analyse the actions A1–A3 below (a graphical description of this sequence of actions is provided in Figure 1):


**Figure 1.** The sequence of actions A1–A3.

Thus, the system operator obtains incremental reserve from the arrangement in preference to using the EIM, when the EIM price is higher than the level *x*∗ specified by the system operator. When the sequence A1–A3 is carried out once, we refer to this as the single problem; when it is repeated indefinitely back-to-back, we refer to it as the lifetime problem.

In the lifetime problem, because storage is energy limited, action A3 must be completed before the sequence A1–A3 can begin again. Thus, if the arrangement is considered as a real swing put option, the time between A2 and A3 is a random refraction period during which no exercise is possible. Note that, after action A3, the battery operator will perform action A1 again when the EIM price has fallen sufficiently. Mathematically, therefore, we have the following objectives:


We also aim to provide a straightforward numerical procedure to explicitly calculate *x*ˇ and the value function (for *x* ≥ *x*ˇ) in the lifetime problem.

#### *1.2. Approach and Related Work*

We take the EIM price to be a continuous time stochastic process (*Xt*)*t*≥0. Since markets operate in discrete time, this is an approximation, made for analytical tractability. Nevertheless, it is consistent with the physical fact that the system operator's system balancing challenge is both real-time and continuous.

Mathematically, the problem is one of choosing two optimal stopping times corresponding to the two actions A1 and A2, based on the evolution of the stochastic process *X* (The reader is refered to Peskir and Shiryaev (2006, chp. 1) for a thorough presentation of optimal stopping problems). We centre our solution techniques around ideas of Beibel and Lerche (2000), who characterise optimal stopping times using the Laplace transforms of first hitting times for the process *X* (see, for example, Borodin and Salminen (2012, sec. 1.10)). Methods and results from the single problem are then combined with a fixed point argument for the lifetime analysis.

Our methodological results feed into a growing body of research on timing problems in trading. In a financial context, Zervos et al. (2013) optimise the performance of "buy low, sell high" strategies, using the same Laplace transforms to provide a candidate value function, which is later verified as a solution to certain quasi-variational inequalities. An analogous strategy in an electricity market using hydroelectric storage is studied in Carmona and Ludkovski (2010) where the authors use Regression Monte Carlo methods to approximately solve the dynamic programming equations for a related optimal switching problem. Our results differ from the above papers in two aspects. Our analysis is purely probabilistic, leading to arguments that do not refer to the theory of PDEs and quasi-variational inequalities. Secondly, our characterisation of the value function and the optimal policy is explicit up to a single, one-dimensional nonlinear optimisation, which, as we demonstrate in an empirical experiment, can be performed in milliseconds using standard scientific software. Related to our lifetime analysis, Carmona and Dayanik (2008) apply probabilistic techniques to study the optimal multiple-stopping problem for a general linear regular diffusion process and reward function. However, the latter work deals with a finite number of option exercises in contrast to our lifetime analysis which addresses an infinite sequence of exercises via a fixed point argument. Our work thus yields results with a significantly simpler and more convenient structure.

The contracts we consider have features in common with the reliability options used in Colombia, Ireland and the ISO New England market and currently being introduced in Italy (Mastropietro et al. 2018). Reliability options pay an initial premium to a generator, usually require physical cover, and have a reference market price and a strike price that plays a similar role to *x*∗. Typically, the strike price is set at the variable cost of the technology used to satisfy demand peaks,

and the generator is contracted to pay back the difference between the market price and the strike price in periods when energy is delivered and the market price is higher. However, instead of being designed for system balancing, the purpose of reliability options is to ensure sufficient investment in generation capacity.

The remainder of the paper is organised as follows. The mathematical formulation and main tools are developed in Section 2. In the results of Section 3, we show that, for a range of price processes *X* incorporating mean reversion, solutions for all initial values *x* can be obtained. Furthermore, an empirical illustration using data from the German Amprion system operator is provided and qualitative implications are drawn, while Section 5 presents the conclusions. Auxiliary results are collected in the appendices.

#### **2. Methodology**

#### *2.1. Formulation and Preliminary Results*

In this section, we characterise the real option value in the single and lifetime problems using the theory of regular one-dimensional diffusions. Denoting by (*Wt*)*t*≥<sup>0</sup> a standard Brownian motion, let *X* = (*Xt*)*t*≥<sup>0</sup> be a (weak) solution of the stochastic differential equation:

$$dX\_t = \mu(X\_t)dt + \sigma(X\_t)dW\_{t\prime} \tag{1}$$

with boundaries *a* ∈ R ∪ {−∞} and *b* ∈ R ∪ {∞}. The solution of this equation with the initial condition *X*<sup>0</sup> = *x* defines a probability measure P*<sup>x</sup>* and the related expectation operator E*x*. We assume that the boundaries are natural or entrance-not-exit, i.e., the process cannot reach them in finite time, and that *X* is a regular diffusion process, meaning that the state space *I* := (*a*, *b*) cannot be decomposed into smaller sets from which *X* cannot exit. The existence and uniqueness of such an *X* is guaranteed if the functions *μ* and *σ* are Borel measurable in *I* with *σ*<sup>2</sup> > 0, and

$$\forall y \in I, \exists \text{ } \exists \text{ } \varepsilon > 0 \text{ such that } \int\_{y-\varepsilon}^{y+\varepsilon} \frac{1 + |\mu(\xi)|}{\sigma^2(\xi)} \, d\xi < +\infty \tag{2}$$

(see Karatzas and Shreve (1991, Theorem 5.5.15); condition (2) holds if, for example, *μ* is locally bounded and *σ* is locally bounded away from zero). Necessary and sufficient conditions for the boundaries *a* and *b* to be non-exit points, i.e., natural or entrance-not-exit, are formulated in Karatzas and Shreve (1991, Theorem 5.5.29). In particular, it is sufficient that the scale function

$$p(\mathbf{x}) := \int\_{\mathcal{c}}^{\mathbf{x}} \exp\left(-2\int\_{\mathcal{c}}^{\mathbf{z}} \frac{\mu(\mathbf{u})}{\sigma^2(\mathbf{u})} d\mathbf{u}\right) dz, \quad \mathbf{x} \in I \tag{3}$$

converges to −∞ when *x* approaches *a* and to +∞ when *x* approaches *b*. (Here, *c* ∈ *I* is arbitrary and the condition stated above does not depend on its choice.) These conditions are mild, in the sense that they are satisfied by all common diffusion models for commodity prices, including those in Section 3.

Denote by *τ<sup>x</sup>* the first time that the process *X* reaches *x* ∈ *I*:

$$\pi\_{\mathfrak{x}} = \inf \{ t \ge 0 : X\_t = \mathfrak{x} \}. \tag{4}$$

For *r* > 0, define

$$\psi\_r(\mathbf{x}) = \begin{cases} \mathbb{E}^{\mathbf{x}} \{ e^{-r\tau\_\mathbf{c}} \}, & \mathbf{x} \le \mathbf{c}, \\ 1/\mathbb{E}^{\mathbf{c}} \{ e^{-r\tau\_\mathbf{c}} \}, & \mathbf{x} > \mathbf{c}, \end{cases} \quad \phi\_r(\mathbf{x}) = \begin{cases} 1/\mathbb{E}^{\mathbf{c}} \{ e^{-r\tau\_\mathbf{c}} \}, & \mathbf{x} \le \mathbf{c}, \\ \mathbb{E}^{\mathbf{x}} \{ e^{-r\tau\_\mathbf{c}} \}, & \mathbf{x} > \mathbf{c}, \end{cases} \tag{5}$$

for any fixed *c* ∈ *I* (different choices of *c* merely result in a scaling of the above functions). It can be verified directly that function *φr*(*x*) is strictly decreasing in *x* while *ψr*(*x*) is strictly increasing, and, for *x*, *y* ∈ *I*, we have

$$\mathbb{E}^{\mathbf{x}}\{e^{-r\tau\_{\mathcal{Y}}}\} = \begin{cases} \psi\_{r}(\mathbf{x})/\psi\_{r}(\mathbf{y}), & \mathbf{x} < \mathbf{y}, \\ \phi\_{r}(\mathbf{x})/\phi\_{r}(\mathbf{y}), & \mathbf{x} \ge \mathbf{y}. \end{cases} \tag{6}$$

Since the boundaries *a*, *b* are natural or entrance-not-exit, we have *ψr*(*a*+) ≥ 0, *φr*(*b*−) ≥ 0 and *ψr*(*b*−) = *φr*(*a*+) = ∞ (Borodin and Salminen 2012, Sec. II.1).

#### 2.1.1. Optimal Stopping Problems and Solution Technique

The class of optimal stopping problems which we use in this paper is

$$w(\mathbf{x}) = \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \{ e^{-r\tau} \theta(X\_{\tau}) \mathbf{1}\_{\tau < \infty} \},\tag{7}$$

where the supremum is taken over the set of all (possibly infinite) stopping times. Here, *ϑ* is the payoff function and *v* is the value function. If a stopping time *τ*∗ exists which attains the supremum in (7), we call this an optimal stopping time. In addition, if *v* and *ϑ* are continuous, then the set

$$\Gamma := \{ \mathbf{x} \in I : \mathbf{v}(\mathbf{x}) = \theta(\mathbf{x}) \} \tag{8}$$

is a closed subset of *I*. Under general conditions (Peskir and Shiryaev 2006, chp. 1), which are satisfied by all stopping problems studied in this paper, *τ*<sup>∗</sup> = inf{*t* ≥ 0 : *Xt* ∈ Γ} is the smallest optimal stopping time and the set Γ is then called the stopping set.

Appendix A contains three lemmas providing a classification of solutions to the stopping problem (7) which will be used below.

#### 2.1.2. Single Problem

Let (*Xt*)*t*≥<sup>0</sup> denote the EIM price. We will develop a mathematical representation of actions A1–A3 (see Section 1.1) when only one reserve contract is traded. Considering A3, the time of power delivery is the first time that the EIM price exceeds a predetermined level *x*∗:

$$\mathfrak{k}\_{\mathfrak{e}} = \inf \{ t \ge 0 : X\_t \ge x^\* \}.$$

Given the present level *x* of the EIM price, the expected net present value of the utilisation payment exchanged at time *τ*ˆ*<sup>e</sup>* can be expressed as follows thanks to (6):

$$h\_{\mathfrak{c}}(\mathbf{x}) = \mathbf{E}^{\mathbf{x}} \{ \mathbf{e}^{-r\hat{\mathbf{r}}\_{\mathfrak{c}}} \mathbf{K}\_{\mathfrak{c}} \} = \begin{cases} \mathbf{K}\_{\mathfrak{c}\_{\mathfrak{c}}} & \mathbf{x} \ge \mathbf{x}^\*, \\\mathbf{K}\_{\mathfrak{c}} \frac{\mathfrak{p}(\mathbf{x})}{\mathfrak{p}(\mathbf{x}^\*)}, & \mathbf{x} < \mathbf{x}^\*. \end{cases} \tag{9}$$

Therefore, the optimal timing of action A2 corresponds to solving the following optimal stopping problem:

$$\sup\_{\tau} \mathbb{E}^{\chi} \{ \boldsymbol{\varepsilon}^{-r\tau} (\boldsymbol{p}\_{\boldsymbol{\varepsilon}} + \boldsymbol{h}\_{\boldsymbol{\varepsilon}}(\boldsymbol{X}\_{\tau})) \mathbf{1}\_{\tau < \infty} \}.$$

Since the utilisation payment *Kc* obtained when the EIM price exceeds *x*∗ is positive and constant, as is the initial premium *pc*, it is best to obtain these cashflows as soon as possible. The solution of the above stopping problem is therefore trivial: the contract should be sold immediately after completing action A1, i.e., immediately after providing physical cover for the reserve contract. Optimally timing the simultaneous actions A1 and A2, the purchase of energy and sale of the incremental reserve

contract, is therefore the core optimisation task. It corresponds to solving the following optimal stopping problem:

$$V\_{\mathfrak{c}}(\mathbf{x}) = \sup\_{\tau} E^{\mathfrak{x}} \{ \varepsilon^{-r\tau} \left( -X\_{\tau} + p\_{\mathfrak{c}} + h\_{\mathfrak{c}}(X\_{\tau}) \right) \mathbf{1}\_{\mathfrak{r} < \infty} \} = \sup\_{\tau} E^{\mathfrak{x}} \{ \varepsilon^{-r\tau} h(X\_{\tau}) \mathbf{1}\_{\mathfrak{r} < \infty} \}, \tag{10}$$

where the payoff

$$h(\mathbf{x}) = -\mathbf{x} + p\_{\varepsilon} + h\_{\varepsilon}(\mathbf{x})\tag{11}$$

is non-smooth since *hc* is non-smooth. The function *Vc*(*x*) is the real option value in the single problem.

#### 2.1.3. Lifetime Problem Formulation and Notation

In addition, to having a design life of multiple decades, thermal power stations have the primary purpose of generating energy rather than providing ancillary services. In contrast, electricity storage technologies such as batteries have a design life of years and may be dedicated to providing ancillary services. In this paper, we take into account the potentially limited lifespan of electricity storage by modelling a multiplicative degradation of their storage capacity: each charge–discharge cycle reduces the capacity by a factor *A* ∈ (0, 1).

We now turn to the lifetime problem. To this end, suppose that a nonnegative continuation value *ζ*(*x*, *α*) is also received at the same time as action A3. It is a function of the capacity of the store *α* ∈ (0, 1) and the EIM price *x*, and represents the future proceeds from the arrangement.

The expected net present value of action A3 is now

$$h^{\tilde{\mathbb{L}}}(\mathbf{x},a) := E^{\mathbf{x}}\{e^{-r\hat{\mathbf{T}}\_{\mathbf{f}}}(a\mathbf{K}\_{\mathbf{c}} + \mathbb{J}(X\_{\mathbf{f}\_{\mathbf{c}}}, Aa))\} = \begin{cases} (a\mathbf{K}\_{\mathbf{c}} + \mathbb{J}(\mathbf{x}^\*, Aa))\frac{\mathbb{y}(\mathbf{x})}{\mathbb{y}(\mathbf{x}^\*)}, & \mathbf{x} < \mathbf{x}^\*,\\ a\mathbf{K}\_{\mathbf{c}} + \mathbb{J}(\mathbf{x}, Aa), & \mathbf{x} \ge \mathbf{x}^\*, \end{cases} \tag{12}$$

where *A* ∈ (0, 1) is the multiplicative decrease of storage capacity per cycle. Here, the optimal timing of action A2 may be non trivial due to the continuation value *ζ*(*x*, *α*). We will show, however, that, for the functions *ζ* of interest in this paper, it is optimal to sell the reserve contract immediately after action A1, identically as in the single problem. The timing of action A1 requires the solution of the optimal stopping problem

$$T\zeta(\mathbf{x},a) := \sup\_{\tau} E^{\mathbf{x}} \left\{ e^{-r\tau} \left( -aX\_{\tau} + a p\_{\varepsilon} + h^{\mathbb{L}}(X\_{\tau}, a) \right) \mathbf{1}\_{\tau < \infty} \right\}. \tag{13}$$

The optimal stopping operator T makes the dependence on *ζ* explicit: it maps *ζ* onto the real option value of a selling a single reserve contract followed by continuation according to *ζ*. We define the lifetime value function *V*' as the limit

$$\hat{\mathcal{V}}(\mathbf{x}) = \lim\_{n \to \infty} (\mathcal{T}^n \mathbf{0})(\mathbf{x}, 1) \tag{14}$$

(if the limit exists), where <sup>T</sup> *<sup>n</sup>* denotes the *<sup>n</sup>*-fold iteration of the operator <sup>T</sup> and **<sup>0</sup>** is the function identically equal to 0. Thus" <sup>T</sup> *<sup>n</sup>***<sup>0</sup>** is the real option value of selling at most *<sup>n</sup>* reserve contracts under the arrangement. (Note that a priori it may not be optimal to sell all *n* contracts in this case, since it is possible to offer fewer contracts and refrain from trading afterwards by choosing *τ* = ∞.)

Calculation of the lifetime value function requires the analysis of a two-argument function. We will show now that this computation may be reduced to a function of the single argument *x*. Define *ζ*0(*x*, *α*) = 0 and *ζn*+1(*x*, *α*) = T *ζn*(*x*, *α*). We interpret *ζn*(*x*, *α*) as the maximum expected wealth accumulated over at most *n* cycles of the actions A1–A3 when the initial capacity of the store is *α*.

**Lemma 1.** *We have ζn*(*x*, *α*) = *α* ˆ *ζn*(*x*)*, where* ˆ *ζn*(*x*) = *ζn*(*x*, 1)*. Moreover,* ˆ *<sup>ζ</sup>n*(*x*) = <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***0**(*x*)*, where*

$$\hat{\mathcal{T}}^{\sharp}\_{\xi}(\mathbf{x}) = \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \left\{ \varepsilon^{-r\tau} \left( -X\_{\tau} + p\_{c} + \hat{h}^{\sharp}(X\_{\tau}) \right) \mathbf{1}\_{\tau < \infty} \right\},\tag{15}$$

*and*

$$\hat{h}^{\sharp}(\mathbf{x}) = \begin{cases} \left(\mathcal{K}\_{\mathfrak{c}} + A^{\sharp}\_{\mathfrak{s}}(\mathbf{x}^\*)\right) \frac{\mathfrak{p}(\mathbf{x})}{\mathfrak{p}(\mathbf{x}^\*)}, & \mathbf{x} < \mathbf{x}^\*,\\ \mathcal{K}\_{\mathfrak{c}} + A^{\sharp}\_{\mathfrak{s}}(\mathbf{x}), & \mathbf{x} \ge \mathbf{x}^\*. \end{cases} \tag{16}$$

**Proof.** The proof is by induction. Clearly, the statement is true for *n* = 0. Assume it is true for *n* ≥ 0. Then,

$$\mathcal{Z}\_{n+1}(\mathbf{x}, \boldsymbol{\mathfrak{a}}) = \mathcal{T}\mathcal{Z}\_{n}(\mathbf{x}, \boldsymbol{\mathfrak{a}}) = \mathfrak{a} \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \left( -X\_{\tau} + p\_{c} + \frac{1}{\alpha} h^{\mathsf{f}, \mathbf{e}}(X\_{\tau}, \boldsymbol{\mathfrak{a}}) \right) \mathbf{1}\_{\tau < \infty} \right\},$$

and

$$\frac{1}{a}h^{\mathsf{f},\mathsf{u}}(\mathsf{x},\mathsf{a}) = E^{\mathsf{x}}\left\{e^{-r\mathsf{f}\_{\mathsf{f}}}\left(\mathsf{K}\_{\mathsf{c}} + \frac{1}{a}\mathsf{f}\_{\mathsf{u}}(\mathsf{X}\_{\mathsf{f}\_{\mathsf{f}}},\mathsf{A}\mathsf{a})\right)\right\} = E^{\mathsf{x}}\left\{e^{-r\mathsf{f}\_{\mathsf{c}}}\left(\mathsf{K}\_{\mathsf{c}} + A^{\mathsf{f}}\_{\mathsf{u}\mathsf{u}}(\mathsf{X}\_{\mathsf{f}\_{\mathsf{c}}})\right)\right\}.$$

Hence, *<sup>ζ</sup>n*+1(*x*, *<sup>α</sup>*) = *<sup>α</sup>*T<sup>ˆ</sup> <sup>ˆ</sup> *ζn*(*x*) = *αζn*+1(*x*, 1). Consequently, ˆ *<sup>ζ</sup><sup>n</sup>* <sup>=</sup> <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***0**.

Assume that *<sup>ζ</sup>n*(*x*, *<sup>α</sup>*) converges to *<sup>ζ</sup>*(*x*, *<sup>α</sup>*) as *<sup>n</sup>* <sup>→</sup> <sup>∞</sup>. Then, clearly, <sup>ˆ</sup> *ζ<sup>n</sup>* converges to ˆ *ζ*(*x*) = *ζ*(*x*, 1). It is also clear that *<sup>ζ</sup>* is a fixed point of <sup>T</sup> if and only if <sup>ˆ</sup> *<sup>ζ</sup>* is a fixed point of <sup>T</sup><sup>ˆ</sup> . Therefore, we have simplified the problem to that of finding a limit of <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***0**(*x*). The stopping problem <sup>T</sup><sup>ˆ</sup> <sup>ˆ</sup> *ζ* will be called the normalised stopping problem and its payoff denoted by

$$\hat{h}(\mathbf{x},\boldsymbol{\hat{\xi}}) = \begin{cases} -\mathbf{x} + p\_{\varepsilon} + \frac{\psi\_{r}(\mathbf{x})}{\psi\_{r}(\mathbf{x}^{\*})} (K\_{\varepsilon} + A\_{\mathfrak{z}}^{\mathfrak{f}}(\mathbf{x}^{\*})), & \mathbf{x} < \mathbf{x}^{\*}, \\ -\mathbf{x} + p\_{\varepsilon} + K\_{\varepsilon} + A\_{\mathfrak{z}}^{\mathfrak{f}}(\mathbf{x}), & \mathbf{x} \ge \mathbf{x}^{\*}. \end{cases} \tag{17}$$

In particular, <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>** coincides with the single problem's value function *Vc*.

*Notation.* In the remainder of this paper, a caret (hat) will be used over symbols relating to the normalised lifetime problem:

$$\hat{\mathcal{V}}(\mathbf{x}) = \lim\_{n \to \infty} \mathcal{T}^n \mathbf{0}(\mathbf{x}).$$

#### 2.1.4. Sustainability Conditions Revisited

The sustainability conditions **S1** and **S2** introduced in Section 1 are our standing economic assumptions. The next lemma, proved in the appendices, expresses them quantitatively. This makes way for their use in the mathematical considerations below.

**Lemma 2.** *When taken together, the sustainability conditions* **S1** *and* **S2** *are equivalent to the following quantitative conditions:*

*S1\*:* sup*x*∈(*a*,*b*) *<sup>h</sup>*(*x*) <sup>&</sup>gt; <sup>0</sup>*, and S2\*: pc* + *Kc* < *x*∗*.*

Notice that **S1\*** is always satisfied when *a* ≤ 0.

#### *2.2. Three Exhaustive Regimes in the Single Problem*

In this section, we consider the single problem. Recall that the sustainability assumptions, or equivalently assumptions **S1\*** and **S2\***, are in force. For completeness, the notation and general optimal stopping theory used below is presented in Appendix A.

Since the boundary *a* is not-exit, we have *φr*(*a*+) = ∞. When *h* is given by (11), the limit *L* of (A3) is then

$$L\_c := \limsup\_{x \to a} \frac{-\infty}{\phi\_r(x)}.\tag{18}$$

It can also be verified that the analogous constant *R* defined in (A4) in the appendices satisfies *R* < ∞ since, by **S2\***, *h* is negative on [*x*∗, ∞). The following theorem completes our aim **M2**.

**Theorem 1.** *(Single problem) Assume that conditions* **S1\*** *and* **S2\*** *hold. With the definition* (18)*, there are three exclusive cases:*

$$(A) \quad L\_{\mathbb{C}} \le \frac{h(\mathbf{x})}{\Phi(\mathbf{x})} \text{ for some } \mathbf{x} \implies \text{there is } \mathbf{f} < \mathbf{x}^\* \text{ that maximizes } \frac{h(\mathbf{x})}{\Phi(\mathbf{x})} \text{ and then, for } \mathbf{x} \ge \mathbf{f}, \ \mathbf{r}\_{\mathbb{R}} \text{ is optimal, and } \Phi(\mathbf{x}) \ge \mathbf{f} \implies \text{ the derivative of } \mathbf{f} \text{ is } \Phi(\mathbf{x}) \text{ (see [10.2])}$$

$$V\_{\varepsilon}(\mathbf{x}) = \phi\_r(\mathbf{x}) \frac{h(\mathbf{x})}{\phi\_r(\mathbf{x})}, \qquad \mathbf{x} \ge \mathbf{x}. \tag{19}$$


*Moreover, in cases A and B, the value function Vc is continuous.*

**Proof.** By condition **S1\***, *h*(*y*) is positive for some *y* ∈ *I* and the value function *Vc*(*x*) > 0. For case A, note first that the function *h* is negative on [*x*∗, *b*) by **S2\***, see (9) and (11). Therefore, the supremum of *<sup>h</sup> <sup>φ</sup><sup>r</sup>* is positive and must be attained at some (not necessarily unique) *x*ˆ ∈ (*a*, *x*∗). The optimality of *τx*<sup>ˆ</sup> for *x* ≥ *x*ˆ then follows from Lemma A1. Case B follows from Lemma A2 and the fact that *Lc* > 0. Lemma A2 proves case C. The continuity of *Vc* follows from Lemma A3.

The optimal strategy in case A is of a threshold type. When an arbitrary threshold strategy *τx*˜ is used, the resulting expected value for *x* ≥ *x*˜ is given by *φr*(*x*)*h*(*x*˜)/*φr*(*x*˜). Figure 2 (whose problem data fall into case A) shows the potentially high sensitivity of the expected value of discounted cash flows for the single problem with respect to the level of the threshold *x*˜. It is therefore important in general to identify the optimal threshold accurately.

VWRSSLQJERXQGDU\SHU0:K

**Figure 2.** Sensitivity of the expected value in the single problem with respect to the stopping boundary. The EIM price is modelled as an Ornstein-Uhlenbeck process *dXt* = 3.42(47.66 − *Xt*)*dt* + 30.65*dWt* (time measured in days, fitted to Elexon Balancing Mechanism price half-hourly data from July 2011 to March 2014). The interest rate *r* = 0.03, power delivery level *x*∗ = 60, the initial premium *pc* = 10, and the utilisation payment *Kc* = 40. The initial price is *X*<sup>0</sup> is set equal to *x*∗.

We now show that, for commonly used diffusion price models, it is case A in the above theorem which is of principal interest. This is due to the mild sufficient conditions established in the following lemma which are satisfied, for example, by the examples in Section 3. Although condition 2(b) in Lemma 3 is rather implicit, it may be interpreted as requiring that the process *X* does not

'escape relatively quickly to −∞' (see Appendix D for a further discussion and examples) and it is satisfied, for example, by the Ornstein-Uhlenbeck process.

**Lemma 3.** *If condition S1*∗ *holds, then:*

	- *(a) a* > −∞*, (b) a* <sup>=</sup> <sup>−</sup><sup>∞</sup> *and* lim*x*→−<sup>∞</sup> *<sup>x</sup> <sup>φ</sup>r*(*x*) = <sup>0</sup>*.*

**Proof.** Condition **S1**<sup>∗</sup> ensures that *h* takes positive values. Hence, the ratio *<sup>h</sup>*(*x*) *<sup>φ</sup>r*(*x*) > <sup>0</sup> = *Lc* for some *<sup>x</sup>*. For assertion 2(a), recall from Section 2 that *φr*(*a*+) = ∞ since the boundary *a* is not-exit. Then, we have *Lc* <sup>=</sup> lim sup*x*→*a*(−*x*)/*φr*(*x*) = 0 as *<sup>a</sup>* <sup>&</sup>gt; <sup>−</sup>∞. In 2(b), the equality *Lc* <sup>=</sup> 0 is immediate from the definition of *Lc*.

Turning now to aim **M1**, we have

**Corollary 1.** *In the setting of Theorem 1 for the single problem, either*

*(a) the quantity*

$$\forall \mathbf{x} := \max \left\{ \mathbf{x} \in I : \frac{h(\mathbf{x})}{\Phi\_{\mathbf{r}}(\mathbf{x})} = \sup\_{y \in I} \frac{h(y)}{\Phi\_{\mathbf{r}}(y)} \right\} \tag{20}$$

*is well-defined, i.e., the set is non-empty. Then, x*ˇ *is the highest price at which the battery operator may buy energy when acting optimally, and we have x*ˇ < *x*∗ *(this is case A); or*

*(b) there is no price at which it is optimal for the battery operator to purchase energy. In this case, the single problem's value function may either be infinite (case C) or finite (case B).*

**Proof.** (a) Since the maximiser *x*ˆ in case A of Theorem 1 is not necessarily unique, the set in (20) may contain more than one point. Since *h* and *φ<sup>r</sup>* are continuous and all maximisers lie to the left of *x*∗, this set is closed and bounded from above, so *x*ˇ is well-defined and a maximiser in case A. For any stopping time *<sup>τ</sup>* with <sup>P</sup>*x*{*X<sup>τ</sup>* <sup>&</sup>gt; *<sup>x</sup>*ˇ} <sup>&</sup>gt; 0, it is immediate from assertion 3 of Lemma A1 that *<sup>τ</sup>* is not optimal for the problem *Vc*(*x*), *x* ≥ *x*ˇ. Part (b) follows directly from cases B and C of Theorem 1.

Corollary 1 confirms that it is optimal for the battery operator to buy energy only when the EIM price is strictly lower than the price *x*∗ which would trigger immediate power delivery to the system operator. Thus, the battery operator (when acting optimally) does not directly conflict with the system operator's balancing actions.

#### *2.3. Two Exhaustive Regimes in the Lifetime Problem*

Turning to the lifetime problem, we begin by letting ˆ *ζ*(*x*) in definition (16) be a general nonnegative continuation value depending only on the EIM price *x*, and studying the normalised stopping problem (15) in this case (the payoff ˆ *h* is therefore defined as in (17)).

We now wish to study the value of *n* cycles A1–A3, and hence the lifetime value, by iterating the operator <sup>T</sup><sup>ˆ</sup> . To justify this approach, it is necessary to check the timing of action A2 in the lifetime problem. With the actions A1–A3 defined as in Section 1.1, recall that the timing of action A2 is trivial in the single problem: after A1 it is optimal to perform A2 immediately. Lemma A5, which may be found in Appendix B, confirms that the same property holds in the lifetime problem.

We may now provide the following answer to objective **M1** for the lifetime problem.

**Corollary 2.** *Assume that conditions* **S1\*** *and* **S2\*** *hold. In the lifetime problem with* ˆ *ζ* = *V, either:* ' *(a) the quantity*

$$\forall \text{ } := \max \left\{ \mathbf{x} \in I : \frac{\hat{h}(\mathbf{x}, \hat{\xi})}{\phi\_r(\mathbf{x})} = \sup\_{y \in I} \frac{\hat{h}(\underline{y}, \hat{\xi})}{\phi\_r(y)} \right\} \tag{21}$$

*is well-defined, i.e., the set is non-empty. Then, x*ˇ *is the highest price at which the battery operator can buy energy when acting optimally in the lifetime problem, and we have x*ˇ < *x*∗ *(cases 1 and 2a in Lemma A4 in Appendix B); or*

*(b) there is no price at which it is optimal for the battery operator to purchase energy. In this case, the lifetime value function may either be infinite (case 3) or finite (case 2b in Lemma A4 in Appendix B).*

**Proof.** The proof proceeds exactly as that of Corollary 1 with the exception of showing that *x*ˇ < *x*∗ (this is because Lemma A4 in the appendices, which characterises the possible solution types in the lifetime problem, does not guarantee the strict inequality *x*ˇ < *x*∗). Assume then *x*ˇ = *x*∗. At the EIM price *Xt* = *x*ˇ = *x*∗, the power delivery to the system operator is immediately followed by the purchase of energy by the battery operator and this cycle can be repeated instantaneously, arbitrarily many times. However, since each such cycle is loss making for the battery operator by condition **S2**∗, this strategy would lead to unbounded losses almost surely in the lifetime problem started at EIM price *x*∗ leading to *<sup>V</sup>*'(*x*∗) = <sup>−</sup>∞. This would contradict the fact that *<sup>V</sup>*' <sup>&</sup>gt; 0, so we conclude that *<sup>x</sup>*<sup>ˇ</sup> <sup>&</sup>lt; *<sup>x</sup>*∗.

Pursuing aim **M2**, we will show now that there are two regimes in the lifetime problem: either the lifetime value function is strictly greater than the single problem's value function (and the cycle A1–A3 is repeated infinitely many times), or the lifetime value equals the single problem's value. Although the latter case appears counterintuitive, it is explained by the fact that the lifetime problem's value is then attained only in the limit when the purchase of energy (action A1) is made at a decreasing sequence of prices converging to *a*, the left boundary of the process (*Xt*). In this limit, the benefit of future payoffs becomes negligible, equating the lifetime value to the single problem's value.1

**Theorem 2.** *There are two exclusive regimes:*

$$\mathcal{V}(\mathfrak{a}) \quad \stackrel{\mathcal{V}}{\underset{\sim}{\mathcal{V}}}(\mathfrak{x}) > V\_{\mathfrak{c}}(\mathfrak{x}) \text{ for all } \mathfrak{x} \ge \mathfrak{x}^\*,$$

*(β) <sup>V</sup>*'(*x*) = *Vc*(*x*) *for all x* <sup>≥</sup> *<sup>x</sup>*<sup>∗</sup> *(or both are infinite for all x).*

*Moreover, in regime (α), an optimal stopping time exists when the continuation value is* ˆ *ζ* = ˆ *<sup>ζ</sup><sup>n</sup>* <sup>=</sup> <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** *for n* > 0 *(that is, for a finite number of reserve contracts), and when* ˆ *ζ* = *V (for the lifetime value function).* '

**Proof.** We take the continuation value ˆ *ζ* = *Vc* in Lemma A4 from Appendix B and consider separately its cases 1, 2a, 2b and 3. Firstly, in case 3, we have *Vc* = ∞, implying that also *V*' = ∞ and we have regime (*β*).

Case 2 of Lemma A4 corresponds to case B of Theorem 1, when there is no optimal stopping time in the single problem and *Vc*(*x*) = *Lcφr*(*x*) for all *<sup>x</sup>* <sup>∈</sup> *<sup>I</sup>*. Considering first case 2b and defining <sup>ˆ</sup> *ζ<sup>n</sup>* as in Lemma A6 in Appendix B, it follows that ˆ *<sup>ζ</sup>*2(*x*) = *Lcφr*(*x*) = *Vc*(*x*) for *<sup>x</sup>* <sup>∈</sup> *<sup>I</sup>* and consequently *<sup>V</sup>*' <sup>=</sup> *Vc*, which again corresponds to regime (*β*).

In case 2a of Lemma A4, suppose first that the maximiser *<sup>x</sup>*<sup>ˆ</sup> <sup>≤</sup> *<sup>x</sup>*<sup>∗</sup> is such that <sup>ˆ</sup> *h*(*x*ˆ, ˆ *ζ*1) *<sup>φ</sup>r*(*x*ˆ) = *Lc*. Then, for *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*<sup>∗</sup> <sup>≥</sup> *<sup>x</sup>*ˆ, we have <sup>ˆ</sup> *ζ*2(*x*) = *φr*(*x*) ˆ *h*(*x*ˆ, ˆ *ζ*1) *<sup>φ</sup>r*(*x*ˆ) = *Lcφr*(*x*), which also yields regime (*β*). On the other hand, when <sup>ˆ</sup> *h*(*x*ˆ, ˆ *ζ*1) *<sup>φ</sup>r*(*x*ˆ) <sup>&</sup>gt; *Lc*, we have for *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*<sup>∗</sup> <sup>≥</sup> *<sup>x</sup>*<sup>ˆ</sup> that <sup>ˆ</sup> *ζ*2(*x*) = *φr*(*x*) ˆ *h*(*x*ˆ, ˆ *ζ*1) *<sup>φ</sup>r*(*x*ˆ) <sup>&</sup>gt; *Lcφr*(*x*) = <sup>ˆ</sup> *ζ*1(*x*), and so regime (*α*) applies by the monotonicity of the operator <sup>T</sup><sup>ˆ</sup> . From the definition of <sup>ˆ</sup> *h* in (17), and holding the point *<sup>x</sup>*<sup>ˆ</sup> <sup>≤</sup> *<sup>x</sup>*<sup>∗</sup> constant, this monotonicity implies that <sup>ˆ</sup> *h*(*x*ˆ, ˆ *ζn*) *<sup>φ</sup>r*(*x*ˆ) <sup>&</sup>gt; *Lc* for all *<sup>n</sup>* <sup>&</sup>gt; 1 and that <sup>ˆ</sup> *h*(*x*ˆ,*V*') *<sup>φ</sup>r*(*x*ˆ) > *Lc*.

<sup>1</sup> If the lifetime value is infinite then so is the single problem's value and they are equal in this sense. When the lifetime value is zero then it is optimal not to enter the contract, and so the single problem's value is also zero.

We conclude that case 2a of Lemma A4 applies (rather than case 2b) for a finite number of reserve contracts and also in the lifetime problem.

Considering now the maximiser *x*ˆ defined in case 1 of Lemma A4, we have for *x* ≥ *x*<sup>∗</sup> ≥ *x*ˆ that

$$\hat{\zeta}\_2(\mathbf{x}) = \phi\_\mathbf{\hat{r}}(\mathbf{x}) \frac{\hbar(\mathbf{\hat{x}}, \hat{\zeta}\_1)}{\phi\_\mathbf{r}(\mathbf{\hat{x}})} \ge \frac{\hbar(\mathbf{\hat{x}}\_0, \hat{\zeta}\_1)}{\phi\_\mathbf{r}(\mathbf{\hat{x}}\_0)} > \frac{\hbar(\mathbf{\hat{x}}\_0, \mathbf{0})}{\phi\_\mathbf{r}(\mathbf{\hat{x}}\_0)} = \hat{\zeta}\_1(\mathbf{x}) = V\_\mathbf{\hat{r}}(\mathbf{x}),$$

and regime (*α*) again follows by monotonicity. In addition, trivially, case 1 of Lemma A4 applies for ˆ *ζ* = ˆ *ζ<sup>n</sup>* and ˆ *ζ* = *V*'.

The following corollary follows immediately from the preceding proof.

**Corollary 3.** *Regime (β) holds if and only if* <sup>T</sup><sup>ˆ</sup> <sup>2</sup>**0**(*x*) = <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>**(*x*) *for all x* <sup>≥</sup> *<sup>x</sup>*∗*.*

To address the implicit nature of our answers to **M1** and **M2** for the lifetime problem, in the next section, we provide results for the construction and verification of the lifetime value function and corresponding stopping time. For this purpose, we close this section by summarising results obtained above (making use of additional results from Appendix C).

**Theorem 3.** *In the setting of Theorem 2, assume that regime (α) holds. Then, the lifetime value function V*' *is continuous, is a fixed point of the operator* <sup>T</sup><sup>ˆ</sup> *and* <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** *converges to <sup>V</sup>*' *exponentially fast in the supremum norm. Moreover, there is <sup>x</sup>*<sup>ˇ</sup> <sup>&</sup>lt; *<sup>x</sup>*<sup>∗</sup> *such that <sup>τ</sup>x*<sup>ˇ</sup> *is an optimal stopping time for* <sup>T</sup><sup>ˆ</sup> *<sup>V</sup>*'(*x*) *when <sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*<sup>ˇ</sup> *and, furthermore, x is the highest price at which the battery operator can buy energy when acting optimally.* ˇ

#### *2.4. Construction of the Lifetime Value Function*

In this section, we discuss a numerical procedure for solution of the lifetime problem. It is based on the problem's structure as summarised in Theorem 3. Lemma 4 provides a means of constructing the lifetime value function, together with the value *x*ˇ of Theorem 3, using a one-dimensional search. We assume that regime (*α*) of Theorem 2 holds.

In the circumstance when the above procedure is not followed, complementary findings in Appendix E enable one to verify if a candidate buy price *x*ˆ is optimal for the lifetime problem.

**Lemma 4.** *The lifetime value function evaluated at x*∗ *satisfies*

$$\hat{V}(\mathbf{x}^\*) = \max\_{z \in (a\_\tau \mathbf{x}^\*)} y(z),\tag{22}$$

*where*

$$y(z) := \frac{-z + p\_c + \frac{\wp\_r(z)}{\wp\_r(x^\*)} K\_c}{\frac{\wp\_r(z)}{\wp\_r(x^\*)} - \frac{\wp\_r(z)}{\wp\_r(x^\*)} A}. \tag{23}$$

**Proof.** Fix *z* ∈ (*a*, *x*∗). In the normalised lifetime problem of Section 2.1.3, suppose that the strategy *τ<sup>z</sup>* is used for each energy purchase. Writing *y* for the total value of this strategy under P*x*<sup>∗</sup> , by construction we have the recursion

$$y = \frac{\phi\_r(\mathbf{x}^\*)}{\phi\_r(z)} \left( -z + p\_c + \frac{\psi\_r(z)}{\psi\_r(\mathbf{x}^\*)} \left( \mathbf{K}\_c + Ay \right) \right).$$

Rearranging, we obtain (23). By Theorem 3, there exists an optimal strategy *τx*<sup>ˇ</sup> of the above form under P*x*<sup>∗</sup> and (22) follows.

Hence, under P*x*<sup>∗</sup> , an optimal stopping level *x*ˆ can be found by maximising *y*(*z*) over *z* ∈ (*a*, *x*∗). The value *<sup>x</sup>*<sup>ˇ</sup> of Theorem <sup>3</sup> is given by *<sup>x</sup>*<sup>ˇ</sup> = max{*<sup>x</sup>* : *<sup>y</sup>*(*x*) = max*z*∈(*a*,*x*∗) *<sup>y</sup>*(*z*)}.

#### **3. Results**

The general theory presented above provides optimal stopping times for initial EIM prices *x* ≥ *x*ˇ, where *x*ˇ is the highest price at which the battery operator can buy energy optimally. In this section, for specific models of the EIM price, we derive optimal stopping times for *all possible* initial EIM prices *x* ∈ *I* when the sustainability conditions **S1\*** and **S2\*** hold. In the examples of this section, the stopping sets Γ for the single and lifetime problems take the form (*a*, *x*ˇ] although, in general, stopping sets may have much more complex structure. Interestingly, the stopping sets for the single and lifetime problem are either both half-lines or both compact intervals.

Note that condition **S2\*** is ensured by the explicit choice of parameters. Verification of condition **S1**∗ is straightforward by checking, for example, if the left boundary *a* of the interval *I* satisfies *<sup>a</sup>* <sup>&</sup>lt; *pc* <sup>+</sup> lim*x*→*<sup>a</sup> <sup>ψ</sup>r*(*x*) *<sup>ψ</sup>r*(*x*∗)*Kc*, i.e., that lim sup*x*→*<sup>a</sup> <sup>h</sup>*(*x*) <sup>&</sup>gt; 0. In particular, **S1**<sup>∗</sup> always holds if *<sup>a</sup>* <sup>=</sup> <sup>−</sup>∞.

Our approach is to combine the above general results with the geometric method drawn from Section 5 of Dayanik and Karatzas (2003). Although Proposition 5.12 of the latter paper gives results for natural boundaries, we note that the same arguments apply to entrance-not-exit boundaries. In particular, we construct the least concave majorant *W* of the obstacle *H* : [0, ∞) → R, where

$$H(y) := \begin{cases} \frac{\hat{h}(F^{-1}(y), \hat{\mathcal{Y}})}{\Phi\_r(F^{-1}(y))}, & y > 0, \\ \limsup\_{x \to a} \frac{\hat{h}(x, \mathcal{Y})}{\Phi\_r(x)} = L\_{\mathfrak{c}}, & y = 0 \end{cases} \tag{24}$$

(the latter equality was given in (A5) in the appendices). Here, the function *F*(*x*) = *ψr*(*x*)/*φr*(*x*) is strictly increasing with *F*(*a*+) = 0. Writing Γˆ for the set on which *W* and *H* coincide, under appropriate conditions, the smallest optimal stopping time is given by the first hitting time of the set Γ := *F*−1(Γˆ) (Dayanik and Karatzas 2003, Propositions 5.13 and 5.14).

The Ornstein-Uhlenbeck (OU) process is a continuous-time stochastic process with dynamics

$$dX\_t = \theta(\mu - X\_t)dt + \sigma d\mathcal{W}\_{t\prime} \tag{25}$$

where *θ*, *σ* > 0 and *μ* ∈ R. It has two natural boundaries, *a* = −∞ and *b* = ∞. This process extends the scaled Brownian motion model by introducing a mean reverting drift term *θ*(*μ* − *Xt*)*dt*. The mean reversion is commonly observed in commodity price time series and may have several causes (Lutz 2009). In the present context, the mean reversion can also be interpreted as the impact on prices of the system operator's corrective balancing actions. Appendix F collects some useful facts about the Ornstein-Uhlenbeck process. In particular, when constructing *W*, it is convenient to note that *H* ◦ *F* has the same sign as (L − *r*)*h*, where L is the infinitesimal generator of *X* defined as in Appendix F.

#### *3.1. OU Price Process*

Assume now that the EIM price follows the OU process (25) so that *Lc* = 0 (see Equation (A19) in Appendix F) and, by Lemma 3, case A of Theorem 1 applies. We are able to deal with the single and lifetime problems simultaneously by setting ˆ *ζ* equal to 0 for the single problem and equal to (the positive function) *V*' in the lifetime problem. The results of Sections 2.2 and 2.3 yield that, in both problems, the right endpoint of the set Γˆ equals *F*(*x*ˇ) for some ∞ < *x*ˇ < *x*∗. Furthermore, since *ψ<sup>r</sup>* is a solution to (L − *r*)*v* = 0 and since *x*ˇ < *x*∗, for *x* ≤ *x*∗, we have

$$(\mathcal{L} - r)\hat{h}(\mathbf{x}, \boldsymbol{\xi}) \quad = \ (\mathcal{L} - r) \left( -\mathbf{x} + p\_{\boldsymbol{\varepsilon}} + \frac{\psi\_{r}(\mathbf{x})}{\psi\_{r}(\mathbf{x}^{\*})} (\mathcal{K}\_{\boldsymbol{\varepsilon}} + A\_{\boldsymbol{\varepsilon}}^{\boldsymbol{\xi}}(\mathbf{x}^{\*})) \right) \tag{26}$$

$$=\left(\mathcal{L}-r\right)(-\mathcal{x}+p\_{\mathcal{C}})\tag{27}$$

$$\mathbf{x} = (r+\theta)\mathbf{x} - rp\_c - \theta\mu. \tag{28}$$

Therefore, the function (L − *<sup>r</sup>*)<sup>ˆ</sup> *<sup>h</sup>*(·, <sup>ˆ</sup> *ζ*) is negative on (−∞, *B*0) and positive on (*B*0, ∞), where *<sup>B</sup>*<sup>0</sup> <sup>=</sup> *rpc*+*θμ <sup>r</sup>*+*<sup>θ</sup>* . This implies that *H* is strictly concave on (0, *F*(*B*0)) and strictly convex on (*F*(*B*0), ∞). Since the concave majorant *W* of *H* cannot coincide with *H* in any point of convexity, so necessarily *x*ˇ < *B*<sup>0</sup> and *H* is concave on (0, *F*(*x*ˇ)). Hence, we conclude that *W* is equal to *H* on the latter interval and so Γ = (−∞, *x*ˇ].

#### *3.2. General Mean-Reverting Processes*

The above reasoning can be extended to mean-reverting processes with general volatility

$$dX\_t = \theta(\mu - X\_t)dt + \sigma(X\_t)dW\_t$$

for a measurable function *σ* such that the above equation admits a unique solution, cf. Section 2, and *Lc* = 0 (cf. (24)). Recall that we assume that (*Xt*) has two non-exit boundaries *a*, *b* (natural or entrance-not-exit boundaries) satisfying *<sup>a</sup>* <sup>&</sup>lt; *<sup>x</sup>*<sup>∗</sup> <sup>&</sup>lt; *<sup>b</sup>*. Since <sup>L</sup> <sup>=</sup> *<sup>θ</sup>*(*<sup>μ</sup>* <sup>−</sup> *<sup>x</sup>*) *<sup>d</sup> dx* <sup>+</sup> <sup>1</sup> <sup>2</sup>*σ*2(*x*) *<sup>d</sup>*<sup>2</sup> *dx*<sup>2</sup> , Equations (26)–(28) still apply. In particular, we see that the diffusion coefficient *σ*(·) does not affect the sign of (28) and thus does not influence the concavity properties of *H* on (0, *F*(*x*∗)). Proceeding as above, we argue that case A of Theorem 1 applies and the single and lifetime problems can be solved simultaneously. Particularly, the largest buy price is given by *a* < *x*ˇ < *x*∗ (different for the single and lifetime problems). Note that the form of the stopping set is purely determined by *μ*, *θ*, the left boundary *a* and the initial premium *pc*. Obviously, the mean price level *μ* satisfies *μ* > *a* because *a* is an unreachable boundary.

**Lemma 5.** *If pc* > *a, then the stopping sets for the single and lifetime problems are of the form* Γ = (*a*, *x*ˇ]*.*

**Proof.** The same arguments as in the OU case are directly applicable to the present setting and, under the assumptions of the lemma, we have *<sup>B</sup>*<sup>0</sup> <sup>=</sup> *rpc*+*θμ <sup>r</sup>*+*<sup>θ</sup>* > *a*. Hence, for each problem, the stopping set has the form Γ = (*a*, *x*ˇ] for some *x*ˇ < *B*0.

In the particular case of the CIR model (Cox et al. 1985)

$$dX\_t = \theta(\mu - X\_t)dt + \sigma \sqrt{X\_t}dW\_{t\prime} \tag{29}$$

we have *a* = 0, *b* = ∞. Then:

**Corollary 4.** *If <sup>X</sup> is the CIR process* (29) *with* <sup>2</sup>*θμ* <sup>≥</sup> *<sup>σ</sup>*<sup>2</sup> *and <sup>μ</sup>* <sup>&</sup>gt; <sup>0</sup>*, then the boundary <sup>a</sup>* <sup>=</sup> <sup>0</sup> *is entrance-not-exit. Furthermore, if pc* > 0*, then the stopping sets for the single and lifetime problems are of the form* Γ = (0, *x*ˇ]*.*

**Proof.** It follows from (Cox et al. 1985, p. 391) that the condition 2*θμ* <sup>≥</sup> *<sup>σ</sup>*<sup>2</sup> is necessary and sufficient for the boundary 0 to be entrance-not-exit. By Lemma 3, we have *Lc* = 0. An application of Lemma 5 concludes.

**Remark 1.** *More generally, suppose that the imbalance price process follows*

$$dX\_t = \theta(\mu - X\_t)dt + \sigma X\_t^\gamma dW\_t$$

*for some γ* > 0.5*. Then, the left boundary a* = 0 *is entrance-not-exit for any choice of parameters θ*, *μ*, *σ* > 0 *since the scale function p given in* (3) *converges to negative infinity at* 0*. Therefore, the arguments in the above corollary apply and the stopping sets for the single and lifetime problems are also of the form* Γ = (0, *x*ˇ]*.*

#### *3.3. Shifted Exponential Price Processes*

In order to first recover and then generalise previously obtained results (Moriarty and Palczewski 2017), take the following shifted exponential model for the price process:

$$f(z) \quad := \quad D + d e^{bz},\tag{30}$$

$$X\_t \quad = \quad f(Z\_t), \tag{31}$$

where *Z* is a regular one-dimensional diffusion with non-exit (natural or entrance-not-exit) boundaries *a<sup>Z</sup>* and *b<sup>Z</sup>* (we will use the superscripts *X* and *Z* where necessary to emphasise the dependence on the stochastic process). The idea is that *Z* models the physical system imbalance process while *f* represents a *price stack* of bids and offers which is used to form the EIM price. In this case, the left boundary for *X* is *<sup>a</sup>* <sup>=</sup> *<sup>f</sup>*(*aZ*) <sup>≥</sup> *<sup>D</sup>* and, by Lemma 3, *Lc* <sup>=</sup> 0 and case A of Theorem <sup>1</sup> applies. Rather than working with the implicitly defined process *X*, however, we may work directly with the process *Z* by setting:

$$z^\* \quad := \quad f^{-1}(\mathbf{x}^\*),$$

$$\downarrow \quad \downarrow \quad \downarrow \tag{32}$$

$$h\_f(z) \quad := \quad -f(z) + p\_c + \begin{cases} \frac{\Psi\_r^Z(z)}{\Psi\_r^Z(z^\*)} K\_{c\prime} & z < z^\*, \\ K\_{c\prime} & z \ge z^\*, \end{cases} \tag{33}$$

$$\begin{array}{rcl} \hat{h}\_f(z,\boldsymbol{\xi}) &:=& \begin{cases} -f(z) + p\_\varepsilon + \frac{\Psi\_r^Z(z)}{\Psi\_r^Z(z^\*)} \left( \mathcal{K}\_\varepsilon + A\_\mathfrak{z}^\sharp(z^\*) \right), & z < z^\*, \\ -f(z) + p\_\varepsilon + \mathcal{K}\_\varepsilon + A\_\mathfrak{z}^\sharp(z), & z \ge z^\*, \end{cases} \end{array} \tag{34}$$

and modifying the definitions for <sup>T</sup> , <sup>T</sup><sup>ˆ</sup> , *Vc* and *<sup>V</sup>*' accordingly. We then have

**Theorem 4.** *Taking definitions* (30) *and* (32)*–*(34)*, assume that conditions* **S1\*** *and* **S2\*** *hold. Then,*

$$L\_c := \limsup\_{z \to a^Z} \frac{-f(z)}{\phi\_r^Z(z)} = 0.$$

*In addition:*

*(i) (Single problem) There exists <sup>z</sup>*<sup>ˆ</sup> <sup>&</sup>lt; *<sup>z</sup>*<sup>∗</sup> *that maximises <sup>h</sup> <sup>f</sup>*(*z*) *φ<sup>Z</sup> <sup>r</sup>* (*z*) *, the stopping time τz*<sup>ˆ</sup> *is optimal for z* ≥ *z*ˆ*, and*

$$V\_{\mathfrak{C}}(z) = \phi\_r^Z(z) \frac{h\_f(\mathfrak{z})}{\phi\_r^Z(\mathfrak{z})}, \qquad z \ge \widehat{z}.$$

*(ii) (Lifetime problem) The lifetime value function <sup>V</sup>*' *is continuous and a fixed point of* <sup>T</sup><sup>ˆ</sup> *. There exists <sup>z</sup>*˜ <sup>∈</sup> (*z*ˆ, *<sup>z</sup>*∗) *which maximises* <sup>ˆ</sup> *h*(*z*,*V*') *φ<sup>Z</sup> <sup>r</sup>* (*z*) *and <sup>τ</sup>z*˜ *is an optimal stopping time for z* <sup>≥</sup> *z with* ˜

$$\hat{\mathcal{V}}(z) = \hat{\mathcal{T}}\hat{\mathcal{V}}(z) = \phi\_r^Z(z) \frac{\hat{h}(\mathbb{Z}, \hat{V})}{\phi\_r^Z(\bar{z})}, \qquad z \ge \bar{z}.$$

**Proof.** The proof follows from the one-to-one correspondence between the process *X* and the process *Z*, and direct transfer from Theorems 1 and 3.

In some cases, explicit necessary and/or sufficient conditions for **S1**∗ may be given in terms of the problem parameters. Assume that *<sup>a</sup><sup>Z</sup>* <sup>=</sup> <sup>−</sup><sup>∞</sup> as in the examples studied below. If *pc* <sup>&</sup>gt; *<sup>D</sup>* and *Kc* <sup>≥</sup> 0, this is sufficient for the condition **S1**<sup>∗</sup> to be satisfied as then *hf*(*z*) ≥ −*f*(*z*) + *pc* > 0 for sufficiently small *z*. When *pc* = *D* and *Kc* > 0, it is sufficient to verify that *ebz* = *o ψ<sup>Z</sup> <sup>r</sup>* (*z*) as *z* → −∞ since then *hf*(*z*) = <sup>−</sup>*debz* <sup>+</sup> *<sup>ψ</sup><sup>Z</sup> <sup>r</sup>* (*z*) *Kc*/*ψ<sup>Z</sup> <sup>r</sup>* (*z*∗) for *z* < *z*∗. On the other hand, our assumption that **S1**<sup>∗</sup> holds necessarily excludes parameter combinations with *pc* − *D* = *Kc* = 0, since the reserve contract writer then cannot make any profit because *hf*(*z*) ≤ 0 for all *z*.

In Section 3.3.1, we take *Z* to be the standard Brownian motion and recover results from the single problem of Moriarty and Palczewski (2017) (the lifetime problem is formulated differently in the latter reference, where degradation of the store is not modelled). In Section 3.3.2, we generalise to the case when *Z* is an OU process.

#### 3.3.1. Brownian Motion Imbalance Process

When the imbalance process *Z* = *W*, the Brownian motion, we have

$$(\mathcal{L} - r)\hat{h}\_f(z, \hat{\xi}) = (\mathcal{L} - r)(-f(z) + p\_\varepsilon) = de^{bz} \left\{ r - \frac{1}{2}b^2 \right\} + r(D - p\_\varepsilon).$$

We have several cases depending on the sign of (*<sup>D</sup>* <sup>−</sup> *pc*) and (*<sup>r</sup>* <sup>−</sup> <sup>1</sup> <sup>2</sup> *<sup>b</sup>*2).

	- (i) We may exclude the subcase *pc* <sup>≤</sup> *<sup>D</sup>*, since then *<sup>H</sup>*(*y*) = <sup>ˆ</sup> *h*(*z*, ˆ *ζ*) *φ<sup>Z</sup> <sup>r</sup>* (*z*) |*z*=(*FZ*)−1(*y*) is strictly convex on (0, *FZ*(*z*∗)) for any ˆ *ζ* and Γ cannot intersect this interval, contradicting Theorem 4 and, consequently, violating **S1**∗ or **S2**∗.
	- (ii) If *pc* > *D*, *H* is concave on (0, *FZ*(*B*)) and convex on (*FZ*(*B*), ∞), where

$$B = \frac{1}{b} \log \left( \frac{r(p\_c - D)}{d(r - \frac{1}{2}b^2)} \right).$$

By Theorem <sup>4</sup> and the positivity of *<sup>H</sup>* on (0, *<sup>F</sup>Z*(*z*ˆ)), we have <sup>Γ</sup> = (−∞, *<sup>z</sup>*ˆ] and <sup>Γ</sup> = (−∞, *<sup>z</sup>*˜] for the single and lifetime problems, respectively, with *z*˜ < *z*ˆ < *B*.

	- (i) When *pc* ≥ *D*, the function *H* is concave on (0, ∞). Hence, the stopping sets Γ for single and lifetime problems have the same form as in case 1(ii) above.
	- (ii) If *pc* < *D*, the function *H* is convex on (0, *FZ*(*B*)) and concave on (*FZ*(*B*), ∞). The set Γ must then be an interval, respectively [*z*ˆ0, *z*ˆ] and [*z*˜0, *z*˜]. For explicit expressions for the left and right endpoints for the single problem, as well as sufficient conditions for **S1**∗, the reader is refered to Moriarty and Palczewski (2017).

#### 3.3.2. OU Imbalance Process

When *Z* is the Ornstein-Uhlenbeck process, by adjusting *d* and *b* in the price stack function *f* (see (30)), we can restrict our analysis to the OU process with zero mean and unit volatility, that is:

$$dZ\_t = -\theta Z\_t dt + d\mathcal{W}\_t.$$

Then, for *z* < *z*∗,

$$(\mathcal{L} - r)\hat{h}\_f(z, \hat{\zeta}) \quad = \ (\mathcal{L} - r)(-f(z) + p\_\mathbb{c})\tag{35}$$

$$= \left. \det^{bz} \left\{ b \left( \theta z - \frac{1}{2} b \right) + r \right\} + r(D - p\_c) =: \eta(z). \tag{36}$$

Differentiating *η*, we obtain

$$
\eta'(z) = db\theta e^{bz} \left( bz + 1 + \frac{r - \frac{1}{2}b^2}{\theta} \right),
$$

which has a unique root at *z* = <sup>1</sup> *b* <sup>1</sup> <sup>2</sup> *<sup>b</sup>*2−*<sup>r</sup> <sup>θ</sup>* − 1 . The function *η* decreases from *r*(*D* − *pc*) at −∞ until *<sup>η</sup>*(*<sup>z</sup>*) = <sup>−</sup>*debz θ* + *r*(*D* − *pc*) at *z* and then increases to positive infinity.

	- (i) Let *z* ≥ *z*∗. We exclude the possibility *η*(*z*∗) ≥ 0, since then the function *H* is convex on (0, *FZ*(*z*∗)) and the set Γ has empty intersection with this interval, contradicting Theorem 4 and, consequently, violating **S1**<sup>∗</sup> or **S2**∗. When *η*(*z*∗) < 0, *H* is convex on (0, *FZ*(*u*)) and concave on (*FZ*(*u*), *FZ*(*z*∗)), where *u* is the unique root of *η* on (0, *z*∗). Therefore, the stopping sets Γ for the single and lifetime problems are of the form [*z*ˆ0, *z*ˆ] and [*z*˜0, *z*˜], respectively, with min(*z*ˆ0, *z*˜0) > *u*, cf. case 2(ii) in Section 3.3.1.
	- (ii) Consider now *z* < *z*∗. As above, we exclude the case *η*(*z*) ≥ 0, since then *H* is convex on (0, *FZ*(*z*∗)). The remaining case *η*(*z*) < 0 implies that the stopping sets Γ have the same form as in case 2(i) above, as *H* is convex and then concave if *η*(*z*∗) ≤ 0, and convex–concave–convex if *η*(*z*∗) > 0.

#### **4. Benchmark Case Study and Economic Implications**

In this section, we use a case study to draw qualitative implications from the above results. An OU model is assumed, which captures both the mean reversion and random variability present in EIM prices, and is fitted to relevant data. The interest rate is taken to be 3% per annum, and the degradation factor for the store to be *A* = 0.9999.

Our data is the 'balancing group price' from the German Amprion system operator, which is available for every 15 min period (AMPRION 2016). Summary statistics for the period from 1 June 2012 to 31 May 2016 are presented in Table 1. To address the issue of its extreme range, which impacts the fitting of both volatility and mean reversion in the OU model, the data was truncated at the values −150 and 150. The parameters obtained by maximum likelihood fitting were then *θ* = 68.69 (the rate of mean reversion), *σ* = 483.33 (the volatility), *μ* = 30.99 (the mean-reversion level). The effect of the truncation step was to approximately halve the fitted volatility.

**Table 1.** Summary statistics for the 15 min balancing group price per MWh in the German Amprion area, 1 June 2012 to 31 May 2016.


The left panel of Figures 3 and 4 show the lifetime value *V*'(*x*∗), while the right panel of Figure 3 plots the stopping boundary *x*ˇ, which is the maximum price at which the battery operator can buy energy optimally. These values of *x*ˇ are significantly below the long-term mean price *D*, indeed the former value is negative while the latter is positive. Thus, in this example, the battery operator purchases energy when it is in excess supply, further contributing to balancing. To place the negative values on the stopping boundary in Figure 3 in the statistical context, recall from Table 1 that the first quartile of the price distribution is approximately zero. Indeed, negative energy prices usually occur

several times per day in the German EIM. In the present dataset of 1461 days, there are only 11 days without negative prices and the longest observed time between negative prices is 41.5 h.

**Figure 3.** Results obtained with the Ornstein-Uhlenbeck model fitted in Section 4, as functions of the total premium, with interest rate 3% per annum. Solid lines: *x*∗ = 100, dotted: *x*∗ = 75, dashed: *x*<sup>∗</sup> = 50. Left: lifetime value *V*'(*x*∗). Right: the stopping boundary *x*ˇ, the maximum price for which the battery operator can buy energy optimally.

**Figure 4.** Lifetime value *V*'(*x*∗) as a function of *x*<sup>∗</sup> with the Ornstein-Uhlenbeck model fitted in Section 4, with interest rate 3% per annum. Dashed line: *pc* + *Kc* = 20, solid: *pc* + *Kc* = 30, dotted: *pc* + *Kc* = 40, mixed: *pc* + *Kc* = 50. The horizontal grey line indicates the current price of lithium-ion battery storage per MWh (IRENA 2017, Figure 33).

We make the following empirical observations. Firstly, defining the total premium as the sum *pc* + *Kc*, altering its distribution between the initial premium *pc* (which is received at *x* = *x*ˇ) and the utilisation payment *Kc* (which is received at *x* = *x*∗) results in insignificant changes to the graphs, with relative differences on the vertical axes of the order 10−<sup>3</sup> (data not shown). It is for this reason that the figures are indexed by the total premium *pc* + *Kc* rather than by individual premia. Secondly, it is seen from the right-hand panel of Figure 3 that the (negative valued) stopping boundary increases with the total premium, making exercise more frequent. Thus, as the total premium increases, both the frequency and size of the cashflows increase, yielding a superlinear relationship in the left-hand panel of Figure 3. This superlinearity is not very pronounced since the stopping boundary is relatively insensitive to the total premium in the range presented in the graphs (see the right hand panel), so that the lifetime value is driven principally by the size of the cashflows. Thirdly, the grey horizontal line of

Figure 4 is placed at a level indicative of recent costs for lithium-ion batteries per MWh (megawatt hour). Thus, the investment case for battery storage providing reserve is significantly positive for a wide range of the contract parameters. Finally, the contours in Figure 4 have an S-shape, the marginal influence of *x*∗ being smaller in the range *x*∗ < 110 and larger for greater values of *x*∗ (with the marginal influence eventually decreasing again in the limit of large *x*∗).

These phenomena are explained by the presence of mean reversion in the OU price model. The timings of the cashflows to the battery operator are entirely determined by the successive *passage times* of the price process between the levels *x*∗ and *x*ˇ. These passage times are relatively short on average for the fitted OU model. This means that the premia are received at almost the same time under each reserve contract, and it is the total premium which drives the real option value. Furthermore, the passage times between *x*∗ and *x*ˇ may be decomposed into passage times between *x*∗ and *D*, and between *D* and *x*ˇ. Since the OU process is statistically symmetric about *D*, let us compare the distances |*x*ˇ − *D*| and |*x*<sup>∗</sup> − *D*|. From Figure 3, we have *x*ˇ ≈ −70 so that |*x*ˇ − *D*| ≈ 100. Therefore, for *x*<sup>∗</sup> < 110, we have |*x*<sup>∗</sup> − *D*||*x*ˇ − *D*| and the passage time between *D* and *x*ˇ, which varies little, dominates that between *x*∗ and *D*. Correspondingly, we observe in Figure 4 that the value function changes relatively little as *x*∗ varies below 110. Conversely, as *x*∗ increases beyond 110, it is the distance between *x*∗ and *D* which dominates, and the value function begins to decrease relatively rapidly.

These results provide insights into the suitability of the considered arrangement for correcting differing levels of imbalance. As the distance between *x*∗ and the mean level *D* grows, the energy price reaches *x*∗ significantly less frequently and the reserve contract starts to provide insurance against rare events, resulting in infrequent power delivery and low utilisation of the battery. These observations suggest that the contractual arrangement studied in this paper is more suitable for the frequent balancing of less severe imbalance. In contrast, the more rapid reduction in the lifetime value for large values of *x*∗ suggests that such arrangements based on real-time markets are not suitable for balancing relatively rare events such as large system disturbances due to unplanned outages of large generators. The system operator may prefer to use alternative arrangements, based, for example, on fixed availability payments, to provide security against such events.

#### **5. Conclusions**

In this paper, we investigate the procurement of operating reserve from energy-limited storage using a sequence of physically covered incremental reserve contracts. This leads to the pricing of a real perpetual American swing put option with a random refraction time. We model the underlying energy imbalance market price as a general linear regular diffusion, which, in particular, is capable of modelling the mean reversion present in imbalance prices. Both the optimal operational policy and the real option value of the store are characterised explicitly. Although the solutions are generally not available in an analytical form, we have provided a straightforward procedure for their numerical evaluation together with empirical examples from the German energy imbalance market.

The results of the lifetime analysis in particular have both managerial implications for the battery operator and policy implications for the system operator. From the operational viewpoint, under the setup described in Section 1.1, we have established that the battery operator should purchase energy as soon as the EIM price falls to the level *x*ˇ, which may be calculated as described in Section 2.4. Furthermore, the battery operator should then sell the reserve contract immediately. Our real options valuation may be taken into account when deciding whether to invest in an energy store, and whether to sell such reserve contracts in preference to trading in other markets (for example, performing price arbitrage in the spot energy market).

Turning to the perspective of the system operator, we have demonstrated that the proposed arrangement can be mutually beneficial to the system operator and battery operator. More precisely, the system operator can be protected against guaranteed financial losses from the incremental capacity contract purchase while the battery operator has a quantifiable profit. The analysis also provides information on feedback due to battery charging by determining the highest price *x*ˇ at which the battery operator buys energy, hence identifying conditions under which the battery operator's operational strategy is aligned with system stability.

We address incremental reserve contracts, which are particularly valuable to the system operator when the margin of electricity generation capacity over peak demand is low. Decremental reserve may also be studied in the above framework, although the second stopping time (action A2) is non-trivial, which leads to a nested stopping problem beyond the scope of the present paper. Furthermore, we assume that the energy storage unit is dedicated to providing incremental reserve contracts, so that the opportunity costs of not operating in other markets or providing other services are not modelled. The extension to a finite expiry time, the lifetime analysis with decremental reserve contracts, and also the opportunity cost of not operating in other markets would be interesting areas for further work.

The methodological advances of this paper reach beyond energy markets. In particular, they are relevant to real options' analyses of storable commodities where the timing problem over the lifetime of the store is of primary interest. The lifetime analysis via optimal stopping techniques, developed in Section 2.3, provides an example of how timing problems can be addressed for rather general dynamics of the underlying stochastic process. In this context, we provide an alternative method to quasi-variational inequalities, which are often dynamics-specific and technically more involved.

**Author Contributions:** J.M. and J.P. contributed equally to the research and writing of the paper.

**Funding:** This research was funded by the UK Engineering and Physical Sciences Research Council Grant No. EP/K00557X/2 and MNiSzW grant UMO-2012/07/B/ST1/03298.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Lemmas and Proofs from Section 2**

The following three lemmas classify solutions to the stopping problem (7). Note that, if sup*<sup>x</sup> ϑ*(*x*) ≤ 0, then no choice of the stopping time *τ* gives a value function greater than 0. The optimal stopping time in this case is given by *τ* = ∞. In what follows, we therefore assume

$$\sup\_{x \in (a,b)} \theta(x) > 0. \tag{A1}$$

These results can be derived from Beibel and Lerche (2000); however, for the convenience of the reader, we provide simple proofs.

**Lemma A1.** *Assume that there exists x*ˆ ∈ *I which maximises ϑ*(*x*)/*φr*(*x*) *over I. Then, the value function v*(*x*) *is finite for all x, and for x* ≥ *x:*ˆ

*1. the stopping time τx*<sup>ˆ</sup> *is optimal,*

$$2. \qquad v(\mathbf{x}) = \frac{\theta(\hat{\mathfrak{x}})}{\phi\_r(\hat{\mathfrak{x}})} \phi\_r(\mathbf{x}),$$

*3. any stopping time τ with* P*x ϑ*(*Xτ*)/*φr*(*Xτ*) < *ϑ*(*x*ˆ)/*φr*(*x*ˆ) > 0 *is strictly suboptimal for the problem v*(*x*)*.*

**Proof.** Since *φ<sup>r</sup>* is *r*-excessive (Borodin and Salminen 2012, Sec. II.5), for any finite stopping time *τ*

$$\mathbb{E}^{\boldsymbol{x}}\{e^{-r\tau}\phi\_r(X\_{\tau})\} \le \phi\_r(\boldsymbol{x}).$$

Let now *τ* be a stopping time taking possibly infinite values. Let *bn* be an increasing sequence converging to *b* with *b*<sup>1</sup> > *x*, the initial point of the process *X*. Then, *τbn* is an increasing sequence of stopping times converging to infinity and

$$\begin{split} \phi\_r(\boldsymbol{x}) &\geq \liminf\_{n\to\infty} \mathbb{E}^{\boldsymbol{x}} \{ e^{-r(\boldsymbol{\tau}\wedge\boldsymbol{\tau}\_{\mathbb{b}\_n})} \phi\_r(X\_{\boldsymbol{\tau}\wedge\boldsymbol{\tau}\_{\mathbb{b}\_n}}) \} \\ &\geq \mathbb{E}^{\boldsymbol{x}} \{ \liminf\_{n\to\infty} e^{-r(\boldsymbol{\tau}\wedge\boldsymbol{\tau}\_{\mathbb{b}\_n})} \phi\_r(X\_{\boldsymbol{\tau}\wedge\boldsymbol{\tau}\_{\mathbb{b}\_n}}) \} = \mathbb{E}^{\boldsymbol{x}} \{ e^{-r\boldsymbol{\tau}} \phi\_r(X\_{\boldsymbol{\tau}}) \mathbf{1}\_{\boldsymbol{\tau}\leq\infty} \}. \end{split}$$

where *φr*(*b*−) = 0 was used in the last equality.

For any stopping time *τ*,

$$\begin{split} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \boldsymbol{\theta}(\mathbf{X}\_{\tau}) \mathbf{1}\_{\tau < \infty} \right\} &= \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \boldsymbol{\phi}\_{\mathbb{T}}(\mathbf{X}\_{\tau}) \frac{\boldsymbol{\theta}(\mathbf{X}\_{\tau})}{\boldsymbol{\phi}\_{\mathbb{T}}(\mathbf{X}\_{\tau})} \mathbf{1}\_{\tau < \infty} \right\} \\ &\leq \frac{\boldsymbol{\theta}(\hat{\mathbf{x}})}{\boldsymbol{\phi}\_{\mathbb{T}}(\hat{\mathbf{x}})} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \boldsymbol{\phi}\_{\mathbb{T}}(\mathbf{X}\_{\tau}) \mathbf{1}\_{\tau < \infty} \right\} \leq \frac{\boldsymbol{\theta}(\hat{\mathbf{x}})}{\boldsymbol{\phi}\_{\mathbb{T}}(\hat{\mathbf{x}})} \boldsymbol{\phi}\_{\mathbb{T}}(\mathbf{x}), \end{split} \tag{A2}$$

where the final inequality follows from the first part of the proof and (A1) (so *<sup>ϑ</sup>*(*x*ˆ) *<sup>φ</sup>r*(*x*ˆ) > 0). Hence, *<sup>v</sup>*(*x*) is finite for all *x* ∈ *I*. To prove claim 1, note from (6) that for *x* ≥ *x*ˆ the upper bound is attained by *τx*ˆ, which is therefore an optimal stopping time in the problem *v*(*x*). The assumption on *τ* in claim 3 leads to strict inequality in (A2), making *τ* strictly suboptimal in the problem *v*(*x*).

It is convenient to introduce the notation

$$L := \limsup\_{\mathbf{x} \to \mathbf{a}} \frac{\vartheta(\mathbf{x})^{+}}{\Phi\_{r}(\mathbf{x})}. \tag{A3}$$

Lemma A2 corresponds to cases when there is no optimal stopping time, but the optimal value can be reached in the limit by a sequence of stopping times.

#### **Lemma A2.**


**Proof. Assertion 1.** Fix any *x* ∈ *I*. Then, for any *x*ˆ < *x*, we have

$$\mathbb{E}^{\mathfrak{x}}\{\mathfrak{e}^{-r\tau\_{\mathfrak{t}}}\mathfrak{e}(X\_{\mathfrak{t}\_{\mathfrak{t}}})\} = \mathfrak{e}(\widehat{\mathfrak{x}})\frac{\phi\_{r}(\mathfrak{x})}{\phi\_{r}(\mathfrak{t})}.$$

which converges to infinity for *x*ˆ tending to *a* over an appropriate subsequence. Since the process is recurrent, the point *x* can be reached from any other point in the state space with positive probability in a finite time. This proves that the value function is infinite for all *x* ∈ *I*.

**Assertion 2.** Recall that, due to the supremum of *<sup>ϑ</sup> <sup>φ</sup><sup>r</sup>* being strictly positive, we have *L* > 0. From the proof of Lemma A1, for an arbitrary stopping time *τ*, we have

$$\mathbb{E}^{\boldsymbol{x}}\{\boldsymbol{\varepsilon}^{-r\tau}\boldsymbol{\theta}(\boldsymbol{X}\_{\tau})\mathbf{1}\_{\tau<\infty}\} = \mathbb{E}^{\boldsymbol{x}}\{\boldsymbol{\varepsilon}^{-r\tau}\boldsymbol{\phi}\_{\boldsymbol{r}}(\boldsymbol{X}\_{\tau})\frac{\boldsymbol{\theta}(\boldsymbol{X}\_{\tau})}{\boldsymbol{\phi}\_{\boldsymbol{r}}(\boldsymbol{X}\_{\tau})}\mathbf{1}\_{\tau<\infty}\} \\ < L\,\mathbb{E}^{\boldsymbol{x}}\{\boldsymbol{\varepsilon}^{-r\tau}\boldsymbol{\phi}\_{\boldsymbol{r}}(\boldsymbol{X}\_{\tau})\mathbf{1}\_{\tau<\infty}\} \leq L\boldsymbol{\phi}\_{\boldsymbol{r}}(\boldsymbol{x}).$$

However, one can construct a sequence of stopping times that achieves this value in the limit. Take *xn* such that lim*n*→<sup>∞</sup> *ϑ*(*xn*)/*φr*(*xn*) = *L* and define *τ<sup>n</sup>* = *τxn* . Then,

$$\lim\_{n \to \infty} \mathbb{E}^{\mathbf{x}} \{ e^{-r\tau\_n} \theta(X\_{\tau\_n}) \} = \lim\_{n \to \infty} \theta(\mathbf{x}\_n) \frac{\phi\_r(\mathbf{x})}{\phi\_r(\mathbf{x}\_n)} = \phi\_r(\mathbf{x}) L\_r$$

so *v*(*x*) = *φr*(*x*)*L*. This together with the strict inequality above proves that an optimal stopping time does not exist.

The results developed in this section also have a 'mirror' counterpart involving

$$R := \limsup\_{\mathbf{x} \to \mathbf{b}} \frac{\vartheta(\mathbf{x})^{+}}{\psi\_{r}(\mathbf{x})} \tag{A4}$$

rather than *L*. In particular, the value function is infinite if *R* = ∞, and

**Corollary A1.** *If x*ˆ ∈ *I maximises ϑ*(*x*)/*ψr*(*x*)*, then, for any x* ≤ *x*ˆ*, an optimal stopping time in the problem v*(*x*) *is given by τx*ˆ*.*

This also motivates the assumptions of the following lemma which collects results from Dayanik and Karatzas (2003, Sec. 5.2). Again, although those results are obtained under the assumption that both boundaries are natural, their proofs require only that they are non-exit.

**Lemma A3.** *Assume that L*, *R* < ∞ *and ϑ is locally bounded. Then, the value function v is finite and continuous on* (*a*, *b*)*.*

All the stopping problems considered in this paper have a finite right-hand limit *R* < ∞. Therefore, whenever *L* < ∞, their value functions will be continuous.

**Proof of Lemma 2.** If **S1\*** does not hold, then the payoff from cycle A1–A3 is not profitable (on average) for any value of the EIM price *x*, so **S1** does not hold. Conversely, if **S1\*** holds, then there exists *x* such that <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>**(*x*) <sup>≥</sup> *<sup>h</sup>*(*x*) <sup>&</sup>gt; 0. For any other *<sup>x</sup>* , consider the following strategy: wait until the process *X* hits *<sup>x</sup>* and proceed optimally thereafter. This results in a strictly positive expected value: <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>**(*x* ) > 0 and, by the arbitrariness of *x* , we have <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>** <sup>&</sup>gt; 0.

Suppose that **S2\*** holds. Then, the system operator makes a profit on the reserve contract (relative to simply purchasing a unit of energy at the power delivery time *τ*ˆ*e*, at the price *X*(*τ*ˆ*e*) ≥ *x*∗) in undiscounted cash terms. Considering discounting, the system operator similarly makes a profit provided the EIM price reaches the level *x*∗ (or above) sufficiently quickly. Since this happens with positive probability for a regular diffusion, a certain financial loss for the system operator is excluded. When **S2\*** does not hold, suppose first that *pc* + *Kc* > *x*∗: then, the system operator makes a loss in undiscounted cash terms, and if the reserve contract is sold when *x* ≥ *x*∗, then this loss is certain. In the boundary case *pc* + *Kc* = *x*∗, the battery operator can only make a profit by purchasing energy and selling the reserve contract when *Xt* < *x*∗, in which case the system operator makes a certain loss. This follows since instead of buying the reserve contract, the system operator could invest *pc* > 0 temporarily in a riskless bond, withdrawing it with interest when the EIM price rises to *x*<sup>∗</sup> = *pc* + *Kc*. The loss in this case is equal in value to the interest payment.

#### **Appendix B. Lemmas for the Lifetime Problem**

It follows from the optimal stopping theory reviewed in Section 2.1.1 and Appendix A that the following definition of an *admissible* continuation function is natural in our setup. In particular, the final condition corresponds to the assumption that the energy purchase occurs at a price below *x*∗.

**Definition A1.** *(Admissible continuation value) A continuation value function* ˆ *ζ is* admissible *if it is continuous on* (*a*, *<sup>x</sup>*∗] *and non-negative on I, with* <sup>ˆ</sup> *ζ*(*x*) *<sup>φ</sup>r*(*x*) *non-increasing on* [*x*∗, *<sup>b</sup>*)*.*

The following result now characterises the possible solution types in the lifetime problem.

**Lemma A4.** *Assume that conditions* **S1\*** *and* **S2\*** *hold. If* ˆ *ζ is an admissible continuation value function, then*

$$\limsup\_{\mathbf{x}\to\mathbf{a}} \frac{h(\mathbf{x}, \boldsymbol{\xi})}{\phi\_r(\mathbf{x})} = \limsup\_{\mathbf{x}\to\mathbf{a}} \frac{-\mathbf{x}}{\phi\_r(\mathbf{x})} = L\_{\mathbf{c}}.\tag{A5}$$

*and with cases A, B, C defined just as in Theorem 1:*

*1. In case A, there exists <sup>x</sup>*<sup>ˆ</sup> <sup>≤</sup> *<sup>x</sup>*<sup>∗</sup> *which maximises* <sup>ˆ</sup> *h*(*x*, ˆ *ζ*) *<sup>φ</sup>r*(*x*) *and <sup>τ</sup>x*<sup>ˆ</sup> *is an optimal stopping time for <sup>x</sup>* ≥ *<sup>x</sup>*<sup>ˆ</sup> *with value function*

$$w(\mathfrak{x}) = \hat{\mathcal{T}}\!^{\sharp}\_{\mathbb{C}}(\mathfrak{x}) = \Phi\_{\mathbb{P}}(\mathfrak{x}) \frac{\hbar(\mathfrak{x}, \mathfrak{f})}{\Phi\_{\mathbb{P}}(\mathfrak{x})}, \qquad \mathfrak{x} \ge \mathfrak{x}.$$

*Denoting by x*ˆ0 *the corresponding x in case A of Theorem* ˆ *1, we have x*ˆ0 ≤ *x.*ˆ

	- *(a) there exists xL* <sup>∈</sup> (*a*, *<sup>b</sup>*) *with* <sup>ˆ</sup> *h*(*xL*, ˆ *ζ*) *<sup>φ</sup>r*(*xL*) <sup>≥</sup> *Lc: then, there exists <sup>x</sup>*<sup>ˆ</sup> <sup>∈</sup> (*a*, *<sup>x</sup>*∗] *which maximises* <sup>ˆ</sup> *h*(*x*, ˆ *ζ*) *<sup>φ</sup>r*(*x*) *, and τx*<sup>ˆ</sup> *is an optimal stopping time for x* ≥ *x*ˆ *with value function v*(*x*) = *φr*(*x*) ˆ *h*(*x*ˆ, ˆ *ζ*) *<sup>φ</sup>r*(*x*ˆ) *for x* ≥ *x*ˆ*; or*
	- *(b) there does not exist xL* <sup>∈</sup> (*a*, *<sup>b</sup>*) *with* <sup>ˆ</sup> *h*(*xL*, ˆ *ζ*) *<sup>φ</sup>r*(*xL*) <sup>≥</sup> *Lc: then, the value function is <sup>v</sup>*(*x*) = *Lc <sup>φ</sup>r*(*x*) *and there is no optimal stopping time.*

*Moreover, the value function v is continuous in cases A and B.*

**Proof.** Note that

$$h(\mathbf{x}) = \hat{h}(\mathbf{x}, \mathbf{0}) \le \hat{h}(\mathbf{x}, \boldsymbol{\xi}) = \begin{cases} h(\mathbf{x}) + \frac{\Psi\_r(\mathbf{x})}{\Psi\_r(\mathbf{x}^\*)} A\_\circ^\circ(\mathbf{x}^\*), & \mathbf{x} < \mathbf{x}^\*,\\ h(\mathbf{x}) + A\_\circ^\circ(\mathbf{x}), & \mathbf{x} \ge \mathbf{x}^\*. \end{cases} \tag{A6}$$

This proves (A5), since lim*x*→*<sup>a</sup> ψr*(*x*)/*φr*(*x*) = 0. We verify from (A6) and the assumptions of the lemma that *R* < ∞ in (A4). Hence, whenever *Lc* < ∞, the value function *v* is finite and continuous by Lemma A3. As noted previously (in the proof of Theorem 1), *h* is negative and decreasing on [*x*∗, *b*), hence the ratio *h*(*x*)/*φr*(*x*) is strictly decreasing on that interval. It then follows from (A6) and the admissibility of ˆ *<sup>ζ</sup>* that the function *<sup>x</sup>* <sup>→</sup> <sup>ˆ</sup> *h*(*x*, ˆ *ζ*) *<sup>φ</sup>r*(*x*) is strictly decreasing on [*x*∗, *<sup>b</sup>*). Therefore the supremum of *<sup>x</sup>* <sup>→</sup> <sup>ˆ</sup> *h*(*x*, ˆ *ζ*) *<sup>φ</sup>r*(*x*) , which is positive by (A6) and **S1**∗, is attained on (*a*, *<sup>x</sup>*∗] or asymptotically when *<sup>x</sup>* → *<sup>a</sup>*. In cases 1 and 2a, the optimality of *τx*<sup>ˆ</sup> for *x* ≥ *x*ˆ then follows from Lemma A1. To see that *x*ˆ0 ≤ *x*ˆ in case 1, take *x* < *x*ˆ0. Then, from (A6), we have

$$\frac{\hat{h}(\mathbf{x},\tilde{\xi})}{\Phi\_{\mathsf{T}}(\mathbf{x})} = \frac{h(\mathbf{x})}{\Phi\_{\mathsf{T}}(\mathbf{x})} + \frac{\psi\_{\mathsf{T}}(\mathbf{x})}{\Phi\_{\mathsf{T}}(\mathbf{x})} \frac{A\tilde{\xi}(\mathbf{x}^\*)}{\Psi\_{\mathsf{T}}(\mathbf{x}^\*)} < \frac{h(\mathbf{\hat{x}}\_0)}{\Phi\_{\mathsf{T}}(\mathbf{\hat{x}}\_0)} + \frac{\psi\_{\mathsf{T}}(\mathbf{\hat{x}}\_0)}{\Phi\_{\mathsf{T}}(\mathbf{\hat{x}}\_0)} \frac{A\tilde{\xi}(\mathbf{x}^\*)}{\Psi\_{\mathsf{T}}(\mathbf{x}^\*)} = \frac{\hat{h}(\mathbf{\hat{x}}\_0,\tilde{\xi})}{\Phi\_{\mathsf{T}}(\mathbf{\hat{x}}\_0)},$$

since *<sup>x</sup>* <sup>→</sup> *<sup>ψ</sup>r*(*x*) *<sup>φ</sup>r*(*x*) is strictly increasing. Case 2b follows from Lemma A2 and the fact that *Lc* > 0, while Lemma A2 proves case 3.

Before proceeding, we note the following technicalities.

**Remark A1.** *The value function v in cases 1 and 2a of Lemma A4 satisfies the condition that v*(*x*)/*φr*(*x*) *is non-increasing on* [*x*∗, *b*)*. Indeed,*

$$\frac{\upsilon(\mathbf{x})}{\phi\_r(\mathbf{x})} = \frac{\hat{h}(\mathfrak{X}, \hat{\xi})}{\phi\_r(\mathfrak{X})} = const.$$

*for x* ≥ *x.*ˆ

**Remark A2.** *For case 3 of Lemma A4, the assumption that* <sup>ˆ</sup> *ζ*(*x*) *<sup>φ</sup>r*(*x*) *is non-increasing on* [*x*∗, *<sup>b</sup>*) *can be dropped.*

**Lemma A5.** *The timing of action A2 remains trivial when the cycle A1–A3 is iterated a finite number of times.*

**Proof.** Let us suppose that action A1 has just been carried out in preparation for selling the first in a chain of *n* reserve contracts, and that the EIM price currently has the value *x*. Define *τA*<sup>2</sup> to be the time at which the battery operator carries out action A2. The remaining cashflows are (i) the first contract premium *pc* (from action A2), (ii) the first utilisation payment *Kc* (from A3), and (iii) all cashflows arising from the remaining cycles A1–A3 (there are *n* − 1 cycles which remain available to the battery operator). The cashflows (i) and (ii) are both positive and fixed, making it best to obtain them as soon as possible. The cashflows (iii) include positive and negative amounts, so their timing is not as simple. However, it is sufficient to notice that

• their expected net present value is given by an optimal stopping problem, namely, the timing of the *next* action A1:

$$\sup\_{\tau \succeq \sigma^\*} \mathbb{E}^x \{ \varepsilon^{-r\tau} h\_{(\widecheck{iii})}(X\_{\tau}) \mathbf{1}\_{\tau < \infty} \},\tag{A7}$$

where *σ*<sup>∗</sup> := inf{*t* ≥ *τA*<sup>2</sup> : *Xt* ≥ *x*∗}, for some suitable payoff function *h*(*iii*),

• the choice *τA*<sup>2</sup> = 0 minimises the exercise time *σ*<sup>∗</sup> and thus maximises the value of component (iii), since the supremum in (A7) is then taken over the largest possible set of stopping times.

It is therefore best to set *τA*<sup>2</sup> = 0, since this choice maximises the value of components (i), (ii) and (iii).

The next result establishes the existence of, and characterises, the lifetime value function *V*'.

**Lemma A6.** *In cases A and B of Theorem 1,*


**Proof.** Part 1 is proved by induction. The claim is clearly true for *n* = 1. Assume it holds for *n*. Then, Lemma A4 applies and ˆ *ζn*+1(*x*)/*φr*(*x*) = ˆ *h*(*x*ˆ, ˆ *ζn*)/*φr*(*x*ˆ) for *x* ≥ *x*ˆ when the optimal stopping time exists and ˆ *ζn*+1(*x*)/*φr*(*x*) = *Lc* otherwise. Therefore, ˆ *ζn*+1(*x*) = *cφr*(*x*) for *x* ≥ *x*<sup>∗</sup> and some constant *<sup>c</sup>* <sup>≥</sup> 0. Since *<sup>φ</sup><sup>r</sup>* is decreasing, we conclude that <sup>ˆ</sup> *ζn*+<sup>1</sup> decreases on [*x*∗, *b*).

The monotonicity of <sup>T</sup><sup>ˆ</sup> guarantees that, if <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>** <sup>&</sup>gt; 0, then <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** <sup>&</sup>gt; 0 for every *<sup>n</sup>*. For the upper bound, notice that

$$\begin{split} \hat{\mathcal{T}}\_{\mathbb{S}^{\mathbb{H}}}^{\mathbb{S}}(\mathbf{x}) &= \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \left( p\_{\mathbf{c}} - \mathbf{X}\_{\tau} + \mathbb{E}^{\mathbf{X}\_{\tau}} \left\{ e^{-r\mathbf{f}\_{\mathbf{c}}} \left( \mathbf{K}\_{\mathbf{c}} + A\_{\mathbb{S}^{\mathbf{H}}}^{\boldsymbol{\xi}} (\mathbf{X}\_{\mathbf{f}\_{\mathbf{c}}}) \right) \right\} \right) \mathbf{1}\_{\mathbb{T} < \infty} \right\} \\ &\leq \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \left( p\_{\mathbf{c}} - \mathbf{X}\_{\tau} + \mathbf{K}\_{\mathbf{c}} \mathbb{E}^{\mathbf{X}\_{\tau}} \left\{ e^{-r\mathbf{f}\_{\mathbf{c}}} \right\} \right) \mathbf{1}\_{\mathbb{T} < \infty} \right\} + A\_{\mathbb{S}^{\mathbf{H}}}^{\boldsymbol{\xi}}(\mathbf{x}^{\*}) = V\_{\mathbb{f}}(\mathbf{x}) + A\_{\mathbb{S}^{\mathbf{H}}}^{\boldsymbol{\xi}}(\mathbf{x}^{\*}), \end{split}$$

where *Vc* <sup>=</sup> <sup>T</sup><sup>ˆ</sup> **<sup>0</sup>** is the value function for the single problem and the inequality follows from the fact that ˆ *ζ<sup>n</sup>* is decreasing on [*x*∗, *b*). From the above, we have ˆ *<sup>ζ</sup>n*(*x*) = <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***0**(*x*) <sup>≤</sup> *Vc*(*x*) + <sup>1</sup>−*A<sup>n</sup>* <sup>1</sup>−*<sup>A</sup> Vc*(*x*∗). Recalling that *<sup>A</sup>* <sup>∈</sup> (0, 1) yields that the <sup>ˆ</sup> *ζn*(*x*) are bounded by *Vc*(*x*) + <sup>1</sup> <sup>1</sup>−*<sup>A</sup> Vc*(*x*∗), so there exists a finite monotone limit ˆ *<sup>ζ</sup>* :<sup>=</sup> lim*n*→<sup>∞</sup> <sup>ˆ</sup> *ζn*, and

$$\begin{split} \hat{\xi}(\mathbf{x}) &= \lim\_{n \to \infty} \hat{\mathcal{T}} \hat{\xi}\_{n}(\mathbf{x}) = \sup\_{n} \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \left\{ e^{-r\tau} \Big( p\_{\mathbf{c}} - X\_{\tau} + \mathbb{E}^{X\_{\tau}} \Big\{ e^{-r\hat{\tau}\_{\mathbf{c}}} \Big( \mathcal{K}\_{\mathbf{c}} + A\_{\theta,\mathbf{u}}^{\hat{\theta}}(X\_{\mathbf{f}\_{\mathbf{c}}}) \Big) \Big\} \Big\} \mathbf{1}\_{\tau < \infty} \right\} \\ &= \sup\_{\tau} \lim\_{n \to \infty} \mathbb{E}^{\mathbf{x}} \Big\{ e^{-r\tau} \Big( p\_{\mathbf{c}} - X\_{\tau} + \mathbb{E}^{X\_{\tau}} \Big\{ e^{-r\hat{\tau}\_{\mathbf{c}}} \Big( \mathcal{K}\_{\mathbf{c}} + A\_{\theta,\mathbf{u}}^{\hat{\theta}}(X\_{\mathbf{f}\_{\mathbf{c}}}) \Big) \Big\} \Big\} \mathbf{1}\_{\tau < \infty} \Big\} \\ &= \sup\_{\tau} \mathbb{E}^{\mathbf{x}} \Big\{ e^{-r\tau} \Big( p\_{\mathbf{c}} - X\_{\tau} + \mathbb{E}^{X\_{\tau}} \Big\{ e^{-r\hat{\tau}\_{\mathbf{c}}} \Big( \mathcal{K}\_{\mathbf{c}} + A\_{\theta}^{\hat{\theta}}(X\_{\mathbf{f}\_{\mathbf{c}}}) \Big) \Big\} \Big\} \mathbf{1}\_{\tau < \infty} \\ &= \hat{T} \hat{\xi}(\mathbf{x})\_{\tau} \end{split}$$

by monotone convergence. The equality of *V*' and ˆ *ζ* is clear from (14).

#### **Appendix C. Uniqueness of Fixed Points**

Corollary A2 below establishes the uniqueness of the fixed point of <sup>T</sup><sup>ˆ</sup> . Lemma A8 shows that <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** converges exponentially fast to this unique fixed point as *<sup>n</sup>* <sup>→</sup> <sup>∞</sup>.

**Lemma A7.** *Let ξ*, *ξ be two continuous non-negative functions with ξ satisfying the assumptions of Lemma A4 together with the bound ξ* ≥ *ξ . In the problem* <sup>T</sup><sup>ˆ</sup> *<sup>ξ</sup>, assume the existence of an optimal stopping time <sup>τ</sup>*<sup>∗</sup> *under which stopping occurs only at values bounded above by x* < *x*∗*. Then,*

$$\|\mathcal{T}\xi - \mathcal{T}\xi'\|\_{\mathfrak{A}} \le \rho \|\xi - \xi'\|\_{\mathfrak{A}'} $$

*where ρ* = *A <sup>ψ</sup>r*(*x* ) *<sup>ψ</sup>r*(*x*∗) < <sup>1</sup> *and <sup>f</sup>* # = | *<sup>f</sup>*(*x*∗)| *is a seminorm on the space of continuous functions. Moreover,*

$$0 \le \hat{\mathcal{T}}\xi(\mathbf{x}) - \hat{\mathcal{T}}\xi'(\mathbf{x}) < ||\xi - \xi'||\mathfrak{e}.\tag{A8}$$

Note that, in general, an optimal stopping time for <sup>T</sup><sup>ˆ</sup> *<sup>ξ</sup>*(*x*) depends on the initial state *<sup>x</sup>*. However, under general conditions (cf. Section 2.1.1), *τ*<sup>∗</sup> = inf{*t* ≥ 0 : *Xt* ∈ Γ}, where Γ is the stopping set. Then, the condition in the above lemma writes as Γ ⊂ (*a*, *x* ] for some *x* < *x*∗.

**Proof of Lemma A7.** By the monotonicity of <sup>T</sup><sup>ˆ</sup> , for any *<sup>x</sup>*, we have

$$0 \le \mathcal{T}\xi(\mathbf{x}) - \mathcal{T}\xi'(\mathbf{x}) \le \mathbb{E}^{\mathbf{x}}\left\{ e^{-r\tau^\*} \left( -X\_{\mathbf{r}^\*} + p\_c + \left( K\_c + A\_{\mathbf{\tilde{s}}}^{\mathbf{x}}(\mathbf{x}^\*) \right) \frac{\Psi\_r(X\_{\mathbf{r}^\*})}{\Psi\_r(\mathbf{x}^\*)} \right) \right\}$$

$$\qquad - \mathbb{E}^{\mathbf{x}}\left\{ e^{-r\tau^\*} \left( -X\_{\mathbf{r}^\*} + p\_c + \left( K\_c + A\_{\mathbf{\tilde{s}}}^{\mathbf{x}}(\mathbf{x}^\*) \right) \frac{\Psi\_r(X\_{\mathbf{r}^\*})}{\Psi\_r(\mathbf{x}^\*)} \right) \right\}.$$

$$= \mathbb{E}^{\mathbf{x}}\left\{ e^{-r\tau^\*} A \left( \left( \tilde{\xi}(\mathbf{x}^\*) - \tilde{\xi}'(\mathbf{x}^\*) \right) \frac{\Psi\_r(X\_{\mathbf{r}^\*})}{\Psi\_r(\mathbf{x}^\*)} \right) \right\}$$

$$= ||\xi - \tilde{\xi}'||\_{\mathsf{H}} A \operatorname{\mathbf{E}}^{\mathbf{x}} \left\{ e^{-r\tau^\*} \frac{\Psi\_r(X\_{\mathbf{r}^\*})}{\Psi\_r(\mathbf{x}^\*)} \right\}.$$

This proves (A8). In addition, we have

$$A \to^{\infty^\*} \left\{ e^{-r\tau^\*} \frac{\psi\_r(X\_{\tau^\*})}{\psi\_r(\mathbf{x}^\*)} \right\} \le A \frac{\phi\_r(\mathbf{x}^\*)}{\phi\_r(\mathbf{x}^\prime)} \frac{\psi\_r(\mathbf{x}^\prime)}{\psi\_r(\mathbf{x}^\*)} \le \rho.$$

**Lemma A8.** *Assume that there exists a fixed point* ˆ *<sup>ζ</sup>*<sup>∗</sup> *of* <sup>T</sup><sup>ˆ</sup> *in the space of continuous non-negative functions. In the problem* <sup>T</sup><sup>ˆ</sup> <sup>ˆ</sup> *ζ*∗*, assume the existence of an optimal stopping time under which stopping occurs only at values bounded above by x* < *x*∗ *(cf. the comment after the previous lemma). Then, there is a constant ρ* < 1 *such that* <sup>ˆ</sup> *<sup>ζ</sup>*<sup>∗</sup> <sup>−</sup> <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>**# <sup>≤</sup> *<sup>ρ</sup><sup>n</sup>* <sup>ˆ</sup> *<sup>ζ</sup>*∗# *and* <sup>ˆ</sup> *<sup>ζ</sup>*<sup>∗</sup> <sup>−</sup> <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>**<sup>∞</sup> <sup>≤</sup> *<sup>ρ</sup>n*−<sup>1</sup> <sup>ˆ</sup> *ζ*∗#*, where* ·<sup>∞</sup> *is the supremum norm.*

**Proof.** Clearly, <sup>ˆ</sup> *<sup>ζ</sup>*<sup>∗</sup> <sup>−</sup> **<sup>0</sup>**# <sup>&</sup>lt; <sup>∞</sup>. By virtue of Lemma A7, we have <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** <sup>−</sup> <sup>ˆ</sup> *<sup>ζ</sup>*∗# <sup>≤</sup> *<sup>ρ</sup><sup>n</sup>***<sup>0</sup>** <sup>−</sup> <sup>ˆ</sup> *ζ*∗# for *ρ* = *<sup>ψ</sup>r*(*x* ) *<sup>ψ</sup>r*(*x*∗) <sup>&</sup>lt; 1. Hence, <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** converges exponentially fast to <sup>ˆ</sup> *ζ*<sup>∗</sup> in the seminorm ·#. Using (A8), we have

$$\|\boldsymbol{\zeta}^{\ast} - \boldsymbol{\mathcal{T}}^{n} \mathbf{0}\|\_{\infty} = \|\boldsymbol{\mathcal{T}}\boldsymbol{\hat{\zeta}}^{\ast} - \boldsymbol{\mathcal{T}} \circ \boldsymbol{\mathcal{T}}^{n-1} \mathbf{0}\|\_{\infty} \le \boldsymbol{\rho}^{n-1} \|\boldsymbol{\hat{\zeta}}^{\ast}\|\_{\ast}.$$

**Corollary A2.** *Let* ˆ *<sup>ζ</sup>*<sup>∗</sup> *be a fixed point of* <sup>T</sup><sup>ˆ</sup> *and suppose that the problem* <sup>T</sup><sup>ˆ</sup> <sup>ˆ</sup> *ζ*∗ *admits an optimal stopping time τ*ˆ<sup>∗</sup> *satisfying Xτ*ˆ<sup>∗</sup> ≤ *x* < *x*∗*, for some constant x . Such a fixed point* ˆ *ζ*∗ *is unique.*

**Proof.** By Lemma A8, if ˆ *ζ*∗ is a fixed point satisfying the assumptions of the corollary, it is approximated by <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** in the supremum norm; hence, it must be unique.

#### **Appendix D. Note on Lemma 3**

The inequality lim*x*→−<sup>∞</sup> <sup>−</sup>*<sup>x</sup> <sup>φ</sup>r*(*x*) > 0 when *<sup>a</sup>* = −<sup>∞</sup> asserts that the process *<sup>X</sup>* escapes to −<sup>∞</sup> quickly. Indeed, choosing *<sup>z</sup>* <sup>∈</sup> *<sup>I</sup>*, we have <sup>E</sup>*z*{*e*−*rτ<sup>x</sup>* } <sup>=</sup> *<sup>φ</sup>r*(*z*) *<sup>φ</sup>r*(*x*) for *<sup>x</sup>* <sup>≤</sup> *<sup>z</sup>*, hence <sup>E</sup>*z*{*e*−*rτ<sup>x</sup>* } ≥ *<sup>c</sup>* <sup>−</sup>*<sup>x</sup>* for some constant *c* > 0 and *x* sufficiently close to −∞. To illustrate the speed of escape, assume for simplicity that *X* is a deterministic process. Then, the last inequality would imply *<sup>τ</sup><sup>x</sup>* <sup>≤</sup> <sup>1</sup> *r* log(−*x*) − log(*c*) , i.e., *X* escapes to −∞ exponentially quickly.

An example of a model that violates the assumptions of Lemma 3 is the negative geometric Brownian motion: *Xt* <sup>=</sup> <sup>−</sup> exp (*<sup>μ</sup>* <sup>−</sup> *<sup>σ</sup>*2/2)*<sup>t</sup>* <sup>+</sup> *<sup>σ</sup>Wt* for *<sup>μ</sup>*, *<sup>σ</sup>* <sup>&</sup>gt; 0. With the generator <sup>A</sup> <sup>=</sup> <sup>1</sup> 2*σ*2*x*<sup>2</sup> *<sup>d</sup>*<sup>2</sup> *dx*<sup>2</sup> + *μx <sup>d</sup> dx* , we have *<sup>φ</sup>r*(*x*)=(−*x*)*γ*<sup>2</sup> and *<sup>ψ</sup>r*(*x*)=(−*x*)*γ*<sup>1</sup> , where *<sup>γ</sup>*<sup>1</sup> <sup>&</sup>lt; <sup>0</sup> <sup>&</sup>lt; *<sup>γ</sup>*<sup>2</sup> are solutions to the quadratic equation *<sup>σ</sup>*<sup>2</sup> <sup>2</sup> *<sup>γ</sup>*<sup>2</sup> + (*<sup>μ</sup>* <sup>−</sup> *<sup>σ</sup>*<sup>2</sup> <sup>2</sup> )*γ* − *r* = 0, i.e., *γ* = *B* ± *B*<sup>2</sup> + 2 *<sup>r</sup> <sup>σ</sup>*<sup>2</sup> with *<sup>B</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> <sup>−</sup> *<sup>μ</sup> <sup>σ</sup>*<sup>2</sup> . Hence, lim*x*→−<sup>∞</sup> <sup>−</sup>*<sup>x</sup> <sup>φ</sup>r*(*x*) = lim*x*→−∞(−*x*)1−*γ*<sup>2</sup> <sup>&</sup>gt; 0 if and only if *<sup>γ</sup>*<sup>2</sup> <sup>≤</sup> 1. It is easy to check that *<sup>γ</sup>*<sup>2</sup> <sup>=</sup> 1 for *<sup>μ</sup>* <sup>=</sup> *<sup>r</sup>* and *<sup>γ</sup>*<sup>2</sup> is decreasing as a function of *μ*. Therefore, the condition *γ*<sup>2</sup> ≤ 1 is equivalent to *μ* ≥ *r*.

In summary, the negative geometric Brownian motion violates the assumptions of Lemma 3 if *μ* ≥ *r*. If *μ* = *r*, then case *B* of Theorem 1 applies with *Lc* = 1, while, if *μ* > *r*, then *Lc* = ∞ and so case C applies. Both cases may be interpreted heuristically as the negative geometric Brownian motion *X* escaping 'relatively quickly' to −∞, that is, relative to the value *r* of the continuously compounded interest rate. In the latter case, this happens sufficiently quickly that the single problem's value function *Vc* is infinite.

#### **Appendix E. Verification Theorem for the Lifetime Value Function**

We now provide a verification lemma which may be used to verify if a given value *x*ˆ is an optimal buy price in the lifetime problem. The result is motivated by the following argument using Theorem 3.

We claim that, for all *<sup>x</sup>* <sup>∈</sup> *<sup>I</sup>*, <sup>T</sup><sup>ˆ</sup> *<sup>V</sup>*'(*x*) depends on the value function *<sup>V</sup>*' only through its value at *x* = *x*∗. The argument is as follows: when the battery operator acts optimally, the energy purchase occurs when the price is not greater than *<sup>x</sup>*∗: under <sup>P</sup>*<sup>x</sup>* for *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*∗, this follows directly from Theorem 3; under P*<sup>x</sup>* for *x* < *x*∗, the energy is either purchased before the price reaches *x*<sup>∗</sup> or one applies a standard dynamic programming argument for optimal stopping problems (see, for example, Peskir and Shiryaev 2006) at *x*∗ to reduce this to the previous case. In our setup, the continuation value is not received until the EIM price rises again to *x*∗ (it is received immediately if the energy purchase occurs at *x*∗).

Suppose therefore that we can construct functions *Vi* : *I* → R, *i* = 1, 2, with the following properties:


Then, we have *<sup>V</sup>*<sup>2</sup> <sup>=</sup> <sup>T</sup><sup>ˆ</sup> *<sup>V</sup>*<sup>1</sup> <sup>=</sup> <sup>T</sup><sup>ˆ</sup> *<sup>V</sup>*2, so that *<sup>V</sup>*<sup>2</sup> is a fixed point of <sup>T</sup><sup>ˆ</sup> .

We postulate the following form for *Vi*: given *y* > 0 take

$$V\_1(\mathbf{x}) := \underbrace{\mathcal{J}\_0^y}\_{\cdot \cdot}(\mathbf{x}) := \mathbf{1}\_{\mathbf{x} \le x^\*} y\_\prime \tag{A9}$$

$$V\_2(\mathbf{x}) \;:\; \mathcal{G}^y(\mathbf{x}) := \mathcal{T} \mathcal{G}^y\_0(\mathbf{x}). \tag{A10}$$

For convenience, define h(*x*, *y*) to be the payoff in the lifetime problem when the the continuation value is ˆ *ξ y* <sup>0</sup>. Thus, we have

$$
\hbar(\mathbf{x}, \mathbf{y}) \quad = \quad \hat{h}(\mathbf{x}, \xi\_0^y), \tag{A11}
$$

$$\hat{\xi}^y(\mathbf{x}) \; \; \;= \; \hat{\mathcal{T}} \hat{\xi}\_0^y(\mathbf{x}) = \sup\_{\tau} \mathbb{E}^x \left\{ \varepsilon^{-r\tau} \mathfrak{h}(X\_{\tau}, y) \mathbf{1}\_{\tau < \infty} \right\}. \tag{A12}$$

**Lemma A9.** *Suppose that x*ˆ ∈ (*a*, *x*∗) *satisfies the system*

$$\frac{\mathfrak{h}(\hat{\mathfrak{x}},\mathcal{Y})}{\mathfrak{d}\_{\mathbf{r}}(\mathfrak{k})} = \sup\_{\mathbf{x} \in (a,\mathbf{r}^\*)} \frac{\mathfrak{h}(\mathbf{x},\mathcal{Y})}{\mathfrak{d}\_{\mathbf{r}}(\mathbf{x})},\tag{A13}$$

$$y^\* = \frac{\phi\_r(\mathfrak{x}^\*)}{\phi\_r(\mathfrak{x})} \mathfrak{h}(\mathfrak{x}, y),\tag{A14}$$

$$y \quad > \quad 0. \tag{A15}$$

*Then, the function* ˆ *<sup>ξ</sup><sup>y</sup> of* (A12) *is a fixed point of* <sup>T</sup><sup>ˆ</sup> *, is continuous and strictly positive, and*

$$\xi^y(\mathbf{x}) = \frac{\phi\_r(\mathbf{x})}{\phi\_r(\mathbf{x}^\*)} y, \quad \text{for } \mathbf{x} \ge \mathbf{f}. \tag{A16}$$

**Proof.** Consider first the problem (A12) with *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*ˆ. By construction, <sup>ˆ</sup> *ξ y* <sup>0</sup> is an admissible continuation value in Lemma A4, and cases 1 or 2a must then hold due to the standing assumption for this section that regime (*α*) of Theorem 2 is in force. By (A13), the stopping time *τx*<sup>ˆ</sup> is optimal, and the problem's value function ˆ *ξ<sup>y</sup>* has the following three properties. Firstly, ˆ *ξ<sup>y</sup>* is continuous on *I* by Lemma A3. Secondly, using (A14), we see that ˆ *ξ<sup>y</sup>* satisfies (A16). This implies thirdly that ˆ *ξy*/*φ<sup>r</sup>* is constant on [*x*∗, *b*) and establishes that ˆ *ξy*(*x*∗) = *y*, giving property (ii) above. Since *y* > 0 by (A15), the strict positivity of ˆ *ξ<sup>y</sup>* everywhere follows as in part 1 of the proof of Lemma 2. Our standing assumption **S2\*** implies that the payoff h(*x*, *y*) of (A11) is negative for *x* > *x*∗, which establishes property (iii) for problem (A12).

The three properties of *ξ<sup>y</sup>* established above make it an admissible continuation value in Lemma A4, so we now consider the problem <sup>T</sup><sup>ˆ</sup> *<sup>ξ</sup><sup>y</sup>* for *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*ˆ. Under <sup>P</sup>*<sup>x</sup>* for *<sup>x</sup>* <sup>≥</sup> *<sup>x</sup>*∗, claim 2 of Lemma A1 prevents the battery operator from buying energy at prices greater than *x*∗ when acting optimally; under P*<sup>x</sup>* for *x* < *x*∗, the dynamic programming principle mentioned above completes the argument.

The following corollary completes the verification argument, and also establishes the uniqueness of the value *y* in Lemma A9.

#### **Corollary A3.** *Under the conditions of Lemma A9:*


**Proof.** (i) We will appeal to Lemma A8 by refining property (III) above for the problem <sup>T</sup><sup>ˆ</sup> *<sup>V</sup>*<sup>2</sup> <sup>=</sup> <sup>T</sup><sup>ˆ</sup> <sup>ˆ</sup> *ξy* (as was done in the proof of Corollary 2). Suppose that the battery operator buys energy at the price *x*∗. Then, since the function ˆ *<sup>ξ</sup><sup>y</sup>* is a fixed point of <sup>T</sup><sup>ˆ</sup> under our assumptions, we may consider Tˆ ˆ *<sup>ξ</sup>y*(*x*∗) = <sup>−</sup>*x*<sup>∗</sup> <sup>+</sup> *pc* <sup>+</sup> *Kc* <sup>+</sup> <sup>ˆ</sup> *<sup>ξ</sup>y*(*x*∗) and then **S2\*** leads to <sup>T</sup><sup>ˆ</sup> <sup>ˆ</sup> *ξy*(*x*∗) < ˆ *ξy*(*x*∗), which is a contradiction.

Thus, from Lemma A8, <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** converges to <sup>ˆ</sup> *<sup>ξ</sup><sup>y</sup>* as *<sup>n</sup>* <sup>→</sup> <sup>∞</sup>. As the limit of <sup>T</sup><sup>ˆ</sup> *<sup>n</sup>***<sup>0</sup>** is the lifetime value function, we obtain *V*' = ˆ *ξy*.

(ii) Assume the existence of two such values *<sup>y</sup>*<sup>1</sup> <sup>=</sup> *<sup>y</sup>*2. Then, (A16) gives *<sup>V</sup>*'(*x*∗) = <sup>ˆ</sup> *<sup>ξ</sup>y*<sup>1</sup> (*x*∗) = *<sup>y</sup>*<sup>1</sup> <sup>=</sup> *y*<sup>2</sup> = ˆ *ξy*<sup>2</sup> (*x*∗) = *V*'(*x*∗), a contradiction.

We recall here that, on the other hand, the value *x*ˆ in Lemma A9 may not be uniquely determined (cf. part (a) of Corollary 2). In this case, the largest *x*ˆ satisfying the assumptions of Lemma A9 is the highest price *x*ˇ at which the battery operator can buy energy optimally.

#### **Appendix F. Facts about the OU Process**

Let us temporarily fix *μ* = 0 and *θ* = *σ* = 1. Consider the ordinary differential equation (ODE)

$$w''(z) + \left(\nu + \frac{1}{2} - \frac{1}{4}z^2\right)w(z) = 0.$$

There are two fundamental solutions *Dν*(*z*) and *Dν*(−*z*), where *D<sup>ν</sup>* is a parabolic cylinder function. Assume that *ν* < 0. This function has a multitude of representations, but the following will be sufficient for our purposes (Érdelyi et al. 1953, p. 119):

$$D\_{\nu}(z) = \frac{e^{-z^2/4}}{\Gamma(-\nu)} \int\_0^\infty e^{-zt - \frac{1}{2}t^2} t^{-\nu - 1} dt.$$

Then, *D<sup>ν</sup>* is strictly positive. Fix *r* > 0. Define

$$\psi\_r(\mathbf{x}) = e^{\frac{(\mathbf{x}-\boldsymbol{\mu})^2\theta}{2\sigma^2}} D\_{-r/\theta} \left( -\frac{(\mathbf{x}-\boldsymbol{\mu})\sqrt{2\theta}}{\sigma} \right), \qquad \phi\_r(\mathbf{x}) = e^{\frac{(\mathbf{x}-\boldsymbol{\mu})^2\theta}{2\sigma^2}} D\_{-r/\theta} \left( \frac{(\mathbf{x}-\boldsymbol{\mu})\sqrt{2\theta}}{\sigma} \right).$$

By direct calculation, one verifies that these functions solve

$$
\mathcal{L}v = rv,\tag{A17}
$$

where

$$
\mathcal{L}v(\mathbf{x}) = \frac{1}{2}\sigma^2 v''(\mathbf{x}) + \theta(\mu - \mathbf{x})v'(\mathbf{x}) \tag{A18}
$$

is the infinitesimal generator of the OU process (25). Setting *ν* = −*r*/*θ*, we can write

$$\psi\_{\mathcal{T}}(\mathbf{x}) = \frac{1}{\Gamma(-\nu)} \int\_0^\infty e^{(\mathbf{x}-\boldsymbol{\mu})\mathbf{t}\frac{\sqrt{2t}}{v} - \frac{1}{2}t^2} t^{-\nu-1} dt, \qquad \phi\_{\mathcal{T}}(\mathbf{x}) = \frac{1}{\Gamma(-\nu)} \int\_0^\infty e^{-(\mathbf{x}-\boldsymbol{\mu})\mathbf{t}\frac{\sqrt{2t}}{v} - \frac{1}{2}t^2} t^{-\nu-1} dt.$$

Hence, *ψ<sup>r</sup>* is increasing and *φ<sup>r</sup>* is decreasing in *x*. In addition, by monotone convergence, *ψr*(−∞) = *φr*(∞) = 0 and *ψr*(∞) = *φr*(−∞) = ∞. The functions *ψ<sup>r</sup>* and *φ<sup>r</sup>* are then fundamental solutions of the Equation (A17). Furthermore, they are strictly convex, which can be checked by passing differentiation under the integral sign (justified by the dominated convergence theorem). Defining *F*(*x*) = *ψr*(*x*)/*φr*(*x*), then *F* is continuous and strictly increasing with *F*(−∞) = 0 and *F*(∞) = ∞.

Using the integral representation of *φ<sup>r</sup>* and l'Hôpital's rule, we have

$$\begin{split} \lim\_{x \to -\infty} \frac{-x}{\phi\_r(x)} &= \lim\_{x \to -\infty} \frac{-1}{\frac{1}{\Gamma(-\nu)} \int\_0^\infty e^{-(x-\mu)t} t^{\frac{\sqrt{2t}}{\sigma} - \frac{1}{2}t^2} \left(-t \frac{\sqrt{2t}}{\sigma}\right) t^{-\nu-1} dt} \\ &= \frac{\sigma}{\sqrt{2\theta}} \lim\_{x \to -\infty} \frac{1}{\frac{1}{\Gamma(-\nu)} \int\_0^\infty e^{-(x-\mu)t} \frac{\sqrt{2t}}{\sigma} - \frac{1}{2}t^2} \\ &= \frac{\sigma}{\sqrt{2\theta}} \lim\_{x \to -\infty} \frac{1}{\frac{\Gamma(-\nu+1)}{\Gamma(-\nu)} \frac{1}{\Gamma(-\nu+1)} \int\_0^\infty e^{-(x-\mu)t} \frac{\sqrt{2t}}{\sigma} - \frac{1}{2}t^2} = 0, \end{split} \tag{A19}$$

as the denominator is a scaled version of *φr*˜ corresponding to a new *r*˜ such that −*r*˜/*θ* = *ν* − 1 < *ν* < 0, and so it converges to infinity when *x* → −∞.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
