*Article* **Resource Exploitation in a Stochastic Horizon under Two Parametric Interpretations**

**José Daniel López-Barrientos 1,\*,†,‡, Ekaterina Viktorovna Gromova 2,‡ and Ekaterina Sergeevna Miroshnichenko <sup>3</sup>**


Received: 25 May 2020; Accepted: 29 June 2020; Published: 3 July 2020

**Abstract:** This work presents a two-player extraction game where the random terminal times follow (different) heavy-tailed distributions which are not necessarily compactly supported. Besides, we delve into the implications of working with logarithmic utility/terminal payoff functions. To this end, we use standard actuarial results and notation, and state a connection between the so-called *actuarial equivalence principle*, and the feedback controllers found by means of the Dynamic Programming technique. Our conclusions include a conjecture on the form of the optimal premia for insuring the extraction tasks; and a comparison for the intensities of the extraction for each player under different phases of the lifetimes of their respective machineries.

**Keywords:** differential games; random time horizon; time until failure; discounted equilibrium; weibull distribution; chen distribution; equivalence principle

**MSC:** 91A10; 91A23; 49N90; 60E05

### **1. Introduction**

In this work we study an extension of the extraction game presented in Reference [1] to the case where the random terminal times follow (different) heavy-tailed distributions which are not necessarily compactly supported. We use the framework of the problem of common non-renewable resource exploitation as was posed in Reference [2], from both—the game-theoretical (cf. Reference [3]) and the actuarial points of view (see References [4,5]).

The first reported works on the dynamic development of exhaustible resources by the members of an oligopoly are those by Hotelling (see References [6,7]). There, we can find the well-known principle of marginal revenue, as well as the standard Hypothesis on the equality between the growth rate and the market interest rate over time. The survey [8] constitutes an excellent introduction to the topic from an Economic point of view, see also Reference [9] for empirical investigation of common-pool resource users' dynamic and strategic behavior at the micro level using real-world data.

The area owes its main developments to a discussion that took place during the late '70s and the early '80s on the possibility of replacing the exploitation schemes with some cutting-edge technology to be attained in the near future. The relevance of the debate was the search of a path to move from extracting a non-renewable resource to extracting a renewable one. In this line, we can quote the works of Dasgupta et al. (e.g., References [10,11]; see also Reference [12] Chapter 10.2), and that of Reinganum and Stokey (see Reference [13]). In the former papers, two agents extract a resource that becomes extinct at a time instant which is not known a priori, and then, some technological breakthrough becomes a suitable replacement; while in the latter, the authors assume that the extraction costs equal zero to find an optimal extraction policy over time. From the point of view of our own research, one of the main features of these publications is a method for comparing aggregate extraction paths in terms of the impact of the commitment period of the players on how fast the resource becomes exhausted.

Harris and Vickers [14] revisited Dasgupta's model, enhanced his analyses on the state dynamics, prove the existence and uniqueness of a Nash equilibrium, and characterize it in terms of the slope of the extraction policies at equilibrium. Epstein [15] and Feliz [16] considered a degenerate game to study the policy of extraction from either one or two wells as the resource dynamics is affected by uncertainty. More recently, the empirical economic knowledge on the subject (along with Hartwick and Solow's developments—see References [17,18] respectively- on the transformation of an exhaustible resource into productive capital to sustain a steady level of consumption), allowed Van der Ploeg to model the relation between the risk of depletion of a resource with the government policy on debt and precautionary saving (see Reference [19]). In Reference [20], an evolutionary analysis of the renewable resources exploitation with differentiated technologies was undertaken. Comprehensive surveys of models of dynamic games for the development of (renewable and non-renewable) resources can be found in Reference [21] by Van Long, and Reference [12] (Chapter 10) by Dockner et al.

Almost all of the papers mentioned above include the notion of uncertainty. However, only Reference [1] uses random variables to model the terminal times of extraction of the competing firms. We propose a differential game for the extraction of exhaustible resources (see References [12,22] (Chapter 10; Chapter 7)), where we consider uncertainty and asymmetry in a cake-eating model (as shown in Reference [23]), and interpret it in actuarial terms to, for instance, insure the extraction tasks of the players. The uncertainty here is reflected by the fact that the game ends at a random time instant, while the asymmetry can be seen in the different distributions we use for each player involved in the game.

In the insurance literature on non-renewable resource extraction, the work of Stroebel and van Benthem (see Reference [4]) is relevant for our research because they prove that (should the economy is likely to expropriate) the insurance premium on the extraction tasks is increasingly expensive, and decreasingly expensive as the extractors become more expert. Our approach resembles the one used by Delacote (see Reference [24]), because what he sees as households near the extraction points can be, in our case, an agent subjected to a double risk: they will (surely) occupy the place of what we dub *Chen extractor* (which we prove to be the riskiest of the two players), and they will not be "able to get more than their subsistence requirement" from the wells (the press article [25] presents a narrative of one such situation in Mexico, and the essays in the book [26] delve with the problem in African and Asian countries). There are three more references, in the insurance literature on the extraction of renewable resources, that are important for our developments [27–29]. *Au lieu* of the control methods we adopt, these pieces use statistical methods to show that natural insurance is a normal economic good, and we all agree on the willingness of the agents to pay premia in exchange for a continued supply of the resource under consideration.

The main feature of our game model can be traced back to the works of Petrosyan and Murzov (cf. Reference [30]), and Yaari (cf. Reference [31]). The former used Dynamic Programming to solve a zero-sum pursuit differential game with random duration, while the latter studied the maximization of a utility function by the design of an optimal consumption plan. This topic was further investigated in Reference [32], where the process of technological innovation was supposed to have a random duration. To the best of our knowledge, Petrosyan and Shevkoplyas (Gromova) were the first to propose a general differential game model with random duration (see Reference [33]), including the derivation and solution in closed form under a logarithmic payoff structure, of the corresponding Hamilton-Jacobi-Bellman-Isaacs (HJBI) equations (see Reference [34], and the generalization brought about by Reference [35]).

Game theoretical tools have been applied to model phenomena of the interest of actuarial scientists since the time of the works of Borch and Lemaire, who modelled risk transfers between insurer and reinsurer by means of cooperative games tools (see References [36–38]). Among the most relevant and recent works in Actuarial literature related to our work, we can quote Schäl's paper on the application of stochastic Dynamic Programming for a specific form of the utility function (cf. Reference [5]), because in that research, as well as in ours, we intend to minimize risks in the insurance industry, we focus on a particular form of the reward functions, and we share the use of Dynamic Programming. Such technique is the cornerstone of our approach (the textbook [39] by Hinderer, Rieder and Stieglitz represents an excellent introduction to it, for it covers the deterministic and stochastic cases, and it presents some actuarial applications in insurance). This method has been widely used since its dawn in the decade of the 1960s for many applications, including warfare, resource extraction, and control of pollution in the environment, among others. Schmidli's textbook [40] presents a complete introduction to the subject with traditional Actuarial Sciences in view. Indeed, he starts with the presentation of stochastic control (by means of Bellman's principle) in discrete- and continuous-time, then goes to applications in life insurance, and finally presents the classic ruin theory and Merton's model (Sections 2.6 and 3.1 in the survey [41], and Reference [42] constitute two quick reference guides towards the techniques we use). The works of Dutang et al. (cf. References [43,44]), and that of Polborn (see Reference [45]) present a game theoretic model in discrete-time on a non-life insurance market; and despite the fact that they focus on the competition for the marketshare of the agents, our research can be thought of as a continuation of their model on the fair (optimal) premia to be charged to the agents so as to maximize their profits from the competitive process. Pliska and Ye's article on optimal life insurance strategies (see Reference [46]), and Mango's work (cf. Reference [47]) on catastrophe modelling resemble our own contribution, but the characterization of the probability laws that these two works use lies on another extreme of the spectrum of distributions. This is the reason for which the model of perishable inventories with fat-tailed distributions studied by Giri (cf. Reference [48]) is so appealing to us. As for our use of simple contingent functions in association with Markovian processes, the works of Mao and Ostaszewski, and Perry and Stadje on annuity theory (see References [49,50]) represent significant antecedents to our own ideas.

The problem with asymmetry in different random time instants has been studied for some differential games in References [1,51–53]. A similar cake-eating problem with different asymmetric discounting functions has been considered in Reference [54]. In our paper we also assume that the game stops at the moment when one of the players quits the extraction tasks. However, rather than a Verhulst-type dynamic for the stock of the resource (see Reference [55]), we use a model that resembles the analysis in Reference [12]. We have chosen to work with the classic two-parameter Weibull distribution and the Chen's law. Weibull's model has been widely used in life and non-life Actuarial Mathematics, while Chen's law owes its importance to the fact that its hazard rate function is –in a broad sense- remarkably stronger than that of the former model; this feature automatically turns it into a very suitable option for modelling extreme events (see Reference [56]). We focus our attention on three phases of the lifetime of the machineries of each player: early stage, normal operation stage, and aging stage. We do this by changing the *shape parameters* of each one of these laws, for lower value of these parameters (*δ* < 1) results in a lower hazard rate for the distributions; while the choice of *δ* = 1 yields a stable hazard rate function, and a greater value (*δ* > 1) gives us strong failure rates.

Our main conclusions have to do with the implications of using stochastic Dynamic Programming with a particular form of the utility functions, as in Reference [5], however, all of our developments are presented in the continuous-time framework. To this end, we use standard actuarial results and notation, and state a connection between the so-called *actuarial equivalence principle* (see Chapter 6 in Reference [57], Section on Premium Calculation, Nonlife in References [46,58]), and the feedback controllers found by means of the Dynamic Programming technique. We will end-up showing that, as in References [4,27–29], the agents are willing to pay for the coverage of the insurance, in both: the one player context (see References [15,16]), and the game theoretic case with asymmetries (see References [1,51–53]); where the unbalance comes from the choice of different fat-tailed distributions for the terminal times of extraction of the agents (see References [50,56,59]).

The rest of the paper is organized as follows. In Section 2 we state the main hypotheses of our model and argue about the connection of the game theoretic framework and the actuarial perspective. Section 3.1 is a reduced version of our study for a degenerate game where only one player performs extraction tasks. This part of the work allows us to exemplify an actuarial interpretation of the verification result given by Theorem 1; and define what we call *ease of the extraction* of the player in terms of the intensity of the extraction rate. In Section 3.2 we state and explicitly solve the two-player extraction game where the random terminal times are distributed according to Weibull and Chen laws; compare the results in terms of the ease of extraction of each player for particular choices of the parameters; and interpret the verification Theorem 3 in actuarial terms. We give our conclusions in Section 4.

### **2. Problem Statement**

In this section we present the problem we are interested in, introduce our hypotheses, and explain its connection with some ideas from the Actuarial Sciences.

### *2.1. Game Theoretical Framework*

Let us consider the conflict-control process of the extraction of a non-renewable resource in which two participants are involved. We assume that this set of agents remains fixed for the whole duration of the process.

We describe the dynamics of the consumption of the resource by means of the model presented in Reference [12] (Chapter 10.3), according to which,

$$
\dot{\mathbf{x}}(t) = -\mu\_1(t) - \mu\_2(t), \text{with } \mathbf{x}(t\_0) = \mathbf{x}\_0. \tag{1}
$$

where *x*(*t*) is the amount of the resource available at time *t* ≥ 0, *ui*(*t*) is the extraction rate of the *i*-th agent at time *t*, *x*<sup>0</sup> is the initial amount of the stock, and *i* = 1, 2.

Let Γ(*x*0) be a differential game whose system satisfies the following conditions.

**Hypothesis 1. (a)** *Both players act simultaneously and start the game at some initial time t*<sup>0</sup> = 0 *from the state x*0*.*


The system (1) mirrors the fact that the resource is non-renewable because, by Hypothesis 1(b), *x*(·) is non-increasing.

We assume that the extraction performed by each agent stops at some random moment of time *Ti*, for *i* = 1, 2. We impose the following hypothesis on the random terminal times *Ti*.

**Hypothesis 2.** *The cumulative distribution function of Ti satisfies the following:*

**(a)** *The random terminal times of the firms are distributed according to some (known) functions Fi* : [0, ∞) → [0, 1]*, i* = 1, 2*, that satisfy the normalization condition*

$$\int\_0^\infty \mathbf{d} F\_{\bar{i}}(t) = 1,$$

*and that are absolutely continuous with respect to Lebesgue's measure.*


We define the *failure rate function associated with the i-th firm* (see References [57,60,61] (Chapter 3; Chapters 4 and 5; Chapter 8.5)) as

$$
\lambda\_i(t) := \frac{f\_i(t)}{1 - F\_i(t)},
\tag{2}
$$

where *fi* := *F <sup>i</sup>* is a density function for the random terminal time of the *i*-th player. The existence of such density function is ensured by Hypothesis 2(a).

By virtue of Hypothesis 2(c), Γ(*x*0) collapses to a one-player game at a random terminal time that may be formed by means of the rule of correspondence

$$T := \min\{T\_1, T\_2\}.\tag{3}$$

Now, by Hypothesis 2(b), the results in Reference [61] (Chapter 16.3) and the well-known relation

$$1 - F\_l(t) = \exp\left(-\int\_0^t \lambda\_l(s)ds\right) \tag{4}$$

(see Reference [57] (Chapter 3)), we define the distribution function of the random variable *T* as

$$\begin{array}{rcl} F(t) &:=& 1 - (1 - F\_1(t))(1 - F\_2(t)) \\ &=& 1 - \mathbf{e}^{-\int\_{t\_0}^t \lambda(s)ds} \end{array} \tag{5}$$

where *λ*(*t*) := *λ*1(*t*) + *λ*2(*t*) stands for the hazard rate function of the random variable *T* (see, for instance Reference [57] (Chapter 9.3)).

We now introduce the payoff function and the performance indices for each player in the game Γ(*x*0). To do this, we assume the following conditions.

**Hypothesis 3.** *For i* = 1, 2*, the utility functions hi and* Φ*<sup>i</sup> are continuous in all of their arguments, and concave with respect to ui. Additionally, the function hi satisfies either:*

**(a)** *It is a nonnegative function, that is: hi* : <sup>R</sup> × U*<sup>n</sup>* <sup>→</sup> [0, <sup>∞</sup>)*.* **(a')** *The product fi* · *hi is such that*

$$\int\_0^\infty \int\_0^\infty |f\_i(t) \cdot h\_i(\mathbf{x}(\tau), u\_1(\tau), u\_2(\tau))| \mathrm{d}t \mathrm{d}\tau < \infty.$$

*At time T given by* (3)*, if the i-th player is the only one remaining in the extraction game, she receives a terminal payoff* Φ*i*(*x*(*T*))*.*

With Hypothesis 3, we can introduce the performance index we are interested in.

Define *<sup>u</sup>* : [0, <sup>∞</sup>) → U<sup>2</sup> as the vector of functions (*u*1, *<sup>u</sup>*2). For *<sup>i</sup>* <sup>=</sup> 1, 2, the performance index for the *i*-th player is:

$$\begin{array}{rcl}\mathbb{K}\_{i}(\mathbf{x}\_{0},\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2})&=&\mathbb{E}\_{\mathbf{x}\_{0}}^{\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}}\left[\int\_{0}^{T\_{i}}h\_{i}(\mathbf{x}(\tau),\boldsymbol{u}\_{1}(\tau),\boldsymbol{\mu}\_{2}(\tau))\mathrm{d}\tau\chi\_{\{T\_{i}\leq T\_{j}\}}\right] \end{array} \tag{6}$$

$$+\mathbb{E}\_{\mathbf{x}\_0}^{\mu\_1,\mu\_2} \left[ \int\_0^{T\_{\hat{\jmath}}} h\_{\hat{\imath}}(\mathbf{x}(\tau), \mu\_1(\tau), \mu\_2(\tau)) d\tau \chi\_{\{T\_{\hat{\jmath}} > T\_{\hat{\jmath}}\}} \right] \tag{7}$$

$$+\mathbb{E}\_{\mathbf{x}\_{0}}^{\mu\_{1},\mu\_{2}}\left[\varPhi\_{i}(\mathfrak{x}(T))\chi\_{\{T\_{i}\geq:T\_{j}\}}\right],\tag{8}$$

where <sup>E</sup>*u*1,*u*<sup>2</sup> *<sup>x</sup>*<sup>0</sup> [·] is the conditional expectation of · given that the initial stock is *<sup>x</sup>*0, and the agents use the strategies *<sup>u</sup>*<sup>1</sup> and *<sup>u</sup>*2; *<sup>χ</sup>*{·} is an indicator function; *<sup>T</sup>* is as in (3); and <sup>Φ</sup>*i*(·) is the terminal payoff function referred to in the final part of Hypothesis 3.

**Remark 1.** *Note that the payoff of the game has two components: the integral payoff* (6) *and* (7)*; achieved while playing, and* (8)*, a final reward, which is assigned to the player that stayed longer in the system.*

Now we use the compactness mentioned in Hypothesis 1(b) to introduce the sort of optimality we are interested in.

**Definition 1.** *Let* <sup>Π</sup>*<sup>i</sup>* :<sup>=</sup> {*ui* : [0, <sup>∞</sup>) <sup>×</sup> [0, *<sup>x</sup>*0] <sup>→</sup> [0, *<sup>x</sup>*0]|*ui is Lebesgue-measurable*}, *for <sup>i</sup>* <sup>=</sup> 1, 2*. We say that a pair of feedback strategies* (*u*∗ <sup>1</sup>, *u*<sup>∗</sup> <sup>2</sup> ) <sup>∈</sup> <sup>Π</sup><sup>1</sup> <sup>×</sup> <sup>Π</sup><sup>2</sup> *is* optimal *if it is a so-called* Nash equilibrium*, that is, if*

> *K*1(*x*0, *u*<sup>∗</sup> 1, *u*<sup>∗</sup> <sup>2</sup> ) ≥ *K*1(*x*0, *u*1, *u*<sup>∗</sup> <sup>2</sup> ) *for every u*<sup>1</sup> <sup>∈</sup> <sup>Π</sup>1, *K*2(*x*0, *u*<sup>∗</sup> 1, *u*<sup>∗</sup> <sup>2</sup> ) ≥ *K*2(*x*0, *u*<sup>∗</sup> 1, *<sup>u</sup>*2) *for every u*<sup>2</sup> <sup>∈</sup> <sup>Π</sup>2.

In this game, each firm intends to maximize its profit. Then, we designate the corresponding strategies of the players as *u*∗ <sup>1</sup>, *u*<sup>∗</sup> <sup>2</sup>, and call them "optimal". We also write the trajectory under such strategies as *x*∗, and also call it "optimal trajectory". Let *h*∗ *<sup>i</sup>* (*t*) := *hi*(*x*∗, *u*<sup>∗</sup> <sup>1</sup>, *u*<sup>∗</sup> <sup>2</sup> ), and rewrite the optimal expected payoff resulting from the maximization of (6)–(8) as

$$\begin{split} \mathbb{K}\_{i}(\mathbf{x}\_{0},\boldsymbol{\mu}\_{1}^{\*},\boldsymbol{\mu}\_{2}^{\*}) &=& \mathbb{E}\_{\mathbf{x}\_{0}^{\prime}}^{\boldsymbol{\mu}\_{1}^{\*},\boldsymbol{\mu}\_{2}^{\*}} \left[ \int\_{0}^{T\_{i}} h\_{i}^{\*}(\boldsymbol{\tau}) \mathrm{d}\,\boldsymbol{\tau} \chi\_{\{T\_{i}\leq T\_{j}\}} \right] \\ &+ \mathbb{E}\_{\boldsymbol{x}\_{0}}^{\boldsymbol{\mu}\_{1}^{\*},\boldsymbol{\mu}\_{2}^{\*}} \left[ \int\_{0}^{T\_{j}} h\_{i}^{\*}(\boldsymbol{\tau}) \mathrm{d}\,\boldsymbol{\tau} \chi\_{\{T\_{i}>T\_{j}\}} \right] \\ &+ \mathbb{E}\_{\boldsymbol{x}\_{0}}^{\boldsymbol{\mu}\_{1}^{\*},\boldsymbol{\mu}\_{2}^{\*}} \left[ \varPhi\_{i}(\boldsymbol{x}^{\*}(T)) \chi\_{\{T\_{i}\geq T\_{j}\}} \right]. \end{split}$$

The following result is an extension of Reference [1] (Corollary 3.1). We include its proof here for the sake of completeness.

**Proposition 1.** *Let Hypotheses 1–3 hold. If*

$$H\_i(\theta) := \int\_0^{\theta} h\_i^\*(t)dt < \infty \tag{9}$$

*for all θ* > 0*, then the optimal expected payoff for the problem starting at t* = 0 *is given by*

$$\mathbb{K}\_i(\mathbf{x}\_{0\prime}u\_1^\*, u\_2^\*) = \int\_0^\infty h\_i^\*(\theta)(1 - F(\theta)) + \Phi\_i(\mathbf{x}^\*(\theta))f\_j(\theta)(1 - F\_i(\theta))\,\mathrm{d}\theta.$$

**Proof.** It is easy to see that

$$\begin{split} K\_{i}(\mathbf{x}\_{0},\boldsymbol{\mu}\_{1}^{\*},\boldsymbol{\mu}\_{2}^{\*}) &= \int\_{0}^{\infty} \int\_{0}^{\infty} \int\_{0}^{\theta} h\_{i}^{\*}(t) \mathbf{d}t \chi\_{\{\theta < \tau\}} f\_{j}(\tau) \mathbf{d}\tau f\_{i}(\theta) \mathbf{d}\theta \\ &+ \int\_{0}^{\infty} \int\_{0}^{\infty} \int\_{0}^{\tau} h\_{i}^{\*}(t) \mathbf{d}t \chi\_{\{\theta > \tau\}} f\_{i}(\theta) \mathbf{d}\theta f\_{j}(\tau) \mathbf{d}\tau d\tau \\ &+ \int\_{0}^{\infty} \int\_{0}^{\infty} \Phi\_{i}(\mathbf{x}^{\*}(\tau)) \chi\_{\{\theta > \tau\}} f\_{j}(\tau) \mathbf{d}\tau f\_{i}(\theta) \mathbf{d}\theta. \end{split}$$

The use of (9) and Fubini's rule yield

$$\begin{aligned} K\_i(\mathbf{x}\_0, \mathbf{u}\_1^\*, \mathbf{u}\_2^\*) &= \int\_0^\infty \int\_0^\tau H\_i(\theta) f\_i(\theta) \mathbf{d}\theta f\_j(\tau) d\tau + \int\_0^\infty \int\_0^\theta H\_i(\tau) f\_j(\tau) d\tau f\_i(\theta) d\theta \end{aligned} \tag{10}$$

$$+\int\_{0}^{\infty}\int\_{0}^{\infty}\Phi\_{i}\left(\mathbf{x}^{\*}\left(\tau\right)\right)\chi\_{\{\theta>\tau\}}f\_{j}(\tau)\mathrm{d}\tau f\_{i}(\theta)\mathrm{d}\theta.\tag{11}$$

An integration by parts yields

$$\int\_{0}^{\infty} \int\_{0}^{\tau} H\_{i}(\theta) f\_{i}(\theta) \, \mathbf{d}\theta f\_{j}(\tau) \, \mathbf{d}\tau + \int\_{0}^{\infty} \int\_{0}^{\theta} H\_{i}(\tau) f\_{j}(\tau) \, \mathbf{d}\tau f\_{i}(\theta) \, \mathbf{d}\theta \tag{12}$$

$$=\int\_{0}^{\infty} H\_{\dot{l}}(\theta) f\_{\dot{l}}(\theta) \mathrm{d}\theta - \int\_{0}^{\infty} H\_{\dot{l}}(\theta) f\_{\dot{l}}(\theta) F\_{\dot{l}}(\theta) \mathrm{d}\theta \tag{13}$$

$$+\int\_{0}^{\infty} H\_{i}(\theta) f\_{\slash}(\theta) \mathrm{d}\theta - \int\_{0}^{\infty} H\_{i}(\theta) f\_{\slash}(\theta) F\_{i}(\theta) \mathrm{d}\theta. \tag{14}$$

Working with <sup>∞</sup> <sup>0</sup> *Hi*(*θ*)*fi*(*θ*)d*<sup>θ</sup>* and <sup>∞</sup> <sup>0</sup> *Hi*(*θ*)*fi*(*θ*)*Fj*(*θ*)d*θ* separately yields:

$$\int\_{0}^{\infty} H\_{i}(\theta) f\_{i}(\theta) \mathrm{d}\theta \quad = \lim\_{\theta \to \infty} H\_{i}(\theta) - \int\_{0}^{\infty} h\_{i}^{\*}(\theta) F\_{i}(\theta) \mathrm{d}\theta,\tag{15}$$

$$\int\_{0}^{\infty} H\_{i}(\theta) f\_{\overline{i}}(\theta) F\_{\overline{j}}(\theta) \mathrm{d}\theta \quad = \lim\_{\theta \to \infty} H\_{i}(\theta) - \int\_{0}^{\infty} H\_{i}(\theta) f\_{\overline{j}}(\theta) F\_{\overline{i}}(\theta) \mathrm{d}\theta - \int\_{0}^{\infty} h\_{i}^{\*}(\theta) F\_{\overline{i}}(\theta) F\_{\overline{j}}(\theta) \mathrm{d}\theta.$$

The last relations imply

$$\int\_{0}^{\infty} H\_{i}(\theta) f\_{i}(\theta) F\_{\jmath}(\theta) \mathrm{d}\theta + \int\_{0}^{\infty} H\_{i}(\theta) f\_{\jmath}(\theta) F\_{i}(\theta) \mathrm{d}\theta = \lim\_{\theta \to \infty} H\_{i}(\theta) - \int\_{0}^{\infty} h\_{i}^{\*}(\theta) F\_{i}(\theta) F\_{\jmath}(\theta) \mathrm{d}\theta. \tag{16}$$

Substituting (15) and (16) in (12)–(14) and collecting similar terms yield:

$$\begin{split} &\int\_{0}^{\infty} \int\_{0}^{\tau} H\_{i}(\theta) f\_{i}(\theta) \, \mathrm{d}\theta f\_{j}(\tau) \mathrm{d}\tau + \int\_{0}^{\infty} \int\_{0}^{\theta} H\_{i}(\tau) f\_{j}(\tau) \mathrm{d}\tau f\_{i}(\theta) \, \mathrm{d}\theta \\ &= \quad \lim\_{\theta \to \infty} H\_{i}(\theta) - \int\_{0}^{\infty} h\_{i}^{\*}(\theta) \left( F\_{i}(\theta) + F\_{j}(\theta) - F\_{i}(\theta) F\_{j}(\theta) \right) \, \mathrm{d}\theta \\ &= \quad \int\_{0}^{\infty} h\_{i}^{\*}(\theta) \left( 1 - F(\theta) \right) \, \mathrm{d}\theta. \end{split} \tag{17}$$

Note that the term <sup>∞</sup> 0 <sup>∞</sup> <sup>0</sup> <sup>Φ</sup>*<sup>i</sup>* (*x*∗(*τ*)) *<sup>χ</sup>*{*θ*>*τ*} *fj*(*τ*)d*<sup>τ</sup> fi*(*θ*)d*<sup>θ</sup>* in (11) equals

$$\begin{split} \int\_{0}^{\infty} \Phi\_{i} \left( \mathbf{x}^{\*} (\tau) \right) f\_{\hat{f}} (\tau) \mathrm{d}\tau f\_{i} (\theta) \mathrm{d}\theta &= \int\_{0}^{\infty} \Phi\_{i} \left( \mathbf{x}^{\*} (\theta) \right) f\_{\hat{f}} (\theta) \mathrm{d}\theta - \int\_{0}^{\infty} F\_{i} (\theta) \Phi\_{i} \left( \mathbf{x}^{\*} (\theta) \right) f\_{\hat{f}} (\theta) \mathrm{d}\theta \\ &= \int\_{0}^{\infty} \Phi\_{i} \left( \mathbf{x}^{\*} (\theta) \right) f\_{\hat{f}} (\theta) \left( 1 - F\_{\hat{i}} (\theta) \right) \mathrm{d}\theta, \end{split} \tag{18}$$

where the first equality is due to an integration by parts. The substitution of (17) and (18) in (10) and (11) gives the result.

### *2.2. Interconnection with the Actuarial Sciences*

In the traditional Actuarial Sciences literature, there are two main principles under which it is possible to compute the amount (of money, time or effort) to be invested/reserved –know as "premium" (premia)- in exchange for a benefit (*e.g.*, the coverage of an insurance or the earnings of the extraction tasks). These principles are classified according to the following methodologies.


is linear, for in that case, the expectation of the prospective loss turns out to be null, and the computation of the corresponding surcharge is very straightforward (see Reference [57] (Example 6.1.1)). The result of this method is called "equivalence premium", "indifferent price", or "utility/benefit premium".

From the point of view of the actuarial scientist, we are using a third principle, which connects the traditional approaches. Thus, we state such third principle:

**PIII.** *Optimization of the conditional expectation* (6)–(8) (see References [40] and [63] (Chapter 6)) to find the premium which is to be exchanged for the benefits derived from the extraction tasks. Such a mixture resembles the approach of the tail-value-at-risk –also dubbed *conditional tail expectation* and *expected shortfall*- (cf. References [64–66] and [60] (Chapter 3.5.4)) from **PI**, in the sense that the objective function of the optimization program we work with is a conditional expectation. However, due to the logarithmic form of the reward functions we use; and the particular form of the HJBI equations that this yields, the premia we obtain are very much alike the indifferent prices one would get, should one have chosen to work with the equivalence principle from **PII** in the first place (see the Remarks 2 and 3 below). In the present context, we follow References [31,33] to present the development of the extraction tasks as a process with uncertain duration, and interpret the maximizers *u*∗ <sup>1</sup> and *u*<sup>∗</sup> <sup>2</sup> as measures of the value's cost that each of the agents earns from such development, that is, as the premia that they should invest in order to ensure/insure their operation.

We consider games with a random terminal time *T*, to which the basic terminology of reliability theory can be applied directly. In fact, the failure rate function (2) can be thought of as a conditional density provided that the agent did not *default* (i.e., leave the game until the moment *t*). In our terminology, we would talk about the density of the terminal time of the game, provided that the game was not terminated before the moment *t*. The failure rate function *λi*(*t*) that describes the life cycle for the player *i* has the form described by Figure 1. We follow Reference [67] (Chapter 1) to identify three phases of the hazard rate with respect to time.


Moreover, we are interested in presenting a comparative analysis of the results when the duration of the extraction is distributed according to the laws of Weibull and Chen. The former has been widely used for modelling losses, time-until-failure of many non-renewable electronic devices (electronic lamps, semiconductor devices, some microwave devices) and lifetimes in general (see References [57,59–61,67]), while the latter is a fat-tailed distribution (see Reference [56]).

**Figure 1.** A *U*-shaped (bathtub) hazard rate function.

The cumulative distribution function of the Weibull distribution is:

$$F\_1(t) = 1 - \exp\left(-\lambda\_1 t^{\delta\_1}\right),\tag{19}$$

where *λ*<sup>1</sup> > 0 is a scale parameter, and *δ*<sup>1</sup> > 0 is a shape parameter that corresponds to one of the three phases in which the lifetime of the player can be located. Namely, the value *δ*<sup>1</sup> < 1 corresponds to the pre-run period, here the failure rate function *λ*1(*t*) is a decreasing function of *t*. At *δ*<sup>1</sup> = 1, the system is in the normal operation mode, and *λ*1(·) equals a constant value of *λ*<sup>1</sup> > 0. We note that for *δ*<sup>1</sup> = 1, the Weibull distribution corresponds to an exponential distribution. For *δ*<sup>1</sup> > 1, the system is in an aging state, therefore *λ*1(·) is an increasing function. A special case of the Weibull distribution for this instance is the so-called *Rayleigh* distribution (see Reference [60] (Appendix A.3)). We are in this case when *δ*<sup>1</sup> = 2.

By (2), the corresponding failure rate function is

$$
\lambda\_1(t) = \lambda\_1 \delta\_1 t^{\delta\_1 - 1}.\tag{20}
$$

The cumulative distribution function of Chen distribution is

$$F\_2(t) = 1 - \exp\left(\lambda\_2 \left(1 - \mathbf{e}^{t^{\delta\_2}}\right)\right),\tag{21}$$

where *λ*<sup>2</sup> > 0 is a scale parameter, and *δ*<sup>2</sup> > 0 is a shape parameter. Now, for this case, it follows from (2) that the failure rate function takes the form

$$
\lambda\_2(t) = \lambda\_2 \delta\_2 t^{\delta\_2 - 1} \mathbf{e}^{t^{\delta\_2}}.\tag{22}
$$

If *δ*<sup>2</sup> < 1, we will be at the "newborn" phase. Here, the failure rate function *λ*2(·) is bathtub-shaped. This corresponds to a realistic process of extracting natural resources. When *δ*<sup>2</sup> = 1, the system is in the normal operation mode and the hazard rate function *λ*2(·) is increasing. At *δ*<sup>2</sup> > 1, the system is in aging state, and *λ*2(·) is also an increasing function, but from

$$
\lambda\_2'(t) = \lambda\_2 \delta\_2 t^{\delta\_2 - 1} \mathbf{e}^{t^{\delta\_2}} \text{ for } t > 0,
$$

it is straightforward that the growth rate of *λ*2(·) is noticeably larger than in the case of normal operation. This implies that there is a greater probability of failure at this stage of the extraction.

Graphical representations of the failure rate function of Chen distribution for two values of the scale parameter *λ*<sup>2</sup> are shown in Figure 2. In Figure 2b, note that, as *δ* → 1, the slope of the graph grows larger. This fact might be interpreted by arguing that Chen's distribution plausibly describes how the system goes from the pre-run state into the normal operation mode.

Graphical representations of the failure rate functions for the Weibull and Chen distributions for fixed *λ*<sup>1</sup> = 2 = *λ*2, and the same values of the parameter *δ*<sup>1</sup> and *δ*<sup>2</sup> are shown in Figure 3. Note that we display these functions for each of the periods we have identified.

(**a**) *λ*<sup>2</sup> = 2, different modes of *δ*2. (**b**) *λ*<sup>2</sup> = 1, "newborn" mode. **Figure 2.** The failure rate function *λ*2(·).

**Figure 3.** Failure rate functions for Weibull and Chen distributions with different shape parameters. (**a**) Comparison of the failure rate functions in the pre-run stage. (**b**) Comparison of the failure rate functions at the stage of normal operation. (**c**) Comparison of the failure rate functions during the system aging period.

(**c**)

### **3. Resource Exploitation under Two Parametric Interpretations**

It is easy to see that Weibull and Chen distributions (19) and (21) verify the Hypothesis 2(a). We will consider the cases of these probability laws for the random terminal times of the players, and take *T* = min{*T*1, *T*2} for different parameters of these families. This will allow us to model failures of the equipments depending on their operation mode.

### *3.1. Dynamic Models for the Extraction of Natural Resources by One Agent*

In this section, we follow Reference [32], and consider the situation where only one agent performs extraction tasks, that is, the degenerate game where *n* = 1 (in this case, we do not consider a terminal payoff function, because the *game* finishes as soon as the only player leaves the system). Let *x*(*t*) be the stock of the non-renewable resource under consideration. Then, the dynamics (1) reduces to

$$\dot{x}(t) = -u(t), \ x(t\_0) = x\_0 > 0,\tag{23}$$

where *u*(*t*) ≥ 0. An application of Proposition 1 yields that the expected payoff of the agent, provided that the terminal random time follows Weibull's law, that is

$$\mathbb{K}(\mathbf{x}\_{0\prime}u) = \int\_0^\infty h(\mathbf{x}(t), u(t)) \mathbf{e}^{-\lambda t^\delta} \mathbf{d}t.\tag{24}$$

Note that adding a terminal cost does not make sense with a single agent, so we set Φ(*x*) ≡ 0. If we use Chen's law for the random terminal time, the winnings of the extractor are given by

$$K(\mathbf{x}\_0, u) = \int\_0^\infty h(\mathbf{x}(t), u(t)) \exp\left[-\lambda \left(1 - \mathbf{e}^{t^\delta}\right)\right] d\mathbf{t}.\tag{25}$$

The problem of obtaining feedback maximizers for the expected gains in Formulae (24) and (25) under the condition (23) can be solved using the following Bellman equation

$$
\lambda(t)\mathcal{W}(\mathbf{x},t) = \frac{\partial}{\partial t}\mathcal{W}(\mathbf{x},t) + \max\_{u} \left( h(t,u) - u \frac{\partial}{\partial \mathbf{x}} \mathcal{W}(\mathbf{x},t) \right), \tag{26}
$$

with transversality condition

$$\lim\_{t \to \infty} \mathcal{W}(\mathbf{x}, t) = 0. \tag{27}$$

In (26), *λ*(·) corresponds either to the failure rate of Weibull or Chen distributions. The details on the necessary calculations for achieving (26) and (27) can be seen in, for instance Reference [68] (Chapter I.5).

The following result assumes a specific form of the agent's utility function. There are, of course, many other forms, which need to be concave, continuous and non-decreasing. However, we have chosen this particular form because of the interest that the result has from the point of view of the Actuarial Scientist (see Remark 2 below).

**Theorem 1.** *If the utility function is of the form*

$$h(\mathbf{x}, \boldsymbol{\mu}) = \ln \boldsymbol{u},\tag{28}$$

*then the optimal controller is given by the Lebesgue-measurable function*

$$u^\*(t,x) = \frac{x}{\overline{a}\_t \prime} \tag{29}$$

*where*

$$\mathfrak{a}\_{t} := \int\_{0}^{\infty} \frac{1 - F(t + s)}{1 - F(t)} \mathrm{d}s. \tag{30}$$

**Proof.** The substitution of the *ansatz* (see References [34,35]):

$$W(\mathbf{x}, t) = A(t) \ln \mathbf{x} + B(t), \tag{31}$$

into Bellman's Equation (26) yields

$$\frac{\partial}{\partial \mathbf{x}} \mathcal{W}(\mathbf{x}, t) \quad = \quad \frac{A(t)}{\mathbf{x}}, \tag{32}$$

$$\frac{\partial}{\partial t} \mathcal{W}(\mathbf{x}, t) \quad = \quad \dot{A}(t) \ln \mathbf{x} + \dot{B}(t). \tag{33}$$

Plugging (32) and (33) into (26) gives us that the maximized control is:

$$
\mu^\*(t, \mathbf{x}) = \frac{\mathbf{x}}{A(t)},\tag{34}
$$

and also the following system of differential equations:

$$
\dot{A}(t) - \lambda(t)A(t) + 1 \quad = \quad 0; \tag{35}
$$

$$
\dot{B}(t) - \lambda(t)B(t) - \ln(A(t)) - 1 \quad = \quad 0. \tag{36}
$$

The transversality condition (27) takes the form

$$\lim\_{t \to \infty} A(t) \quad = \quad 0,\tag{37}$$

$$\lim\_{t \to \infty} B(t) \quad = \quad 0. \tag{38}$$

Using (37) and (thus) the integrating factor (4) we can solve (35) and get

$$\begin{array}{rcl} A(t) &=& \frac{1}{1 - F(t)} \int\_{t}^{\infty} \mathbf{e}^{-\int\_{0}^{s} \lambda(\tau) \mathbf{d}\tau} \mathbf{ds} \\ &=& \frac{\int\_{t}^{\infty} \mathbf{1} - F(s) \mathbf{ds}}{1 - F(t)}. \end{array}$$

The last equality follows from (4). Here, of course, *F*(·) is given by (19) or (21). From (30), it is straightforward that

$$A(t) = \mathfrak{d}\_t. \tag{39}$$

The substitution of (39) in (34) gives (29).

The fact that the controller (29) is optimal for the degenerate game follows from Theorem I.7.1(a) in Reference [68]. The fact that that such controller is Lebesgue-measurable follows from Hypothesis 1(b), along with the so-named measurable selection theorems (see, for instance, References [69–71] (Theorem 12.1; Proposition D5(a); Theorem 3.4)). This proves the result.

**Remark 2.** *An actuarial interpretation of Theorem 1 is the fact that the function A*(*t*) *agrees with the expectation of the so-called* contingent life annuity with 0% interest rate for a life aged (*t*) *displayed in* (30) *(see Reference [57] (Chapter 5.2)) and write*

$$
\hbar \mathbf{x} - \boldsymbol{\mu}^\*(t, \mathbf{x}) \cdot \boldsymbol{\upmu} = \mathbf{0}.\tag{40}
$$

*This expression is remarkably similar to the relations (6.2.3)–(6.2.4) in Reference [57] (see PII in Section 2.2, References [57,58] (Chapters 6.2 and 7.2; sections on Premium Calculation for Nonlife Insurance), which are used to establish the* equivalence premium *u*∗(*t*, *x*) *that is to be* continuously paid *to obtain a benefit of x. In the Actuarial Mathematics literature, we use this expression to state the existence of a balance between an* expected income *u*∗(*t*, *x*) · *a*¯*<sup>t</sup> (that will be paid over a contingent horizon), and an* expected benefit *x (that*

*will be received at a given moment of time). From this point of view, we might distinguish two parts within the Lebesgue-measurable optimal rate of extraction u*∗(*t*, *x*) *from* (29)*:*

**(i)** *the benefit that the agent will eventually get, x; and*

**(ii)** *the* intensity of the extraction *that the agent needs to apply to acquire u*∗(*t*, *x*)*, that is, A*(*t*) = *a*¯*t.*

*Keeping this in mind, we can state that the resulting instantaneous utility of obtaining x by continuously extracting u*∗(*t*, *x*) *with an intensity of a*¯*<sup>t</sup> is given by h*(*x*, *u*) *in* (28)*.*

*Since the intensity of the extraction appears in the denominator of* (29)*, we will dub* <sup>1</sup> *<sup>a</sup>*¯*<sup>t</sup> as* ease of extraction*. Our goal is to emphasize the inverse proportionality of the optimal controller u*∗(*t*, *x*) *with respect to the* intensity of extraction *a*¯*t.*

We obtain the following result as a by-product of Theorem 1.

**Theorem 2.** *If the utility function is of the form* (28)*, then the value function for the optimal control problem of maximizing* (6) *within* U *subject to* (1) *is given by*

$$\mathcal{W}(\mathbf{x},t) = \bar{a}\_t \ln \mathbf{x} + \int\_0^\infty (1 + \ln \bar{a}\_s) \frac{1 - F(s+t)}{1 - F(t)} \mathbf{d}s. \tag{41}$$

**Proof.** From (31), we readily know that

$$W(\mathbf{x}, t) = d\_t \ln \mathbf{x} + B(t).$$

To find the function *B*(*t*), we note that the transversality condition (38) and again, the integrating factor (4) give that

$$B(t) = \frac{\int\_{t}^{\infty} \exp\left(-\int\_{0}^{s} \lambda(\tau) \mathrm{d}\tau\right) \left(1 + \ln A(s)\right) \mathrm{d}s}{\exp\left(-\int\_{0}^{t} \lambda(s) \mathrm{d}s\right)}$$

$$= \frac{\int\_{t}^{\infty} \left(1 - F(s)\right) \left(1 + \ln \bar{a}\_{s}\right) \mathrm{d}s}{1 - F(t)}.\tag{42}$$

The last equality follows from (4) and (39). Substituting (42) into (31) gives (41). The fact that this function is actually the value of the optimal control problem of maximizing (6) subject to (1) follows from the verification Theorem I.7.1(b) in Reference [68]. This proves the result.

### 3.1.1. Normal Mode (*δ*<sup>1</sup> = 1 = *δ*2)

As we already stated, for the case of Weibull distribution, when *δ*<sup>1</sup> = 1, the random terminal time is exponentially distributed with mean *λ*−<sup>1</sup> <sup>1</sup> . Then, by Theorem 1, the optimal strategy of the agent is *u*∗(*t*, *x*) = *λ*1*x*. We solve (23) and get that the optimal trajectory is

$$\mathbf{x}^\*(\pi) = \mathbf{x}\_0 \mathbf{e}^{-\lambda\_1 \pi}.$$

For the case of Chen's law, we let *δ*<sup>2</sup> = 1 and substitute (22) into (29) to get

$$u^\*(t, \mathbf{x}) = \frac{x \exp\left(-\lambda\_2 \mathbf{e}^t\right)}{\int\_t^\infty \exp\left(-\lambda\_2 \mathbf{e}^s\right) \,\mathrm{d}s}.$$

We substitute this controller into (23) and solve the differential equation to obtain the optimal trajectory when the random terminal time is distributed according to Chen's law and *δ*<sup>2</sup> = 1. That is:

$$\mathbf{x}^\*(\tau) = \mathbf{x}\_0 \exp\left(-\int\_0^\tau \frac{\exp\left(\mathbf{e}^{-\lambda\_2 t}\right)}{\exp\left(-\int\_\tau^\infty \mathbf{e}^{-\lambda\_2 s} \mathrm{d}s\right)} \mathrm{d}t\right).$$

Figures 4–6 summarize these results when *λ*<sup>1</sup> = 1 = *λ*2, which, since *λ*<sup>1</sup> and *λ*<sup>2</sup> are scale parameters of the distributions under study, we can have a sufficiently general idea of our developments. The reason is that, if we select other values for these parameters, we will end up having constant multiples of the random variables *X*<sup>1</sup> and *X*<sup>2</sup> analyzed in this study (see Reference [60] (Section 4.2.1)).

**Figure 4.** Optimal trajectories for the laws of Weibull (continuous-red) and Chen (dotted-blue) when *δ*<sup>1</sup> = 1 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

**Figure 5.** Optimal Weibull (red) and Chen (blue) controllers when *δ*<sup>1</sup> = 1 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*<sup>2</sup> for (*t*, *x*) ∈ [0, 1] × [0, 10].

**Figure 6.** Ease of the extraction for the laws of Weibull (straight-red) and Chen (decreasing-blue) when *δ*<sup>1</sup> = 1 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

We might give a plausible interpretation of Figures 5 and 6 in the direction of Remark 2. The optimal extraction rates are linear in the state variable and, as we established in Remark 2(ii), the largest the value of *A*(*t*) = *a*¯*<sup>t</sup>* is, the easiest it is for the extractor to obtain *u*∗(*t*, *x*). In this sense, it is straightforward that, for Chen's law, as time goes by, it becomes harder to extract at a rate of *u*∗(*t*, *x*). On the other hand, it is very easy to see that, under Weibull's law, *A*(*t*) ≡ 1 (see References [12,57,61] (Example 5.2.1; p. 323; Chapter 8.10.1)). This implies that, in the normal mode, the agent whose random terminal time follows the Weibull distribution is indifferent to the moment of time when the extraction task takes place, and should only look at the remaining amount of the resource.

As for the optimal trajectories in Figure 4, observe that the system exploited by an agent affected by Chen's law becomes exhausted a lot faster than a system exploited by a Weibull extractor. This is consistent with the fact that, according to Figure 5, a Chen extractor would be more intense in his exploitation tasks.

### 3.1.2. Aging Mode (*δ*<sup>1</sup> = 2 = *δ*2)

We stated before that for *δ*<sup>1</sup> = 2 = *δ*2, Weibull's law coincides with Rayleigh distribution. In this case, (20) gives that *λ*1(*t*) = 2*λ*1*t*; then from Theorem 1 we get

$$u^\*(t, \mathbf{x}) \quad = \quad \frac{\mathbf{x}e^{-\lambda\_1 t^2}}{\int\_t^{\infty} \mathbf{e}^{-\lambda\_1 s^2} \mathbf{d}s}.$$

For Chen's distribution, when we have an aging system, by (22), the failure rate function is *λ*2(*t*) = 2*λ*2*t*e*<sup>t</sup>* 2 , and the Theorem 1 gives that the optimal control is

$$u^\*(t, \mathbf{x}) = \frac{\mathbf{x} \exp\left(-\lambda\_2 \mathbf{e}^{t^2}\right)}{\int\_t^\infty \exp\left(-\lambda\_2 \mathbf{e}^{s^2}\right) d s}.$$

.

We solve (23) and get that the optimal trajectories under each law are:

$$\mathbf{x}^\*(\tau) = \begin{cases} \mathbf{x}\_0 \exp\left(-\int\_0^\tau \frac{\mathbf{e}^{-\lambda\_1 \mathbf{r}^2}}{\int\_t^\infty \mathbf{e}^{-\lambda\_1 \mathbf{r}^2} \mathbf{d}s} \mathbf{d}t\right) & \text{for Weibull's law with} \quad \delta\_1 = 2, \\\mathbf{x}\_0 \exp\left(-\int\_0^\tau \frac{\exp\left(-\lambda\_2 \mathbf{e}^{\tau}\right)}{\int\_t^\infty \exp\left(-\lambda\_2 \mathbf{e}^{\tau}\right) \mathbf{d}s} \mathbf{d}t\right) & \text{for Chern's law with} \quad \delta\_2 = 2. \end{cases}$$

Figures 7–9 summarize these results when *λ*<sup>1</sup> = 1 = *λ*2. There, we can see that, in the aging mode, as time passes by, the extraction tasks need to be more intense for both assumptions.

**Figure 7.** Optimal trajectories for laws of Weibull (continous-red) and Chen (dotted-blue) when *δ*<sup>1</sup> = 2 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

**Figure 8.** Optimal Weibull (red) and Chen (blue) controllers when *δ*<sup>1</sup> = 2 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*<sup>2</sup> for (*t*, *x*) ∈ [0, 1] × [0, 10].

However, for the Chen extractor, the situation deteriorates very rapidly, and hence, it needs to speed up the pace of its tasks. It readily follows from (29) that both controllers are linear in the state. Again, Figure 7 shows how the stock becomes exhausted for each case.

**Figure 9.** ease of the extraction for Weibull (red) and Chen (blue) laws when *δ*<sup>1</sup> = 2 = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

3.1.3. Early Period (*δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> = *δ*2)

For the Weibull case, (20) yields that the hazard rate function is *λ*1(*t*) = *<sup>λ</sup>*<sup>1</sup> 2 √*t* . A substitution in (29) gives us

$$\left(u^\*(t,x)\right)' = \frac{x\mathbf{e}^{-\lambda\_1\sqrt{t}}}{\int\_t^\infty \mathbf{e}^{-\lambda\_1\sqrt{s}}\mathbf{ds}}.$$

For Chen's law, the pre-run system has a hazard rate function of the form *λ*2(*t*) = *<sup>λ</sup>*<sup>2</sup> 2 √*t* e √*t* . The corresponding optimal control has the form

$$\mu^\*(t, \mathbf{x}) \quad = \frac{\mathbf{x} \exp\left(-\lambda\_2 \mathbf{e}^{\sqrt{t}}\right)}{\int\_t^\infty \exp\left(-\lambda\_2 \mathbf{e}^{\sqrt{s}}\right) \mathbf{ds}},$$

.

We solve (23) and get that the optimal trajectories under each law are:

$$\mathbf{x}^\*(\tau) = \begin{cases} \mathbf{x}\_0 \exp\left(-\int\_0^\tau \frac{\lambda\_1^2}{2(\lambda\_1\sqrt{\tau} + 1)} \mathbf{d}t\right) & \text{for Weibull's law with} \quad \delta\_1 = \frac{1}{2}, \\\mathbf{x}\_0 \exp\left(-\int\_0^\tau \frac{\exp\left(-\lambda\_2 \mathbf{e}^{\sqrt{\tau}}\right)}{\int\_\tau^\infty \exp\left(-\lambda\_2 \mathbf{e}^{\sqrt{\tau}}\right) \mathbf{d}s} \mathbf{d}t\right) & \text{for Chern's law with} \quad \delta\_2 = \frac{1}{2}. \end{cases}$$

Figures 10–12 summarize these results when *λ* = 1. It is of particular interest that, according to Figure 12, for a Chen extractor, we almost have the same situation displayed in Figure 6 with the Weibull extractor. However, in this case, the intensity function *A*(*t*) = *a*¯*<sup>t</sup>* converges to a smaller value than the one displayed in that part of the illustration. This means that, during the early mode of the system, an agent whose random terminal time follows the Chen distribution should only mind about the remaining stock of the resource.

**Figure 10.** Optimal trajectories for laws of Weibull (continuous-red) and Chen (dotted-blue) when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

**Figure 11.** Optimal Weibull (red) and Chen (blue) controllers when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*<sup>2</sup> for (*t*, *x*) ∈ [0, 1] × [0, 10].

**Figure 12.** Ease of the extraction for Weibull (red) and Chen (blue) laws when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> = *δ*<sup>2</sup> and *λ*<sup>1</sup> = 1 = *λ*2.

For a Weibull extractor at the pre-run phase, as time goes by, the exploitation becomes less intense, and, in spite of the fact that the reserve will be consumed at an exponential rate, it will last for more time than under Chen assumption (look at Figure 10). An economic interpretation of this fact is that the optimal rules of behavior for the agents demand Chen's law to be more intense in its labors even from the early period, while they allow Weibull's to be less intense (look at Figure 11).

### *3.2. Game Theoretic Model for the Extraction of Natural Resources*

Now we consider the non degenerate game model from Section 2. For that purpose, we will make an extensive use of the results presented in References [1,50] and [57] (Chapter 9.2). Note that the utility function of the agent will explicitly depend only on its own control, and there are no payoff transfers among the players, that is, on the extraction rate applied by the agent, and on the stock of the resource at time *t* ≥ 0.

Theorem 3.1 in Reference [1] allows us to state the HJBI equations associated with the optimization problem for the *i*-th player. Namely,

$$\begin{aligned} &-\frac{\partial}{\partial t}\mathcal{W}\_{i}(\mathbf{x},t) + (\lambda\_{1}(t) + \lambda\_{2}(t))\mathcal{W}\_{i}(\mathbf{x},t) \\ &= \max\_{\boldsymbol{\mu}\_{i} \in \Pi^{i}} \left( h\_{i}(t,\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}) + \Phi\_{i}(\mathbf{x})\lambda\_{j}(t) - \frac{\partial}{\partial \mathbf{x}}\mathcal{W}\_{i}(\mathbf{x},t)(\boldsymbol{\mu}\_{1} + \boldsymbol{\mu}\_{2}) \right), \end{aligned} \tag{43}$$

$$\mathcal{W}\_{l}(\mathbf{x},t) \to 0 \text{ as } t \to \infty,\tag{44}$$

for *i* = 1, 2.

We will find explicit solutions for (43) when the functions *hi* are analogous to (28) for *i* = 1, 2. That is, when

$$h\_i(\mathbf{x}, u\_1(t, \mathbf{x}), u\_2(t, \mathbf{x})) = \ln(u\_i(t, \mathbf{x})).\tag{45}$$

We also suppose that

$$\Phi\_i(\mathbf{x}(t \wedge \tau)) = c\_i \ln(\mathbf{x}(t \wedge \tau)) = c\_i \ln(\mathbf{x}) \chi\_{\tau \le t\_\prime} \tag{46}$$

for some positive constant values *ci* and *i* = 1, 2. In this case, the HJBI Equation (43) turns out to be

$$\begin{aligned} & -\frac{\partial}{\partial t} \mathcal{W}\_{\mathrm{i}}(\mathbf{x}, t) + (\lambda\_1(t) + \lambda\_2(t)) \mathcal{W}\_{\mathrm{i}}(\mathbf{x}, t) \\ &= \max\_{u\_{\mathrm{i}} \in \Pi^1} \left( \ln(u\_{\mathrm{i}}) + c\_{\mathrm{i}} \ln(\mathbf{x}) \lambda\_{-\mathrm{i}}(t) - \frac{\partial}{\partial \mathbf{x}} \mathcal{W}\_{\mathrm{i}}(\mathbf{x}, t) (u\_1 + u\_2) \right), \end{aligned}$$

for *i* = 1, 2. Here,

$$
\lambda\_{-i}(\cdot) := \begin{cases}
\lambda\_2(\cdot) & \text{if } i = 1, \\
\lambda\_1(\cdot) & \text{if } i = 2.
\end{cases}
$$

In what follows, we find the optimal strategies and the value function for this problem by proceeding in the same way that led us to (29) and (41). The next result is similar to Reference [1] (Proposition 4.1).

**Theorem 3.** *If the utility functions are of the form* (28)*, and the terminal payoff functions are given by* (46)*, then the optimal strategies for the game* Γ(*x*0) *are Lebesgue-measurable, and are given by:*

$$u\_i^\*\left(t, \mathbf{x}\right) = \frac{\mathbf{x}}{\vec{d}\_{[t]\_1:[t]\_2} + c\_i \vec{A}\_{[t]\_i:[t]\_{-i}}},\tag{47}$$

*for i* = 1, 2*, where*

$$d\_{[t]\_1 : [t]\_2} := \int\_0^\infty \frac{1 - F(t + s)}{1 - F(t)} \mathrm{d}s,\tag{48}$$

*and*

$$A\_{\left[t\right]\_i \left[t\right]\_{-i}} \coloneqq\_i \int\_0^\infty \frac{1 - F(t+s)}{1 - F(t)} \lambda\_{-i}(s+t) \, \mathrm{d}s.\tag{49}$$

*Here, F*(*t*) *is as in* (5)*.*

**Proof.** The substitution of the informed guesses

$$\mathcal{W}\_{i}(\mathbf{x},t) = \mathcal{A}\_{i}(t)\ln\mathbf{x} + B\_{i}(t)\_{,i}$$

for *i* = 1, 2; into (43) and (44) yields:

• that the maximizers of (43) are of the form

$$
\mu\_i^\*(t, \mathbf{x}) = \frac{\mathbf{x}}{A\_i(t)}.\tag{50}
$$

The fact that these controllers are Lebesgue-measurable follows from Hypothesis 1(b), along with the so-named measurable selection theorems (see, for instance, References [69–71] (Theorem 12.1; Proposition D5(a); Theorem 3.4)).

• the following Cauchy problem (which is analogous to (35)–(38)):

$$-\dot{A}\_i(t) + A\_i(t)(\lambda\_i(t) + \lambda\_{-i}(t)) \quad = \quad 1 + c\_i \lambda\_{-i}(t),\tag{51}$$

$$\left(\dot{B}\_i(t) - B\_i(t)(\lambda\_i(t) + \lambda\_{-i}(t))\right) \\
= \ln A\_i(t) + 1 + \frac{A\_i(t)}{A\_{-i}(t)} \tag{52}$$

$$\lim\_{t \to \infty} A\_i(t) \quad = \quad 0,\tag{53}$$

$$\lim\_{t \to \infty} B\_i(t) \quad = \quad 0. \tag{54}$$

We apply the technique of the integrating factor in (51); use the transversality condition (53), and get

$$\begin{split} A\_{i}(t) &= \frac{\int\_{t}^{\infty} (1 + c\_{i}\lambda\_{-i}(\tau)) \exp\left(-\int\_{0}^{\tau} \lambda\_{i}(s) + \lambda\_{-i}(s) \mathrm{d}s\right) d\tau}{\exp\left(-\int\_{0}^{t} \lambda\_{i}(\tau) + \lambda\_{-i}(\tau) \mathrm{d}\tau\right)} \\ &= \frac{\int\_{t}^{\infty} (1 + c\_{i}\lambda\_{-i}(\tau))(1 - F(\tau)) \mathrm{d}\tau}{1 - F(t)}. \end{split}$$

The last equality holds by virtue of (5).

We now use (48) and (49) and observe that

$$\mathcal{A}\_{i}(t) = \vec{a}\_{[t]\_{1}:[t]\_{2}} + c\_{i}\vec{\mathcal{A}}\_{[t]\_{i}:[t]\_{-i}}^{1}.\tag{55}$$

The substitution of this expression in (50) yields (47).

By Theorem I.7.1(a) in Reference [68] (see also Reference [52] (Theorem 2(i))), we know that (47) is optimal for the game with no explicit payoff transfer among the players. This proves the result.

**Remark 3.** *Just as we interpreted* (39) *in Remark 2 as a continuous life annuity, we can do the same with* (55)*.*


$$\frac{1 - F(t+s)}{1 - F(t)}\lambda\_{-i}(s+t)$$

*can be thought of as the probability that* both *agents survive up to moment t, and the* (−*i*)*-th agent fails at moment t* + *s. (See References [46,57] (Chapter 9.9)).*

*Keeping this in mind we can propose an extension of* (40) *and write* (47) *as:*

$$
\lambda \mathbf{x} - \mu\_i^\*(t, \mathbf{x}) \cdot \left( \bar{a}\_{[t]\_1 : [t]\_2} + c\_i \bar{A}\_{[t]\_i : [t]\_{-i}} \mathbf{1}\_{]} \right) = 0. \tag{56}
$$

*From an actuarial perspective, this expression means that, for the i-th extractor, there is a balance between the* eventual benefit *x, and the continuous rate of extraction u*∗ *<sup>i</sup>* (*t*, *x*)*, which includes a* final *payment (of size ciu*∗ *<sup>i</sup>* (*t*, *x*)*) that covers the possibility that the other player fails and leaves the game.*

*This means that the optimal rate of extraction of the i-th player in* (47) *can be viewed as the composition of two parts:*


*The resulting utility of this exercise for the i-th player is given by* (45)*.*

As a by-product of Theorem 3, we can state and prove the following result.

**Theorem 4.** *If the utility functions are of the form* (28)*, and the terminal payoff functions are given by* (46)*, then the value functions for game* Γ(*x*0) *are given by*

$$\begin{split} \mathcal{W}\_{i}(\mathbf{x},t) &= \left(\mathbb{d}\_{[t]\_{i}:[t]\_{2}} + c\_{i}\vec{A}\_{[t]\_{i}:[t]\_{-i}}\right) \ln \mathbf{x} \\ &+ \int\_{0}^{\infty} \left(1 + \ln\left(\mathbb{d}\_{[t]\_{i}+:\mathbf{c}:[t]\_{2}+\mathbf{s}} + c\_{i}\vec{A}\_{[t]\_{i}+:\mathbf{c}:[t]\_{-i}+\mathbf{s}}\right) + \frac{\mathbb{d}\_{[t]\_{1}+:\mathbf{c}:[t]\_{2}+\mathbf{s}} + c\_{i}\vec{A}\_{[t]\_{i}+:\mathbf{c}:[t]\_{-i}+\mathbf{s}}}{\mathbb{d}\_{[t]\_{1}+:\mathbf{c}:[t]\_{2}+\mathbf{s}} + c\_{-i}\vec{A}\_{[t]\_{i}+:\mathbf{c}:[t]\_{-i}+\mathbf{s}}}\right) \frac{1-F(\mathbf{s}+t)}{1-F(t)}d\mathbf{s}\_{i} \end{split}$$

*for i* = 1, 2*.*

**Proof.** From Theorem 3, we readily know that

$$\mathcal{W}\_{i}(\mathbf{x},t) = \left(\bar{a}\_{[t]\_{1}:[t]\_{2}} + \mathcal{c}\_{i}\mathcal{A}\_{[t]\_{i}:[t]\_{-i}}\right) \ln \mathbf{x} + B\_{i}(t)\_{i},$$

for *i* = 1, 2. To find the functions *Bi*(*t*), we apply the technique of the integrating factor to (52), and use the transversality condition (54). This gives:

$$\begin{aligned} B\_i(t) &= \frac{\int\_t^\infty \left( \ln A\_i(s) + 1 + \frac{A\_i(s)}{A\_{-i}(s)} \right) \exp\left( -\int\_0^s \lambda\_i(\tau) + \lambda\_{-i}(\tau) \mathrm{d}\tau \right) \mathrm{d}s}{\exp\left( -\int\_0^t \lambda\_i(s) + \lambda\_{-i}(s) \mathrm{d}s \right)} \\ &= \frac{\int\_t^\infty \left( \ln A\_i(s) + 1 + \frac{A\_i(s)}{A\_{-i}(s)} \right) (1 - F(s)) \mathrm{d}s}{1 - F(t)}, \end{aligned}$$

where *Ai*(·) is as in (55).

The optimality of the functions *W*<sup>1</sup> and *W*<sup>2</sup> follows from Reference [52] (Theorem 2(ii)). This completes the proof.

The optimal trajectory can be found by plugging (47) into (1) and solving. That is,

$$\mathbf{x}^\*(\tau) = \mathbf{x}\_0 \exp\left(-\sum\_{i=1}^2 \int\_0^\tau \frac{\exp\left(-\int\_0^t (\lambda\_1(s) + \lambda\_2(s)) \mathbf{ds}\right)}{\int\_t^\infty (1 + c\_i \lambda\_{-i}(s)) \exp\left(-\int\_0^t (\lambda\_1(r) + \lambda\_2(r)) \mathbf{dr}\right) \mathbf{ds}} \mathbf{dt}\right). \tag{57}$$

The relation (57) can also be compactly written as

$$\mathbf{x}^\*(\tau) = \mathbf{x}\_0 \mathbf{exp} \left( -\int\_0^\tau \frac{1}{\vec{a}\_{[t]\_1:[t]\_2} + c\_1 \vec{A}\_{[t]\_1:[t]\_2}} + \frac{1}{\vec{a}\_{[t]\_1:[t]\_2} + c\_2 \vec{A}\_{[t]\_1:[t]\_2}^1} \mathbf{d}t \right) \mathbf{x}$$

thus showing the interaction between the players and their effect on the system.

### *3.3. An Illustration*

We devote this Section to the analysis of the particular cases of our interest.

If the random terminal time of player 1 has a Weibull distribution, and that of player 2 follows Chen's law, the optimal extraction rates are:

$$u\_1^\*(t, \mathbf{x}) = \frac{\mathbf{x} \exp\left(-\int\_0^t (\lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \mathbf{ds}\right)}{\int\_t^\infty (1 + c\_1 \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \exp\left(-\int\_0^s (\lambda\_1 \delta\_1 r^{\delta\_1 - 1} + \lambda\_2 \delta\_2 r^{\delta\_2 - 1} \mathbf{e}^{r^{\delta\_2}}) \mathbf{d}r\right) \mathbf{ds}}.$$

and

$$u\_2^\*(t, \mathbf{x}) = \frac{\mathbf{x} \exp\left(-\int\_0^t (\lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{\mathbf{s}^{\delta\_2}}) \mathbf{ds}\right)}{\int\_t^\infty (1 + c\_2 \lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1}) \exp\left(-\int\_0^s (\lambda\_1 \delta\_1 r^{\delta\_1 - 1} + \lambda\_2 \delta\_2 r^{\delta\_2 - 1} \mathbf{e}^{r^{\delta\_2}}) \mathbf{d} r\right) \mathbf{ds}}.$$

If we fix the initial stock at *x*<sup>0</sup> = 5, Figures 13–18 will show us a graphical depiction of the involved intensities in the process. From these images, we can notice that *almost all the time* the extraction rates of the player whose random terminal time follows the Chen distribution are higher than those of the other player. This fact is also consistent with the plots of Figures 4–6 and 10–12, which were exhibited in the illustration of Section 3.

**Figure 13.** Ease of the extraction when the shape parameter of Weibull distribution is *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> . (**a**) Ease of the extraction for Weibull's player when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> . (**b**) Ease of the extraction for Chen's player when *δ*<sup>2</sup> = <sup>1</sup> 2 .

To have a proper interpretation of the results presented in Figure 13, recall Figures 10–12. Observe that the shapes of the plots in Figure 13a,b resemble those in Figure 12. It should be noted that the scales are much larger in the degenerate case. The reason is that, in this scenario, there are two actors making decisions and consuming the resource. Moreover, each of the agents needs to take into consideration the action of its counterpart (recall Theorem 3). As in the one-player case, it should be noticed that, in spite of the fact that the plots in Figure 13a,b have opposite trends, the optimal behavior of the Chen extractor is to obtain the resource at a much rapid pace than that of the Weibull extractor.

**Figure 14.** Ease of the extraction when the shape parameters are *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> and *δ*<sup>2</sup> = 1 for Weibull and Chen distributions, respectively. (**a**) Ease of the extraction for Weibull's player when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> . (**b**) Ease of the extraction for Chen's player when *δ*<sup>2</sup> = 1.

In view of Figure 14, we must recall Figures 4–6 and 10– 12. Again, for the intensity function of the Chen extractor, *A*2(·), we see the same general trend as that in Figure 6. However, for the Weibull case, the trend observed in Figure 12 becomes perturbed by the presence of the other agent in a more advanced mode of the extraction.

**Figure 15.** Ease of the extraction when the shape parameters are *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> and *δ*<sup>2</sup> = 2 for Weibull and Chen distributions, respectively. (**a**) Ease of the extraction for Weibull's player when *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> . (**b**) Ease of the extraction for Chen's player when *δ*<sup>2</sup> = 2.

Figure 15 can be interpreted by means of Figures 7–12. Note that from the comparison of Figure 15a with Figure 12 it is very clear how the presence of the Chen extractor affects the ease of the extraction of Weibull's player.

For the case *δ*<sup>1</sup> = 1, we present the following result.

**Proposition 2.** *If δ*<sup>1</sup> = 1*, then*

$$A\_1(t) = c\_1 + (1 - c\_1 \lambda\_1) \frac{\mathbf{e}^t}{1 - F\_2(t)} \left(\frac{1}{\lambda\_1} - \mathcal{L}(F\_2)\right) \lambda\_1$$

*where* <sup>L</sup>(*F*2) :<sup>=</sup> <sup>∞</sup> <sup>0</sup> <sup>e</sup>−*λ*1*τF*2(*τ*)d*<sup>τ</sup> stands for the Laplace transform of the distribution function F*2*.*

**Proof.** By (55), we readily know that

$$\mathcal{A}\_1(t) = \bar{a}\_{[t]\_1;[t]\_2} + c\_1 \bar{\mathcal{A}}\_{[t]\_1[t]\_2} \, ^1 \tag{58}$$

Now, by (5) and (48) (with *v* ≡ 1) we can write

$$\begin{array}{rcl} \mathbf{a}\_{[t]\_1;[t]\_2} &=& \int\_0^\infty \frac{1 - F\_1(t + \tau)}{1 - F\_1(t)} \frac{1 - F\_2(t + \tau)}{1 - F\_2(t)} \mathbf{d}\tau \\ &=& \int\_0^\infty \mathbf{e}^{-\lambda\_1 \tau} \frac{1 - F\_2(t + \tau)}{1 - F\_2(t)} \mathbf{d}\tau. \end{array}$$

The last equality follows from plugging *δ*<sup>1</sup> = 1 into Weibull's distribution. Define *s*2(*τ*) := 1−*F*2(*t*+*τ*) <sup>1</sup>−*F*2(*t*) as the conditional *survival* of the random variable [*<sup>T</sup>* <sup>−</sup> *<sup>t</sup>*|*<sup>T</sup>* <sup>&</sup>gt; *<sup>t</sup>*] and write

$$\mathbb{d}\_{[t]\_1 \colon [t]\_2} = \int\_0^\infty \mathbf{e}^{-\lambda\_1 \tau} \mathbf{s}\_2(\tau) \mathbf{d}\tau. \tag{59}$$

We now use (5) and (49) (with *v* ≡ 1) to write

$$\begin{array}{rcl} \bar{A}\_{[t]\_1 \cdot [t]\_2} &=& \int\_0^\infty \frac{1 - F\_1(t + \tau)}{1 - F\_1(t)} \frac{1 - F\_2(t + \tau)}{1 - F\_2(t)} \lambda\_2(t + \tau) d\tau \\ &=& \int\_0^\infty \mathbf{e}^{-\lambda\_1 \tau} \frac{1 - F\_2(t + \tau)}{1 - F\_2(t)} \lambda\_2(t + \tau) d\tau. \end{array}$$

The last equality follows from the substitution of *δ*<sup>1</sup> = 1. Now, by the results in Reference [57] (Chapter 3.2.4) we can argue that

$$\bar{A}\_{[t]\_1 \cdot [t]\_2} ^1 = -\int\_0^\infty \mathbf{e}^{-\lambda \tau} s\_2'(\tau) \mathbf{d} \tau,\tag{60}$$

where *s* <sup>2</sup> stands for the derivative of *s*2. Plugging (59) and (60) into (58) yields

$$\begin{array}{rcl} A\_1(t) &=& \int\_0^\infty \mathrm{e}^{-\lambda\_1 \tau} (s\_2(\tau) - c\_1 s\_2'(\tau)) d\tau \\ &=& \mathcal{L}(s\_2 - c\_1 s\_2') \\ &=& \mathcal{L}(s\_2) - c\_1 \mathcal{L}(s\_2') \\ &=& \mathcal{L}(s\_2) - c\_1 (\lambda\_1 \mathcal{L}(s\_2) - s\_2(0)). \end{array} \tag{61}$$

The last equality follows from the rule of Laplace Transform of Derivatives (see Reference [72] (Theorem 6.2.1)). Noting that *s*2(0) = 1 and rearranging the terms in (61) gives us

$$\begin{array}{rcl} A\_1(t) &=& c\_1 + (1 - c\_1 \lambda\_1) \mathcal{L}(s\_2) \\ &=& c\_1 + (1 - c\_1 \lambda\_1) \int\_0^\infty \mathbf{e}^{-\lambda\_1 \tau} \frac{1 - F\_2(t + \tau)}{1 - F\_2(t)} \mathbf{d}\tau. \end{array}$$

$$\begin{split} &=\quad \mathbf{c}\_{1} + (1 - \mathbf{c}\_{1}\lambda\_{1})\int\_{0}^{\infty} \mathbf{e}^{-\lambda\_{1}\tau} \frac{1 - F\_{2}(t + \tau)}{1 - F\_{2}(t)} \mathbf{d}\tau \\ &=\quad \mathbf{c}\_{1} + (1 - \mathbf{c}\_{1}\lambda\_{1})\frac{\mathbf{e}^{t}}{1 - F\_{2}(t)} \int\_{0}^{\infty} \mathbf{e}^{-\lambda\_{1}\tau} (1 - F\_{2}(\tau)) \mathbf{d}\tau \\ &=\quad \mathbf{c}\_{1} + (1 - \mathbf{c}\_{1}\lambda\_{1})\frac{\mathbf{e}^{t}}{1 - F\_{2}(t)} \left(\frac{1}{\lambda\_{1}} - \mathcal{L}(F\_{2})\right). \end{split}$$

This proves the result.

Proposition 2 gives us that, regardless of the distribution of *T*<sup>2</sup> (or of its shape parameter for the case of Chen's law), when *T*<sup>1</sup> follows the exponential distribution and *c*1*λ*<sup>1</sup> = 1, the optimal rate of extraction will be invariant.

To interpret Figure 16, we recall Figures 7–12. This case corresponds to the situation where the Weibull extractor is already at the aging mode of the process, and Chen extractor is only starting its tasks. Here, the intensity of Weibull's extractor has to be higher as time passes, and the other player can take advantage of it because the intensity of its rate of extraction is decreasing. However, it is only at the beginning of the process when it is optimal for Chen extractor to have a stronger rate than that of Weibull (this is the only case in which this situation is observed). This is consistent with the behaviors shown by the intensity functions for the Weibull extractor in Figure 9, and for the Chen extractor in Figure 12.

**Figure 16.** Ease of the extraction when the shape parameters of Weibull and Chen distributions are *δ*<sup>1</sup> = 2 and *δ*<sup>2</sup> = <sup>1</sup> <sup>2</sup> , respectively. (**a**) Ease of the extraction for Weibull's player when *δ*<sup>1</sup> = 2. (**b**) Ease of the extraction for Chen's player when *δ*<sup>2</sup> = <sup>1</sup> 2 .

**Figure 17.** Ease of the efforts when the shape parameters of Weibull and Chen distributions are *δ*<sup>1</sup> = 2 and *δ*<sup>2</sup> = 1, respectively. (**a**) Ease of the effort required Weibull extractor when *δ*<sup>1</sup> = 2. (**b**) Ease of the effort required Chen extractor when *δ*<sup>2</sup> = 1.

Figure 17a shows how the effort required by the Weibull extractor is a convex function that tends to stabilize in the long run. This feature is consistent with Figure 9. However, in the controlled case, we cannot observe an increasing trend. The behavior of the ease of Chen extractor mirrors that of the corresponding ease function in Figure 6.

Figure 18 shows a dramatized effect of the one shown in Figure 17. The reason is that both agents are in the aging mode of their respective processes.

**Figure 18.** Ease of the efforts when the shape parameters of Weibull and Chen distributions are *δ*<sup>1</sup> = 2 and *δ*<sup>2</sup> = 2, respectively. (**a**) Ease of the effort required by Weibull extractor when *δ*<sup>1</sup> = 2. (**b**) Ease of the effort required by Chen extractor when *δ*<sup>2</sup> = 2.

If we fix the initial stock at *x*<sup>0</sup> = 5, and use Theorem 4, we can calculate the following table, where we have assumed, as in Section 3 that the scale parameters *λ*<sup>1</sup> and *λ*<sup>2</sup> equal the unity. Each of the entries represent the pair (*W*1(5, 0), *W*2(5, 0)).


It should be noted that the player whose random terminal time is distributed according to the Chen's law end's up earning less than the other player in all cases. This is consistent with the comparisons we made in Section 2, and in particular, in Figure 3c.

The optimal trajectory is given by the following exponential function:

$$\begin{split} \mathbf{x}^\*(\tau) &= \quad x\_0 \exp\Big(-\int\_0^\tau \frac{\exp\left(-\int\_0^t (\lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \mathbf{ds}\right)}{\int\_t^\infty (1 + c\_1 \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \exp\left(-\int\_0^s (\lambda\_1 \delta\_1 \mathbf{r}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{r}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \mathbf{dr}\right) \mathbf{ds}} \\ &+ \frac{\exp\left(-\int\_0^t (\lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{s}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \mathbf{ds}\right)}{\int\_t^\infty (1 + c\_2 \lambda\_1 \delta\_1 \mathbf{s}^{\delta\_1 - 1}) \exp\left(-\int\_0^s (\lambda\_1 \delta\_1 \mathbf{r}^{\delta\_1 - 1} + \lambda\_2 \delta\_2 \mathbf{r}^{\delta\_2 - 1} \mathbf{e}^{s^{\delta\_2}}) \mathbf{dr}\right) \mathbf{ds}}) \mathbf{d}\right). \end{split}$$

### **4. Conclusions**

In this paper we used standard Dynamic Programming techniques and classic Actuarial Mathematics tools for the analysis of a differential game with linear system and logarithmic reward function under the total payoff criteria with a random horizon. In Section 2.1 we presented the game model of our interest, and in Section 2.2 we presented a *third actuarial principle* to calculate premia under the total payoff criterion, which has been used to introduce the game with random terminal times. The distributions that we studied for these random variables are of great importance for risk analysts, and we devoted Section 3 to describe them and to present our main results and analyses. The first distribution that we considered is the classic two-parameter Weibull random variable, and the second is the heavy-tailed law of Chen. We compared the results of a resource extraction differential game model with each of these random variables and we identified the phases of the extraction with different values of the shape parameter of the distributions we used. In Section 3.1 we solved the degenerate case of the game (i.e., for only one player) and, in Section 3.2 we used some actuarial tools to study a two-player version of the game with two independent random terminal times for the extraction tasks.

For the purpose of illustrating our developments, we stated a degenerate game, and a two-person game with independent random terminal times for each player. It is important to emphasize the importance of the logarithmic utility of wealth function that we used, for it allowed us to find a connection between principles **PI** and **PII** from Section 2.2. This connection is clearly expressed by expressions (40) and (56) in Remarks 2 and 3, respectively. We believe that, regardless of the very particular form of the utility functions involved in our study, these expressions mean that the bond between the extraction game at hand (with random terminal times whose distributions are known), and the actuarial equivalence principle should be further studied, for it implies that, should there be an interest in insuring the operation of any of the agents, the optimal premium to be *continuously charged* to each player is given by *u*∗(*t*, *x*). We think of such an extension as a plausible future line of work.

Additionally, we found short, explicit, numerical and graphical expressions in terms of actuarial nomenclature for the optimal rates of extraction of the agents in each case analyzed. We used this notation to characterize the optimal extraction rates in two parts, namely:


The resulting instantaneous utility of this exercise is given by (28) for the degenerate game, and (45) for the two-person differential game.

Moreover, with such representations of the premia available, it is possible to think of analyses of other kind; for instance, building confidence intervals and adding a loading factor to the premium to compute probabilities of ruin of the (hypothetical) insurance company, and thus extend our results to the realm of the actuarial risk theory. One of such extensions, that we look forward to work on, would be to consider the possibility of having an agent whose entry in the system happens at a random moment after the start of the extraction tasks (as is the case in Reference [52]) and (any of) the agents require(s) insurance to continue to develop the resource.

Another plausible extension is the one suggested by the fact that, in References [1,52], the authors found expressions for the optimal rates of extraction that can also be put in terms of simple contingent functions; which, on the one hand, is consistent with our findings in this paper, and on the other, can be used to model the presence of both, legal and illegal extracting agents. This means that in spite of the particular form of the utility functions we used for the agents, the *optimal* premia are given by expressions that follow the actuarial equivalence principle. With this in mind, we could try to perform an analysis of the optimality of equivalence premia with a broader class of reward functions, under the total payoff criterion, by using Dynamic Programming.

The speed of the extraction tasks differs for two different moments of the end of the game. And, as it was to be expected, Chen's distribution allows the most realistic description of the life cycle of the system. Our illustrations confirm that in normal operation, it is necessary to "dig" at a rate which needs to be gradually increased with time, but in the case where the end of the process is subject to Chen's law, the pace of resource extraction still needs a faster intensification. In the equipment aging mode, when the failure rate function increases, it does not matter what distribution law determines the completion of the development process, it is necessary to increase the rate of development of the fields.

Among our findings, we can also quote that the optimal behaviors of the agents differ for different game scenarios and moments of its completion. For example, during the pre-run phase (i.e., when the equipment is not yet established and the overall picture of the process is not fully clear), the development speed should be the smallest, which corresponds to the agent's caution. This is reflected in the entries of the row *δ*<sup>1</sup> = <sup>1</sup> <sup>2</sup> , and the column *<sup>δ</sup>*<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> of the table of Section 3.2. We can notice about this by observing that, for both players, the entries in the other rows and columns are greater than these ones. This feature is consistent with the conclusions drawn in Reference [4], in the sense that both our works prove that the price of insuring the extraction tasks is decreasing in the

level of expertise of the extractor (which we might interpret as the stages where the agents are located at). This conclusion is coherent as well with the developments shown in References [27–29], on the willingness of the agents to pay for the coverage of an insurance strategy to guarantee the continued supply of the revenues from the extraction tasks.

To conclude, we mention that the results of this paper can be applied to a wide range of critical thecnologies related domains, including Internet of Things, energy management, and security to mention just a few. An extension of our results to the mentioned applications will be addressed in our subsequent work.

**Author Contributions:** The conceptualization of the model with random-time horizon, as well as its formal analysis for each of the players is original of E.V.G. The methodology for this particular problem is due to E.S.M., for it was the subject of her M.Sc. dissertation. The validation of the methods, and comparison with the results from the Actuarial Sciences field is original of J.D.L.-B. The financial resources came from Saint Petersburg State University and Universidad Anáhuac México. J.D.L.-B. was in charge of writing and preparing the original draft preparation. All authors contributed to the process of writing, reviewing and editing. All authors have read and agreed to the published version of the manuscript.

**Funding:** The reported study on optimal controls of E.V.G. was funded by RFBR according to the research project N 18-00-00727 (18-00-00725). The participation of J.D.L.-B. was jointly financed by Universidad Anáhuac México and Saint Petersburg State University.

**Acknowledgments:** The authors wish to acknowledge the technical contributions of Dmitrii V. Gromov. The discussions we held were both, fruitful and interesting.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
