Next Article in Journal
Laguerre-Type Exponentials, Laguerre Derivatives and Applications. A Survey
Next Article in Special Issue
Optimal Reinsurance Problem under Fixed Cost and Exponential Preferences
Previous Article in Journal
Indirect Taxis on a Fluctuating Environment
Previous Article in Special Issue
The Optimal Control of Government Stabilization Funds
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints

1
Ecole Nationale d’Ingénieurs de Tunis-LAMSIN, Université de Tunis El Manar, Tunis 2092, Tunisie
2
Sorbonne Université, Université de Paris, CNRS, Laboratoire de Probabilités, Statistiques et Modélisations (LPSM), 75005 Paris, France
3
Ecole Nationale Supérieure d’Informatique pour l’Industrie et l’Entreprise, Laboratoire de Mathématiques et Modélisation d’Evry, CNRS UMR 8071, 91037 Evry, France
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(11), 2053; https://doi.org/10.3390/math8112053
Submission received: 11 September 2020 / Revised: 5 November 2020 / Accepted: 9 November 2020 / Published: 18 November 2020
(This article belongs to the Special Issue Stochastic Optimization Methods in Economics, Finance and Insurance)

Abstract

:
In this work, we study an optimization problem arising in the management of a natural resource over an infinite time horizon. The resource is assumed to evolve according to a logistic stochastic differential equation. The manager is allowed to harvest the resource and sell it at a stochastic market price modeled by a geometric Brownian process. We assume that there are delay constraints imposed on the decisions of the manager. More precisely, starting harvesting order and selling order are executed after a delay. By using the dynamic programming approach, we characterize the value function as the unique solution to an original partial differential equation. We complete our study with some numerical illustrations.
MSC:
Classification (2010): 93E20; 62L15; 49L20; 49L25; 92D25

1. Introduction

In the recent decades, the management of natural resources has become a major issue. Indeed, for many countries, natural resources ensure regular incomes, allowing for economic growth and development. In particular, the seeking of high short-term profits can lead to an overconsumption of the natural resources and therefore to their exhaustion (see, e.g., [1]). Hence, the question of the sustainability of such natural resources is crucial.
As a consequence, many countries have imposed restrictions on the exploitation of natural resources so as to avoid their depletion. One of the repercussions of these constraints is the non-immediacy of the decisions: the actions of the natural resources managers are executed after some delay. Moreover, the harvests are limited in time, and sometimes, we have a lag constraint between two harvests. The aim of this work is to model these delays and to study their effects on the gain and the behavior of the natural resource managers.
We suppose that the resource manager can act by two type of interventions: starting and stopping the harvesting of the natural resource. We therefore model the strategy by a double sequence ( d i , s i ) i 1 where d i and s i are stopping times representing respectively the time of the i-th decision to start and stop harvesting. Therefore, we assume s i d i for i 1 .
Such a formulation naturally appears in decision-making problems in economics and finance. In many cases, managers face technical and regulatory delays, which may be significant. Thus, these delays need to be taken into account in the way of acting (see for example [2,3]). In our case, we consider the management of a natural resource when we have constraints and lags. We first suppose that there is a minimum time δ between the end of an action and the beginning of the following one. This constraint can be written on the strategy ( d i , s i ) i 1 as d i + 1 s i + δ for i 1 . We also suppose that we have two kind of lags. The first one appears for starting orders: the harvest of the natural resource starts after a given fixed delay . This delay represents the time needed to access the natural resource. The second kind of lag, denoted by m, corresponds to the time between the end of the harvest and the date when the manager sells the harvest; this lag can be due to the drying, the packaging, the transport, the time to find a counterpart to buy the harvest, etc.
Hence, our modeling takes into account the non-immediacy of both the harvest and its sale. As a result, the corresponding optimal strategies will be more practical and will lead to economic and environmental policies that are more effective than those suggested in the classic literature.
We assume that without any intervention of the manager, the natural resource abundance evolves according to a stochastic logistic diffusion model. Such a logistic dynamics is classic in the modeling of populations’ evolution; see for example [4]. If we denote by X α the controlled resource abundance process and by P its price process, the problem of the manager turns into a maximization of the expected total profit on an infinite horizon of the form:
E i 1 f ( d i , s i , P s i + m , ( X t α ) d i + t s i ) ,
over the strategies α = ( d i , s i ) i 1 satisfying the previous constraints.
From a mathematical point of view, control problems with delay were studied in [5,6] where there was only one kind of intervention. In our model, we consider two kinds of interventions, which are moreover related by a constraint. In the paper [7], the authors also considered two kinds of interventions. However, there is no constraint linking them, and only one of them is lagged. Furthermore, the state variable (the resource abundance) is a physical quantity. We therefore have the additional state constraint restricting strategies to those in which the remaining abundance is nonnegative.
Control problems under state constraints without delay have been intensely studied in the literature (see for example [8] for the study of optimal portfolio management under liquidity constraints), and the classical approach to deal with such problems is to consider the notion of constrained viscosity solutions introduced in [9,10]. In this work, we adapt these techniques to a state constraints control problem with delay. Using a dynamic programming approach, we characterize the associated value function as the unique solution in the viscosity sense to an original partial differential equation (PDE). The novelty of the PDE lies in the different forms it takes on several regions of the space.
We then test the applicability of our approach by computing numerically the optimal strategies on some examples. Our numerical tests show that the optimal strategies heavily depend on the delay parameters. In particular, the effective optimal strategies are different from the naive optimal strategies, i.e., without delays. This illustrates the contribution of our approach to identifying optimal solutions for the management of natural resources.
The paper is organized as follows. We define the model and formulate our stochastic control problem in Section 2. In Section 3, we derive the partial differential equations associated with the control problem. Then, we characterize the value function as the unique viscosity solution of a Hamilton–Jacobi–Bellman equation. Finally, in Section 4, we compute numerically the value function and the associated optimal policy via an iterative procedure based on a quantization method. We further enrich our studies with numerical illustrations.

2. Problem Formulation

2.1. The Model

Let Ω = C ( R + , R 2 ) be the space of continuous functions from R + to R 2 . We define on Ω the σ -algebra F generated by the coordinate functions ω Ω ω t , for t R + , and we endow ( Ω , F ) with the Wiener measure P . By an abuse of notation, we still denote by F the P -completed σ -algebra. We define on the probability space ( Ω , F , P ) the two R -valued processes B and W by:
B t ( ω ) = ω t 1 and W t ( ω ) = ω t 2 ,
for t R + and ω = ( ω t 1 , ω t 2 ) t 0 . We then denote by F = ( F t ) t 0 the complete filtration generated by ( W , B ) .
We consider a resource that evolves according to the classical logistic stochastic differential equation if there is no harvesting:
d X t = η X t ( λ X t ) d t + γ X t d B t ,
where η , λ , and γ are three positive constants. The constant η λ corresponds to the intrinsic rate of population growth, and 1 / λ is the carrying capacity of the environment. A manager can harvest the resource under some conditions. We denote by α : = ( d i , s i ) i 1 a harvesting strategy, which is described as follows.
  • d i is the time at which the manager gives the order to harvest. The harvest starts only at time d i + , with a positive constant representing the delay.
  • s i is the time when the harvest is stopped.
In the following, we will only consider the set A of admissible strategies such that ( d i ) i 1 and ( s i ) i 1 are two increasing sequences of F -stopping times satisfying:
0 s i d i K ,
and
s i + δ d i + 1 ,
for any i 1 , where δ and K are positive constants with < K .
We assume that in the harvesting time, the manager harvests the quantity g ( x ) by the time unit where x is the quantity of the available resource, and g is a function satisfying the following conditions.
(Hg) g is an increasing function from R + to R + such that g ( 0 ) = 0 , and there exist two positive constants a min and a max such that a min x g ( x ) a max x for any x R + .
Moreover, the manager must pay a cost when he harvests during a period Δ t , and this cost is f ( Δ t ) where f is an increasing function from R + to R + such that f ( 0 ) = 0 .
Finally, after any harvest time, the manager sells at time s i + m , with m a positive constant, the harvested resource. We denote by P p the price of the resource, and we suppose that it evolves according to the following stochastic differential equation:
d P t p = P t p ( μ t d t + σ t d W t ) , P 0 p = p ,
where μ and σ are positive bounded F -adapted processes and p is the price at time 0. We assume that m < δ .
We can sum up all the constraints with the following graph.
Mathematics 08 02053 i001
The state A corresponds to the state where the manager can decide to start a harvest. The state B corresponds to the harvesting time. The state C corresponds to the moment of sale.
The variable d i (resp. s i ) corresponds to the time when the manager decides to leave the state A (resp. B). The time to go from the state A to the state B is . This means that the time between the order to harvest and the start of the harvest is . We cannot stay more than K in the state B, which means that the harvesting time cannot be more than K . The time to go from the state B to the state C is m. This means that the manager must wait m after the harvest to sell this production. The time to go from the state C to the state A is δ m , which means that the minimum time between the sale and the next order to harvest is δ m .
If the manager follows an admissible strategy α = ( d i , s i ) i 1 , then the quantity of available resource X t x , α at time t evolves with the following stochastic differential equation:
d X t x , α = η X t x , α ( λ X t x , α ) d t + γ X t x , α d B t i 1 g ( X t x , α ) 1 d i + t s i d t ,
with X 0 x , α = x .

2.2. The Value Function

The objective of the manager is to optimize the expected profit over an infinite horizon. The associated value function is then given by:
V ( x , p ) = sup α A E i 1 e β ( s i + m ) ( G i α C i α ) ,
where β is a positive constant corresponding to the discount factor, G i α and C i α corresponds to the gain, and the cost for the i-th harvest associated with the strategy α A :
C i α = f ( s i d i ) ,
and:
G i α = P s i + m p d i + s i g ( X t x , α ) d t .

3. PDE Characterization

3.1. Extension of the Value Function

In order to provide an analytic characterization of the value function V defined by (4), we need to extend the definition of this control problem to general initial conditions. Indeed, the delays imposed on the manager make the state process non-Markov. To overcome this issue, we introduce new variables keeping in mind the time spent from the previous decision. More precisely, we consider a gain function J ( x , p , θ , ρ , y , α ) from E : = R + × R + * × D × A ( Θ ) to R , with x representing the size of the available resource at the initial time, p the price of the resource, θ the time from the last decision of the manager (start a harvest or stop a harvest), ρ the time from which the manager has decided to harvest the last time, y the quantity of the harvest until now associated with this harvest, and α the strategy. We introduce some notation to simplify the formulae: z : = ( x , p ) , Z : = R + × R + * and Θ : = ( θ , ρ , y ) . We also introduce the following sets:
D 0 : = { ( θ , ρ , y ) R + 3 , y 0 , 0 θ = ρ K } , D 1 : = { ( θ , ρ , y ) R + 3 , y 0 , 0 θ < ρ m and   ρ K θ ρ } , D 2 : = { ( θ , ρ , y ) R + 3 , y 0 , m θ < ρ and   ρ K θ ρ } ,
and D : = D 0 D 1 D 2 .
The gain function J is given for any state ( z , Θ ) Z × D and strategy α A ( Θ ) by:
J ( z , Θ , α ) : = E j 1 e β ( s j + m ) ( G j C j ) ,
where G 1 and C 1 are defined by:
G 1 = y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p 1 D 0 + y P m θ p 1 D 1 , C 1 = f ( s 1 + θ ) 1 D 0 + f ( ρ θ ) 1 D 1 .
For any j 2 , C j and G j are defined by (5) and (6).
To define the set of admissible strategies A ( Θ ) , we first introduce the set of admissible strategies A i ( Θ ) defined for any Θ D i with i { 0 , 1 , 2 } :
A 0 ( Θ ) : = { ( d i , s i ) i 1 , where d 1 = θ , s 1 is a stopping time valued in [ ( θ ) + , K θ ] and ( d i , s i ) i 2 A with d 2 s 1 + δ } ,
A 1 ( Θ ) : = ( d i , s i ) i 1 , where ( d 1 , s 1 ) = ( ρ , θ ) and ( d i , s i ) i 2 A with d 2 δ θ ,
A 2 ( Θ ) : = ( d i , s i ) i 1 , where ( d 1 , s 1 ) = ( ρ , θ ) and ( d i , s i ) i 2 A with d 2 ( δ θ ) + .
Finally, we define the set A ( Θ ) by A ( Θ ) = A i ( Θ ) when Θ D i with i { 0 , 1 , 2 } .
We can now define the extended value function v by:
v ( z , Θ ) : = sup α A ( Θ ) E J ( z , Θ , α ) ,
for any ( z , Θ ) Z × D .

3.2. Dynamic Programming Principle

To characterize the value function v by a PDE, we use the approach by dynamic programming principle. The value function v satisfies the following equalities, which depend on the set in which Θ lives.
Theorem 1.
For any z Z and Θ D 0 , we have:
v ( z , Θ ) = sup ( θ ) + s 1 K θ E e β s 1 v ( X s 1 x , α , P s 1 p , 0 , s 1 + θ , y + ( θ ) + s 1 g ( X s x , α ) d s ) .
For any z Z and Θ D 1 , we have:
v ( z , Θ ) = E e β ( m θ ) [ y P m θ p f ( ρ θ ) ] + e β ( m θ ) v ( X m θ x , α , P m θ p , m , ρ + m θ , 0 ) ) .
For any z Z and Θ D 2 , we have:
v ( z , Θ ) = sup ( δ θ ) + d 2 E e β d 2 v ( X d 2 x , α , P d 2 p , 0 , 0 , y ) .
The proof of this theorem is postponed to Appendix A, Appendix B and Appendix C.

3.3. Growth Property

We now impose the following assumption on the coefficients:
μ + η λ β < 0 .
We then get the following growth property for the value function v. This one will be useful to characterize v as the unique viscosity solution of a PDE system.
Proposition 1.
The value function v satisfies the following growth condition: there exist two positive constants C 1 and C 2 such that:
y p C 1 v ( z , Θ ) C 2 ( 1 + | x | 2 + | p | 2 ) ,
for any z = ( x , p ) Z and Θ = ( θ , ρ , y ) D .
Proof. 
We first prove the left inequality. If Θ D 0 , we can consider the strategy that consists of stopping the harvest as soon as possible and never harvesting after that, so:
v ( z , Θ ) E [ y P ( θ ) + + m p f ( ( θ ) + ) ] y p f ( K ) .
If Θ D 1 , we can consider the strategy that consists of selling the harvest and never harvesting after that, and we get:
v ( z , Θ ) E [ y P m θ p f ( ρ θ ) ] y p f ( K ) .
If Θ D 2 , we can consider the strategy that consists of starting the harvest as soon as possible, stopping this as soon as possible, and never harvesting after that, so:
v ( z , Θ ) E [ y P ( δ θ ) + + + m p f ( 0 ) ] y p f ( K ) .
Hence, the left inequality holds with C 1 = f ( K ) .
We now prove the right inequality. For that, we introduce the process X ¯ x defined by X ¯ 0 x = x and:
d X ¯ t x = η X ¯ t x ( λ X ¯ t x ) d t + γ X ¯ t x d B t .
Using the closed formula of the logistic diffusion (see, e.g., in [11]), we have:
X ¯ t x = e ( η λ γ 2 2 ) t + γ B t 1 x + η 0 t e ( η λ γ 2 2 ) u + γ B u d u x e η λ T e γ 2 2 t + γ B t ,
which implies the following inequality:
sup 0 t T E [ X ¯ t x ] x e η λ T .
We now consider any strategy α = ( d i , s i ) i 1 A ( Θ ) . Since the cost function f is positive, we get:
J ( z , Θ , α ) i 1 e β ( s i + m ) E P s i + m p d i + s i g ( X ¯ s x ) d s .
From Assumption (Hg), we have:
J ( z , Θ , α ) i 1 e β ( s i + m ) E [ P s i + m p ] d i + s i a max E X ¯ s x d s C p i 1 e ( μ β ) s i a max sup 0 t s i E X ¯ s x .
From Inequality (10), we get (C is a generic constant, which can be modified):
J ( z , Θ , α ) C p x i 1 e ( μ + η λ β ) s i .
From Inequality (8) and all the constraints about d i and s i , we know that s i K i + δ ( i 1 ) + m for any i N * , so we get:
J ( z , Θ , α ) C p x i 1 e ( μ + η λ β ) ( K i + δ ( i 1 ) + m ) C p x i 1 e ( μ + η λ β ) ( K + δ ) i C p x ,
which implies:
v ( z , Θ ) C p x .
 □

3.4. Viscosity Properties and Uniqueness

We now consider all the cases to get the PDEs satisfied by the value function v, which is derived from the dynamic programming relation:
  • If Θ D 0 with θ [ 0 , ) , that means the manager has given the order to harvest, but this has not yet started, which implies:
    β v L 0 v = 0 ,
    with L 0 ψ = η x ( λ x ) x ψ + μ p p ψ + | γ x | 2 2 x x ψ + | σ p | 2 2 p p ψ + θ ψ + ρ ψ for any function ψ C 2 ( Z × D ) .
  • If Θ D 0 with θ [ , K ) , that means the manager harvests, and he/she can decide to stop this, which implies:
    min ( β v L 1 v , v M 1 v ) = 0 ,
    with L 1 ψ = ( η x ( λ x ) g ( x ) ) x ψ + μ p p ψ + g ( x ) y ψ + | γ x | 2 2 x x ψ + | σ p | 2 2 p p ψ + θ ψ ψ C 2 ( Z × D ) , and the operator M 1 is defined for any function v C 2 ( Z × D 0 ) by:
    M 1 v ( x , p , θ , θ , y ) = v ( x , p , 0 , θ , y ) .
  • If Θ D 0 with θ = K , the manager must stop the harvest, so we have:
    v ( x , p , K , K , y ) = v ( x , p , 0 , K , y ) .
  • If Θ D 1 with θ [ 0 , m ) , that means the manager has finished harvesting, but he/she has not yet sold his/her harvest, which implies:
    β v L 0 v = 0 .
  • If Θ D 1 with θ = m , that means the manager sells his/her harvest, which implies:
    lim θ m v ( x , p , θ , ρ , y ) = y p f ( ρ m ) + v ( x , p , m , ρ , 0 ) .
  • If Θ D 2 with θ [ m , δ ) , then the manager can do nothing, which implies:
    β v L 0 v = 0 .
  • If Θ D 2 with θ δ , then the manager can decide to start a harvest:
    min ( β v L 0 v , v M 2 v ) = 0 .
    The operator M 2 is defined for any function v C 2 ( Z × D 2 ) by:
    M 2 v ( x , p , θ , ρ , y ) = v ( x , p , 0 , 0 , y ) .
As usual, we do not have any regularity property on the value function v. We therefore work with the notion of the viscosity solution.
Definition 1 
(Viscosity solution to (11)–(17)). A locally bounded function w defined on Z × D is a viscosity supersolution (resp. subsolution) if:
  • for any ( z , Θ ) Z × D 0 and φ C 2 ( Z × D 0 ) such that:
    ( w * φ ) ( z , Θ ) = min Z × D 0 ( w * φ ) ( r e s p . ( w * φ ) ( z , Θ ) = max Z × D 0 ( w * φ ) )
    we have
    β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 i f θ [ 0 , l ) ( r e s p . β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 ) min ( β φ ( z , Θ ) L 1 φ ( z , Θ ) , w * ( z , Θ ) M 1 w * ( z , Θ ) ) 0 i f θ [ l , K ) ( r e s p . min ( β φ ( z , Θ ) L 1 φ ( z , Θ ) , w * ( z , Θ ) M 1 w * ( z , Θ ) ) 0 )
  • for any z Z and y R +
    w * ( x , p , K , K , y ) w * ( x , p , 0 , K , y ) ( r e s p . w * ( x , p , K , K , y ) w * ( x , p , 0 , K , y ) )
  • for any ( z , Θ ) Z × D 1 with θ [ 0 , m ) and φ C 2 ( Z × D 1 ) such that:
    ( w * φ ) ( z , Θ ) = min Z × D 1 ( w * φ ) ( r e s p . ( w * φ ) ( z , Θ ) = max Z × D 1 ( w * φ ) )
    we have:
    β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 ( r e s p . β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 )
  • for any z Z , ρ [ + m , K + m ] and y R + :
    w * ( x , p , m , ρ , y ) y p f ( ρ ) + w * ( x , p , m , ρ , 0 ) ( r e s p . w * ( x , p , m , ρ , y ) y p f ( ρ ) + w * ( x , p , m , ρ , 0 ) )
  • for any ( z , Θ ) Z × D 2 and φ C 2 ( Z × D 2 ) such that:
    ( w * φ ) ( z , Θ ) = min Z × D 2 ( w * φ ) ( r e s p . ( w * φ ) ( z , Θ ) = max Z × D 2 ( w * φ ) )
    we have:
    β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 i f θ [ m , δ ) ( r e s p . β φ ( z , Θ ) L 0 φ ( z , Θ ) 0 ) min ( β φ ( z , Θ ) L 0 φ ( z , Θ ) , w * ( z , Θ ) M 2 w * ( z , Θ ) ) 0 i f θ δ ( r e s p . min ( β φ ( z , Θ ) L 0 φ ( z , Θ ) , w * ( z , Θ ) M 2 w * ( z , Θ ) ) 0 )
A locally bounded function w defined on Z × D is said to be a viscosity solution to (11)–(17) if it is a supersolution and a subsolution to (11)–(17).
The next result provides the viscosity properties of the value function v.
Theorem 2
(Viscosity characterization). The value function v is the unique viscosity solution to (11)–(17) satisfying the growth condition (9). Moreover, v is continuous on Z × D .
The proof of this theorem is postponed in Appendix B.

4. Numerical Results

Unfortunately, we are not able to provide an explicit solution for the HJB Equations (11)–(17). We therefore propose in this section a scheme to approximate the solution v.

4.1. The Discrete Problem

In the following, we introduce the numerical tools to solve the HJB equations related to the value function v. We use a numerical backward scheme based on the optimal quantization mixed with an iterative procedure. The convergence of the solution of the numerical scheme towards the solution of the HJB equation, when the time-space step on a bounded grid goes to zero, can be shown using the standard monotonicity, stability, and consistency arguments. We refer to [12,13] for numerical schemes of the same form.
Each HJB equation of the form min ( β v L i v , v h i ) = 0 , with i { 0 , 1 } , will be approximated as follows:
v n + 1 ( x , p , θ , ρ , y ) = max E [ v n + 1 ( X Δ i , P Δ , θ + Δ , ρ + Δ , Y Δ i ) ] , h i n ,
where:
X Δ i = x + ( η x ( λ x ) i g ( x ) ) Δ + γ x Δ ξ k , P Δ = p exp ( ( μ σ 2 / 2 ) Δ + σ Δ ξ l ) , Y Δ i = y + i g ( x ) Δ .
The constant Δ represents the time step, and the index n stands for the iteration procedure steps, which are stopped when the error between two consecutive iterations becomes smaller than a given stopping criterion ε . The random variables ξ k and ξ l represent the quantization of two independent normally distributed random variables.
Remark 1.
Recall that the optimal quantization technique consists of approximating the expectation E [ f ( Z ) ] , where Z is a normal distributed variable and f is a given real function, by:
E [ f ( ξ ) ] = k ξ ( Ω ) f ( k ) P ( ξ = k ) .
The distribution of the discrete variable ξ is known for a fixed N : = c a r d ( ξ ( Ω ) ) , and the approximation is optimal as the L 2 -error between ξ and Z is of order 1 / N (see [14]). The optimal grid ξ ( Ω ) and the associated weights P ( ξ = k ) can be downloaded from the website: http://www.quantize.maths-fi.com/downloads.

4.2. Numerical Interpretations

The numerical computation is done using the following set of data:
  • η = 0.1 , λ = 2 , γ = 0.2 , μ = 0.2 , σ = 0.1 , β = 0.5 .
  • δ = K = 0.4828 , l = m = 0.2069 .
  • Penalty function: f ( x ) = 0.1 × x .
  • Gain function: g ( x ) = x .
Figure 1. The shape of the value function v for ( x , p , θ , ρ , y ) D 0 in the plane of x.
We plot the shape of the value function v for a fixed state ( p , θ , ρ , y ) in the plane of x s.t. ( x , p , θ , ρ , y ) D 0 and θ [ 0 , ) . We can see that, as expected, v is increasing with respect to x. We can remark about three cases in this figure:
  • x [ 1.5 , 1.7 ] : in this case the, value function is increasing since if we have 1.5 x < x 1.7 at the initial time, then we will have X θ x < X θ x < 2 a.s., and the bigger is the resource when we harvest, the more we can harvest since the function g is increasing;
  • x [ 1.7 , 2 ] : in this case, the value function is constant since if we have 1.7 x < x 2 at the initial time then we will have X θ x = X θ x = 2 a.s., so we harvest exactly the same quantity in the two cases;
  • x > 2 : in this case, the value function is increasing since if we have 2 x < x at the initial time, then the resource decreases, but we will have X θ x < X θ x a.s.
The value function increases with respect to x, which is natural since the greater the resource is, the more we can harvest as the function g is increasing. We can also see that the value function becomes concave after x = 2 , which is due to the resource’s mean-reverting nature. Indeed, if the quantity of the resource is greater than two, because the drift is negative, the resource would necessarily decrease, reducing the harvest. The oscillations observed when x is small are due to the delay .
Figure 2. The shape of the value function v for ( x , p , θ , ρ , y ) D 0 in the plane of x for different values of .
We plot the shape of the value function v for a fixed state ( p , θ , ρ , y ) in the plane of x s.t. ( x , p , θ , ρ , y ) D 0 and θ [ 0 , ) for different values of . We can see that, when changing the delay time , the change point of the value function’s monotony is also shifted: 1.6 for = 0.1379 , 1.7 for = 0.2069 , and 1.8 for = 0.2759 . Indeed, as the figure shows, as the delay decreases, the change point of the monotony approaches zero and will likely disappear when there is no delay, leading to a perfect concave function. In fact, wasting time waiting to harvest because of the delay will lead the manager to skip the increasing period of the resource. The manager will start the real harvesting when the population is dropping due to the mean-reverting parameter λ ; thus, the value function will decrease.
Figure 3. The optimal policy for ( x , p , θ , ρ , y ) D 0 in the plane of x.
We plot the optimal decision that the manager would make for a fixed state ( p , θ , ρ , y ) in the plane of x s.t. ( x , p , θ , ρ , y ) D 0 and θ [ , K ] . As we can see, the optimal decision that the manager should make is to start harvesting if the resource x is over a given level; otherwise, he/she should stop and sell the harvest. This is due to the cost f, which penalizes him/her as long as the harvesting is ongoing. In fact, if the population is not large enough, he/she would not be able to cover his/her loss.
Figure 4. The optimal policy for ( x , p , θ , ρ , y ) D 0 in the plane of θ .
We plot the optimal decision that the manager would make for four fixed states ( x , p , y ) ( P 1 , P 2 , P 3 , P 4 ) in the plane of θ s.t. ( x , p , θ , ρ , y ) D 0 . The state P 1 represents the case where x and y are both low; state P 2 is for x low and y high, P 3 for x high and y low, and final state P 4 for x and y both high. Decision 1 stands for starting the harvest, and Decision 2 stands for stopping it. As we can see, the optimal decision that the manager should make in state P 1 (resp. P 2 ), knowing that he/she has already spent θ time since the starting decision, is to stop harvesting if θ θ 0 (resp. θ θ 1 ) where θ 0 0.34 (resp. θ 1 0.38 ). We can explain this as follows: on the one hand, in the case where θ θ 0 (resp. θ θ 1 ), due to the cost of harvesting and the fact that we are in state P 1 (resp. P 2 ) where the population is low, the manager prefers to immediately stop harvesting and sell the harvest; otherwise, he/she will likely lose money. On the other hand, if θ θ 0 (resp. θ θ 1 ), i.e., the manager has already given the order to harvest since a given period of time, it is optimal for him/her to harvest for the purpose of covering the cost due to the large time spent harvesting. We can also note that this last window of time is larger for state P 1 in comparison with the one of state P 2 . Indeed, in state P 2 , the manager has already harvested more than in state P 1 , so he/she can stop harvesting sooner.
Concerning states P 3 and P 4 , where the population is high, obviously, the optimal decision for the manager is to harvest at all times and stop harvesting when θ = K = 0.4828 .
Figure 5. The shape of the value function v for ( x , p , θ , ρ , y ) D 1 in the plane of p.
We plot the value function v for a fixed state ( x , θ , ρ , y ) in the plane of p s.t. ( x , p , θ , ρ , y ) D 1 and θ [ 0 , m ] . As expected, v is nondecreasing w.r.t. p. The more expensive the resource is, the more the manager takes benefits.
Figure 6. The optimal policy for ( x , p , θ , ρ , y ) D 2 in the plane of x.
We plot the optimal decision that the manager would make for a fixed state ( p , θ , ρ , y ) in the plane of x s.t. ( x , p , θ , ρ , y ) D 2 and θ [ δ , θ m a x ] . As we can see, the optimal decision that the manager should make, knowing that he/she has already sold the harvest, is to start harvesting over a certain level of x under the mean-reverting barrier λ so that the population grows enough to cover the harvesting costs and take benefits.

4.3. Conclusions

Our modeling takes into account the non-immediacy of both the harvest and its sale, described by the time delays and m. The optimum strategies commented on previously illustrate the effect of those delays on the manager’s actions. In the classical literature, many studies suggesting optimal harvesting policies presume that the natural resource is immediately available, which is not the case in general. Consequently, the proposed policies would not be feasible and would lead to a sub-optimal use of the resource, in the best case scenario. The ecological and economic effects may thus be consequential.
For example, a modeling of fisheries that does not involve delays may lead to a harvesting strategy that would likely deplete the fish population, leading to extinction.
In fact, if the population is at a high level at the initial time, it is then likely to decline due to the logistic nature of the dynamics. As a result, if the time required to reach the harvest region is not taken into account, the best approach would be to harvest massively, thus causing a drastic degradation of the fish population.
We may make the same reasoning when it comes to selling the crop. Assume that we are in a position to sell our harvest immediately after harvesting and neglect the time required to return to land. In that case, if the price of fish falls, we would suffer losses, and we would not be able to amortize the costs of fishing.

Author Contributions

Designed the theoretical framework, M.G., I.K. and T.L.; methodology, M.G., I.K. and T.L.; formal analysis, M.G., I.K. and T.L.; Designed the experiments, M.G.; performed the experiments, M.G.; writing an original draft preparation, M.G., I.K. and T.L.; writing a review and editing, M.G. and T.L. All authors read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Dynamic Programming Principle

We introduce some notations in this part to alleviate the proofs. We first denote by T the set of F -stopping time. For ω , ω Ω and t 0 , we set:
( t ω s ) s 0 = ( ω s t ) s 0 and ( ω t ω s ) s 0 = ( ω s 1 s t + ( ω s ω t + ω t ) 1 s > t ) s 0 .
For any ( z , Θ ) Z × D and α A ( Θ ) we define Z z , α as the two-dimensional process ( X x , α , P p ) . For any t 0 , we denote by Θ ( t , α ) the triple ( θ t , ρ t , y t ) where θ t corresponds to the time from the last decision of the manager before t (this order can be an order to start a harvest or an order to stop a harvest), ρ t the time from which the manager has given the last order to harvest before t and y t is the harvested quantity until time t.
For τ T and α = ( d i , s i ) i 1 A ( Θ ) , we define the shifted (random) strategy α τ by:
α τ ( ω ) = { ( d i ( ω τ ( ω ) ω ) τ ( ω ) , s i ( ω τ ( ω ) ω ) τ ( ω ) ) i κ ( τ , α ) ( ω ) , ω Ω }
with
κ ( τ , α ) ( ω ) : = sup { i 1 , d i ( ω ) τ ( ω ) }
for all ω Ω .
Before proving the dynamic programming principle, we need the following results.
Lemma A1.
For any ϑ T , ( z , Θ ) Z × D and α = ( d i , s i ) i 1 A ( Θ ) , we have the following properties.
  • Consistency of the admissible strategies: Θ ( ϑ , α ) D and α ϑ A ( Θ ( ϑ , α ) ) P -a.s.
  • Consistency of the gain function:
    J ( z , Θ , α ) = E i = 1 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i α C i α ) + E e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) .
Proof. 
These properties are direct consequences of the dynamics of Z z , α and of the definitions of J and A .
We now turn to the proof of the dynamic programming principle. Unfortunately, we have not enough information on the value function v to directly prove these results. In particular, we do not know the measurability of v and this prevents us from computing expectations involving v as in the dynamic programming principle. We therefore provide weaker dynamic programing principles involving the envelopes v * and v * as in [15] where:
v * ( z , Θ ) = lim ¯ ( z , Θ ) ( z , Θ ) ( z , Θ ) E θ θ + ρ ρ + v ( z , Θ ) ,
and:
v * ( z , Θ ) = lim ̲ ( z , Θ ) ( z , Θ ) ( z , Θ ) E θ θ + ρ ρ + v ( z , Θ ) .
We recall that in general, v * v v * . Since we get the continuity of v at the end, these results implies the dynamic programming principle. □
Proposition A1.
For any Θ D 0 and z Z , we have:
v ( z , Θ ) sup α A ( Θ ) sup ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ] .
For any Θ D 1 and z Z , we have:
v ( z , Θ ) sup α A ( Θ ) sup ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < m θ + [ e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 m θ ϑ ] .
For any Θ D 2 and z Z , we have:
v ( z , Θ ) sup α A ( Θ ) sup ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < d 2 + [ i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 ϑ d 2 ] .
Proof. 
Let i { 0 , 1 , 2 } , z Z and Θ D i , α A ( Θ ) and ϑ T . By definition of the value function v, for any ε > 0 and ω Ω , there exists α ε , ω = ( s k ε , ω , d k ε , ω ) k 1 A ( Θ ( ϑ ( ω ) , α ) , which is ε -optimal at ( Z ϑ z , α , Θ ( ϑ , α ) ) ( ω ) , i.e.,
v Z ϑ ( ω ) z , α ( ω ) , Θ ( ϑ ( ω ) , α ( ω ) ) ε J ( Z ϑ ( ω ) z , α ( ω ) , Θ ( ϑ ( ω ) , α ( ω ) ) , α ε , ω ) .
By a measurable selection theorem (see e.g., Theorem 82 in the appendix of Chapter III in [16]), there exists a sequence of stopping times α ¯ ε = ( s ¯ k ε , d ¯ k ε ) k 1 s.t. s ¯ k ε ( ω ) = s k ε , ω ( ω ) and d ¯ k ε ( ω ) = d k ε , ω ( ω ) for a.a. ω Ω .
We now define by concatenation the control strategy α ¯ consisting of the impulse control components of α on [ 0 , ϑ ) , and the impulse control components ( α ¯ ε + ϑ ) on [ ϑ , ) . More precisely, α is given by:
α ¯ ( ω ) = ( s k ( ω ) , d k ( ω ) ) 1 k < κ ( ϑ , α ) ( ω ) ( s ¯ k ε ( ω ) + ϑ ( ω ) , d ¯ k ε ( ω ) + ϑ ( ω ) ) κ ( ϑ , α ) ( ω ) k .
By definition of the shift given in (A1), we have:
α ¯ ϑ ( ω ) = { ( s ¯ k ε ( ω ϑ ( ω ) ω ) , d ¯ k ε ( ω ϑ ( ω ) ω ) ) k 1 , ω Ω } = { α ¯ ϑ , ε ( ω ϑ ( ω ) ω ) , ω Ω } .
From Lemma A1 (ii) and the definition of the performance criterion we get the following equalities.
  • If z Z and Θ D 0 , then we have:
    J ( z , Θ , α ¯ ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ϵ ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ε ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ] .
  • If z Z and Θ D 1 , then we have:
    J ( z , Θ , α ¯ ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ϵ ) 1 ϑ < m θ + [ e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ε ) + e β ( s κ ( ϑ , α ) + n ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 m θ ϑ ] .
  • If z Z and Θ D 2 , then we have:
    J ( z , Θ , α ¯ ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ϵ ) 1 ϑ < d 2 + [ i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ¯ ε ) + e β ( s κ ( ϑ , α ) + n ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 d 2 ϑ ] .
Together with (A2), this implies if z Z and Θ D 0 , we have:
v ( z , Θ ) J ( z , Θ , α ¯ ) E [ e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ] .
If z Z and Θ D 1 , we have:
v ( z , Θ ) J ( z , Θ , α ¯ ) E [ e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) 1 ϑ < m θ + e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 m θ ϑ ] .
If z Z and Θ D 2 , we have:
v ( z , Θ ) J ( z , Θ , α ¯ ) E [ e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) 1 ϑ < d 2 + [ i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ ( v * ( Z ϑ z , α , Θ ( ϑ , α ) ) ε ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 d 2 ϑ ] .
Since ε , ϑ and α are arbitrarily chosen, we get the result. □
Proposition A2.
For all z Z and Θ D 0 , we have:
v ( z , Θ ) sup α A ( Θ ) inf ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ] .
For all z Z and Θ D 1 , we have:
v ( z , Θ ) sup α A ( Θ ) inf ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < m θ + ( e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ) 1 m θ ϑ ] .
For all z Z and Θ D 2 , we have:
v ( z , Θ ) sup α A ( Θ ) inf ϑ T E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < d 2 + [ i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ) 1 ϑ d 2 ] .
Proof. 
Fix z Z and Θ D 0 , α A ( Θ ) and ϑ T . From Lemma A1, the definition of the performance criterion, we get:
J ( z , Θ , α ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ] E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < s 1 + m + [ e β ( s 1 + m ) y + ( θ ) + s 1 g ( X s x , α ) d s P s 1 + m p f ( s 1 + θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 s 1 + m ϑ ]
since ϑ and α are arbitrary, we obtain the required inequality.
If z Z and Θ D 1 , we get:
J ( z , Θ , α ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) 1 ϑ < m θ + [ e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 m θ ϑ ] E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < m θ + [ e β ( m θ ) y P m θ p f ( ρ θ ) + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 m θ ϑ ]
since ϑ and α are arbitrary, we obtain the required inequality.
If z Z and Θ D 2 , we get:
J ( z , Θ , α ) = E [ e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , 2 , α ϑ ) 1 ϑ < d 2 + [ i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ J ( Z ϑ z , α , Θ ( ϑ , α ) , α ϑ ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 d 2 ϑ ] E [ e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) 1 ϑ < d 2 + i = 2 κ ( ϑ , α ) 1 e β ( s i + m ) ( G i C i ) + e β ϑ v * ( Z ϑ z , α , Θ ( ϑ , α ) ) + e β ( s κ ( ϑ , α ) + m ) ( G κ ( ϑ , α ) C κ ( ϑ , α ) ) 1 D 2 ] 1 d 2 ϑ ]
since ϑ and α are arbitrary, we obtain the required inequality. □

Appendix B. Viscosity Properties

  • We first prove the viscosity supersolution property. Fix i { 0 , 1 , 2 } , and let z ¯ Z and Θ ¯ D i , φ C 2 ( Z × D i ) such that:
    ( v * φ ) ( z ¯ , Θ ¯ ) = min Z × D i ( v * φ ) = 0 .
    If i = 0 and θ ¯ , we can take the immediate control s 1 = 0 so we obtain by Theorem 1:
    v ( x ¯ , p ¯ , θ ¯ , θ ¯ , y ¯ ) v ( x ¯ , p ¯ , 0 , θ ¯ , y ¯ ) = M 1 v ( x ¯ , p ¯ , θ ¯ , θ ¯ , y ¯ ) .
    If i = 2 and θ ¯ δ , we can take the immediate control d 2 = 0 so we obtain by Theorem 1:
    v ( x ¯ , p ¯ , θ ¯ , ρ ¯ , y ¯ ) v ( x ¯ , p ¯ , 0 , 0 , y ¯ ) = M 2 v ( x ¯ , p ¯ , θ ¯ , ρ ¯ , y ¯ ) .
    From the definition of v * , there exists a sequence ( z n , Θ n ) n N of Z × D i such that:
    ( z n , Θ n , v ( z n , Θ n ) ) n + ( z ¯ , Θ ¯ , v * ( z ¯ , Θ ¯ ) ) .
    We define γ n : = v ( z n , Θ n ) v * ( z ¯ , Θ ¯ ) φ ( z n , Θ n ) + φ ( z ¯ , Θ ¯ ) . By continuity of φ , we get γ n 0 as n .
    Applying Proposition A1 with ϑ = h n = | γ n | . We have for n large enough:
    • if i = 0 :
      v ( z n , Θ n ) E [ e β h n v * ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 h n < s 1 n + m + [ e β ( s 1 n + m ) ( y + ( θ n ) + K θ n g ( X s x n , α 0 , n ) d s ) P s 1 n + m p n f ( s 1 n + θ n ) + e β h n v * ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) ] 1 s 1 n + m h n ]
      where α 0 , n is the strategy ( d 1 n , s 1 n ) = ( θ n , K θ n ) and then the manager follows the optimal strategy after this date,
    • if i = 1 :
      v ( z n , Θ n ) E [ e β h n v * ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 h n < m + 1 h n m e β ( s 1 n + m ) ( y P s 1 n + m p n f ( s 1 n + θ n ) ) + e β h n v * ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) ]
      where α 1 , n is the strategy ( d 1 n , s 1 n ) = ( ρ n , θ n ) and then the manager follows the optimal strategy after this date,
    • if i = 2 :
      v ( z n , Θ n ) E [ e β h n v * ( Z h n z n , α 2 , n , Θ ( h n , α 2 , n ) ) 1 h n < δ + 1 h n δ e β h n v * ( Z h n z n , α 2 , n , Θ ( h n , α 2 , n ) ) ]
      where α 2 , n is the strategy ( d 2 n , s 2 n ) = ( ( δ θ n ) + , ( δ θ n ) + + K ) and then the manager follows the optimal strategy after this date.
    We get for n large enough from (A3) and the three previous inequalities:
    γ n + φ ( z n , Θ n ) E [ e β h n ( v * ( Z h n z n , α i , n , Θ ( h n , α i , n ) ) ] E [ e β h n ( φ ( Z h n z n , α i , n , Θ ( h n , α i , n ) ) ] .
    Applying Itô’s formula, we get:
    • if i = 0 and θ ¯ < :
      1 h n E 0 h n e β s ( β φ ( Z s z n , α 0 , n , Θ ( s , α 0 , n ) ) L 0 φ ( Z s z n , α 0 , n , Θ ( s , α 0 , n ) ) ) d s | γ n | ,
    • if i = 0 and θ ¯ < K :
      1 h n E 0 h n e β s ( β φ ( Z s z n , α 0 , n , Θ ( s , α 0 , n ) ) L 1 φ ( Z s z n , α 0 , n , Θ ( s , α 0 , n ) ) ) d s | γ n | ,
    • if i = 1 and θ ¯ < m :
      1 h n E 0 h n e β s ( β φ ( Z s z n , α 1 , n , Θ ( s , α 1 , n ) ) L 0 φ ( Z s z n , α 1 , n , Θ ( s , α 1 , n ) ) ) d s | γ n | ,
    • if i = 2 and θ ¯ m :
      1 h n E 0 h n e β s ( β φ ( Z s z n , α 2 , n , Θ ( s , α 2 , n ) ) L 0 φ ( Z s z n , α 2 , n , Θ ( s , α 2 , n ) ) ) d s | γ n | .
      Sending n to , we get the supersolution property from the mean value theorem.
  • We turn to the viscosity subsolution. Fix i { 0 , 1 , 2 } , and let z ¯ Z and Θ ¯ D i , φ C 2 ( Z × D i ) such that:
    ( v * φ ) ( z ¯ , Θ ¯ ) = max Z × D i ( v * φ ) ( z , Θ ) = 0 .
    If v * ( z ¯ , Θ ¯ ) M 1 v * ( z ¯ , Θ ¯ ) for Θ ¯ D 0 with θ ¯ , and v * ( z ¯ , Θ ¯ ) M 2 v * ( z ¯ , Θ ¯ ) for Θ ¯ D 2 with θ ¯ δ , then the subsolution inequality holds trivially. Consider now the case v * ( z ¯ , Θ ¯ ) > M 1 v * ( z ¯ , Θ ¯ ) for Θ ¯ D 0 with θ ¯ , and v * ( z ¯ , Θ ¯ ) > M 2 v * ( z ¯ , Θ ¯ ) for Θ ¯ D 2 with θ ¯ δ , and argue by contradiction by assuming on the contrary that:
    • if z ¯ Z , Θ ¯ D 0 and θ ¯ < :
      r : = β φ ( z ¯ , Θ ¯ ) L 0 φ ( z ¯ , Θ ¯ ) > 0 ,
    • if z ¯ Z , Θ ¯ D 0 and θ ¯ :
      r : = β φ ( z ¯ , Θ ¯ ) L 1 φ ( z ¯ , Θ ¯ ) > 0 ,
    • if z ¯ Z , Θ ¯ D 1 :
      r : = β φ ( z ¯ , Θ ¯ ) L 0 φ ( z ¯ , Θ ¯ ) > 0 ,
    • if z ¯ Z , Θ ¯ D 2 :
      r : = β φ ( z ¯ , Θ ¯ ) L 0 φ ( z ¯ , Θ ¯ ) > 0 .
    By continuity of φ and its derivatives, there exists some Δ 0 > 0 s.t. for all 0 < Δ Δ 0 , we have:
    • if z ¯ Z , Θ ¯ D 0 and θ ¯ < :
      β φ ( z , Θ ) L 0 φ ( z , Θ ) > r / 2 , ( z , Θ ) B ( ( z ¯ , Θ ¯ ) , Δ ) E 0 with θ < ,
    • if z ¯ Z , Θ ¯ D 0 and θ ¯ :
      β φ ( z , Θ ) L 1 φ ( z , Θ ) > r / 2 , ( z , Θ ) B ( ( z ¯ , Θ ¯ ) , Δ ) E 0 with θ ,
    • if z ¯ Z , Θ ¯ D 1 :
      β φ ( z , Θ ) L 0 φ ( z , Θ ) > r / 2 , ( z , Θ ) B ( ( z ¯ , Θ ¯ ) , Δ ) E 1 ,
    • if z ¯ Z , Θ ¯ D 2 :
      β φ ( z , Θ ) L 0 φ ( z , Θ ) > r / 2 , ( z , Θ ) B ( ( z ¯ , Θ ¯ ) , Δ ) E 1 .
    From the definition of v * , there exists a sequence ( z n , Θ n ) n N of B ( ( z ¯ , Θ ¯ ) , Δ / 2 ) E 0 with θ n < (resp. θ n ) if θ ¯ < (resp. θ ¯ ) such that:
    ( z n , Θ n , v ( z n , Θ n ) ) n ( z ¯ , Θ ¯ , v * ( z ¯ , Θ ¯ ) ) ,
    and there exists a sequence ( z n , Θ n ) n N of B ( ( z ¯ , Θ ¯ ) , Δ / 2 ) E 1 (resp. B ( ( z ¯ , Θ ¯ ) , Δ / 2 ) E 2 ) if θ ¯ < m (resp. θ ¯ m ) such that:
    ( z n , Θ n , v ( z n , Θ n ) ) n ( z ¯ , Θ ¯ , v * ( z ¯ , Θ ¯ ) ) .
    By Theorem 1 we can find for each n N a control α i , n A ( Θ n ) such that for all h n T :
    v ( z n , Θ n ) E [ e β h n v ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 h n < s 1 0 , n + n + [ e β ( s 1 0 , n + m y n + ( θ ) + s 1 0 , n g ( X s x n , α 0 , n ) d s P s 1 0 , n + n p n f ( s 1 0 , n + θ ) + i = 2 κ ( h n , α 0 , n ) 1 e β ( s i 0 , n + m ) ( G i C i ) + e β h n v ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) ] 1 s 1 0 , n + n h n ] + 1 n ,
    and for all ( z , Θ ) E 1 , we have:
    v ( z n , Θ n ) E [ e β h n v ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 h n < m θ n + [ e β ( m θ n ) y n P m θ n p n f ( ρ n θ n ) + i = 2 κ ( h n , α 1 , n ) 1 e β ( s i 1 , n + n ) ( G i C i ) + e β h n v ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) ] 1 θ n h n ] + 1 n ,
    and for all ( z , Θ ) E 2 , we have:
    v ( z n , Θ n ) E [ e β h n v ( Z h n z n , α 2 , n , Θ ( h n , α 2 , n ) ) 1 h n < d 2 n + i = 2 κ ( h n , α 2 , n ) 1 e β ( s i 1 , n + m ) ( G i C i ) + e β h n v ( Z h n z n , α 2 , n , Θ ( h n , α 2 , n ) ) 1 d 2 n h n ] + 1 n .
    We now choose h n : = τ n 0 s 1 0 , n where τ n 0 : = inf { s 0 , ( Z s z n , α 0 , n , Θ ( s , α 0 , n ) ) B ( ( z n , Θ n ) , Δ / 2 ) } . Therefore, we get:
    v ( z n , Θ n ) E [ e β h n ( v ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 τ n 0 < s 1 0 , n + h 1 * ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 τ n 0 s 1 0 , n ) ] + 1 n
    Now, since h 1 * ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 τ n 0 s 1 0 , n < φ ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) 1 τ n 0 s 1 0 , n and v v * φ on E 0 , we get:
    φ ( z n , Θ n ) + γ n E e β h n φ ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) + 1 n .
    Applying Itô’s formula to e β h n φ ( Z h n z n , α 0 , n , Θ ( h n , α 0 , n ) ) between 0 and h n , we then get:
    γ n r 2 E [ h n ] + 1 n .
    This implies:
    lim n E [ h n ] = 0 .
    On the other hand, we have by (A6):
    v ( z n , Θ n ) sup ( z , Θ ) B ( ( z n , Θ n ) , Δ / 2 ) v 0 ( z , Θ ) P [ τ n 0 < s 1 0 , n ] + sup ( z , Θ ) B ( ( z n , Θ n ) , Δ / 2 ) h 1 * ( z , Θ ) P [ τ n 0 s 1 0 , n ] + 1 n .
    From (A7), we then get sending n to infinity and Δ to zero:
    v * ( z ¯ , Θ ¯ ) M 1 v * ( z ¯ , Θ ¯ ) .
    Concerning the proof for D 1 and D 2 , we consider two cases: the case θ ¯ < m and the case θ ¯ m .
    We start with the case θ ¯ m . In this case we consider h n = d 2 1 , n τ n 1 where τ n 1 : = inf { s 0 , ( Z s z n , α 1 , n , d ( s , α 1 , n ) ) B ( ( z n , Θ n ) , Δ / 2 ) } . Therefore, we have:
    v 2 ( z n , Θ n ) E [ e β h n ( h 2 * ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 d 2 1 , n τ n 1 + v 2 ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 d 2 1 , n > τ n 1 ) ] + 1 n
    Now, since h 2 * ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 d 2 1 , n τ n 1 < φ ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) 1 d 2 1 , n τ n 1 and v 2 v 2 * φ on E 1 , we get:
    φ ( z n , Θ n ) + γ n E e β h n φ ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) + 1 n .
    Applying Itô’s formula to e β h n φ ( Z h n z n , α 1 , n , Θ ( h n , α 1 , n ) ) between 0 and h n , we then get:
    γ n r 2 E [ h n ] + 1 n .
    This implies:
    lim n E [ h n ] = 0 .
    On the other hand, we have by (A8):
    v 2 ( z n , Θ n ) sup ( z , Θ ) B ( ( z n , Θ n ) , Δ / 2 ) v 2 ( z , Θ ) P [ τ n 1 < d 2 1 , n ] + sup ( z , Θ ) B ( ( z n , Θ n ) , Δ / 2 ) h 2 * ( z , Θ ) P [ τ n 1 d 2 1 , n ] + 1 n .
    From (A9), we then get sending n to infinity and Δ to zero:
    v 2 * ( z ¯ , d ¯ ) h 2 * ( z ¯ , d ¯ ) ,
    which is a contradiction.
    The case θ ¯ < m where i = 1 , is analogous to the previous case and we also obtain a contradiction.

Appendix C. Uniqueness

The uniqueness of v as a viscosity solution to (11), (12), (14), (16) and (17) satisfying (9) follows from the following comparison result.
Proposition A3.
Let w ̲ : Z × D R a viscosity subsolution to (11), (12), (14), (16), and (17) such that:
w ̲ ( x , p , K , K , y ) g ( x , p , 0 , K , y ) lim θ m w ̲ ( x , p , θ , ρ , y ) y p f ( ρ m ) + g ( x , p , m , ρ , 0 )
and w ¯ : Z × D R a viscosity supersolution to (11), (12), (14), (16), and (17) such that:
w ¯ ( x , p , K , K , y ) g ( x , p , 0 , K , y ) lim θ m w ¯ ( x , p , θ , ρ , y ) y p f ( ρ m ) + g ( x , p , m , ρ , 0 ) .
Suppose there exist two positive constants C 1 and C 2 such that:
w ̲ ( z , Θ ) C 2 ( 1 + | x | 2 + | p | 2 ) w ¯ ( z , Θ ) y p C 1 ,
for all ( z , Θ ) Z × D with z = ( x , p ) and Θ = ( θ , ρ , y ) . Then w ̲ w ¯ on Z × D . In particular, there exists at most a unique viscosity solution w to (11), (12), (14), (16), (17), (A10), and (A11) satisfying (A12) and w is continuous on Z × D .
The proof follows from the classical argument of doubling variable for proving the comparison between a sub and a super solution. We therefore omit it and refer to [8] Theorem 5.2 for a detailed proof that can be easily adapted to our PDE.

References

  1. Lande, R.; Engen, S.; Sæther, B.-E. Optimal harvesting of fluctuating populations with a risk of extinction. Am. Nat. 1995, 145, 728–745. [Google Scholar] [CrossRef]
  2. Alvarez, L.; Keppo, J. The impact of delivery lags on irreversible investment under uncertainty. Eur. J. Oper. Res. 2002, 136, 173–180. [Google Scholar] [CrossRef]
  3. Bar-Ilan, A.; Strange, W. Investment lags. Am. Econ. Rev. 1996, 8, 610–622. [Google Scholar]
  4. Alvarez, L.; Shepp, L. Optimal harvesting of stochastically fluctuating populations. J. Math. Biol. 1998, 37, 155–177. [Google Scholar] [CrossRef]
  5. Bruder, B.; Pham, H. Impulse control problem on finite horizon with execution delay. Stoch. Process. Appl. 2009, 119, 1436–1469. [Google Scholar] [CrossRef] [Green Version]
  6. Oksendal, B.; Sulem, A. Optimal Stochastic Impulse Control with Delayed Reaction. Appl. Math. Optim. 2008, 58, 253–298. [Google Scholar] [CrossRef] [Green Version]
  7. Kharroubi, I.; Lim, T.; Vath, V.L. Optimal Exploitation of a Resource with Stochastic Population Dynamics and Delayed Renewal. J. Math. Anal. Appl. 2019, 477, 627–656. [Google Scholar] [CrossRef] [Green Version]
  8. Mnif, M.; Pham, H. A model of optimal portfolio selection under liquidity risk and price impact. Financ. Stochastics 2007, 11, 51–90. [Google Scholar]
  9. Soner, H.M. Optimal Control with State-Space Constraints I. SIAM J. Control. Optim. 1986, 24, 552–561. [Google Scholar] [CrossRef]
  10. Soner, H.M. Optimal Control with State-Space Constraints II. SIAM J. Control. Optim. 1986, 24, 1110–1122. [Google Scholar] [CrossRef]
  11. Skiadas, C.H. Exact Solutions of Stochastic Differential Equations: Gompertz, Generalized Logistic and Revised Exponential. Methodol. Comput. Appl. Probab. 2010, 12, 261–270. [Google Scholar] [CrossRef]
  12. Gaïgi, M.; Vath, V.L.; Mnif, M.; Toum, S. Numerical Approximation for a Portfolio Optimization Problem Under Liquidity Risk and Costs. Appl. Math. Optim. 2016, 74, 163–195. [Google Scholar] [CrossRef]
  13. Guilbaud, F.; Mnif, M.; Pham, H. Numerical methods for an optimal order execution problem. J. Comput. Financ. 2013, 16, 3–45. [Google Scholar] [CrossRef] [Green Version]
  14. Pagès, G.; Pham, H.; Printems, J. Optimal quantization methods and applications to numerical problems in finance. In Handbook on Numerical Methods in Finance; Rachev, S., Ed.; Birkhäuser: Boston, MA, USA, 2004; pp. 253–298. [Google Scholar]
  15. Bouchard, B.; Touzi, N. Weak dynamic programming principle for viscosity solutions. SIAM J. Control. Optim. 2011, 49, 948–962. [Google Scholar] [CrossRef] [Green Version]
  16. Dellacherie, C.; Meyer, P.A. Probabilités et Potentiel, I–IV; Hermann: Paris, France, 1975. [Google Scholar]
Figure 1. The shape of the value function v sliced in the plane of x.
Figure 1. The shape of the value function v sliced in the plane of x.
Mathematics 08 02053 g001
Figure 2. The shape of the value function v sliced in the plane of x for different values of .
Figure 2. The shape of the value function v sliced in the plane of x for different values of .
Mathematics 08 02053 g002
Figure 3. The optimal policy for ( x , p , θ [ , K ] , ρ , y ) D 0 in the plane of x.
Figure 3. The optimal policy for ( x , p , θ [ , K ] , ρ , y ) D 0 in the plane of x.
Mathematics 08 02053 g003
Figure 4. The optimal policy for ( x , p , θ , ρ , y ) D 0 in the plane of θ .
Figure 4. The optimal policy for ( x , p , θ , ρ , y ) D 0 in the plane of θ .
Mathematics 08 02053 g004
Figure 5. The value function v for ( x , p , θ [ 0 , m ] , ρ , y ) D 1 in the plane of p.
Figure 5. The value function v for ( x , p , θ [ 0 , m ] , ρ , y ) D 1 in the plane of p.
Mathematics 08 02053 g005
Figure 6. The optimal policy for ( x , p , θ [ δ , θ m a x ] , ρ , y ) D 2 in the plane of x.
Figure 6. The optimal policy for ( x , p , θ [ δ , θ m a x ] , ρ , y ) D 2 in the plane of x.
Mathematics 08 02053 g006
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gaïgi, M.; Kharroubi, I.; Lim, T. Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics 2020, 8, 2053. https://doi.org/10.3390/math8112053

AMA Style

Gaïgi M, Kharroubi I, Lim T. Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics. 2020; 8(11):2053. https://doi.org/10.3390/math8112053

Chicago/Turabian Style

Gaïgi, M’hamed, Idris Kharroubi, and Thomas Lim. 2020. "Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints" Mathematics 8, no. 11: 2053. https://doi.org/10.3390/math8112053

APA Style

Gaïgi, M., Kharroubi, I., & Lim, T. (2020). Optimal Exploitation of a General Renewable Natural Resource under State and Delay Constraints. Mathematics, 8(11), 2053. https://doi.org/10.3390/math8112053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop