Next Article in Journal
Stochastic Comparisons of Lifetimes of Used Standby Systems
Next Article in Special Issue
Valuation of Commodity-Linked Bond with Stochastic Convenience Yield, Stochastic Volatility, and Credit Risk in an Intensity-Based Model
Previous Article in Journal
Optimal Circular Economy and Process Maintenance Strategies for an Imperfect Production–Inventory Model with Scrap Returns
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Stochastic Control Approach for Constrained Stochastic Differential Games with Jumps and Regimes

Department of Mathematics, University of Oslo, Postboks 1053 Blindern, 0316 Oslo, Norway
Mathematics 2023, 11(14), 3043; https://doi.org/10.3390/math11143043
Submission received: 4 June 2023 / Revised: 3 July 2023 / Accepted: 7 July 2023 / Published: 9 July 2023
(This article belongs to the Special Issue Stochastic Analysis and Applications in Financial Mathematics)

Abstract

:
We develop an approach for two-player constraint zero-sum and nonzero-sum stochastic differential games, which are modeled by Markov regime-switching jump-diffusion processes. We provide the relations between a usual stochastic optimal control setting and a Lagrangian method. In this context, we prove corresponding theorems for two different types of constraints, which lead us to find real-valued and stochastic Lagrange multipliers, respectively. Then, we illustrate our results for a nonzero-sum game problem with the stochastic maximum principle technique. Our application is an example of cooperation between a bank and an insurance company, which is a popular, well-known business agreement type called Bancassurance.

1. Introduction

A regime-switching model is one of the most powerful tools for efficiently capturing abrupt changes in a wide range of random phenomena. The discrete shifts from one state to another may easily be described mathematically as financial, natural, or mechanical events; hence, they enjoy a substantial application area. In this work, specifically, we focus on the fields of finance and actuarial science.
The states of a Markov chain can be seen as proxies of macroeconomic instruments such as gross domestic product or sovereign credit ratings. Furthermore, when we observe how regulation policies issued by governments or financial institutions cause deep modifications in microstructure of financial markets (see [1]), the importance of regime-switching models becomes clear. Moreover, periods that emerged after more catastrophic events like a financial crisis, e.g., the bankruptcy of Lehman Brothers in 2008, can be efficiently described by such systems. Additionally, we can combine regime switches with stochastic optimal control, which is another fundamental method of managing random events (for complete treatments of control theory, see [2,3]). Hence, these models have attracted many researchers so far, such as [4,5,6,7,8,9,10,11,12,13].
Furthermore, our work utilizes the foundations of stochastic differential games and combines them with stochastic optimal control and regime switches in a clear way. Such effective mathematical approaches attracted several authors; see [5,12,14,15,16,17,18,19] and references therein. Particularly, we focus on providing a financial application of a nonzero-sum stochastic differential game, for which the stochastic maximum principle has been preferred as a solution technique. In this sense, we want to mention some of the specific attempts in the literature to tackle such problems [7,17,18], which applied the stochastic maximum principle as well.
In [7], the authors develop necessary and sufficient maximum principles for a Markov regime-switching forward–backward zero-sum and a nonzero-sum stochastic differential games. Then, they provide an application for a zero-sumgame, which describes a robust utility maximization under a relative entropy penalty. But they do not give an application to describe the solution techniques for a nonzero-sum game formulation. In [17], the authors investigate optimal dividend strategies for two insurance companies and model their work with a stopping time problem via a regime-switching process. In this work, the authors just focus on a diffusion process with regimes. On the other hand, in [18], the authors study the optimal control problem of a nonzero-sum mean-field game with a delayed Markov regime-switching forward–backward stochastic system with Lévy processes. In this context, they provide necessary and sufficient maximum principles for these types of problems. Also, they define a single state process for both players and try to maximize each investor’s profits over the specified objective functional.
In our work, we approach stochastic differential game problems from a different technical point of view. We propose to provide the mathematical formulations for constrained stochastic control problems in a regime-switching environment. We formulate corresponding theorems to describe both stochastic and constant Lagrange multipliers. In this context, we extend Theorem 11.3.1 in [20], which has been proven for stochastic optimal control problems but without regime switches and game theoretical structures. Hence, our main contributions are developing the required theorems for zero-sum and nonzero-sum stochastic differential games with constraints and generalizing the state processes of the system to a Markov regime-switching jump-diffusion environment. We would like to emphasize the flexibility of our theorems, which can be applied with both dynamic programming principle and stochastic maximum principle techniques. Also, unlike the above works, we define different state processes for each player in a matrix representation, and our control problems appear with diversified constraints.
This paper is organized as follows: In Section 2, we provide the details of the model dynamics. Then, we introduce our Markov regime-switching jump-diffusion process, which is going to correspond to the state process of the system in our game theoretical application. In Section 3, we extend Theorem 11.3.1 in [20] to develop techniques in order to find the saddle point of a zero-sum game. In Section 4, we generalize Theorem 11.3.1 in [20] for the Nash equilibrium concept, which presents the stochastic optimal control processes of a nonzero-sum game formulation. In Section 5, we investigate cooperation between a bank and an insurance company via a nonzero-sum stochastic differential game method. While the company makes a decision for the optimal dividend payment against the best decision of the bank, the bank tries to determine the optimal appreciation rate for its cash flow corresponding to the best action of the company, and vice versa. In Section 6, we provide an insight into our results. Finally, a version of the sufficient maximum principle theorem with all the required technical conditions can be found in the appendix.

2. Preliminaries

Throughout this work, we assume that the maturity time T > 0 is finite. Let ( Ω , F , F t t 0 , P ) be a complete probability space, where F t t 0 is a right-continuous, P -completed filtration and F = F t : t [ 0 , T ] is generated by an M-dimensional Brownian motion W ( · ) , an L-dimensional Poisson random measure N ( · , · ) and a D-state Markov chain α ( · ) . It is assumed that these processes are independent of each other and adapted to F .
Let α ( t ) : t [ 0 , T ] be a continuous-time, finite-state Markov chain. We can choose a time-homogenous or a time-inhomogenous Markov chain, depending on the application that we intend to formulate. Moreover, based on the specific problem, the chain may be reducible or irreducible. In this work, we utilize a set of Markov jump martingales associated with the chain α as developed in [6]. Hence, we represent the canonical state space of the finite-state Markov chain α ( t ) by S = e 1 , e 2 , , e D , where D N , e i R D and the jth component of e i is the Kronecker delta δ i j for each pair of i , j = 1 , 2 , , D . The generator of the chain under P is defined by Λ : = [ μ i j ( t ) ] i , j = 1 , 2 , , D , t [ 0 , T ] and for each i , j = 1 , 2 , , D , μ i j ( t ) denotes the transition intensity of the chain from each state e i to state e j at time t. Note that for i j , μ i j ( t ) 0 and j = 1 D μ i j ( t ) = 0 ; then, μ i i ( t ) 0 . By Appendix B in [21], we know that there is a semimartingale representation for a Markov chain α as follows:
α ( t ) = α ( 0 ) + 0 t Λ T α ( u ) d u + M ( t ) ,
where M ( t ) : t [ 0 , T ] is an R D -valued, ( F , P ) -martingale and Λ T describes the transpose of the matrix. Let J i j ( t ) represent the number of jumps from state i to state j up to and including time t for each i , j = 1 , 2 , , D , with i j and t [ 0 , T ] . Then,
J i j ( t ) : = 0 < s t α ( s ) , e i α ( s ) , e j = 0 < s t α ( s ) , e i α ( s ) α ( s ) , e j = 0 t α ( s ) , e i d α ( s ) , e j = 0 t α ( s ) , e i Λ T α ( s ) , e i d s + 0 t α ( s ) , e i d M ( s ) , e j = 0 t μ i j ( s ) α ( s ) , e i d s + m i j ( t ) ,
where the processes m i j s are ( F , P ) -martingales and called the basic martingales associated with the chain α . For each fixed j = 1 , 2 , , D , let Φ j be the number of jumps into state e j up to time t. Hence, we obtain:
Φ j ( t ) : = i = 1 , i j D J i j ( t ) = i = 1 , i j D 0 t μ i j ( s ) α ( s ) , e i d s + Φ ˜ j ( t ) .
Let us define Φ ˜ j ( t ) : = i = 1 , i j D m i j ( t ) and μ j ( t ) : = i = 1 , i j D 0 t μ i j ( s ) α ( s ) , e i d s , then, for each j = 1 , 2 , , D ,
Φ ˜ j ( t ) = Φ j ( t ) μ j ( t )
is an ( F , P ) -martingale. By Φ ˜ ( t ) = ( Φ ˜ 1 ( t ) , Φ ˜ 2 ( t ) , , Φ ˜ D ( t ) ) T , we represent a compensated random measure on ( [ 0 , T ] × S , B ( [ 0 , T ] ) B S ) , where B S is a σ -field of S. Note that another description of such a martingale representation for a random measure generated by a Markov chain can be found in Appendix A.3 in [22] within the framework of actuarial science.
Furthermore, let B 0 be the Borel σ -field generated by an open subset of R 0 : = R 0 , whose closure does not contain the point 0. We define the compensated Poisson random measures as follows:
N ˜ i ( d t , d z ) : = N i ( d t , d z ) ν i ( d z ) d t , i = 1 , 2 , , L ,
where ( N i ( d t , d z ) : t [ 0 , T ] , z R 0 ) s are independent Poisson random measures on ( [ 0 , T ] × R 0 , B ( [ 0 , T ] ) B 0 ) and ν i ( d z ) = ( ν e 1 i ( d z ) , ν e 2 i ( d z ) , , ν e D i ( d z ) ) T ’s are Lévy densities of jump sizes of the random measure N i ( d t , d z ) for i = 1 , 2 , , L .
Now, let us describe the state process of the system as a Markov regime-switching jump-diffusion process:
Y ( t ) = b ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) d t + σ ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) d W ( t ) + R 0 η ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) , z ) N ˜ α ( d t , d z ) + γ ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) d Φ ˜ ( t ) , t [ 0 , T ] ,
Y ( 0 ) = y 0 R N ,
where U 1 and U 2 are nonempty subsets of R N and u 1 U 1 and u 2 U 2 are F t -predictable, cádlág (right continuous with left limits) control processes, such that
E 0 T u k ( t ) 2 d t < , k = 1 , 2 .
Moreover,
b : [ 0 , T ] × R N × S × U 1 × U 2 R N , σ : [ 0 , T ] × R N × S × U 1 × U 2 R N × M , η : [ 0 , T ] × R N × S × U 1 × U 2 × R 0 R N × L , γ : [ 0 , T ] × R N × S × U 1 × U 2 R N × D
are given measurable functions with respect to F , such that
0 T { b ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) + σ ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) 2 + R 0 η ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) , z ) 2 ν ( d z ) + j = 1 D γ ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) 2 μ j ( t ) } d t < .
Let f : [ 0 , T ] × R N × S × U 1 × U 2 R , called profit rate, and g : R N × S R , called terminal gain or bequest function, be C 1 functions with respect to y. Then, we can define the performance (objective) functional as follows:
J ( y , e i , u 1 , u 2 ) = E y , e i 0 T f ( s , Y ( s ) , α ( s ) , u 1 ( s ) , u 2 ( s ) ) d s + g ( Y u 1 , u 2 ( T ) , α ( T ) ) ,
for each i = 1 , 2 , , D and ( u 1 , u 2 ) are the control processes of the targeted problem. We call the control processes admissible and assume that Θ 1 and Θ 2 are given families of admissible control processes of u 1 U 1 and u 2 U 2 , respectively, if the following conditions are satisfied:
  • There exists a unique strong solution of the state process Y ( t ) introduced in Equations (1) and (2) (see Proposition 7.1 in [23] for an existence–uniqueness theorem of such a system).
  • E 0 T | f ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) | d t + | g ( Y u 1 , u 2 ( T ) , α ( T ) ) | < .
In the following section, we develop our first constrained control problem in a zero-sum game theoretic framework.

3. A Zero-Sum Stochastic Differential Game Approach

Firstly, let us remember the mathematical definition of a saddle point, i.e., the optimal control processes ( u 1 * , u 2 * ) Θ 1 × Θ 2 of a zero-sum stochastic differential game problem (if they exist). As we described in [12], assume that
J ( y , e i , u 1 * , u 2 * ) J ( y , e i , u 1 , u 2 * ) for all u 1 Θ 1 , e i S , i = 1 , 2 , , D ,
where we define:
J ( y , e i , u 1 * , u 2 * ) = sup u 1 Θ 1 J ( y , e i , u 1 , u 2 * ) .
Furthermore, suppose that
J ( y , e i , u 1 * , u 2 * ) J ( y , e i , u 1 * , u 2 ) for all u 2 Θ 2 e i S , i = 1 , 2 , , D ,
where we specify:
J ( y , e i , u 1 * , u 2 * ) = inf u 2 Θ 2 J ( y , e i , u 1 * , u 2 ) .
Then, ( u 1 * , u 2 * ) is a saddle point of a zero-sum stochastic differential game and
ϕ ( y , e i ) = J ( y , e i , u 1 * , u 2 * ) = sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) = inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 )
for each e i S , i = 1 , 2 , , D .
Now, we can express our constrained and unconstrained zero-sum stochastic differential game formulations and their relations.
Our constrained zero-sum problem is to find ( u 1 * , u 2 * ) for the following system:
ϕ ( y , e i ) = sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) = inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 ) = sup u 1 Θ 1 inf u 2 Θ 2 E y , e i 0 T f ( s , Y ( s ) , α ( s ) , u 1 ( s ) , u 2 ( s ) ) d s + g ( Y u 1 , u 2 ( T ) , α ( T ) ) ,
for i = 1 , 2 , , D , subject to the System (1) and (2) and the constraints,
( i ) E y , e i [ M ( Y u 1 , u 2 ( T ) , α ( T ) ) ] = 0
or
( ii ) M ( Y u 1 , u 2 ( T ) , α ( T ) ) = 0 a . s . ,
where M : R N R is a C 1 function with respect to y. Here, we introduce two types of constraints. For Constraint (4), it is enough to determine a real-valued Lagrange multiplier, while we have to find a stochastic one for the stochastic Constraint (5). Therefore, we clarify the set of stochastic Lagrange multipliers by:
Δ = λ : Ω R | λ is F T measurable and E [ λ ] < .
Moreover, in this case, we assume that E [ M ( Y u 1 , u 2 ( T ) , α ( T ) ) ] < .
Now, we can define our unconstrained zero-sum stochastic differential game as follows:
ϕ λ ( y , e i ) = sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 λ , u 2 λ ) = inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 λ , u 2 λ ) = sup u 1 Θ 1 ( inf u 2 Θ 2 E y , e i [ 0 T f ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) d t + g ( Y u 1 , u 2 ( T ) , α ( T ) ) + λ M ( Y u 1 , u 2 ( T ) , α ( T ) ) ] ) .
for i = 1 , 2 , , D , subject to the System (1) and (2).
Let us provide the following theorem for the constraint type (5):
Theorem 1.
Suppose that for all λ Δ 1 Δ , we can find ϕ λ ( y , e i ) , i = 1 , 2 , , D , and a saddle point ( u 1 * , λ , u 2 * , λ ) solving the unconstrained stochastic control Problem (6) subject to (1) and (2). Moreover, suppose that there exists λ 0 Δ 1 , such that
M ( Y T u 1 * , λ 0 , u 2 * , λ 0 , e i ) = 0 , a . s .
for all e i S , i = 1 , 2 , , D . Then, ϕ ( y , e i ) = ϕ λ 0 ( y , e i ) , i = 1 , 2 , , D and ( u 1 * , u 2 * ) = ( u 1 * , λ 0 , u 2 * , λ 0 ) solves the constrained stochastic control Problem (3) subject to (1)–(2) and (5).
Proof. 
By definition of the saddle point, we have:
ϕ λ ( y , e i ) = J ( y , e i , u 1 * , λ , u 2 * , λ ) = E y , e i [ 0 T f ( t , Y t u 1 * , λ , u 2 * , λ , e i , u 1 * , λ , u 2 * , λ ) d t + g ( Y T u 1 * , λ , u 2 * , λ , α T ) + λ M ( Y T u 1 * , λ , u 2 * , λ , α T ) ] J ( y , e i , u 1 λ , u 2 * , λ ) = E y , e i [ 0 T f ( t , Y t u 1 λ , u 2 * , λ , e i , u 1 λ , u 2 * , λ ) d t + g ( Y T u 1 λ , u 2 * , λ , α T ) + λ M ( Y T u 1 λ , u 2 * , λ , α T ) ] .
For the optimal strategy of Player 2, u 2 * , λ Θ 2 , λ Δ 1 , in particular if λ = λ 0 and since u 1 Θ 1 is feasible in the constrained control problem, then based on (7):
M ( Y T u 1 * , λ 0 , u 2 * , λ 0 , e i ) = 0 = M ( Y T u 1 , u 2 * , e i ) , for i = 1 , 2 , , D .
By (8):
ϕ λ 0 ( y , e i ) = J ( y , e i , u 1 * , λ 0 , u 2 * , λ 0 ) J ( y , e i , u 1 , u 2 * ) ,
for all u 1 Θ 1 and e i S , i = 1 , 2 , , D . Moreover, we know that
ϕ λ ( y , e i ) = J ( y , e i , u 1 * , λ , u 2 * , λ ) = E y , e i [ 0 T f ( t , Y t u 1 * , λ , u 2 * , λ , e i , u 1 * , λ , u 2 * , λ ) d t + g ( Y T u 1 * , λ , u 2 * , λ , α T ) + λ M ( Y T u 1 * , λ , u 2 * , λ , α T ) ] J ( y , e i , u 1 * , λ , u 2 λ ) = E y , e i [ 0 T f ( t , Y t u 1 * , λ , u 2 λ , e i , u 1 * , λ , u 2 λ ) d t + g ( Y T u 1 * , λ , u 2 λ , α T ) + λ M ( Y T u 1 * , λ , u 2 λ , α T ) ] .
for all u 2 Θ 2 and e i S , i = 1 , 2 , , D . For the optimal strategy of Player 1, u 1 * , λ Θ 1 , λ Δ 1 , in particular if λ = λ 0 and since u 2 Θ 2 is feasible in the constrained control problem, then based on (7):
M ( Y T u 1 * , λ 0 , u 2 * , λ 0 , e i ) = 0 = M ( Y T u 1 * , u 2 , e i ) , a . s . for i = 1 , 2 , , D .
Therefore, based on (10):
ϕ λ 0 ( y , e i ) = J ( y , e i , u 1 * , λ 0 , u 2 * , λ 0 ) J ( y , e i , u 1 * , u 2 ) ,
for all u 2 Θ 2 and e i S , i = 1 , 2 , , D . Consequently, using (9)–(11), we obtain:
J ( y , e i , u 1 , u 2 * ) J ( y , e i , u 1 * , λ 0 , u 2 * , λ 0 ) = ϕ λ 0 ( y , e i ) J ( y , e i , u 1 * , u 2 )
for any feasible ( u 1 , u 2 ) Θ 1 × Θ 2 and for all e i S , i = 1 , 2 , , D .
Then,
J ( y , e i , u 1 * , λ 0 , u 2 * , λ 0 ) inf u 2 Θ 2 J ( y , e i , u 1 * , u 2 ) sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) .
Moreover,
J ( y , e i , u 1 * , λ 0 , u 2 * , λ 0 ) sup u 1 Θ 1 J ( y , e i , u 1 , u 2 * ) inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 )
Hence, we obtain:
sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 ) .
Since we always have
sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 ) ,
finally, we prove
ϕ ( y , e i ) = sup u 1 Θ 1 inf u 2 Θ 2 J ( y , e i , u 1 , u 2 ) = inf u 2 Θ 2 sup u 1 Θ 1 J ( y , e i , u 1 , u 2 ) = ϕ λ 0 ( y , e i ) ,
for i = 1 , 2 , , D .
This completes the proof. □
We can prove the following theorem similarly for the constraint type (4).
Theorem 2.
Suppose that for all λ K R , we can find ϕ λ ( y , e i ) , i = 1 , 2 , , D , and a saddle point ( u 1 * , λ , u 2 * , λ ) solving the unconstrained stochastic control problem (6) subject to (1) and (2). Moreover, suppose that there exists λ 0 K , such that
E [ M ( Y T u 1 * , λ 0 , u 2 * , λ 0 , e i ) ] = 0 ,
for all e i S , i = 1 , 2 , , D . Then, ϕ ( y , e i ) = ϕ λ 0 ( y , e i ) , i = 1 , 2 , , D and ( u 1 * , u 2 * ) = ( u 1 * , λ 0 , u 2 * , λ 0 ) solves the constrained stochastic control problem (3) subject to (1) and (2) and (4).
In this section, we extended Theorem 11.3.1 in [20] to a zero-sum stochastic differential game formulation within the framework of regime switches.

4. A Nonzero-Sum Stochastic Differential Game Approach

By solving a nonzero-sum stochastic differential game, we aim to find a pair of optimal control processes that correspond to the Nash equilibrium of the two-player game, if it exists. Remember that Nash equilibrium is a self-enforcing strategy, i.e., each player knows that unilateral profitable deviation is not possible. This also means that each player’s strategy is optimal, or the best response against the other players. Then, a mathematical definition of the Nash equilibrium can be introduced, as we described in [12]:
Let u 1 Θ 1 and u 2 Θ 2 be two admissible control processes for Player 1 and Player 2, respectively. We define the performance criteria for each player as follows:
J k ( y , e i , u 1 , u 2 ) = E y , e i 0 T f k ( s , Y ( s ) , α ( s ) , u 1 ( s ) , u 2 ( s ) ) d s + g k ( Y u 1 , u 2 ( T ) , α ( T ) )
for each e i S , i = 1 , 2 , , D , and both propose to maximize their payoffs with respect to other player’s best action as follows:
J 1 ( y , e i , u 1 * , u 2 * ) = sup u 1 Θ 1 J 1 ( y , e i , u 1 , u 2 * ) ,
J 2 ( y , e i , u 1 * , u 2 * ) = sup u 2 Θ 2 J 2 ( y , e i , u 1 * , u 2 ) ,
for each e i S and for all y G , where G is an open subset of R N and corresponds to a solvency region for the state processes.
Definition 1
([12]). Let us assume that for the optimal strategy of Player 2, u 2 * Θ 2 , the best response of Player 1 satisfies
J 1 ( y , e i , u 1 , u 2 * ) J 1 ( y , e i , u 1 * , u 2 * ) for all u 1 Θ 1 , e i S , y G ,
and for the optimal strategy of Player 1, u 1 * Θ 1 , the best response of Player 2 satisfies
J 2 ( y , e i , u 1 * , u 2 ) J 2 ( y , e i , u 1 * , u 2 * ) for all u 2 Θ 2 , e i S , y G .
Then, the pair of optimal control processes ( u 1 * , u 2 * ) Θ 1 × Θ 2 is called a Nash equilibrium for the stochastic differential game of the System (1)–(2) and (12)–(13).
Our constrained nonzero-sum stochastic differential game is to find ( u 1 * , u 2 * ) for the Problems (12) and (13) subject to the System (1) and (2) and
( i ) E y , e i [ M k ( Y u 1 , u 2 ( T ) , α ( T ) ) ] = 0
or
( ii ) M k ( Y u 1 , u 2 ( T ) , α ( T ) ) = 0 a . s . ,
where M k : R N R , are C 1 functions with respect to y and we assume that E [ M k ( Y u 1 , u 2 ( T ) , α ( T ) ) ] < , k = 1 , 2 .
Finally, our unconstrained nonzero-sum stochastic differential game problem is described as follows:
ϕ k λ k ( y , e i ) = J k ( y , e i , u 1 * , λ 1 , u 2 * , λ 2 ) = sup u k Θ k E y , e i [ 0 T f k ( t , Y ( t ) , α ( t ) , u 1 ( t ) , u 2 ( t ) ) d t + g k ( Y u 1 , u 2 ( T ) , α ( T ) ) + λ k M k ( Y u 1 , u 2 ( T ) , α ( T ) ) ]
for k = 1 , 2 and e i S , i = 1 , 2 , , D , subject to the system (1)–(2).
Theorem 3.
Suppose that for all λ k Δ k Δ , we can find ϕ k λ k ( y , e i ) , i = 1 , 2 , , D and k = 1 , 2 , and a Nash equilibrium ( u 1 * , λ 1 , u 2 * , λ 2 ) solving the unconstrained stochastic control problems (16) for each player. Moreover, suppose that there exist λ k 0 Δ k Δ , k = 1 , 2 , such that
M 1 ( Y T u 1 * , λ 1 0 , u 2 * , λ 2 , e i ) = 0 and M 2 ( Y T u 1 * , λ 1 , u 2 * , λ 2 0 , e i ) = 0 a . s . ,
for all e i S , i = 1 , 2 , , D . Then, ϕ k ( y , e i ) = ϕ k λ k 0 ( y , e i ) , k = 1 , 2 , i = 1 , 2 , , D and ( u 1 * , u 2 * ) = ( u 1 * , λ 1 0 , u 2 * , λ 2 0 ) solves the constrained stochastic control problem.
Proof. 
By the definition of Nash equilibrium, we have
J 1 ( y , e i , u 1 * , λ 1 , u 2 * , λ 2 ) = E y , e i [ 0 T f 1 ( t , Y t u 1 * , λ 1 , u 2 * , λ 2 , e i , u 1 * , λ 1 , u 2 * , λ 2 ) d t + g 1 ( Y T u 1 * , λ 1 , u 2 * , λ 2 , α T ) + λ 1 M 1 ( Y T u 1 * , λ 1 , u 2 * , λ 2 , α T ) ] J 1 ( y , e i , u 1 λ 1 , u 2 * , λ 2 ) = E y , e i [ 0 T f 1 ( t , Y t u 1 λ 1 , u 2 * , λ 2 , e i , u 1 λ 1 , u 2 * , λ 2 ) d t + g 1 ( Y T u 1 λ 1 , u 2 * , λ 2 , α T ) + λ 1 M 1 ( Y T u 1 λ 1 , u 2 * , λ 2 , α T ) ] .
For the optimal strategy of Player 2, u 2 * , λ 2 Θ 2 , λ 2 Δ 2 , since u 1 Θ 1 is feasible in the constrained control problem, if λ 1 = λ 1 0 is satisfied, then based on (15) and (17):
M 1 ( Y T u 1 * , λ 1 0 , u 2 * , λ 2 , e i ) = 0 = M 1 ( Y T u 1 , u 2 * , e i ) , a . s . for i = 1 , 2 , , D .
Based on (18):
J 1 ( y , e i , u 1 * , λ 1 0 , u 2 * , λ 2 ) J 1 ( y , e i , u 1 , u 2 * ) ,
for all u 1 Θ 1 and e i S , i = 1 , 2 , , D .
Similarly, we can obtain
J 2 ( y , e i , u 1 * , λ 1 , u 2 * , λ 2 0 ) J 2 ( y , e i , u 1 * , u 2 ) ,
for all u 2 Θ 2 and e i S , i = 1 , 2 , , D . Therefore, by the definition of Nash equilibrium, the inequalities (19) and (20) complete the proof. □
We can easily develop a similar theorem for the constraint type (14) as well:
Theorem 4.
Suppose that for all λ k A k R , we can find ϕ k λ k ( y , e i ) , i = 1 , 2 , , D and k = 1 , 2 , and a Nash equilibrium ( u 1 * , λ 1 , u 2 * , λ 2 ) solving the unconstrained stochastic control problems (16) for each player. Moreover, suppose that there exists λ k 0 A k R , k = 1 , 2 , such that
E [ M 1 ( Y T u 1 * , λ 1 0 , u 2 * , λ 2 , e i ) ] = 0 and E [ M 2 ( Y T u 1 * , λ 1 , u 2 * , λ 2 0 , e i ) ] = 0 ,
for all e i S , i = 1 , 2 , , D . Then, ϕ k ( y , e i ) = ϕ k λ k 0 ( y , e i ) , k = 1 , 2 , i = 1 , 2 , , D and ( u 1 * , u 2 * ) = ( u 1 * , λ 1 0 , u 2 * , λ 2 0 ) solves the constrained stochastic control problem described above.
In this section, we extended Theorem 11.3.1 in [20] to a nonzero-sum stochastic differential game formulation within the framework of regime switches.
Remark 1.
Firstly, we should indicate that Theorems 1–4 can be applied to both the dynamic programming principle and the stochastic maximum principle under specific technical conditions that arise as a consequence of the nature of the corresponding technique.
In this work, we provide an application of the nonzero-sum game formulation with the stochastic maximum principle. The technical conditions and problem formulation within the framework of the stochastic maximum principle for a Markov regime-switching jump diffusion system of such a game have already been developed in [7] without a Lagrangian setting. Hence, here, we just remind you of an appropriate version of the sufficient maximum principle in the appendix.
On the other side, while applying Theorem A1 to a Lagrangian problem, one should be careful that g k λ k ( y , e i ) = g k ( y , e i ) + λ k ( M k ( y , e i ) ) ’s are concave, C 1 -functions with respect to y for i = 1 , 2 , , D and k = 1 , 2 .

5. An Application: Bancassurance

In this section, we provide an application of Theorem 4 within the framework of a collaboration between a bank and an insurance company, which illustrates an example of a well-known concept: Bancassurance.
The core of the joint venture between the bank and insurers is to strengthen their business objectives through sharing a client database, the development of products, and coordination. Insurance companies and banks benefit from this long-term cooperation in many ways. For example, insurance companies may create new and more efficient financial instruments with the help of the experience of banks. Furthermore, they can reach the wide customer portfolio of the banks without investing in more offices and manpower. Hence, the insurance companies may reduce costs while increasing sales. On the other side, banks can also increase their income, diversify their offerings of financial products, and, by providing different services under one roof, gain more customer loyalty and satisfaction. Some financial and actuarial aspects for this legal and independent organizational entity, Bancassurance, can be found in [24,25,26] and references therein.
Basically, in our formulation, the insurance company gives a certain amount of its surplus to the bank as a commission, which becomes the initial value of the bank’s wealth process. Moreover, in this application, we can approach our regime switching model from two different sides:
  • The states of the Markov chain may represent the switches in different states of the economy, such as a shift from recession to growth periods, macroeconomic indicators, regulation changes, or a radical change like a financial crisis. All these impulses affect the cash flows of both the insurance company and the bank within the framework of decision making, dividend management, commissions, the number of clients, etc. In this case, we may also assume to use a time-homogenous, irreducible Markov chain.
  • We can see the shifts as the states of a life insurance policy, i.e., the states of the Markov chain may represent injured, dead, or alive cases. We may assume that the bank makes an investment whose cash flow is affected by the abrupt changes experienced by the insured, such as investing in the stocks of an insurance company. In this case, time-inhomogenous and reducible Markov chains may be utilized.
First, let us introduce the dynamics of the wealth process of the insurance company. p ˜ ( t ) represents the deterministic premium rate at time t [ 0 , T ] for each claim. An insurance premium is the amount of money that has to be paid by an insured for an insurance policy covering such as healthcare, auto, home, and life insurance. The premium is an income for an insurance company, which is used for coverage of the claims against the policy. Additionally, the companies may utilize premiums to make some investments and increase their own wealth. In our application, we focus on a life insurance policy.
Now, we can show the dynamics of the surplus process via R ( t ) as follows:
d R ( t ) = a ( t , α ( t ) ) d t + σ 1 ( t , α ( t ) ) d W 1 ( t ) + i = 1 , i j N γ i j ( t ) ( d J i j ( t ) μ i j ( t ) ) = a ( t , α ( t ) ) d t + σ 1 ( t , α ( t ) ) d W 1 ( t ) + γ ( t , α ( t ) ) d Φ ˜ ( t )
Here, for t [ 0 , T ] , σ 1 ( t , α ( t ) ) denotes the instantaneous volatility of the aggregate insurance claims at time t and a ( t , α ( t ) ) specifies the payments of the insurer due during sojourns in state i. Finally, γ i j ( · ) determines the claims of the insurance company to the insured due upon transition from state i to j. In this context, α ( · ) is a time-inhomogenous Markov chain, and μ i j ( · ) indicates the intensity of the chain, which corresponds to the mortality rate for a life insurance contract. Furthermore, J i j ( t ) denotes the number of transitions into state j up to and including time t [ 0 , T ] for the associated cádlág counting process, for which we use a martingale form. By the way, we describe a risk exchange between insurer and insured. While the company pays out the amount of the insurance claim γ i j ( t ) upon a transition to state j, the policyholder has to pay an amount of γ i j ( t ) μ i j ( t ) if she is in state i, for t [ 0 , T ] .
In our application, we suppose that the insurance company pays an amount of its surplus to its shareholders, which is called dividend distribution, and it is the only control process of the company defined as follows:
d D ˜ ( t ) = δ ( t ) d t .
Hence, we represent the wealth (cash) process of the insurance company X 1 ( t ) , t [ 0 , T ] by:
d X 1 ( t ) = p ˜ ( t ) d t d R ( t ) d D ( t ) , X 1 ( 0 ) = u c ,
where u and c are nonnegative constant values that correspond to the initial surplus of the insurance company and the commission of the bank paid at time t = 0 , respectively.
We focus on the cash flow of the bank, which is generated just through an investment of the gathered commissions via the bancassurance agreement, rather than other investments of the bank.
Let us introduce the wealth (cash) process of the bank:
d X 2 ( t ) = X 2 ( t ) u ( t ) d t + σ 2 ( t , α ( t ) ) d W 2 ( t ) + R 0 η ( t , α ( t ) , z ) N ˜ ( d t , d z ) X 2 ( 0 ) = c , t [ 0 , T ]
where the appreciation rate is not given a prior. Specifically, u ( · ) is a control process depending on the interaction between the bank and the insurer.
Now, we can present the state process of the system as follows:
d Y ( t ) = d X 1 ( t ) d X 2 ( t ) = p ˜ ( t ) a ( t , α ( t ) ) δ ( t ) X 2 ( t ) u ( t ) d t + σ 1 ( t , α ( t ) ) 0 0 X 2 ( t ) σ 2 ( t , α ( t ) ) d W 1 ( t ) d W 2 ( t ) + 0 X 2 ( t ) R 0 η ( t , α ( t ) , z ) N ˜ ( d t , d z ) + γ ( t , α ( t ) ) 0 d Φ ˜ ( t ) ,
with initial values,
Y ( 0 ) = X 1 ( 0 ) X 2 ( 0 ) = u c c > 0 .
Here, we assume that W 1 and W 2 are independent Brownian motions; moreover, a , σ 1 , σ 2 , η , and γ are square integrable and measurable functions.
Let us describe the performance functional of the insurer and the bank by J 1 ( δ , u * ) and J 2 ( δ * , u ) correspondingly:
J 1 ( δ , u * ) = E x , e i 0 T 1 1 κ 1 h 1 ( t , α ( t ) ) δ ( t ) 1 κ 1 d t X 2 2 ( T ) ,
and
J 2 ( δ * , u ) = E x , e i 0 T h 2 ( t , α ( t ) ) ln ( u ( t ) ) d t + κ 2 X 1 ( T ) ,
where κ 1 0 , κ 1 1 , κ 2 R , h 1 , h 2 are square integrable, measurable functions. Furthermore, r ˜ ( t , e i ) = ( r ˜ 1 , r ˜ 2 , , r ˜ D ) , i = 1 , 2 , , D , are constants at each state on [ 0 , T ] and can be seen as interest rates in different states of the economy.
Then, our problem is to find ( δ * , u * ) by solving:
J 1 ( δ * , u * ) = sup δ Θ 1 J 1 ( δ , u * )
subject to the System (23) and
E [ X 1 ( T ) ] = K 1 ,
and
J 2 ( δ * , u * ) = sup u Θ 2 J 2 ( δ * , u )
subject to the System (23) and
E [ e r ˜ ( T , α ( T ) ) ln ( X 2 ( T ) ) ] = K 2 .
Here, in this context, Player 1 wants to maximize h 1 ( · , e i ) times power utility of her dividend process while she punishes the deviation of the terminal value of Player 2 from 0. Furthermore, Player 1 sets a goal to reach a level of K 1 for the terminal value of her expected wealth process. On the other side, Player 2 proposes to maximize h 2 ( · , e i ) times the logarithmic utility of her appreciation rate with κ 1 times the terminal value of the insurer’s wealth process. Moreover, Player 2 sets a target to catch the K 2 level for the discounted logarithm of her terminal value in the sense of expected values.
Finally, if we consider this nonzero-sum game problem in terms of the Lagrangian formulation described in Theorem 4, our problem becomes to find ( δ * , u * ) for:
J 1 ( δ * , u * ) = sup δ Θ 1 E x , e i 0 T 1 1 κ 1 h 1 ( t , α ( t ) ) δ ( t ) 1 κ 1 d t X 2 2 ( T ) + λ 1 ( X 1 ( T ) K 1 ) ,
and
J 2 ( δ * , u * ) = sup u Θ 2 E x , e i 0 T h 2 ( t , α ( t ) ) ln u ( t ) d t + κ 2 X 1 ( T ) + λ 2 ( e r ˜ ( T , α ( T ) ) ln X 2 ( T ) K 2 ) .
Now, we can provide the corresponding Hamiltonian functions for each player and solve them using Theorem A1 (see Appendix A):
H 1 ( t , y , δ , u , p 2 , q 2 , r 2 ( · ) , w 2 , e i ) = 1 1 κ 1 δ 1 κ 1 h 1 ( t , α ( t ) ) + ( p ˜ a ( t , e i ) ) δ ) p 1 1 + x 2 u p 2 1 σ 1 ( t , e i ) q 11 1 + x 2 σ 2 ( t , e i ) q 22 1 + x 2 R 0 η ( t , e i , z ) r 2 1 ( t , z ) ν ( d z ) j = 1 N γ i j ( t ) w 1 1 , j μ i j ( t ) .
and
H 2 ( t , y , δ , u , p 2 , q 2 , r 2 ( · ) , w 2 , e i ) = h 2 ( t , e i ) ln ( u ) + ( p ˜ a ( t , e i ) ) δ ) p 1 2 + x 2 u p 2 2 σ 1 ( t , e i ) q 11 2 + x 2 σ 2 ( t , e i ) q 22 2 + x 2 R 0 η ( t , e i , z ) r 2 2 ( t , z ) ν ( d z ) j = 1 D γ i j ( t ) w 1 2 , j μ i j ( t ) .
We propose to solve the corresponding BSDEs with jumps and regimes and find the following:
p k ( t ) = p 1 k ( t ) p 2 k ( t ) q k ( t ) = q 11 k ( t ) q 12 k ( t ) q 21 k ( t ) q 22 k ( t ) r k ( t , z ) = r 1 k ( t , z ) r 2 k ( t , z ) w ( t ) = w 1 k ( t ) w 2 k ( t ) = w 1 k , j ( t ) w 2 k , j ( t ) t [ 0 , T ] , k = 1 , 2 and j = 1 , 2 , , D .
Firstly, let us solve the adjoint equations corresponding to H 1 :
d p 1 1 ( t ) = q 11 1 ( t ) d W 1 ( t ) + q 12 1 ( t ) d W 2 ( t ) + R 0 r 1 1 ( t , z ) N ˜ ( d t , d z ) + w 1 1 ( t ) d Φ ˜ ( t ) , p 1 1 ( T ) = λ 1 ,
and
d p 2 1 ( t ) = u ( t ) p 2 1 ( t ) + σ 2 ( t , α ( t ) ) q 22 1 ( t ) + R 0 η ( t , e i , z ) r 2 1 ( t , z ) ν ( d z ) d t + q 21 1 ( t ) d W 1 ( t ) + q 22 1 ( t ) d W 2 ( t ) + R 0 r 2 1 ( t , z ) N ˜ ( d t , d z ) + w 2 1 ( t ) d Φ ˜ ( t ) , p 2 1 ( T ) = 2 X 2 ( T ) ,
where w k 1 ( t ) d Φ ˜ ( t ) = j = 1 N w k 1 , j ( t ) d Φ ˜ j ( t ) for k = 1 , 2 and j = 1 , 2 , , D .
To find a solution for p 2 1 ( t ) , t [ 0 ; T ] we try:
p 2 1 ( t ) = ϕ ( t , α ( t ) ) X 2 ( t ) ϕ ( T , α ( T ) ) = ϕ ( T , e i ) = 2 ,
where ϕ ( · , e i ) is a C 1 deterministic function for all e i S , i = 1 , 2 , , D with the given terminal value. We apply Itô’s formula as described in [6]:
d p 2 1 ( t ) = ( ϕ ( t , α ( t ) ) X 2 ( t ) + ϕ ( t , α ( t ) ) X 2 ( t ) u ( t ) + j = 1 N X 2 ( t ) ( ϕ ( t , e j ) ϕ ( t , α ( t ) ) ) μ j ( t ) ) d t + ϕ ( t , α ( t ) ) X 2 ( t ) σ 2 ( t , α ( t ) ) d W 2 ( t ) + R 0 ϕ ( t , α ( t ) ) X 2 ( t ) η ( t , α ( t ) , z ) N ˜ ( d t , d z ) + j = 1 D X 2 ( t ) ( ϕ ( t , e j ) ϕ ( t , α ( t ) ) ) d Φ ˜ j ( t )
Now, we compare Equations (25) and (26) and obtain the following solutions for p 2 1 ( t ) , q 21 1 ( t ) , q 22 1 ( t ) , r 2 1 ( t , z ) and w 2 1 ( t ) for t [ 0 , T ] :
q 21 1 ( t ) = 0 , q 22 1 ( t ) = ϕ ( t , e i ) X 2 ( t ) σ 2 ( t , e i ) , r 2 1 ( t , z ) = ϕ ( t , e i ) X 2 ( t ) η ( t , α ( t ) , z ) , w 2 1 , j ( t ) = ( ϕ ( t , e j ) ϕ ( t , e i ) ) X 2 ( t ) ,
and
ϕ ( t , e i ) X 2 ( t ) u ( t ) + σ 2 2 ( t , e i ) + R 0 η 2 ( t , e i , z ) ν ( d z ) = X 2 ( t ) ϕ ( t , e i ) + ϕ ( t , e i ) u ( t ) + j = 1 N X 2 ( t ) ( ϕ ( t , e j ) ϕ ( t , e i ) ) μ i j ( t ) .
Hence,
X 2 ( t ) [ ϕ ( t , e i ) + ϕ ( t , e i ) 2 u * ( t ) + σ 2 2 ( t , e i ) + R 0 η 2 ( t , e i , z ) ν ( d z ) + j = 1 D ( ϕ ( t , e j ) ϕ ( t , e i ) ) μ i j ( t ) ] = 0 .
Let us call
B ( t , e i ) = 2 u * ( t ) + σ 2 2 ( t , e i ) + R 0 η 2 ( t , e i , z ) ν ( d z ) .
Then, obviously, we obtain the following N-coupled differential equation with its terminal value as follows:
ϕ ( t , e i ) + ϕ ( t , e i ) B ( t , e i ) + j = 1 D ( ϕ ( t , e j ) ϕ ( t , e i ) ) μ i j ( t ) = 0 , Φ ( T , e i ) = 2 , for i = 1 , 2 , , D .
Finally, by applying the Feyman–Kac procedure:
ϕ ( t , e i ) = 2 E exp t T B ( t , e i ) d s | α ( t ) = e i , i = 1 , 2 , , D .
Moreover, we can find out p 1 1 ( t ) , t [ 0 , T ] by trying p 1 1 ( t ) = g 1 ( t ) , where g ( · ) is a deterministic function with terminal value g ( T ) = λ 1 .
Then, according to Equation (24):
p 1 1 ( t ) = λ 1 , q 11 1 ( t ) = q 12 1 ( t ) = r 1 1 ( t , z ) = w 1 1 ( t ) = 0 .
Now, let us differentiate H 1 with respect to δ to define the optimal control process for the insurance company:
δ κ 1 ( t ) h 1 ( t , α ( t ) ) p 1 1 = 0
Then,
δ ( t ) = λ 1 h 1 ( t , α ( t ) ) 1 κ 1 .
Finally, by applying the expectation to both sides of the Equation (22) and based on the constraint for Player 1, let us determine λ 1 :
λ 1 = u c K 1 + E 0 T ( p ˜ ( t ) a ( t , α ( t ) ) ) d t κ 1 E 0 T h 1 1 κ 1 ( t , α ( t ) ) d t κ 1
Now, let us represent the adjoint equations for the Hamiltonian of the second player:
d p 1 2 ( t ) = q 11 2 ( t ) d W 1 ( t ) + q 12 2 ( t ) d W 2 ( t ) + R 0 r 1 2 ( t , z ) N ˜ ( d t , d z ) + w 1 2 ( t ) d Φ ˜ ( t ) , p 1 2 ( T ) = κ 2 ,
and
d p 2 2 ( t ) = u ( t ) p 2 2 ( t ) + σ 2 ( t , α ( t ) ) q 22 2 ( t ) + R 0 η ( t , e i , z ) r 2 2 ( t , z ) ν ( d z ) d t + q 21 2 ( t ) d W 1 ( t ) + q 22 2 ( t ) d W 2 ( t ) + R 0 r 2 2 ( t , z ) N ˜ ( d t , d z ) + w 2 2 ( t ) d Φ ˜ ( t ) , p 2 2 ( T ) = λ 2 e r ˜ ( T , α ( T ) X 2 ( T ) ,
where w k 2 ( t ) d Φ ˜ ( t ) = j = 1 D w k 2 , j ( t ) d Φ ˜ j ( t ) for k = 1 , 2 and j = 1 , 2 , , D .
Now, let us solve these adjoint equations. Firstly, we try:
p 2 2 ( t ) = A ( t , α ( t ) ) X 2 ( t ) , for t [ 0 , T ] , A ( T , α ( T ) ) = A ( T , e k ) = λ 2 e r ˜ ( T , e k ) ,
where A ( · , e k ) is a deterministic C 1 function for all k = 1 , 2 , , D with the given terminal value. Then, we apply Itô’s formula as described in [6]:
d p 2 2 ( t ) = [ A ( t , α ( t ) ) X 2 1 ( t ) + A ( t , α ( t ) ) X 2 1 ( t ) ( u ( t ) + σ 2 2 ( t , α ( t ) ) + R 0 ( 1 + η ( t , α ( t ) , z ) ) 1 1 + η ( t , α ( t ) , z ) ν ( d z ) ) + j = 1 N X 2 1 ( t ) A ( t , e j ) A ( t , α ( t ) ) μ j ( t ) ] d t + A ( t , α ( t ) ) X 2 1 ( t ) [ σ 2 ( t , α ( t ) ) d W 2 ( t ) + R 0 ( 1 + η ( t , α ( t ) , z ) ) 1 1 N ˜ ( d t , d z ) ] + j = 1 D X 2 1 ( t ) A ( t , e j ) A ( t , α ( t ) ) d Φ ˜ j ( t )
Let us compare Equations (28) and (29) and obtain:
u ( t ) p 2 2 ( t ) + σ 2 ( t , α ( t ) ) q 22 2 ( t ) + R 0 η ( t , α ( t ) , z ) ) r 2 2 ( t , z ) ν ( d z ) = A ( t , α ( t ) ) X 2 1 ( t ) + A ( t , α ( t ) ) X 2 1 ( t ) ( u ( t ) + σ 2 2 ( t , α ( t ) ) + R 0 ( 1 + η ( t , α ( t ) , z ) ) 1 1 + η ( t , α ( t ) , z ) ν ( d z ) ) + j = 1 D X 2 1 ( t ) A ( t , e j ) A ( t , α ( t ) ) μ j ( t )
and
q 21 2 ( t ) = 0 , q 22 2 ( t ) = A ( t , e i ) X 2 1 σ 2 ( t , e i ) , r 2 2 ( t ) = A ( t , e i ) X 2 1 ( t ) ( 1 + η ( t , e i , z ) ) 1 1 , w 2 2 , j ( t ) = X 2 1 A ( t , e j ) A ( t , e i ) , for i = 1 , 2 , , D .
If we replace the values of p 2 2 , q 21 2 , and r 2 2 in Equation (30), then we get:
A ( t , e i ) X 2 1 ( t ) + R 0 A ( t , e i ) X 2 1 ( t ) ( η ( t , e i , z ) η ( t , e i , z ) + 1 + 1 η ( t , e i , z ) + 1 1 ) ν ( d z ) + j = 1 D X 2 1 ( t ) A ( t , e j ) A ( t , e i ) μ i j ( t ) = 0
Finally, we get:
X 2 1 A ( t , e i ) + j = 1 D A ( t , e j ) A ( t , e i ) μ i j ( t ) = 0 .
Then,
A ( t , e i ) + j = 1 D A ( t , e j ) A ( t , e i ) μ i j ( t ) = 0 , A ( T , α ( T ) ) = A ( T , e k ) = λ 2 e r ˜ ( T , e k )
for any e k S , k = 1 , 2 , , D . By applying the classical Feyman–Kac procedure, we can solve these N-coupled equations:
A ( t , α ( t ) ) = λ 2 E e r ˜ ( T , e k ) | α ( t ) = e i = λ 2 e r ˜ ( T , α ( T ) ) , for any t [ 0 , T ] .
Now, let us find p 1 2 ( t ) , t [ 0 , T ] by trying p 1 2 ( t ) = h ( t ) , where h ( t ) is a deterministic function with terminal value h ( T ) = κ 2 . Then, based on Equation (27):
p 1 2 ( t ) = κ 2 , q 11 2 ( t ) = q 12 2 ( t ) = r 1 2 ( t , z ) = w 1 2 ( t ) = 0 .
We differentiate H 2 with respect to u to define the optimal control process for the bank:
h 2 ( t , α ( t ) ) u ( t ) + X 2 ( t ) p 2 2 ( t ) = 0
Then,
u ( t ) = 1 λ 2 e r ˜ ( T , α ( T ) ) h 2 ( t , α ( t ) ) , t [ 0 , T ] .
In order to determine λ 2 , let us apply Itô’s formula to Y ( t ) = ln ( X 2 ( t ) ) :
d Y ( t ) = { 1 λ 2 e r ˜ ( T , α ( T ) ) h 2 ( t , α ( t ) ) 1 2 σ 2 2 ( t , α ( t ) ) + R 0 ln ( η ( t , α ( t ) , z ) + 1 ) η ( t , α ( t ) , z ) ν ( d z ) } d t + σ 2 ( t , α ( t ) ) d W 2 ( t ) + R 0 ln ( η ( t , α ( t ) , z ) + 1 ) N ˜ ( d t , d z ) .
If we multiply both sides of the equation by e r ˜ ( T , α ( T ) ) and apply expectation, we get:
E [ e r ˜ ( T , α ( T ) ) ln ( X 2 ( T ) ) ] = E [ e r ˜ ( T , α ( T ) ) ln ( c ) + 0 T { 1 λ 2 h 2 ( t , α ( t ) ) 1 2 e r ˜ ( T , α ( T ) ) ×   σ 2 2 ( t , α ( t ) ) + R 0 e r ˜ ( T , α ( T ) ) ln ( η ( t , α ( t ) , z ) + 1 ) η ( t , α ( t ) , z ) ν ( d z ) } d t ] .
Finally, let us call:
D 1 = e r ˜ ( T , α ( T ) ) ln ( c ) , D 2 = E 0 T h 2 ( t , α ( t ) ) d t , D 3 = E [ 0 T e r ˜ ( T , α ( T ) ) ( 1 2 σ 2 2 ( t , α ( t ) ) + R 0 ln ( η ( t , α ( t ) , z ) + 1 ) η ( t , α ( t ) , z ) ν ( d z ) ) d t ] .
Therefore, based on the constraint for Player 2, we select λ 2 , such that:
λ 2 = D 2 D 1 + D 3 K 2 > 0 .
Finally, based on the measurability and square-integrability conditions for σ k , η , γ , h k and selection of g k λ k ( y , e i ) = g k ( y , e i ) + λ k ( M k ( y , e i ) ) , for i = 1 , 2 , , D and k = 1 , 2 , one can easily verify the integrability and concavity conditions of Theorem A1.

6. Conclusions

In this work, we developed techniques to solve stochastic optimal control problems in a Lagrangian game-theoretic environment. Both the zero-sum and nonzero-sum stochastic differential game problems with two specific types of constraints can be approached based on the dynamic programming principle and the stochastic maximum principle within the construction of our theorems. Moreover, we demonstrated these theorems for a quite extended model of stochastic processes, named Markov regime switching jump diffusions. As we explained in Section 1, such models have a wide range of application areas. In our work, we focused on a business agreement, called Bancassurance, between a bank and an insurance company using the stochastic maximum principle for a nonzero-sum stochastic differential game. We investigated the optimal dividend strategy for the company as the best response according to the optimal mean rate of return choice of a bank for its own cash flow and vice versa. We found a Nash equilibrium for this game and solved the adjoint equations explicitly for each state.
It is well known that the timing and amount of dividend payments are strategic decisions for companies. The announcement of a dividend payment may reduce or increase the stock price of a company. A high dividend payment may send a message to shareholders and potential investors about the substantial profits achieved by the company. On the other hand, this may create the impression that the company does not have a good future project to invest in rather than paying investors. Moreover, dividend payments may aim to honor the shareholders’ feelings of obtaining a reward for their trust in the company.
From the side of the bank, it is clear that creating a cash flow with high returns would be the main goal. It is obviously seen that depending on the values of h 2 ( · , e i ) , e i S , i = 1 , 2 , , D , the appreciation rate of the bank’s investment may drop below zero.
Hence, in our formulation, we provide insight to both the bank and the insurance company about their best moves in a bancassurance commitment under specified technical conditions.

Funding

This project is supported by SCROLLER: A Stochastic ContROL approach to Machine Learning with applications to Environmental Risk models, Project 299897 from the Norwegian Research Council.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Let us clarify the general formulation of the technique that we apply here for the solution of a nonzero-sum stochastic differential game within this context of Equations (1) and (2) and the Problems (12) and (13) based on the stochastic maximum principle for a Markov regime-switching jump-diffusion model:
The Hamiltonian functions associated with Player k, namely H k , for k = 1 , 2 , are defined from [ 0 , T ] × R N × U 1 × U 2 × R N × R N × M × R × R N × D × S × to R as follows:
H k ( t , y , u 1 , u 2 , p k , q k , r k ( · ) , w k , e i ) = f k ( t , y , u 1 , u 2 , e i ) + b T ( t , y , u 1 , u 2 , e i ) p k + t r ( σ T ( t , y , u 1 , u 2 , e i ) q k ) + R N l = 1 L n = 1 N η n l ( t , y , u 1 , u 2 , e i , z ) r n l k ( t , z ) ν l ( d z ) + j = 1 D n = 1 N γ n j ( t , y , u 1 , u 2 , e i , z ) w n j k ( t ) μ i j , k = 1 , 2 ,
and each H k , k = 1 , 2 , is continuously differentiable with respect to y; i.e., each is a C 1 -function with respect to y, and differentiable with respect to corresponding Player’s control processes.
Corresponding adjoint equations for Player k, for k = 1 , 2 , in the unknown adapted processes p k ( t ) R N , q k ( t ) R N × M , r k ( t , z ) R , where R is the set of functions r : [ 0 , T ] × R 0 R N × L , and w k ( t ) R N × D are given by the following equations:
d p k ( t ) = y H k ( t , Y ( t ) , u 1 ( t ) , u 2 ( t ) , p k ( t ) , q k ( t ) , r k ( t , · ) , w k ( t ) , α ( t ) ) d t + q k ( t ) d W ( t ) + R 0 r k ( t , z ) N ˜ ( d t , d z ) , t < T ,
p k ( T ) = g k ( Y ( T ) , α ( T ) ) , k = 1 , 2 ,
where y ϕ ( · ) = ( ϕ y 1 , , ϕ y N ) T is the gradient of ϕ : R N R with respect to y = ( y 1 , , y N ) . For the existence–uniqueness results of the BSDEs with jumps and regimes (A1) and (A2), see Propositions 5.1 and 5.2 by Crépey and Matoussi [27]. In this context, here, we assume that p k ( t ) , q k ( t ) , r k ( t , z ) , and w k ( t ) , k = 1 , 2 are square integrable.
Now, we can present a sufficient maximum principle for such a game:
Theorem A1.
Let ( u 1 * , u 2 * ) Θ 1 × Θ 2 with a corresponding solution Y ^ ( t ) : = Y u 1 * , u 2 * ( t ) and suppose there exists an adapted solution ( p k ( t ) , q k ( t ) , r k ( t , z ) , w k ( t ) ) , k = 1 , 2 , of the corresponding adjoint Equations (A1) and (A2) such that for all ( u 1 , u 2 ) Θ 1 × Θ 2 , we have:
E [ 0 T ( Y ^ ( t ) Y u 1 ( t ) ) T { q ^ 1 ( t ) q ^ 1 ( t ) T + R 0 r ^ 1 ( t , z ) r ^ 1 ( t , z ) T ν ( d z ) + w ^ 1 ( t ) D i a g ( μ ( t ) ) w ^ 1 ( t ) T } ( Y ^ ( t ) Y u 1 ( t ) ) T d t ] < ,
and
E [ 0 T ( Y ^ ( t ) Y u 2 ( t ) ) T { q ^ 2 ( t ) q ^ 2 ( t ) T + R 0 r ^ 2 ( t , z ) r ^ 2 ( t , z ) T ν ( d z ) + w ^ 2 ( t ) D i a g ( μ ( t ) ) w ^ 2 ( t ) T } ( Y ^ ( t ) Y u 2 ( t ) ) T d t ] < ,
where Y u 1 ( t ) : = Y u 1 , u 2 * ( t ) and Y u 2 ( t ) : = Y u 1 * , u 2 ( t ) .
Furthermore,
E [ 0 T p ^ 1 ( t ) T ( ( σ ( t , Y u 1 ( t ) , α ( t ) , u 1 ( t ) , u 2 * ( t ) ) σ ^ ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) ) ) 2 + R 0 ( η ( t , Y u 1 ( t ) , α ( t ) , u 1 ( t ) , u 2 * ( t ) , z ) η ^ ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) , z ) ) 2 ν ( d z ) + j = 1 D ( γ j ( t , Y u 1 ( t ) , α ( t ) , u 1 ( t ) , u 2 * ( t ) ) γ ^ j ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) ) ) 2 λ j ( t ) ) p ^ 1 ( t ) d t ] <
and
E [ 0 T p ^ 2 ( t ) T ( ( σ ( t , Y u 2 ( t ) , α ( t ) , u 1 * ( t ) , u 2 ( t ) ) σ ^ ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) ) ) 2 + R 0 ( η ( t , Y u 2 ( t ) , α ( t ) , u 1 * ( t ) , u 2 ( t ) , z ) η ^ ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) , z ) ) 2 ν ( d z ) + j = 1 D ( γ j ( t , Y u 2 ( t ) , α ( t ) , u 1 * ( t ) , u 2 ( t ) ) γ ^ j ( t , Y ^ ( t ) , α ( t ) , u 1 * ( t ) , u 2 * ( t ) ) ) 2 λ j ( t ) ) p ^ 2 ( t ) d t ] < .
Moreover, assume that the following conditions hold:
1. 
For almost all t [ 0 , T ] ,
H 1 ( t , Y ^ ( t ) , u 1 * ( t ) , u 2 * ( t ) , p ^ 1 ( t ) , q ^ 1 ( t ) , r ^ 1 ( t , · ) , w ^ 1 ( t ) , α ( t ) ) = sup u 1 U 1 H 1 ( t , Y ^ ( t ) , u 1 ( t ) , u 2 * ( t ) , p ^ 1 ( t ) , q ^ 1 ( t ) , r ^ 1 ( t , · ) , w ^ 1 ( t ) , α ( t ) ) ,
and
H 2 ( t , Y ^ ( t ) , u 1 * ( t ) , u 2 * ( t ) , p ^ 2 ( t ) , q ^ 2 ( t ) , r ^ 2 ( t , · ) , w ^ 2 ( t ) , α ( t ) ) = sup u 2 U 2 H 2 ( t , Y ^ ( t ) , u 1 * ( t ) , u 2 ( t ) , p ^ 2 ( t ) , q ^ 2 ( t ) , r ^ 2 ( t , · ) , w ^ 2 ( t ) , α ( t ) ) .
2. 
For each fixed pair of ( t , e i ) [ 0 , T ] × S ,
H 1 ^ ( y ) = sup u 1 U 1 H 1 ( t , y , u 1 , u 2 * ( t ) , p ^ 1 ( t ) , q ^ 1 ( t ) , r ^ 1 ( t , · ) , w ^ 1 ( t ) , e i ) ,
and
H 2 ^ ( y ) = sup u 2 U 2 H 2 ( t , y , u 1 * ( t ) , u 2 , p ^ 2 ( t ) , q ^ 2 ( t ) , r ^ 2 ( t , · ) , w ^ 2 ( t ) , e i )
exist and are concave functions of y.
3. 
g k ( y , e i ) , k = 1 , 2 , are concave functions of y for each e i S .
Then, ( u 1 * , u 2 * ) Θ 1 × Θ 2 is a Nash equilibrium of the System (1) and (2) and the Problems (12) and (13).
Proof. 
For the proof of this theorem, it is enough to follow the steps of Theorem 3.1 in [6] in our game theoretical formulation for each player. Moreover, the proof may be seen as a special version of Thoerem 3.1 in [7]. □

References

  1. Laruelle, S.; Rosenbaum, M.; Savku, E. Assessing MiFID 2 regulation on tick sizes: A transaction costs analysis viewpoint. Mark. Microstruct. Liq. 2018, 5, 2050003. [Google Scholar] [CrossRef]
  2. Øksendal, B.; Sulem, A. Stochastic Control of Jump Diffusions; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
  3. Yong, J.; Zhou, X.Y. Stochastic Controls: Hamiltonian Systems and HJB Equations; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1999; Volume 43. [Google Scholar]
  4. Crépey, S. About the pricing equations in finance. In Paris-Princeton Lectures on Mathematical Finance 2010; Springer: Berlin/Heidelberg, Germany, 2011; pp. 63–203. [Google Scholar]
  5. Elliott, R.J.; Siu, T.K. On risk minimizing portfolios under a Markovian regime-switching Black-Scholes economy. Ann. Oper. Res. 2010, 176, 271–291. [Google Scholar] [CrossRef]
  6. Zhang, X.; Elliott, R.J.; Siu, T.K. A stochastic maximum principle for a Markov regime-switching jump-diffusion model and an application to finance. Siam J. Control Optim. 2012, 50, 964–990. [Google Scholar] [CrossRef]
  7. Menoukeu-Pamen, O.; Momeya, R.H. A maximum principle for Markov regime-switching forward–backward stochastic differential games and applications. Math. Methods Oper. Res. 2017, 85, 349–388. [Google Scholar] [CrossRef] [Green Version]
  8. Lv, S.; Tao, R.; Wu, Z. Maximum principle for optimal control of anticipated forward–backward stochastic differential delayed systems with regime switching. Optim. Control Appl. Methods 2016, 37, 154–175. [Google Scholar] [CrossRef]
  9. Mao, X.; Yuan, C. Stochastic Differential Equations with Markovian Switching; Imperial College Press: London, UK, 2006. [Google Scholar]
  10. Savku, E.; Weber, G.W. A stochastic maximum principle for a markov regime-switching jump-diffusion model with delay and an application to finance. J. Optim. Theory Appl. 2018, 179, 696–721. [Google Scholar] [CrossRef]
  11. Savku, E.; Weber, G.W. A Regime-Switching Model with Applications to Finance: Markovian and Non-Markovian Cases. In Dynamic Economic Problems with Regime Switches; Springer: Berlin/Heidelberg, Germany, 2021; pp. 287–309. [Google Scholar]
  12. Savku, E.; Weber, G.W. Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market. Ann. Oper. Res. 2020, 312, 1171–1196. [Google Scholar] [CrossRef]
  13. Savku, E. Memory and Anticipation: Two main theorems for Markov regime-switching stochastic processes. arXiv 2023, arXiv:2302.13890. [Google Scholar]
  14. Elliott, R.J.; Siu, T.K. A stochastic differential game for optimal investment of an insurer with regime switching. Quant. Financ. 2011, 11, 365–380. [Google Scholar] [CrossRef]
  15. Ma, C.; Wu, H.; Lin, X. Nonzero-sum stochastic differential portfolio games under a Markovian regime switching model. Math. Probl. Eng. 2015, 2015, 738181. [Google Scholar] [CrossRef] [Green Version]
  16. Shen, Y.; Siu, T.K. Stochastic differential game, Esscher transform and general equilibrium under a Markovian regime-switching Lévy model. Insur. Math. Econ. 2013, 53, 757–768. [Google Scholar] [CrossRef]
  17. Zhang, J.; Chen, P.; Jin, Z.; Li, S. On a class of non-zero-sum stochastic differential dividend games with regime switching. Appl. Math. Comput. 2021, 397, 125956. [Google Scholar] [CrossRef]
  18. Deepa, R.; Muthukumar, P.; Hafayed, M. Optimal control of nonzero sum game mean-field delayed Markov regime-switching forward-backward system with Lévy processes. Optim. Control Appl. Methods 2021, 42, 110–125. [Google Scholar] [CrossRef]
  19. Bui, T.; Cheng, X.; Jin, Z.; Yin, G. Approximation of a class of non-zero-sum investment and reinsurance games for regime-switching jump–diffusion models. Nonlinear Anal. Hybrid Syst. 2019, 32, 276–293. [Google Scholar] [CrossRef] [Green Version]
  20. Ksendal, B. Stochastic differential equations. In Stochastic Differential Equations; Springer: Berlin/Heidelberg, Germany, 2003; pp. 65–84. [Google Scholar]
  21. Elliott, R.J.; Aggoun, L.; Moore, J.B. Hidden Markov Models: Estimation and Control; Springer: New York, NY, USA, 1995. [Google Scholar]
  22. Asmussen, S.; Steffensen, M. Risk and Insurance; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  23. Crepey, S. About the Pricing Equations in Finance. Paris-Princet. Lect. Math. Financ. 2010, 2003, 63–203. [Google Scholar]
  24. Peng, J.L.; Jeng, V.; Wang, J.L.; Chen, Y.C. The impact of bancassurance on efficiency and profitability of banks: Evidence from the banking industry in Taiwan. J. Bank. Financ. 2017, 80, 1–13. [Google Scholar] [CrossRef]
  25. Leepsa, N.; Singh, R. Contribution of bancassurance on the performance of bank: A case study of acquisition of shares in max new york life insurance by axis bank. J. Bus. Financ. Aff. 2017, 6, 1000283. [Google Scholar] [CrossRef]
  26. Buric, M.N.; Kascelan, V.; Vujosevic, S. Bancassurance Concept from the Perspective of Montenegrin Market. Econ. Rev. J. Econ. Bus. 2015, 13, 62–73. [Google Scholar]
  27. Crépey, S.; Matoussi, A. Reflected and doubly reflected BSDEs with jumps: A priori estimates and comparison. Ann. Appl. Probab. 2008, 18, 2041–2069. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Savku, E. A Stochastic Control Approach for Constrained Stochastic Differential Games with Jumps and Regimes. Mathematics 2023, 11, 3043. https://doi.org/10.3390/math11143043

AMA Style

Savku E. A Stochastic Control Approach for Constrained Stochastic Differential Games with Jumps and Regimes. Mathematics. 2023; 11(14):3043. https://doi.org/10.3390/math11143043

Chicago/Turabian Style

Savku, Emel. 2023. "A Stochastic Control Approach for Constrained Stochastic Differential Games with Jumps and Regimes" Mathematics 11, no. 14: 3043. https://doi.org/10.3390/math11143043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop