Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning

Liu, Xiangdong; Gu, Yu

doi:10.3390/math11122658

Open AccessArticle

Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning

by

Xiangdong Liu

and

Yu Gu

^*

Department of Statistics and Data Science, Jinan University, Guangzhou 510632, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(12), 2658; https://doi.org/10.3390/math11122658

Submission received: 13 April 2023 / Revised: 31 May 2023 / Accepted: 6 June 2023 / Published: 11 June 2023

(This article belongs to the Special Issue Computational Economics and Mathematical Modeling)

Download

Browse Figures

Versions Notes

Abstract

:

Many problems in the fields of finance and actuarial science can be transformed into the problem of solving backward stochastic differential equations (BSDE) and partial differential equations (PDEs) with jumps, which are often difficult to solve in high-dimensional cases. To solve this problem, this paper applies the deep learning algorithm to solve a class of high-dimensional nonlinear partial differential equations with jump terms and their corresponding backward stochastic differential equations (BSDEs) with jump terms. Using the nonlinear Feynman-Kac formula, the problem of solving this kind of PDE is transformed into the problem of solving the corresponding backward stochastic differential equations with jump terms, and the numerical solution problem is turned into a stochastic control problem. At the same time, the gradient and jump process of the unknown solution are separately regarded as the strategy function, and they are approximated, respectively, by using two multilayer neural networks as function approximators. Thus, the deep learning-based method is used to overcome the “curse of dimensionality” caused by high-dimensional PDE with jump, and the numerical solution is obtained. In addition, this paper proposes a new optimization algorithm based on the existing neural network random optimization algorithm, and compares the results with the traditional optimization algorithm, and achieves good results. Finally, the proposed method is applied to three practical high-dimensional problems: Hamilton-Jacobi-Bellman equation, bond pricing under the jump Vasicek model and option pricing under the jump diffusion model. The proposed numerical method has obtained satisfactory accuracy and efficiency. The method has important application value and practical significance in investment decision-making, option pricing, insurance and other fields.

Keywords:

deep learning; backward stochastic differential equation; nonlinear Feynman-Kac formula; high dimensional PDE; derivatives pricing; neural network

MSC:

65C30; 65M75; 91G30

1. Introduction

High-dimensional (dimension ≥ 3) nonlinear partial differential equations (PDEs) are among the subjects that draw a lot of interest and are widely used in a variety of fields. Many practical problems require the use of a high-dimensional PDE, for example: the Schrodinger equation of the quantum many-body problem, whose PDE dimension is about three times that of the electron or quantum (particle) in the system; the Black Scholes equation used to price financial derivatives, where the dimension of the PDE is the number of relevant financial assets under consideration; and the Hamilton-Jacobi-Bellman equation in dynamic programming. High-dimensional nonlinear partial differential equations are very practical, but because there are few explicit solutions or the analytical expressions are too complicated, it is frequently necessary to use some numerical techniques to solve them. However, in practical applications, high-dimensional nonlinear partial differential equations are usually very difficult to solve and remain one of the challenging topics in the academic community. The difficulty is that, due to the “curse of dimensionality” [1], the time complexity of traditional numerical solutions will increase exponentially with the increase of dimensions, thus requiring extensive computing resources. Nevertheless, we urgently need to approximate the numerical solution of these high-dimensional nonlinear partial differential equations because these equations can resolve a variety of real-world issues.

In the field of finance, the method of solving high-dimensional PDEs is frequently employed. However, the “heavy tail” of financial markets has been demonstrated by numerous empirical research studies. “Abnormal” events such as new inventions, economic policies, wars or other news, as well as various arbitrage and hedging activities, can lead to sudden and intermittent changes in asset prices. These sudden changes in price and price fluctuations intuitively reflect the jumping effect that exists in financial markets. Although jumping behavior has a low likelihood of happening, when it does happen, it causes significant harm and often results in investors suffering enormous losses or even going bankrupt. How to build adequate models to depict the jumping behavior of asset prices has become a hot topic for academics as financial market research has become more sophisticated and miniaturized. Therefore, it is necessary to incorporate common jumping behaviors in financial problems, such as buying (selling) property defaults, corporate bankruptcies, operational failures, insurance events, etc., into high-dimensional PDE numerical solutions.

Backward stochastic differential equations (BSDEs) are widely used in stochastic control and financial mathematics. One important reason is that a BSDE can be expressed as the stochastic counterpart of a PDE through the Feynman-Kac formula. El Karoui N et al. [2] provided a detailed introduction to the different properties of BSDEs and their applications to finance. Zhang J [3] proposed a numerical scheme for a class of BSDEs with possible path-dependent terminal values and proved its convergence. Recently, Barigou K and Delong Ł [4] also examined the pricing of equity-linked life insurance contracts by framing the price as the solution of a system of non-linear PDEs, reformulating the problem as a BSDE with jumps and solving it numerically using efficient neural networks.

In recent years, deep learning algorithms have rapidly gained popularity and success in various application sectors. As a result, several scientists have had success using deep learning methods to handle high-dimensional PDE issues. In 2017, E W. Han et al. [5,6] first proposed the deep BSDE method and systematically applied deep learning to general high-dimensional PDEs. Following that, Han et al. (2020) [7] continued to apply the deep BSDE method to the study of random games and also produced promising outcomes. The method was expanded to include the case of 2BSDE and the corresponding fully nonlinear PDE by Beck et al. (2019) [8]. Raissi et al. (2017) [9] use the latest development of probabilistic machine learning to infer the control equation represented by a parametric linear operator and modify the prior value of the Gaussian process according to the special form of the operator, which is then used to infer the parameters of the linear equation from the scarce and possibly noisy observations. To approximate the solution of a high-dimensional PDE, Sirignano et al. (2018) [10] suggested using a deep neural network technique. The network was trained to satisfy boundary conditions, initial conditions and differential operators in a batch of randomly sampled time and space. The achievements listed above show that it is feasible to solve high-dimensional PDEs with a deep learning-based method.

In this paper, we mainly apply the deep learning algorithm to a special class of high-dimensional nonlinear partial differential equations with jumps and obtain a numerical solution of the equation. Specifically, through mathematical derivation and equivalent formula expression of a high-dimensional PDE and backward stochastic differential equation, the problem of solving a partial differential equation is equivalent to the problem of solving a BSDE, and then it is transformed into a stochastic control problem. Then, a deep neural network framework is created to address the issue. Therefore, this method can be used to solve high-dimensional PDEs and corresponding backward stochastic differential equations simultaneously.

The method introduced in this paper to solve a kind of high-dimensional PDE with jump by using deep learning is mainly applied in the financial field, and it has important applications in financial derivatives pricing, insurance investment decision-making, small and micro enterprise financing, risk measurement mechanism and other issues.

2. Background Knowledge

2.1. A Class of PDE

We consider a class of semilinear parabolic PDEs with jump term in the following form:

Let

T \in (0, \infty)

,

d \in ℕ

, and

f : ℝ \times ℝ^{d} \to ℝ

and

g : ℝ^{d} \to ℝ

be continuous functions. For all

t \in [0, T]

,

x \in ℝ^{d}

, the function to be solved

u = u (t, x) \in C^{1, 2} ([0, T] \times ℝ^{d}, ℝ)

has a terminal value condition

u (T, x) = g (x)

and satisfies:

\frac{\partial u}{\partial t} (t, x) - L u (t, x) - f (t, x, u (t, x), σ^{T} (t, x) \nabla_{x} u (t, x), B u (t, x)) = 0

(1)

where we introduce two operators:

L u (t, x) = \nabla_{x} u (t, x) [μ (x) - λ β (x)] + \frac{1}{2} T r (σ σ^{T} (t, x) (H e s s_{x} u) (t, x)) + λ \{u (t, x + β (x)) - u (t, x)\}

(2)

B u (t, x) = u (t, x + β (x)) - u (t, x)

(3)

Here

t

is the time variable,

x

is the

d

-dimensional space variable,

f

is the nonlinear part of the equation,

\nabla_{x} u

represents the gradient of

u

with respect to

x

,

H e s s_{x} u

represents the Hessian matrix of

u

with respect to

x

. In particular, this equation can be regarded as a special case of partial integro-differential equation. What we are interested in is the solution at

t = 0

,

x = ξ \in ℝ^{d}

, which is

u (0, ξ)

[11,12].

2.2. Backward Stochastic Differential Equations with Jumps

Let

(Ω, F, P)

be a complete probability space,

W : [0, T] \times Ω \to ℝ^{d}

be the d-dimensional standard Brownian motion in this probability space,

{\{F_{t}\}}_{t \in [0, T]}

be the normal filtration generated by

W

in space

(Ω, F, P)

.

{\{X_{t}\}}_{0 \leq t \leq T}

,

{\{Y_{t}\}}_{0 \leq t \leq T}

,

{\{Z_{t}\}}_{0 \leq t \leq T}

,

{\{U_{t}\}}_{0 \leq t \leq T}

are integrable

F

-adapted stochastic processes. For a class of forward backward stochastic differential equations with jump terms, it has the following form:

\{\begin{matrix} X_{t} = X_{0} + \int_{0}^{t} μ (s, X_{s}) d s + \int_{0}^{t} σ (s, X_{s}) d W_{s} + \int_{0}^{t} β (s, X_{s}) d {\tilde{N}}_{s} \\ Y_{t} = g (X_{T}) + \int_{t}^{T} f (s, X_{s}, Y_{s}, Z_{s}, U_{s}) d s - \int_{t}^{T} Z_{s} d W_{s} - \int_{t}^{T} U_{s} d {\tilde{N}}_{s} \end{matrix}

(4)

where

d {\tilde{N}}_{t}

is the compensated Poisson measure,

{\tilde{N}}_{t}

is a centralized Poisson process with the compensator intensity

λ t

such that

{\tilde{N}}_{t} = N_{t} - λ t

,

N_{t}

is the Poisson process in the probability space

(Ω, F, P)

such that

E (N_{t}) = λ t

.

Under the standard Lipschitz assumptions on the coefficients

μ

,

σ

,

β

,

f

,

g

, the existence and uniqueness of the solution have been proved [13].

2.3. The Generalized Nonlinear Feynman-Kac Formula

Under the appropriate regularization assumption, and if

u (t, x) \in C ([0, T] \times R^{d})

satisfies Equation (1) and linear growth condition

|u (t, x)| \leq K (1 + |x|)

,

|\nabla_{x} u (t, x)| \leq K (1 + |x|)

,

(t, x) \in C ([0, T] \times R^{d})

, the following relationships are established almost everywhere:

Y_{t} = u (t, X_{t}) \in R

(5)

Z_{t} = \nabla_{x} u (t, X_{t}) σ (t, X_{t}) \in R^{d}

(6)

U_{t} = u (t, X_{t} + β (t, X_{t})) - u (t, X_{t}) \in R

(7)

(Y_{t}, Z_{t}, U_{t})

is the only solution of the BSDE [11,12].

2.4. Improvement of Neural Network Parameter Optimization Algorithm

The Adam (Adaptive Moment Estimation) algorithm is an algorithm that combines the RMSProp algorithm with classical momentum in physics. It uses the first-order moment estimate and the second-order moment estimation of gradient to dynamically modify the learning rate of each parameter. The Adam optimizer is one of the most popular classical optimizers in deep learning, and it also shows excellent performance in practice [14].

Although Adam combines RMSprop with momentum, the adaptive moment estimation with Nesterov acceleration is often superior to momentum. Therefore, we consider including the Nesterov acceleration effect [15] into the Adam algorithm, that is, using the Nadam (Nesterov-accelerated Adaptive Moment Estimation) optimization algorithm. The calculation formula is as follows:

{\hat{g}}_{t} = \frac{g_{t}}{1 - \prod_{i = 1}^{t} μ_{i}}

(8)

m_{t} = μ * m_{t - 1} + (1 - μ) * g_{t}

(9)

{\hat{m}}_{t} = \frac{m_{t}}{1 - \prod_{i = 1}^{t + 1} μ_{i}}

(10)

n_{t} = ν * n_{t - 1} + (1 - ν) * g_{t}^{2}

(11)

{\hat{n}}_{t} = \frac{n_{t}}{1 - ν^{t}}

(12)

{\bar{m}}_{t} = (1 - μ_{t}) {\hat{g}}_{t} + μ_{t + 1} {\hat{m}}_{t}

(13)

θ_{t} = θ_{t - 1} - η \frac{{\bar{m}}_{t}}{\sqrt{{\hat{n}}_{t}} + ε}

(14)

For parameters of the neural network

θ_{t}

,

g_{t}

is the gradient of

θ_{t}

. Besides,

m_{t}

and

n_{t}

are the first order moment estimate and the second order moment estimate of the gradient, respectively, which can be regarded as the estimation of the expectation

E |g_{t}|

and

E |g_{t}^{2}|

, and here

μ

and

ν

are their attenuation rates, respectively. Moreover,

{\hat{m}}_{t}

and

{\hat{n}}_{t}

are the correction for

m_{t}

and

n_{t}

, which can be approximated as an unbiased estimate of the expectation. It can be seen that NAdam has a stronger constraint on the learning rate and more direct influence on the gradient update. As a result, we attempt to apply the new optimization algorithm to our method and compare the outcomes with the traditional Adam algorithm.

3. Main Theorem

3.1. Basic Ideas

In this paper we propose a deep learning-based PDE numerical solution for nonlinear PDEs with jump terms in the form of Equation (1). The basic ideas of the algorithm are as follows:

(1): By using the generalized nonlinear Feynman-Kac formula, the PDEs to be solved can be equivalently constructed using BSDEs with jumps.
(2): Taking the gradient operator and the jump term of the solution of the function to be solved as policy functions, the problem of solving the numerical solution of BSDEs can be considered as a stochastic control problem, which can be further considered as a reinforcement learning problem.
(3): Two different deep neural networks are used to approximate this pair of high-dimensional strategy functions, and the neural networks are trained through a deep learning method to obtain the numerical solution of the original equation.

3.2. Transforming the Nonlinear PDE Numerical Solution Problem into a Stochastic Control Problem

It is well known that the generalized nonlinear Feynman-Kac formula can connect a PDE and a BSDE under appropriate assumptions. Thus, the solution

Y_{t} (x)

of the above equation is equivalent to the viscous solution

u (t, x)

of the semilinear parabolic partial differential equation with jump term in Equation (1) [16,17]. We can estimate the solution

u (t, x)

of Equation (1) by estimating the solution of Equation (4) because the numerical solution of the traditional partial differential equation does not perform well in high dimensions. Since BSDEs developed from the study of stochastic control problems and have common characteristics and internal relations, the nonlinear PDE problem can also be transformed into a stochastic control problem through the nonlinear Feynman-Kac formula.

As for the stochastic control problem related to Equation (4), the following formula holds almost everywhere for all

(t, x) \in C ([0, T] \times R^{d})

:

\{\begin{matrix} X_{s}^{t, x} = x + \int_{t}^{s} μ (r, X_{r}^{t, x}) d r + \int_{t}^{s} σ (r, X_{r}^{t, x}) d W_{r} + \int_{t}^{s} β (r, X_{r}^{t, x}) d {\tilde{N}}_{r} \\ Y_{s}^{t, x} = g (X_{T}^{t, x}) + \int_{s}^{T} f (r, X_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}, U_{r}^{t, x}) d s - \int_{s}^{T} Z_{r}^{t, x} d W_{r} - \int_{s}^{T} U_{r}^{t, x} d {\tilde{N}}_{r} \end{matrix}

(15)

Among them,

s \in [t, T]

[18].

Under the appropriate regularity assumption, for the nonlinear function

f

, it holds that a group of solution functions composed of

u (0, ξ) \in ℝ

,

\nabla_{x} u (t, X_{t}) σ (t, X_{t}) \in A

and

u (t, X_{t} + β (t, X_{t})) - u (t, X_{t}) \in ℝ

is the unique global minimum of the following functions

(y, Z, U) \in ℝ \times A \times ℝ

:

(y, Z, U) ⟼ E [{|Y_{T}^{y, Z, U} - g (X_{T}^{t, x})|}^{2}] \in [0, \infty]

(16)

We regard

Z

and

U

as the policy functions of a deep learning problem and use DNN to approximate

Z

and

U

. In this way, the stochastic process

u (t, X_{t})

corresponds to the solution function of the stochastic control problem and can be solved by using the policy functions

Z

and

U

, and thus we can transform the numerical solution problem of the nonlinear partial differential equation into a stochastic control problem.

3.3. Forward Discretization of the Backward Stochastic Differential Equations with Jumps

For the following BSDE:

\{\begin{matrix} X_{t} = x + \int_{0}^{t} μ (s, X_{s}) d s + \int_{0}^{t} σ (s, X_{s}) d W_{s} + \int_{0}^{t} β (s, X_{s}) d {\tilde{N}}_{s} \\ Y_{t} = g (X_{T}) + \int_{t}^{T} f (s, X_{s}, Y_{s}, Z_{s}) d s - \int_{t}^{T} Z_{s} d W_{s} - \int_{t}^{T} U_{s} d {\tilde{N}}_{s} \end{matrix}

(17)

For all

t_{1}, t_{2} \in [0, T]

and

t_{1} \leq t_{2}

, the following equations hold almost everywhere:

X_{t_{2}} = X_{t_{1}} + \int_{0}^{t} μ (s, X_{s}) d s + \int_{0}^{t} σ (s, X_{s}) d W_{s} + \int_{0}^{t} β (s, X_{s}) d {\tilde{N}}_{s}

(18)

Y_{t_{2}} = Y_{t_{1}} - \int_{t_{1}}^{t_{2}} f (s, X_{s}, Y_{s}, Z_{s}) d s + \int_{t_{1}}^{t_{2}} Z_{s} d W_{s} + \int_{t_{1}}^{t_{2}} U_{s} d {\tilde{N}}_{s}

(19)

By substituting (6) and (7) into (19), we can obtain:

Y_{t_{2}} = Y_{t_{1}} - \int_{t_{1}}^{t_{2}} f (s, X_{s}, Y_{s}, (\nabla_{x} u σ) (s, X_{s})) d s + \int_{t_{1}}^{t_{2}} 〈 (\nabla_{x} u σ) (s, X_{s}), d W_{s} 〉_{R^{d}} + \int_{t_{1}}^{t_{2}} 〈 u (t, X_{s} + β (s, X_{s})) - u (s, X_{s}), d {\tilde{N}}_{s} 〉_{R}

(20)

Now, we will discretize them in the time dimension and divide the time interval

[0, T]

into N partitions, so that

t_{0}, t_{1}, \dots, t_{N} \in [0, T]

and

0 = t_{0} < t_{1} < \dots < t_{N} = T

. For

N \in ℕ

sufficiently large, according to (18) and (19) we have:

X_{t_{n + 1}} - X_{t_{n}} = μ (t_{n}, X_{t_{n}}) Δ t_{n} + σ (t_{n}, X_{t_{n}}) Δ W_{t_{n}} + β (t_{n}, X_{t_{n}}) Δ {\tilde{N}}_{t_{n}}

(21)

Y_{t_{n + 1}} = Y_{t_{n}} - f (t_{n}, X_{t_{n}}, Y_{t_{n}}, (\nabla_{x} u σ) (t_{n}, X_{t_{n}})) Δ t_{n} + 〈 (\nabla_{x} u σ) (t_{n}, X_{t_{n}}), W_{t_{n + 1}} - W_{t_{n}} 〉 + 〈 U (t_{n}, X_{t_{n}}), {\tilde{N}}_{t_{n + 1}} - {\tilde{N}}_{t_{n}} 〉

(22)

where

Δ t_{n} = t_{n + 1} - t_{n}

,

Δ W_{t_{n}} = W_{t_{n + 1}} - W_{t_{n}}

,

Δ {\tilde{N}}_{t_{n}} = {\tilde{N}}_{t_{n + 1}} - {\tilde{N}}_{t_{n}}

. [5,18].

The above is the Euler scheme for discretization. In this way, we can start from the initial value

X_{0}

of (4) and finally obtain the approximation of the Euler scheme of (4) at N partitions.

3.4. General Framework for Neural Networks

Two feedforward neural networks are established at each time step

t = t_{n}

in (21) and (22). One is to approximate the gradient of the unknown solution, which means to approximate the function

X_{t_{n}} \to Z_{t_{n}} : \nabla_{x} u (t_{n}, X_{t_{n}}) σ (t_{n}, X_{t_{n}})

, and this neural network is recorded as

{NN}_{θ_{z_{n}}} (x)

, such that

θ_{z_{n}}

represents all parameters of this neural network and

Z_{n}

indicates that this neural network is used to approximate

Z_{t_{n}}

at time

t_{n}

. Another is to approximate the jump process of the unknown solution, which means to approximate the function

X_{t_{n}} \to U_{t_{n}} : u (t_{n}, X_{t_{n}} + β (t_{n}, X_{t_{n}})) - u (t_{n}, X_{t_{n}})

, and this neural network is recorded as

{NN}_{θ_{U_{n}}} (x)

, such that

θ_{U_{n}}

represents all parameters of this neural network, and

U_{n}

indicates that this network is used to approximate

U_{t_{n}}

at time

t_{n}

. For the convenience of expression, we can suppose

θ_{Z} = \{θ_{Z_{1}}, \dots, θ_{Z_{N - 1}}\}

,

θ_{U} = \{θ_{U_{1}}, \dots, θ_{U_{N - 1}}\}

,

θ = \{θ_{Z}, θ_{U}\}

. As shown in Figure 1, all sub-neural networks are stacked together to form a complete neural network.

X_{t_{0}} = ξ \in ℝ^{d}

in the initial layer is a random variable.

u (t_{0}, X_{t_{0}})

,

\nabla_{x} u (t_{0}, X_{t_{0}})

and

u (t_{0}, X_{t_{0}} + β (t_{0}, X_{t_{0}})) - u (t_{0}, X_{t_{0}})

in the initial layer are unknown, and we treat them as parameters of the neural network.

u (t_{0}, X_{t_{0}})

,

\nabla_{x} u (t_{0}, X_{t_{0}})

and

u (t_{0}, X_{t_{0}} + β (t_{0}, X_{t_{0}})) - u (t_{0}, X_{t_{0}})

correspond to the value of the BSDE with jumps as follows:

Y_{0} = u (t_{0}, ξ)

(23)

Z_{0} = \nabla_{x} u (t_{0}, ξ)

(24)

and

U_{0} = u (t_{0}, ξ + β (t_{0}, ξ)) - u (t_{0}, ξ)

(25)

In this neural network, the

X_{t_{n}}

of the current layer is related to the

X_{t_{n - 1}}

of the previous layer, and at the same time the

u (t_{n}, X_{t_{n}})

of the current layer is related to the

X_{t_{n - 1}}

,

u (t_{n - 1}, X_{t_{n - 1}})

,

\nabla_{x} u (t_{n - 1}, X_{t_{n - 1}})

and

u (t_{n - 1}, X_{t_{n - 1}} + β (t_{n - 1}, X_{t_{n - 1}})) - u (t_{n - 1}, X_{t_{n - 1}})

of the previous layer. However, there are no

\nabla_{x} u (t_{n - 1}, X_{t_{n - 1}})

and

u (t_{n - 1}, X_{t_{n - 1}} + β (t_{n}, X_{t_{n}})) - u (t_{n - 1}, X_{t_{n - 1}})

in the current layer. Therefore, as shown in Figure 1, our solution is to start from the

X_{t_{n}}

of the current layer to build two sub-neural networks to represent these two values. In addition, the construction of the final loss function can be obtained from a given terminal value condition

u (T, x)

, that is,

u (t_{N}, X_{t_{N}})

in the neural network, which also corresponds to

g (X_{T})

in the nonlinear Feynman-Kac formula.

Specifically, for

n = \{0, 1, \dots, N - 1\}

, we can set

{\hat{Y}}_{0}

,

{\hat{Z}}_{0}

as parameters to obtain the appropriate Euler approximation

\hat{Y}

of the process forward:

{\hat{Y}}_{t_{n + 1}} = {\hat{Y}}_{t_{n}} - f (t_{n}, X_{t_{n}}, {\hat{Y}}_{t_{n}}, {\hat{Z}}_{t_{n}}) Δ t_{n} + 〈 {\hat{Z}}_{t_{n}}, Δ W_{t_{n}} 〉 + 〈 {\hat{U}}_{t_{n}}, Δ {\tilde{N}}_{t_{n}} 〉

(26)

For all appropriate

θ_{z_{n}}

, we have

{\hat{Z}}_{t_{n}} = {NN}_{θ_{z_{n}}} (X_{t_{n}}) \approx (\nabla_{x} u σ) (t_{n}, X_{t_{n}})

. For all appropriate

θ_{U_{n}}

, we have

{\hat{U}}_{t_{n}} = {NN}_{θ_{U_{n}}} (X_{t_{n}}) \approx u (t_{n}, X_{t_{n}} + β (t_{n}, X_{t_{n}})) - u (t_{n}, X_{t_{n}})

. Then, we can obtain a suitable approximation of

u (0, ξ)

:

{\hat{Y}}_{0} \approx u (0, ξ)

(27)

The mean square error between the final output

{\hat{Y}}_{t_{N}}

of neural network and the true value

g (X_{t_{N}})

at the terminal time is selected as the loss function:

θ ⟼ l o s s (θ) = E [{|{\hat{Y}}_{t_{N}} - g (X_{t_{N}})|}^{2}] \in [0, \infty]

(28)

θ

is the set of all training parameters of the above system, such that

θ = \{{\hat{Y}}_{0}, {\hat{Z}}_{0}, {\hat{U}}_{0}, θ_{z_{1}}, \dots θ_{z_{N - 1}}, θ_{U_{1}}, \dots, θ_{U_{N - 1}}\}

. This loss function is used since

Y_{T} = g (X_{T})

. The expectation in the loss function in (28) is the expectation for all sample paths, but due to the infinite number of sample paths, it is not possible to traverse the entire training set. Therefore, we adopt an optimization method based on stochastic gradient descent. In each iteration of gradient descent, only a portion of samples is selected to estimate the loss function, thereby obtaining the gradient of the loss function over all parameters and obtaining a one-step neural network parameter update.

Like this, the back propagation of this neural network uses an optimizer to update the parameters layer by layer. When the DNN is trained,

{\hat{Y}}_{0}

is fixed as the parameter. Take this parameter out and it is the required value.

3.5. Details of the Algorithms

The detailed steps of our proposed algorithm based on deep learning to numerically solve the BSDE with jumps is presented as follows.

According to our algorithm, which uses a deep learning-based neural network solver, the BSDE with jumps in the form of Equation (4) can be numerically solved in the following ways:

Simulate sample paths using standard Monte Carlo methods
Use a deep neural network (DNN) to approximate $Z_{t} = \nabla_{x} u (t, X_{t}) σ (t, X_{t})$ and $U_{t} = u (t, X_{t} + β (t, X_{t})) - u (t, X_{t})$ , and then plug them into the BSDE with jumps to perform a forward iterative operation in time

For simplicity, here we use the one-dimensional case as an example, and the high-dimensional case is similar. We divide the time interval

[0, T]

into N partitions, so that

t_{0}, t_{1}, \dots, t_{N} \in [0, T]

and

0 = t_{0} < t_{1} < \dots < t_{N} = T

, assuming

h_{i} = t_{i + 1} - t_{i}

,

d W_{i} = W_{i + 1} - W_{i}

and

d {\tilde{N}}_{i} = {\tilde{N}}_{i + 1} - {\tilde{N}}_{i}

. The detailed steps are as follows:

(1): N Monte Carlo paths of the diffusion process $\{X_{i}^{(j)} : i = 0, 1, \dots, N; j = 1, 2, \dots, M; X_{0}^{(j)} \equiv X_{0}\}$ are sampled by the Euler scheme:

$X_{i + 1}^{(j)} = X_{i}^{(j)} + μ (t_{i}, X_{i}^{(j)}) h_{i} + σ (t_{i}, X_{i}^{(j)}) d W_{i}^{(j)} + β (t_{i}, X_{i}^{(j)}) d {\tilde{N}}_{i}^{(j)}$

(29)

This step is the same as the standard Monte Carlo method. Other discretization schemes can also be used, such as logarithmic Euler discretization or Milstein discretization.

(2): At $t_{0} = 0$ , the initial value $Y_{0}$ , $Z_{0} = \frac{\partial Y}{\partial X} (0, X_{0}) σ (0, X_{0})$ and $U_{0} = u (0, X_{0} + β (0, X_{0})) - u (0, X_{0})$ are randomly selected as parts of the neural network parameters, and $Y_{0}$ , $Z_{0}$ and $U_{0}$ are all constant random numbers selected from an empirical range.
(3): At each time step $t_{i}$ , given $\{X_{i}^{(j)} : j = 1, 2, \dots, M\}$ , $\{Z_{i}^{(j)} : j = 1, 2, \dots, M\}$ and $\{U_{i}^{(j)} : j = 1, 2, \dots, M\}$ are approximated using deep neural networks, respectively. Note that every time $t = t_{i}$ , two sub-neural networks are used for all Monte Carlo paths. In our one-dimensional case, as described in Section 3.4, we introduce two sub-deep neural networks: $θ_{z_{i}} : ℝ \to ℝ$ , such that $\{Z_{i}^{(j)} = θ_{Z_{i}} (X_{i}^{(j)}) : j = 1, 2, \dots, M\}$ and $θ_{U_{i}} : ℝ \to ℝ$ , such that $\{U_{i}^{(j)} = θ_{U_{i}} (X_{i}^{(j)}) : j = 1, 2, \dots, M\}$ . $θ_{Z} = \{θ_{Z_{1}}, \dots, θ_{Z_{N - 1}}\}$ , $θ_{U} = \{θ_{U_{1}}, \dots, θ_{U_{N - 1}}\}$ , $θ = \{θ_{Z}, θ_{U}\}$ .
(4): For $t_{i} \in [0, T]$ , we have

$Y_{i + 1}^{(j)} = Y_{i}^{(j)} - f (t_{i}, X_{i}^{(j)}, Y_{i}^{(j)}, Z_{i}^{(j)}, U_{i}^{(j)}) h_{i} + 〈 Z_{i}^{(j)}, d W_{i}^{(j)} 〉 + 〈 U_{i}^{(j)}, d {\tilde{N}}_{i}^{(j)} 〉$

(30)

According to this formula, we can directly calculate the

Y_{i + 1}^{(j)}

at the next time step. This process does not require any parameters to be optimized. In this way, the BSDE with jumps propagates forward in time direction from

t_{i}

to

t_{i + 1}

. Along each Monte Carlo path, as the BSDE with jumps propagates forward in time from 0 to T,

Y_{N}^{(j)}

is estimated as

Y_{N}^{(j)} (Y_{0}, Z_{0}, U_{0}, θ^{'})

, where

θ^{'} = (θ_{Z}, θ_{U})

is all the hyper-parameters for the N − 1 layers neural networks.

(5): Calculate the following loss function:

$L_{F o r w a r d} = \frac{1}{M} \sum_{j = 1}^{M} {(Y_{N}^{(j)} (Y_{0}, Z_{0}, U_{0}, θ) - g (X_{N}^{(j)}))}^{2}$

(31)

where g is the terminal function.
(6): Use stochastic optimization algorithms to minimize the loss function:

$(\tilde{Y_{0}}, \tilde{z_{0}}, \tilde{θ}) = \underset{(Y_{0}, z_{0}, θ)}{argmin} \frac{1}{M} \sum_{j = 1}^{M} {(Y_{N}^{(j)} (Y_{0}, z_{0}, θ) - g (X_{N}^{(j)}))}^{2}$

(32)

The estimated value

\tilde{Y_{0}}

is the desired initial value at time t = 0.

4. Numerical Results

In this section, we will use the deep neural network to code the theoretical framework of the proposed algorithm. Many problems in the financial field can be solved through numerical solutions using high-dimensional PDEs and corresponding BSDEs. Among them, the jump process can depict the sudden impact of the outside world in its process, so that it can accurately depict a class of uncertain things, and this situation is most common in financial markets. So, we extend the numerical solution problem of a high-dimensional PDE to the case of jump diffusion. In this paper, three classical high-dimensional PDEs with important applications in finance-related fields are selected for numerical simulation. They are financial derivative pricing under the jump-diffusion model, Bond pricing under the Vasicek model with jump, and the Hamilton-Jacobi-Bellman equation.

The numerical experiments in this paper use a 64-bit Windows 10 operating system, based on the PyTorch deep learning framework in Python. All examples in this section are calculated based on 1000 sample paths, each chosen for a time partition of N = 100. We ran each example 10 times independently to calculate the average result. All neural network parameters are initialized by uniform distribution. Each sub-neural network of each time node contains 4 layers: one d-dimensional input layer, two (d + 10)-dimensional hidden layers and one d-dimensional output layer. We use the rectifier function (ReLU) as the activation function. Batch Normalization (BN) techniques are used after each linear transformation and before activation.

4.1. Pricing of Financial Derivatives under Jump Diffusion Model

In this section, we will apply the proposed method to the pricing of derivatives related to a 100-dimensional jump diffusion model. In this model, the stock price

X_{t}

satisfies the following jump diffusion model [19]:

d X_{t} = (r - λ k) X_{t} d t + σ X_{t} d W_{t} + (V - 1) X_{t} d {\tilde{N}}_{t}

(33)

where

r

is the constant discount rate,

λ

is the average number of jumps per unit time of the stock price and

σ

is the constant volatility.

V

represents the jump magnitude. Assuming that the jump amplitude is fixed and V is constant, we can let

k = V - 1

and

k > - 1

. This equation can be written as

d X_{t} = r X_{t} d t + σ X_{t} d W_{t} + k X_{t} d {\tilde{N}}_{t}

.

It is known that the European call option with the stock as the underlying asset satisfies the following partial differential equation:

\frac{\partial u}{\partial t} (t, x) + \frac{1}{2} σ^{2} x^{2} Δ_{x} u (t, x) + (r - λ k) x \nabla_{x} u (t, x) + λ [u (t, x V) - u (t, x)] - r u (t, x) = 0

u (T, x) = {(x - K)}^{+}

(34)

For all

t \in [0, T]

,

x, ω \in ℝ^{d}

,

y \in ℝ

,

z \in ℝ^{1 \times d}

. Suppose

d = 100

,

T = 1

,

N = 100

,

λ = 1

,

X_{0} = (1, \dots, 1)

,

μ (t, x) = r X_{t}

,

σ (t, x) = σ X_{t}

,

β (t, x) = (V - 1) X_{t} = k X_{t}

,

f (t, x, y, z) = - r Y

,

g (x) = {(\max_{1 \leq i \leq 100} x_{i} - 5)}^{+}

, here

r^{i}

,

σ^{i}

,

k^{i}

are randomly selected in

[0, 1]

,

i = 1, \dots 100

.

After calculation, Table 1 and Table 2 show the important numerical results of solving the related derivative pricing problem of the 100-dimensional jump diffusion model with Adam and NAdam optimizers, respectively, including that with the change of iteration steps, mean and standard deviation of loss function

Y_{0}

, mean and standard deviation of loss function and running time. Only the numerical results of iteration steps

n \in \{1000, 2000, 3000, 4000\}

are selected as typical examples for display. We set the initial value

u (0, X_{0}) = 0.3124

obtained from the classical Monte Carlo method as the exact solution and compare it with the numerical results.

It can be seen from Table 1 and Table 2 that when the iteration steps are the same, the calculation time required for using the two optimizers is not very different. As the iteration progresses, the loss function value of the NAdam optimizer is smaller than that of the Adam optimizer at the same iteration steps and the final loss function value is also smaller.

Figure 2 and Figure 3 show the changes of the initial value estimate and loss function with the number of iteration steps when Adam and NAdam optimizers are used for solving. It can be seen that the initial value estimates and loss functions of the numerical solutions of the two methods converge rapidly. However, the convergence speed of the latter is obviously faster than that of the former, whether it is the initial value estimate or the loss function. Among them, the former tends to converge after about 1500 iterations, while the latter tends to converge after about 700 iterations. When the iteration is 4000 times, the numerical solution obtained with the Adam algorithm is 0.32143775, while the numerical solution obtained with the NAdam algorithm is 0.32273737, which is similar and not significantly different from the exact value simulated by the Monte Carlo method.

4.2. Bond Pricing under the Jumping Vasicek Model

In this section, we use the proposed method to solve the pricing problem of a class of bonds with interest rates subject to the Vasicek model with jumps [20,21,22]. In this model, the short-term interest rate

X_{t}

obeys the following stochastic differential equations:

d X_{t} = a (b - X_{t}) d t + σ d W_{t} + k d {\tilde{N}}_{t}

(35)

i.e., each part of

X_{t}

follows:

d X_{t}^{i} = a^{i} (b^{i} - X_{t}^{i}) d t + σ^{i} d W_{t} + k^{i} d {\tilde{N}}_{t}

(36)

For all

t \in [0, T]

,

x, ω \in ℝ^{d}

,

y \in ℝ

,

z \in ℝ^{1 \times d}

. Suppose

d = 100

,

T = 1

,

N = 100

,

λ = 1

,

X_{0} = (1, \dots, 1)

,

μ (t, x) = (μ^{1}, \dots, μ^{i}, \dots, μ^{d})

,

μ^{i} = a^{i} (b^{i} - x^{i})

,

σ (t, x) = (σ^{1}, \dots, σ^{i}, \dots, σ^{d})

,

β (t, x) = (β^{1}, \dots, β^{i}, \dots, β^{d})

,

a^{i}, b^{i}, σ^{i},

β^{i}

are randomly selected in

[0, 1]

,

f (t, x, y, z) = - (\max_{1 \leq i \leq n} X^{i}) Y

,

g (x) = 1

. Then, for the zero-coupon bond price

u

paying 1 at maturity T under the above jump Vasicek model, it satisfies that

u (T, x) = 1

for all

t \in [0, T]

,

x \in ℝ^{d}

, and the following partial differential equation holds:

\frac{\partial u}{\partial t} (t, x) + \sum_{i = 1}^{n} [a^{i} (b^{i} - x^{i}) - β^{i}] \frac{\partial u}{\partial x^{i}} + \frac{1}{2} \sum_{1 \leq i, j \leq n} σ^{i} σ^{j} \sqrt{x^{i} x^{j}} \frac{\partial^{2} u}{\partial x^{i} \partial x^{j}} + u (t, x + β (x)) - u (t, x) - (\max_{1 \leq i \leq n} X^{i}) u = 0

(37)

After calculation, Table 3 and Table 4 show the important numerical results of solving a class of bond pricing problems with interest rates subject to the Vasicek model with jumps using the Adam and NAdam optimizers, respectively, including that with the change of iteration steps, mean and standard deviation of loss function

Y_{0}

, mean and standard deviation of loss function and running time. Only the numerical results of the iteration steps

n \in \{1000, 2000, 3000, 4000\}

are selected as typical examples for display. We set the initial value

u (0, X_{0}) = 0.2702

obtained from the classical Monte Carlo method as the exact solution and compare it with the numerical results.

It can be seen from Table 3 and Table 4 that with the iteration, the loss function value of the NAdam optimizer is smaller than that of the Adam optimizer at the same iteration steps, and the final loss function value is smaller, but the cost is that the operation time becomes longer.

Figure 4 and Figure 5 show the changes of the initial value estimate and loss function with the number of iteration steps when Adam and NAdam optimizers are used for solving. It can be seen that the initial value estimates and loss functions of the numerical solutions of the two methods converge rapidly. However, the convergence speed of the latter is obviously faster and more stable than the former, whether it is the initial value estimate or the loss function. When the iteration is 4000 times, the numerical solution obtained with the Adam algorithm is 0.25530547, while the numerical solution obtained with the NAdam algorithm is 0.27403888, which is similar and not significantly different from the exact value simulated by the Monte Carlo method.

4.3. Hamilton-Jacobi-Bellman (HJB) Equation

In fields such as finance, investment, and risk management, optimization and control problems are often involved, and these problems are often represented by stochastic optimal control models. One way to solve this kind of control problem is to obtain and solve the Hamilton-Jacobi-Bellman equation (HJB equation for short) of the corresponding control problem according to the principle of dynamic programming. In this section, we use the proposed method to solve a class of 100-dimensional HJB equations [12]:

For all

t \in [0, T]

,

x, ω \in ℝ^{d}

,

y \in ℝ

,

z \in ℝ^{1 \times d}

. Suppose

d = 100

,

T = 1

,

N = 100

,

λ = 1

,

X_{0} = (1, \dots, 1)

,

μ (t, x) = 0

,

σ (t, x) = \sqrt{2}

,

β (t, x) = β^{T} x

, where

β^{T} = (β^{1}, \dots, β^{i}, \dots, β^{d})

,

β^{i}

are randomly selected in

[0, 1]

,

f (t, x, y, z) = - {‖ z ‖}_{ℝ^{1 \times d}}^{2}

,

g (x) = l n ((1 + {‖ x ‖}^{2}) / 2)

, it satisfies that

u (T, x) = l n ((1 + {‖ x ‖}^{2}) / 2)

for all

t \in [0, T]

,

x \in ℝ^{d}

, and the following partial differential equation holds:

\frac{\partial u}{\partial t} (t, x) + Δ_{x} u (t, x) + u (t, x + β (x)) - u (t, x) - \nabla_{x} u (t, x) β (x) = ‖ \nabla_{x} u {(t, x) ‖}^{2}

(38)

After calculation, the important numerical results of solving a class of HJB equations with Adam and NAdam optimizers are shown in Table 5 and Table 6, respectively, including that with the change of iteration steps, mean and standard deviation of loss function

Y_{0}

, mean and standard deviation of loss function, and running time. Only the numerical results of iteration steps

n \in \{2000, 4000, 6000, 8000\}

are selected as typical examples for display. We set the initial value

u (0, X_{0}) = 2.6119

obtained from the classical Monte Carlo method as the exact solution and compare it with the numerical results.

It can be seen from Table 5 and Table 6 that with the iteration, the loss function value of the NAdam optimizer is smaller than that of the Adam optimizer at the same iteration steps, and the final loss function value is smaller, but the operation time is longer.

Figure 6 and Figure 7 show the changes of the initial value estimate and loss function with the number of iteration steps when Adam and NAdam optimizers are used for solving. It can be seen that the convergence speed of the latter is obviously faster than that of the former, whether it is the initial value estimate or the loss function. Among them, the former tends to converge after about 7000 iterations, while the latter tends to converge after about 5500 iterations. After 8000 iterations, the numerical solution obtained with the Adam algorithm is 2.73136686, while the numerical solution obtained with the NAdam algorithm is 2.6456454, which is similar and not significantly different from the exact value simulated by the Monte Carlo method.

5. Conclusions

In this paper, we propose an algorithm that can be used to solve a class of partial differential equations with jump terms and their corresponding backward stochastic differential equations. Through the nonlinear Feynman-Kac formula, the above-mentioned high-dimensional nonlinear PDE with jumps and its corresponding BSDE can be expressed equivalently, and the numerical solution problem can be regarded as a stochastic control problem. Next, we treat the gradient and jump process of the unknown solution separately as policy functions and use two neural networks at each time division to approximate the gradient and jump process of the unknown solution respectively. In this way, we can use deep learning to overcome the “curse of dimensionality” caused by high-dimensional PDE with jumps and obtain numerical solutions. In addition, we attempt to replace the traditional Adam algorithm with the new stochastic optimization approximation algorithm and apply it to our algorithm.

Next, focusing on the financial field, we applied our algorithm to solve three common high-dimensional problems in finance-related fields and then compared the results. We concluded that our algorithm performs well in numerical simulation. It is also concluded that after applying the new optimizer to the deep learning algorithm, the convergence speed of the model is mostly faster, and the generalization ability is significantly improved, at the cost of the operation time possibly being longer.

The algorithm proposed in this paper can be used for a wide variety of problems in finance, insurance, actuarial modeling and other fields, such as the pricing problem of some special path-dependent financial derivatives, the bond pricing problem under the jump diffusion model of interest rates, the pricing of equity-linked life insurance contracts and so on.

6. Discussion and Limitations

The algorithm proposed in this article only targets a special class of nonlinear parabolic PDEs with jump terms, and its application objects have certain limitations. We can consider improving the algorithm to extend the range of application objects, such as other special forms of PDE. For example, the improved deep learning-based algorithm can be extended to solve reflected PDEs, using the penalty function method to approximate reflected PDEs to a general form of PDEs. We can also consider delayed PDEs, restricted PDEs, and so on. In addition, this paper only considers the most common Poisson process, not the general jump process. We hope to apply this algorithm to other jump processes in the future. In addition, this paper does not focus on the changes in the intensity of the jump process. We hope to extend this algorithm to the general integro-partial differential equations in the future.

This paper mainly focuses on numerical simulations of examples in finance. In the future, more cases can be considered to test the practical effect of the proposed algorithm in other jump-diffusion instances. In addition, this paper only introduces the specific steps of the algorithm but does not analyze the convergence, error estimation and robustness of stochastic simulation, and thus, we hope to improve it in future research.

Author Contributions

Conceptualization, X.L.; Methodology, X.L.; Software, Y.G.; Validation, Y.G.; Writing—original draft, Y.G.; Writing—review & editing, Y.G.; Supervision, Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kang, W.; Wilcox, L.C. Mitigating the curse of dimensionality: Sparse grid characteristics method for optimal feedback control and HJB equations. Comput. Optim. Appl. 2017, 68, 289–315. [Google Scholar] [CrossRef] [Green Version]
El Karoui, N.; Peng, S.; Quenez, M.C. Backward stochastic differential equations in finance. Math. Financ. 1997, 7, 1–71. [Google Scholar] [CrossRef]
Zhang, J. A numerical scheme for BSDEs. Ann. Appl. Probab. 2004, 14, 459–488. [Google Scholar] [CrossRef]
Barigou, K.; Delong, Ł. Pricing equity-linked life insurance contracts with multiple risk factors by neural networks. J. Comput. Appl. Math. 2022, 404, 113922. [Google Scholar] [CrossRef]
Han, J.; Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and Backward Stochastic Differential Equations. Commun. Math. Stat. 2017, 5, 349–380. [Google Scholar]
Han, J.; Jentzen, A.; Weinan, E. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 2018, 115, 8505–8510. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Han, J.; Hu, R. Deep fictitious play for finding Markovian Nash equilibrium in multi-agent games. In Proceedings of the First Mathematical and Scientific Machine Learning Conference, Princeton, NJ, USA, 20–24 July 2020; Volume 107, pp. 221–245. [Google Scholar]
Beck, C.; Weinan, E.; Jentzen, A. Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. J. Nonlinear Sci. 2019, 29, 1563–1619. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Machine learning of linear differential equations using Gaussian processes. J. Comput. Phys. 2017, 348, 683–693. [Google Scholar] [CrossRef] [Green Version]
Sirignano, J.; Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 2018, 375, 1339–1364. [Google Scholar] [CrossRef] [Green Version]
Rong, S. Theory of Stochastic Differential Equations with Jumps and Applications; Springer: London, UK, 2005; pp. 205–290. [Google Scholar]
Delong, Ł. Backward Stochastic Differential Equations with Jumps and Their Actuarial and Financial Applications; Springer: London, UK, 2013; pp. 85–88. [Google Scholar]
Tang, S.; Li, X. Necessary conditions for optimal control of stochastic systems with random jumps. SIAM J. Control. Optim. 1994, 32, 1447–1475. [Google Scholar] [CrossRef]
Diederik, P.K.; Jimmy, L.B. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Sutskever, I.; Martens, J.; Dahl, G.; Hinton, G. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013), Atlanta, GA, USA, 16–21 June 2013; pp. 1139–1147. [Google Scholar]
Barles, G.; Buckdahn, R.; Pardoux, E. Backward stochastic differential equations and integral-partial differential equations. Stoch. Rep. 1997, 60, 57–83. [Google Scholar] [CrossRef]
Buckdahn, R.; Pardoux, E. BSDE’s with jumps and associated integro-partial differential equations. Preprint 1994, 79. [Google Scholar]
Gnoatto, A.; Patacca, M.; Picarelli, A. A deep solver for BSDEs with jumps. arXiv 2022, arXiv:2211.04349. [Google Scholar] [CrossRef]
Merton, R.C. Option Pricing when Underlying Stock Returns are Discontinuous. J. Financ. Econ. 1976, 3, 125–144. [Google Scholar] [CrossRef] [Green Version]
Wu, Y.; Liang, X. Vasicek model with mixed-exponential jumps and its applications in finance and insurance. Adv. Differ. Equ. 2018, 2018, 1–15. [Google Scholar] [CrossRef]
Lukman, P.C.; Handari, B.D.; Tasman, H. Study on European put option pricing with underlying asset zero-coupon bond and interest rate following the Vasicek model with jump. J. Phys. Conf. Ser. 2021, 1725, 012092. [Google Scholar] [CrossRef]
Jiang, Y.; Li, J. Convergence of the Deep BSDE method for FBSDEs with non-Lipschitz coefficients. Probab. Uncertain. Quant. Risk 2021, 6, 391–408. [Google Scholar] [CrossRef]

Figure 1. Deep neural network framework.

Figure 2. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (34) under Adam optimizer.

Figure 3. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (34) under NAdam optimizer.

Figure 4. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (37) under Adam optimizer.

Figure 5. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (37) under NAdam optimizer.

Figure 6. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (38) under Adam optimizer.

Figure 7. Changes of initial value estimation and loss function against the number of the iteration steps in the case of PDE (38) under NAdam optimizer.

Table 1. Numerical results with Adam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
1000	0.5989	0.1719	1.2743	7.7509	474
2000	0.4664	0.1235	1.2532	4.3095	891
3000	0.4183	0.1623	1.0894	3.7375	1271
4000	0.3940	0.1465	0.6269	2.4771	1676

Table 2. Numerical results with NAdam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
1000	0.2821	0.0625	1.1528	2.7450	549
2000	0.3019	0.0484	0.6747	2.0031	828
3000	0.3084	0.0406	0.4911	1.6566	1206
4000	0.3117	0.0356	0.4134	1.4535	1651

Table 3. Numerical results with Adam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
1000	0.2096	0.0435	0.1405	0.4825	538
2000	0.2300	0.0370	0.0820	0.3464	924
3000	0.2396	0.0331	0.0585	0.2849	1296
4000	0.2450	0.0302	0.0463	0.2476	1779

Table 4. Numerical results with NAdam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
1000	0.2828	0.0293	0.0323	0.1310	652
2000	0.2772	0.0215	0.0224	0.1033	1326
3000	0.2757	0.0177	0.0172	0.0855	1957
4000	0.2750	0.0154	0.0150	0.0753	2581

Table 5. Numerical results with Adam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
2000	1.2504	0.1521	1.3142	1.4709	857
4000	1.4850	0.2820	0.9090	1.1166	1770
6000	1.7704	0.4768	0.7447	0.9412	2906
8000	2.0139	0.5904	0.6314	0.8386	3864

Table 6. Numerical results with NAdam optimizer.

Number of Iteration Steps n	$Mean of u (0, X_{0})$	$Standard Deviation of u (0, X_{0})$	Mean of the Loss Function	Standard Deviation of the Loss Function	Runtime in Second
2000	0.8762	0.1941	1.2294	1.5767	1649
4000	1.2456	0.4419	0.8682	1.1733	3104
6000	1.6858	0.7241	0.6715	0.9980	4527
8000	1.9266	0.7531	0.5606	0.8855	5948

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, X.; Gu, Y. Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning. Mathematics 2023, 11, 2658. https://doi.org/10.3390/math11122658

AMA Style

Liu X, Gu Y. Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning. Mathematics. 2023; 11(12):2658. https://doi.org/10.3390/math11122658

Chicago/Turabian Style

Liu, Xiangdong, and Yu Gu. 2023. "Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning" Mathematics 11, no. 12: 2658. https://doi.org/10.3390/math11122658

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Study of Pricing of High-Dimensional Financial Derivatives Based on Deep Learning

Abstract

1. Introduction

2. Background Knowledge

2.1. A Class of PDE

2.2. Backward Stochastic Differential Equations with Jumps

2.3. The Generalized Nonlinear Feynman-Kac Formula

2.4. Improvement of Neural Network Parameter Optimization Algorithm

3. Main Theorem

3.1. Basic Ideas

3.2. Transforming the Nonlinear PDE Numerical Solution Problem into a Stochastic Control Problem

3.3. Forward Discretization of the Backward Stochastic Differential Equations with Jumps

3.4. General Framework for Neural Networks

3.5. Details of the Algorithms

4. Numerical Results

4.1. Pricing of Financial Derivatives under Jump Diffusion Model

4.2. Bond Pricing under the Jumping Vasicek Model

4.3. Hamilton-Jacobi-Bellman (HJB) Equation

5. Conclusions

6. Discussion and Limitations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI