Next Article in Journal
Noise and Dynamical Synapses as Optimization Tools for Spiking Neural Networks
Next Article in Special Issue
Optimal Control of an Electromechanical Energy Harvester
Previous Article in Journal
Dynamic Analysis of the Multi-Lingual S2IR Rumor Propagation Model Under Stochastic Disturbances
Previous Article in Special Issue
Maximum-Power Stirling-like Heat Engine with a Harmonically Confined Brownian Particle
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Numerical Integration of the Fokker–Planck Equation Driven by a Mechanical Force and the Bismut–Elworthy–Li Formula

by
Julia Sanders
*,† and
Paolo Muratore-Ginanneschi
Department of Mathematics and Statistics, University of Helsinki, 00014 Helsinki, Finland
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2025, 27(3), 218; https://doi.org/10.3390/e27030218
Submission received: 15 November 2024 / Revised: 10 February 2025 / Accepted: 18 February 2025 / Published: 20 February 2025
(This article belongs to the Special Issue Control of Driven Stochastic Systems: From Shortcuts to Optimality)

Abstract

:
Optimal control theory aims to find an optimal protocol to steer a system between assigned boundary conditions while minimizing a given cost functional in finite time. Equations arising from these types of problems are often non-linear and difficult to solve numerically. In this article, we describe numerical methods of integration for two partial differential equations that commonly arise in optimal control theory: the Fokker–Planck equation driven by a mechanical potential for which we use the Girsanov theorem; and the Hamilton–Jacobi–Bellman, or dynamic programming, equation for which we find the gradient of its solution using the Bismut–Elworthy–Li formula. The computation of the gradient is necessary to specify the optimal protocol. Finally, we give an example application of the numerical techniques to solving an optimal control problem without spacial discretization using machine learning.

1. Introduction: Optimal Control in Stochastic Thermodynamics and the Fokker–Planck Equation

Thermodynamic transitions at nanoscale occur in highly fluctuating environments. For instance, nanoscale bio-molecular machines operate within a power output range between 10 16   W and 10 17   W per molecule while experiencing random environmental buffeting of approximately 10 8   W at room temperature [1]. Nanomachines experience topological randomness as their motion occurs in inherently non-smooth surroundings due to the fact that machine constituent dimensions are close to those of the atom [1]. The dynamics of nanosystems, therefore, need to be described in terms of stochastic [2] or, more generally, random differential equations [3]. Consequently, the laws of macroscopic thermodynamics are replaced by identities involving functions of indicators of the state of the system that are naturally expressed by stochastic processes. Addressing fundamental and technological questions of nanoscale physics has thus propelled interest in the field of stochastic thermodynamics over the last few years [4,5,6,7].
A class of important questions in stochastic thermodynamics revolves around finding efficient protocols that natural or artificial nanomachines adopt to perform useful work at nanoscale. Optimal control theory provides a natural mathematical formulation for this type of question [8]. For instance, the conversion of chemical energy into mechanical work typically implies steering the system probability distribution between two assigned values. Schrödinger bridge problems [9] (English translation in [10]) and their extensions, see, e.g., [11,12,13], depict this idea mathematically. In these types of problems, protocols optimizing a given functional of the stochastic process describing the state of a nanosystem are determined by solving a coupled Hamilton–Jacobi–Bellman equation [14] and Fokker–Planck equation. The solution of the Hamilton–Jacobi–Bellman equation determines the value of the optimal action (force) steering the dynamics at any instant of time in the control horizon. However, the boundary condition at the end of the control horizon is assigned on the system probability distribution. Solving the Fokker–Planck equation thus becomes necessary to fully determine the optimal protocols.
Possibly the most prominent physical application of such a setup is the derivation of a tight lower bound on the entropy production in classical stochastic thermodynamics [15]. Remarkably, when the system dynamics is modelled by Langevin–Smoluchowski (overdamped) dynamics, the problem maps into the Monge–Ampère–Kantorovich equations and becomes essentially integrable [16]. This allows one to extricate relevant quantitative information about molecular engine efficiency [17,18,19,20] and minimum heat release [21]. The situation is, however, more complicated for more realistic models of nanosystem evolution. Specifically, if we adopt an underdamped (Langevin–Kramers) model of the dynamics [22,23], then even the equation connecting the optimal mechanical potential to the value function solving the dynamic programming equation is not analytically integrable. In the Gaussian case, the solution of a Lyapunov equation [24] specifies the optimal mechanical potential [25]. In general, optimal control duality methods [26] and multiscale perturbative expansions [25] yield lower bounds on the entropy production and approximate expressions of optimal protocols. More detailed quantitative information calls for exact numerical methods. This is particularly challenging, as integration strategies must be adapted to take into account the boundary conditions at the end of the control horizon imposed on the system probability density. Hence, the development of accurate and scaleable methods for numerical integration of the Fokker–Planck equation becomes an essential element of optimization algorithms.
Traditional numerical methods from hydrodynamics, such as the pseudo-spectral method, see, e.g., [27], are certainly accurate, but require boundary conditions that are periodic in space, and may not suit problems in stochastic thermodynamics. An even more serious limitation is the exponentially fast increase in computational complexity with the degrees of freedom of the problem. Monte Carlo averages over Lagrangian particles paths, i.e., realizations of the solution of the stochastic differential equation associated with the Fokker–Planck, circumvent the curse of dimensionality; see [28,29] for example applications to classical and quantum physics. The drawback is, however, that these methods are best suited to computing expectation values of smooth indicators of the stochastic process. They lack accuracy when computing the probability density itself, as this involves averaging over Dirac distributions. These considerations motivate recent works [30,31], which use machine learning methods to construct solutions of the Fokker–Planck equation in the system’s state space. These approaches consider the associated probability flow equation [16,30,31], which use the score function (or gradient of the log probability density [32]) to turn the Fokker–Planck into a mass conservation equation. The score function can be parametrized by, for example, a neural network [31], and the probability density can be recovered through a deterministic transport map.
In this article, we propose a Monte Carlo method adapted to the numerical integration of Fokker–Planck equations of diffusion processes driven by a time-dependent mechanical force. Although mathematically non-generic, they are recurrent in applications of stochastic thermodynamics, as they describe the evolution of a system under a mechanical potential, which may vary in time because of a feedback. We encounter this type of equation in generalized Schrödinger bridge problems instantiating refinements of the second law of thermodynamics [15,21,25]. In this context, the Fokker–Planck equation describes the evolution of the optimal distribution of the state throughout the time interval. Integrating this directly offers a challenging problem, particularly when the driving mechanical potential is non-linear or for systems of high dimension. By using the well-known Girsanov change-of-measure formula [33], we couch the solution to the Fokker–Planck in terms of a numerical expectation that can be evaluated from sampled trajectories of the dynamics.
In addition, we also take a look at an application of the Bismut–Elworthy–Li formula [34,35,36] to compute the gradient of the solution to the Hamilton–Jacobi–Bellman equation. This equation determines the value function, which enforces the system dynamics over the control horizon, and is coupled to the Fokker–Planck through its boundary conditions. Direct access to the gradient of the value function is important, since the stationarity condition in control problems often links the optimal control protocol through the gradient of the value function. The Bismut–Elworthy–Li formula is commonly used in finance, for the calculation of the Greek derivatives [37,38,39]. It has also been used in the numerical integration of non-linear parabolic partial differential equations [40]. We apply the Bismut–Elworthy–Li formula in the underdamped, or degenerate, dynamics [36], alongside a numerical example.
Numerical approaches to the Schrödinger bridge are often iterative. For example, ref. [41] turns the problem into a pair of Fokker–Planck equations and iteratively integrates them to recompute the boundary conditions via a proximal operator-based numerical integration method [42]. Machine learning techniques have been used to iteratively solve half-bridge problems [43,44]. We bring together the described Monte Carlo methods for the Fokker–Planck and the Hamilton–Jacobi–Bellman equation in a prototype numerical example to solve a Schrödinger bridge minimizing the Kullback–Leibler divergence from a free diffusion. This is done by an iteration between updating the drift, parametrized by a neural network, with the stationarity condition, and updating the value function with the probability density. Using Monte Carlo integration allows us to compute the update steps quickly and without spacial discretization.
The next Sections are organized as follows. In Section 2, we leverage the Girsanov theorem to express the solution of a Fokker–Planck equation as an expectation which can be evaluated numerically. This is complemented by analytical and numerical examples. In Section 3, we extend this setup into the underdamped dynamics, with an accompanying numerical example from a stochastic optimal control model in the underdamped dynamics. Section 4 discusses the application of the Bismut–Elworthy–Li formula to non-degenerate dynamics, with an analytic example application in Section 4.3, and a numerical example in Section 4.4 of the Hamilton–Jacobi–Bellman equation coupled to the Fokker–Planck from Section 2.2.4. In Section 5, we extend the application of the Bismut–Elworthy–Li formula to the degenerate case. The formula is applied analytically in Section 5.2, and numerically in Section 5.3. Finally, we give an example use case for the derived formulae, by solving an optimal control problem in the overdamped dynamics by machine learning.

2. Fokker–Planck for a Time-Dependent Mechanical Overdamped Diffusion Process

We consider the Langevin–Smoluchowski stochastic differential equation
d q t = μ   ( U t ) ( q t )   d t + 2   μ β   d w t
where w t denotes a standard Wiener process [2,45]. The diffusion coefficient β 1 is proportional to the temperature of the environment surrounding the system. The positive constant μ is the motility with canonical dimensions of a time over a mass. The drift in (1) is the gradient of a time-dependent potential
U t ( q ) : [ t ι   ,   )   ×   R d R +
that we assume to be sufficiently regular and confining.
Remark 1.
Following a well-established convention in stochastic thermodynamics (see, e.g., [46]), we denote functional dependence upon time, i.e., the dynamical parameter, with an underscript. Round brackets express dependence upon state coordinates in configuration or phase space of the diffusion.
The probability density distribution of the solution of (1) at any instant of time t satisfies the Fokker–Planck equation
t p t ( q ) μ     q   ,   ( U t ) ( q ) p t ( q )   μ β   q 2 p t ( q ) = 0
whose solution is fully specified by the assignment of an initial datum at time t = t ι . The assumption of a confining potential (2) guarantees that the probability density is integrable in R d . The connection between (1) and (3) stems from the representation of the transition probability density as a Monte Carlo average:
p t ( q ) = R d d d y   E P δ ( d ) ( q q t ) | q t ι = y   p t ι ( q )
The expectation value E P is over the probability measure P weighing the realizations of the solutions of (1). The singular nature of the Dirac delta distribution prevents the accurate evaluation of the transition probability density as a Monte Carlo average. For this reason, we look for the solution in the form
p t ( q ) = e β   U t ( q )   f t ( q )
Upon inserting into (3), we arrive at
t f t ( q ) + μ     ( U t ) ( q )   , ( f t ) ( q )   μ β   q 2 f t ( q ) = β   f t ( q )   t U t ( q )
Proposition 1.
The solution of (3) admits the representation
p t ( q ) = e β   U t ( q )   E P p t ι ( q t ι )   e β   U t ι ( q t ι ) + β t ι t d s   s U s ( q s ) | q t = q
where P is the probability measure over the paths of the backward diffusion process
  d q t : = q t q t d t = μ   ( U t ) ( q t )   d t + 2   μ β   d w t   d w t = w t w t d t
naturally complemented by conditions assigned for some t f t .
Idea of the proof: We start by recalling that, for any test function F t , Itô’s lemma for backward differentials yields
d F t ( q t ) = d t t F t ( q t ) + μ     ( U t ) ( q )   , ( F t ) ( q )   μ β q 2 F t ( q ) + 2   μ β     d w t   , ( F t ) ( q )  
We emphasize that d w t is just a standard Wiener process but evolving backward in time. In the stochastic analysis jargon, (8) is a diffusion process with respect to a backward filtration as in, e.g., [47]. As is well known, the stochastic integrals and martingale properties are the same as in forward calculus once one exchanges the pre-point rule with the post point, see, e.g., [48]. Let us define the auxiliary function
g t ( q ) = e β t t f d s   s U s ( q ) f t ( q )
Then, Itô’s lemma and (6) immediately imply
d g t ( q t ) =   e β t t f d s   s U s ( q s ) f t ( q t )   β   t U t ( q s )   d t + e β t t f d s   s U s ( q s )   d f t ( q t )   = e β t t f d s   s U s ( q s ) 2   μ β     d w t   , ( f t ) ( q )  
The equation tells us that the auxiliary function is a local martingale of the backward diffusion (see, e.g., Chapter 7 of [2]). Since we assume that the confining potential also guarantees integrability, we infer that the martingale property
E P g t f ( q t f )   |   q t f = q = E P g t ι ( q t ι )   |   q t f = q
must also hold true. By construction, we know that
E P g t f ( q t f ) | q t f = q = f t f ( q )
and we conclude
f t f ( q ) = E P g t ι ( q t ι )   |   q t f = q = E P e β t ι t f d s   s U s ( q ) f t ι ( q t ι )   |   q t f = q
Replacing t f with t in the above chain of identities completes the proof. □
The upshot is that we can use the Feynman–Kac formula over a backward diffusion to compute the solution of a forward Fokker–Planck equation. Next, we take advantage of Girsanov’s change-of-measure formula (see, e.g., Chapter 10 of [2] or 3.5 of [45]) to evaluate the conditional expectation in (7) directly over the paths of the Wiener process, or, more generally, over the paths of any diffusion that generates a measure with respect to which P is absolutely continuous. Girsanov’s change-of-measure formula is thus the basis of statistical inference for diffusion processes, see, e.g., [49]. We emphasize that we make use of the Girsanov formula while dealing with backward diffusions as, e.g., in [48]. As time is evolving from a larger to smaller value, correspondingly, the role of “past” and “future” events must be exchanged.
Remark 2.
As increments of any Wiener process are independent, from now on we write
d w t = d w t
to alleviate the notation.

2.1. Use of Girsanov’s Formula

We denote by Q the probability measure over the path of
d q t = 2   μ β   d w t
Our aim is to use Girsanov’s formula to express expectations with respect to the path measure P of (8) in terms of expectations with respect to Q :
p t ( q ) = e β   U t ( q )   E Q p t ι ( q t ι )   e β U t ι ( q t ι ) + β t ι t d s   s U s ( q s ) d P d Q   |   q t = q
where
d P d Q = exp t ι t β   μ 2   d w s   ,   ( U s ) ( q s )   β   μ 4 d s ( U s ) ( q s ) 2
is the Radon–Nikodym derivative. The symbol ∘ emphasizes that we define the stochastic integral in (10) using the post-point prescription:
t ι t   d w s   ,   ( U s ) ( q s )   : = lim d t 0 N d t = t t ι i = 1 N   w t i + 1 w t i   , ( U t i + 1 ) ( q t i + 1 )  
i.e., the function ( U t ) ( q t ) is evaluated at the end of each time interval. Accordingly, (10) is a martingale with respect to the backward filtration (i.e., the family of σ -algebras increases as t ι decreases) to which we associate the probability measure Q . In other, rougher, words, (10) is a martingale conditional to events occurring at times larger than or equal to the upper bound of integration t.
Writing the stochastic integral in the standard pre-point form allows us to simplify the expression of the probability density. We notice that (9) trivially implies
t ι t β   μ 2   d w s   ,   ( U s ) ( q s )   = t ι t β   μ 2   d q s   ,   ( U s ) ( q s )  
Next, we use the relation between stochastic integrals in the post-point, mid-point (or Stratonovich, denoted by the ⋄-symbol), and pre-point prescriptions:
t ι t   d q s   ,   ( U s ) ( q s )   = t ι t   d q s   ,   ( U s ) ( q s )   +   d q s   , ( U s ) ( q s )   2
Finally, we recall that ordinary differential calculus holds for the stochastic differentials in Stratonovich form:
U t ( q t ) U t ι ( q t ι ) = t ι t d s   s U s ( q s ) +   d q s   ,   ( U s ) ( q s )  
Putting these observations together, we obtain the following representation of the solution for the Fokker–Planck Equation (3):
p t ( q ) =   E p t ι ( q t ι )   e β 2 t ι t   d q s   ,   ( U s ) ( q s )   + μ 2   d s   ( U s ) ( q s ) 2   |   q t = q
In practice, this means that, to compute the probability density at the configuration space point q at time t     t ι , we need to average the initial density over solutions of (9) evolved backward in time to t ι and weighted by a path-dependent change-of-measure factor.

2.2. Examples and Path Integral Representation

Let us summarize the meaning of (11) in words. Formula (7) tells us that a Fokker–Planck equation of a forward diffusion process with gradient drift, i.e., of the form (3), admits a Feynman–Kac representation in terms of a backward diffusion process. This is because we can use the potential specifying the drift to turn the forward Fokker–Planck into a non-homogeneous backward Kolmogorov equation with respect to the backward diffusion process. This latter equation, as is well known, generically specifies a problem with initial data. We now turn to illustrate this fact with two examples.

2.2.1. Analytical Example

Consider a quadratic potential
U t ( q ) = 1 2   q   , U t q  
with U s denoting a d   ×   d real symmetric time-dependent matrix. The backward stochastic differential Equation (8) reduces to
d q t = μ   U t q t d t + 2   μ β d w t
Letting F denote the flow solution of the deterministic ordinary differential equation
d d t F t , s = μ   U t   F t , s
then the solution of the backward stochastic differential equation is
  q t = F t f , t   q 2   μ β t t f F s , t   d w s   q t f = q
The symbol ⊤ as usual denotes matrix transposition. The corresponding transition probability density is Gaussian with mean
  E q t | q u = q = F u , t   q   u     t
(recalling that, for standard backward differential equations, the martingale property arises upon conditioning on future events [47]) and variance matrix
  E ( q t E q t ) ( q t E q t ) | q u = q = 2   μ β t u d s   F s , t F s , t   u     t
We need to compute
f t ( q ) = E f t ι ( q )   e β t ι t d s   q s   , U ˙ s q s   2 | q t = q
If we couch this expression into the form of a path integral [50], we obtain
f t ( q ) = q t = q D [ q t : t ι ] e t ι t d s β q ˙ s μ   U s q s 2 4   μ β q s   , U ˙ s q s     2 f ( q t ι )
Here, D [ q t : t ι ] denotes the limit over finite dimensional approximations over time lattices in [ t ι , t ] of paths satisfying the terminal condition q t = q . We are free to interpret the path integral in the mid-point sense, because any change of discretization generates a path-independent Jacobian that can be reabsorbed into the normalization constant. As the integral is Gaussian, we can compute it by infinite dimensional stationary phase using ordinary differential calculus. We are left with
f t ( q ) = e U t ( q ) q t = q D [ q t : t ι ] e t ι t d s β q ˙ s 2 4   μ + β 2 q ˙ s   , U s q s   + μ U s q s   , U s q s   4 p t ι ( q t ι )
We now readily recognize that
p t , s ( q y ) = q s = y q t = q D [ q t : t ι ] e t ι t d s β q ˙ s 2 4   μ + β 2 q ˙ s   , U s q s   + μ U s q s   , U s q s   4
is the path integral expression of the transition probability density of the forward stochastic differential equation
d q t = μ   U t q t   d t + 2   μ β   d w t
We therefore recover the Chapman–Kolmogorov representation of the solution of (3):
p t ( q ) = R d y   p t , s ( q y )   p t ι ( q )
as expected.

2.2.2. Path-Integral Representation in General

The path integral representation of (7) is
p t ( q ) = R d d d y q t ι = y q t = q D [ q t : t ι ] e t ι t d s β q ˙ s 2 4   μ + β q ˙ s   , ( U s ) ( q s )   + μ ( U s ) ( q s ) 2 4 p t ι ( y )
As the stochastic integral term is evaluated in the pre-point representation, the path integral exactly recovers the path integral expression of the transition probability density. We have thus verified that (12) holds in general. We refer the reader unfamiliar with path integral calculus to, e.g., [50].

2.2.3. Numerical Example: Time-Independent Drift

In this section, we demonstrate how the method described can be applied to find a numerical solution of a Fokker–Planck equation driven by a mechanical potential. We consider the Fokker–Planck of the form (3) with a time-independent drift. By applying the Girsanov theorem as above, we couch the solution of the Fokker–Planck at time t [ t ι , t f ] into a numerical average of simulated trajectories of the auxiliary dynamics, given by
p t ( q ) = E Q P ι ( q t ι ) exp t ι t β   μ 2 d w r   ,   q U r ( q r ) + d r   β   μ 4 q U r ( q r ) 2   |   q t = q
where Q is the measure generated by the backwards diffusion (9). We approximate the expectation value numerically by repeated sampling of trajectories of the process (9). The trajectories are approximated on a discretization of the time interval [ t ι , t f ] given by
t ι = t 0 < t 1 < < t N = t f
Trajectories of (9) are sampled using the Euler–Maruyama scheme:
q t n 1 = q t n 2   μ   β | t n 1 t n |   ϵ
where ϵ is an increment of Brownian noise, sampled independently from a standard normal distribution. The Girsanov factor g is computed as a running cost:
g = n = 0 N 1 β   μ 4 | t n 1 t n | ( U ) t n 1 i ( q t n 1 i ) | 2 + μ   β 2 | t n 1 t n |   ( U t n 1 i ) ( q t n 1 i )   ϵ
This computation is summarized in Algorithm 1. In Figure 1, we integrate an example Fokker–Planck equation driven by a time-independent mechanical potential in two ways. The results of Algorithm 1 are compared to the proximal gradient descent method of [42]. In this method, the solution is found via gradient descent on the space of probability distributions by solving a proximal fixed point recursion at each time step. Both methods discretize the time interval, but do not require spacial discretization. In our implementation, the Monte Carlo method performs significantly faster.
Algorithm 1 Integrating Fokker–Planck equation using Girsanov theorem
  • Initialize q t n = q R d
  • Initialize g = 0
  • Initialize U t n for n { 0 , 1 , , N }
  • Initialize δ t = | t n 1 t n | for n { 0 , 1 , , N }
  • for i in 0 , , n 1 do
  •    Sample Brownian noise: ϵ N ( 0 , 1 )
  •    Evolve one step of (8): q t n i 1 = q t n i 2   δ t μ / β   ϵ
  •    Add to running total:
    g = g + δ t μ β 4   ( U t n 1 i ) ( q t n 1 i ) 2 + δ t μ β 2   ( U t n 1 i ) ( q t n 1 i )   ϵ
  • end for
  • Return p t n ( q ) = P ι ( q t ι )   e g

2.2.4. Numerical Example: “Föllmer’s Drift”

In this section, we apply (7) to a non-trivial example of gradient drift. Specifically, we consider the Föllmer drift solution of the dynamic Schrödinger bridge that steers the system between assigned boundary conditions while minimizing the Kullback–Leibler divergence from a free diffusion [10], given by
D KL = β   μ 4   E t ι t f d t   ( U t ) ( q t ) 2
in a finite time interval t [ t ι , t f ] . The boundary conditions are assigned on the initial and final distributions of the position, denoted P ι for the initial at time t = t ι and P f for the final at time t = t f . We consider boundary conditions of the form
P ι ( q ) = e β   U ι ( q ) R d d d y   e β   U ι ( y )
P f ( q ) = e β   U f ( q ) R d d d y   e β   U f ( y ) .
Föllmer drifts are relevant to machine learning applications, see, e.g., [44,51]. We refer to [12] or [25] and references therein for further details on the mathematics and physics backgrounds, respectively.
We summarize how to construct the Föllmer drift by solving a Schrödinger bridge problem using an iterative method of [41]. In doing so, we also obtain the solution of the Fokker–Planck Equation (3) that we use for comparison with the numerical expression provided by (7). The Schrödinger bridge problem is formulated as the minimization of a Bismut–Pontryagin functional [25]. In this framework, we find that the intermediate density p and a value function V imposing the boundary conditions satisfy the coupled partial differential equations
t p t ( q ) μ   q   , ( U t ) ( q ) p t ( q )   μ β q 2 p t ( q ) = 0 ;
t V t ( q ) μ     ( U t ) ( q )     , q V t ( q ) + μ β   q 2 V t ( q ) + β   μ 4 ( U t ) ( q ) 2 = 0
along with the stationarity condition
q V t ( q ) = 2 β q U t ( q )
We identify (18a) as the Fokker–Planck equation, and (18b) as the Hamilton–Jacobi–Bellman equation which is discussed in later sections. For a known U, we can apply the Girsanov theorem to integrate (18a). We find a reference solution to the system (18a) and (18b) using an adaptation of the method of [41], which is briefly described below. The transformation
p t ( q ) = ϕ t ( q ) ϕ ^ t ( q ) V t ( q ) = log ( ϕ t ( q ) )
applied to (18a) and (18b) yields the linear coupled equations
t ϕ t ( q ) + μ β   q 2 ϕ t ( q ) = 0
t ϕ ^ t ( q ) μ β   q 2 ϕ ^ t ( q ) = 0
with boundary conditions
ϕ t f ( q ) = P f ( q )   /   ϕ ^ t f ( q ) ϕ ^ t ι ( q ) = P ι ( q )   /   ϕ t ι ( q )
We make an initial guess for ϕ t ι which we use to integrate (21a), recompute the boundary conditions and then integrate (21b), recomputing ϕ t ι , and repeating this process until convergence: see [41] or Section 8.2 of [25] for a more detailed treatment. We reconstruct the value function and intermediate densities using (20).
With these results, we have a numerical approximation of the drift which maps an assigned initial probability density into an assigned final density while minimizing the Kullback–Leibler divergence on the interval [ t ι , t f ] . We use this drift to compute the solution of the Fokker–Planck (18a) p t via Algorithm 1, and compare it to the density resulting from the iteration method of [41] in Figure 2.

3. Fokker–Planck for a Time-Dependent Mechanical Underdamped Diffusion Process

We now turn our attention to the underdamped dynamics [52]:
  d q t = p t m   d t   d p t = p t τ + ( U t ) ( q t )   d t + 2   m τ   β   d w t
This is probably the most popular model of an open classical system in contact with a bath at temperature β 1 , driven by a mechanical force and subject to a linear friction force dissipating energy at Stokes rate τ . As in Section 2, we assume that the mechanical potential is confining. The corresponding Fokker–Planck equation
  t +   p m   , q     ( U t ) ( q )   , p     p   , p τ   m β   τ p 2 p t ( x ) = 0
relaxes to a Maxwell–Boltzmann equilibrium for any time-independent confining potential
p eq ( x ) = e β p 2 2   m β   U ( q ) Z
with Z a normalizing constant. Our aim is to express the solution of (23) as a suitable Monte Carlo average over an initial datum at time t ι . We proceed analogously to Section 2 and posit
p t ( x ) = e β p 2 2   m f t ( x )
where x = [ q , p ] . The symplectic component of the drift acts in the same manner on the probability density and the auxiliary function f t . The dissipative component changes sign:
t +   p m   , q     ( U t ) ( q ) p τ   , p m β   τ p 2 f t ( x ) = β m   ( U t ) ( q )   , p   f t ( x )
The upshot is that the resulting equation admits the interpretation of a non-homogeneous backward Kolmogorov equation associated with the backward process
  d q t = p t m   d t   d p t = p t τ ( U t ) ( q t )   d t + 2   m τ   β   d w t
Specifically, it is possible to prove the following:
Proposition 2.
The solution of the Fokker–Planck Equation (23) can be couched into a conditional expectation
p t ( x ) = e β p 2 2   m E P e β p t ι 2 2   m p t ι ( x t ι )   e β m t ι t d s     ( U s ) ( q s )   , p s     |   x t = x
with respect to the path measure P generated by (25).
Idea of the proof: The proof mirrors that of the overdamped case. We define the auxiliary function
g t ( x t ) = e β m t t f d s     ( U s ) ( q s )   , p s   f t ( x t )
for any f t solution of (24). Differentiation backward in time along the paths of (25) yields
d g t ( x t ) = e β m t t f d s     ( U s ) ( q s )   , p s   2   m τ   β   d w t   , p t g t ( x t )  
We conclude that, in any time integral where g t is integrable, it is also a martingale with respect to the measure P generated by (25):
E P g t ι ( x t ι )   |   x t f = x = E P g t f ( x t f )   |   x t f = x = f t f ( x )
The chain of identities yields the claim. □
In the case of particular physical interest, when the system probability density at time t ι takes the Maxwell–Boltzmann form
p t ι ( x ) = e β p 2 2   m β   U ι ( q ) Z
(26) reduces to
p t ( x ) = e β p 2 2   m E P e β   U ι ( q t ι ) Z   e β m t ι t   d s     ( U s ) ( q s )   , p s     |   x t = x

3.1. Representation with Respect to a Time Reversal Invariant Measure

We now turn our attention to the path measure I generated by the forward process
  d q t = p t m   d t
  d p t = ( U t ) ( q t )   d t + 2   m τ   β   d w t
The drift of the process is divergenceless. This, together with the statistical invariance under time reversal of the Wiener process, imply that we can also interpret the equation
t +   p m   , q     ( U t ) ( q )   , p   + m β   τ p 2 f t ( q , p ) = 0
both as a backward Kolmogorov equation or as the Fokker–Planck equation associated with (27a) and (27b) if we replace forward differentials with backward differentials. This means that both P and P are absolutely continuous with respect to I . In particular, we find that
d P d I = exp t ι t τ   β 2   m   d w s   ,   p s τ   τ   β 4   m   d s   p s τ 2
As in Section 2, the symbol ∘ denotes that the post-point prescription in the construction of the stochastic integral
t ι t   d w s   ,   p s τ   = t ι < t i     t   w t i w t i 1   , p t i τ  
The immediate consequence is the representation of the solution of the Fokker–Planck Equation (23):
p t ( x ) = e β p 2 2 m E I e β   U ι ( q t ι ) Z   e β m t ι t d s     ( U s ) ( q s )   , p s   d P d I   |   x t = x
Some further simplifications are possible. In view of the identities
t ι t d s     ( U s ) ( q s )   , p s   = t ι t   2   m τ   β d w s d p s   ,   p s  
and
d p t 2 = 2     p t   ,   d p t   2   m   d τ   β   d t
we can couch (29) into the form
p t ( x ) = e ( t t ι ) d τ E I e β   U ι ( q t ι ) β p t ι 2 2   m t ι t β   τ 2   m   d w s   ,   p s τ   + τ   β 4   m d s   p s τ 2 Z   |   x t = x
Finally, we can re-write the stochastic integral in the pre-point discretization using the identity
t ι t τ   β 2   m   d w s   ,   p s τ   = t ι t τ   β 2   m   d w s   , p s τ   + ( t t ι ) d τ
Thus, we arrive at
p t ( x ) = E I 1 Z   e β   U ι ( q t ι ) β p t ι 2 2   m t ι t β   τ 2   m   d w s   ,   p s τ   + τ   β 4   m   d s   p s τ 2 | x t = x
The advantage of taking averages with respect to the measure I is that it allows us to use paths of (27a) and (27b) to integrate both the Fokker–Planck equation and a coupled Hamilton–Jacobi–Bellman equation, necessary when constructing numerical solutions of Schrödinger bridge problems.

3.2. Numerical Example

In this section, we illustrate the numerical integration of a Fokker–Planck Equation (23) governing the evolution of the joint density of the momentum and position processes following underdamped dynamics (23). We once again consider the optimal control problem of minimizing the Kullback–Leibler divergence from a free diffusion (16) on the interval [ t ι , t f ] . This optimal control problem is formulated as the minimization of a Bismut–Pontryagin functional, resulting in the coupled partial differential equations
t +   p m   , q     ( U t ) ( q )   , p     p   , p τ   m β   τ p 2   p t ( x ) = 0
t + p m   ,   p p τ + ( U t ) ( q )   ,   q + m β   τ   p 2 V t ( x ) = β m 4   τ 2 ( U t ) ( q ) 2
with the stationarity condition
q U t ( q ) = R d d d p   p t ( p , q ) R d d d p   p t ( p , q )   p V t ( p , q )
We identify (30a) as the Fokker–Planck and (30b) as the Hamilton–Jacobi–Bellman equation. Using the Girsanov theorem, we find an expression for the intermediate density, the solution of (30a), as an expectation
p t ( x ) = E P ι ( p t ι   ,   q t ι ) e τ β 4   m t ι t d s p s 2 τ β 2   m t ι t d w s   ,   p s   |   x t = x
taken over trajectories of the process (27a) and (27b).
We consider the case where the boundary conditions are assigned on the joint distribution at the initial and final times
P ι ( p , q ) = 1 Z ι exp β p 2 2 β   U ι ( q )
P f ( p , q ) = 1 Z f exp β p 2 2 β   U f ( q )
with Z ι , Z f normalizing constants.
The numerical computation is summarized in Algorithm 2. For the optimal control potential and benchmark solution, we use numerical predictions from [25]. There, predictions are made for the optimal protocol in the underdamped dynamics using a multiscale perturbative expansion around the overdamped problem; for more detail, see Section 8.2 of [25]. The prediction for the optimal control protocol is used as the drift in the integration of the Fokker–Planck (30a), with the results shown in Figure 3. The convergence of the numerical scheme is emphasized in Figure 4. We compute relative distances from the assigned final boundary data (33b) and the variance of the results of Algorithm 2 as the number of sampled trajectories increases and the step size remains constant.
Algorithm 2 Integrating a Fokker–Planck for an underdamped diffusion process using Girsanov theorem
  • Initialize p t n ,   q t n = p ,   q R
  • Initialize g = 0
  • Initialize U t n for n { 0 , 1 , , N }
  • Initialize δ t = | t n 1 t n |
  • for i in 0 , , n 1 do
  •    Sample Brownian noise: ϵ N ( 0 , 1 )
  •    Evolve one backward step of (29): q t n 1 i = q t n i 1 m p t i   δ t p t n 1 i = p t n i + ( U t n i ) ( q t n i )   δ t δ t 2 m   τ   β   ϵ
  •    Add Girsanov weight: g = g + δ t τ β 4   m   p t n 1 i 2 + δ t τ β 2   m     ϵ   p t n 1 i
  • end for
  • Return p t n ( p , q ) = P ι ( p t ι , q t ι )   e g

4. Bismut–Elworthy–Li Monte Carlo Representation of Gradients

Numerical integration of Schrödinger bridge type problems, in the overdamped [12,13,15,41] and underdamped [22,23,25,53] cases require the solution of a Hamilton–Jacobi–Bellman (also known as a dynamic programming) equation, specifying the optimal control potential. In the simplest overdamped setup, the mechanical force is given by (19). The function
V t ( q ) : [ t ι , t f ]   ×   R d R
solves a Burgers type equation. More generally, optimization problems often require computing gradients of scalar functions satisfying a non-homogeneous backward Kolmogorov equation in [ t ι   , t f ] of the form
t +   b t ( x )   , x   +   A t ( x ) A t ( x )   , x x   V t ( x ) =     F t ( x )
V t f ( x ) = φ ( x )
The left hand side of (34a) is the mean forward derivative (see Chapter 11 of [54]) of V t along the paths of the n-dimensional system of Itô stochastic differential equations:
d x t = b t ( x t )   d t + A t ( x t )   d w t
In (34a) and (35), we consider drift b and volatility fields A of a more general form than in Section 2 and Section 3. This choice means that the following discussion is applicable to both overdamped and underdamped cases, as well as to more general situations, including non-linear problems [40]. In non-linear problems, the expression of the solution of (34a) and (34b) and its gradient are iteratively computed in sequences of infinitesimal time horizons [ t ι , t f ] to construct the solution of partial differential equations in which b , A and F depend upon the unknown field V.
It is well known that Dynkin’s formula (see, e.g., Chapter 6 of [2]) yields a Monte Carlo representation of the solution of (34a) and (34b):
V t ( x ) = E P φ ( x t f ) + t t f d s   F s ( x s )   |   x t = x
Our goal is to find an analogous expression for the gradient of V t . The Bismut–Elworthy–Li formula [34,35,55] accomplishes this task.
Remark 3.
In what follows, to neaten mathematical formulae we adopt the push-forward notation for the Jacobian matrix of a vector field. We refer to Section O . j pag. xlii of [56] for a geometrical justification of the notation.
For any v : R d R d , we write
  e i   , v e j   : = x j v t i ( x )
where e i and e j are, respectively, the i-th and j-th elements of the canonical basis of R d . Under our regularity assumptions, we regard the solution of (35) satisfying the condition
  x s = x   s     t
as the image of the stochastic flow X : R × R × R d R d [57] such that
x t = X t , s ( x s )
and omit reference to the initial data on the left hand side, when no ambiguity arises. According to (37), we denote the cocycle obtained by differentiating the flow X t , s with respect to its argument as x t , s implying that
  e i   , x t , s e j   : = x j X t , s i ( x )
By definition, x t , s enjoys the cocycle property [3], meaning that
x t , s ( x s ) = x t , u ( x u ) x u , s ( x s )   s u t  
Here, we present a heuristic, physics-style derivation of the formula based on Malliavin’s stochastic variational calculus [58] which draws from the mathematically more rigorous exposition in [36], and is close to the original treatment in [34]. To this end, we observe that if e i is the i-th element of the canonical basis of R n
e i   , ( V t ) ( x ) = E x t f , t e i   , ( φ ) ( x t f ) + t t f d s   x t f , t e i   , ( F s ) ( x s ) | x t = x
where x t f , t denotes the matrix valued process obtained by varying (35) with respect to its initial datum. In other words, if we suppose
  x s = x   s     t
then
  d x t , s = ( b t ( x t )   d t + A t ( x t )   d w t )   x t , s   x s , s = 1 n
The identity (38) allows us to derive the Bismut–Elworthy–Li formula from Malliavin’s integration by parts formula.

4.1. Integration by Parts Formula

Let us consider the equation
  d x t ( ε ) = b t ( x t ( ε ) ) + ε   h t   d t + A t ( x t ( ε ) )   d w t   x s ( ε ) = x
We assume h t to be a differentiable process, although rigorous constructions of integration by parts formula, see, e.g., [58], weaken this assumption to processes of bounded variation (see Chapter 1 of [2]). Differentiating at ε = 0 yields the variational equation
  d x t = ( b t ( x t )   d t + A t ( x t )   d w t )   x t + h t   d t   x s = 0
We can always write the solution of this latter equation in terms of the push-forward of the flow of (35):
x t = s t d u   x t , u   h u
Therefore, for sufficiently small ε ,
x t ( ε ) = x t + ε   x t + h . o . t
allows us to regard the solution of (40) as a functional of the solution of (35) (h.o.t. stands for higher order terms). The conclusion is that we can compute the expectation value of any integrable function g of a solution of (40) by expressing it as a function of the solution of (35) via (42) and then averaging with respect to the measure P generated by (35):
E P ε g ( x t )   |   x s = x = E P g ( x t ( ε ) )   |   x s = x
A second connection comes from Girsanov’s change-of-measure formula. Namely, if P ε is the path measure generated by (39), then, for any test function g, we obtain the identity
E P ε g ( x t )   |   x s = x = E P g ( x t ) d P ε d P   |   x s = x = E P g ( x t )   exp s t   d w u   , ε   A u 1 ( x u ) h u   d u   ε   A u 1 ( x u )   h u 2 2   |   x s = x
If g is also sufficiently regular, upon differentiating (43) and (44) at ε = 0 , we arrive at Malliavin’s integration by parts formula:
E P   x t   , ( g ) ( x t )     |   x s = x = E P g ( x t ) s t   d w u   , A u 1 ( x u ) h u     |   x s = x

4.2. Application to Non-Degenerate Diffusion

We set
h u = x u , s e i
where e i is the i-th element of the canonical basis of R n . This is legitimate because, under standard regularity assumptions, x u , s is a process of finite variation. Upon inserting into (41), we obtain
x t , s = ( t s )   x t , s   e i
The integration by parts formula becomes
E P   x t , s e i   , ( g ) ( x t )     |   x s = x = E P g ( x t ) t s s t   d w u   , A u 1 ( x u ) x u , s e i     |   x s = x
The identity holds for arbitrary t     s . Hence, we can apply it to (38) in order to derive the expression of the gradient of the solution of (34a) and (34b):
e i   , ( V t ) ( x ) = E φ ( x t f ) t f t t t f   d w u   , A u 1 ( x u ) x u , t e i     |   x t = x   + E t t f d s   F s ( x s ) s t t s   d w u   , A u 1 ( x u ) x u , t e i     |   x t = x
provided the volatility field A is always non-singular.

Application to the Transition Probability Density

It is worth noticing the following consequence of (46) when F s vanishes. In such a case, (46) reduces to
x R d d d y   φ ( y )   p t f , t ( y   |   x ) = 1 t f t   E P φ ( x t f ) t t f   d w u   , A u 1 ( x u ) ) x   u , t   |   x t = x
As the identity must hold true for any φ , we can also write
x p t f , t ( y   |   x ) = 1 t f t   E P δ ( d ) ( y x t f ) t t f   d w u   , A u 1 ( x u ) x   u , t   |   x t = x
A result by Molchanov, Section 5 of [59], allows us to express (47) in terms of an expectation value with respect to a reciprocal process, see, e.g., [60]. Namely, given a Markov process in [ t ι , t f ] , we can use it to construct a reciprocal process, i.e., a process conditioned at both ends of the time horizon from the relations
  p t , t f , t ι ( x z , y ) = p t f , t ( z x )   p t , t ι ( x y ) p t f , t ι ( z y )   t ι     t     t f   p t 2 , t 1 t f , t ι ( x 2 , x 1 z , y ) = p t f , t 2 ( z x 2 )   p t 2 , t 1 ( x 2 x 1 )   p t 1 , t ι ( x 1 y ) p t f , t ι ( z y )   t ι     t 1     t 2     t f   etc.
Upon contrasting (48) with (47), we thus arrive at Bismut’s formula (page 78 of [34]) for the gradient of the transition probability density:
  x p t f , t ( y x ) p t f , t ( y x ) = E P t t f   d w u   , A u 1 ( x u ) x   u , t   |   x t f = y , x t = x
The subscript P here means that we construct all finite dimensional approximations to the reciprocal process from the transition probability density of (35) according to (47).
Unfortunately, (49) does not directly provide a Monte Carlo representation of the score function because the derivative acts on the variable expressing the condition. It is, however, possible to use ideas similar to these and from the previous sections to obtain a Monte Carlo representation of the score function.

4.3. Analytical Example

It is worth illustrating the use of the Dynkin’s and Bismut–Elworthy–Li formulas in a case where all calculations can be performed explicitly. To this end let us consider
  d q t = p t m + η 2   τ m   β   d w t ( 1 )   d p t = p t τ + 2   m β   τ   d w t ( 2 )
whose solution is simply
  q t = q t ι + η 2   τ m   β   w t ( 1 ) + t ι t d s   p s m   p t = p t ι e t τ + 2   m β   τ t ι t d w s ( 2 )   e t s τ
Let us consider the partial differential equation
  t V ( q , p ) + p m   q V ( q , p ) p τ   p V ( q , p ) + η   τ m   β   q 2 V ( q , p ) + m β   τ   p 2 V ( q , p ) = 0   V t f ( q , p ) = p
It is straightforward to verify that at any time t t f
V t ( p , q ) = p   e t f t τ
Upon applying Dynkin’s formula (36), we verify that
V t ( p , q ) = E p t f   |   q t = q , p t = p
Next, we wish to apply Bismut–Elworthy–Li to recover
p V t ( p , q ) = e t f t τ
To this end, we determine the cocycle solution of the linearized dynamics. The cocycle equation is
  x ˙ t , s = 0 1 m 0 1 τ x t , s     x s , s = 1 0 0 1
from where we obtain the unique solution
x t , s = 1 1 m 1 e t s τ 0 e t s τ
To evaluate the Bismut–Elworthy–Li formula, we also need the inverse of the volatility matrix, which is
A 1 = β   m 2   η   τ 0 0 β   τ 2   m
We thus obtain
p V t ( p , q ) = E ( p t   e t f t τ + 2   m β   τ t t f   d w s ( 2 )   e t f s τ t f t ×   t t f d w u ( 1 ) β   m 2   τ   1 e u t τ m + β   τ 2   m   d w u ( 2 ) e u t τ   |   q t = q ,   p t = p )
Using standard properties of stochastic integrals [2], we recover the expected result:
p V t ( p , q ) = t t f d s   e 1 τ ( t f s ) e 1 τ ( s t ) t f t = e t f t τ

4.4. Numerical Example

In this section, we apply the Bismut-Elworthy- Li formula to compute the gradient of the value function in the optimal control problem minimizing the Kullback–Leibler divergence (16) in the overdamped dynamics: the gradient of the solution to (18b). This is calculated as a numerical average over sampled trajectories of (3). We use the same approximation of the optimal control potential U as in the Fokker–Planck example of Section 2.2.4. We find
q s , t = 1   for   s = t e μ t s d r   2 U r ( q r )   s   >   t
Hence, (46) becomes
    ( q V t ) ( q ) = β 2   μ E ( φ ( q t f ) t f t t t f d w s   ,   exp μ t s d r   2 U r ( q r )   + β   μ 4 t t f d s   U s ( q s ) 2 s t   t s d w r   ,   exp μ t r d v   2 U v ( q v )   |   q t = q )
We repeatedly sample trajectories of process (3) using the Euler–Maruyama discretization scheme and compute the integrals as running costs over each trajectory, finally taking a numerical expectation. The calculation is summarized in Algorithm 3 and numerical results shown in Figure 5.
From the physics point of view, note that we can conceptualize the motility constant with the ratio
μ = τ m
for consistency with the underdamped equations.
Algorithm 3 Monte Carlo integration for gradient of the value function
  • Initialize q t n = q R
  • Initialize ι 1 = 0
  • Initialize ι 2 = 0
  • Initialize ι 3 = 0
  • Initialize drift U t n for n { 0 , 1 , , N }
  • for i in n , , N 1 do
  •    Sample Brownian noise: ϵ N ( 0 , 1 )
  •    Compute the BEL weights:
  •    if i == n then
  •       ι 2 = ι 2 + δ t   ϵ
  •    else
  •       ι 1 = ι 1 + μ   δ t   ( 2 U t n ) ( q t n )
  •       ι 2 = ι 2 + δ t   ϵ   e   ι 1
  •    end if
  •     ι 3 = ι 3 + δ t   ( q U t i ) ( q t i ) 2   ι 2
  •    Evolve one step of (3): q t i + 1 = q t i μ   ( q U t i ) ( q t i )   δ t + δ t 2   μ β   ϵ
  • end for
  • Return V t n ( q ) = β   μ 2   φ ( q t f ) t f t n   ι 2 + β   μ 4   ι 3

5. Application of Bismut–Elworthy–Li to Degenerate Diffusion

For a degenerate diffusion we cannot directly apply (45) as it is because the expression is written in terms of the inverse of a degenerate matrix. Nevertheless the Bismut–Elworthy–Li formula continues to hold. To give an idea of how this comes about, we consider the counterpart to (35), while referring to [36] for the mathematically rigorous reader. Our starting point is
  d q t = a t ( x t )   d t   d p t = b t ( x t )   d t + A t ( x t )   d w t
with
x t = q t p t : R + R 2   d
and A a non-singular matrix field. The variational equation is
  d q t = a t ( x t ) x t   d t   d p t = b t ( x t ) x t   d t + A t ( x t ) x t   d w t + h t   d t
where we suppose again
x s = 0
We notice that we can always write the solution of the variational equation as
x t = x t , s c t
for some vector valued process c t : R R 2   d such that
c s = 0
Upon differentiation, we readily verify that the self-consistency condition is
x   t , s   c ˙ t = 0 h t
whose solution is
c t = s t d u   x u , s 1 ( x ) 0 h u
We now avail ourselves of the fact that the above relations hold for a sufficiently regular but otherwise arbitrary vector field h u and choose it such that
c t f = 0 v
Here, v is a unit vector that specifies the direction of the gradient in the Bismut–Elworthy–Li formula. Namely, given
V t ( x ) = E φ ( x t f )   |   x t = q p
the Bismut–Elworthy–Li formula [36] continues to hold according to the chain of identities
( v   ·   p ) V t ( x )   = E x t f , t   c t f   , ( φ ) ( x t f ) = E x t f   , ( φ ) ( x t f )   = E φ ( x t f ) t t f   d w u   , A u 1 ( x u ) h u     |   x t = q p
provided the conditions (57), (58) are satisfied.
Similarly, we obtain a representation of the derivative with respect to the q variables by alternative choices of h t such that
c t f = v 0

5.1. A Strategy for the Explicit Construction of a Variational Field Enforcing the Boundary Conditions

Drawing from [36], we present a straightforward way to construct a variational field on the interval [ t , t f ] such that, e.g., (58) holds true. Let
H u : = d x   u , t d u   x   u , t 1
For clarity, we drop the subscripts t , t f in the following. However, there is still an implicit dependence on these parameters, with u taking values in [ t , t f ] . Consider the differential system
g ˙ u f ˙ u = H u g u u
with u arbitrarily assigned (but sufficiently regular), and g u and f u determined by the identity. Then,
x ˙ u + g ˙ u f ˙ u = H u x u + g u u + 0 h u
holds by construction for u [ t , t f ] . Hence, if we require
˙ u = f ˙ u h u
we see that
y u = x u + g u u
satisfies
    y ˙ u = H u y u   y t = g t t       u [ t , t f ]
We solve (60) with the “initial condition”
g t = 0
As a consequence, we arrive at
x u = x u , t ( x ) 0 t g u u
The identity we just obtained shows that, in order to obtain a representation of the gradient according to (59), we must restrict the choice of vector fields u to those satisfying the boundary conditions
  t = v    &    t f = 0
so that at time u = t f
x t f = x t f , t ( x ) 0 v
holds true. Once all the above conditions are satisfied, we can determine the right hand side of (59) from
  h u = f ˙ u ˙ u   u     [ t , t f ]

A Case of Particular Interest

There are an infinite number of ways to choose u such that condition (62) holds true. We detail here a choice of particular interest for physics. Let us consider the generator of the linearized dynamics around a path solution of (22):
H u ( x u ) = 0 1 m ( U u ) ( q u ) 1 τ
In such a case, the instantiation of (60) is the differential system
  g ˙ u = 1 m u   f ˙ u = ( U s ) ( q u )   g u 1 τ u   f ˙ u h u = ˙ u
We assign
u = v   t f u t f t + v 1   ( t f u ) ( u t ) 2   ( t f t ) 2
and obtain
g u = v t u d s   t f s m   ( t f t ) + v 1 t u d s   ( t f s ) ( s t ) 2   m   ( t f t ) 2
We fix v 1 by requiring
g t f = 0 v 1 = 6   v
We thus obtain
  u = v   ( t f u ) ( t f + 2   t 3   u ) ( t f t ) 2
  g u = v   ( t f u ) 2 ( u t ) m   ( t f t ) 2
and therefore
f ˙ u = 1 τ v   ( t f u ) ( t f + 2   t 3 u ) ( t f t ) 2 ( U u ) ( q u )   v   ( t f u ) 2 ( u t ) m   ( t f t ) 2

5.2. Analytical Example

We return to the elementary case (50), (51) but set η = 0 . The momentum gradient (52) does not depend on η , yet the application of the Bismut–Elworthy–Li formula requires the inverse of the volatility which in turn appears to depend on η . As in the present example, the potential is identically vanishing:
U = 0
we arrive at
  p V t ( p , q ) =   E p t e t f t τ + 2   m β   τ t t f d w s ( 2 )   e t f s τ t t f β   τ 2   m d w u ( 2 ) u τ ˙ u   |   q t = q , p t = p
Using again the properties of stochastic integrals, the expectation value reduces to
p V t ( p , q ) = t t f d s   d d s e t f s τ s = t f + e t f t τ t
whence we recover the correct expression of the gradient once we recall the boundary conditions imposed on the function in [ t , t f ] .
This example also indicates that the existence of the Bismut–Elworthy–Li formula for Langevin–Kramers equations of the form (56) can be recovered from the limit η tending to zero of a non-degenerate model owing to the vanishing of products of Itô stochastic integrals with respect to independent Wiener processes.

5.3. Numerical Example

We demonstrate here a numerical example of using the Bismut–Elworthy–Li formula to find the gradient of a value function satisfying the Hamilton–Jacobi–Bellman Equation (30b). We look at the case where the initial and final conditions assigned on the density are Gaussian distributions. In the case of Gaussian boundary conditions, we can determine the value function and optimal control protocol in the underdamped dynamics as the numerical solution of a system of differential equations; see Section IV in [25]. The value function V t is quadratic in the momentum and position variables, and in the two-dimensional phase space case reads
V t ( p , q ) = v t ( 0 ) + v t ( p ) p + v t ( q ) q + 1 2 v t ( p , p ) p 2 + 2 v t ( p , q ) p q + v t ( q , q ) q 2
for time-dependent coefficients v t ( 0 ) ,   v t ( p ) ,   v t ( q ) ,   v t ( p , p ) ,   v t ( p , q ) and v t ( q , q ) found as in Section IV of [25]. The solution of (34a) and (34b) can be found as
V t ( x ) = E φ ( x t f ) + β   τ 4   m t t f s   q U s ( q s ) 2   |   x t = x
Applying the Bismut–Elworthy–Li formula with h t gives the following expression for the gradient of the value function with respect to momentum:
p V t ( x ) = β   τ 2   m   E ( φ ( x t f ) t t f d w s   ,   h s     + β   τ 4   m t t f d s   U s ( q s ) 2 t s d w u   ,   h u   |   x t = x )
where
h u = ˙ u u τ ( U u ) ( q u )   g u   = v   1 τ ( t f t ) 2   ( ( t f u ) ( t f + 2 t 3 u ) τ   ( 4 t f 6 u + 2 t )       + τ m   ( U u ) ( q u ) ( t f u ) 2 ( u t ) )
The computation is summarized in Algorithm 4 and results shown in Figure 6.
Algorithm 4 BEL for degenerate diffusion
  • Initialize q t n = q R
  • Initialize p t n = p R
  • Initialize ι 1 = 0
  • Initialize drift function q U t
  • Initialize h ( u , t , T ) as in (67)
  • for i in n , , N 1 do
  •    Sample Brownian noise: ϵ i N ( 0 , δ t )
  •    if  i > n then
  •      Add to running cost: ι 1 = ι 1 + δ t q U t i ( q t i ) 2 j = n i ϵ j   h ( t j , t n , t i )
  •    end if
  •    Evolve one step of (22): q t i + 1 = q t i + p t i m   δ t p t i + 1 = p t i p t i τ + q U t i ( q t i )   δ t + 2   m τ β   ϵ i
  • end for
  • Return p V t n ( x ) = β 2 φ ( x t f ) j = n N 1 ϵ j   h ( t j , t n , t f ) + β 4   ι 1

6. Application to Machine Learning

In this section, we return to the overdamped dynamics and demonstrate an application of numerical methods we discuss above. We present a prototype example for the optimal control problem in the overdamped dynamics of minimizing the Kullback–Leibler divergence (16). Inspired by the seminal works [62,63], we model the optimal control protocol by a neural network, and use gradient descent to iteratively update it based on the stationarity condition (19).
As before, we formulate the problem in terms of a Bismut–Pontryagin cost functional. Additionally, we enforce the assigned boundary conditions (initial and final conditions on the density of the form (17a) and (17b)) through a Lagrangian multiplier λ . This gives
A [ p , U , V ] = R d d d q V t f ( q ) p t f ( q ) + λ ( q )   ( p t f ( q ) P f ( q ) ) + V t ι ( q ) P ι ( q )   + t ι t f d t   R d d d q   p t ( q ) β μ 4 ( U t ) ( q ) 2 + t μ ( q U ) ( q )   ,   q + μ β q 2 V t ( q )
Taking stationary variation with respect to the density p , control protocol U and value function V yields the coupled partial differential Equations (18a) and (18b), and the stationarity condition (19). We identify
V t f ( q ) = λ ( q )
and the following update rule:
λ ( new ) ( q ) = λ ( old ) ( q ) + γ 1   log p t f ( q ) P f ( q )
chosen in this way to preserve the integrability conditions of the value function. The stationarity condition gives an update rule for the drift of the control protocol:
( U t ) ( new ) ( q ) = ( U t ) ( old ) ( q ) γ 2   ( U t ) ( old ) ( q ) β 2 ( V t ) ( q )
The parameters γ 1 , γ 2 > 0 control the step sizes of the gradient descent, known as a learning rate. The update for the Lagrange multiplier is a gradient ascent rather than descent [64].
The right hand sides of both (69) and (70) can be computed using Monte Carlo integration techniques discussed in this article. With appropriate parametrization of the gradient of the control protocol and the Lagrangian multiplier λ , the method could therefore scale to high dimensions. In this prototype example, we use a polynomial regression for fitting λ and a neural network for the gradient of the control protocol. The polynomial regression could be replaced with any suitable parametrization, in particular, with a second neural network.
The gradient of the optimal control protocol U t is modelled by a neural network, denoted by U t . We use a feed-forward neural network: connected layers, representing affine transformations with non-linear functions (known as activation functions) between them. The neural network has a set of parameters (weights and biases) associated with the layers, which we denote by Θ . The network takes the time t and space coordinates q as input. Using a neural network allows for evaluating the optimal control protocol on new space coordinates without using interpolation, meaning that it can easily be used as the drift in the computation of the density and value functions using Algorithms 1 and 3.
The training process can be summarized as follows. Firstly, the functions λ and U with a set of parameters Θ ( 0 ) are initialized. We use these to find the final density p t f under this drift with Algorithm 1. The Lagrange multiplier is updated using (69). The new λ is used as the terminal condition of the value function. We then use Algorithm 3 to compute the gradient of the value function, q V t , using the current drift and terminal condition. The neural network parameters Θ are updated so that the new drift satisfies (70). Under the updated drift, the final density is recomputed and the process is repeated until convergence. The whole iteration is summarized in Algorithm 5.
The results of Algorithm 5 are illustrated in Figure 7. We show the final density in panel (a). Panels (b)–(g) show the approximation of the gradient of the control protocol by the trained neural network.
Algorithm 5 Learning an Optimal Control Protocol by Gradient Descent
  • Initialize a neural network U with parameters Θ ( 0 )
  • Initialize λ ( 0 ) as a polynomial with coefficients zero
  • Initialize γ 1 , γ 2 learning rates
  • while  L max iters do
  •    Initialize a batch q = { q k } k of K points
  •    Compute p t f ( ) ( q ) using Algorithm 1 with U as drift
  •    Update λ ( + 1 ) ( q ) = λ ( ) ( q ) + γ 1 log p t f ( ) ( q ) P f ( q )
  •    Set V t f ( + 1 ) ( q ) = λ ( + 1 ) ( q )
  •    for  t n , n = 0 , , N do
  •      Compute ( V t n ( + 1 ) ) ( q ) using Algorithm 3 with U as drift
  •      Update Θ ( + 1 ) such that U ( q , t n ;   Θ ( + 1 ) ) β 2   ( V t n ( + 1 ) ) ( q ) 2 is minimized
  •    end for
  • end whilereturn Approximation of the gradient of the optimal control protocol U

7. Conclusions

In this article, we discuss two integration methods for partial differential equations which frequently appear in optimal control problems. We show how we can use the Girsanov theorem such that a Fokker–Planck equation driven by a mechanical potential can be integrated by taking a numerical expectation of Monte Carlo trajectories of an auxiliary stochastic process. This method can be applied when the auxiliary stochastic process is non-degenerate or degenerate. Secondly, we use the Bismut–Elworthy–Li formula to find expressions for the gradient of the value function satisfying a Hamilton–Jacobi–Bellman equation. We show this for both a non-degenerate and degenerate diffusion.
The discussed numerical methods are supported by computational examples. We examine the dynamic Schrödinger bridge problem, or the minimization of the Kullback–Leibler divergence from a free diffusion while satisfying boundary conditions on the density at the initial and final time. For the overdamped dynamics, our integration shows good agreement with the iterative approach of Caluya and Halder [41] in Figure 2 and Figure 5. In the underdamped case, we integrate the associated Fokker–Planck equation to support the consistency of the multiscale perturbative approach used in [25]. In particular, we compute an estimate of the evolution of the joint density function of the system state for this problem in Figure 3. We also verify the stationarity condition using the Bismut–Elworthy–Li for a degenerate diffusion in Figure 6. Finally, we demonstrate an application of both integrations in a simple machine learning model in Figure 7.
The optimal control problem discussed here has many applications. One possibility is application in machine learning, for instance in the development of diffusion models for image generation [44]. Here, we find an optimal steering protocol between a noise distribution (e.g., a Gaussian) and a target (e.g., an image) by minimizing the Kullback–Leibler divergence. Optimal control problems in the underdamped dynamics are particularly interesting. Underdamped dynamics take into account random thermal fluctuations, noise and the effects of inertia; hence, they are well suited to model non-equilibrium transitions at nanoscale. Models of certain biological systems require considering complex dynamics, for example, because of random external noise from the environment [30]. Such models then result in non-linear partial differential equations, making them difficult to integrate. While the implementation of machine learning to solve an optimal control problem we use here is a prototype, it may be possible to extend it to a more general setting. Specifically, we have in mind transitions obeying underdamped dynamics and occurring at minimum entropy production such as those considered in [53].

Author Contributions

Initial conceptualization P.M.-G. further development and refinement J.S.; writing, J.S. and P.M.-G.; All authors have read and agreed to the published version of the manuscript.

Funding

J.S. was supported by a University of Helsinki funded doctoral researcher position, Doctoral Programme in Mathematics and Statistics.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The codes used to generate numerical examples shown in this article are available in Github at https://github.com/julia-sand/kldivergence (accessed on 14 November 2024).

Acknowledgments

The authors gratefully acknowledge Paolo Andrea Erdman for many useful discussions on topics related to the present work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kay, E.R.; Leigh, D.A.; Zerbetto, F. Synthetic Molecular Motors and Mechanical Machines. Angew. Chem. Int. Ed. 2007, 46, 72–191. [Google Scholar] [CrossRef] [PubMed]
  2. Klebaner, F.C. Introduction to Stochastic Calculus with Applications, 2nd ed.; Imperial College Press: London, UK, 2005; p. 432. [Google Scholar] [CrossRef]
  3. Arnold, L. Random Dynamical Systems; Springer Monographs in Mathematics; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
  4. Sekimoto, K. Stochastic Energetics; Lecture Notes in Physics; Springer: Berlin/Heidelberg, Germany, 2010; Volume 799, p. 322. [Google Scholar] [CrossRef]
  5. Seifert, U. Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 2012, 75, 126001. [Google Scholar] [CrossRef] [PubMed]
  6. Peliti, L.; Pigolotti, S. Stochastic Thermodynamics; Princeton University Press: Princeton, UK; Oxford, UK, 2020. [Google Scholar]
  7. Fodor, E.; Jack, R.L.; Cates, M.E. Irreversibility and Biased Ensembles in Active Matter: Insights from Stochastic Thermodynamics. Annu. Rev. Condens. Matter Phys. 2022, 13, 215–238. [Google Scholar] [CrossRef]
  8. Bechhoefer, J. Control Theory for Physicists; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar] [CrossRef]
  9. Schrödinger, E. Über die Umkehrung der Naturgesetze. Sitzungsberichte Der Preuss. Akad. Der Wiss. Phys. Math. Kl. 1931, 8, 144–153. [Google Scholar] [CrossRef]
  10. Chetrite, R.; Muratore-Ginanneschi, P.; Schwieger, K. E. Schrödinger’s 1931 paper “On the Reversal of the Laws of Nature” [“Über die Umkehrung der Naturgesetze”, Sitzungsberichte der preussischen Akademie der Wissenschaften, physikalisch-mathematische Klasse, 8 N9 144-153]. Eur. Phys. J. H 2021, 46, 1–29. [Google Scholar] [CrossRef]
  11. Todorov, E. Efficient computation of optimal actions. Proc. Natl. Acad. Sci. USA 2009, 106, 11478–11483. [Google Scholar] [CrossRef]
  12. Léonard, C. A survey of the Schrödinger problem and some of its connections with optimal transport. Discret. Contin. Dyn. Syst. Ser. A 2014, 34, 1533–1574. [Google Scholar] [CrossRef]
  13. Chen, Y.; Georgiou, T.T.; Pavon, M. Stochastic Control Liaisons: Richard Sinkhorn Meets Gaspard Monge on a Schrödinger Bridge. SIAM Rev. 2021, 63, 249–313. [Google Scholar] [CrossRef]
  14. Fleming, W.H.; Soner, M.H. Controlled Markov Processes and Viscosity Solutions, 2nd ed.; Stochastic Modelling and Applied Probability; Springer: Berlin/Heidelberg, Germany, 2006; Volume 25, p. 428. [Google Scholar]
  15. Aurell, E.; Gawȩdzki, K.; Mejía-Monasterio, C.; Mohayaee, R.; Muratore-Ginanneschi, P. Refined Second Law of Thermodynamics for fast random processes. J. Stat. Phys. 2012, 147, 487–505. [Google Scholar] [CrossRef]
  16. Villani, C. Optimal Transport: Old and New; Grundlehren der mathematischen Wissenschaften; Springer: Berlin/Heidelberg, Germany, 2009; Volume 338, p. 973. [Google Scholar] [CrossRef]
  17. Schmiedl, T.; Seifert, U. Efficiency at maximum power: An analytically solvable model for stochastic heat engines. EPL (Europhys. Lett.) 2008, 81, 20003. [Google Scholar] [CrossRef]
  18. Schmiedl, T.; Seifert, U. Efficiency of molecular motors at maximum power. EPL (Europhys. Lett.) 2008, 83, 30005. [Google Scholar] [CrossRef]
  19. Muratore-Ginanneschi, P.; Schwieger, K. Efficient protocols for Stirling heat engines at the micro-scale. EPL (Europhys. Lett.) 2015, 112, 20002. [Google Scholar] [CrossRef]
  20. Martínez, I.A.; Roldán, E.; Dinis, L.; Petrov, D.; Parrondo, J.M.R.; Rica, R.A. Brownian Carnot engine. Nat. Phys. 2016, 12, 67–70. [Google Scholar] [CrossRef] [PubMed]
  21. Proesmans, K.; Ehrich, J.; Bechhoefer, J. Finite-Time Landauer Principle. Phys. Rev. Lett. 2020, 125, 100602. [Google Scholar] [CrossRef] [PubMed]
  22. Muratore-Ginanneschi, P. On extremals of the entropy production by “Langevin–Kramers” dynamics. J. Stat. Mech. Theory Exp. 2014, 2014, P05013. [Google Scholar] [CrossRef]
  23. Muratore-Ginanneschi, P.; Schwieger, K. How nanomechanical systems can minimize dissipation. Phys. Rev. E 2014, 90, 060102(R). [Google Scholar] [CrossRef]
  24. Horn, R.A.; Johnson, C.R. Topics in Matrix Analysis; Cambridge University Press: New York, NY, USA, 1991. [Google Scholar] [CrossRef]
  25. Sanders, J.; Baldovin, M.; Muratore-Ginanneschi, P. Optimal Control of Underdamped Systems: An Analytic Approach. J. Stat. Phys. 2024, 191, 117. [Google Scholar] [CrossRef] [PubMed]
  26. Chiarini, A.; Conforti, G.; Greco, G.; Ren, Z. Entropic turnpike estimates for the kinetic Schrödinger problem. Electron. J. Probab. 2021, 27, 1–32. [Google Scholar] [CrossRef]
  27. Celani, A.; Lanotte, A.; Mazzino, A.; Vergassola, M. Fronts in passive scalar turbulence. Phys. Fluids 2001, 13, 1768–1783. [Google Scholar] [CrossRef]
  28. Mazzino, A.; Muratore-Ginanneschi, P. Passive scalar turbulence in high dimensions. Phys. Rev. E 2001, 63, 015302. [Google Scholar] [CrossRef] [PubMed]
  29. Donvil, B.; Muratore-Ginanneschi, P. Quantum trajectory framework for general time-local master equations. Nat. Commun. 2022, 13, 4140. [Google Scholar] [CrossRef] [PubMed]
  30. Maoutsa, D.; Reich, S.; Opper, M. Interacting Particle Solutions of Fokker–Planck Equations Through Gradient–Log–Density Estimation. Entropy 2020, 22, 802. [Google Scholar] [CrossRef] [PubMed]
  31. Boffi, N.M.; Vanden-Eijnden, E. Probability flow solution of the Fokker-Planck equation. Mach. Learn. Sci. Technol. 2022, 4, 035012. [Google Scholar] [CrossRef]
  32. Hyvärinen, A. Estimation of Non-Normalized Statistical Models by Score Matching. J. Mach. Learn. Res. 2005, 6, 695–709. [Google Scholar]
  33. Borrell, E.R.; Quer, J.; Richter, L.; Schütte, C. Improving control based importance sampling strategies for metastable diffusions via adapted metadynamics. SIAM J. Sci. Comput. 2024, 46, S298–S323. [Google Scholar] [CrossRef]
  34. Bismut, J.M. Large Deviations and the Malliavin Calculus; Progress in Mathematics; Birkhäuser Verlag: Basel, Switzerland, 1984; Volume 45, p. 232. [Google Scholar]
  35. Elworthy, K.D.; Li, X.M. Formulae for the Derivatives of Heat Semigroups. J. Funct. Anal. 1994, 125, 252–286. [Google Scholar] [CrossRef]
  36. Zhang, X. Stochastic flows and Bismut formulas for stochastic Hamiltonian systems. Stoch. Process. Their Appl. 2010, 120, 1929–1949. [Google Scholar] [CrossRef]
  37. Fournié, E.; Lasry, J.M.; Lebuchoux, J.; Lions, P.L.; Touzi, N. Applications of Malliavin calculus to Monte Carlo methods in finance. Financ. Stochastics 1999, 3, 391–412. [Google Scholar] [CrossRef]
  38. Fournié, E.; Lasry, J.M.; Lebuchoux, J.; Lions, P.L. Applications of Malliavin calculus to Monte-Carlo methods in finance. II. Financ. Stochastics 2001, 5, 201–236. [Google Scholar] [CrossRef]
  39. Baños, D. The Bismut–Elworthy–Li formula for mean-field stochastic differential equations. Ann. L’Institut Henri Poincaré, Probab. Stat. 2018, 54, 220–233. [Google Scholar] [CrossRef]
  40. Weinan, E.; Hutzenthaler, M.; Jentzen, A.; Kruse, T. Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. Partial. Differ. Equations Appl. 2021, 2, 80. [Google Scholar] [CrossRef]
  41. Caluya, K.F.; Halder, A. Wasserstein Proximal Algorithms for the Schrödinger Bridge Problem: Density Control with Nonlinear Drift. IEEE Trans. Autom. Control 2022, 67, 1163–1178. [Google Scholar] [CrossRef]
  42. Caluya, K.F.; Halder, A. Gradient Flow Algorithms for Density Propagation in Stochastic Systems. IEEE Trans. Autom. Control 2020, 65, 3991–4004. [Google Scholar] [CrossRef]
  43. Vargas, F.; Thodoroff, P.; Lamacraft, A.; Lawrence, N. Solving Schrödinger Bridges via Maximum Likelihood. Entropy 2021, 23, 1134. [Google Scholar] [CrossRef] [PubMed]
  44. De Bortoli, V.; Thornton, J.; Heng, J.; Doucet, A. Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling. Adv. Neural Inf. Process. Syst. 2021, 34, 17695–17709. [Google Scholar]
  45. Pavliotis, G.A. Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations; Springer: New York, NY, USA, 2014; p. 339. [Google Scholar] [CrossRef]
  46. Chétrite, R.; Gawȩdzki, K. Fluctuation relations for diffusion processes. Commun. Math. Phys. 2008, 282, 469–518. [Google Scholar] [CrossRef]
  47. Kunita, H. On backward stochastic differential equations. Stochastics 1982, 6, 293–313. [Google Scholar] [CrossRef]
  48. Meyer, P.A. Géométrie différentielle stochastique, II. SÉminaire Probab. Strasbg. 1982, S16, 165–207. [Google Scholar]
  49. Sørensen, M. Estimating functions for diffusion-type processes. In Statistical Methods for Stochastic Differential Equations; Chapman and Hall/CRC Press: New York, NY, USA, 2012; p. 507. [Google Scholar] [CrossRef]
  50. Langouche, F.; Roekaerts, D.; Tirapegui, E. Functional Integration and Semiclassical Expansions; Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 1982; Volume 10, p. 313. [Google Scholar] [CrossRef]
  51. Tzen, B.; Raginsky, M. Theoretical guarantees for sampling and inference in generative models with latent diffusions. In Proceedings of the 32nd Annual Conference on Learning Theory, Phoenix, AZ, USA, 25–28 June 2019; Volume 99, pp. 1–31. [Google Scholar]
  52. Zwanzig, R. Nonequilibrium Statistical Mechanics; Oxford University Press: Oxford, UK, 2001; p. 240. [Google Scholar]
  53. Sanders, J.; Baldovin, M.; Muratore-Ginanneschi, P. Minimal-work protocols for inertial particles in non-harmonic traps. arXiv 2024, arXiv:2407.15678. [Google Scholar] [CrossRef]
  54. Nelson, E. Dynamical Theories of Brownian Motion, 2nd ed.; Princeton University Press: Princeton, NJ, USA, 2001; p. 148. [Google Scholar] [CrossRef]
  55. Elworthy, K.D.; Li, X.M. Differentiation of heat semigroups and applications. In Probability Theory And Mathematical Statistics: Proceedings of the 6th Vilnius Conference 1993; Grigelionis, B., Ed.; V.S.P. Interanational Science: Utrecht, The Netherlands, 1994. [Google Scholar]
  56. Frankel, T. The Geometry of Physics: An Introduction, 3rd ed.; Cambridge University Press: Cambridge, UK, 2012; p. LXII, 686. [Google Scholar]
  57. Kunita, H. Lectures on Stochastic Flows And Applications; Lectures on Mathematics and Physics: Mathematics; Springer: Berlin/Heidelberg, Germany, 1986; Volume 78. [Google Scholar]
  58. Nualart, D. The Malliavin Calculus and Related Topics; Probability and Its Applications; Springer: Berlin/Heidelberg, Germany, 1995; p. 266. [Google Scholar]
  59. Molchanov, S.A. Diffusion processes and Riemannian geometry. Uspekhi Mat. Nauk. 1975, 30, 3–59. [Google Scholar] [CrossRef]
  60. Krener, A.J. Reciprocal diffusions in flat space. Probab. Theory Relat. Fields 1997, 107, 243–281. [Google Scholar] [CrossRef]
  61. Rackauckas, C.; Nie, Q. DifferentialEquations.jl–A performant and feature-rich ecosystem for solving differential equations in Julia. J. Open Res. Softw. 2017, 5, 15. [Google Scholar] [CrossRef]
  62. Weinan, E.; Han, J.; Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun. Math. Stat. 2017, 5, 349–380. [Google Scholar] [CrossRef]
  63. Han, J.; Jentzen, A.; Weinan, E. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 2018, 115, 8505–8510. [Google Scholar] [CrossRef] [PubMed]
  64. Platt, J.; Barr, A. Constrained Differential Optimization. In Proceedings of the Neural Information Processing Systems; Anderson, D., Ed.; American Institute of Physics: College Park, MD, USA, 1987. [Google Scholar]
  65. Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  66. Innes, M. Flux: Elegant Machine Learning with Julia. J. Open Source Softw. 2018, 3, 602. [Google Scholar] [CrossRef]
  67. Innes, M.; Saba, E.; Fischer, K.; Gandhi, D.; Rudilosso, M.C.; Joy, N.M.; Karmali, T.; Pal, A.; Shah, V. Fashionable Modelling with Flux. arXiv 2018, arXiv:1811.01457. [Google Scholar] [CrossRef]
Figure 1. Solution of a Fokker–Planck equation driven by a mechanical potential (23) computed using Monte Carlo integration via Girsanov formula (dashed blue line). We use q U ( q ) = 2   q 3 . The initial condition is p t ι ( q ) = 1 2 π exp ( q 2 / 2 ) at t ι = 0 . The Girsanov method is compared with an implementation of the “proximal gradient descent” method described in [42], shown in orange. For the proximal gradient descent, we use 10 4 samples from the initial distribution and γ = 0.05 as the regularization parameter, see [42]. Both methods simulate trajectories of the auxiliary stochastic process (11) by the Euler–Maruyama scheme with step size h = 10 3 . For the Girsanov theorem approach, we evolve 10 3 trajectories from 10 4 initial points in the interval [ 6 , 6 ] . Resulting distributions are smoothed by convolution with a box filter. We use t ι = 0 and μ = β = 1 . The expected equilibrium state of the distribution is shown by the shaded area in the final panel at t = 0.75 . In our implementation, the Monte Carlo method of integration is roughly three orders of magnitude faster than the proximal gradient descent. Accompanying code for all figures can be found in the link in the Data Availability statement.
Figure 1. Solution of a Fokker–Planck equation driven by a mechanical potential (23) computed using Monte Carlo integration via Girsanov formula (dashed blue line). We use q U ( q ) = 2   q 3 . The initial condition is p t ι ( q ) = 1 2 π exp ( q 2 / 2 ) at t ι = 0 . The Girsanov method is compared with an implementation of the “proximal gradient descent” method described in [42], shown in orange. For the proximal gradient descent, we use 10 4 samples from the initial distribution and γ = 0.05 as the regularization parameter, see [42]. Both methods simulate trajectories of the auxiliary stochastic process (11) by the Euler–Maruyama scheme with step size h = 10 3 . For the Girsanov theorem approach, we evolve 10 3 trajectories from 10 4 initial points in the interval [ 6 , 6 ] . Resulting distributions are smoothed by convolution with a box filter. We use t ι = 0 and μ = β = 1 . The expected equilibrium state of the distribution is shown by the shaded area in the final panel at t = 0.75 . In our implementation, the Monte Carlo method of integration is roughly three orders of magnitude faster than the proximal gradient descent. Accompanying code for all figures can be found in the link in the Data Availability statement.
Entropy 27 00218 g001
Figure 2. Solution of a Fokker–Planck driven by a time-dependent mechanical potential computed using Monte Carlo integration via Girsanov formula (dashed blue line). The optimal protocol U and reference solution (orange line) is computed using an iterative method [41]. For the Girsanov theorem approach, we evolve M = 10,000 trajectories from 500 initial points in the interval [ 3 , 3 ] with a time step of h = 0.005 by the Euler–Maruyama scheme. The reference solution uses the iteration method of [41], where integration of the Equations (21a) and (21b) is also computed as a numerical average of Monte Carlo sampled trajectories, using 5000 initial points from the interval [ 6 , 6 ] . Ten total iterations are performed. Final distributions are normalized and smoothed by a convolution with a box filter. We use t ι = 0 and t f = 0.2 , and μ = β = 1 . Assigned boundary conditions (shaded blue area in first and final panels) are given by (17a) with U ι ( q ) = 1 4 ( q 1 ) 4 and (17b) with U f ( q ) = 1 4 ( q 2 1 ) 2 .
Figure 2. Solution of a Fokker–Planck driven by a time-dependent mechanical potential computed using Monte Carlo integration via Girsanov formula (dashed blue line). The optimal protocol U and reference solution (orange line) is computed using an iterative method [41]. For the Girsanov theorem approach, we evolve M = 10,000 trajectories from 500 initial points in the interval [ 3 , 3 ] with a time step of h = 0.005 by the Euler–Maruyama scheme. The reference solution uses the iteration method of [41], where integration of the Equations (21a) and (21b) is also computed as a numerical average of Monte Carlo sampled trajectories, using 5000 initial points from the interval [ 6 , 6 ] . Ten total iterations are performed. Final distributions are normalized and smoothed by a convolution with a box filter. We use t ι = 0 and t f = 0.2 , and μ = β = 1 . Assigned boundary conditions (shaded blue area in first and final panels) are given by (17a) with U ι ( q ) = 1 4 ( q 1 ) 4 and (17b) with U f ( q ) = 1 4 ( q 2 1 ) 2 .
Entropy 27 00218 g002
Figure 3. Solution of a Fokker–Planck equation driven by a non-linear mechanical underdamped diffusion computed by Monte Carlo integration. Panels (af) show the following: Center: the joint distribution for the momentum and position; Top: the marginal distribution of the momentum; Left: marginal distribution of the position. The optimal protocol U used in the integration and the reference solutions for the marginal densities (orange) are estimated from a perturbative expansion around the overdamped limit [25]. For the integration, we evolve M = 10,000 trajectories from a set of 2601 equally spaced points from the interval [ 5 , 5 ] × [ 5 , 5 ] . We use a time step size h = 0.025 and integrate over trajectories of (23) using an Euler–Maruyama discretization. We use t ι = 0 , t f = 5 , β = 25 and τ = m = 1 . The assigned initial condition is given by (33a) with U ι ( q ) = 1 4 ( q 1 ) 4 and final condition (33b) with U f ( q ) = 1 4 ( q 2 1 ) 2 , indicated by contour lines in the initial and final panels, respectively.
Figure 3. Solution of a Fokker–Planck equation driven by a non-linear mechanical underdamped diffusion computed by Monte Carlo integration. Panels (af) show the following: Center: the joint distribution for the momentum and position; Top: the marginal distribution of the momentum; Left: marginal distribution of the position. The optimal protocol U used in the integration and the reference solutions for the marginal densities (orange) are estimated from a perturbative expansion around the overdamped limit [25]. For the integration, we evolve M = 10,000 trajectories from a set of 2601 equally spaced points from the interval [ 5 , 5 ] × [ 5 , 5 ] . We use a time step size h = 0.025 and integrate over trajectories of (23) using an Euler–Maruyama discretization. We use t ι = 0 , t f = 5 , β = 25 and τ = m = 1 . The assigned initial condition is given by (33a) with U ι ( q ) = 1 4 ( q 1 ) 4 and final condition (33b) with U f ( q ) = 1 4 ( q 2 1 ) 2 , indicated by contour lines in the initial and final panels, respectively.
Entropy 27 00218 g003
Figure 4. Weighted errors and variances at time t = 5 (Panel (f)) for the example in Figure 3 as a function of the number of sampled trajectories M of the SDE (22) with a fixed time step. The errors and variances are computed over 100 sample points x i = ( p , q ) [ 3 , 3 ] × [ 3 , 3 ] . In Panel (a), the output of Algorithm 2 for each point x i is compared to the assigned final distribution (33b). The L 1 (blue) is computed as x i   P f ( x i )   | P ^ f ( M ) ( x i ) P f ( x i ) | ; L 2 (orange) is ( x i   P f ( x i )   | P ^ f ( M ) ( x i ) P f ( x i ) | 2 ) 1 / 2 and L (green) is max x i | P ^ f ( M ) ( x i ) P f ( x i ) | 2 where P ^ f ( M ) ( x i ) in all cases indicates the value found using M sample trajectories in Algorithm 2. Panel (b) shows the largest (max, blue line) variance across all sample points as a function of the number of sampled trajectories M . All other parameters are as in Figure 3.
Figure 4. Weighted errors and variances at time t = 5 (Panel (f)) for the example in Figure 3 as a function of the number of sampled trajectories M of the SDE (22) with a fixed time step. The errors and variances are computed over 100 sample points x i = ( p , q ) [ 3 , 3 ] × [ 3 , 3 ] . In Panel (a), the output of Algorithm 2 for each point x i is compared to the assigned final distribution (33b). The L 1 (blue) is computed as x i   P f ( x i )   | P ^ f ( M ) ( x i ) P f ( x i ) | ; L 2 (orange) is ( x i   P f ( x i )   | P ^ f ( M ) ( x i ) P f ( x i ) | 2 ) 1 / 2 and L (green) is max x i | P ^ f ( M ) ( x i ) P f ( x i ) | 2 where P ^ f ( M ) ( x i ) in all cases indicates the value found using M sample trajectories in Algorithm 2. Panel (b) shows the largest (max, blue line) variance across all sample points as a function of the number of sampled trajectories M . All other parameters are as in Figure 3.
Entropy 27 00218 g004
Figure 5. Gradient of the value function, i.e., the gradient of the solution to the Hamilton–Jacobi–Bellman Equation (18b), computed using the Bismut–Elworthy–Li formula (BEL) (dashed blue line) described in Algorithm 3. We sample 10,000 trajectories of the stochastic process (3) from 500 initial points in the interval [ 3 , 3 ] , discretized by the Euler–Maruyama scheme with time step size h = 0.005 , and compute the BEL weights along the trajectories. The optimal control protocol U and reference solution (orange) used is computed by the iteration as in Figure 2. Numerical parameters and boundary conditions are as in Figure 2. We use t ι = 0 , t f = 0.2 and μ = β = 1 .
Figure 5. Gradient of the value function, i.e., the gradient of the solution to the Hamilton–Jacobi–Bellman Equation (18b), computed using the Bismut–Elworthy–Li formula (BEL) (dashed blue line) described in Algorithm 3. We sample 10,000 trajectories of the stochastic process (3) from 500 initial points in the interval [ 3 , 3 ] , discretized by the Euler–Maruyama scheme with time step size h = 0.005 , and compute the BEL weights along the trajectories. The optimal control protocol U and reference solution (orange) used is computed by the iteration as in Figure 2. Numerical parameters and boundary conditions are as in Figure 2. We use t ι = 0 , t f = 0.2 and μ = β = 1 .
Entropy 27 00218 g005
Figure 6. The gradient of the optimal control potential minimizing the Kullback–Leibler divergence (16) in the underdamped dynamics. We compute the stationarity condition (31) at t = 0 using the gradient of the solution of the Hamilton–Jacobi–Bellman Equations (34a) and (34b) using the Bismut–Elworthy–Li formula (Monte Carlo w. BEL) (blue line) in Algorithm 4. The optimal control protocol U and terminal condition φ of (34a) and (34b) are found using numerical integration of the system of equations described in Section IV of [25], using a fourth-order co-location method from the DifferentialEquations.jl library [61]. We use Gaussian boundary conditions: the initial and final position and momentum means are set as zero; the initial and final cross-correlation is zero; the initial variances are set to 1; the final position variance is 1.7 ; and the final momentum variance is 1. We sample 10,000 independent trajectories of the stochastic process (22) started from 500 sample points in the interval [ 5 , 5 ] × [ 5 , 5 ] using an Euler–Maruyama discretization with time step h = 0.01 . We use t ι = 0 , t f = 1 and β = τ = m = 1 .
Figure 6. The gradient of the optimal control potential minimizing the Kullback–Leibler divergence (16) in the underdamped dynamics. We compute the stationarity condition (31) at t = 0 using the gradient of the solution of the Hamilton–Jacobi–Bellman Equations (34a) and (34b) using the Bismut–Elworthy–Li formula (Monte Carlo w. BEL) (blue line) in Algorithm 4. The optimal control protocol U and terminal condition φ of (34a) and (34b) are found using numerical integration of the system of equations described in Section IV of [25], using a fourth-order co-location method from the DifferentialEquations.jl library [61]. We use Gaussian boundary conditions: the initial and final position and momentum means are set as zero; the initial and final cross-correlation is zero; the initial variances are set to 1; the final position variance is 1.7 ; and the final momentum variance is 1. We sample 10,000 independent trajectories of the stochastic process (22) started from 500 sample points in the interval [ 5 , 5 ] × [ 5 , 5 ] using an Euler–Maruyama discretization with time step h = 0.01 . We use t ι = 0 , t f = 1 and β = τ = m = 1 .
Entropy 27 00218 g006
Figure 7. Solution of the optimal control problem minimizing the Kullback–Leibler divergence from a free diffusion in a fixed time interval in the overdamped dynamics. The gradient of the control protocol is parametrized by a neural network and trained using the process described in Algorithm 5. Panel (a) shows the final boundary condition obtained by integrating the Fokker–Planck Equation (18a) using the trained neural network as the drift in Algorithm 1 (blue) against the assigned final boundary condition (orange). Panels (bg) show the output of the neural network (blue) after training to estimate the gradient of the optimal control protocol against a reference solution [41] (orange). We use the assigned boundary conditions as in Figure 2, with β = μ = 1 , t ι = 0 and t f = 0.2 . The gradient of the optimal control protocol is parametrized by a fully connected feed-forward neural network with one input layer of four neurons, one hidden layer of ten neurons and one output layer. The swish ( x x σ ( x ) ) activation function is used between the input and hidden, and hidden and output layers. Weights and biases are initialized using Glorot normal initialization, and Glorot uniform initialization for the output layer. The Lagrange multiplier λ function is approximated by fitting a polynomial of degree 6 and is initialized with all coefficients set to 0. At each iteration, 512 points are sampled uniformly from the interval [ 3 , 3 ] . The gradient of the value function is computed using Algorithm 3 with the neural network U as the drift, with 10 independent simulated trajectories of the associated SDE using an Euler–Maruyama discretization and time step 0.005 . The final probability density is computed using Algorithm 1 with U as the drift, with 100 independent Monte Carlo trajectories from each sample point using an Euler–Maruyama discretization and time step 0.005 . The neural network U is trained in four phases as follows. The first phase is 20 full iterations of Algorithm 5 with 100 updates to the parameters Θ per iteration using stochastic gradient descent with learning rate γ 2 = 10 3 . At each iteration, the Lagrange multiplier λ is recomputed using (69) with γ 1 = 0.1 . In the second phase, we make 20 full iterations with 100 updates to Θ according to (70) using stochastic gradient descent and learning rate γ 2 = 10 4 per iteration. The Lagrange multiplier is recomputed once at each iteration using γ 1 = 10 2 . In the third phase, we make 20 full iterations with 400 updates to Θ using stochastic gradient descent with learning rate γ 2 = 10 5 per iteration. In the fourth phase, we make 20 full iterations with 400 update steps to Θ per iteration using the ADAM [65] optimizer and γ 2 = 10 4 . The code is written in the Julia programming language, using especially the Flux.jl [66,67] and Polynomials.jl libraries.
Figure 7. Solution of the optimal control problem minimizing the Kullback–Leibler divergence from a free diffusion in a fixed time interval in the overdamped dynamics. The gradient of the control protocol is parametrized by a neural network and trained using the process described in Algorithm 5. Panel (a) shows the final boundary condition obtained by integrating the Fokker–Planck Equation (18a) using the trained neural network as the drift in Algorithm 1 (blue) against the assigned final boundary condition (orange). Panels (bg) show the output of the neural network (blue) after training to estimate the gradient of the optimal control protocol against a reference solution [41] (orange). We use the assigned boundary conditions as in Figure 2, with β = μ = 1 , t ι = 0 and t f = 0.2 . The gradient of the optimal control protocol is parametrized by a fully connected feed-forward neural network with one input layer of four neurons, one hidden layer of ten neurons and one output layer. The swish ( x x σ ( x ) ) activation function is used between the input and hidden, and hidden and output layers. Weights and biases are initialized using Glorot normal initialization, and Glorot uniform initialization for the output layer. The Lagrange multiplier λ function is approximated by fitting a polynomial of degree 6 and is initialized with all coefficients set to 0. At each iteration, 512 points are sampled uniformly from the interval [ 3 , 3 ] . The gradient of the value function is computed using Algorithm 3 with the neural network U as the drift, with 10 independent simulated trajectories of the associated SDE using an Euler–Maruyama discretization and time step 0.005 . The final probability density is computed using Algorithm 1 with U as the drift, with 100 independent Monte Carlo trajectories from each sample point using an Euler–Maruyama discretization and time step 0.005 . The neural network U is trained in four phases as follows. The first phase is 20 full iterations of Algorithm 5 with 100 updates to the parameters Θ per iteration using stochastic gradient descent with learning rate γ 2 = 10 3 . At each iteration, the Lagrange multiplier λ is recomputed using (69) with γ 1 = 0.1 . In the second phase, we make 20 full iterations with 100 updates to Θ according to (70) using stochastic gradient descent and learning rate γ 2 = 10 4 per iteration. The Lagrange multiplier is recomputed once at each iteration using γ 1 = 10 2 . In the third phase, we make 20 full iterations with 400 updates to Θ using stochastic gradient descent with learning rate γ 2 = 10 5 per iteration. In the fourth phase, we make 20 full iterations with 400 update steps to Θ per iteration using the ADAM [65] optimizer and γ 2 = 10 4 . The code is written in the Julia programming language, using especially the Flux.jl [66,67] and Polynomials.jl libraries.
Entropy 27 00218 g007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sanders, J.; Muratore-Ginanneschi, P. On the Numerical Integration of the Fokker–Planck Equation Driven by a Mechanical Force and the Bismut–Elworthy–Li Formula. Entropy 2025, 27, 218. https://doi.org/10.3390/e27030218

AMA Style

Sanders J, Muratore-Ginanneschi P. On the Numerical Integration of the Fokker–Planck Equation Driven by a Mechanical Force and the Bismut–Elworthy–Li Formula. Entropy. 2025; 27(3):218. https://doi.org/10.3390/e27030218

Chicago/Turabian Style

Sanders, Julia, and Paolo Muratore-Ginanneschi. 2025. "On the Numerical Integration of the Fokker–Planck Equation Driven by a Mechanical Force and the Bismut–Elworthy–Li Formula" Entropy 27, no. 3: 218. https://doi.org/10.3390/e27030218

APA Style

Sanders, J., & Muratore-Ginanneschi, P. (2025). On the Numerical Integration of the Fokker–Planck Equation Driven by a Mechanical Force and the Bismut–Elworthy–Li Formula. Entropy, 27(3), 218. https://doi.org/10.3390/e27030218

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop