Next Article in Journal
Attraction Controls the Entropy of Fluctuations in Isosceles Triangular Networks
Next Article in Special Issue
Random Finite Set Based Parameter Estimation Algorithm for Identifying Stochastic Systems
Previous Article in Journal
Performance Evaluations on Using Entropy of Ultrasound Log-Compressed Envelope Images for Hepatic Steatosis Assessment: An In Vivo Animal Study
Previous Article in Special Issue
An Auxiliary Variable Method for Markov Chain Monte Carlo Algorithms in High Dimension
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastic Proximal Gradient Algorithms for Multi-Source Quantitative Photoacoustic Tomography

1
Department of Mathematics, University of Innsbruck, Technikerstraße 13, A-6020 Innsbruck, Austria
2
Institute of Basic Sciences in Engineering Science, University of Innsbruck, Technikerstraße 13, A-6020 Innsbruck, Austria
*
Author to whom correspondence should be addressed.
Entropy 2018, 20(2), 121; https://doi.org/10.3390/e20020121
Submission received: 13 December 2017 / Revised: 22 January 2018 / Accepted: 4 February 2018 / Published: 11 February 2018
(This article belongs to the Special Issue Probabilistic Methods for Inverse Problems)

Abstract

:
The development of accurate and efficient image reconstruction algorithms is a central aspect of quantitative photoacoustic tomography (QPAT). In this paper, we address this issues for multi-source QPAT using the radiative transfer equation (RTE) as accurate model for light transport. The tissue parameters are jointly reconstructed from the acoustical data measured for each of the applied sources. We develop stochastic proximal gradient methods for multi-source QPAT, which are more efficient than standard proximal gradient methods in which a single iterative update has complexity proportional to the number applies sources. Additionally, we introduce a completely new formulation of QPAT as multilinear (MULL) inverse problem which avoids explicitly solving the RTE. The MULL formulation of QPAT is again addressed with stochastic proximal gradient methods. Numerical results for both approaches are presented. Besides the introduction of stochastic proximal gradient algorithms to QPAT, we consider the new MULL formulation of QPAT as main contribution of this paper.

1. Introduction

Photoacoustic tomography (PAT) is an emerging imaging modality, which combines the benefits of pure ultrasound imaging (high resolution) with those of pure optical tomography (high contrast); see [1,2]. The basic principle of PAT is as follows (see Figure 1): A semitransparent sample such as a part of a human patient is illuminated with short pulses of optical radiation. A fraction of the optical energy is absorbed inside the sample, which causes thermal heating, expansion, and a subsequent acoustic pressure wave depending on the interior absorbing structure of the sample. The acoustic pressure is measured outside of the sample and used to reconstruct an image of the interior.
One important reconstruction problem in PAT is recovering the initial pressure distribution (see, for example, [3,4,5,6,7,8,9,10]). The initial pressure distribution only provides qualitative information about the tissue-relevant parameters, as it is the product of the optical absorption coefficient and the spatially varying optical intensity, which again indirectly depends on the tissue parameters. Quantitative photoacoustic tomography (QPAT) addresses this issue and aims at quantitatively estimating the tissue parameters by supplementing the inversion of the acoustic wave equation with an inverse problem for light propagation (see, for example, [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]).

1.1. Multi-Source QPAT

In this paper, we consider image reconstruction in QPAT using multiple sources. We allow limited view measurements, where, for each illumination, partial data are collected only from a certain angular domain. For modeling the light transport, we use the radiative transfer equation (RTE), which is commonly considered as a very accurate model for light transport in tissue (see, for example, [27,28,29,30]). In particular, opposed to the diffusion approximation, the RTE allows for modeling directed optical radiation, which is required for a reasonable QPAT forward model. Additionally, it allows for including internal voids as regions of low scattering. As proposed in [18], we work with a single-stage reconstruction procedure for QPAT, where the optical parameters are reconstructed directly from the measured acoustical data. The image reconstruction problem of multi-source QPAT using N different sources can be formulated as a system of nonlinear equations (see, for example, [18,31])
F i ( μ ) = v i for i = 1 , , N .
Here, F i is the operator that maps the unknown parameter pair μ = ( μ a , μ s ) consisting of the absorption coefficient μ a : Ω R and the scattering coefficient μ s : Ω R to the measured acoustic data v i corresponding to the i-th source distribution (see Section 2 for precise definitions). There are two main classes of methods for solving the nonlinear inverse problem (1), namely, Tikhonov type regularization on the one and iterative regularization methods on the other hand [32,33,34]. Both approaches are based on rewriting (1) as a single equation F ( μ ) = v with forward operator F = ( F i ) i = 1 N and data v = ( v i ) i = 1 N . In Tikhonov regularization, one defines approximate solutions as minimizers of the penalized least squares functional 1 2 F ( μ ) v 2 + λ R ( μ ) . Here, R ( · ) is an appropriate regularization functional included to stabilize the inversion process and λ a regularization parameter that has to be carefully chosen depending on the data and the noise. In iterative regularization methods, stabilization is achieved via early stopping of iterative schemes. In such a situation, one usually applies iterative optimization techniques designed for minimizing the un-regularized least squares functional 1 2 F ( μ ) v 2 , and the iteration index plays the role of the regularization parameter.
Tikhonov type as well as iterative regularization methods can both be formulated as finding a solution of the optimization problem
min 1 2 i = 1 N F i ( μ ) v i 2 + G ( μ ) , with μ L 2 ( Ω ) × L 2 ( Ω ) .
In iterative regularization methods, one takes G = χ D , the characteristic function of the domain of definition D of the forward operator (taking the value 0 in and the value ∞ outside of D ). In Tikhonov regularization, we take G = χ D + λ R . Well established algorithms for solving Equation (2) are proximal gradient algorithms [35,36], which can be written in the form
μ k + 1 = prox s k G μ k s k i = 1 N F i ( μ k ) * F i ( μ k ) v i .
Here, prox s k G is the proximity operator and s k the positive step size; F i ( μ k ) denotes the derivative of the i-th forward operator evaluated at μ k with F i ( μ k ) * being its Hilbert space adjoint.

1.2. Stochastic Proximal Gradient Algorithms

Each iteration in the proximal gradient algorithm (3) can be numerically quite expensive, since it requires solving the forward and adjoint problems for all N equations in (1). In many cases, stochastic (proximal) gradient methods turn out to be more efficient since these methods only consider one of the equations in (1) per iteration. The stochastic proximal gradient method (see, for example, [37,38,39,40,41,42] and the references therein) for solving (2) is defined by
μ k + 1 = prox s k G μ k s k F i ( k ) ( μ k ) * F i ( μ i ( k ) ) v i ,
where i ( k ) 1 , , N corresponds to one of the equations in (1) that is selected randomly for the update in the k-th iteration. In opposition to the standard proximal gradient method, this requires solving only one forward and one adjoint problem per iteration. Therefore, one iterative step is much cheaper for the stochastic gradient method than for the full gradient method. In the case of no regularization, λ = 0 , the stochastic proximal gradient method reduces to the Kaczmarz method for inverse problems studied in [43,44,45].
The computationally most expensive task in the above methods is the numerical solution of the RTE. In this paper, we therefore additionally study a reformulation of the inverse problem of QPAT avoiding the computation of a solution of the RTE. For this purpose, the inverse problem is reformulated as multilinear inverse problem (35), where the RTE is added as a constraint instead of explicitly including its solution. The new formulation will be again addressed by Tikhonov regularization in combination with proximal stochastic gradient methods as discussed in Section 4.
Note that, in QPAT, it has often been assumed that the initial pressure distribution (corresponding to each illumination) is already recovered from acoustic measurements (see, for example, [13,16,23,25,26,46,47,48,49]). Research was focused on inverting the light propagation in tissues either modeled by the RTE or the diffusion approximation. In the case that acoustic measurements are only known on parts of the boundary, reconstruction of the initial pressure distribution is not possible in a stable manner. In order to obtain stable reconstruction results in [18], we propose a single-stage approach for QPAT, where the optical parameters are directly recovered from the acoustic boundary data. Throughout this paper, we will make use of this approach, which delivers stable results especially in the limited view situation. In opposition to [18], in this paper, we introduce (proximal) stochastic gradient methods, which effectively exploit the multi-illumination structure and turn out to be faster than the standard proximal gradient methods.

1.3. Outline

The remainder of this paper is organized as follows. In Section 2, we provide the mathematical model for QPAT (the forward problem) using the RTE. We allow multiple sources and partial acoustic measurements. We also recall known results for QPAT including differentiability of the forward problem. In Section 3, we address the inverse problem of QPAT using Tikhonov regularization and study the proximal stochastic gradient method for its solution. The new reformulation of the inverse problem of QPAT as a multilinear inverse problem is presented in Section 4. For the solution of the proposed formulation, we again develop proximal gradient methods. Numerical results are presented in Section 5. The paper is concluded with a summary and outlook presented in Section 6.

2. The Forward Problem in QPAT

The image reconstruction problem of QPAT can be written as the system (1) of nonlinear equations, where the forward operators F i map tissue relevant parameters to acoustic data sets recorded in specific regions outside the tissue. Precise formulations will be given in this section.

2.1. Mathematical Notation

We fix some mathematical notation that is used throughout this paper. We denote by Ω R d a convex domain with piecewise smooth boundary modeling our domain of interest, with d 2 , 3 denoting the spatial dimension. In order to be able to impose appropriate boundary conditions for the RTE, it is convenient to split the set Γ : = Ω × S d 1 into inflow and outflow boundaries,
Γ : = ( x , θ ) Ω × S d 1 ν ( x ) · θ 0 , Γ + : = ( x , θ ) Ω × S d 1 ν ( x ) · θ > 0 ,
with ν ( x ) denoting the outward pointing unit normal at x Ω and x · y the standard inner product in R d . We write B R = x R n x < R for the ball of radius R centered at the origin and suppose B R Ω .
By L 2 ( Ω ) and L 2 ( Ω × S d 1 ) , we denote the Hilbert spaces of square integrable functions on Ω and Ω × S d 1 , respectively. By L 2 Γ , ν · θ , we denote the space of all q o : Γ R for which q o L 2 Γ , ν · θ 2 : = Γ q o ( x , θ ) 2 ν · θ d ( x , θ ) is finite. We further write
Φ W 2 : = Φ L 2 ( Ω × S d 1 ) 2 + θ · x Φ L 2 ( Ω × S d 1 ) 2 + Φ | Γ L 2 Γ , ν · θ 2 , v Y 2 : = 0 B R v ( x , t ) 2 t d x d t ,
and define
Q : = L 2 ( Ω × S d 1 ) × L 2 Γ , ν · θ , X : = L 2 ( Ω ) × L 2 ( Ω ) , W : = { Φ : Ω × S d 1 R Φ W < } , Y : = { v : B R × ( 0 , ) R v Y < } .
The inner products in Q , X , W , Y will be denoted by · , · Q , · , · X , · , · W , · , · Y , respectively. The subspace of all Φ W with Φ | Γ = 0 will be denoted by W 0 .
Elements in X will be written in the form μ = ( μ a , μ s ) and are the parameters we aim to determine. They are actually required to be contained in the convex subset
D ( T ) : = μ X 0 μ a μ ¯ a , 0 μ s μ ¯ s ,
where μ ¯ a , μ ¯ s > 0 . Elements in Q will be written in the form q = ( q o , q i ) and model the optical sources. Elements in W describe the optical radiation, and elements in Y the measured acoustic data.

2.2. The Radiative Transfer Equation

To specify the forward operators, we require mathematical models for the light propagation, the conversion of optical into acoustic energy, and the propagation of the acoustic waves. These models will be presented in the rest of this section.
We model the optical radiation by a function Φ : Ω × S d 1 R , where Φ ( x , θ ) is the density of photons at position x Ω and propagating into direction θ S d 1 . The interaction of the photons with the background are described by absorption coefficient μ a : Ω R , the scattering coefficient μ s : Ω R , and the scattering operator K : Φ K Φ , taking the form (see [27,30])
x , θ Ω × S d 1 : K Φ ( x , θ ) = S d 1 k ( θ , θ ) Φ ( x , θ ) d θ ,
with scattering kernel k : S d 1 × S d 1 R . The absorption coefficient describes the ability of the background to absorb photons and the scattering coefficient describes the amount of photon scattering. The scattering kernel k θ , θ describes the redistribution of velocity directions due to interaction of the photons with the background. From physical considerations, it is natural to assume k to be measurable, symmetric, nonnegative, and to satisfy S d 1 k · , θ d θ = 1 . In this article, we are concerned with the situation when the kernel is known a priori.
The photon density Φ ( x , θ ) is supposed to satisfy the stationary radiative transfer equation (RTE),
θ · x + μ a + μ s ( I K ) Φ ( x , θ ) = q i ( x , θ ) for ( x , θ ) Ω × S d 1
with boundary conditions
Φ | Γ ( x , θ ) = q o ( x , θ ) for ( x , θ ) Γ .
Here, q i : Ω × S d 1 R denotes an internal photon source and q o : Γ R a prescribed boundary source pattern. Note that PAT uses very short light pulses (below microseconds) and that light propagation happens on time scales much shorter than the scale of acoustic wave propagation. This justifies the use of the stationary case for the RTE; see [12] for a more complete discussion.
Theorem 1 (Well-posedness of the RTE).
For every μ D ( T ) and q Q , the stationary RTE (6) admits a unique solution Φ W . Moreover, there exists a constant C only depending on the parameters μ ¯ a , μ ¯ s > 0 (defining the domain D ( T ) ), such that
Φ W C q i L 2 + q o L 2 .
Proof. 
See [50]. ☐
Definition 1 (Solution operator for the RTE).
The solution operator for the RTE is defined by
T : Q × D ( T ) W : q , μ T ( q , μ ) : = Φ ,
where Φ denotes the unique solution of (6).
Theorem 1 guarantees that the operator T , mapping ( q , μ ) Q × D ( T ) to the solution of the RTE, is well defined. Note that in the actual application q = ( q i , q o ) Q are prescribed sources, and μ = ( μ a , μ s ) D ( T ) are the unknown parameters to be recovered.

2.3. Heating Operator

Due to the spatially varying absorption of photons, the tissue is locally heated and emits an acoustic pressure wave. The acoustic source is proportional to the amount of absorbed photons, the light intensity and the so-called Grüneisen parameter γ describing the efficiency of conversion of optical to acoustical energy. We assume γ to be constant and after appropriate re-scaling we take γ = 1 ; for more details about the Grüneisen parameter, we refer to [17]. Therefore, the conversion of the optical energy into acoustic pressure wave is described by the heating operator defined as follows.
Definition 2 (Heating operator).
The heating operator is defined by
H : Q × D ( T ) L 2 ( Ω ) ( q , μ ) μ a S d 1 T ( q , μ ) ( · , θ ) d θ .
If one introduces the averaging operator A : W L 2 ( Ω ) defined by A Φ = S d 1 Φ ( · , θ ) d θ one may write the heating operator in the form
H ( q , μ ) = μ a A T ( q , μ ) for ( q , μ ) Q × D ( T ) .
Because T ( q , μ ) models the photon density, A T ( q , μ ) actually models the total light intensity. The heating operator is therefore given by the product of the absorption coefficient and the light intensity. The averaging operator A is well defined and bounded and therefore the heating operator is well defined as a mapping between Q × D ( T ) and L 2 ( Ω ) .

2.4. The Wave Equation

The local heating causes an acoustic pressure wave, where the initial pressure distribution p 0 is proportional to a fraction of the absorbed energy. Assuming constant speed of sound and after rescaling, the induced acoustic pressure p : R d × 0 , R satisfies the free-space wave equation:
( t 2 Δ ) p ( x , t ) = 0 for x , t R d × 0 , , p x , 0 = p 0 ( x ) for x R d , t p x , 0 = 0 for x R d .
Here, the function p 0 vanishes outside B R , the ball of radius R, and acoustic data are collected on a subset of B R × ( 0 , ) that we denote by Λ × ( 0 , ) . Recall that coupling of the RTE and the wave equation happens in such a way that the result of the heating operator H ( q , μ ) acts as initial sound source p 0 depending on tissue parameters; see Definition 4. Standard existence and uniqueness theory for hyperbolic equations guarantees that, for any p 0 H 1 , (10) has a unique solution p H 1 , which continuously depends on p 0 . Taking the trace results in loss of regularity by degree 1 / 2 . Therefore, p 0 p | B R × ( 0 , ) is continuous between H 1 and H 1 / 2 . The following Lemma implies the much stronger result that p 0 p | B R × ( 0 , ) is actually an L 2 -isometry.
Lemma 1.
Let p 0 C R n have support in B R and let p denote the solution of (10). Then,
B R 0 p ( x , t ) 2 t d t d x = R 2 B R p 0 ( x ) 2 d x .
Proof. 
See [51] for d odd and [52] for d even. ☐
Definition 3.
We define the solution operator with full boundary data for the wave Equation (10) by
U : C ( B R ) L 2 ( B R ) Y : p 0 p | B R × ( 0 , ) ,
where p denotes the solution of (10).
According to Lemma 1, the operator U can be uniquely extended to a bounded linear operator defined on L 2 ( B ) , denoted again by U : L 2 ( B R ) Y . The partial acoustic measurements made on Λ B R are then modeled by χ Λ × ( 0 , ) U p 0 .

2.5. Analysis of the Forward Problem in Multi-Source QPAT

We assume that we perform N individual experiments, where each experiment consists of separate optical sources and separate acoustic measurements. For the i-th experiment, we denote the source term by q i Q and assume the acoustic measurements are made on Λ i × ( 0 , T i ) B R × ( 0 , ) .
Definition 4.
For any i { 1 , , N } , we denote
T i : D ( T ) W : μ T ( q i , μ ) , H i : D ( T ) L 2 ( Ω ) : μ μ a ( A T i ) ( μ ) , U i : L 2 ( Ω ) Y : p 0 χ Λ i × ( 0 , T i ) U p 0 , F i : D ( T ) Y : μ ( U i H i ) ( μ ) .
Here, T i denotes the i-th solution operator for the RTE, H i the i-th heating operator, U i the i-th partial solution operator for the wave equation, and F i the i-th forward operator.
Recall that T stands for the solution operator for the RTE (6) given in Definition 1, A Φ = S d 1 Φ ( · , θ ) d θ is the averaging operator, and U the solution operator for the wave Equation (10); see Definition 3. The operator T i models the photon transport and its solution (via the heating operator) acts as input for the solution of the wave equation and thereby couples the optical with the acoustical part.
Next, we recall continuity and differentiability of the forward operators. For that purpose, we call h X a feasible direction at μ D ( T ) , if there exists some ϵ > 0 with μ + ϵ h D ( T ) .
Theorem 2 (Continuity and Differentiability).
(1)
The operators T i , F i and H i are sequentially continuous and Lipschitz-continuous.
(2)
For every μ D ( T ) , the one-sided directional derivatives T i ( μ ) ( h ) , F i ( μ ) ( h ) of T i , F i at μ in any feasible direction h exist, and are given by
T i ( μ ) ( h ) = T 0 , ( h a + h s h s K ) T ( μ ) , μ ,
F i ( μ ) ( h ) = U Λ i , T i h a A T i ( μ ) + μ a A ( T i ( μ ) ( h ) ) .
Proof. 
See [18]. ☐
Equations (13) and (14) define a bounded linear operator F i ( μ ) : X Y , which we call the derivative of F i at μ D ( T ) . Numerical minimization schemes actually require the adjoint of F i ( μ ) , which we compute next.
Theorem 3 (Adjoint of F i ( μ ) ).
Let i { 1 , , N } and μ D ( T ) . Furthermore, set Φ i : = T i ( μ ) and let Φ i * denote the solution of the adjoint problem
θ · x + μ a + μ s σ K Φ i * = A * μ a ( U i * v )
with Φ i * | Γ + = 0 . Then, F i ( μ ) * : Y X is given by
F i ( μ ) * v = A ( Φ i * Φ i ) + ( A Φ i ) ( U i * v ) A ( I K ) ( Φ i * Φ i ) .
Proof. 
See [18]. ☐
Given data v 1 , , v N Y , most numerical schemes for QPAT use gradients of the partial data-fidelity terms F I : D ( T ) R for I { 1 , , N } , where
F I ( μ ) = i I F i ( μ ) with F i ( μ ) : = 1 2 F i ( μ ) v i Y 2 .
By the chain rule, the gradient of F I is given by F I ( μ ) = i I F i ( μ ) with F i ( μ ) = F i ( μ ) * F i ( μ ) v i , where F i ( μ ) * can be computed by Theorem 3. Convergence of schemes such as the (stochastic) proximal gradient method considered in the following section require the Lipschitz continuity of F I , which will be shown in the following theorem.
Theorem 4 (Lipschitz continuity of ∇FI)
For any data v 1 , , v N Y and any subset I { 1 , , N } , the map μ F I ( μ ) is Lipschitz-continuous.
Proof. 
Without loss of generality, we assume N = 1 , I = { 1 } and write v = v 1 , F = F { 1 } , T = T 1 , U = U 1 , and v ( μ ) = F ( μ ) v . For any μ D ( T ) , let T * ( μ ) denote the solution of (15) with v ( μ ) in place of v. Then, for any μ , μ ˜ D ( T ) ,
F ( μ ) F ( μ ˜ ) X 2 = A ( T * ( μ ) T ( μ ) ) + ( A T ( μ ) ) ( U * v ( μ ) ) A ( T * ( μ ˜ ) T ( μ ˜ ) ) ( A T ( μ ˜ ) ) ( U * v ( μ ˜ ) ) L 2 ( Ω ) 2 + A ( I K ) ( T * ( μ ) T ( μ ) ) A ( I K ) ( T * ( μ ˜ ) T ( μ ˜ ) ) L 2 ( Ω ) 2 .
For the second term in (18), we obtain
A ( I K ) ( T * ( μ ) T ( μ ) ) A ( I K ) ( T * ( μ ) T ( μ ˜ ) ) + A ( I K ) ( T * ( μ ) T ( μ ˜ ) ) A ( I K ) ( T * ( μ ˜ ) T ( μ ˜ ) ) L 2 ( Ω ) = A ( I K ) ( T * ( μ ) [ T ( μ ) T ( μ ˜ ) ] ) + A ( I K ) ( T ( μ ˜ ) [ T * ( μ ) T * ( μ ˜ ) ] ) L 2 ( Ω ) c 1 A ( I K ) T * ( μ ) W L T μ μ ˜ X + c 1 A ( I K ) T ( μ ˜ ) W L T * μ μ ˜ X 2 c 1 A ( I K ) max ( T * ( μ ) W , T ( μ ˜ ) W ) max ( L T * , L T ) μ μ ˜ X ,
where L T * and L T denote the Lipschitz constants of T * and T , and c 1 is a constant. The difference A ( T * ( μ ) T ( μ ) ) A ( T * ( μ ˜ ) T ( μ ˜ ) ) in the first term in (18) is estimated in a similar manner. Furthermore, we have
( A T ( μ ) ) ( U * v ( μ ) ) ( A T ( μ ˜ ) ) ( U * v ( μ ) ) + ( A T ( μ ˜ ) ) ( U * v ( μ ) ) ( A T ( μ ˜ ) ) ( U * v ( μ ˜ ) ) L 2 ( Ω ) ( U * v ( μ ) ) A [ T ( μ ) T ( μ ˜ ) ] L 2 ( Ω ) + A T ( μ ˜ ) ( U * ( v ( μ ) v ( μ ˜ ) ) ) L 2 ( Ω ) .
Noting that A , U and U * are linear and bounded, Theorem 2 and the computations above yield the Lipschitz continuity of F . ☐

3. The Stochastic Proximal Gradient Method for QPAT

3.1. Formulation of the Inverse Problem

The inverse problem of multi-source QPAT consists in finding μ X from measured data
v i = F i ( μ ) + z i for i = 1 , , N .
Here, μ = ( μ a , μ s ) are the unknowns to be estimated, z i are the unknown error vectors, and v 1 , , v N are the given noisy data. Using the notation
v : = ( v 1 , , v N ) Y N , F : = ( F 1 , , F N ) : D ( T ) Y N ,
we can write (19) in the alternative form
Estimate μ * X from v = F ( μ * ) + z .
Here, z Y N denotes the error vector.
There are, at least, two different strategies to address such an inverse problem: Tikhonov type regularization on the one and iterative methods on the other hand. In this section, we give an overview of such methods. In particular, we describe proximal stochastic gradient methods (for minimizing the Tikhonov functional), which seem particularly well suited for multi-source QPAT but have not been investigated yet for that purpose.

3.2. Tikhonov Regularization in QPAT

In this section, we consider a quadratic Tikhonov regularization term for solving (19). Let
L : D ( L ) X Z : μ L μ
be a linear, densely defined, and possibly unbounded operator between X and another Hilbert space Z , · , · Z and set D : = D D ( L ) . In this context, any element μ + D with L μ + = min { L μ F ( μ ) = v } is called an L ( · ) -minimizing solution of F μ = v . Tikhonov regularization with regularization term 1 2 L μ Z 2 consists in computing a minimizer of the generalized Tikhonov functional T v , λ : X R , defined by
T v , λ ( μ ) : = 1 2 F ( μ ) v 2 + λ 2 L μ 2 , if μ D , , otherwise .
Here, λ > 0 denotes the regularization parameter that acts as a trade-off between the data fitting term and stability.
Theorem 5 (Well-posedness and convergence).
(1)
For any v Y and any λ > 0 , the Tikhonov functional T λ , v has at least one minimizer.
(2)
Let v ran ( F ) , ( δ m ) m N ( 0 , ) N , ( v m ) m N Y N with v v m δ m . Suppose further that ( λ m ) m N ( 0 , ) N satisfies λ m 0 and δ m 2 / λ m 0 as m . Then:
Every sequence ( μ m ) m N with μ m arg min T v m , λ m has a weakly converging subsequence.
The limit of every weakly convergent subsequence of ( μ m ) m N is an L ( · ) -minimizing solution of F μ = v .
If the L ( · ) -minimizing solution of F μ = v is unique and denoted by μ + , then ( μ m ) μ + .
Proof. 
See [18]. ☐

3.3. The Proximal Stochastic Gradient Algorithm for QPAT

Depending on the particular choice of L , the Tikhonov functional (21) may be ill-conditioned. To address this issue in [18], we proposed the proximal gradient algorithm for minimizing (21), which is a very flexible algorithm for minimizing functionals of the form F + G , where F is smooth and G is convex (see, for example, [35,36]). Here, we extend the approach to the proximal stochastic gradient algorithm. Additionally, we propose computing the proximal step using Dykstra’s projection algorithm.
Proximal gradient algorithm: The proximal gradient algorithm is a splitting method that iteratively computes explicit gradient steps for F and implicit proximal steps for G. In our context, we take F as the data fidelity term and
G ( μ ) = G λ ( μ ) : = g λ ( μ ) + χ D ( μ ) : = λ 2 L μ 2 + χ D ( μ ) ,
where χ D is the characteristic function taking the value zero inside D and ∞ outside. The proximal gradient algorithm for minimizing the QPAT-Tikhonov functional (21) reads
μ k + 1 = prox s k G λ μ k s k i = 1 N F i ( μ k ) .
Here, prox s k G λ : X D denotes the proximal mapping corresponding to the functional s k G λ ,
prox s k G λ ( x ) = arg min 1 2 x ( · ) 2 + s k G λ .
Furthermore, F i ( μ k ) is the gradient of the i-th data fidelity term computed in Theorem 3.
Dykstra’s projection algorithm: The constraint quadratic optimization problem (24) can efficiently be solved by a proximal variant of Dykstra’s projection algorithm [35,36,53]. For that purpose, we write s k G λ = χ D + g with g ( x ) : = s k λ 2 L x 2 . Setting x 0 = μ , p 0 = 0 and q 0 = 0 , Dykstra’s projection algorithm for (24) reads, for m N ,
y m = prox g ( x m + p m ) ,
x m + 1 = P D ( y m + q m ) ,
p m + 1 = x m + p m y m ,
q m + 1 = y m + q m x m + 1 .
Both proximal mapping in (25) and the projection in (26) can be computed explicitly. In fact, one readily verifies that
prox g ( x ) = I X + s k λ L * L 1 x ,
P D ( μ ) = min μ ¯ , max 0 , μ .
Here, I X is the identity operator on X and P D the projection onto D .
Proximal stochastic gradient algorithm: The methods described so far require in any iterative step the computation of the full gradient
F ( μ ) = i = 1 N F i ( μ ) with F i ( μ ) = F i ( μ ) * F i ( μ ) v i .
The evaluation of each F i ( μ ) requires the solution of the RTE and an adjoint problem and therefore is quite time-consuming. For multi-source QPAT, where N > 1 , a significant acceleration may be obtained by a Kaczmarz strategy, where in each iterative step only one of the summands F i ( μ ) is used. The resulting proximal stochastic gradient method for minimizing the Tikhonov functional (21) in QPAT reads
μ k + 1 = prox s k G λ μ k s k F i ( k ) ( μ k ) ,
where i ( k ) 1 , , N is selected randomly for the update in the k-th iteration. Furthermore, prox s k G λ is the proximal mapping of s k G λ that can be computed by Dykstra algorithm (25)–(28) and F i ( μ k ) is the gradient of the i-th data fidelity term that can be computed by Theorem 3.
One can also incorporate a block-iterative (or mini-batch) strategy in the stochastic gradient method, meaning that a small subset of { 1 , , N } of equations is used per iteration instead of a single one. Such a variant could be especially useful in the case of a large number of different illumination patterns. For more details about stochastic gradient methods, see [37,38,39,40,41,42] and the references therein. Note that, in general, convergence of stochastic gradient methods requires asymptotically vanishing step size [42].

3.4. Iterative Regularization Methods

An alternative class of algorithms to address nonlinear inverse problems are iterative techniques. The most basic iterative method for solving the nonlinear inverse problem v = F ( μ ) is the Landweber iteration. In the case that the domain of definition D is a proper subset, we have to combine the Landweber iteration with a projection step onto D as presented in this subsection. The projected Landweber iteration applied to multi-source QPAT reads
μ k + 1 = P D μ k s k i = 1 N F i ( μ k ) ,
where F i is the gradient of F i (see Equation (17)), and P D ( μ ) = min μ ¯ , max 0 , μ denotes the projection onto D . In Tikhonov regularization, the regularity of solutions is enforced by an explicitly included penalty. In opposition to that, in iterative regularization methods, a stabilization effect is enforced by early stopping of the iteration. A common stopping rule is the discrepancy principle, where iteration is stopped at the smallest index k N satisfying v F ( μ k ) τ δ , where δ is an estimate for the noise and τ 1 . Formally, the projected Landweber iteration (32) arises as a special case of the proximal gradient iteration (23) for minimizing the Tikhonov functional, where the regularization parameter is taken as λ = 0 and where the proximal mapping (24) reduces to the orthogonal projection onto D .
In a similar manner, one can also use a stochastic version of the projected Landweber iteration. Using the loping strategy of [43,44,45] in order to stabilize the iterative process, the resulting projected loping Landweber–Kaczmarz iteration reads
μ k + 1 = P D μ k s k ω k F i ( k ) ( μ k ) ,
ω k : = 1 , F i ( k ) ( μ k ) v i ( k ) X > τ δ i ( k ) , 0 , otherwise .
Here, i ( k ) 1 , , N for any k N may be randomly selected, τ > 1 is an appropriately chosen positive constant, F i ( · ) is the gradient of the i-th data fidelity term computed in Theorem 3 and P D ( μ ) = min μ ¯ , max 0 , μ denotes the projection onto D . The iteration (33), (34) terminates if F i ( μ k ) v i τ δ for all i 1 , , N . It is worth mentioning that, for noise free data, we have ω k = 1 for all k and, therefore, in this special situation, the iteration becomes μ k + 1 = P D ( μ k s k F i ( k ) ( μ k ) ) , which formally arises from the proximal stochastic gradient method (31) with λ = 0 . A convergence analysis of the loping Landweber–Kaczmarz method can be found in [43,44].

4. QPAT as Multilinear Inverse Problem

Since the RTE is time-consuming to solve, we are looking for a suitable reformulation of the inverse problem in multi-source QPAT avoiding computation of a solution of the RTE in each iterative step. In this paper, we propose to write (19) as a multilinear inverse problem, where we add the RTE as a constraint instead of explicitly including its solution. The new formulation will again be addressed by Tikhonov regularization and proximal stochastic gradient methods.

4.1. Reformulation as Multilinear Inverse Problem

Recall the forward problem of QPAT governed by the RTE (6). With the abbreviation M ( μ ) : = θ · x + μ a + μ s ( I K ) , the RTE can be written in compact form M ( μ ) Φ = q , where μ = ( μ a , μ s ) is the unknown parameter pair. In the case of exact data, the multi-source problem in QPAT (19) then can be reformulated as the problem of finding the tuple z : = ( μ , ( Φ i , H i ) i = 1 N ) D × ( W × L 2 ( Ω ) ) N such that
M ( μ ) Φ i = q i for i = 1 , , N , H i = μ a A Φ i for i = 1 , , N , v i = U ( H i ) for i = 1 , , N .
Here, the index i indicates the i-th illumination, and q i Q , Φ i W , H i L 2 ( Ω ) , v i Y are the corresponding source, photon density, heating and acoustical data, respectively, and A Φ i = S d 1 Φ i ( · , θ ) d θ is the averaging operator. We call (35) and resulting formulations below the multilinear (MULL) formulation of QPAT.

4.2. Application of Tikhonov Regularization

In the case that the data v i are only known approximately, we use Tikhonov regularization for the stable solution of (35). For that purpose, we approximate (35) by the constrained optimization problem
min ( μ , Φ i , H i ) i = 1 N 1 2 i = 1 N v i U ( H i ) 2 + λ 2 L ( μ ) 2 + χ D ( μ ) , s . t . M ( μ ) Φ i = q i , H i = μ a A Φ i for i = 1 , , N .
Here, the operator L μ = ( L a μ a , L s μ s ) is possibly unbounded, λ 2 L ( μ ) 2 is the regularization term and λ > 0 the regularization parameter. Note that (36) is equivalent to (21) and therefore the well-posedness and convergence results of Theorem 5 apply to (36) as well.
The constrained optimization problem (36) proposed in this paper can be addressed by various solution methods, for example using penalty methods or augmented Lagrangian techniques [54]. In this paper, we use a penalty approach for solving (36) where the constraints are included as penalty term. To simplify notation, we introduce the unconstraint functionals
J ( i ) ( z ) : = a 1 2 M ( μ ) Φ i q i 2 + a 2 2 μ a A Φ i H i 2 + a 3 2 v i U ( H i ) 2 + λ 2 L ( μ ) 2 + χ D ( μ ) ,
for certain parameters a 1 , a 2 , a 3 > 0 and z i : = ( μ , Φ i , H i ) Q × W × L 2 ( Ω ) . The sum of the unconstraint functionals (37) over all illuminations will actually be minimized in our numerical implementations. For that purpose, we define
J 1 ( i ) ( z ) = 1 2 M ( μ ) Φ i q i 2 , J 2 ( i ) ( z ) = 1 2 μ a A Φ i H i 2 , J 3 ( i ) ( z ) = 1 2 v i U ( H i ) 2 , J 4 ( i ) ( z ) = 1 2 L ( μ ) 2 .
Then, we have J ( i ) ( z ) = = 1 4 a J ( i ) ( z ) + χ D ( μ ) . For the approximate solution of (36), we minimize the unconstrained functional J ( z ) = i = 1 4 J ( i ) ( z ) , which can be written in the forms
J ( z ) = i = 1 N = 1 4 a J ( i ) ( z ) + χ D ( μ ) ,
J ( z ) = i = 1 N = 1 3 a J ( i ) ( z ) + λ 2 L ( μ ) 2 + χ D ( μ ) .
(Here and below, we also write a 4 = λ , if it simplifies notation.) The formulations (38) as well as (39) can be solved by various optimization techniques. In particular, as the functionals are given as the sum of simpler terms, the stochastic (proximal) gradient method is particularly appealing.

4.3. Solution of the MULL Formulation of QPAT Using Stochastic Gradient Methods

For solving QPAT in the novel MULL formulation (35), we use stochastic gradient methods similar to previous sections. For that purpose, we require the gradients (determining the steepest descent directions) of the individual functionals J ( i ) ( z i ) with respect to z i = ( μ , Φ i , H i ) , which are given as
μ a J 1 ( i ) ( z ) = Φ i ( M ( μ ) Φ i q i ) ,
μ s J 1 ( i ) ( z ) = ( I K ) Φ i ( M ( μ ) Φ i q i ) ,
Φ i J 1 ( i ) ( z ) = M ( μ ) ( M ( μ ) Φ i q i ) ,
μ a J 2 ( i ) ( z ) = ( A Φ i ) ( μ a A Φ i H i ) ,
H i J 2 ( i ) ( z ) = ( μ a A Φ i H i ) ,
Φ i J 2 ( i ) ( z ) = A * [ μ a ( μ a A Φ i H i ) ] ,
H i J 3 ( i ) ( z ) = U T ( v i U ( H i ) ) ,
μ a J 4 ( i ) ( z ) = L a * L a μ a ,
μ s J 4 ( i ) ( z ) = L s * L s μ s .
(All other partial gradients are vanishing.) In the following, let N be the number of illuminations, write z = ( μ a , μ s , ( Φ i , H i ) i = 1 N ) and let ( s k ) k N be a sequence of step sizes. In this paper, we propose the following instances of the stochastic proximal gradient method for QPAT based on the multilinear formulation (35).
MULL-projected stochastic gradient algorithm: Here, we consider the form (38). For any iteration index k N choose i ( k ) { 1 , , N } and ( k ) { 1 , , 4 } and define the sequence of iterates ( z k ) k N by
z k + 1 = ( P D × I ) z k s k J ( k ) ( i ( k ) ) ( z k ) .
Here, the mapping P D × I is the proximal mapping corresponding to z χ D ( μ ) , which equals the projection P D in the μ component and equals the identity I in the other components.
MULL-proximal stochastic gradient algorithm: Here, we consider the form (39). For any iteration index k N choose i ( k ) { 1 , , N } and ( k ) { 1 , , 3 } and define sequence of iterates ( z k ) k N by
z k + 1 = prox s k G λ z k s k J ( k ) ( i ( k ) ) ( z k ) .
The second step implements the proximal mapping of z s k G λ ( μ ) with G λ ( μ ) = λ 2 L ( μ ) 2 + χ D ( μ ) . As in the previous section, this can be computed with Dykstra’s projection algorithm (25)–(28).
For better scaling, in our actual numerical implementation, we replace the scalar step sizes s k by the adaptive step size rule
s k i , : = arg min { z k t J ( i ) ( z k ) t R } .
Note that computing such step sizes does barely increase the computational time of the stochastic gradient method, since all involved calculations are anyhow necessary for computing the gradient for the iterative update. In opposition to that, calculating a similar adaptive step size for the algorithms proposed in Section 3 would require evaluation of the forward operators F i and therefore would significantly increase the computation time. This might be seen as an additional advantage of the novel MULL formulation (35) and its regularized version (36).

5. Numerical Simulations

For the Tikhonov approach to multi-source QPAT, the radiative transfer equation is numerically solved by a streamline diffusion finite element method. Solving the RTE is required to evaluate the forward operator F and the gradient F of the data fidelity term in every iterative step. For the alternative multilinear approach, these calculations are not necessary. However, the application of the transport operator to Φ has to be calculated for every update of J 1 . The simulations are performed on the square domain Ω = [ 1 cm , 1 cm ] 2 , where the absorption and the scattering coefficient are supported.

5.1. Numerical Solution of the RTE

Employing a finite element scheme, we derive the weak formulation of Equation (6) by integrating against a test function w : Ω × S 1 R and replacing the exact solution Φ by a linear combination in the finite element space Φ ( h ) = i = 1 N h c i ( h ) ψ i ( h ) ( x , θ ) as in [18]. Here, the basis function ψ i ( h ) ( x , θ ) is the product of a basis function in space and a basis function in velocity. The spatial domain is triangulated uniformly with mesh size h and P 1 -Lagrangian element function for the spatial and velocity domain. By choosing the test function w ( x , θ ) = j = 1 N h w j ( ψ j ( x , θ ) + D ( x , θ ) θ · x ψ j ( x , θ ) ) with streamline diffusion coefficient D ( x , θ ) , we obtain
Ω S 1 ( D θ · ψ i ψ i ) θ · x ψ j d θ d x + Γ + | θ · ν | d σ + Ω S 1 ( μ a + μ s μ s K ) ( ψ j + D θ · x ψ j ) ψ i d θ d x = Γ | θ · ν | ψ i ψ j d σ .
Equation (52) yields a system of linear equations M ( h ) c ( h ) = b ( h ) , where evaluating the left-hand side of (52) provides the entries of M ( h ) , the right-hand side gives the components of vector b ( h ) . Note that the sparsity of matrix M ( h ) is low and solving the linear system for the Tikhonov approach is very time-consuming. On the other hand, the solution via the MULL formulation requires only a matrix vector multiplication, since in this case Φ ( h ) is an independent variable. Thus, only the application to Φ has to be calculated and the transport equation does not need to be solved.

5.2. Test Scenario for Multiple Illumination

The sample is illuminated in orthogonal direction at the boundaries of Ω = [ 1 cm , 1 cm ] 2 . In our simulations, we use N = 4 homogenous illuminations and no internal sources. The illuminations are applied separately from each side (left, right, top and bottom) with acoustic data measured on a half circle on the same side as the illumination (see Figure 2). For the scattering kernel, we use the two-dimensional Henyey–Greenstein kernel,
k ( θ , θ ) : = 1 2 π 1 g 2 1 + g 2 2 g cos ( θ · θ ) for θ , θ S 1 ,
where the anisotropy factor is chosen as g = 0.5 in all our experiments.
For the simulated data, we choose a spatial mesh size 2 / 100 , in order to discretize the velocity direction the unit circle is divided in 64 subintervals. In order to avoid inverse crime, for the reconstruction, we use a different spatial mesh size h = 2 / 80 and use N θ = 48 velocity directions. Calculating the simulated data corresponds to evaluating the forward operators F i with perpendicular boundary illumination constant along one side of the boundary square, q o ( x , θ ) = δ ( θ θ i ) χ i ( x ) 1   m J   cm 1 , where δ is an approximation of the Dirac delta function and χ i the indicator function of side i of Ω . In this way, we simulate data
v i = U H i ( μ ) + z i noise for i = 1 , , 4 .
Thereby, the heating operator is computed numerically by solving the RTE as described in Section 5.1. The wave operator U is evaluated by straightforward discretization of the well-known explicit formulas for (10) that can be found, for example, in [55,56]. In the following, we present results for exact data (where z i noise = 0 ) as well as for noisy data. For the noisy data case, we add 0.5 % random noise to the simulated data, i.e., we take the maximum value of the simulated pressure and add white noise z i noise with a standard deviation of 0.5 % of that maximal value. The phantom, the setup and the simulated data for one of the four illuminations (top) are shown in Figure 2 and Figure 3.

5.3. Numerical Results

For regularizing the absorption and scattering coefficient, we make use of Laplace regularization and choose L a = Δ and L s = 100 Δ , respectively. We assume that the coefficient μ is known at the boundary of Ω and is therefore used as the starting value of our iterative schemes. Furthermore, we use the boundary value of μ for regularization; that is, we implement it in the Dykstra projection procedure (26) by iteratively projecting on the known boundary value. In the following, we discuss the methods that we have outlined in the previous section.
Standard formulation of QPAT (19): We assume that the scattering coefficient is known and we restrict ourself to reconstructing the absorption coefficient. Then, the proximal gradient and proximal stochastic gradient algorithm, respectively, read
μ a k + 1 = prox s k G λ μ a k s k 4 i = 1 4 F i ( μ a k , μ s ) ,
μ a k + 1 = prox s k G λ μ a k s k F i ( k ) ( μ a k , μ s ) .
In contrast, to the full proximal gradient algorithm, the proximal stochastic gradient algorithm avoids evaluating the full gradient F , but selects randomly an illumination number i { 1 , , 4 } for each iterative step. Because of formula (16), each iteration of the above procedures requires the calculation of the solution of the radiative transfer equation Φ as well of its adjoint Φ * .
The top row in Figure 4 shows reconstruction result for the absorption coefficient using the original formulation with the proximal gradient method with λ = 2 × 10 8 and 10 iterative steps. The left picture shows the relative error μ a μ a k / μ a . Note that, in this case, solutions of the RTE and its adjoint have to be computed for four illuminations per iterative step. The reconstruction results in the bottom row in Figure 4 are obtained by the proximal stochastic gradient method with λ = 2 × 10 7 . The regularization parameters λ have been selected empirically as a trade-off between stability and accuracy. The total number of iterations is taken as 30. In each iteration, a illumination pattern is chosen randomly and the computation of RTE and its adjoint is executed only for this single illumination. Therefore, the computational effort for the proximal stochastic gradient method is approximately 3/4 of the proximal gradient algorithm using full gradients. For the algorithms based on the standard formulation (19), calculating adaptive step sizes similar to (51) is time-consuming as this requires another evaluation of the forward operator F i and therefore another solution of the RTE. Therefore, we simply use a constant step size rule; in our numerical experiments, it turned out that s k = 0.5 is a suitable choice.
Novel MULL formulation of QPAT (35): The multilinear approach overcomes the problem of solving the RTE by minimizing (38) or (39). In both cases, one selects an arbitrary functional and performs a steepest descent step, resulting in an iterative scheme for the variables μ a , μ s , Φ and H. Recall that none of the partial gradients (40)–(48) requires solving the RTE (which is the most time-consuming part for the standard formulation of QPAT). In each iterative step, we take a random illumination number i { 1 , , 4 } and a random functional number { 1 , 2 , 3 } . The gradient step then consists of the update rule
μ Φ i H i k + 1 = μ Φ i H i k + s k · J ( ( μ , Φ i , H i ) k ) .
Dykstra’s algorithm for smoothing the μ component is applied after each iterative step when { 1 , 2 } . Iteration (55) contains a gradient step for the RTE. Since one gradient step is not enough to obtain an appropriate approximation to the solution of the transport equation, we apply iteration (55) 40 times whenever = 1 is chosen. In this situation, we apply the Dykstra iteration in the μ component after these 40 iteration steps, whereas the positivity projection is done in every step. Flowcharts of the stochastic gradient algorithms (standard and MULL formulations) are shown in Figure 5. For the projected stochastic gradient method, regularization of μ is done by incorporating the regularization functional J 4 in the random choice of functionals; see (38). The positivity restriction is realized with the cut projection P D ( μ ) = max { 0 , μ } applied after every iterative step.
Figure 6 shows reconstructions with the stochastic gradient methods for the novel MULL formulation of QPAT (35). The results in the top row are for the MULL-proximal stochastic gradient algorithm with λ = 2 × 10 8 and in the the bottom row results for the MULL-projected stochastic gradient method with λ = 2 × 10 8 are shown. In both cases, we used 1000 iterations.
Remark 1.
Note that in the stochastic gradient methods for the novel MULL formulation of QPAT calculating the matrix vector product M ( h ) · Φ is the most costly part. In contrast, the standard formulation (19) requires the solution of the system M ( h ) c ( h ) = b ( h ) . Since the matrix M ( h ) is sparse only in its spatial domain, this is very time-consuming. On the other hand, the matrix M ( h ) (which is a discretization of θ · x + μ a + μ s ( I K ) ) has a simple dependence on the variables μ a , μ s . We therefore can compute the velocity entries of M ( h ) at the beginning of the iterative process to save computation time.
The reconstruction times for the final reconstructions using all methods described above are shown in Table 1. For the standard formulation of QPAT, the reconstruction times seem to be in accordance with reported results using gradient or Newton-type methods for QPAT (see, for example, [48].) The methods based on the new MULL formulation (35) (after 1000 iterations) are faster than the methods based on the standard formulation (19) of QPAT (after 10, respectively, 40 iterations). From the relative reconstruction errors shown in Figure 4 and Figure 6, one notices that, opposed to the methods based on (19), the methods using the MULL formulation could even be stopped much earlier while still obtaining a comparable reconstruction quality. We roughly estimate a speedup of a factor 10 using the novel MULL formulation instead of the standard formulation of QPAT.
In Figure 7, we show results for noisy data using the proximal gradient method based on the standard formulation (19) (top) and the proximal stochastic gradient method using the MULL formulation for QPAT (bottom). The regularization parameter is chosen as in the exact data case. Finally, in Figure 8, we show reconstruction results using only two consecutive illuminations applied from the top and from the left with noisy data. We use 10 iterations for the proximal gradient algorithm based on (19) (shown in left image in Figure 8) and 500 iterations for the stochastic gradient algorithms based on the MULL formulation (35) (shown in the right image in Figure 8).

6. Conclusions

In this paper, we developed efficient proximal stochastic gradient methods for image reconstruction in multi-source QPAT. We used the RTE as an accurate model for light transport and employed the single stage approach for QPAT introduced in [18]. One class of the proximal stochastic gradient methods has been developed based on the standard formulation for QPAT given in (19). Additionally, we developed another class using proximal stochastic gradient methods for the new MULL formulations (35) and (36) for QPAT. Besides proposing proximal stochastic gradient methods for QPAT, we also consider the formulations (35) and (36) as the main contributions of the present article. These new formulations avoid the time-consuming evaluation of the RTE at each iteration and allow for treating the QPAT problem as a constrained optimization problem, which enables the use of a variety of numerical algorithms. Here, we used a penalty approach in combination with stochastic gradient methods for the solution. Future work will be done in the direction of developing new algorithms based on (35) and (36). Additionally, we will investigate the use of different regularization terms in (36). Finally, the theoretical convergence analysis of proximal gradient algorithms and other iterative algorithms for solving (35) will be the subject of future research.

Acknowledgments

Markus Haltmeier acknowledges support of the Austrian Science Fund (FWF), project P 30747. Simon Rabanser is a recipient of a DOC Fellowship of the Austrian Academy of Sciences (OEAW) and acknowledges support of the OEAW for his PhD project. We thank the referees for helpful comments that increased the readability of the this manuscript, as well as pointing out some interesting references to us.

Author Contributions

Markus Haltmeier, Lukas Neumann and Simon Rabanser developed the reconstruction algorithms and the numerical implementation; Simon Rabanser performed the numerical experiments; Markus Haltmeier, Lukas Neumann and Simon Rabanser wrote the paper; Markus Haltmeier and Lukas Neumann supervised the project.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Beard, P. Biomedical photoacoustic imaging. Interface Focus 2011, 1, 602–631. [Google Scholar] [CrossRef] [PubMed]
  2. Wang, L.V. Multiscale photoacoustic microscopy and computed tomography. Nat. Photonics 2009, 3, 503–509. [Google Scholar] [CrossRef] [PubMed]
  3. Agranovsky, M.; Kuchment, P.; Kunyansky, L. On reconstruction formulas and algorithms for the thermoacoustic tomography. In Photoacoustic Imaging and Spectroscopy; Wang, L.V., Ed.; CRC Press: Boca Raton, FL, USA, 2009; pp. 89–101. [Google Scholar]
  4. Burgholzer, P.; Bauer-Marschallinger, J.; Grün, H.; Haltmeier, M.; Paltauf, G. Temporal back-projection algorithms for photoacoustic tomography with integrating line detectors. Inverse Probl. 2007, 23, S65–S80. [Google Scholar] [CrossRef]
  5. Haltmeier, M. Universal inversion formulas for recovering a function from spherical means. SIAM J. Math. Anal. 2014, 46, 214–232. [Google Scholar] [CrossRef]
  6. Haltmeier, M.; Nguyen, L.V. Analysis of iterative methods in photoacoustic tomography with variable sound speed. SIAM J. Imaging Sci. 2017, 10, 751–781. [Google Scholar] [CrossRef]
  7. Haltmeier, M.; Schuster, T.; Scherzer, O. Filtered backprojection for thermoacoustic computed tomography in spherical geometry. Math. Methods Appl. Sci. 2005, 28, 1919–1937. [Google Scholar] [CrossRef]
  8. Huang, C.; Wang, K.; Nie, L.; Wang, L.V.; Anastasio, M.A. Full-wave iterative image reconstruction in photoacoustic tomography with acoustically inhomogeneous media. IEEE Trans. Med. Imaging 2013, 32, 1097–1110. [Google Scholar] [CrossRef] [PubMed]
  9. Nguyen, L.V.; Kunyansky, L.A. A Dissipative Time Reversal Technique for Photoacoustic Tomography in a Cavity. SIAM J. Imaging Sci. 2016, 9, 748–769. [Google Scholar] [CrossRef]
  10. Rosenthal, A.; Ntziachristos, V.; Razansky, D. Acoustic Inversion in Optoacoustic Tomography: A Review. Curr. Med. Imaging Rev. 2013, 9, 318–336. [Google Scholar] [CrossRef] [PubMed]
  11. Ammari, H.; Bossy, E.; Jugnon, V.; Kang, H. Reconstruction of the Optical Absorption Coefficient of a Small Absorber from the Absorbed Energy Density. SIAM J. Appl. Math. 2011, 71, 676–693. [Google Scholar] [CrossRef]
  12. Bal, G.; Jollivet, A.; Jugnon, V. Inverse transport theory of photoacoustics. Inverse Probl. 2010, 26, 025011. [Google Scholar] [CrossRef]
  13. Bal, G.; Ren, K. Multi-source quantitative photoacoustic tomography in a diffusive regime. Inverse Probl. 2011, 27, 075003. [Google Scholar] [CrossRef]
  14. Chen, J.; Yang, Y. Quantitative photo-acoustic tomography with partial data. Inverse Probl. 2012, 28, 115014. [Google Scholar] [CrossRef]
  15. Cox, B.T.; Arridge, S.A.; Beard, P.C. Gradient-Based Quantitative Photoacoustic Image Reconstruction for Molecular Imaging. Proc. SPIE 2007, 6437, 64371T. [Google Scholar]
  16. Cox, B.T.; Arridge, S.R.; Köstli, P.; Beard, P.C. Two-dimensional quantitative photoacoustic image reconstruction of absorption distributions in scattering media by use of a simple iterative method. Appl. Opt. 2006, 45, 1866–1875. [Google Scholar] [CrossRef] [PubMed]
  17. Cox, B.T.; Laufer, J.G.; Arridge, S.R.; Beard, P.C. Quantitative spectroscopic photoacoustic imaging: A review. J. Biomed. Opt. 2012, 17, 0612021. [Google Scholar] [CrossRef] [PubMed]
  18. Haltmeier, M.; Neumann, L.; Rabanser, S. Single-stage reconstruction algorithm for quantitative photoacoustic tomography. Inverse Probl. 2015, 31, 065005. [Google Scholar] [CrossRef]
  19. Haltmeier, M.; Neumann, L.; Nguyen, L.V.; Rabanser, S. Analysis of the Linearized Problem of Quantitative Photoacoustic Tomography. arXiv, 2017; arXiv:1702.04560. [Google Scholar]
  20. Kruger, R.A.; Lui, P.; Fang, Y.R.; Appledorn, R.C. Photoacoustic Ultrasound (PAUS)–Reconstruction Tomography. Med. Phys. 1995, 22, 1605–1609. [Google Scholar] [CrossRef] [PubMed]
  21. Mamonov, A.V.; Ren, K. Quantitative photoacoustic imaging in radiative transport regime. Commun. Math. Sci. 2014, 12, 201–234. [Google Scholar] [CrossRef]
  22. Naetar, W.; Scherzer, O. Quantitative Photoacoustic Tomography with Piecewise Constant Material Parameters. SIAM J. Imaging Sci. 2014, 7, 1755–1774. [Google Scholar] [CrossRef]
  23. Ren, K.; Gao, H.; Zhao, H. A hybrid reconstruction method for quantitative PAT. SIAM J. Imaging Sci. 2013, 6, 32–55. [Google Scholar] [CrossRef]
  24. Rosenthal, A.; Razansky, D.; Ntziachristos, V. Fast Semi-Analytical Model-Based Acoustic Inversion for Quantitative Optoacoustic Tomography. IEEE Trans. Med. Imaging 2010, 29, 1275–1285. [Google Scholar] [CrossRef] [PubMed]
  25. Tarvainen, T.; Cox, B.T.; Kaipio, J.P.; Arridge, S.A. Reconstructing absorption and scattering distributions in quantitative photoacoustic tomography. Inverse Probl. 2012, 28, 084009. [Google Scholar] [CrossRef]
  26. Yao, L.; Sun, Y.; Jiang, H. Transport-based quantitative photoacoustic tomography: Simulations and experiments. Phys. Med. Biol. 2010, 55, 1917–1934. [Google Scholar] [CrossRef] [PubMed]
  27. Arridge, S.R. Optical tomography in medical imaging. Inverse Probl. 1999, 15, R41–R93. [Google Scholar] [CrossRef]
  28. Dautray, R.; Lions, J. Mathematical Analysis and Numerical Methods for Science and Technology; Springer: Berlin, Germany, 1993. [Google Scholar]
  29. Egger, H.; Schlottbom, M. Numerical methods for parameter identification in stationary radiative transfer. Comput. Optim. Appl. 2015, 62, 67–83. [Google Scholar] [CrossRef]
  30. Kanschat, G. Solution of radiative transfer problems with finite elements. In Numerical Methods in Multidimensional Radiative Transfer; Springer: Berlin, Germany, 2009; pp. 49–98. [Google Scholar]
  31. Gao, H.; Feng, J.; Song, L. Limited-view multi-source quantitative photoacoustic tomography. Inverse Probl. 2015, 31, 065004. [Google Scholar] [CrossRef]
  32. Engl, H.W.; Hanke, M.; Neubauer, A. Regularization of Inverse Problems; Springer Science & Business Media: Berlin, Germany, 1996. [Google Scholar]
  33. Kaltenbacher, B.; Neubauer, A.; Scherzer, O. Iterative Regularization Methods for Nonlinear Ill-Posed Problems; Walter de Gruyter: Berlin, Geramny, 2008. [Google Scholar]
  34. Scherzer, O.; Grasmair, M.; Grossauer, H.; Haltmeier, M.; Lenzen, F. Variational Methods in Imaging; Applied Mathematical Sciences; Springer: New York, NY, USA, 2009. [Google Scholar]
  35. Combettes, P.L.; Wajs, V.R. Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 2005, 4, 1168–1200. [Google Scholar] [CrossRef]
  36. Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces; Springer: Berlin, Germany, 2011. [Google Scholar]
  37. Bertsekas, D.P. Incremental gradient, subgradient, and proximal methods for convex optimization: A survey. In Optimization for Machine Learing; Sra, S., Nowozin, S., Wright, S.J., Eds.; The MIT Press: London, UK, 2012. [Google Scholar]
  38. Bertsekas, D.P. Incremental proximal methods for large scale convex optimization. Math. Program. 2011, 129, 163–195. [Google Scholar] [CrossRef] [Green Version]
  39. Xiao, L.; Zhang, T. A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 2014, 24, 2057–2075. [Google Scholar] [CrossRef]
  40. Duchi, J.; Singer, Y. Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 2009, 10, 2899–2934. [Google Scholar]
  41. Li, H.; Haltmeier, M. The Averaged Kaczmarz Iteration for Solving Inverse Problems. arXiv, 2017; arXiv:1709.00742. [Google Scholar]
  42. Pereyra, M.; Schniter, P.; Chouzenoux, E.; Pesquet, J.-C.; Tourneret, J.Y.; Hero, A.O.; McLaughlin, S. A survey of stochastic simulation and optimization methods in signal processing. IEEE J. Sel. Top. Signal Process. 2016, 10, 224–241. [Google Scholar] [CrossRef]
  43. De Cezaro, A.; Haltmeier, M.; Leitão, A.; Scherzer, O. On steepest-descent-Kaczmarz methods for regularizing systems of nonlinear ill-posed equations. Appl. Math. Comput. 2008, 202, 596–607. [Google Scholar] [CrossRef]
  44. Haltmeier, M.; Leitao, A.; Scherzer, O. Kaczmarz methods for regularizing nonlinear ill-posed equations I: convergence analysis. Inverse Probl. Imaging 2007, 1, 289–298. [Google Scholar]
  45. Haltmeier, M.; Kowar, R.; Leitao, A.; Scherzer, O. Kaczmarz methods for regularizing nonlinear ill-posed equations II: Applications. Inverse Probl. Imaging 2007, 1, 507–523. [Google Scholar]
  46. Alberti, G.; Ammari, H. Disjoint sparsity for signal separation and applications to hybrid inverse problems in medical imaging. Appl. Comput. Harmon. Anal. 2017, 42, 319–349. [Google Scholar] [CrossRef]
  47. Ammari, H.; Garnier, J.; Kang, H.; Nguyen, L.; Seppecher, L. Multi-Wave Medical Imaging, Mathematical Modelling & Imaging Reconstruction; World Scientific Publishing: London, UK, 2017. [Google Scholar]
  48. Saratoon, T.; Tarvainen, T.; Cox, B.T.; Arridge, S.R. A gradient-based method for quantitative photoacoustic tomography using the radiative transfer equation. Inverse Probl. 2013, 29, 075006. [Google Scholar] [CrossRef]
  49. Wang, C.; Zhou, T. On iterative algorithms for quantitative photoacoustic tomography in the radiative transport regime. Inverse Probl. 2017, 33, 115006. [Google Scholar] [CrossRef]
  50. Egger, H.; Schlottbom, M. Stationary radiative transfer with vanishing absorption. Math. Models Methods Appl. Sci. 2014, 24, 973–990. [Google Scholar] [CrossRef]
  51. Finch, D.; Patch, S.K.; Rakesh. Determining a function from its mean values over a family of spheres. SIAM J. Math. Anal. 2004, 35, 1213–1240. [Google Scholar] [CrossRef]
  52. Finch, D.; Haltmeier, M.; Rakesh. Inversion of spherical means and the wave equation in even dimensions. SIAM J. Appl. Math. 2007, 68, 392–412. [Google Scholar] [CrossRef]
  53. Combettes, P.L.; Pesquet, J.C. Proximal splitting methods in signal processing. In Fixed-Point Algorithms for Inverse Problems in Science and Engineering; Springer: New York, NY, USA, 2011; pp. 185–212. [Google Scholar]
  54. Ito, K.; Kunisch, K. Lagrange Multiplier Approach to Variational Problems and Applications; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2008. [Google Scholar]
  55. John, F. Partial Differential Equations, 4th ed.; Applied Mathematical Sciences; Springer: New York, NY, USA, 1982. [Google Scholar]
  56. Evans, L.C. Partial Differential Equations; Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 1998. [Google Scholar]
Figure 1. Basic principles of PAT. Left: the investigated object is illuminated with a short optical pulse; Middle: due to the thermoelastic effect, the absorbed light distribution induces an acoustic pressure wave depending on internal tissue properties; Right: the acoustic pressure wave is measured outside the object and used to reconstruct an image of the interior.
Figure 1. Basic principles of PAT. Left: the investigated object is illuminated with a short optical pulse; Middle: due to the thermoelastic effect, the absorbed light distribution induces an acoustic pressure wave depending on internal tissue properties; Right: the acoustic pressure wave is measured outside the object and used to reconstruct an image of the interior.
Entropy 20 00121 g001
Figure 2. (a) The phantom is defined on the square Ω = [ 1 cm , 1 cm ] 2 and the acoustic pressure is measured on a semi-circle on the side of the illumination; (b) the simulated pressure correspond to the phantom and the illumination in (a) and are represented as gray scale density.
Figure 2. (a) The phantom is defined on the square Ω = [ 1 cm , 1 cm ] 2 and the acoustic pressure is measured on a semi-circle on the side of the illumination; (b) the simulated pressure correspond to the phantom and the illumination in (a) and are represented as gray scale density.
Entropy 20 00121 g002
Figure 3. Absorption coefficient distribution of the tissue sample used for the numerical examples. Background absorption of the tissue is taken as μ a = 0.3   cm 1 , the blue obstacles have μ a = 1   cm 1 and the red stripes μ a = 2   cm 1 . The area between the red stripes has absorption coefficient μ a = 0.5   cm 1 . The scattering coefficient is constant in the whole sample and chosen to be μ s = 3   cm 1 . Illuminations are applied consecutively from top, right, bottom and left. The corresponding boundary sources are given by q o ( x , θ ) = δ ( θ θ i ) χ i ( x ) 1   m J   cm 1 . The x- and y-axis cover [ 1 cm , 1 cm ] .
Figure 3. Absorption coefficient distribution of the tissue sample used for the numerical examples. Background absorption of the tissue is taken as μ a = 0.3   cm 1 , the blue obstacles have μ a = 1   cm 1 and the red stripes μ a = 2   cm 1 . The area between the red stripes has absorption coefficient μ a = 0.5   cm 1 . The scattering coefficient is constant in the whole sample and chosen to be μ s = 3   cm 1 . Illuminations are applied consecutively from top, right, bottom and left. The corresponding boundary sources are given by q o ( x , θ ) = δ ( θ θ i ) χ i ( x ) 1   m J   cm 1 . The x- and y-axis cover [ 1 cm , 1 cm ] .
Entropy 20 00121 g003
Figure 4. Reconstruction results based on standard formulation (19). Top: proximal gradient method; Bottom: proximal stochastic gradient method. The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the result after the final iteration. (The phantom is as described in Figure 3.)
Figure 4. Reconstruction results based on standard formulation (19). Top: proximal gradient method; Bottom: proximal stochastic gradient method. The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the result after the final iteration. (The phantom is as described in Figure 3.)
Entropy 20 00121 g004
Figure 5. Flowcharts of stochastic gradient algorithms for QPAT proposed in this paper. Left: algorithm based on the standard formulation (19). Right: algorithm based on the novel MULL formulation (35). The update (54) using the standard formulation requires solving the forward RTE and the adjoint RTE, which is not required by (55) with the MULL formulation. Simulations are performed with N = 4 .
Figure 5. Flowcharts of stochastic gradient algorithms for QPAT proposed in this paper. Left: algorithm based on the standard formulation (19). Right: algorithm based on the novel MULL formulation (35). The update (54) using the standard formulation requires solving the forward RTE and the adjoint RTE, which is not required by (55) with the MULL formulation. Simulations are performed with N = 4 .
Entropy 20 00121 g005
Figure 6. Reconstruction results based on the novel MULL formulation (35). Top: MULL-proximal stochastic gradient method based on the decomposition (38). Bottom: MULL-projected stochastic gradient method based on the decomposition (39). The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the results after the last iterations. (The phantom is as described in Figure 3.)
Figure 6. Reconstruction results based on the novel MULL formulation (35). Top: MULL-proximal stochastic gradient method based on the decomposition (38). Bottom: MULL-projected stochastic gradient method based on the decomposition (39). The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the results after the last iterations. (The phantom is as described in Figure 3.)
Entropy 20 00121 g006
Figure 7. Reconstruction results from noisy data. Top: Proximal gradient method based on (19). Bottom: MULL-proximal stochastic gradient method. The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the results after the last iterations. (The phantom is as described in Figure 3.)
Figure 7. Reconstruction results from noisy data. Top: Proximal gradient method based on (19). Bottom: MULL-proximal stochastic gradient method. The left images show the relative reconstruction errors of the reconstructed absorption coefficient as a function of the number of iterations, whereas the right pictures show the results after the last iterations. (The phantom is as described in Figure 3.)
Entropy 20 00121 g007
Figure 8. Reconstruction results from noisy data for two illuminations. Left: proximal gradient method based on (19) using 10 iterations. Right: MULL-proximal stochastic gradient method using only 500 iterations. The phantom is as described in Figure 3 and, for the reconstruction methods, we use two consecutive illuminations (from the top and from the left). The reconstruction time has been about 14 h for the method based on the standard formulation (19) and 3 h for the proposed MULL-proximal stochastic gradient method.
Figure 8. Reconstruction results from noisy data for two illuminations. Left: proximal gradient method based on (19) using 10 iterations. Right: MULL-proximal stochastic gradient method using only 500 iterations. The phantom is as described in Figure 3 and, for the reconstruction methods, we use two consecutive illuminations (from the top and from the left). The reconstruction time has been about 14 h for the method based on the standard formulation (19) and 3 h for the proposed MULL-proximal stochastic gradient method.
Entropy 20 00121 g008
Table 1. Reconstruction times for all methods. Recall that one iteration of the proximal stochastic gradient method is approximately four times cheaper than one iteration of the full proximal gradient method (both based on (19)). Further recall that one step in the methods based on the MULL formulation (35) is much less time consuming than for the methods based on (19); see Remark 1.
Table 1. Reconstruction times for all methods. Recall that one iteration of the proximal stochastic gradient method is approximately four times cheaper than one iteration of the full proximal gradient method (both based on (19)). Further recall that one step in the methods based on the MULL formulation (35) is much less time consuming than for the methods based on (19); see Remark 1.
AlgorithmModelUpdateNo. IterationsReconstruction Time
Proximal gradient(19)(23)10 27.2 h
Proximal stochastic gradient(19)(31)30 24.4 h
MULL-proximal stochastic gradient(35)(49)1000 14.7 h
MULL-projected stochastic gradient(35)(50)1000 11.9 h

Share and Cite

MDPI and ACS Style

Rabanser, S.; Neumann, L.; Haltmeier, M. Stochastic Proximal Gradient Algorithms for Multi-Source Quantitative Photoacoustic Tomography. Entropy 2018, 20, 121. https://doi.org/10.3390/e20020121

AMA Style

Rabanser S, Neumann L, Haltmeier M. Stochastic Proximal Gradient Algorithms for Multi-Source Quantitative Photoacoustic Tomography. Entropy. 2018; 20(2):121. https://doi.org/10.3390/e20020121

Chicago/Turabian Style

Rabanser, Simon, Lukas Neumann, and Markus Haltmeier. 2018. "Stochastic Proximal Gradient Algorithms for Multi-Source Quantitative Photoacoustic Tomography" Entropy 20, no. 2: 121. https://doi.org/10.3390/e20020121

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop