*2.1. Analytical Reconstruction*

In a PACT system, after laser excitation, acoustic waves are generated. The waves are collected by ultrasonic transducers all around the imaging target, stored in a computer by a data acquisition (DAQ) card, and processed in a reconstruction algorithm. The acoustic wave generation is based on thermoelastic expansion effect. Based on the principle of acoustic theories, in a homogeneous medium, the pressure at position *r* and time *t*, *p*(*<sup>r</sup>*, *t*), follows the thermoacoustic equation shown in Equation (1). In this equation, the initial pressure, *p*0(*r*), is generated by a short pulse, which can be mathematically considered as a delta function (*δ*(t)) [20].

$$\nabla^2 p(r, t) - \frac{1}{c^2} \frac{\partial^2}{\partial t^2} p(r, t) = -p\_0(r) \frac{d\delta(t)}{dt} \tag{1}$$

where *c* indicates the speed of sound. The pressure at (*r*, *t*), generated from the initial pressure at *r*, is given in Equation (2), known as the Forward Problem.

$$p(r,t) = \frac{\partial}{\partial t} \left[ \frac{1}{4\pi c^3 t} \int dr' p\_0(r') \delta(t - \frac{|r - r'|}{c}) \right]. \tag{2}$$

In PACT, we look for initial pressure (*p*0(*r*)) calculation using the pressure measured at different view angles/times (*p*(*<sup>r</sup>*,*<sup>t</sup>*)). To this end, Equation (3), known as the Backward Problem, would be used [27].

$$p\_0(r) = \int \left[2p(r\_0, \bar{t}) - 2\bar{t}\frac{\partial}{\partial \bar{t}} p(r\_0, \bar{t})\right] \frac{d\Omega\_0}{\Omega\_0} \tag{3}$$

where ¯*t* = |*r* − *<sup>r</sup>*0|, and *d*Ω0 Ω0 denotes the weight that must be allocated to the detected pressures by transducers [27]. A favorite reconstruction algorithm for the PACT system could be the one that generates a high-quality image with fewer number of view angles. Analytical algorithms such as BP and TV-based have an inherent limitation (necessity of a large number of view angles around the target object) for accurate estimation of the optical absorption. In other words, such algorithms cause artifacts in the reconstructed images when a limited number of view angles is available in the PACT system. One solution to address this problem is a model-based algorithm in which artifacts and noise are iteratively degraded using a model. It should be noted that reconstruction algorithms are mostly simplified, and the effects of transducer size, imaging noise, and directivity of sensors are not considered in the reconstruction procedure. All these assumptions lead to negative effects called artifacts in this paper.

#### *2.2. The Proposed Method*

We model the procedure of PACT using *b* = *Ax*, where *A* is the measurement matrix, and *x* is the assumed phantom in the forward problem and the reconstructed image in the backward problem. *b* is the detected signals by the ultrasound transducers. When the number of unknowns (number of pixels of the PA image) is greater than the number of equations (number of recorded samples), which usually happens in PACT, especially when a low number of view angles is used for decreasing the data acquisition time, we have an underdetermined system. In this case, there is no exact solution.

One commonly used method to solve an underdetermined problem is the least square technique, giving *xest* = (*ATA*)(−<sup>1</sup>)*ATb*. The least square method uses the error minimization to obtain a solution, but the answer is not the best and the most accurate one that can be obtained. The sparse component analysis and sparse representation of signals can be used to solve the underdetermined problem of the PA image reconstruction. Having a prior knowledge, we can promote sparsity using *l*0, *l*1, and *lp* (0 < *p* < 1) norms to obtain a more accurate solution [30]. In other words, we can improve the image quality by assuming that the imaging target is sparse.

Targets in PA imaging are composed of many singularities. A singularity is the point of an exceptional set where it fails to be well-behaved in differentiability. These singularities can be considered as non-zero values of *x*. Such an assumption in the procedure of the backward problem leads to a more accurate solution in comparison with algorithms that use error minimization. The representation of the backward problem is given in Equation (4).

$$\min\_{\mathbf{x}} J(\mathbf{x}) \quad \text{s.t.} \quad b = A\mathbf{x} \tag{4}$$

where *J*(*x*) is the prior knowledge function that is used to promote sparsity. The basis pursuit method uses the *l*1-norm to sparsify the problem and transfers it into a linear or quadratic equation (see Equation (5)).

$$\min\_{\mathbf{x}} \frac{1}{2} \|\mathbf{b} - A\mathbf{x}\|\_2^2 + \lambda \|\mathbf{x}\|\_1 \tag{5}$$

where *λ* is the scalar regularization parameter, and .<sup>1</sup> and .<sup>2</sup> indicate the *l*1-norm and *l*2-norm, respectively. The first and second terms of Equation (5) represent the error of estimation and level of sparsity of estimated *x*, respectively. An effective sparsity method should represent all of the most important features of the PA images. Considering the diagnostically relevant features in a PA image, significant features can be edges, singular points, and homogenous texture.

A standard transform function can be used for signal sparsification. A single transformation, however, can well represent only one of the major features. In order to obtain an image having all these features, we propose combining some of the well-known standard transform functions. Although WT's magnitude will not oscillate around singularities, and it uses continuous transform to characterize the oscillations and discontinuities, the images generated by WT are blurred. DCT helps separate the image into spectral sub-bands of differing importance. In addition, it preserves homogeneous textures better than WT. However, it provides some blocking artifacts in the image. The blocking artifact is a distortion that appears as abnormally large pixel blocks. Therefore, WT and DCT cannot capture two-dimensional singularities, i.e., curves and edges in an image. On the other hand, the TV is an operator that works based on the local variations in a signal. It well preserves the edges in the image. However, the artifacts in the initial image introduced by limited view angles significantly affect the performance of TV [37]. As illustrated, all the three mentioned transformation functions have some benefits and disadvantages. We therefore propose to optimally use the combination of basis functions of WT, DCT, and TV, where WT captures point-like features, DCT captures homogeneous texture components, and TV sharpens the edges and reduces other artifacts without eliminating essential information of the image. In this way, due to the properties of the imaging target, which directly affect the PA reconstructed image generated in the first attempt, the reconstruction procedure is updated, moving toward a high image quality. The procedure of the proposed method can be seen in Figure 1.

The signal (*b*) can be decomposed to three sub-signals (*b*1, *b*2, and *b*3), each emphasizing a feature in the image, utilizing the morphological component analyses (MCA). Having three sub-signal and three sparse representation methods, MCA assumes that, for each sub-signal, there is a corresponding sparse representation which makes the sub-signal sparser than others. The proposed method is described in Equations (6) and (7), which are the expansion of Equation (5) for the MCA concept.

$$\min\_{\mathbf{x}} \frac{1}{2} \|\mathbf{b} - (\mathbf{b}\_1 + \mathbf{b}\_2 + \mathbf{b}\_3)\|\_2^2 + \alpha \|\psi\_w \mathbf{x}\|\_1 + \beta \|\psi\_{DCT} \mathbf{x}\|\_1 + \gamma TV(\mathbf{x}) \tag{6}$$

$$TV(\mathbf{x}) = \int |D\mathbf{x}|\tag{7}$$

where *x* is an initial image, obtained from BP algorithm, and *ψw* and *ψDCT* are WT and DCT transform operators, respectively. *D* is the gradient operator. *α*, *β*, and *γ* are the weight factors for WT, DCT, and TV transform operators, respectively. It should be noted that, in each basis, the emphasis is on one major feature and other features are less signified. These features are preserved as shown in Equation (6).

**Figure 1.** Block diagram of the proposed method. All the italic letters have been described in the text.

#### Paradigm of the Proposed Method

The proposed reconstruction method, demonstrated in Figure 1, is an iterative algorithm with the raw data acquired by transducer/s as the input and the reconstructed image as the output. The principle of the algorithm is as follows. First, we reconstruct an image (*x*) using the traditional BP method (considered as the initial image). The image reconstructed by the BP algorithm and the signal used within the BP algorithm (for reconstruction) are considered as the input of the proposed method. Here, *b* indicates the raw data, detected by the US transducer/s. Using the forward problem, shown in Equation (2), and the initial image (*x*), we calculate the raw data that could be used to obtain the initial image *x*. It should be noted that the initial image is obtained by the recorded signals using US transducer/s, but using the forward equation presented in Figure 1, we calculate the signal which could be as the initial signal, ignoring the recorded data. As the next step, we differentiate the recorded signals by the calculated signals. The result of differentiation is called *d*1. By applying *l*1-norm to *d*1, *f*1 is generated. The *f*1 indicates the power of the error signal. In addition, we calculate an error image, *G*1, from the error signal by applying the backward problem illustrated in Equation (3). Simultaneously, the system applies the proposed combinational sparsifying method on *x* to obtain an image. This image is called *G*2, as shown in Figure 1. At this step, the signals that could be used for generation of *G*2 is calculated using the forward problem, called *f*2. By applying *l*1-norm to *f*2, *f*3 is generated as the power of *f*2. Therefore, summation of *f*1 and *f*3 (i.e., *f* ) is a parameter that shows the difference between the real image and what we have estimated. If *f* does not meet the desired margin error, *x* should be updated. The updating process can be done by subtracting a portion of *G* from *x*, i.e., *δt*, iteration step.

Big step accelerates the speed of the program and reduces the execution time, but it may increase the error and lead to an unstable loop. If *f* meets the desired margin error, the last updated *x* would be the final image at this first step. The number of the steps is selected based on the trade-off between the execution time and the performance of the final reconstructed image. In our algorithm, we set the number of steps to 50. By allocating appropriate weights to each basis function, coefficients corresponding to dominant features will receive higher gain and those corresponding to less critical features will receive a lower gain.

It should be noted that the weighting factors are determined based on what diagnostically relevant features need to be emphasized in the image, i.e., edges, singular points, and homogeneous textures. A statistical texture analysis method, the Gray-Level Co-Occurrence Matrix (GLCM) technique, has been used to measure the homogeneous texture components [38]. GLCM is used in 0, 45, 90, and 135 degree directions. Singular points are small local areas that the signal values in these areas are changed in two dimensions. The Harris operator, by providing an analytical autocorrelation, considering the local intensity changes in different directions, has been used to detect singular points [39]. We used the canny filter, a first-order image edge detector, to determine the weighting factor TV (which signifies the edges). It should be noted that the main justification for the proposed method is to utilize the advantages of the overlapping methods to improve the image quality in limited view PACT systems. As mentioned, using only one sparsifying function leads to information loss. By combining the three basis, all the information loss in one of them can be mitigated using another one, leading to a higher image quality, compared to using any one of them.
