**2. Methodology**

### *2.1. Parabolic Radon Transform*

The fundamental strategy of multiple suppression based on Radon transform is to use the difference in kinematic features between primary and multiple reflections. The velocity between primary and multiple reflections is used to perform normal movement correction of the seismic data of CMP gather, and then the seismic events are more like parabolas [6]. Therefore, after the parabolic Radon transform, the seismic events with different curvatures can be mapped to different regions of the Radon domain to realize the separation of primary and multiple reflections.

The sum trajectories of parabolic Radon transform follow a parabola, and its definition is as follows:

$$m\left(\mathbf{r}, q\_{\rangle}\right) = \sum\_{j=0}^{N\_x} d\left(\mathbf{t} = \mathbf{r} + q\_{\rangle}\mathbf{x}^2, \mathbf{x}\right) \tag{1}$$

$$d(t, \mathbf{x}) = \sum\_{j=0}^{N\_q} m\left(\mathbf{r} = t - q\_j \mathbf{x}^2, q\_j\right) \tag{2}$$

where *d*(*t*, *x*) is seismic data, *m* - *τ*, *qj* is Radon domain data after parabolic Radon transform, *x* is offset distance, *t* is time, *q* is curvature parameter and τ is intercept time. The meaning of parabolic Radon transform is that the seismic events having a parabolic shape in the time domain are mapped to a point in the Radon domain.

For the sake of efficiency, Radon transform can be transformed via the time variable to the frequency domain, as shown in Formulations (3) and (4):

$$M(q\_{j'}f) = \sum\_{k=1}^{N\_{\mathfrak{X}}} D(\mathbf{x}\_{k'}f) e^{i2\pi f q\_j \mathbf{x}\_k^2} \tag{3}$$

$$D(\mathbf{x}\_{k\prime}f) = \sum\_{j=1}^{N\_q} M(q\_{j\prime}f)e^{-i2\pi f q\_j \mathbf{x}\_k^2} \tag{4}$$

where f is frequency, *k* = 1, 2, ··· , *Nx* and *j* = 1, 2, ··· , *Nq*

The matrix vector form of Formulations (3) and (4) can be represented by

$$m = A^H d\tag{5}$$

$$d = Am\tag{6}$$

where *d* is CMP gather and *m* is the matrix form of Radon domain data. The Radon operators *A* and *A<sup>H</sup>* of forward and inversion transform are defined as

$$\mathbf{A}^{H} = e^{i2\pi f q\_{\hat{f}} \mathbf{x}\_{\hat{k}}^{2}} \tag{7}$$

$$A = e^{-i2\pi f q\_\dagger \mathbf{x}\_k^2} \tag{8}$$

Due to the limited aperture and discretization leading to low resolution and aliasing, Hampson [6] proposed the least squares inversion method, and the solution is presented as

$$m = \left(A^H A\right)^{-1} A^H d\tag{9}$$

The solution of Tikhonov regularization [37] is

$$
\sigma \mathfrak{m} = \left(\mathbf{A}^H \mathbf{A} + \sigma \mathbf{I}\right)^{-1} \mathbf{A}^H \mathbf{d} \tag{10}
$$

where *σ* is a stable factor and *I* is the identity matrix.

The least squares parabolic Radon transform improves the resolution of the Radon domain data as well as the accuracy and focusing ability of the transformation. This method limits partial energy diffusion and ensures the consistency of the reconstructed data with the original data. However, for multiples and primaries with little difference in the time difference of normal moveout, this method cannot separate them without distortion, and there is still some energy overlap in the Radon domain. It is necessary to improve the parabolic Radon transform to make the solution of the Radon domain more sparse. Radon transform is regarded as a nonlinear inversion problem. The data transformation error is constrained by *L*2-norm, and the model sparsity is constrained by L-norm to improve the sparsity and resolution of the model. Therefore, *L*<sup>1</sup> regularization [20] and *Lq* regularization [25,27] are proposed to constrain data.
