*2.3. Periodic Noise Representation*

Based on the stationarity of periodic noise, the basis function *ϕ*<sup>0</sup> of periodic noise is obtained by extending the waveform *w* iteratively to the length of seismic data on a single trace and then energy-normalized:

$$\boldsymbol{\varrho} = [\boldsymbol{w}, \boldsymbol{w}, \dots, \boldsymbol{w}] \tag{7}$$

$$\varphi\_0 = \frac{\varphi}{||\varphi||\_2} \tag{8}$$

where -•-<sup>2</sup> is the *l*<sup>2</sup> norm. The size of *ϕ* is *N* × 1, which is equal to that of *Sj*. The noise dictionary is constructed by basis functions of different phases:

$$\mathbf{D} = \begin{bmatrix} \varphi\_0 & \varphi\_1 & \varphi\_2 & \cdots & \varphi\_{T\_j - 1} \end{bmatrix} \tag{9}$$

where *<sup>ϕ</sup>*1, *<sup>ϕ</sup>*2, ... , *<sup>ϕ</sup>T*−<sup>1</sup> are obtained by cyclically shifting 1, 2, ... , *<sup>T</sup>* <sup>−</sup> 1 time samples. The size of **D** is *N* × *T*. The dictionary provides a sparse representation of periodic noise. In sparse representation theory [20], the signals can be efficiently explained as linear combinations of prespecified basis functions, where the linear coefficients are sparse. Based on sparse representation theory, the mathematical model of our noise representation is

$$S\_{\bar{j}} = \mathbf{D} \mathbf{x} \tag{10}$$

where *Sj*(*j* = 1, 2, ... , *m*) is the actual periodic noise and *x* is its coefficient represented by the dictionary **D**. The size of *x* is *T* × 1. An approach to solve Equation (10) is using the sparsity constraint via an *L*<sup>0</sup> regularization term. Then, the periodic noise of multi-trace seismic data is obtained by solving the following optimization problem:

$$\tilde{S}\_{\dot{\boldsymbol{\beta}}} = \underset{\mathcal{S}\_{\dot{\boldsymbol{\beta}}}}{\operatorname{argmin}} \left\| S\_{\dot{\boldsymbol{\beta}}} - \mathbf{D} \mathbf{x} \right\|\_{2}^{2} \quad \text{s.t.} \left\| \mathbf{x} \right\|\_{0} \le 1 \tag{11}$$

where *Sj*(*j* = 1, 2, ... , *m*) is the approximate periodic noise and *x* is its coefficient represented by the dictionary **D**. In the optimization problem, the sparsity of this representation is 1. Because only one basis function corresponds to a single trace, the coefficient *x* has only one non-zero component. This is the reason for the condition on the right in Equation (11) *x*-<sup>0</sup> ≤ 1. Equation (11) is solved by the matching pursuit algorithm, which entails computing the inner products between the residual and the dictionary elements, updating the coefficient, and updating the residual iteratively [20,21]. Finally, the de-noised data are obtained by subtracting the periodic noise from the raw seismic data.
