**Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A. The Strategies to Determine** *N-* **and** *K*

This appendix provides the strategies we follow to determine the value of *N* and *K*. The strategies are from [22] and a more complete description can be found there.

#### *Appendix A.1. The Determination of N*

Supposing that *γ* is the slowness of the medium, *f* is the frequency, *N* is the number of sensors, *rij* is the great-circle distance between two sensors *i* and *j*, and *r*¯ is the typical separation of the array which is defined as

$$\vec{r} = \frac{2}{N(N-1)} \sum\_{i=1}^{N} \sum\_{j>i}^{N} r\_{ij\prime} \tag{A1}$$

we can determine the value of *N* using

$$N'(f) = \min\{2\lceil 2\pi f \gamma \vec{r} \rceil + 1, N/2\},\tag{A2}$$

where *x* is the least integer greater than or equal to *x*.

Equation (A2) implies that the determination of *N* only relies on the frequency, the slowness of the medium, the typical separation, and the number of sensors used. Since the slowness of the medium is not known beforehand, a rough estimation of the average slowness is sufficient to use in practice, as is suggested by Seydoux et al. [30] (1.1 s/km is used in this paper). Thus, if we choose gathers with the same shapes, the value of *N* is a function of frequency.

#### *Appendix A.2. The Determination of K*

A statistical hypothesis test is used to determine *K*. At each frequency *f* , the eigenvalues {*λ*ˆ 1, ..., *λ*ˆ *N*−1} of **R**ˆ are tested sequentially at each step *k* using the statistic

$$
\pi(k) = \frac{\tilde{\lambda}\_k}{\mathcal{O}\_k},\tag{A3}
$$

with

$$
\bar{\sigma}\_k = \frac{\sum\_{i=k}^{N'} \hat{\lambda}\_i}{N' - k + 1}. \tag{A4}
$$

Note that *τ*(*k*) relies on the local noise characteristics of the chosen gather and may be different across gathers. The tested eigenvalue *λ* ˆ *k* is rejected and suppressed to an average level at a significance level *α* if

$$\tau(k) > w \mathbf{P}\_{\max|\mathbf{R}\_c^{N-k+1}}^{-1}(1-a),\tag{A5}$$

where P *max*|**R**<sup>ˆ</sup> *N*−*k*+1 *c* is the empirical cumulative distribution of the largest eigenvalue of an *N* − *k* + 1 dimensional SCM **R** ˆ *N*−*k*+1 *c* , which can be pre-computed by 1000 Monte Carlo trials. The significance level *α* = 0.05 is usually used. The weight *w* satisfies 0 ≤ *w* ≤ 1 and is used to affect the selection of *K*. A weight selection study [22] can be referred to choose an appropriate value for the weight. When the test stops rejecting, the value *k* is used as *K*. The statistical model of the SCM **R**ˆ*c* relating to a diffuse noise field is simulated using

$$
\hat{\mathbf{R}}\_{\varepsilon} = \frac{1}{M} \mathbf{R}\_{\varepsilon} \mathbf{X} \mathbf{X}^{H}, \tag{A6}
$$

where **X** is an *N* × *M* random matrix with entries **<sup>X</sup>***ij* ∼ CN (0, <sup>1</sup>), and **R***c* is the analytical covariance matrix of isotropic noise field defined as

$$\mathbf{[R\_c]\_{ij}} = J\_0(2\pi f \gamma ||\mathbf{r\_i} - \mathbf{r\_j}||),\tag{A7}$$

in which *J*0 is the zeroth order Bessel function of the first kind, and **r***i* denotes the position of the receiver *i*. Similarly, an estimation of the average slowness is sufficient to use in practice (1.1 s/km is used in this paper). Note that, if we choose gathers with the same shapes, the value of **R***c* is a function of frequency.

Considering these formulas from Equations (A3)–(A7), we can find that the determination of *K* not only relies on the frequency but also on local noise characteristics of the chosen gathers.
