GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis

Wang, Haolin; Wang, Honghua; Hou, Zhiyang; Zhou, Fei

doi:10.3390/app15063204

Open AccessArticle

GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis

College of Earth Science, Guilin University of Technology, Guilin 541004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 3204; https://doi.org/10.3390/app15063204

Submission received: 8 February 2025 / Revised: 8 March 2025 / Accepted: 11 March 2025 / Published: 14 March 2025

(This article belongs to the Special Issue Ground Penetrating Radar (GPR): Theory, Methods and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

By leveraging amplitude differences between reflected and diffracted signals in Ground Penetrating Radar (GPR) data, multiple singular spectrum analysis (MSSA) is considered an attractive approach to separate diffraction, which has identified great potential in their detectability of small-scale geological structures. However, conventional MSSA encounters difficulties in pinpointing the singular value threshold that corresponds to reflection, diffraction, and noise within the singular spectrum, leading to a resolution loss of the extracted diffraction profile. To address this issue, this paper develops a new technique that incorporates multilevel wavelet transform (MWT) and MSSA to separate GPR diffraction. By first implementing the MWT on GPR data decompose, the strategy can obtain various approximate detailed coefficients of multiple transformation levels for the subsequent inverse MWT to construct the corresponding coefficient profile. The issue of coefficient profiles that depict reflections often contains residual diffractions is also addressed by performing multiple singular spectrum SVDs based on the Hankel matrix within the dominant frequency domain. Building upon this, the k-means clustering algorithm is introduced to perform MSSA for classifying singular values into k categories. The diffraction wavefield is rebuilt by combining these outcomes with the coefficient profiles that depict diffractions at various transformation levels. Numerical tests showcase that the biorthogonal wavelet basis function bior4.4 provides remarkably efficient GPR diffraction separation performance, and the number of clusters in the k-means clustering algorithm typically ranges from 9 to 15, accounting for the complexity of the wave components. Compared to plane wave deconstruction (PWD), the proposed MWT-MSSA approach reduces energy loss at the diffraction vertex, decreases residual diffraction energy within the reflection profile, and enhances computational efficiency by approximately 70–80% to facilitate the subsequent subtle imaging.

Keywords:

ground penetrating radar (GPR); multilevel wavelet transform (MWT); multiple singular spectrum analysis (MSSA); k-means clustering algorithm; diffraction separation

1. Introduction

Ground Penetrating Radar (GPR) is a high-frequency electromagnetic detection technique that typically operates within a frequency range of 10 MHz to 10 GHz. It is widely utilized in applications such as infrastructure quality assessment, urban road detection, and archaeological exploration due to its high precision, high resolution, and non-destructive nature. The penetration depth and resolution of GPR are strongly affected by operating frequency and site conditions. Higher frequencies (e.g., 500–1000 MHz) provide greater resolution but are limited to shallow depths, while lower frequencies (e.g., 50–200 MHz) enable deeper penetration at the cost of reduced resolution [1,2,3]. Detecting shallow geological structures using the GPR technique typically involves focusing on small-scale anomalous features such as voids, cracks, collapses, and reinforcing bars [4,5]. Obtaining geometric and electrical specifics of these anomalies requires efficiently and intactly separating diffraction from GPR profiles using advanced data processing techniques, since these wave components are the vital responses of small-scale anomalies. This process is considered crucial for both infrastructure quality assessment and near-surface survey using subtle imaging.

However, the measured GPR data typically include strong reflections from larger structures as well as weaker diffraction events from small-scale anomalies and noise [3]. Reflections are typically distinguished by horizontal or nearly horizontal linear patterns with strong energy, while diffraction generally exhibit a hyperbolic shape with relatively faint energy, with noise energy usually the faintest [6,7]. Theoretically, small-scale anomalies are challenging to identify and interpret due to the faint diffraction being usually obscured by crosstalk or even masked by stronger reflection in the GPR profile [8]. Effective methods for separating GPR diffraction are of utmost importance, as they significantly improve the detectability of small-scale anomalies.

Diffraction separation was initially introduced in seismic data processing and has now become widely accepted [9,10,11]. The most commonly used approaches to separating diffraction are focusing–defocusing, plane wave deconstruction (PWD) enabled dip filtering [12], coherence analysis [8] and multiple singular spectrum analysis (MSSA). The dip filtering method separates diffraction by utilizing the angular differences between diffraction and reflection events based on the extracted dip angle metric of the different components. However, accurately extracting low dip angle information is often considered a tough issue [13]. The focusing–defocusing method aids in diffraction separation by focusing on the amplitudes of signals from discontinuous diffraction points [14,15,16]. Likewise, accurately extracting angle–focus information to guarantee the completeness of the diffraction components remains a challenge when there is a reflection-diffraction crosstalk issue involved, particularly in in-site field datasets [17]. The Radon transform methods distinguish diffraction by suppressing reflection through the Radon transform, considering their differences in characteristics within the dip domain [18,19,20]. However, the conventional Radon transform often suffers from low-resolution issues due to the residual reflection remained in the separated diffraction profile [18]. The PWD method employs plane wave filters that are designed using the finite difference method to separate diffraction and eliminate reflection [12].

Recent studies have proven the validity of diffraction separation methods like PWD and coherence analysis to process GPR data by exploiting the similarities in propagation characteristics between seismic and radar-emitted electromagnetic waves [8,21,22]. However, the performance of diffraction separation methods is significantly affected by scattering artifacts at grid edges caused by improper parameter settings in terms of conventional data processing before separation [9,23,24]. By utilizing amplitude differences between reflections and diffractions, the multiple singular spectrum analysis (MSSA) is featured with simplicity and efficiency in separating diffraction to remove the intense reflections while retaining the faint diffraction energy [25]. The essence of this technique is to perform singular value decomposition (SVD), with greater singular values indicating reflection, medium values indicating diffraction, and smaller values indicating noise [26,27]. However, determining the threshold of singular values that correspond to different wave components within a singular spectrum remains a challenge since GPR data are often featured with strong direct waves, coda waves, and ambient noise, which are inherently different from the seismic MSSA strategy. This could lead to a significant reduction in the resolution of the extracted diffraction profiles for GPR applications.

To address these issues, this paper proposes a new technique for GPR diffraction separation, which incorporates multilevel wavelet transform (MWT) for preprocessing and multiple singular spectrum analysis (MSSA) for adaptive threshold determination using a partitioning-based k-means clustering algorithm [28]. This contribution feature allows for the differentiation of different components in the singular spectrum based on similarity attributes. To address the issue of decreased k-means clustering accuracy caused by the difference between the maximum and minimum singular values of the singular spectrum being approximately 2–3 orders of magnitude apart [29], we employ MWT to first preprocess the GPR dataset [30,31,32,33]. MWT technique progressively decomposes the different components (i.e., direction, reflection, diffraction, and noise) into corresponding multilevel coefficients to enhance the accuracy of k-means clustering in coefficient profiles. These coefficients involve low-frequency approximate coefficients, high-frequency horizontal detailed coefficients, high-frequency vertical detailed coefficients, and diagonal detailed coefficients. Numerical experiments and field tests are conducted to validate the effectiveness and robustness of the proposed method. In particular, we showcase the dependence of the wavelet basis function on the separation performance, along with the analysis of clustering parameters selection and anti-noise performance.

2. Related Works

Current studies have shown that there has not been a precise definition of a unified standard for the threshold between different wave components within a singular spectrum. To mitigate this, Lin et al. [27] proposed the singular value truncation, which separates diffraction wavefields by setting the larger singular values and some smaller singular values to zero based on the composition characteristics of the wavefield and then reconstructing the diffraction using the remaining singular values. Other researchers employed the mean value method and cumulative energy method to define the singular value threshold [10,27]. The two methods involve obtaining the mean value and total energy of all singular values in the singular spectrum [10,27]. in regard to GPR data processing, the complexity of the wavefield often actually determines the number of singular values to be selected. The task of choosing the appropriate number of singular values becomes more difficult when the number of data acquisition traces increases, for instance, as the number of singular values obtained through SVD increases. In contrast to seismic data processing, GPR data are commonly observed with strong direct waves and complicated crosstalk, resulting in a first singular value in the singular spectrum that is substantially larger than the subsequent ones.

Figure 1a,b illustrate the synthetic dataset and its singular spectrum at the dominant frequency position, which was determined by the SVD result of GPR data. When employing methods such as mean value method or cumulative energy approach [27] to analyze the singular spectrum, it becomes evident that even the first singular value alone accounts for more than 98% of the total sum of singular values or cumulative energy. More specifically, for example, the singular values that are larger than 0.3 times of the mean value (i.e., the first ten singular values that are highlighted in red in Figure 1b) are set to zero based on the cumulative energy ideology that these ten value components actually account for 99.9995% of the total energy in singular spectrum [10,27]. However, the separated diffraction profiles still present multi-reflective and crosstalk features in terms of oblique events with considerable energy appearing between 10 ns–15 ns and 25 ns–30 ns, at the point where the two undulating interfaces are designed and located, as shown in Figure 1c,d. This disproportionate contribution (i.e., those faint but vital residual diffraction components and those intense but coda reflection components) could significantly limit the practicality of these seismic mainstream approaches in analyzing the singular spectrum of electromagnetic data for the GPR data processing community.

3. Methodology

3.1. Principle of Multi-Level Wavelet Transform

Two-dimensional (2D) time-domain GPR data are typically represented as d(x, t), x = 1, 2, …, N_x, t = 1, 2, …, N_t, where N_x represents the total number of traces, and N_t represents the number of samples (e.g., when collecting 10 m of data with an offset of 0.02 m and sampling points of 1024, the final acquired data will have N_x traces range from 1 to 500 and N_t samples range from 1 to 1024). According to wavelet transform theory, the first-level 2D discrete wavelet transform can be expressed as [34,35]:

\{\begin{cases} L L_{1} (x, t) = \sum_{m} \sum_{n} h [m] h [n] d (2 x - m, 2 t - n) \\ L H_{1} (x, t) = \sum_{m} \sum_{n} h [m] g [n] d (2 x - m, 2 t - n) \\ H L_{1} (x, t) = \sum_{m} \sum_{n} g [m] h [n] d (2 x - m, 2 t - n) \\ H H_{1} (x, t) = \sum_{m} \sum_{n} g [m] g [n] d (2 x - m, 2 t - n) \end{cases}

(1)

where LL₁, LH₁, HL₁, and HH₁ represent the low-frequency approximation coefficient, high-frequency horizontal, vertical and diagonal detail coefficient of the first-level wavelet transform of d(x, t), respectively. h[m] and g[n] represent the filter coefficients used to extract the low-frequency components and the high-frequency components, respectively. d(2x − m, 2t − n) denotes the local sampling value. m and n are the filter indices in horizontal and vertical directions, respectively.

The MWT builds upon the results of each previous wavelet transformation level. The low-frequency approximation coefficients LL_i_-1 derived from the previous level are further decomposed using wavelet transform. This iterative process is repeated until the desired number of decomposition levels is achieved [36].

MWT generally involves applying wavelet decomposition iteratively to the low-frequency approximation coefficients LL_i obtained from the previous transformation level, until the specified number of decomposition levels are achieved. Based on Equation (1), the i-th level wavelet transform of d(x, t) can be expressed as [37,38]:

\{\begin{cases} L L_{i} (x, t) = \sum_{m} \sum_{n} h [m] h [n] L L_{i - 1} (2 x - m, 2 t - n) \\ L H_{i} (x, t) = \sum_{m} \sum_{n} h [m] g [n] L L_{i - 1} (2 x - m, 2 t - n) \\ H L_{i} (x, t) = \sum_{m} \sum_{n} g [m] h [n] L L_{i - 1} (2 x - m, 2 t - n) \\ H H_{i} (x, t) = \sum_{m} \sum_{n} g [m] g [n] L L_{i - 1} (2 x - m, 2 t - n) \end{cases}

(2)

Upon completing k-level wavelet transforms, k sets low-frequency approximation coefficients, high-frequency horizontal, vertical and diagonal detailed coefficients can be obtained. To visually illustrate the distribution characteristics of the GPR wavefield at each level of the wavelet transform, the inverse transform of the wavelet coefficients is applied, as described in Equation (2). This inverse transform generates time-domain GPR coefficient profiles d_LL_k(x, t), d_LH_k(x, t), d_HL_k(x, t), and d_HH_k(x, t), which provide a detailed representation of the wavelet transform coefficients across each decomposition level.

\{\begin{cases} d_{L L_{k}} (x, t) = \sum_{m} \sum_{n} h [m] h [n] L L_{k} (2 x - m, 2 t - n) \\ d_{L H_{k}} (x, t) = \sum_{m} \sum_{n} h [m] g [n] L H_{k} (2 x - m, 2 t - n) \\ d_{H L_{k}} (x, t) = \sum_{m} \sum_{n} g [m] h [n] H L_{k} (2 x - m, 2 t - n) \\ d_{H H_{k}} (x, t) = \sum_{m} \sum_{n} g [m] g [n] H H_{k} (2 x - m, 2 t - n) \end{cases}

(3)

Without loss of generality, the subsequent discussion refers to

d_{* *_{k}} (x, t)

as the time-domain GPR coefficient profiles that are derived from the inverse transform of the wavelet coefficients at each decomposition level.

3.2. Multiple Singular Spectrum Analysis Method Based on k-Means Clustering

In accordance with MSSA theory, the initial step involves applying a Fourier transform to convert the multilevel wavelet coefficient profiles

d_{* *_{k}} (x, t)

into their frequency-domain counterparts

{\tilde{d}}_{* *} (x, f)

. The data

{\tilde{d}}_{* *} (x, f_{i})

at a specific frequency f_i can be expressed as:

{\tilde{d}}_{* *} (x, f_{i}) = [{\tilde{d}}_{1}, {\tilde{d}}_{2}, \dots, {\tilde{d}}_{N_{x}}]

(4)

Based on the Hankel transform [39,40], the Hankel matrix of

{\tilde{d}}_{* *} (x, f_{i})

can be represented as:

H = (\begin{matrix} {\tilde{d}}_{1} & {\tilde{d}}_{2} & \dots & {\tilde{d}}_{p} \\ {\tilde{d}}_{2} & {\tilde{d}}_{3} & \dots & {\tilde{d}}_{p + 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\tilde{d}}_{q} & {\tilde{d}}_{q + 1} & \dots & {\tilde{d}}_{N_{x}} \end{matrix})

(5)

where p and q represent two key parameters for constructing the Hankel matrix [41]. Typically, q is set as the integer value of N_x/2 + 1, and p is defined as N_x − q + 1. The Hankel matrix can then be decomposed using SVD technique [40]:

H = \sum_{i = 1}^{r} λ_{i} U_{i} V_{i}^{T}

(6)

where U_i and V_i represent the i-th eigenvector of HH^T and H^TH, respectively, while λ_i represents the i-th eigenvalue of H, satisfying the following:

\{\begin{cases} H H^{T} U_{i} = λ_{i}^{2} U_{i} \\ H^{T} H V_{i} = λ_{i}^{2} V_{i} \end{cases}

(7)

where the matrix product form of Equation (6) is given by

H = U Λ V^{T}

(8)

where

Λ

represents the diagonal matrix composed of singular values, arranged in descending order along the diagonal. According to the composition characteristics of the GPR profile, the singular spectrum formed by the diagonal elements of the singular value matrix can be considered as primarily consisting of direct waves, reflection, diffraction, and noise [39].

In this work, we employ the k-means clustering algorithm to partition the singular values in the singular spectrum into k clusters based on their similarity, with the aim of defining the threshold between the singular values that correspond to different wave components. The following details describe this procedure.

Determine the number of clusters k and randomly select k singular values from the singular spectrum as the initial cluster centers.
Compute the Manhattan distance between each singular value in the singular spectrum and the cluster centers, as described by [28].

Q = |x_{i} - x_{k}| + |y_{i} - y_{k}|

(9)

where Q represents the calculated Manhattan distance. x_i and y_i represent the horizontal and vertical coordinates of the i-th singular value in the singular spectrum, respectively. x_k and y_k represent the horizontal and vertical coordinates of the k-th cluster center, respectively. A smaller value of Q indicates higher similarity, while a larger Q denotes lower similarity. In other words, singular values with similarity to the cluster center will be assigned to the same cluster.

Calculate the mean value of all singular values in each cluster and update the cluster center to this mean value.
Repeat second and third steps until the cluster centers stabilize or a predefined number of iterations is reached.

Upon performing the mentioned calculations, the single values in the singular spectrum are disaggregated into k clusters based on their similarity. Utilizing the intrinsic characteristics of the GPR wavefield, the first cluster is thought to be represent direct waves, while several clusters with larger singular values are thought to be represent reflections. Clusters that have smaller singular values are indicative of noise, and the remaining smaller singular values are indicative of diffraction [40]. Then, the singular values representing direct wave, reflection, and noise in the singular spectrum are set to zero, resulting in a singular spectrum now representing only the diffraction. It is subsequently employed to reconstruct the singular value matrix

Λ'

. From

Λ'

, the Hankel matrix

H'

, which characterizes the diffraction, is finally reconstructed as:

H' = U Λ' V^{T}

(10)

An inverse Hankel transform and inverse Fourier transform are then performed on

H'

to convert it back into the time domain, thereby achieving the separation of diffraction while performing denoising. By aggregating the extracted diffraction profiles, we can form a separated GPR diffraction profile. The workflow of the proposed MWT-MSSA algorithm for GPR diffraction separation is illustrated in Figure 2.

4. Numerical Experiment

4.1. Effectiveness Verification of MWT-MSSA Method

To verify the efficiency of the MWT-MSSA GPR diffraction separation method proposed in this work, we first perform testing and analysis on simulated GPR data that feature small-scale anomalies. The model dimensions were set to 8.0 m × 2.5 m, with two undulating reflection interfaces located at approximately depths of 0.9 m and 1.4 m, respectively. The designed relative dielectric permittivity and electric conductivity of the layers are shown in Figure 3a. Three voids with diameters of roughly 0.1 m are specifically designed to be located in the second, third layers and at the interfaces. The simulation was implemented using the finite difference time-domain method (FDTD) with a Ricker wavelet source at a central frequency of 400 MHz and a time window of 50 ns. The common- offset configuration was employed for data acquisition, with the grid size at 0.005 m and the trace interval at 0.05 m. Figure 3b shows two nearly linear reflection events that correspond to the different media layers and three hyperbolic diffraction events that are produced by the voids. It is noteworthy that the hyperbolic diffraction exhibits a unique shape when the voids are further away from the interfaces, as shown in the left and middle diffraction. However, when the voids are closer to or coincide with the interfaces, the energy at the vertex and wings of the hyperbolic diffraction is severely obscured by reflection, which maintains a stronger energy due to the presence of interfaces. This can evidently be shown in the right-side diffraction, where the reflection caused by interference could complicate the accurate identification and extraction of the diffraction.

The simulated data shown in Figure 3b are first processed using MWT to separate the reflection and diffraction components into distinct wavelet transform coefficients. To overcome the ambiguities of MWT coefficients in describing the spatial and temporal distribution, we employ an inverse wavelet transform to reconstruct GPR profiles that could precisely depict the characteristics of the wavelet transform coefficients across all levels.

Figure 4 presents the coefficient profiles corresponding to the first, second, and third levels after the wavelet transformation was performed on the data shown in Figure 3b. The biorthogonal wavelet basis function bior4.4 is employed herein for wavelet transform. A detailed comparison of results using other alternative wavelet basis functions will be discussed in Section 4.2. As illustrated in Figure 4a,e,i, the majority of the reflective components in the synthetic data tend to concentrate toward the low-frequency approximation coefficients d_L_L_k at each level, while the diffraction data are predominantly distributed in the high-frequency vertical detailed coefficients d_H_L_k at each level, as shown in Figure 4c,g,k. After the first-level wavelet transform, the faint energy at the wings of the diffraction is decomposed into the first-level high-frequency vertical detailed coefficient d_H_L₁, whereas the remaining energy of both reflection and diffraction are retained in the first level of low-frequency approximation coefficient d_L_L₁. Following the second-level wavelet transform, the majority of the diffractive energy previously contained in the first level of low-frequency approximation coefficient is now redistributed into the second level of high-frequency vertical detailed coefficient, as shown in Figure 4g, leaving only faint energy at the diffraction vertexes within the second level of low-frequency approximation coefficient, as evidenced in Figure 4e,f,h. Similarly, as presented in Figure 4i–l, the third level wavelet transform further decomposes the diffractive vertex energy (i.e., initially present in the second level low-frequency approximation coefficient) into the third level of high-frequency vertical detailed coefficient. However, reflective energy is also presented in these coefficients, as indicated by the arrows in Figure 4k. This progression illustrates that, with increasing wavelet transform levels, wavefield energy can be incrementally transferred to higher-level of high-frequency detailed coefficients. Notably, diffraction energy tends to shift into the high-frequency detailed coefficients at earlier stages compared to reflection energy, which highlights the distinct decomposition characteristics of this work.

After the second level wavelet transform, a portion of the energy at the vertex of the diffraction remains within the second level of low-frequency approximation coefficient. In this sense, reconstructing the diffraction wavefield using the high-frequency vertical detailed coefficients d_H_L₁ and d_H_L₂ from the first and second levels could result in a partial energy loss at the vertex of the diffraction. Conversely, incorporating the high-frequency vertical detailed coefficient from the third level can reduce the vertex energy loss but introduce significant reflection energy into the reconstruction, ultimately compromising the separation accuracy. To avoid the energy loss at the vertex of diffraction for wavelet transform-based separation methods, the MSSA method is used coupled with k-means clustering. With this coupling strategy, the diffraction energy can effectively be separated from the original second level of low-frequency approximation coefficient d_L_L₂, thereby elevating the precision of wave separation.

Figure 5a,b present the diffraction and reflection profiles extracted from the second level of low-frequency approximation coefficient d_L_L₂ (in Figure 4e) using the MSSA method coupled with k-means clustering. The results indicate that the majority of the diffraction vertex energy has been successfully separated, with minimal energy loss, and the reflection is practically invisible. The profile in Figure 5b shows a greater amount of complete reflection energy and a significant decrease in diffraction energy, compared to the low-frequency approximation coefficient profile derived from the third level wavelet transform.

Figure 6b showcases the reconstructed diffraction profile that was derived from the high-frequency vertical detailed coefficient profiles d_H_L₁ and d_H_L₂ (using the first and second level wavelet transforms) and the diffraction profile (in Figure 5a), with the reconstructed reflection profile illustrated in Figure 6c. It can be observed that the diffraction profile eliminates a significant amount of reflection energy, with only minimal energy loss at the top of the diffraction, indicating encouraging separation performance. To further confirm the outperformance of the proposed method, we compare the reconstruction profile with the PWD method, as shown in Figure 6e,f. The comparison result suggests that the diffraction profile reconstructed by PWD using the dip field (Figure 6d) loses more energy at the vertex of the diffraction, while simultaneously maintaining a portion of reflective energy. As demonstrated, the reflection profile contains a greater quantity of residual diffraction energy throughout the pair-tail of the hyperbolic events.

To highlight the detailed benefits of the proposed MWT-MSSA method, A-scan waveforms were extracted from the diffraction wing and vertex, as indicated by the white dashed lines in Figure 6b,e. The waveform comparison results are shown in Figure 7a–c. As revealed by the black dashed rectangle in Figure 7a,b, the energy around the wings of the diffraction separated by MWT-MSSA closely matches the synthetic A-scan, which indicates a minimal loss of energy. We use the energy-retention ratio to evaluate the performance of the method. Its formula is shown as follows

R = E_{rec} / E_{o r i} \cdot 100 %

(11)

where

E_{rec}

represents the maximum amplitude of the separated diffraction peak and

E_{o r i}

represents the maximum amplitude of the before separated diffraction peak in the original profile.

Furthermore, as illustrated in Figure 7c, the diffraction separation using the MWT-MSSA method demonstrates a notable reduction in energy loss at the vertex, achieving an energy-retention ratio of approximately 35.6%. In contrast, the PWD method shows significant energy loss at the vertex, with an energy-retention ratio of only 10.7%. This comparison underscores the performance of the MWT-MSSA approach in conserving diffraction energy. These benefits confirm that the proposed method achieves more complete preservation of diffraction energy, leading to optimal separation performance.

To specify the computational efficiency of the proposed method, the execution times are recorded and listed in Table 1. All the tests discussed in this paper were performed on a laptop (Windows 11 with 16 GB RAM and CPU frequency of 2.5 GHz) using MATLAB 2024b. As shown, PWD took approximately 38 min, while the execution of MWT-MSSA was completed in roughly 8 min, resulting in a 79% improvement in computational efficiency.

4.2. Impact of Wavelet Basis Functions on Separation Performance

To evaluate the influence of different wavelet basis functions on the performance of MWT-MSSA, three typical wavelet basis functions (i.e., haar, db4, and rbio4.4) were performed by a third level wavelet transformation on the synthetic data. The selection of an appropriate wavelet basis function is a critical step for the given problem, as it can affect the results to some extent. Specifying this effect necessitates a careful consideration of the specific problem and an evaluation of key wavelet properties such as vanishing moments, support width, regularity, symmetry, and other relevant characteristics. A comprehensive, multi-faceted assessment of these properties is essential to ensure optimal performance in wavelet-based analysis.

As shown in Table 2, the bior4.4 wavelet has a large number of vanishing moments that enable it to capture local features of the signal effectively, while its extended support length enhances the accuracy of signal representation. Also, its favorable regularity ensures a smooth reconstruction, and its symmetry minimizes the chance of reconstruction errors [42]. Based on these properties, we prefer the bior4.4 wavelet in our study to separate GPR diffraction waves for subsequent analyses.

The MWT coefficient profiles are presented in Figure 8, Figure 9 and Figure 10. In comparison to the third level bior4.4 wavelet transform coefficient profiles shown in Figure 4, the high-frequency vertical detail coefficient profiles d_H_L₁ and d_H_L₂ obtained by using haar wavelet basis function exhibit significant linear reflective components at the positions marked by the black arrows (see Figure 8c,g), which could be detrimental for diffraction separation. The high-frequency vertical detailed coefficient profile d_H_L₂ obtained by using rbio4.4 wavelet basis function still retains some portion of weak reflection (see Figure 9g). In contrast, the high-frequency vertical detailed coefficient profile d_H_L₂ from the db4 wavelet basis function exhibits minimal reflection energy. However, compared to the high-frequency vertical detail coefficient profile d_H_L₂ from the bior4.4 basis function, a slight loss of energy can be observed at the diffraction vertex, as indicated by black arrow in Figure 10g.

By examining Figure 8, Figure 9 and Figure 10, it can be concluded that different wavelet basis functions used in the MWT could cause variations in the rate of energy decomposition for reflection and diffraction into high-frequency detailed coefficients in synthetic data. In general, diffraction decompose into high-frequency coefficients at a faster rate than reflection. Figure 11 presents the diffraction and reflection profiles obtained after MWT-MSSA separation using the haar, rbio4.4, and db4 wavelet basis functions. The diffraction profile separated by haar wavelet basis function exhibits some reflective energy remained, as shown by the black arrow in Figure 11a. Despite having less reflective energy, there is no significant difference between the diffraction profiles separated by both rbio4.4 and db4 functions and the bior4.4 wavelet basis function.

The A-scan waveform comparison of the diffraction vertex and wings, illustrated in Figure 12, demonstrates that diffraction separated using MWT-MSSA method with different wavelet basis functions exhibit similar energy loss at the vertex. However, all cases show stronger vertex energy than the PWD method, as shown in Figure 12a. For the diffraction wings, the waveform comparison in Figure 12b indicates that the haar and the db4 wavelet basis functions result in the most energy loss, whereas the bior4.4 wavelet basis function achieves the least energy loss, which is more closely matched to the synthetic data. In summary, the MWT-MSSA approach utilizing the bior4.4 wavelet basis function demonstrates a minimal energy loss and improved separation performance for our proposal.

4.3. Impact of Clustering Parameter Selection on Separation Performance in the k-Means Algorithm

As stated, the proposed MWT-MSSA leverages the distinct amplitude differences between direction, reflection, diffraction and noise in the GPR profile to achieve effective diffraction separation. In contrast to seismic wavefields, the GPR wavefield predominantly consists of these four components: the direct wave and reflection have the highest energy levels, diffraction possess moderate energy levels, and noise features the weakest energy [40]. According to the SVD principle, direct wave and reflection are exemplified by the largest singular values, diffraction is represented by moderate singular values, and noise is characterized by the smallest singular values [25,27,40]. To accurately determine the thresholds between different components in the singular spectrum, we employ a partitioning-based k-means clustering algorithm to classify the singular values in the singular spectrum into k clusters based on their similarity attribute. The diffraction is then reconstructed by selecting the singular values associated with the diffraction cluster.

Typically, there is no definitive criterion for determining the number of clusters k. Recent studies in seismic exploration have demonstrated that while k can be guided by the complexity of the wavefield. The more complex wavefields typically require a larger number of clusters to accurately represent their distinct components [27,40]. Take the GPR wavefield shown in Figure 3b as an example. This wavefield consists of a direct wave, two reflections that originate from undulating interfaces, three diffractions that are caused by small-scale anomalous bodies, and ambient noise. In this regard, the number of clusters k may need to exceed seven to warrant an accurate representation of these distinct components. Also, the complexity of the wavefield necessitates an appropriate increase in k, as discussed in [39].

To evaluate the impact of the total number of k-means clusters on separation efficiency, we employ the k-means clustering algorithm to analyze the GPR data depicted in Figure 3b. The clustering analysis is performed with the number of k-means clusters set to 9, 11, and 13, respectively, while the first six clusters are fixed at zero and all other parameters remain unchanged. As the results shown in Figure 13, the effectiveness of diffraction separation is significantly affected by the number of k-means clusters chosen. Specifically, when the cluster count is set to 13, the separated diffraction profile exhibits optimal results. Conversely, a decrease in cluster count to 9 leads to a notable decline in separation performance, highlighting the sensitivity of the process to the selected number of clusters.

After performing MWT on the synthetic data (shown in Figure 3b), residual diffraction can be observed in the low-frequency approximation coefficient profile shown in Figure 4e. The singular spectrum of the coefficient profile at the dominant frequency in the profile illustrated in Figure 4e was analyzed using k-means clustering with a total number of 13 clusters. As shown in Figure 14, the singular spectrum at the dominant frequency is divided into 13 distinct clusters, each with a gray region. The distribution of singular values across the clusters is non-uniform, with distinct amounts of singular values in each cluster.

By examining a total of 13 clusters listed in Table 3, we investigated the impact of selecting different numbers of clusters (singular values) representing diffraction on the separation of GPR diffraction. The reconstruction of the GPR diffraction is achieved by utilizing various numbers of singular values from the clusters. A total of six cases are considered, where the following singular values are set to zero: 5 singular values from the first 4 clusters (Case 11), 6 singular values from the first 5 clusters (Case 12), 7 singular values from the first 6 clusters (Case 13), 9 singular values from the first 7 clusters (Case 14), 12 singular values from the first 8 clusters (Case 15), and 17 singular values from first 9 clusters (Case 16). The reconstruction performance can be evaluated in detail by utilizing the remaining singular values from the other clusters to reconstruct the diffraction, as demonstrated in Figure 15.

It can be seen in Figure 15 that setting a smaller number of clusters (singular values) to zero results in less energy loss at the diffraction vertex. However, this also leads to a stronger residual reflection energy within the diffraction profile, as can be observed in Figure 15a,b. Conversely, by decreasing the number of clusters (singular values) to zero, the residual reflection energy in the reconstructed diffraction profile can be decreased. It is worth noting that the reduction in diffraction vertex energy remains significantly smaller than the reduction in reflection energy, thereby enhancing the overall separation performance. Also, it should be noted that as the number of clusters (singular values) set to zero continues to increase, the residual reflection energy in the reconstructed diffraction profile is eliminated. This, however, comes with an increased energy loss in the diffraction itself. This finding can also be substantiated in the A-scan waveform comparison of the diffraction vertex and two wings, as shown in Figure 16. The optimal balance between separation results can be maintained by Case 14, which preserves the highest diffraction energy at the vertex and maintains superior energy retention in the wings indicated by the arrow. This selection suggests a model that utilizes the most efficient performance to separate GPR diffraction.

Reconstructing diffraction profiles can be achieved by zeroing out different clusters (singular values) when the total number of clusters is 9, 11, and 15. The specific clusters and their corresponding singular values set to zero are listed in Table 3. Through testing and analysis, it can be found that the diffraction separation is most effective in cases 3, 8, and 20, which correspond to the total cluster numbers of 9, 11, and 15, respectively, as shown in Figure 17, where the reconstructed diffraction and reflection profiles exhibit minimal differences. Despite the differences in the total number of clusters set to zero in these three cases, the initial 9 singular values are identical. Therefore, for the MSSA method, the total number of clusters in the k-means clustering algorithm is recommended to range between 9 and 15. In this sense, the value of k is typically optimized based on the complexity of the data and the wavefield characteristics.

4.4. Anti-Noise Performance Testing

To evaluate the anti-noise performance of the MWT-MSSA method, a Gaussian noise with a SNR of 35 dB is added to the synthetic data. The noisy profile, as depicted in Figure 18, causes diffractions only at the vertex. However, the majority of energy around two wings is concealed by noise, making it rather difficult to discern, in addition, the diffraction on the right side is nearly undetectable. Even though the reflection can still be identified, the noise has an effect on its energy, leading to a significant decrease in resolution.

Figure 19 shows the low-frequency approximation and high-frequency detailed coefficient profiles obtained by performing a three-level wavelet transform with the bior4.4 wavelet basis function. As can be observed, the distribution of the reflection and diffraction across the coefficient profiles at each level closely mirrors the patterns seen under noise-free conditions. It is noteworthy that after the three-level wavelet transform, the noise is mainly decomposed into the low-frequency approximation coefficients and high-frequency horizontal detailed coefficients at each level. Likewise, as the wavelet transformation level of increases, the reflection, diffraction, and noise are progressively decomposed into the high-frequency detailed coefficients at each level. This observation underscores the effectiveness of MWT in separating diffraction in noisy GPR simulation profiles.

As shown in Figure 19e, the profile contains a certain amount of noise, while the noise presented in Figure 19c,g is more prominent, making it difficult to identify the diffraction. Since the singular values that correspond to noise are relatively small and typically located in the latter half of the singular spectrum after SVD, the k-means clustering algorithm is employed to perform cluster analysis on the coefficient profiles. Figure 20 displays the reconstructed diffraction and reflection profiles created by the MSSA, and Table 4 lists the parameters related to cluster analysis. As can be seen, the noise energy is relatively weak in Figure 20a, and the diffraction is clearly visible in Figure 20b,c. The GPR diffraction and reflection profiles, which were reconstructed from the coefficient profiles after diffraction extraction, are presented in Figure 20b,c. It has been observed that the diffraction profiles obtained using the MWT-MSSA show a significant decrease in reflection and noise components. The separation performance is highly encouraging due to the presence of three clearly visible hyperbolic diffraction events.

Figure 21e illustrates the diffraction profile obtained by using PWD filtering on the dip field result depicted in Figure 21d. It can be found that the diffraction profile still contains substantial noise, which potentially reduces the clarity of the three hyperbolic diffraction events and manifests a suboptimal separation performance. More importantly, as listed in Table 1, the time it took for PWD was approximately 39 min, while MWT-MSSA completed the process in 12 min, resulting in a 69.2% improvement in computational efficiency. A comparison between the diffraction profiles separated from the noisy data and those derived from the noise-free data revealed a noticeable difference in SNR. The diffraction profile obtaining using MWT-MSSA has a SNR of 1.382, which is significantly higher than the SNR of 0.014 that was achieved using PWD. This confirms the robustness of the MWT-MSSA method for diffraction separation in terms of anti-noise performance.

5. Field Data Test

To validate the efficacy of the proposed method for analyzing field data, we performed a field trial with the SIR-4000 GPR system and surveyed a concrete pavement. With a central frequency of 400 MHz for both transmitting and receiving antennas, the acquisition employed a common-offset configuration with an offset of 0.02 m, a time window of 45 ns, and a sampling rate of 1024 per trace. As shown in Figure 22, several diffractions with relatively high energy are distinctly visible in the observed GPR profile. However, faint diffractions, as highlighted by white arrows, are severely obscured by strong reflections that may originate from layering interfaces, making these vital components challenging to discern clearly.

A 5-level wavelet transform was performed on the raw GPR profile. The resulting low-frequency approximation and high-frequency detail coefficient profiles for each level are presented in Figure 23. As illustrated, MWT effectively decomposes the majority of the energy from reflection and diffraction into high-frequency detailed coefficients. However, the rate of decomposition is slightly slower than that observed in the synthetic data. Figure 23m–t shows that the energy of the reflection is mostly dispersed among the low-frequency approximation coefficients d_LL₄ and high-frequency horizontal detailed coefficients d_LH₄. In contrast, the high-frequency vertical detail coefficients d_HL₄ are where the diffraction energy is mainly concentrated, while the vertex energy of the diffraction is still located in the low-frequency approximation coefficients d_LL₄. After the fifth level wavelet transform, the energy at the vertex of the diffraction (previously retained in the fourth level low-frequency approximation coefficients) is further decomposed into the fifth-level high-frequency horizontal detail coefficients, as illustrated in Figure 23r. If the low-frequency approximation coefficients and high-frequency vertical detail coefficients from the fifth level wavelet transform are used to reconstruct the diffraction, a significant energy loss at the vertex of the diffraction may occur in the reconstructed result. To address this, the MWT-MSSA method is employed to reconstruct the diffraction from the coefficients of the first to fourth levels of the wavelet transform. The corresponding k-means cluster analysis parameters has been recorded and listed in Table 5.

Figure 24a,b present the diffraction and reflection profiles derived using the MWT-MSSA. The results demonstrate that the reflection present in the observed GPR profile has effectively been eliminated, with only slight energy loss observed at the vertex of the diffraction. As indicated by the arrow in Figure 22, the diffraction is obscured by the reflection. However, in the profile shown in Figure 24a, the diffraction (marked by the dashed circle) is clearly visible. In contrast, the diffraction obtained from the dip field (Figure 25c) through PWD exhibits a higher energy loss at the vertex of the diffraction (indicated by arrow), resulting in less effective separation results, as shown in Figure 25b. After comparing the A-scan waveforms (as shown in Figure 26) along the white dashed lines in Figure 24a and Figure 25a, it was found that the MWT-MSSA method performs better in conserving diffraction energy. Quantitative evaluation shows that MWT-MSSA achieves an energy-retention ratio of approximately 69.7%, which represents a significant improvement over the PWD method, which yields a ratio of 58.3%. And more importantly, as listed in Table 1, PWD lasted approximately 39 min, while MWT-MSSA finished the task in 12 min, resulting in a 69.2% improvement in computational efficiency. In this sense, the proposed MWT-MSSA demonstrates outperformance over PWD, with significantly reduced energy loss at the vertex of the diffraction, enhanced noise suppression capability, and improvement in computational efficiency. The production of high-quality diffraction profiles from GPR field data using this new method could facilitate subsequent diffraction processing or subtle imaging.

6. Conclusions

To address the challenge of pinpointing the singular value thresholds for different wave components, this paper incorporates a multilevel wavelet transform (MWT) with the multiple singular spectrum analysis (MSSA). The intrinsic attributes of near-linear reflections and hyperbolic diffractions can be specified by analyzing GPR signals using multiple level detailed coefficients created by singular value decomposition and k-means clustering. By using clustering analysis, the singular spectrum can be categorically divided based on similarity features, and several multiple clusters of singular values that represent the residual diffraction are identified to contribute to the reconstruction of the GPR diffraction.

For the repetition of the proposed method, we focus on the impact of wavelet basis functions on separation performance by investigating three common wavelet basis functions includes haar, db4, and rbio4.4 based on third level wavelet transformation, along with analyzing the determination of clustering parameters and robustness against noise.

Numerical experiments and field trials demonstrate that the biorthogonal wavelet basis function bior4.4 provides a remarkable efficiency for separating GPR diffraction. Notably, the k-means algorithm should take into account the complexity of the GPR wavefield, which typically ranges from 9 to 15 in the proposed MWT-MSSA. The proposed approach has the benefit of reducing energy loss at the diffraction vertex, decreasing residual diffraction energy within the reflection profile, and improving noise immunity when compared to the classical Plane Wave Deconstruction (PWD) method. Additionally, the computation efficiency is raised to 70–80%, which allows for the creation of high-resolution diffraction profiles for efficient data processing and subsequent subtle imaging.

Author Contributions

Methodology, H.W. (Haolin Wang) and H.W. (Honghua Wang); validation, H.W. (Haolin Wang), H.W. (Honghua Wang), Z.H. and F.Z.; formal analysis, Z.H.; investigation, H.W. (Haolin Wang), Z.H. and F.Z.; data curation, H.W. (Haolin Wang); writing—original draft, H.W. (Haolin Wang); writing—review and editing, H.W. (Honghua Wang); supervision, F.Z.; Funding acquisition, H.W. (Honghua Wang). All authors have read and agreed to the published version of the manuscript.

Funding

This research are supported by the National Natural Science Foundation of China (Grant Nos. 42364010) and the Natural Science Foundation of Guangxi Province (Grant Nos. 2022GXNSFAA035595).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Acknowledgments

The authors greatly appreciate the anonymous reviewers and academic editors for their valuable comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Alani, A.M.; Tosti, F.; Ciampoli, L.B.; Gagliardi, V.; Benedetto, A. An integrated investigative approach in health monitoring of masonry arch bridges using GPR and InSAR technologies. NDT E Int. 2020, 115, 102288. [Google Scholar] [CrossRef]
Joshaghani, A.; Shokrabadi, M. Ground penetrating radar (GPR) applications in concrete pavements. Int. J. Pavement Eng. 2022, 23, 4504–4531. [Google Scholar] [CrossRef]
Bornik, A.; Neubauer, W. 3D Visualization Techniques for Analysis and Archaeological Interpretation of GPR Data. Remote Sens. 2022, 14, 1709. [Google Scholar] [CrossRef]
Kang, M.-S.; Kim, N.; Im, S.B.; Lee, J.-J.; An, Y.-K. 3D GPR image-based UcNet for enhancing underground cavity detectability. Remote Sens. 2019, 11, 2545. [Google Scholar] [CrossRef]
Liu, Z.; Yeoh, J.K.W.; Gu, X.; Dong, Q.; Chen, Y.; Wu, W.; Wang, L.; Wang, D. Automatic pixel-level detection of vertical cracks in asphalt pavement based on GPR investigation and improved mask R-CNN. Autom. Constr. 2023, 146, 104689. [Google Scholar] [CrossRef]
Schwarz, B.; Krawczyk, C.M. Coherent diffraction imaging for enhanced fault and fracture network characterization. Solid Earth. 2020, 11, 1891–1907. [Google Scholar] [CrossRef]
Park, S.; Kim, J.; Jeon, K. Improvement of gpr-based rebar diameter estimation using yolo-v3. Remote Sens. 2021, 13, 2011. [Google Scholar] [CrossRef]
Zhang, B.; He, R.; Zhang, H. Diffraction separation by a coherence analysis framework for ground penetrating radar applications. IEEE Trans. Geosci. Remote Sensing. 2023, 61, 2003115. [Google Scholar] [CrossRef]
Xue, Z.; Zhang, H.; Zhao, Y. Pattern-guided dip estimation with plane-wave destruction filters. Geophys. Prospect. 2019, 67, 1798–1810. [Google Scholar] [CrossRef]
Shen, H.-Y.; Li, Q.; Yan, Y.-Y.; Li, X.-X.; Zhao, J. Separation of diffracted waves via SVD filter. Pet. Sci. 2020, 17, 1259–1271. [Google Scholar] [CrossRef]
Lin, P.; Peng, S.; Li, C. Separation and imaging of seismic diffractions using geometric mode decomposition. Geophysics 2023, 88, WA239–WA251. [Google Scholar] [CrossRef]
Fomel, S.; Landa, E.; Taner, M.T. Post-stack velocity analysis by separation and imaging of seismic diffractions. Geophysics 2007, 72, U89–U94. [Google Scholar] [CrossRef]
Zhang, X.; Feng, X.; Zhang, Z. Dip filter and random noise suppression for GPR B-scan data based on a hybrid method in f-x domain. Remote Sens. 2019, 11, 2180. [Google Scholar] [CrossRef]
Kanasewich, E.R.; Phadke, S.M. Imaging discontinuities on seismic sections. Geophysics 1988, 53, 334–345. [Google Scholar] [CrossRef]
Khaidukov, V.; Landa, E.; Moser, T.J. Diffraction imaging by focusing-defocusing: An outlook on seismic superresolution. Geophysics 2004, 69, 1478–1490. [Google Scholar] [CrossRef]
Moser, T.J.; Howard, C.B. Diffraction imaging in depth. Geophys. Prospect. 2008, 56, 627–641. [Google Scholar] [CrossRef]
De Ribet, B.; Yelin, G.; Serfaty, Y. High resolution diffraction imaging for reliable interpretation of fracture systems. First Break. 2017, 35, 2. [Google Scholar] [CrossRef]
Nowak, E.J.; Imhof, M.G. Diffractor Localization via Weighted Radon Transforms. In Proceedings of the SEG International Exposition and Annual Meeting, Denver, CO, USA, 10–15 October 2004; pp. 2108–2111. [Google Scholar]
Akalin, M.F.; Khan, F.A.; Saleh, A.S.B. An effective methodology for high resolution diffraction imaging. In Proceedings of the SEG International Exposition and Annual Meeting, Houston, TX, USA, 22–27 September 2013; pp. 4060–4065. [Google Scholar]
Karimpouli, S.; Malehmir, A.; Hassani, H.; Khoshdel, H.; Nabi-Bidhendi, M. Automated diffraction delineation using an apex-shifted Radon transform. J. Geophys. Eng. 2015, 12, 199–209. [Google Scholar] [CrossRef]
St. Clair, J.; Holbrook, W.S. Measuring snow water equivalent from common-offset GPR records through migration velocity analysis. Cryosphere 2017, 11, 2997–3009. [Google Scholar] [CrossRef]
Liu, Y.; Irving, J.; Holliger, K. High-resolution velocity estimation from surface-based common-offset GPR reflection data. Geophys. J. Int. 2022, 230, 131–144. [Google Scholar] [CrossRef]
Kong, X.; Wang, D.-Y.; Li, Z.-C.; Zhang, R.-X.; Hu, Q.-Y. Diffraction separation by plane-wave prediction filtering. Appl. Geophys. 2017, 14, 399–405. [Google Scholar] [CrossRef]
Lowney, B.; Lokmer, I.; O’Brien, G.S.; Amy, L.; Bean, C.J.; Igoe, M. Enhancing interpretability with diffraction imaging using plane-wave destruction aided by frequency-wavenumber f-k filtering. Interpretation 2020, 8, T541–T554. [Google Scholar] [CrossRef]
Oropeza, V.; Sacchi, M. Simultaneous seismic data denoising and reconstruction via multichannel singular spectrum analysis. Geophysics 2011, 76, V25–V32. [Google Scholar] [CrossRef]
Yu, Z.; Hattori, K.; Zhu, K.; Chi, C.; Fan, M.; He, X. Detecting earthquake-related anomalies of a borehole strain network based on multi-channel singular spectrum analysis. Entropy 2020, 22, 1086. [Google Scholar] [CrossRef]
Lin, P.; Zhao, J.; Peng, S. Low-rank diffraction separation using an improved MSSA algorithm. Acta Geophys. 2021, 69, 1651–1665. [Google Scholar] [CrossRef]
Abdeyazdan, M. Data clustering based on hybrid K-har-monic means and modifier imperialist competitive algorithm. J. Supercomput. 2014, 68, 574–598. [Google Scholar] [CrossRef]
Kumar, B.S.; Sahoo, A.K.; Maiti, S. Integrated Feature Investigation and Classification Methods for Discrimination of Subsurface Objects in GPR Imagery. IEEE Sens. J. 2024, 24, 11003–11013. [Google Scholar] [CrossRef]
Zhang, L.; Ling, T.; Yu, B.; Huang, F.; Zhang, S. Intensive interferences processing for GPR signal based on the wavelet transform and FK filtering. J. Appl. Geophys. 2021, 186, 104273. [Google Scholar] [CrossRef]
Oliveira, R.J.; Caldeira, B.; Teixidó, T.; Borges, J.F.; Bezzeghoud, M. Geophysical data fusion of ground-penetrating radar and magnetic datasets using 2D wavelet transform and singular value decomposition. Front. Earth Sci. 2022, 10, 1011999. [Google Scholar] [CrossRef]
Al Bassam, N.; Ramachandran, V.; Parameswaran, S.E. Wavelet theory and application in communication and signal processing. In Wavelet Theory; BoD—Books on Demand: Hamburg, Germany, 2021; Volume 45. [Google Scholar]
Ge, J.; Sun, H.; Shao, W.; Liu, D.; Liu, H.; Zhao, F.; Tian, B.; Liu, S. Wavelet-GAN: A GPR Noise and Clutter Removal Method Based on Small Real Datasets. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5918214. [Google Scholar] [CrossRef]
Nuzzo, L.; Quarta, T. Improvement in GPR coherent noise attenuation using t-p and wavelet transforms. Geophysics 2004, 69, 789–802. [Google Scholar] [CrossRef]
Lu, G.; Zhao, W.; Forte, E.; Tian, G.; Li, Y.; Pipan, M. Multi-frequency and multi-attribute GPR data fusion based on 2-D wavelet transform. Measurement 2020, 166, 108243. [Google Scholar] [CrossRef]
Bazine, R.; Wu, H.; Boukhechba, K. Spectral DWT multilevel decomposition with spatial filtering enhancement preprocessing-based approaches for hyperspectral imagery classification. Remote Sens. 2019, 11, 2906. [Google Scholar] [CrossRef]
Yin, Z.; Shi, W.-Z.; Wu, Z.; Zhang, J. Multilevel wavelet-based hierarchical networks for image compressed sensing. Pattern Recognit. 2022, 129, 108758. [Google Scholar] [CrossRef]
Guo, T.; Zhang, T.; Lim, E.; Lopez-Benitez, M.; Ma, F.; Yu, L. A review of wavelet analysis and its applications: Challenges and opportunities. IEEE Access 2022, 10, 58869–58903. [Google Scholar] [CrossRef]
Werthmüller, D.; Key, K.; Slob, E.C. A tool for designing digital filters for the Hankel and Fourier transforms in potential, diffusive, and wavefield modeling. Geophysics 2019, 84, F47–F56. [Google Scholar] [CrossRef]
Lin, P.; Peng, S.; Zhao, J.; Cui, X. Diffraction separation and imaging using multichannel singular-spectrum analysis. Geophysics 2020, 85, V11–V24. [Google Scholar] [CrossRef]
Li, H.; Lin, J.; Liu, N.; Li, F.; Gao, J. Seismic reservoir delineation via Hankel transform based enhanced empirical wavelet transform. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1411–1414. [Google Scholar] [CrossRef]
Ramnivas, K.; Ravi, N.; Singh, S.K. Selection of suitable mother wavelet along with vanishing moment for the effective detection of crack in a beam. Mech. Syst. Signal Process. 2022, 163, 108136. [Google Scholar]

Figure 1. Simulated GPR profile (a) and its singular spectrum at the dominant frequency (b), with the diffraction profiles separated by the mean value method (c) and the cumulative energy approach (d), respectively.

Figure 2. Flowchart of the proposed MWT-MSSA algorithm of GPR diffraction separation.

Figure 3. (a) Model of relative dielectric permittivity and electric conductivity distribution and (b) its simulated GPR data.

Figure 4. Three-level wavelet transform coefficient profiles of the GPR data present in Figure 3b. (a,e,i) are the reconstructed profiles of the low-frequency approximation coefficients for the first, second, and third levels, respectively. (b,f,j) are the reconstructed profiles of the high-frequency horizontal detail coefficients for the first, second, and third levels, respectively. (c,g,k) are the reconstructed profiles of the high-frequency vertical detail coefficients for the first, second, and third levels, respectively. (d,h,l) are the reconstructed profiles of the high-frequency diagonal detail coefficients for the first, second, and third levels, respectively.

Figure 5. The diffraction profile (a) and reflection profile (b) extracted from the second-level low-frequency approximation coefficient in Figure 4e by using the k-means clustering-based MSSA method.

Figure 6. Comparison of diffraction and reflection profiles separated using MWT-MSSA and PWD. (a) Synthetic GPR data; (b) Diffraction profile and (c) reflection profile obtained using MWT-MSSA; (d) Dip field utilized for PWD separation; (e) Diffraction profile and (f) reflection profile obtained using PWD.

Figure 7. Comparison of A-scan waveforms extracted from the wings (a,b) and vertex (c) of the diffraction located at the white dashed lines marked in Figure 6b,e.

Figure 8. Third level wavelet transform coefficient profiles of the GPR data shown in Figure 3b using the haar wavelet. (a,e,i) represent the reconstructed profiles of the first to third level low-frequency approximation coefficients. (b,f,j) represent the reconstructed profiles of the first to third level high-frequency horizontal detail coefficients. (c,g,k) represent the reconstructed profiles of the first to third level high-frequency vertical detail coefficients. (d,h,l) represent the reconstructed profiles of the first to third level high-frequency diagonal detail coefficients.

Figure 9. Third level wavelet transform coefficient profiles of the GPR data shown in Figure 3b using the rbio4.4 wavelet. (a,e,i) represent the reconstructed profiles of the first to third level low-frequency approximation coefficients. (b,f,j) represent the reconstructed profiles of the first to third level high-frequency horizontal detail coefficients. (c,g,k) represent the reconstructed profiles of the first to third level high-frequency vertical detail coefficients. (d,h,l) represent the reconstructed profiles of the first to third level high-frequency diagonal detail coefficients (the location indicated by the arrow shows a weak-energy reflected wave).

Figure 10. Third level wavelet transform coefficient profiles of the GPR data shown in Figure 3b using the db4 wavelet. (a,e,i) represent the reconstructed profiles of the first to third level low-frequency approximation coefficients. (b,f,j) represent the reconstructed profiles of the first to third level high-frequency horizontal detail coefficients. (c,g,k) represent the reconstructed profiles of the first to third level high-frequency vertical detail coefficients. (d,h,l) represent the reconstructed profiles of the first to third level high-frequency diagonal detail coefficients.

Figure 11. Diffraction separation results with different wavelet basis functions. (a–c) show the diffraction separated using haar, rbio4.4, and db4 wavelet basis functions, respectively. (d–f) show the reflection separated using haar, rbio4.4, and db4 wavelet basis functions, respectively.

Figure 12. Comparison of the single trace waveform of the diffraction wave’s vertex (a) and wings (b) extracted from Figure 11a, b, and c, respectively.

Figure 13. Diffraction and reflection profiles extracted from Figure 4a with k-means cluster numbers set to 9, 11, and 13. (a–c) represent the diffraction profiles extracted with k-means cluster numbers of 9, 11, and 13, respectively. (d–f) represent the reflection profiles extracted with k-means cluster numbers of 9, 11, and 13, respectively.

Figure 14. k-means clustering result of the singular spectrum at the dominant frequency location of the coefficient profile

d_{L L_{2}}

(a) shows the singular value distribution of the first cluster in the singular spectrum, while (b) shows the singular value distribution of the second to thirteenth clusters in the singular spectrum.

Figure 14. k-means clustering result of the singular spectrum at the dominant frequency location of the coefficient profile

d_{L L_{2}}

(a) shows the singular value distribution of the first cluster in the singular spectrum, while (b) shows the singular value distribution of the second to thirteenth clusters in the singular spectrum.

Figure 15. Diffraction profiles reconstructed by selecting different numbers of clusters (singular values) in 6 Cases when the total number of clusters is 13. (a) Case 11, (b) Case 12, (c) Case 13, (d) Case 14, (e) Case 15, (f) Case 16.

Figure 16. Comparison of the single trace waveforms of the diffraction’s vertex (a) and wings (b) extracted from Figure 15 (In (a), the left arrow indicates the reflection removal effect under different cases, while the right arrow indicates the energy retention effect of the diffraction vertex, and in (b), the arrow indicates the energy retention effect of the diffraction wave wings under different cases).

Figure 17. Optimal diffraction profiles reconstructed with different total numbers of clusters. (a,b) show the diffraction and reflection profiles separated in Case 3 with a total of 9 clusters. (c,d) show the diffraction and reflection profiles separated in Case 8 with a total of 11 clusters. (e,f) show the diffraction and reflection profiles separated in Case 20 with a total of 15 clusters.

Figure 18. Noisy GPR profile by adding Gaussian noise (SNR = 35 dB) to the synthetic data shown in Figure 3b.

Figure 19. Coefficient profiles of the three levels wavelet transform of the GPR data from Figure 18. (a,e,i) are the profiles reconstructed from the low-frequency approximation coefficients of levels 1–3. (b,f,j) are the profiles reconstructed from the high-frequency horizontal detail coefficients of levels 1–3. (c,g,k) are the profiles reconstructed from the high-frequency vertical detail coefficients of levels 1–3. (d,h,l) are the profiles reconstructed from the high-frequency diagonal detail coefficients of levels 1–3.

Figure 20. Diffraction separation results from Figure 19c,e,g. (a,d) show the diffraction and reflection with noise extracted from Figure 19e, respectively. (b,e) present the diffraction and noise extracted from Figure 19g, respectively. (c,f) display the diffraction and noise extracted from Figure 19c, respectively.

Figure 21. Comparison of diffraction and reflection profiles separated using MWT-MSSA and PWD. (a) Noisy GPR profile; (b) diffraction profile and (c) reflection profile separated using the MWT-MSSA method; (d) dip field applied in PWD separation; (e) diffraction profile and (f) reflection profile obtained from PWD separation.

Figure 22. The Observed GPR profile.

Figure 23. Coefficient profiles of the 5-level wavelet transform of the observed GPR data in Figure 22. (a,e,i,m,q) are the profiles reconstructed from the low-frequency approximation coefficients of level 1–5. (b,f,j,n,r) are the profiles reconstructed from the high-frequency horizontal detail coefficients of level 1–5. (c,g,k,o,s) are the profiles reconstructed from the high-frequency vertical detail coefficients of level 1–5. (d,h,l,p,t) are the profiles reconstructed from the high-frequency diagonal detail coefficients of level 1–5.

Figure 24. Results of MWT-MSSA separated diffraction (a) and reflection (b) in the observed GPR profile.

Figure 25. PWD-separated diffraction (a) and reflection (b), along with the dip field (c) from the observed GPR profile.

Figure 26. Comparison of single trace waveforms of diffraction separated using the MWT-MSSA and PWD methods (the arrow in the figure indicate the vertex of the diffraction).

Table 1. Computation time statistics for MWT-MSSA and PWD methods.

Calculated Time (min)	Synthetic Data	Synthetic Data with Noise	In-Site Field Data
MWT-MSSA	8	12	16
PWD	38	39	45

Table 2. Comparison of Properties of Common Wavelet Basis Functions.

Properties	Haar Wavelet	Daubechies Wavelet (dbN)	Reverse Biorthogonal Wavelet (rbioN_d.N_r)	Biorthogonal Wavelet (biorN_r.N_d)
Orthogonality	Yes	Yes	No	No
Biorthogonality	Yes	Yes	Yes	Yes
Support Width	1	2N − 1	2N_d + 1 (Reconstruction) 2N_r + 1 (Decomposition)	2N_r + 1 (Reconstruction) 2N_d + 1 (Decomposition)
Regularity	Discontinuous	Approximately 0.2N	N_d − 1, N_d − 2	N_r − 1, N_r − 2
Symmetry	Yes	No	Yes	Yes
Vanishing Moments	1	N	N_d	N_r

Table 3. Singular Spectrum Truncation Points Determined by the k-means Clustering Algorithm.

Cases	Total Clusters	Zeroed Clusters	Reconstructed Clusters	Zeroed Singular Values	Reconstructed Singular Values
Case 1	9	1–3	4–9	1–5	6–78
Case 2		1–4	5–9	1–6	7–78
Case 3		1–5	6–9	1–9	10–78
Case 4		1–6	7–9	1–12	13–78
Case 5	11	1–3	4–11	1–5	6–78
Case 6		1–4	5–11	1–6	7–78
Case 7		1–5	6–11	1–7	8–78
Case 8		1–6	7–11	1–9	10–78
Case 9		1–7	8–11	1–12	13–78
Case 10		1–8	9–11	1–17	18–78
Case 11	13	1–4	5–13	1–5	6–78
Case12		1–5	6–13	1–6	7–78
Case 13		1–6	7–13	1–7	8–78
Case 14		1–7	8–13	1–9	10–78
Case 15		1–8	9–13	1–12	13–78
Case 16		1–9	10–13	1–17	18–78
Case 17	15	1–5	6–15	1–5	6–78
Case 18		1–6	7–15	1–6	7–78
Case 19		1–7	8–15	1–7	8–78
Case 20		1–8	9–15	1–9	10–78
Case 21		1–9	10–15	1–12	13–78
Case 22		1–10	11–15	1–14	15–78
Case 23		1–11	12–15	1–17	18–88

Table 4. k-means clustering parameters and singular spectrum truncation positions for noisy GPR profile.

Reconstructed Profiles	Total Clusters	Reconstructed Clusters	Reconstructed Singular Values
Figure 19e	13	8–9	10–19
Figure 19c	13	1–4	1–15
Figure 19g	13	1–5	1–15

Table 5. k-means clustering parameters and singular spectrum truncation for GPR field data.

Reconstructed Profiles	Total Clusters	Total Singular Values	Reconstructed Clusters	Reconstructed Singular Values
Figure 23m	10	425	4–8	7–58

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Wang, H.; Hou, Z.; Zhou, F. GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis. Appl. Sci. 2025, 15, 3204. https://doi.org/10.3390/app15063204

AMA Style

Wang H, Wang H, Hou Z, Zhou F. GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis. Applied Sciences. 2025; 15(6):3204. https://doi.org/10.3390/app15063204

Chicago/Turabian Style

Wang, Haolin, Honghua Wang, Zhiyang Hou, and Fei Zhou. 2025. "GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis" Applied Sciences 15, no. 6: 3204. https://doi.org/10.3390/app15063204

APA Style

Wang, H., Wang, H., Hou, Z., & Zhou, F. (2025). GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis. Applied Sciences, 15(6), 3204. https://doi.org/10.3390/app15063204

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

GPR Diffraction Separation by Incorporating Multilevel Wavelet Transform and Multiple Singular Spectrum Analysis

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Principle of Multi-Level Wavelet Transform

3.2. Multiple Singular Spectrum Analysis Method Based on k-Means Clustering

4. Numerical Experiment

4.1. Effectiveness Verification of MWT-MSSA Method

4.2. Impact of Wavelet Basis Functions on Separation Performance

4.3. Impact of Clustering Parameter Selection on Separation Performance in the k-Means Algorithm

4.4. Anti-Noise Performance Testing

5. Field Data Test

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI