Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor

Zong, Zaixiang; Zhao, Jianhu; Li, Shaobo; Zhang, Hongmei

doi:10.3390/jmse11091770

Open AccessArticle

Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor

¹

School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China

²

Institute of Marine Science and Technology, Wuhan University, Wuhan 430079, China

³

The School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China

⁴

Department of Artificial Intelligence and Automation, School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(9), 1770; https://doi.org/10.3390/jmse11091770

Submission received: 11 August 2023 / Revised: 6 September 2023 / Accepted: 8 September 2023 / Published: 11 September 2023

(This article belongs to the Section Geological Oceanography)

Download

Browse Figures

Versions Notes

Abstract

:

It has been proven that the quality factor (Q) is important for marine sediment attenuation attribute representation and is helpful for sediment classification. However, the traditional spectral-ratio (SR) method is affected by the interference effect caused by thin interbeds, which seriously degrade the performance of the SR method. Aimed at this problem, a novel method based on variational mode decomposition (VMD) correlation analysis is presented in this paper, which realizes the separation between interference reflections and effective signals. After obtaining the effective signals, a frequency band selection method is employed to weaken the influence of background noise. To better apply the proposed method to large-area sediment classification, a sediment clustering method based on texture features is introduced. Experiments on real data validate the effectiveness of the proposed method. The accuracy of the correlation analysis method using the modified parameters is 94 percent. The stability improvement in the standard deviation of the Q calculation can reach more than 90 percent. Moreover, the interpretation of sediment categories using the mean value of Q fits the drilling data well. It is believed that the proposed method has huge potential for the engineering applications in sub-bottom sediment classification.

Keywords:

sediment classification; quality factor; VMD; correlation analysis; feature clustering

1. Introduction

The sub-bottom profiler (SBP) has been widely used in underwater topographic surveys, sediment surveys, mineral resource exploration, and marine scientific research [1,2,3]. The acoustic wave of SBP attenuation occurs during wave propagation through water and sediment due to the heterogeneity and anelasticity of the media [2,4]. Commonly, the quality factor (Q) is used to parameterize the attenuation, which is inversely proportional to the attenuation coefficient [3,5,6]. Q is sensitive to physical information such as lithology, porosity, saturation, and permeability [5,7,8]. Hence, accurate Q estimation might help to improve the resolution of SBP data, enhance the fine details in stratigraphic features, and classify the stratigraphic material [2,8,9,10,11]

Q is often assumed to be approximately constant in the seismic frequency band [9,10,12]. Earlier methods for calculating Q are mainly based on signal amplitude information. However, prior knowledge of the true amplitude of the echo signal is required, which limited the application of the method [13].

Since real amplitude information is difficult to obtain from echo signal data, signal-amplitude-based methods are difficult to apply in practice [14]. Therefore, most researchers have calculated Q using the energy intensity loss in the frequency domain.

The traditional adopted spectral-ratio (SR) method is widely applied because of its simple theory and easy employment [15,16,17,18]. However, it is sensitive to noise and is easily influenced by interference effects. Schock [18] analyzed the performance of the SR method and derived the influence items of thin interbed reflections. He claimed that the reflection caused difficulties in the calculation of Q [12,18,19,20].

Panda [21] proposed an attenuation-based sediment classification method using the instantaneous frequency using the Hilbert transform to avoid the influence of thin interbed reflections. This method is seriously affected by noise and is not suitable for multi-component and wide-band SBP sonar data [22,23,24]. Reasonable and suitable strategies using instantaneous frequency-based methods to obtain good instantaneous frequency series have not been developed [2].

Therefore, the correction of the signal spectrum shape to remove the influence of interference effects has become the direction of Q calculation methods [23,24,25,26]. Hackert et al. [23] used the impedance parameters obtained from drilling data to correct the spectrum. This method improved the traditional SR method, but the drilling data are still needed. Pinson et al. [5] proposed an anti-noise SR method by considering the characteristic of the noise spectrum. This method can avoid the interference of noise, but ignores overlapping reflections. Chen et al. [26] proposed a generalized dispersive mode decomposition method. Xue et al. [27] designed a novel method based on variational mode decomposition (VMD). These methods work well on seismic data, but are not specifically designed for SBP sonar data [27,28,29,30,31].

These methods separate the signals of different components and achieve better results. However, since these methods do not make clear provisions for the selection of the band ranges, they need further improvement.

Based on the Q calculation method using VMD, this paper improves both the correlation analysis and the study of band range selection. In addition, we study the automatic classification method of the sediment based on feature clustering and Q to realize the automatic and accurate sub-bottom sediment classification in a large area. The rest of this paper is organized as follows. Section 2 provides a detailed description of the proposed method. Section 3 describes the data used in this study and analyzes the experimental results. Finally, our discussions and conclusions are presented in Section 4 and Section 5, respectively.

2. Methods

The technical flow of this paper is shown in Figure 1. There are two parts. The first part obtains the statistical features of the SBP data and, with the help of an unsupervised classifier (using the k-means algorithm), the multiple data subsets are derived. The second part uses VMD and the correlation distortion coefficient (CDC), as well as the frequency band selection criteria, to obtain an accurate Q. Combining the two parts, the automatic and accurate sub-bottom sediment classification in a large area is realized.

2.1. Q Calculation Based on VMD

Since Q calculation is more important, it is described first. The workflow of the modified Q calculation method is shown in Figure 2. The Q estimation is performed by a generalized Stockwell transform (GST), a VMD, a correlation analysis, band range selection, and finally a least-squares method.

2.1.1. Signal Decomposition and Reconstruction

The one-dimensional ping signal h(t) is transformed into a 2D time–frequency signal S(t, f) using generalized Stockwell transform (GST) [32,33,34]. The amplitude spectra at the two horizons (t = t₁, t = t₂) are S(t₁, f) and S(t₂, f). Then, according to the principle of acoustic wave attenuation, the following equation is obtained [5]:

S (t_{2}, f) = S (t_{1}, f) \exp (\frac{(1 - R_{1}^{2}) \cdot G_{2} R_{2}}{R_{1}}) \exp (- \frac{π f d T}{Q})

(1)

where f denotes frequency, dT = t₂ − t₁, R₁ and R₂ are the medium reflection coefficients at t₁ and t₂, G₂ is the acoustic spherical diffusion loss from t₁ to t₂, and Q is the average or equivalent Q from t₁ to t₂. Assuming that R, G, and Q are frequency independent, Equation (2) can be rewritten as the following linear equation:

\ln S (t_{2}, f) = \ln S (t_{1}, f) + (\frac{(1 - R_{1}^{2}) \cdot G_{2} R_{2}}{R_{1}}) - (\frac{π d T}{Q}) f

(2)

To attenuate the effect of thin interbed interference effects within t₁ to t₂, depending on the different source, the echo model is stated as:

\ln S (f) = \ln W (f) + \ln R (f)

(3)

where W(f) and R(f) denote the Fourier transform of the wavelet function and reflection function, respectively, which vary slowly and rapidly with frequency [22,27]. The ln W(f) and ln R(f) are also slowly and rapidly changing functions with frequency.

The ln R(f) is related to the interference effect and should be removed. We introduce VMD combined with correlation analysis to adaptively achieve this goal.

2.1.2. Variational Mode Decomposition (VMD)

The VMD algorithm is an adaptive and fully non-recursive approach to mode variation and signal processing that adaptively matches the optimal center frequency and finite bandwidth of each mode. It can efficiently decompose multiple quasi-orthogonal intrinsic mode functions (IMFs) with narrow-band nature and finally obtain the optimal solution of the variational problem [27,35,36].

For a multimodal signal x(t) decomposed into n IMFs, the VMD can be expressed as:

{\begin{cases} \min_{{u_{k}, ω_{k}}} {{\sum_{n} ‖ \partial_{t} [(δ (t) + \frac{j}{π t}) * u_{k} (t)] e^{- j ω_{k} t} ‖}_{2}^{2}} \\ \sum_{n} u_{k} = x (t) \\ u_{k} (t) = A_{k} (t) \cos (ϕ_{k} (t)) \end{cases}

(4)

where u_k is the kth IMF, ω_k denotes the corresponding center frequency, δ(t) denotes the impulse function, and * is the convolution operation in the time domain. ϕ_k(t) is the signal phase of u_k (a non-decreasing function), A_k(t) ≥ 0 is the signal envelope, and the instantaneous frequency ω_k(t) ≥ 0 (=ϕ_k′(t)) and A_k(t) both change slowly compared with ϕ_k(t).

Based on VMD, ln S(f) is decomposed into multiple IMFs from low to high frequencies, and these IMFs can be divided into two parts: ln W(f)- and ln R(f)-dominant. The ln W(f) part of the signal changes slowly with frequency, so it mainly exists in the first few IMFs, while the ln R(f) that should be excluded mainly exists in the later IMFs.

2.1.3. Interference Effect ln R(f) Removal

Correlation analysis can be used to find the dividing point between the two parts of the IMFs (ln W(f) and ln R(f) parts), and then to reconstruct the signal from the low-frequency components, namely, the ln W(f) related parts, before the dividing point. Therefore, correlation analysis was introduced to address this issue.

Based on the mutual information (MI) [37,38], the normalized “generalized correlation coefficient (GCC)” is firstly defined as:

\begin{matrix} R_{_{g}}^{k} = \frac{I (u_{k}, u_{k + 1})}{\sqrt{H (u_{k}) H (u_{k + 1})}} & k = 1 ~ n - 1 \end{matrix}

(5)

Theoretically, two IMFs belonging to the same ln W(f) or ln R(f) are highly correlated, whereas two IMFs belonging to ln W(f) and ln R(f) are less correlated, so the minimum in R_g can be used as a dividing point between ln W(f) and ln R(f). However, due to the random noise nature exhibited by the ln R(f) part of the signal, the degree of correlation between two IMFs belonging to the same ln R(f) is unpredictable, and the minimum R_g does not guarantee an accurate finding of the cutoff.

Therefore, this paper uses the difference between two adjacent R_g as the evaluation criterion for the cutoff point, and further defines the “correlation distortion coefficient (CDC)” as:

\begin{matrix} I^{k} = R_{_{g}}^{k} - R_{_{g}}^{k - 1} & k = 2 ~ n - 1 \end{matrix}

(6)

When I^k acquires a negative value with a large absolute value, the dividing point is found to be:

G = \min_{k = 2 ~ n - 1} {I^{k}}

(7)

Then, the ln W(f)-dominant signal at t = t₁ is reconstructed by:

\ln U (t_{1}, f) = \sum_{k = 1}^{G} u_{k}

(8)

where ln U(t₁, f) is considered as the reconstructed signal without the effect of reflection coefficients between thin interbeds.

Referencing the research of Xue et al. [27], the signal at t = t₂ is too complex for a separate analysis, so the results at t = t₁ can be fully exploited. The higher the correlation coefficient between U(t₁, f) and U(t₂, f), the lower the residual of the reflection coefficient-related contents’ influence in U(t₂, f). Let:

G_{2} = \max_{k = 1 ~ n} {R (\sum_{i = 1}^{k} \exp (u_{i}), U (t_{1}, f))}

(9)

where R denotes taking the correlation coefficient.

Then, the signal at t = t₂ is reconstructed by:

\ln U (t_{2}, f) = \sum_{k = 1}^{G_{2}} u_{k}

(10)

2.1.4. Q Calculation and Frequency Band Range Selection

Combing Equations (8) and (10), the Q calculation method Equation (2) can be rewritten as:

\ln U (t_{2}, f) = \ln U (t_{1}, f) + (\frac{(1 - R_{1}^{2}) \cdot G_{2} R_{2}}{R_{1}}) - (\frac{π d T}{Q}) f

(11)

To estimate Q, the quadratic objective function F(a, b) is minimized over the frequency band [f₁, f₂] using least-squares methods.

F (a, b) = \sum_{f = f_{1}}^{f_{2}} [\ln U (t_{1}, f) - \ln U (t_{2}, f) + a - b f]

(12)

By using Equation (12) to obtain b, Q is estimated as:

Q = \frac{π d T}{b}

(13)

To obtain Q, the frequency band [f₁, f₂] in Equation (12) should be determined. [f₁, f₂] is closely related to the final estimated Q [5]. Therefore, high signal-to-noise ratio (SNR) screening is necessary to weaken the interference from background noise and other full-band properties.

The following band range selection process was developed:

(1): Band filtering based on background noise

The average energy intensity of the water body is chosen to be the background noise. We calculate the SNR of the reconstructed signal ln U(t, f) in the full band range and then filter out the frequencies where the SNR is less than one.

(2): Band filtering based on energy anomalies

Due to attenuation effects in the propagation of acoustic waves, the energy at the lower horizon must be smaller than that at the upper boundary in an ideal signal. The difference between the energies at the upper and lower horizon is thus calculated and then filtered out at frequencies where the difference is less than zero.

(3): Selecting for the energy concentration band

The peak of the horizon energy in the remaining band range is found, and then a search is started from the peak position to the left and right to find the first point that does not satisfy the following equation:

\frac{E (f)}{E_{p e a k}} > η

(14)

where E_peak is the peak energy, E(f) is the energy intensity at each frequency, and η is the set threshold value. After experiments, η is given an empirical threshold range (0.3~0.6). The two identified critical frequencies are denoted f₁ and f₂, which is the range of the calculated band for Equation (12).

2.2. Joint Feature Clustering and Q for Sediment Classification

2.2.1. Unsupervised Classification

In the classification of SBP data, the classification units are partitioned by horizons. There are manual, semi-automatic, and automatic methods for horizon picking. This paper adopts the integrated automatic horizon picking method with joint amplitude and phase information [39] and supplemented with manual correction.

Based on a comprehensive consideration of the media properties, the feature parameters need to be chosen as comprehensively as possible. In this paper, we select various statistical features of the sediment-dependent SBP data, including amplitude intensity features and grayscale image texture features. The four amplitude feature parameters of mean, variance, median, and correlation coefficient are selected. The six texture feature parameters of mean, variance, skewness, kurtosis, energy (uniformity), and entropy are selected [40,41].

Meanwhile, to address the information redundancy that can arise from having too many feature parameters, PCA is performed to determine the divergence and correlation of each feature parameter to obtain the size of each feature parameter’s contribution to the classification. Accordingly, the feature vector is dimensionally reduced to reduce the classification error [42,43].

Finally, unsupervised feature clustering is performed rapidly based on the k-means algorithm [44,45]. k-means is an unsupervised classification method based on Euclidean distance. The basic principle of the k-means algorithm is to determine the cluster to which each sample belongs based on its distance from the cluster center, update the cluster center with the newly added sample data, and so on, iteratively, until the clustering result converges. The ideal goal of clustering is to partition the sample data into K disjoint datasets such that the sum of distances from each data point to the cluster center is minimized, and the clustering is determined according to whether the sum of distances converges or not.

The number of classifications K is a hyperparameter that generally needs to be chosen empirically. In this paper, it is set to n + 1, where n is the sediment class number, which is a priori the number obtained from the drilling data, and 1 denotes the seawater region and the deep non-reflective echo region.

2.2.2. Identification of the Type of Sediment

Since SBP data are segmented by unsupervised classification without specific sediment categories, further determinations based on drilling data or other information are necessary. In the geological survey, the sediment category is often classified according to the grain size, and empirical models between Q and the average grain size have been established in existing papers [2,5].

Thus, the Q values of the central regions of each subset can be computed using the proposed method, and the sediment classes of each subset can then be determined from the Q values.

3. Experiment and Analysis

3.1. Overview of Experimental Data

The data were surveyed in Tianjin, China, using Innomar SES2000 Standard SBP [16]. Four measurement lines were selected that were close to the drilling points. The total length of the lines was 6.471 km, with an average separation of 540 m between the lines and an average water depth of about 20 m.

A total of 25 drill holes (ZK01~ZK25) were scheduled for this survey, with a required depth of 8 m below the surface of the seabed. Based on laboratory analysis of the drilling data, the sediment in the measured area can be roughly divided into two layers, with the upper layer being predominantly sandy mud and the lower layer being predominantly sandy clay, with a small amount of sandy silt interspersed. Sandy mud is characterized by uneven soil quality, large amounts of sand particles with large grain sizes, and about 27.3% clay particles. Sandy clay generally has grain sizes less than 0.005 mm, with a small quantity of silty soil layers. Sandy silt has a grain size between 0.005 and 0.05 mm, and the soil is not uniform in quality, with a small amount of silt included. The typical sedimentary drilling figures are shown in Figure 3.

Combining the empirical models of Stevenson et al. [2] and Pinson et al. [5], the Q and sediment comparison table applicable to this region is shown in Table 1.

3.2. Unsupervised Classification

The method of Li et al. [39] was adopted for horizon extraction, and the units for classification and Q calculation were divided by horizons. The nine feature parameters mentioned in Section 2.2.1 were computed separately according to the classification unit and PCA was performed.

In PCA, degree is the ratio of each principal component eigenvalue to the sum of all eigenvalues, and sum-degree denotes the cumulative contribution degree. A share of 85% of the sum-degree was used as the basis for selection. Thus, the first three principal components were chosen to form a rotation matrix and the original feature parameters were reduced to 3D feature vectors by rotation transformation.

Based on the feature vectors of each unit after rotation, k-means classification was performed for each measurement line (considering the regional sediment class, K = 3 was taken), and the classification results are shown in Figure 4.

It can be seen that the unsupervised classification results are fairly consistent with the law of sediment distribution and have high reliability. The situation is complicated by the large number of horizons in the fourth line. The number of clusters will thus be increased later for targeted experimental analysis.

3.3. Q Calculation Results

3.3.1. Validity of Correlation Analysis Method

In order to verify the effectiveness of the “CDC” proposed in this paper for signal decomposition, 16 IMFs were obtained by VMD of the signal, and then the MI and “GCC” (R_g) and “CDC” (I^k) were calculated. Table 2 and Table 3 and Figure 5 could then be obtained. The separation point among the IMFs determined by manual discrimination is between the second and third IMFs, i.e., G = 2.

By comparison, the MI is found to fluctuate considerably and the trend of the curve change is not obvious. Its minimum point lies between the 13th and 14th IMFs and the second MI value (the theoretical minimum) is at the higher level of the overall graph. The normalized GCC is more stable and has a minimum between the second and third IMFs if only the first ten IMFs are considered, which is more consistent with the manual interpretation. However, its minimum is still wrong when there are more IMFs. In contrast, the CDC is more in line with the desired target, both in terms of the location of the minimum and the fluctuations. The experimental results can demonstrate that the CDC is more detailed and accurate in distinguishing IMFs belonging to different signals.

The experiments are also applied to the 100-ping data and the results are compared with the manual interpretation of the true values. The correctness of the results determined using MI is 23 percent and that using GCC is 56 percent. However, the accuracy using CDC is 94 percent and the maximum deviation is no more than one IMF, indicating that the proposed modified algorithm significantly improves the accuracy of the reconstructed signal.

3.3.2. Single-Ping Data Results

Based on the signal processing approach in this paper, as an example, ping data were taken near the location of a drilling site and the results are shown in the Figure 6.

Figure 6a shows the amplitude sequence of the ping data, where three stronger echo positions can be seen: Figure 6b shows the time–frequency spectrum obtained after GST; the horizontal axis is the signal time, the left vertical axis is the instantaneous frequency, and the right color indicates the amplitude value; Figure 6c shows the original amplitude signal at the three horizon positions and the corresponding reconstructed signal; Figure 6d shows the difference value between two adjacent reconstructed logarithmic spectra and its fitted straight line.

Based on the fit results, for layer 1, the fit residuals are 3.6144 and the Q value is 108.4639, which is judged to be sandy silt. For layer 2, the fit yielded a residual of 1.1636 and a Q value of 221.4818, which was judged to be clay. The sediment determination is in good agreement with the drilling data.

3.3.3. Multi-Ping Results

The results of the successive multi-ping data experiments are shown in Table 4, comparing the traditional SR method with the proposed method. It can be seen that the traditional SR method is almost not robust, with large deviations in the results for Q between neighboring pings, even up to a standard deviation (SD) of 562.9094. After signal reconstruction and band selection with the proposed method, the Q results are more stable and the standard deviation is reduced to 7.5292. Thus, from the standard deviation point of view, the improvement in the stability of the Q calculation using this method can reach more than 90 percent.

3.4. Sediment Class Identification

Based on the unsupervised classification results, Q values were calculated for all units and boxplots were constructed for different categories, as shown in Figure 7 and Table 5.

Figure 7 shows that there is a clear difference in the Q values between the different classes, with only a partial overlap in line 3, but the main parts are still clearly differentiated. Drilling data indicate that the upper layers are sandy silt and the lower layers are sandy clay in the measured area. From Table 5, it can be seen that the specific categories of each type can be accurately interpreted using the mean values. Therefore, it can be argued that the interpretation of the sediment class based on Q is well defined.

4. Discussion

4.1. Comparison with Traditional SR Method

In contrast to the conventional SR method of calculating Q, the proposed method reduces the effect of thin interbeds and background noise by signal reconstruction and band selection (as exemplified by the single-ping data in Section 3.2). The comparison is shown in Figure 8.

Figure 8a,c shows the log spectrum difference (ln S(t_i) − ln S(t_i+₁)) between the two horizons and the corresponding fitted line obtained using the original signal and the traditional SR method. Figure 8b,d shows the reconstructed signal log spectrum difference (ln U(t_i) − ln U(t_i+₁)) and the fitted line obtained using the method in this paper. They show that the difference between the two reconstructed log spectra in the selected frequency bands has a better linear character compared to the traditional method.

Between horizons 1 and 2, the conventional SR method yields a residual of 18.3287 for the fit, which is reduced by our method to 3.6144. The slopes of the lines obtained using the two methods are 1.7577 × 10⁻⁵ and 5.2484 × 10⁻⁵, respectively, and the calculated Q values are 323.8621 and 108.4639, respectively. Drilling data suggest that the sediment there is sandy mud and that Q should be below 100. Between horizons 2 and 3, the residuals are 13.8364 and 1.1636 for both methods, resulting in a final Q of 4149.6 and 221.4818, respectively. Drilling data suggest that the sediment there is sandy clay and that Q should be in the upper 150.

From the fitted residuals, our method effectively improves the confidence level of the results. The results of Q computation show that our method yields results that are closer to the true values and can improve the accuracy of Q computation.

4.2. Influence of K Number on Unsupervised Classification Results

The number of clusters, K, is a hyperparameter in the computation of the k-means algorithm and has a large impact on the results. In this paper, it is set according to the a priori number of sediment categories. Due to the complexity of the horizons of line 4, the number of classification categories is again increased to three for unsupervised classification, and the classification results are shown in Figure 9.

It can be seen that the distribution of the first layer is basically unchanged, and the main changes are concentrated in the lower layers (marked as Type 2 and Type 3). These two layers do not have a clear upper and lower stratification in the spatial distribution and are mis-matched with each other. Their Q values were calculated separately to obtain Figure 10 and Table 6.

As shown in Figure 10 and Table 6, there is little difference between the two types and the distribution of the various values of Q is almost the same between the two types. This is also consistent with their spatial distribution features. Thus, they can be identified as the same sediment (the sandy clay).

The above experimental results show that using the prior number of sediment categories to determine K is highly enforceable and reliable. Moreover, combining the classification results from measurement line 4 shows that Q not only ensures the accuracy of sediment classification, but also serves as an evaluation index for k-means classification. It can even be used to indicate the number of clusters in the absence of a prior number of sediment categories.

4.3. Comparison with the Method Based on VMD and MI

To eliminate the interference of reflection for Q estimation in seismic data, Xue et al. [27] proposed a method based on the VMD and MI. To better fit the SBP data, we introduce CDC and band selection based on this method. The two methods are now applied to the same data and the comparison results are shown in Table 7.

If only the mean is considered, both results point to the same sediment class. However, it is clear that the Q values of our method are more stable. In addition, Q obtained by Xue’s method is occasionally incorrectly determined to be negative.

It is also noted that Xue’s method generally achieves relatively small Q values, which may be caused by the increased tail of log SR due to the use of MI. Moreover, in contrast to the proposed method, Xue’s method does not indicate how the fitted frequency bands are determined, which somewhat limits the application of the method.

4.4. Limitations and Future Directions

In this paper, we propose an improved correlation analysis method based on the IMFs obtained by VMD, which outperforms conventional MI. However, whether VMD is suitable for water acoustic signals requires further investigation.

Experiments on real data validate the effectiveness of the proposed sediment classification method. However, restricted to experimental conditions, the experimental data are from the same shallow sea area. The applicability of this method to multi-area data and deep water data is still unknown.

Since current methods are implemented with drilling data support, it is worth considering how to ensure the reliability of results without drilling data in order to better implement these techniques.

In addition, the correlation between Q and sediment used in this paper is derived from previous experimental results [2,5]. While these results of correlation between the sediment class and Q have been widely accepted, the backward inference procedure from Q to the sediment class is still controversial. It is also worth investigating whether the Q values of the same type of sediment vary across different layers. Those interested in going further in this area may consider more scientific experimental validation.

5. Conclusions

In this paper, the accurate determination of Q for the noise sensitivity problem in traditional SR methods is studied. Using the VMD method and the accurate determination of the effective band range, we achieved the removal of interference effects caused by thin interbeds and the robust estimation of Q. Combining automatic determination of the sediment class, we further addressed the problem that traditional Q-based methods cannot be used for large areas. In conclusion, we achieved an accurate and automatic classification of sub-bottom sediments over a large area based on Q, which provides theoretical support for engineering applications of acoustic classification on the seafloor.

However, the introduction of VMD for signal decomposition also introduces many related problems, such as the determination of the number of modes. Whether the proposed approach is applicable to deep-water acoustic signals also requires further investigation. The practical application of this method to complex sedimentary environments is still far from being implemented, and further studies are needed.

Author Contributions

Conceptualization, Z.Z., J.Z. and S.L.; funding acquisition, J.Z. and H.Z.; investigation, Z.Z. and S.L.; methodology, Z.Z., J.Z. and S.L.; writing—original draft, Z.Z.; writing—review and editing, S.L., J.Z. and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China, grant number 2022YFC2808303 and National Natural Science Foundation of China under Grant 42176186.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Access to the data will be considered upon request to the authors.

Acknowledgments

We would like to thank the editor and anonymous reviewers for their valuable comments and suggestions that greatly improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Plets, R.M.K. The Acoustic Imaging, Reconstruction and Characterization of Buried Archaeological Material; University of Southampton: Southampton, UK, 2007. [Google Scholar]
Stevenson, I.; McCann, C.; Runciman, P. An attenuation-based sediment classification technique using Chirp sub-bottom profiler data and laboratory acoustic analysis. Mar. Geophys. Res. 2002, 23, 277–298. [Google Scholar] [CrossRef]
Dvorkin, J.P.; Mavko, G. Modeling attenuation in reservoir and nonreservoir rock. Lead. Edge 2006, 25, 194–197. [Google Scholar] [CrossRef]
Kneib, G.; Shapiro, S.A. Viscoacoustic wave propagation in 2-D random media and separation of absorption and scattering attenuation. Geophysics 1995, 60, 459–467. [Google Scholar] [CrossRef]
Pinson, L.J.W.; Henstock, T.J.; Dix, J.K.; Bull, J.M. Estimating quality factor and mean grain size of sediments from high-resolution marine seismic data. Geophysics 2008, 73, G19–G28. [Google Scholar] [CrossRef]
Aki, K.; Richards, P.G. Quantitative Seismology, 2nd ed.; University Science Books: Melville, NY, USA, 2002; pp. 161–177. [Google Scholar]
Blias, E. Accurate interval Q-factor estimation from VSP data. Geophysics 2012, 77, WA149–WA156. [Google Scholar] [CrossRef]
Zhao, L.F.; Mousavi, S.M. Lateral variation of crustal Lg attenuation in eastern North America. Sci. Rep. 2018, 8, 7285. [Google Scholar] [CrossRef] [PubMed]
Wang, Y. Q analysis on reflection seismic data. Geophys. Res. Lett. 2004, 31, L17606. [Google Scholar] [CrossRef]
Chopra, S.; Marfurt, K.J. Seismic Attributes for Prospect Identification and Reservoir Characterization; Society of Exploration Geophysicists and European Association of Geoscientists and Engineers: Utrecht, The Netherlands, 2007. [Google Scholar]
Yaojun, W.; Shuangquan, C.; Lei, W.; Li, X.Y. Modeling and analysis of seismic wave dispersion based on the rock physics model. J. Geophys. Eng. 2013, 10, 054001. [Google Scholar]
Schock, S. A method for estimating the physical and acoustic properties of the sea bed using chirp sonar data. IEEE J. Ocean. Eng. 2005, 29, 1200–1217. [Google Scholar] [CrossRef]
Tonn, R. The Determination of the seismic quality factor Q from VSP data: A comparison of different computational methods1. Geophys. Prospect. 1991, 39, 1–27. [Google Scholar] [CrossRef]
Engelhard, L. Determination of seismic-wave attenuation by complex trace analysis. Geophys. J. Int. 1996, 125, 608–622. [Google Scholar] [CrossRef]
Bath, M. Spectral Analysis in Geophysics; Elsevier Scientific Pub: London, UK, 1974. [Google Scholar]
Wu, Z.; Yang, F.; Luo, X.; Li, S.; Xiong, M. High-Resolution Submarine Topography—Theory and Technology for Surveying and Post-Processing; Science Press: Beijing, China, 2017. [Google Scholar]
Jannsen, D.; Voss, J.; Theilen, F. Comparison of methods to determine Q in shallow marine sediments from vertical reflection seismograms. Geophys. Prospect. 2010, 33, 479–497. [Google Scholar] [CrossRef]
Schock, S.G. The Chirp Sonar—A High-Resolution, Quantitative Subbottom Profiler; University of Rhode Island: Kingston, RI, USA, 1989. [Google Scholar]
Schock, S. Remote estimates of physical and acoustic sediment properties in the South China Sea using chirp sonar data and the biot model. IEEE J. Ocean. Eng. 2004, 29, 1218–1230. [Google Scholar] [CrossRef]
Liu, G.; Chen, X.; Rao, Y. Seismic quality factor estimation using frequency-dependent linear fitting. J. Appl. Geophys. 2018, 156, 1–8. [Google Scholar] [CrossRef]
Panda, S. Remote Acoustic Evaluation of Seafloor Sediment Properties; University of Rhode Island: Kingston, RI, USA, 1992. [Google Scholar]
Li, S.; Zhao, J.; Zhang, H.; Qu, S. Sub-Bottom Sediment Classification Using Reliable Instantaneous Frequency Calculation and Relaxation Time Estimation. Remote Sens. 2021, 13, 4809. [Google Scholar] [CrossRef]
Hackert, C.L.; Parra, J.O. Improving Q estimates from seismic reflection data using well-log-based localized spectral correction. Geophysics 2004, 69, 1521–1529. [Google Scholar] [CrossRef]
Tu, N.; Lu, W.-K. Improve Q estimates with spectrum correction based on seismic wavelet estimation. Appl. Geophys. 2010, 7, 217–228. [Google Scholar] [CrossRef]
Li, C.; Liu, X. A new method for interval Q-factor inversion from seismic reflection data. Geophysics 2015, 80, R361–R373. [Google Scholar] [CrossRef]
Chen, S.; Wang, K.; Peng, Z.; Chang, C.; Zhai, W. Generalized dispersive mode decomposition: Algorithm and applications. J. Sound Vib. 2020, 492, 115800. [Google Scholar] [CrossRef]
Xue, Y.J.; Cao, J.X.; Wang, X.J.; Du, H.K. Estimation of seismic quality factor in the time-frequency domain using variational mode decomposition. Geophysics 2020, 85, V329–V343. [Google Scholar] [CrossRef]
Baradello, L. An improved processing sequence for uncorrelated Chirp sonar data. Mar. Geophys. Res. 2014, 35, 337–344. [Google Scholar] [CrossRef]
Forte, E.; Dossi, M.; Pipan, M.; Del Ben, A. Automated phase attribute-based picking applied to reflection seismics. Geophysics 2016, 81, V141–V150. [Google Scholar] [CrossRef]
LeBlanc, L.R.; Panda, S.; Schock, S.G. Sonar attenuation modeling for classification of marine sediments. J. Acoust. Soc. Am. 1992, 91, 116–126. [Google Scholar] [CrossRef]
Mattia, N.; Tore, G.K.; Daniel, P. Sketch-based modelling and visualization of geological deposition. Comput. Geosci. 2014, 67, 40–48. [Google Scholar]
Stockwell, R.G.; Mansinha, L.; Lowe, R.P. Localization of the complex spectrum: The S transform. IEEE Trans. Signal Process. 1996, 44, 998–1001. [Google Scholar] [CrossRef]
Adams, M.D.; Kossentini, F.; Ward, R.K. Generalized S transform. IEEE Trans. Signal Process. 2002, 50, 2831–2842. [Google Scholar] [CrossRef]
Pinnegar, C.R.; Mansinha, L. The S-transform with windows of arbitrary and varying shape. Geophysics 2003, 68, 381–385. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
Xu, T.; Zeng, Z.; Huang, X.; Li, J.; Feng, H. Pipeline Leak Detection based on Variational Mode Decomposition and Support Vector Machine Using an Interior Spherical Detector. Process. Saf. Environ. Prot. 2021, 153, 167–177. [Google Scholar] [CrossRef]
Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef]
Maes, F.; Collignon, A.; Vandermeulen, D.; Marchal, G.; Suetens, P. Multimodality image registration by maximization of mutual information. IEEE Trans. Med Imaging 1997, 16, 187–198. [Google Scholar] [CrossRef] [PubMed]
Li, S.; Zhao, J.; Zhang, H.; Qu, S. An Integrated Horizon Picking Method for Obtaining the Main and Detailed Reflectors on Sub-Bottom Profiler Sonar Image. Remote Sens. 2021, 13, 2959. [Google Scholar] [CrossRef]
Lee, G.H.; Kim, H.J.; Kim, D.C.; Yi, B.Y.; Nam, S.M.; Khim, B.K.; Lim, M.S. The acoustic diversity of the seabed based on the similarity index computed from Chirp seismic data. ICES J. Mar. Sci. J. Du Cons. 2008, 66, 227–236. [Google Scholar] [CrossRef]
Shang, X.; Robert, K.; Misiuk, B.; Mackin-McLaughlin, J.; Zhao, J. Self-adaptive analysis scale determination for terrain features in seafloor substrate classification. Estuarine Coast. Shelf Sci. 2021, 254, 107359. [Google Scholar] [CrossRef]
Reich, D.; Price, A.L.; Patterson, N. Principal component analysis of genetic data. Nat. Genet. 2008, 40, 491–492. [Google Scholar] [CrossRef]
Takane, Y.; Shibayama, T. Principal component analysis with external information on both subjects and variables. Psychometrika 1991, 56, 97–120. [Google Scholar] [CrossRef]
Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. Appl. Stat. 1979, 28, 100–108. [Google Scholar] [CrossRef]

Figure 1. Overall technology roadmap. It covers two parts: unsupervised classification and Q computation with modified SR.

Figure 2. Workflow of Q estimation using VMD. VMD is first used to obtain a series of IMFs from the time–frequency spectra generated by the GST of SBP ping data. Then, the correlation analysis is performed successively on the IMFs to reconstruct the signal, which removes the effect of the reflection coefficient. Finally, band selection and least-squares fitting are used to calculate Q.

Figure 3. Typical sedimentary drilling figure. (a) ZK08 drilling figure; (b) ZK15 drilling figure.

Figure 4. Results of k-means classification (K = 3). (a–d) represent the four measurement lines. Different colors indicate different categories. Since each line is an independently unsupervised classification, the same color in different lines does not necessarily indicate the same sediment type.

Figure 5. Chart of the variation in the correlation coefficient.

Figure 6. Single-ping data results graph. (a) The ping signal; and (b) its time–frequency spectrum by GST; (c) reconstructed signal at the horizon location (the original signal in blue and the reconstructed signal in orange); (d) spectral-ratio results ( (ln U(t_i) − ln U(t_i+₁)) in blue and the fitted line in orange).

Figure 7. Boxplots of Q for different measurement lines. (a–d) denote the four lines and correspond one-to-one to the subfigures in Figure 4. Type 1 and type 2 represent the types represented by different colors from the upper to lower layers in Figure 4, except for the seawater area. Each line is computed independently. The red plus sign represents extreme abnormal values.

Figure 8. Comparison of traditional SR and our method. (a) Original signal and fitted line between horizon 1 and 2; (b) reconstructed signal and fitted line between horizon 1 and 2; (c) original signal and fitted line between horizon 2 and 3; (d) reconstructed signal and fitted line between horizon 2 and 3. (Difference value between two horizons in blue and the fitted line in orange).

Figure 9. Comparison of classification results for line 4 using different K. ((a) K = 3; (b) K = 4). Different colors indicate different categories.

Figure 10. Boxplots of Q for line 4. The red plus sign represents extreme abnormal values.

Table 1. Comparison table of Q sediment.

Category	Moisture (%)	Mean Grain Size (mm)	Q
Sandy Clay	27~39	0~0.005	150~300
Sandy Silt	20~36	0.005~0.05	100~150
Sandy Mud	30~50	0.05~0.5	30~100

Table 2. Table of correlation coefficients (MI and GCC); minima are marked in bold.

k	Source	MI	R_g
1	IMF1 to IMF2	1.7683	0.6195
2	IMF2 to IMF3	0.5776	0.2184
3	IMF3 to IMF4	0.5074	0.2481
4	IMF4 to IMF5	0.3966	0.2856
5	IMF5 to IMF6	0.4436	0.4589
6	IMF6 to IMF7	0.2565	0.4074
7	IMF7 to IMF8	0.2530	0.4429
8	IMF8 to IMF9	0.2526	0.4462
9	IMF9 to IMF10	0.3011	0.5960
10	IMF10 to IMF11	0.2757	0.2551
11	IMF11 to IMF12	0.3911	0.1566
12	IMF12 to IMF13	0.7177	0.2355
13	IMF13 to IMF14	0.1279	0.1071
14	IMF14 to IMF15	0.2753	0.5658
15	IMF15 to IMF16	0.2372	0.2719
	Separation point (G)	13	13

Table 3. Table of correlation coefficients (CDC); minima are marked in bold.

k	Source	I^k
2	R_g 1 to R_g 2	−0.4011
3	R_g 2 to R_g 3	0.0297
4	R_g 3 to R_g 4	0.0375
5	R_g 4 to R_g 5	0.1733
6	R_g 5 to R_g 6	−0.0515
7	R_g 6 to R_g 7	0.0355
8	R_g 7 to R_g 8	0.0033
9	R_g 8 to R_g 9	0.1498
10	R_g 9 to R_g 10	−0.3409
11	R_g 10 to R_g 11	−0.0985
12	R_g 11 to R_g 12	0.0789
13	R_g 12 to R_g 13	−0.1284
14	R_g 13 to R_g 14	0.4587
15	R_g 14 to R_g 15	−0.2939
	Separation point (G)	2

Table 4. Comparison of Q calculations between 10 neighboring pings at layer 1.

Ping No.	Traditional SR	Proposed Method
401	192.7969	35.3588
402	286.8748	39.6956
403	130.9156	40.6357
404	131.1762	55.1115
405	1437.9062	36.0064
406	92.3441	40.2316
407	1332.5844	35.7613
408	367.6768	47.3555
409	1373.0436	55.1474
410	573.2333	46.6754
Mean	591.8552	43.1979
SD	562.9094	7.5292

Table 5. Statistical table of Q for measurement lines.

		Mean-Q	Sediment Class
Line1	Type1	90.5234	Sandy Mud
Line1	Type2	224.6792	Sandy Clay
Line2	Type1	34.4638	Sandy Mud
Line2	Type2	166.6610	Sandy Clay
Line3	Type1	72.5031	Sandy Mud
Line3	Type2	177.4148	Sandy Clay
Line4	Type1	45.7010	Sandy Mud
Line4	Type2	186.6549	Sandy Clay

Table 6. Statistical table of Q.

	Mean-Q	Sediment Class	Min-Q	Sediment Class	Max-Q	Sediment Class
Type 2	153.6138	Sandy Clay	101.4761	Sandy Silt	310.3159	Sandy Clay
Type 3	167.8000	Sandy Clay	122.0791	Sandy Silt	295.2156	Sandy Clay

Table 7. Comparison of Q calculations between 10 neighboring pings.

Ping No.	Xue’s Method	Proposed Method
01	64.9375	72.5031
02	54.2861	68.3482
03	68.9251	72.4638
04	102.2962	80.2263
05	57.9562	74.3086
06	80.2833	76.9745
07	41.0345	69.9683
08	−43.2664	53.6392
09	58.3921	65.8944
10	−34.0844	62.3942
Mean	45.0760	69.6721
SD	47.1311	7.6605

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zong, Z.; Zhao, J.; Li, S.; Zhang, H. Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor. J. Mar. Sci. Eng. 2023, 11, 1770. https://doi.org/10.3390/jmse11091770

AMA Style

Zong Z, Zhao J, Li S, Zhang H. Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor. Journal of Marine Science and Engineering. 2023; 11(9):1770. https://doi.org/10.3390/jmse11091770

Chicago/Turabian Style

Zong, Zaixiang, Jianhu Zhao, Shaobo Li, and Hongmei Zhang. 2023. "Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor" Journal of Marine Science and Engineering 11, no. 9: 1770. https://doi.org/10.3390/jmse11091770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor

Abstract

1. Introduction

2. Methods

2.1. Q Calculation Based on VMD

2.1.1. Signal Decomposition and Reconstruction

2.1.2. Variational Mode Decomposition (VMD)

2.1.3. Interference Effect ln R(f) Removal

2.1.4. Q Calculation and Frequency Band Range Selection

2.2. Joint Feature Clustering and Q for Sediment Classification

2.2.1. Unsupervised Classification

2.2.2. Identification of the Type of Sediment

3. Experiment and Analysis

3.1. Overview of Experimental Data

3.2. Unsupervised Classification

3.3. Q Calculation Results

3.3.1. Validity of Correlation Analysis Method

3.3.2. Single-Ping Data Results

3.3.3. Multi-Ping Results

3.4. Sediment Class Identification

4. Discussion

4.1. Comparison with Traditional SR Method

4.2. Influence of K Number on Unsupervised Classification Results

4.3. Comparison with the Method Based on VMD and MI

4.4. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI