Next Article in Journal
Monitoring of a Coastal Protection Scheme through Satellite Remote Sensing: A Case Study in Ghana
Previous Article in Journal
Development of an Edge Computing-Based Intelligent Feeding System for Observing Depth-Specific Feeding Behavior in Red Seabream
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor

1
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
2
Institute of Marine Science and Technology, Wuhan University, Wuhan 430079, China
3
The School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China
4
Department of Artificial Intelligence and Automation, School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2023, 11(9), 1770; https://doi.org/10.3390/jmse11091770
Submission received: 11 August 2023 / Revised: 6 September 2023 / Accepted: 8 September 2023 / Published: 11 September 2023
(This article belongs to the Section Geological Oceanography)

Abstract

:
It has been proven that the quality factor (Q) is important for marine sediment attenuation attribute representation and is helpful for sediment classification. However, the traditional spectral-ratio (SR) method is affected by the interference effect caused by thin interbeds, which seriously degrade the performance of the SR method. Aimed at this problem, a novel method based on variational mode decomposition (VMD) correlation analysis is presented in this paper, which realizes the separation between interference reflections and effective signals. After obtaining the effective signals, a frequency band selection method is employed to weaken the influence of background noise. To better apply the proposed method to large-area sediment classification, a sediment clustering method based on texture features is introduced. Experiments on real data validate the effectiveness of the proposed method. The accuracy of the correlation analysis method using the modified parameters is 94 percent. The stability improvement in the standard deviation of the Q calculation can reach more than 90 percent. Moreover, the interpretation of sediment categories using the mean value of Q fits the drilling data well. It is believed that the proposed method has huge potential for the engineering applications in sub-bottom sediment classification.

1. Introduction

The sub-bottom profiler (SBP) has been widely used in underwater topographic surveys, sediment surveys, mineral resource exploration, and marine scientific research [1,2,3]. The acoustic wave of SBP attenuation occurs during wave propagation through water and sediment due to the heterogeneity and anelasticity of the media [2,4]. Commonly, the quality factor (Q) is used to parameterize the attenuation, which is inversely proportional to the attenuation coefficient [3,5,6]. Q is sensitive to physical information such as lithology, porosity, saturation, and permeability [5,7,8]. Hence, accurate Q estimation might help to improve the resolution of SBP data, enhance the fine details in stratigraphic features, and classify the stratigraphic material [2,8,9,10,11]
Q is often assumed to be approximately constant in the seismic frequency band [9,10,12]. Earlier methods for calculating Q are mainly based on signal amplitude information. However, prior knowledge of the true amplitude of the echo signal is required, which limited the application of the method [13].
Since real amplitude information is difficult to obtain from echo signal data, signal-amplitude-based methods are difficult to apply in practice [14]. Therefore, most researchers have calculated Q using the energy intensity loss in the frequency domain.
The traditional adopted spectral-ratio (SR) method is widely applied because of its simple theory and easy employment [15,16,17,18]. However, it is sensitive to noise and is easily influenced by interference effects. Schock [18] analyzed the performance of the SR method and derived the influence items of thin interbed reflections. He claimed that the reflection caused difficulties in the calculation of Q [12,18,19,20].
Panda [21] proposed an attenuation-based sediment classification method using the instantaneous frequency using the Hilbert transform to avoid the influence of thin interbed reflections. This method is seriously affected by noise and is not suitable for multi-component and wide-band SBP sonar data [22,23,24]. Reasonable and suitable strategies using instantaneous frequency-based methods to obtain good instantaneous frequency series have not been developed [2].
Therefore, the correction of the signal spectrum shape to remove the influence of interference effects has become the direction of Q calculation methods [23,24,25,26]. Hackert et al. [23] used the impedance parameters obtained from drilling data to correct the spectrum. This method improved the traditional SR method, but the drilling data are still needed. Pinson et al. [5] proposed an anti-noise SR method by considering the characteristic of the noise spectrum. This method can avoid the interference of noise, but ignores overlapping reflections. Chen et al. [26] proposed a generalized dispersive mode decomposition method. Xue et al. [27] designed a novel method based on variational mode decomposition (VMD). These methods work well on seismic data, but are not specifically designed for SBP sonar data [27,28,29,30,31].
These methods separate the signals of different components and achieve better results. However, since these methods do not make clear provisions for the selection of the band ranges, they need further improvement.
Based on the Q calculation method using VMD, this paper improves both the correlation analysis and the study of band range selection. In addition, we study the automatic classification method of the sediment based on feature clustering and Q to realize the automatic and accurate sub-bottom sediment classification in a large area. The rest of this paper is organized as follows. Section 2 provides a detailed description of the proposed method. Section 3 describes the data used in this study and analyzes the experimental results. Finally, our discussions and conclusions are presented in Section 4 and Section 5, respectively.

2. Methods

The technical flow of this paper is shown in Figure 1. There are two parts. The first part obtains the statistical features of the SBP data and, with the help of an unsupervised classifier (using the k-means algorithm), the multiple data subsets are derived. The second part uses VMD and the correlation distortion coefficient (CDC), as well as the frequency band selection criteria, to obtain an accurate Q. Combining the two parts, the automatic and accurate sub-bottom sediment classification in a large area is realized.

2.1. Q Calculation Based on VMD

Since Q calculation is more important, it is described first. The workflow of the modified Q calculation method is shown in Figure 2. The Q estimation is performed by a generalized Stockwell transform (GST), a VMD, a correlation analysis, band range selection, and finally a least-squares method.

2.1.1. Signal Decomposition and Reconstruction

The one-dimensional ping signal h(t) is transformed into a 2D time–frequency signal S(t, f) using generalized Stockwell transform (GST) [32,33,34]. The amplitude spectra at the two horizons (t = t1, t = t2) are S(t1, f) and S(t2, f). Then, according to the principle of acoustic wave attenuation, the following equation is obtained [5]:
S ( t 2 , f ) = S ( t 1 , f ) exp ( ( 1 R 1 2 ) G 2 R 2 R 1 ) exp ( π f d T Q )
where f denotes frequency, dT = t2t1, R1 and R2 are the medium reflection coefficients at t1 and t2, G2 is the acoustic spherical diffusion loss from t1 to t2, and Q is the average or equivalent Q from t1 to t2. Assuming that R, G, and Q are frequency independent, Equation (2) can be rewritten as the following linear equation:
ln S ( t 2 , f ) = ln S ( t 1 , f ) + ( ( 1 R 1 2 ) G 2 R 2 R 1 ) ( π d T Q ) f
To attenuate the effect of thin interbed interference effects within t1 to t2, depending on the different source, the echo model is stated as:
ln S ( f ) = ln W ( f ) + ln R ( f )
where W(f) and R(f) denote the Fourier transform of the wavelet function and reflection function, respectively, which vary slowly and rapidly with frequency [22,27]. The ln W(f) and ln R(f) are also slowly and rapidly changing functions with frequency.
The ln R(f) is related to the interference effect and should be removed. We introduce VMD combined with correlation analysis to adaptively achieve this goal.

2.1.2. Variational Mode Decomposition (VMD)

The VMD algorithm is an adaptive and fully non-recursive approach to mode variation and signal processing that adaptively matches the optimal center frequency and finite bandwidth of each mode. It can efficiently decompose multiple quasi-orthogonal intrinsic mode functions (IMFs) with narrow-band nature and finally obtain the optimal solution of the variational problem [27,35,36].
For a multimodal signal x(t) decomposed into n IMFs, the VMD can be expressed as:
{ min { u k , ω k } { n t [ ( δ ( t ) + j π t ) u k ( t ) ] e j ω k t 2 2 } n u k = x ( t ) u k ( t ) = A k ( t ) cos ( ϕ k ( t ) )
where uk is the kth IMF, ωk denotes the corresponding center frequency, δ(t) denotes the impulse function, and * is the convolution operation in the time domain. ϕk(t) is the signal phase of uk (a non-decreasing function), Ak(t) ≥ 0 is the signal envelope, and the instantaneous frequency ωk(t) ≥ 0 (=ϕk(t)) and Ak(t) both change slowly compared with ϕk(t).
Based on VMD, ln S(f) is decomposed into multiple IMFs from low to high frequencies, and these IMFs can be divided into two parts: ln W(f)- and ln R(f)-dominant. The ln W(f) part of the signal changes slowly with frequency, so it mainly exists in the first few IMFs, while the ln R(f) that should be excluded mainly exists in the later IMFs.

2.1.3. Interference Effect ln R(f) Removal

Correlation analysis can be used to find the dividing point between the two parts of the IMFs (ln W(f) and ln R(f) parts), and then to reconstruct the signal from the low-frequency components, namely, the ln W(f) related parts, before the dividing point. Therefore, correlation analysis was introduced to address this issue.
Based on the mutual information (MI) [37,38], the normalized “generalized correlation coefficient (GCC)” is firstly defined as:
R g k = I ( u k , u k + 1 ) H ( u k ) H ( u k + 1 ) k = 1 ~ n 1
Theoretically, two IMFs belonging to the same ln W(f) or ln R(f) are highly correlated, whereas two IMFs belonging to ln W(f) and ln R(f) are less correlated, so the minimum in Rg can be used as a dividing point between ln W(f) and ln R(f). However, due to the random noise nature exhibited by the ln R(f) part of the signal, the degree of correlation between two IMFs belonging to the same ln R(f) is unpredictable, and the minimum Rg does not guarantee an accurate finding of the cutoff.
Therefore, this paper uses the difference between two adjacent Rg as the evaluation criterion for the cutoff point, and further defines the “correlation distortion coefficient (CDC)” as:
I k = R g k R g k 1 k = 2 ~ n 1
When Ik acquires a negative value with a large absolute value, the dividing point is found to be:
G = min k = 2 ~ n 1 { I k }
Then, the ln W(f)-dominant signal at t = t1 is reconstructed by:
ln U ( t 1 , f ) = k = 1 G u k
where ln U(t1, f) is considered as the reconstructed signal without the effect of reflection coefficients between thin interbeds.
Referencing the research of Xue et al. [27], the signal at t = t2 is too complex for a separate analysis, so the results at t = t1 can be fully exploited. The higher the correlation coefficient between U(t1, f) and U(t2, f), the lower the residual of the reflection coefficient-related contents’ influence in U(t2, f). Let:
G 2 = max k = 1 ~ n { R ( i = 1 k exp ( u i ) , U ( t 1 , f ) ) }
where R denotes taking the correlation coefficient.
Then, the signal at t = t2 is reconstructed by:
ln U ( t 2 , f ) = k = 1 G 2 u k

2.1.4. Q Calculation and Frequency Band Range Selection

Combing Equations (8) and (10), the Q calculation method Equation (2) can be rewritten as:
ln U ( t 2 , f ) = ln U ( t 1 , f ) + ( ( 1 R 1 2 ) G 2 R 2 R 1 ) ( π d T Q ) f
To estimate Q, the quadratic objective function F(a, b) is minimized over the frequency band [f1, f2] using least-squares methods.
F ( a , b ) = f = f 1 f 2 [ ln U ( t 1 , f ) ln U ( t 2 , f ) + a b f ]
By using Equation (12) to obtain b, Q is estimated as:
Q = π d T b
To obtain Q, the frequency band [f1, f2] in Equation (12) should be determined. [f1, f2] is closely related to the final estimated Q [5]. Therefore, high signal-to-noise ratio (SNR) screening is necessary to weaken the interference from background noise and other full-band properties.
The following band range selection process was developed:
(1) 
Band filtering based on background noise
The average energy intensity of the water body is chosen to be the background noise. We calculate the SNR of the reconstructed signal ln U(t, f) in the full band range and then filter out the frequencies where the SNR is less than one.
(2) 
Band filtering based on energy anomalies
Due to attenuation effects in the propagation of acoustic waves, the energy at the lower horizon must be smaller than that at the upper boundary in an ideal signal. The difference between the energies at the upper and lower horizon is thus calculated and then filtered out at frequencies where the difference is less than zero.
(3) 
Selecting for the energy concentration band
The peak of the horizon energy in the remaining band range is found, and then a search is started from the peak position to the left and right to find the first point that does not satisfy the following equation:
E ( f ) E p e a k > η
where Epeak is the peak energy, E(f) is the energy intensity at each frequency, and η is the set threshold value. After experiments, η is given an empirical threshold range (0.3~0.6). The two identified critical frequencies are denoted f1 and f2, which is the range of the calculated band for Equation (12).

2.2. Joint Feature Clustering and Q for Sediment Classification

2.2.1. Unsupervised Classification

In the classification of SBP data, the classification units are partitioned by horizons. There are manual, semi-automatic, and automatic methods for horizon picking. This paper adopts the integrated automatic horizon picking method with joint amplitude and phase information [39] and supplemented with manual correction.
Based on a comprehensive consideration of the media properties, the feature parameters need to be chosen as comprehensively as possible. In this paper, we select various statistical features of the sediment-dependent SBP data, including amplitude intensity features and grayscale image texture features. The four amplitude feature parameters of mean, variance, median, and correlation coefficient are selected. The six texture feature parameters of mean, variance, skewness, kurtosis, energy (uniformity), and entropy are selected [40,41].
Meanwhile, to address the information redundancy that can arise from having too many feature parameters, PCA is performed to determine the divergence and correlation of each feature parameter to obtain the size of each feature parameter’s contribution to the classification. Accordingly, the feature vector is dimensionally reduced to reduce the classification error [42,43].
Finally, unsupervised feature clustering is performed rapidly based on the k-means algorithm [44,45]. k-means is an unsupervised classification method based on Euclidean distance. The basic principle of the k-means algorithm is to determine the cluster to which each sample belongs based on its distance from the cluster center, update the cluster center with the newly added sample data, and so on, iteratively, until the clustering result converges. The ideal goal of clustering is to partition the sample data into K disjoint datasets such that the sum of distances from each data point to the cluster center is minimized, and the clustering is determined according to whether the sum of distances converges or not.
The number of classifications K is a hyperparameter that generally needs to be chosen empirically. In this paper, it is set to n + 1, where n is the sediment class number, which is a priori the number obtained from the drilling data, and 1 denotes the seawater region and the deep non-reflective echo region.

2.2.2. Identification of the Type of Sediment

Since SBP data are segmented by unsupervised classification without specific sediment categories, further determinations based on drilling data or other information are necessary. In the geological survey, the sediment category is often classified according to the grain size, and empirical models between Q and the average grain size have been established in existing papers [2,5].
Thus, the Q values of the central regions of each subset can be computed using the proposed method, and the sediment classes of each subset can then be determined from the Q values.

3. Experiment and Analysis

3.1. Overview of Experimental Data

The data were surveyed in Tianjin, China, using Innomar SES2000 Standard SBP [16]. Four measurement lines were selected that were close to the drilling points. The total length of the lines was 6.471 km, with an average separation of 540 m between the lines and an average water depth of about 20 m.
A total of 25 drill holes (ZK01~ZK25) were scheduled for this survey, with a required depth of 8 m below the surface of the seabed. Based on laboratory analysis of the drilling data, the sediment in the measured area can be roughly divided into two layers, with the upper layer being predominantly sandy mud and the lower layer being predominantly sandy clay, with a small amount of sandy silt interspersed. Sandy mud is characterized by uneven soil quality, large amounts of sand particles with large grain sizes, and about 27.3% clay particles. Sandy clay generally has grain sizes less than 0.005 mm, with a small quantity of silty soil layers. Sandy silt has a grain size between 0.005 and 0.05 mm, and the soil is not uniform in quality, with a small amount of silt included. The typical sedimentary drilling figures are shown in Figure 3.
Combining the empirical models of Stevenson et al. [2] and Pinson et al. [5], the Q and sediment comparison table applicable to this region is shown in Table 1.

3.2. Unsupervised Classification

The method of Li et al. [39] was adopted for horizon extraction, and the units for classification and Q calculation were divided by horizons. The nine feature parameters mentioned in Section 2.2.1 were computed separately according to the classification unit and PCA was performed.
In PCA, degree is the ratio of each principal component eigenvalue to the sum of all eigenvalues, and sum-degree denotes the cumulative contribution degree. A share of 85% of the sum-degree was used as the basis for selection. Thus, the first three principal components were chosen to form a rotation matrix and the original feature parameters were reduced to 3D feature vectors by rotation transformation.
Based on the feature vectors of each unit after rotation, k-means classification was performed for each measurement line (considering the regional sediment class, K = 3 was taken), and the classification results are shown in Figure 4.
It can be seen that the unsupervised classification results are fairly consistent with the law of sediment distribution and have high reliability. The situation is complicated by the large number of horizons in the fourth line. The number of clusters will thus be increased later for targeted experimental analysis.

3.3. Q Calculation Results

3.3.1. Validity of Correlation Analysis Method

In order to verify the effectiveness of the “CDC” proposed in this paper for signal decomposition, 16 IMFs were obtained by VMD of the signal, and then the MI and “GCC” (Rg) and “CDC” (Ik) were calculated. Table 2 and Table 3 and Figure 5 could then be obtained. The separation point among the IMFs determined by manual discrimination is between the second and third IMFs, i.e., G = 2.
By comparison, the MI is found to fluctuate considerably and the trend of the curve change is not obvious. Its minimum point lies between the 13th and 14th IMFs and the second MI value (the theoretical minimum) is at the higher level of the overall graph. The normalized GCC is more stable and has a minimum between the second and third IMFs if only the first ten IMFs are considered, which is more consistent with the manual interpretation. However, its minimum is still wrong when there are more IMFs. In contrast, the CDC is more in line with the desired target, both in terms of the location of the minimum and the fluctuations. The experimental results can demonstrate that the CDC is more detailed and accurate in distinguishing IMFs belonging to different signals.
The experiments are also applied to the 100-ping data and the results are compared with the manual interpretation of the true values. The correctness of the results determined using MI is 23 percent and that using GCC is 56 percent. However, the accuracy using CDC is 94 percent and the maximum deviation is no more than one IMF, indicating that the proposed modified algorithm significantly improves the accuracy of the reconstructed signal.

3.3.2. Single-Ping Data Results

Based on the signal processing approach in this paper, as an example, ping data were taken near the location of a drilling site and the results are shown in the Figure 6.
Figure 6a shows the amplitude sequence of the ping data, where three stronger echo positions can be seen: Figure 6b shows the time–frequency spectrum obtained after GST; the horizontal axis is the signal time, the left vertical axis is the instantaneous frequency, and the right color indicates the amplitude value; Figure 6c shows the original amplitude signal at the three horizon positions and the corresponding reconstructed signal; Figure 6d shows the difference value between two adjacent reconstructed logarithmic spectra and its fitted straight line.
Based on the fit results, for layer 1, the fit residuals are 3.6144 and the Q value is 108.4639, which is judged to be sandy silt. For layer 2, the fit yielded a residual of 1.1636 and a Q value of 221.4818, which was judged to be clay. The sediment determination is in good agreement with the drilling data.

3.3.3. Multi-Ping Results

The results of the successive multi-ping data experiments are shown in Table 4, comparing the traditional SR method with the proposed method. It can be seen that the traditional SR method is almost not robust, with large deviations in the results for Q between neighboring pings, even up to a standard deviation (SD) of 562.9094. After signal reconstruction and band selection with the proposed method, the Q results are more stable and the standard deviation is reduced to 7.5292. Thus, from the standard deviation point of view, the improvement in the stability of the Q calculation using this method can reach more than 90 percent.

3.4. Sediment Class Identification

Based on the unsupervised classification results, Q values were calculated for all units and boxplots were constructed for different categories, as shown in Figure 7 and Table 5.
Figure 7 shows that there is a clear difference in the Q values between the different classes, with only a partial overlap in line 3, but the main parts are still clearly differentiated. Drilling data indicate that the upper layers are sandy silt and the lower layers are sandy clay in the measured area. From Table 5, it can be seen that the specific categories of each type can be accurately interpreted using the mean values. Therefore, it can be argued that the interpretation of the sediment class based on Q is well defined.

4. Discussion

4.1. Comparison with Traditional SR Method

In contrast to the conventional SR method of calculating Q, the proposed method reduces the effect of thin interbeds and background noise by signal reconstruction and band selection (as exemplified by the single-ping data in Section 3.2). The comparison is shown in Figure 8.
Figure 8a,c shows the log spectrum difference (ln S(ti) − ln S(ti+1)) between the two horizons and the corresponding fitted line obtained using the original signal and the traditional SR method. Figure 8b,d shows the reconstructed signal log spectrum difference (ln U(ti) − ln U(ti+1)) and the fitted line obtained using the method in this paper. They show that the difference between the two reconstructed log spectra in the selected frequency bands has a better linear character compared to the traditional method.
Between horizons 1 and 2, the conventional SR method yields a residual of 18.3287 for the fit, which is reduced by our method to 3.6144. The slopes of the lines obtained using the two methods are 1.7577 × 10−5 and 5.2484 × 10−5, respectively, and the calculated Q values are 323.8621 and 108.4639, respectively. Drilling data suggest that the sediment there is sandy mud and that Q should be below 100. Between horizons 2 and 3, the residuals are 13.8364 and 1.1636 for both methods, resulting in a final Q of 4149.6 and 221.4818, respectively. Drilling data suggest that the sediment there is sandy clay and that Q should be in the upper 150.
From the fitted residuals, our method effectively improves the confidence level of the results. The results of Q computation show that our method yields results that are closer to the true values and can improve the accuracy of Q computation.

4.2. Influence of K Number on Unsupervised Classification Results

The number of clusters, K, is a hyperparameter in the computation of the k-means algorithm and has a large impact on the results. In this paper, it is set according to the a priori number of sediment categories. Due to the complexity of the horizons of line 4, the number of classification categories is again increased to three for unsupervised classification, and the classification results are shown in Figure 9.
It can be seen that the distribution of the first layer is basically unchanged, and the main changes are concentrated in the lower layers (marked as Type 2 and Type 3). These two layers do not have a clear upper and lower stratification in the spatial distribution and are mis-matched with each other. Their Q values were calculated separately to obtain Figure 10 and Table 6.
As shown in Figure 10 and Table 6, there is little difference between the two types and the distribution of the various values of Q is almost the same between the two types. This is also consistent with their spatial distribution features. Thus, they can be identified as the same sediment (the sandy clay).
The above experimental results show that using the prior number of sediment categories to determine K is highly enforceable and reliable. Moreover, combining the classification results from measurement line 4 shows that Q not only ensures the accuracy of sediment classification, but also serves as an evaluation index for k-means classification. It can even be used to indicate the number of clusters in the absence of a prior number of sediment categories.

4.3. Comparison with the Method Based on VMD and MI

To eliminate the interference of reflection for Q estimation in seismic data, Xue et al. [27] proposed a method based on the VMD and MI. To better fit the SBP data, we introduce CDC and band selection based on this method. The two methods are now applied to the same data and the comparison results are shown in Table 7.
If only the mean is considered, both results point to the same sediment class. However, it is clear that the Q values of our method are more stable. In addition, Q obtained by Xue’s method is occasionally incorrectly determined to be negative.
It is also noted that Xue’s method generally achieves relatively small Q values, which may be caused by the increased tail of log SR due to the use of MI. Moreover, in contrast to the proposed method, Xue’s method does not indicate how the fitted frequency bands are determined, which somewhat limits the application of the method.

4.4. Limitations and Future Directions

In this paper, we propose an improved correlation analysis method based on the IMFs obtained by VMD, which outperforms conventional MI. However, whether VMD is suitable for water acoustic signals requires further investigation.
Experiments on real data validate the effectiveness of the proposed sediment classification method. However, restricted to experimental conditions, the experimental data are from the same shallow sea area. The applicability of this method to multi-area data and deep water data is still unknown.
Since current methods are implemented with drilling data support, it is worth considering how to ensure the reliability of results without drilling data in order to better implement these techniques.
In addition, the correlation between Q and sediment used in this paper is derived from previous experimental results [2,5]. While these results of correlation between the sediment class and Q have been widely accepted, the backward inference procedure from Q to the sediment class is still controversial. It is also worth investigating whether the Q values of the same type of sediment vary across different layers. Those interested in going further in this area may consider more scientific experimental validation.

5. Conclusions

In this paper, the accurate determination of Q for the noise sensitivity problem in traditional SR methods is studied. Using the VMD method and the accurate determination of the effective band range, we achieved the removal of interference effects caused by thin interbeds and the robust estimation of Q. Combining automatic determination of the sediment class, we further addressed the problem that traditional Q-based methods cannot be used for large areas. In conclusion, we achieved an accurate and automatic classification of sub-bottom sediments over a large area based on Q, which provides theoretical support for engineering applications of acoustic classification on the seafloor.
However, the introduction of VMD for signal decomposition also introduces many related problems, such as the determination of the number of modes. Whether the proposed approach is applicable to deep-water acoustic signals also requires further investigation. The practical application of this method to complex sedimentary environments is still far from being implemented, and further studies are needed.

Author Contributions

Conceptualization, Z.Z., J.Z. and S.L.; funding acquisition, J.Z. and H.Z.; investigation, Z.Z. and S.L.; methodology, Z.Z., J.Z. and S.L.; writing—original draft, Z.Z.; writing—review and editing, S.L., J.Z. and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China, grant number 2022YFC2808303 and National Natural Science Foundation of China under Grant 42176186.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Access to the data will be considered upon request to the authors.

Acknowledgments

We would like to thank the editor and anonymous reviewers for their valuable comments and suggestions that greatly improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Plets, R.M.K. The Acoustic Imaging, Reconstruction and Characterization of Buried Archaeological Material; University of Southampton: Southampton, UK, 2007. [Google Scholar]
  2. Stevenson, I.; McCann, C.; Runciman, P. An attenuation-based sediment classification technique using Chirp sub-bottom profiler data and laboratory acoustic analysis. Mar. Geophys. Res. 2002, 23, 277–298. [Google Scholar] [CrossRef]
  3. Dvorkin, J.P.; Mavko, G. Modeling attenuation in reservoir and nonreservoir rock. Lead. Edge 2006, 25, 194–197. [Google Scholar] [CrossRef]
  4. Kneib, G.; Shapiro, S.A. Viscoacoustic wave propagation in 2-D random media and separation of absorption and scattering attenuation. Geophysics 1995, 60, 459–467. [Google Scholar] [CrossRef]
  5. Pinson, L.J.W.; Henstock, T.J.; Dix, J.K.; Bull, J.M. Estimating quality factor and mean grain size of sediments from high-resolution marine seismic data. Geophysics 2008, 73, G19–G28. [Google Scholar] [CrossRef]
  6. Aki, K.; Richards, P.G. Quantitative Seismology, 2nd ed.; University Science Books: Melville, NY, USA, 2002; pp. 161–177. [Google Scholar]
  7. Blias, E. Accurate interval Q-factor estimation from VSP data. Geophysics 2012, 77, WA149–WA156. [Google Scholar] [CrossRef]
  8. Zhao, L.F.; Mousavi, S.M. Lateral variation of crustal Lg attenuation in eastern North America. Sci. Rep. 2018, 8, 7285. [Google Scholar] [CrossRef] [PubMed]
  9. Wang, Y. Q analysis on reflection seismic data. Geophys. Res. Lett. 2004, 31, L17606. [Google Scholar] [CrossRef]
  10. Chopra, S.; Marfurt, K.J. Seismic Attributes for Prospect Identification and Reservoir Characterization; Society of Exploration Geophysicists and European Association of Geoscientists and Engineers: Utrecht, The Netherlands, 2007. [Google Scholar]
  11. Yaojun, W.; Shuangquan, C.; Lei, W.; Li, X.Y. Modeling and analysis of seismic wave dispersion based on the rock physics model. J. Geophys. Eng. 2013, 10, 054001. [Google Scholar]
  12. Schock, S. A method for estimating the physical and acoustic properties of the sea bed using chirp sonar data. IEEE J. Ocean. Eng. 2005, 29, 1200–1217. [Google Scholar] [CrossRef]
  13. Tonn, R. The Determination of the seismic quality factor Q from VSP data: A comparison of different computational methods1. Geophys. Prospect. 1991, 39, 1–27. [Google Scholar] [CrossRef]
  14. Engelhard, L. Determination of seismic-wave attenuation by complex trace analysis. Geophys. J. Int. 1996, 125, 608–622. [Google Scholar] [CrossRef]
  15. Bath, M. Spectral Analysis in Geophysics; Elsevier Scientific Pub: London, UK, 1974. [Google Scholar]
  16. Wu, Z.; Yang, F.; Luo, X.; Li, S.; Xiong, M. High-Resolution Submarine Topography—Theory and Technology for Surveying and Post-Processing; Science Press: Beijing, China, 2017. [Google Scholar]
  17. Jannsen, D.; Voss, J.; Theilen, F. Comparison of methods to determine Q in shallow marine sediments from vertical reflection seismograms. Geophys. Prospect. 2010, 33, 479–497. [Google Scholar] [CrossRef]
  18. Schock, S.G. The Chirp Sonar—A High-Resolution, Quantitative Subbottom Profiler; University of Rhode Island: Kingston, RI, USA, 1989. [Google Scholar]
  19. Schock, S. Remote estimates of physical and acoustic sediment properties in the South China Sea using chirp sonar data and the biot model. IEEE J. Ocean. Eng. 2004, 29, 1218–1230. [Google Scholar] [CrossRef]
  20. Liu, G.; Chen, X.; Rao, Y. Seismic quality factor estimation using frequency-dependent linear fitting. J. Appl. Geophys. 2018, 156, 1–8. [Google Scholar] [CrossRef]
  21. Panda, S. Remote Acoustic Evaluation of Seafloor Sediment Properties; University of Rhode Island: Kingston, RI, USA, 1992. [Google Scholar]
  22. Li, S.; Zhao, J.; Zhang, H.; Qu, S. Sub-Bottom Sediment Classification Using Reliable Instantaneous Frequency Calculation and Relaxation Time Estimation. Remote Sens. 2021, 13, 4809. [Google Scholar] [CrossRef]
  23. Hackert, C.L.; Parra, J.O. Improving Q estimates from seismic reflection data using well-log-based localized spectral correction. Geophysics 2004, 69, 1521–1529. [Google Scholar] [CrossRef]
  24. Tu, N.; Lu, W.-K. Improve Q estimates with spectrum correction based on seismic wavelet estimation. Appl. Geophys. 2010, 7, 217–228. [Google Scholar] [CrossRef]
  25. Li, C.; Liu, X. A new method for interval Q-factor inversion from seismic reflection data. Geophysics 2015, 80, R361–R373. [Google Scholar] [CrossRef]
  26. Chen, S.; Wang, K.; Peng, Z.; Chang, C.; Zhai, W. Generalized dispersive mode decomposition: Algorithm and applications. J. Sound Vib. 2020, 492, 115800. [Google Scholar] [CrossRef]
  27. Xue, Y.J.; Cao, J.X.; Wang, X.J.; Du, H.K. Estimation of seismic quality factor in the time-frequency domain using variational mode decomposition. Geophysics 2020, 85, V329–V343. [Google Scholar] [CrossRef]
  28. Baradello, L. An improved processing sequence for uncorrelated Chirp sonar data. Mar. Geophys. Res. 2014, 35, 337–344. [Google Scholar] [CrossRef]
  29. Forte, E.; Dossi, M.; Pipan, M.; Del Ben, A. Automated phase attribute-based picking applied to reflection seismics. Geophysics 2016, 81, V141–V150. [Google Scholar] [CrossRef]
  30. LeBlanc, L.R.; Panda, S.; Schock, S.G. Sonar attenuation modeling for classification of marine sediments. J. Acoust. Soc. Am. 1992, 91, 116–126. [Google Scholar] [CrossRef]
  31. Mattia, N.; Tore, G.K.; Daniel, P. Sketch-based modelling and visualization of geological deposition. Comput. Geosci. 2014, 67, 40–48. [Google Scholar]
  32. Stockwell, R.G.; Mansinha, L.; Lowe, R.P. Localization of the complex spectrum: The S transform. IEEE Trans. Signal Process. 1996, 44, 998–1001. [Google Scholar] [CrossRef]
  33. Adams, M.D.; Kossentini, F.; Ward, R.K. Generalized S transform. IEEE Trans. Signal Process. 2002, 50, 2831–2842. [Google Scholar] [CrossRef]
  34. Pinnegar, C.R.; Mansinha, L. The S-transform with windows of arbitrary and varying shape. Geophysics 2003, 68, 381–385. [Google Scholar] [CrossRef]
  35. Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
  36. Xu, T.; Zeng, Z.; Huang, X.; Li, J.; Feng, H. Pipeline Leak Detection based on Variational Mode Decomposition and Support Vector Machine Using an Interior Spherical Detector. Process. Saf. Environ. Prot. 2021, 153, 167–177. [Google Scholar] [CrossRef]
  37. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef]
  38. Maes, F.; Collignon, A.; Vandermeulen, D.; Marchal, G.; Suetens, P. Multimodality image registration by maximization of mutual information. IEEE Trans. Med Imaging 1997, 16, 187–198. [Google Scholar] [CrossRef] [PubMed]
  39. Li, S.; Zhao, J.; Zhang, H.; Qu, S. An Integrated Horizon Picking Method for Obtaining the Main and Detailed Reflectors on Sub-Bottom Profiler Sonar Image. Remote Sens. 2021, 13, 2959. [Google Scholar] [CrossRef]
  40. Lee, G.H.; Kim, H.J.; Kim, D.C.; Yi, B.Y.; Nam, S.M.; Khim, B.K.; Lim, M.S. The acoustic diversity of the seabed based on the similarity index computed from Chirp seismic data. ICES J. Mar. Sci. J. Du Cons. 2008, 66, 227–236. [Google Scholar] [CrossRef]
  41. Shang, X.; Robert, K.; Misiuk, B.; Mackin-McLaughlin, J.; Zhao, J. Self-adaptive analysis scale determination for terrain features in seafloor substrate classification. Estuarine Coast. Shelf Sci. 2021, 254, 107359. [Google Scholar] [CrossRef]
  42. Reich, D.; Price, A.L.; Patterson, N. Principal component analysis of genetic data. Nat. Genet. 2008, 40, 491–492. [Google Scholar] [CrossRef]
  43. Takane, Y.; Shibayama, T. Principal component analysis with external information on both subjects and variables. Psychometrika 1991, 56, 97–120. [Google Scholar] [CrossRef]
  44. Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
  45. Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-Means Clustering Algorithm. Appl. Stat. 1979, 28, 100–108. [Google Scholar] [CrossRef]
Figure 1. Overall technology roadmap. It covers two parts: unsupervised classification and Q computation with modified SR.
Figure 1. Overall technology roadmap. It covers two parts: unsupervised classification and Q computation with modified SR.
Jmse 11 01770 g001
Figure 2. Workflow of Q estimation using VMD. VMD is first used to obtain a series of IMFs from the time–frequency spectra generated by the GST of SBP ping data. Then, the correlation analysis is performed successively on the IMFs to reconstruct the signal, which removes the effect of the reflection coefficient. Finally, band selection and least-squares fitting are used to calculate Q.
Figure 2. Workflow of Q estimation using VMD. VMD is first used to obtain a series of IMFs from the time–frequency spectra generated by the GST of SBP ping data. Then, the correlation analysis is performed successively on the IMFs to reconstruct the signal, which removes the effect of the reflection coefficient. Finally, band selection and least-squares fitting are used to calculate Q.
Jmse 11 01770 g002
Figure 3. Typical sedimentary drilling figure. (a) ZK08 drilling figure; (b) ZK15 drilling figure.
Figure 3. Typical sedimentary drilling figure. (a) ZK08 drilling figure; (b) ZK15 drilling figure.
Jmse 11 01770 g003
Figure 4. Results of k-means classification (K = 3). (ad) represent the four measurement lines. Different colors indicate different categories. Since each line is an independently unsupervised classification, the same color in different lines does not necessarily indicate the same sediment type.
Figure 4. Results of k-means classification (K = 3). (ad) represent the four measurement lines. Different colors indicate different categories. Since each line is an independently unsupervised classification, the same color in different lines does not necessarily indicate the same sediment type.
Jmse 11 01770 g004
Figure 5. Chart of the variation in the correlation coefficient.
Figure 5. Chart of the variation in the correlation coefficient.
Jmse 11 01770 g005
Figure 6. Single-ping data results graph. (a) The ping signal; and (b) its time–frequency spectrum by GST; (c) reconstructed signal at the horizon location (the original signal in blue and the reconstructed signal in orange); (d) spectral-ratio results ( (ln U(ti) − ln U(ti+1)) in blue and the fitted line in orange).
Figure 6. Single-ping data results graph. (a) The ping signal; and (b) its time–frequency spectrum by GST; (c) reconstructed signal at the horizon location (the original signal in blue and the reconstructed signal in orange); (d) spectral-ratio results ( (ln U(ti) − ln U(ti+1)) in blue and the fitted line in orange).
Jmse 11 01770 g006
Figure 7. Boxplots of Q for different measurement lines. (ad) denote the four lines and correspond one-to-one to the subfigures in Figure 4. Type 1 and type 2 represent the types represented by different colors from the upper to lower layers in Figure 4, except for the seawater area. Each line is computed independently. The red plus sign represents extreme abnormal values.
Figure 7. Boxplots of Q for different measurement lines. (ad) denote the four lines and correspond one-to-one to the subfigures in Figure 4. Type 1 and type 2 represent the types represented by different colors from the upper to lower layers in Figure 4, except for the seawater area. Each line is computed independently. The red plus sign represents extreme abnormal values.
Jmse 11 01770 g007
Figure 8. Comparison of traditional SR and our method. (a) Original signal and fitted line between horizon 1 and 2; (b) reconstructed signal and fitted line between horizon 1 and 2; (c) original signal and fitted line between horizon 2 and 3; (d) reconstructed signal and fitted line between horizon 2 and 3. (Difference value between two horizons in blue and the fitted line in orange).
Figure 8. Comparison of traditional SR and our method. (a) Original signal and fitted line between horizon 1 and 2; (b) reconstructed signal and fitted line between horizon 1 and 2; (c) original signal and fitted line between horizon 2 and 3; (d) reconstructed signal and fitted line between horizon 2 and 3. (Difference value between two horizons in blue and the fitted line in orange).
Jmse 11 01770 g008
Figure 9. Comparison of classification results for line 4 using different K. ((a) K = 3; (b) K = 4). Different colors indicate different categories.
Figure 9. Comparison of classification results for line 4 using different K. ((a) K = 3; (b) K = 4). Different colors indicate different categories.
Jmse 11 01770 g009
Figure 10. Boxplots of Q for line 4. The red plus sign represents extreme abnormal values.
Figure 10. Boxplots of Q for line 4. The red plus sign represents extreme abnormal values.
Jmse 11 01770 g010
Table 1. Comparison table of Q sediment.
Table 1. Comparison table of Q sediment.
CategoryMoisture (%)Mean Grain Size (mm)Q
Sandy Clay27~390~0.005150~300
Sandy Silt20~360.005~0.05100~150
Sandy Mud30~500.05~0.530~100
Table 2. Table of correlation coefficients (MI and GCC); minima are marked in bold.
Table 2. Table of correlation coefficients (MI and GCC); minima are marked in bold.
kSourceMIRg
1IMF1 to IMF21.76830.6195
2IMF2 to IMF30.57760.2184
3IMF3 to IMF40.50740.2481
4IMF4 to IMF50.39660.2856
5IMF5 to IMF60.44360.4589
6IMF6 to IMF70.25650.4074
7IMF7 to IMF80.25300.4429
8IMF8 to IMF90.25260.4462
9IMF9 to IMF100.30110.5960
10IMF10 to IMF110.27570.2551
11IMF11 to IMF120.39110.1566
12IMF12 to IMF130.71770.2355
13IMF13 to IMF140.12790.1071
14IMF14 to IMF150.27530.5658
15IMF15 to IMF160.23720.2719
Separation point (G)1313
Table 3. Table of correlation coefficients (CDC); minima are marked in bold.
Table 3. Table of correlation coefficients (CDC); minima are marked in bold.
kSourceIk
2Rg 1 to Rg 2−0.4011
3Rg 2 to Rg 30.0297
4Rg 3 to Rg 40.0375
5Rg 4 to Rg 50.1733
6Rg 5 to Rg 6−0.0515
7Rg 6 to Rg 70.0355
8Rg 7 to Rg 80.0033
9Rg 8 to Rg 90.1498
10Rg 9 to Rg 10−0.3409
11Rg 10 to Rg 11−0.0985
12Rg 11 to Rg 120.0789
13Rg 12 to Rg 13−0.1284
14Rg 13 to Rg 140.4587
15Rg 14 to Rg 15−0.2939
Separation point (G)2
Table 4. Comparison of Q calculations between 10 neighboring pings at layer 1.
Table 4. Comparison of Q calculations between 10 neighboring pings at layer 1.
Ping No.Traditional SRProposed Method
401192.796935.3588
402286.874839.6956
403130.915640.6357
404131.176255.1115
4051437.906236.0064
40692.344140.2316
4071332.584435.7613
408367.676847.3555
4091373.043655.1474
410573.233346.6754
Mean591.855243.1979
SD562.90947.5292
Table 5. Statistical table of Q for measurement lines.
Table 5. Statistical table of Q for measurement lines.
Mean-QSediment Class
Line1Type190.5234Sandy Mud
Type2224.6792Sandy Clay
Line2Type134.4638Sandy Mud
Type2166.6610Sandy Clay
Line3Type172.5031Sandy Mud
Type2177.4148Sandy Clay
Line4Type145.7010Sandy Mud
Type2186.6549Sandy Clay
Table 6. Statistical table of Q.
Table 6. Statistical table of Q.
Mean-QSediment ClassMin-QSediment ClassMax-QSediment Class
Type 2153.6138Sandy Clay101.4761Sandy Silt310.3159Sandy Clay
Type 3167.8000Sandy Clay122.0791Sandy Silt295.2156Sandy Clay
Table 7. Comparison of Q calculations between 10 neighboring pings.
Table 7. Comparison of Q calculations between 10 neighboring pings.
Ping No.Xue’s MethodProposed Method
0164.937572.5031
0254.286168.3482
0368.925172.4638
04102.296280.2263
0557.956274.3086
0680.283376.9745
0741.034569.9683
08−43.266453.6392
0958.392165.8944
10−34.084462.3942
Mean45.076069.6721
SD47.13117.6605
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zong, Z.; Zhao, J.; Li, S.; Zhang, H. Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor. J. Mar. Sci. Eng. 2023, 11, 1770. https://doi.org/10.3390/jmse11091770

AMA Style

Zong Z, Zhao J, Li S, Zhang H. Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor. Journal of Marine Science and Engineering. 2023; 11(9):1770. https://doi.org/10.3390/jmse11091770

Chicago/Turabian Style

Zong, Zaixiang, Jianhu Zhao, Shaobo Li, and Hongmei Zhang. 2023. "Automatic Marine Sub-Bottom Sediment Classification Using Feature Clustering and Quality Factor" Journal of Marine Science and Engineering 11, no. 9: 1770. https://doi.org/10.3390/jmse11091770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop