1. Introduction
Synthetic aperture sonar (SAS) is a technique that repeatedly transmits and receives pulses while the sonar is moving and coherently synthesizes the received signals to obtain a high-resolution image [
1,
2,
3]. By synthesizing multiple pings, it is possible to achieve the effect of a sonar operating with an aperture larger than the actual sonar aperture, therefore called a “synthetic aperture” sonar. Compared to other techniques for obtaining underwater images, such as side-scan sonar, SAS obtains images with a high resolution [
4] and is used in various fields such as crude oil exploration, geological exploration, and for military purposes such as in mine detection [
5,
6].
Conventional SAS methods reconstruct the image by performing Fourier transform and matched filtering in the slant-range or in the azimuth domain. Conventional SAS methods are classified into back-projection in the spatial–temporal domain [
1], correlation in the spatial–temporal domain [
7], range-Doppler in the range-Doppler domain [
8], wavenumber in the wavenumber domain [
9,
10], and chirp-scaling in the wavenumber domain [
11], contingent on whether Fourier transform is performed in the slant-range or in the azimuth domain. To form a synthetic aperture requires sampling following Nyquist theory in the time domain according to the traditional signal processing technique, and dense sampling in the spatial domain alongside the sonar movement is also required. Because conventional SAS signal processing techniques pass through a matched filter, side lobes are generated, resulting in the deterioration of image reconstruction performance [
12,
13].
This paper proposes SAS imaging algorithms that apply the compressive sensing (CS) framework to compensate for disadvantages associated with conventional SAS signal processing techniques. CS is a technique used to restore a sparse signal from a small number of measurements [
14]. Under suitable conditions, CS obtains a better resolution than conventional signal processing and suppresses side lobes. In addition, admittedly under suitable conditions, CS obtains exact solutions even at a low sampling level, which violates Nyquist theory. In recent years, studies related to the CS framework have been conducted in various fields, such as medical imaging fields—including MRI and ultrasound imaging—and sensor networks [
15,
16,
17,
18,
19,
20]. In the estimation of the direction-of-arrival (DOA), which is a classical source-localizing method, CS is applied to increase the number of employed sensors or observations, thereby enhancing localization performance [
21,
22]. Additionally, studies have been conducted on the application of sparse reconstruction to synthetic aperture radar (SAR) [
23,
24,
25,
26,
27,
28] and SAS [
29,
30]. Many studies have applied CS to SAR, but as far as could be determined, few studies have been conducted on SAS, especially underwater.
In [
29], a method that applies CS to SAS imaging is presented, which estimates the reflectivity function in the area of interest using all the given data. When some of the data were excluded, results were good, showing that large data reduction is possible. However, this study does not show results for actual underwater acoustic conditions; it only shows results for the ultrasonic synthetic aperture laboratory system using assumed point targets. The fact that the laboratory results have not been verified against actual underwater experimental data has significant consequences. Targets in real underwater environments are generally not point targets but targets with continuous characteristics. Therefore, if the reflectivity function is estimated instantly, as in the method proposed in [
29], the shape of the target will not be properly revealed and only segments with high reflectivity will be obtained. In [
30], CS was applied to SAS to obtain a parsimonious representation to utilize aspect- or frequency-specific information. By way of simulation and employing real underwater experimental data, it was verified that the strategy using aspect- or frequency-specific information was effective. However, the method proposed in [
30], which uses an iterative method called the alternating direction method of multipliers (ADMM), has a limitation in that it is unstable because convergence is highly dependent on the regularization parameter. Therefore, we propose a stable method that does not use an iterative method and that expresses the characteristics of a real underwater target by dividing data and repeatedly estimating the reflectivity function of the area of interest.
This study offers three main contributions: First, the proposed method (called the CS-SAS algorithm for simplicity) that shows better reconstruction performance compared to a conventional SAS algorithm. The proposed algorithms are SAS algorithms formulated from the perspective of the CS framework and in accordance with the CS characteristics. Less aliasing occurs and high-resolution results are obtained.
Section 3 explains that the proposed method outperforms one of the conventional SAS algorithms, the ω-k algorithm. Second, the proposed algorithms are more robust in the absence of sensor data. Because conventional SAS algorithms require sampling frequency according to Nyquist theory in the time and spatial domains, conventional SAS algorithms are not resistant to sensor failure or data loss in the sonar system. Conversely, the proposed algorithms are robust, as indicated later on in this paper. Third, few studies apply CS to SAS underwater and, therefore, this study is meaningful in that it applies simulation data and actual underwater experimental data.
The remainder of this paper is organized into four sections. In
Section 2, the geometry of the SAS system and the ω-k algorithm—which is a representative conventional SAS algorithm—are described. In
Section 3, the basic theory of CS is described, and SAS algorithms using CS are proposed. In
Section 4, the performance of the proposed method is verified by comparing the results of applying the CS-SAS and the ω-k algorithms to the simulation and experimental data. Finally, conclusions are presented in
Section 5.
In the following, vectors are represented by bold lowercase letters, and matrices are represented by bold capital letters. The lp-norm of a vector is defined as . The imaginary unit is denoted as j. The operators denote the transpose and conjugate operators, respectively.
3. SAS Algorithm with CS Framework (CS-SAS)
3.1. Compressive Sensing
Compressive sensing is a method or framework for solving linear problems, such as
for sparse signal
[
36].
is an unknown signal vector that we want to reconstruct. The unknown signal vector
is a
k-sparse vector, where
is
k-sparse, meaning that
, that is,
has only
k non-zero elements.
is a measurement vector consisting of measured values. In many realistic problems,
—called a sensing matrix—is introduced to represent the problem as a linear relationship, such as
. When the dimension of the measurement vector
is smaller than the dimension of
, that is,
, the
problem becomes an underdetermined problem and has numerous solutions, making it impossible to specify
. Using the sparse property of
, it is possible to specify a unique and exact solution among countless feasible solutions of the underdetermined problem. The sparsity is imposed by the sparsity constraint
l0-norm. The
l0-norm minimization problem is formulated as follows:
However, the l0-norm minimization problem, Equation (15), is an NP-hard problem that is computationally intractable. To deal with the NP-hard problem, various methods have been developed such as l1-norm relaxation or greedy algorithms represented by orthogonal match-pursuit.
One of the most representative methods for solving the compressive sensing problem is
l1-norm relaxation, which solves the problem by replacing
l0-norm with
l1-norm. The
l0-norm minimization problem, Equation (15), can be relaxed by reformulating as
In the presence of noise, a sparse solution
can be obtained by the following equation:
Equations (16) and (17) are called basis pursuit (BP) and basis pursuit denoising (BPDN) problems, respectively. The larger the hyperparameter , the sparser the optimized solution . Oppositely, the smaller the , the more optimized solution fits the data. Therefore, it is important to assume a suitable hyperparameter. However, finding a suitable hyperparameter is complex and deemed to be outside the scope of this study.
In this study, the SAS image was obtained by solving the BPDN problem using the tool provided by CVX [
37].
3.2. CS-SAS Algorithm for Single Sensor
To handle SAS imaging problems from the perspective of compressive sensing, the problem must first be well defined as the
problem. To formulate a compressive algorithm in which a single sensor is in linear motion, the signal reflected by the targets and returned to the single sensor needs to be considered first. When a signal
p(
t) is transmitted from a single sensor located in
and reflected by
N-targets, the received signal
can be written as
where
is the target strength of the
k-th target,
is a function representing travel time,
xk is the slant-range of the
k-th target, and
yk is the cross-range of the
k-th target. When
p(
t) is a continuous wave (CW) signal with a pulse duration of
Tp and carrier frequency of
fc, Equation (19) can be rewritten as Equation (21).
The above is a formulation of the signal received at one location, . This can be expanded to the expression for a single sensor.
The operation of a single sensor sonar can be expressed as shown in
Figure 3. The total number of pings is
Np, the single sensor—transmitter and receiver—corresponding to the
m-th ping, is
um, and the position of
um is
,
m = 1, ...,
Np. The center point of the area of interest is
, the half-size of the area of interest in the range is
X0, and the half-size of the area of interest in the cross-range is
Y0. By dividing the area of interest into
grids, assuming that there is a virtual target
σk at each grid point, the signal
received at the
m-th ping can be written as follows:
The signal vector
, which consists of signals received at a specific time
ti at each position of the sonar, can be written as
Combining Equations (22) and (23),
can be expressed as the product of the target strength vector
of the virtual targets and the sensing matrix
:
where
denotes an element corresponding to the
m-th row and
k-th column of
. In the CS system, where the length of
is
M and the length of
is
N and
k-sparse,
is successfully recovered when
measurements are used [
36,
38]. In the proposed algorithm, if the length
Np of the received signal
is too small compared to the length
of
, an accurate solution cannot be obtained. Therefore, in the case where the length of the signal
is too small, signals for a total of
Nt times are collected to form a long signal vector
and, similarly, corresponding matrices are collected to form a long sensing matrix
:
where
,
, and
.
is estimated by solving the following BP or BPDN problems:
The value, determined by applying Equation (30) or (31), is a target strength vector obtained from using only the signals received in consecutive Nt time snapshots from ti. Therefore, to obtain the target strength vector for all the area of interest, the l1-norm minimization process must be repeated for all time snapshots corresponding to the area of interest. The final image was compiled by adding all the values obtained in each process.
When obtaining solutions for BP or BPDN, the elements of
with a large
l1-norm of the corresponding sensing matrix column vector tend to have a non-zero value. To eliminate this bias, each column vector
of the sensing matrix
and the received signal
is normalized by their
l2-norms [
14].
Therefore, Equations (30) and (31) are modified, and the final solution is obtained by compensating the
l2-norms to the
estimated from Equation (34) or Equation (35) as follows:
3.3. CS-SAS Algorithm for Uniform Line Array
The algorithm proposed in
Section 3.2 is a method used for a single sensor. However, in many cases, a uniform array sonar consisting of one transmitter and multiple receivers is used, and it needs to be extended to the algorithm for a uniform line array. The algorithm for the uniform linear array introduced in this section is the same as the algorithm for a single sensor, except for the travel time function. The transmitter in the
m-th ping is
utm,
m = 1, …,
Np, and its position vector is
. The number of receivers in the physical array is
Nu, and the
n-th receiver in the physical array is
un,
n = 1, …,
Nu. Therefore, Equations (22) and (23) can be reformulated as follows:
The signal vector composed of measurements received at a specific time
ti of each receiver in the
j-th ping is denoted as
. Therefore, the signal vectors
can be arranged as
Similar to the previous section, the following formulas are obtained:
The CS framework can be applied by formulation as above.
4. Results
In this section, the performance of the proposed algorithms is demonstrated by comparing the results of applying the ω-k algorithm and the CS-SAS algorithms to both the simulation and experimental data. In the simulation and in the experiment, the carrier frequencies of the CW signal were 400 and 455 kHz, respectively, whereas the sampling frequencies were 25 or 50 kHz and, therefore, the ω-k algorithm included the baseband process.
The following shows that the CS-SAS algorithms exhibit superior performance in terms of resolution and noise robustness and indicates how to become robust when combating conditions where sensors are not working or data are lost.
4.1. Simulation Results
4.1.1. Simulation Results for Single Sensor
The ω-k and CS-SAS algorithms were compared for various cases. For the single-sensor SAS, five cases were simulated. The basic simulation environment was a single-sensor sonar operated at 0.02 m intervals from −5 to 5 in the cross-range axis, that is,
Np = 501. The signal
p(
t) was a CW signal with carrier frequency
fc = 400 kHz and pulse duration
Tp = 0.1 ms. The center point of the area of interest was [
Xc,
Yc] = [250, 0], the half-size of the area of interest in slant-range was
X0 = 0.5 m, the half-size of the area of interest in cross-range was
Y0 = 1 m, the sampling frequency was
fs = 50 kHz, and the area of interest consisted of
NX = 101 and
NY = 101 uniform grid points. The sound speed was
c = 1500 m/s. Twelve-point targets were placed, as described in
Figure 4.
All simulation environments for single-sensor sonar were fundamentally the same as the basic simulation environment described above and were performed by changing the noise level,
Xc,
fs, and sonar interval, as shown in
Table 1.
As shown in
Figure 5, it was confirmed that the images obtained through the CS-SAS algorithm accurately distinguished 12 targets and had a good azimuth resolution. In addition, it was confirmed that there was no performance degradation caused by the side lobes. However, the result for the ω-k algorithm is unable to distinguish between targets adjacent to each other in the center of the area of interest. In addition, much aliasing occurs, especially in the azimuth direction. On account of the influence of matched filtering and Fourier transform, the exact position of the point targets cannot be obtained, resulting in a blurred result.
The results of the simulation are shown in
Figure 6. In the case of the ω-k algorithm, because the sampling frequency is reduced by half compared to Case 1, the resolution in the slant-range direction is reduced. The aliasing at the center of the area of interest has a larger value than the values at the four target locations, [250 ± 0.02, 0] and [250 ± 0.04, 0]. Nonetheless, the image obtained using the CS-SAS algorithm yielded an accurate target image.
The results are shown in
Figure 7. The CS-SAS algorithm accurately fetches the images of 12 targets. However, the ω-k algorithm requires a Fourier transform in the space domain, which violates Nyquist theory because the sampling level in the space domain is reduced to 1/10. Therefore, aliasing occurred, and the image of the target could not be properly obtained. The eight target points in the center were not distinguishable, and the remaining four points were difficult to determine.
The results are depicted in
Figure 8. Even when the spatial sampling is reduced to a level of 1/20 and when some side lobes occur, the CS-SAS algorithm still fetches the image of 12 targets, whereas the ω-k algorithm fails to depict the proper image of the targets.
Noisy conditions with signal-to-noise ratios (SNRs) of 20, 10, and 5 dB were simulated. The results in
Figure 9 indicate that the CS-SAS algorithm applied to an environment with SNRs of 20 dB and 10 dB, obtained an almost accurate image of the targets. However, when the SNR was 5 dB, the values at the grid point between [250, ±0.04] and [250, ±0.08] were greater than the values at [250, ±0.04] and [250, ±0.08], and an accurate image could not be obtained. As the noise became louder, a degree of degradation occurred. Nevertheless, the CS-SAS algorithm was still superior to the ω-k algorithm in terms of resolution and sidelobe suppression.
4.1.2. Simulation Results for Uniform Line Array
The CS-SAS and ω-k algorithms for a uniform line array were simulated in two cases. The first simulation case is as follows: The simulation environments of sampling frequency
fs and sound velocity
c, excluding sonar configuration and
Xc, are the same as those of simulation Case 1 for a single sensor. The array has 20 receivers, with a 0.04 m spacing between receivers, as shown in
Figure 10a. By moving 0.4 m between pings, a total of 25 pings were shot. The source position of the uniform line array is 0.1 m away from the first sensor in the cross-range direction. Conditions for other simulations are the same as for Case 1, except that the sensor spacing of the array is different. The array has two receivers, with 0.4 m spacing between receivers, as shown in
Figure 10b. By moving 0.4 m between pings, a total of 25 pings were shot. The source position of the uniform line array is 0.1 m away from the first sensor in the cross-range direction.
The results for Case 1 are shown in
Figure 11. The CS-SAS algorithm accurately fetched images of the 12 targets. Contrarily, the result for the ω-k algorithm showed aliasing. In particular, the total number of sensors used were 500 = 20 × 25, which compares favorably to the simulation environment of a single sensor; however, the interval between the sensors doubled to 0.04, and the spatial sampling interval also doubled, resulting in aliasing near [250, ±0.07].
The results for Case 2 are shown in
Figure 12. The synthetic aperture is the same as in Case 1, but the sensor spacing is increased 10 times. Severe aliasing occurred in the ω-k results as well as the inability to properly identify the targets. However, the CS-SAS algorithm was significantly better distinguished.
The performances of the CS-SAS and ω-k algorithms were compared using simulation results for a single sensor and uniform line array. In the case of the ω-k algorithm, even under the most naïve simulation conditions, adjacent targets could not be distinguished and aliasing occurred, whereas in the case of the proposed algorithm, because CS was applied and sidelobes were rather suppressed, high-resolution results were obtained. In effect, CS-SAS has clearly distinguished targets under harsher conditions by increasing spatial sampling or reducing
fs, and has obtained accurate locations and shown robustness in noisy situations. This is possible because the measured signal can be expressed in a sparse representation for a certain domain, and CS can significantly lower the sampling rate and has robust resistance to noise [
36,
39].
4.2. Experimental Data Results
This was a water tank experiment conducted by SonaTech Inc. (Santa Barbara, CA, USA). As shown in
Figure 13a, an experiment was performed to obtain images of two rings in a water tank. As shown in
Figure 13b, the sonar has one transmitter and 32 receivers. After transmitting and receiving the signal once, the transmitter moves 616.5 mm and then sends and receives the next signal. This process was repeated seven times to receive signals from 224 locations. The ping signal
p(
t) is a CW signal with carrier frequency
fc = 455 kHz and pulse duration
Tp = 0.3 ms, the sampling frequency is
fs = 50 kHz, and the sound speed
c = 1480 m/s. There are two ring-shaped targets of approximate length of major axis 1.5 m each in the slant-range of 7 to 10 m and a cross-range of −2.4385 to 2.4385 m.
The raw data recorded in the slant-range are shown in
Figure 14a. The CS-SAS result was derived by dividing the area of interest into a uniform grid of
NX = 101 and
NY = 101. The results of the ω-k and CS-SAS algorithms are shown in
Figure 14b,c, respectively.
From the raw data, the targets can be seen in the form of rings, but the shape appears thick, and it is difficult to accurately determine the location of the targets. In the results of the ω-k algorithm, the shapes are slightly thinner, but aliasing is severe in the azimuth direction, and it appears that there are several rings. The results of the CS-SAS algorithm construct tworing-shaped targets. Because the CS-SAS algorithm attempts to bring an image with as little target distribution as possible, the side lobes are suppressed to obtain thin ring-shaped targets.
To examine whether the CS-SAS algorithm is robust under conditions where some of the sensors are broken, the result was derived by assuming a situation in which data from some sensors were lost. The experiment was divided into two cases: one case where the sensor failed uniformly (Sensor Loss: Uniform, SLU) and another case where the sensor failed randomly (Sensor Loss: Random, SLR). The percentages of sensors that did not malfunction and operated normally are also indicated in the results. The results are displayed in
Figure 15.
In addition, peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM), which are representative image quality measurement indicators [
40,
41], were calculated for quantitative comparison. PSNR is an index that evaluates loss information for the quality of the generated or compressed images and is expressed by the peak signal/mean square error (MSE) term. It has a unit of dB, and a higher value indicates less loss, i.e., higher image quality. SSIM is an index designed to evaluate differences in human visual quality rather than numerical errors. SSIM quality is evaluated in three aspects: luminance, contrast, and structural. However, since the correct answer image is not known, PSNR and SSIM were calculated for all sensor loss situations using
Figure 14c as a reference image. The calculation results are shown in
Table 2.
In the case of some conventional methods such as ω-k, a Fourier transform in space is performed. Therefore, it is difficult to obtain results freely in the form of an array because linear sampling is not possible in space when some sensors in the array fail. However, the CS-SAS algorithm does not perform Fourier transform in space and has a formulation that is free in the form of an array and, therefore, it is easy to obtain a result in a sensor loss situation. In addition, it is difficult to detect significant performance degradation of up to 75% for both SLU and SLR, and 50% of the SLR show particularly good results; note that CS has the best performance for random array or random down sampling [
42,
43].
Table 2 also shows that the random array results are generally better. In
Table 2, it can be seen that the image quality of SLR is high for both 50% and 25%. At 75%, the indicators of SLR are worse than at 75% of SLU. Because there is little deterioration in image quality in 75% of cases, it can be seen that the PSNR and SSIM simply show how similar the reconstruction results are to
Figure 14c, rather than showing the results of image quality deterioration. When it reaches approximately 25%, both SLU and SLR seem to blur to some extent, but in terms of side lobe suppression, it still shows no inferiority over the ω-k algorithm.
Using the CS-SAS algorithm in this study made it possible to obtain a higher resolution image than when using the conventional synthetic aperture sonar algorithm—the ω-k algorithm—and made it possible to reduce the problem of aliasing which also occurs in the conventional method. In addition, even with less spatial sampling, better results were obtained than compared to the conventional algorithm, and it was confirmed to be robust even when some sensors failed. Good results can be expected even if the number of sensors are reduced during actual sonar operation, and as a consequence, cost reduction is possible. Moreover, it is durable because it presents robust characteristics in failure situations. Results of actual experimental data were also observed, and it is expected that satisfying results will be obtained in the event that the CS-SAS algorithm is applied to a natural underwater environment.
5. Conclusions
In this paper, we proposed an algorithm that applies compressive sensing (CS) to a synthetic aperture sonar (SAS) under the assumption that the target distribution in water is sparse. Through simulation, it was confirmed that the proposed algorithm produces images with better resolution than the conventional SAS algorithm, the ω-k algorithm. In addition, because images obtained by the proposed method present very few and small side lobes, no deterioration of imaging performance occurs. Furthermore, even in the case of sampling at a low level that violates Nyquist theory in the time and space domain, a higher quality target image was obtained than with the ω-k algorithm.
Real environment applicability was revealed for the proposed method when comparing the results with actual experimental data. The results confirm that aliasing is reduced and side lobes are suppressed when applying the compressive sensing method. Contrarily, the ω-k algorithm does not obtain accurate target images due to aliasing. Importantly, it was confirmed that the proposed method is robust in the event of some sensors of the sonar system failing or when some data are lost.