**2. Basic SAS Imaging**

#### *2.1. SAS Geometry*

The general geometry of SAS is depicted in Figure 1. The direction of *y* along which the sonar moves is defined as azimuth or the cross-range axis, the direction *x* perpendicular to *y* is the slant-range, the size of the sonar is *D*, and the synthetic aperture and the distance the sonar travels is 2*L*. The basic concept of SAS is that the sonar moves from −*L* to *L* along the cross-range axis, and then transmits and receives signals to synthesize the received signals scattered back from the targets to obtain underwater images.

One of the standards for evaluating the level of performance of an SAS system is the resolution of the reconstructed images. The slant-range resolution Δ*<sup>x</sup>* of the SAS was determined by the matched filtering process as follows:

$$
\Delta\_x = \frac{c\pi}{\omega\_{bd}} = \frac{c}{2f\_{bd}},
\tag{1}
$$

where *ωbd* is the bandwidth of the transmitted signal, and *c* is the sound speed.

The cross-range resolution Δ*<sup>y</sup>* can be derived simply through the following development. Consider a single frequency signal as the simplest signal model. Then, the −3 dB main lobe width of an *l*-length uniform line array is *λ*/*l*, the main lobe width *θSAS* of the synthetic aperture is *λ*/2*L*. For the slant-range of the target *R*, *L Rθ* and *θ* = *λ*/*D*. Therefore, the cross-range resolution Δ*<sup>y</sup>* can be expressed as

$$
\Delta\_y = R D / (2R) = D / 2. \tag{2}
$$

For convenience, a single frequency signal was assumed, but it is known that crossrange resolution Δ*<sup>y</sup>* = *D*/2 even if a signal with bandwidth like LFM signal is used [28,31]. From Equation (2), it is clear that the cross-range resolution of the synthetic aperture sonar is independent of range and frequency. This independence makes it possible to reconstruct high-resolution images over a long range [32,33].

#### *2.2. Wavenumber Domain Algorithm (ω-k Algorithm)*

The wavenumber domain algorithm, a representative conventional SAS algorithm, was used as a baseline method. The wavenumber domain algorithm is a method that

obtains an image using 2-D Fourier transform of recorded signals and is also called the ω-k algorithm because signal processing is performed in the frequency domain [1,8]. The signal of duration *Tp* transmitted from the sonar, is denoted by *p*(*t*), and the signal received by the sonar at position *u* is denoted by *s*(*t*,*u*). The signal *s*(*t*,*u*) can be expressed as the sum of the signals scattered by the *N*-targets as follows:

$$s(t,u) = \sum\_{n=1}^{N} \sigma\_n p\left(t - \frac{2\sqrt{\mathbf{x}\_n^2 + (y\_n - u)^2}}{c}\right) \left(0 < t - \frac{2\sqrt{\mathbf{x}\_n^2 + (y\_n - u)^2}}{c} < T\_p\right) \tag{3}$$

where *σ<sup>n</sup>* is the target strength of the *n*-th target, and *xn* and *yn* are the slant-range and cross-range of the *n*-th target, respectively. The 2-D Fourier transform of Equation (3) using the stationary-phase principle [34] gives

$$S(\omega, k\_{\rm u}) = \sum\_{n=1}^{N} P(\omega) \sigma\_n \exp\left(-j\sqrt{4k^2 - k\_{\rm u}^2} \mathbf{x}\_n - jk\_{\rm u} y\_n\right). \tag{4}$$

where *ω* is the angular frequency, *k* is the wavenumber, and *ku* is the azimuth wavenumber. By changing the coordinates as shown in Equation (5), it can be arranged as in Equation (6).

$$k\_x = \sqrt{4k^2 - k\_{\mu\prime}^2} \; \; k\_y = k\_u. \tag{5}$$

$$S(\omega, k\_n) = \sum\_{n=1}^{N} P(\omega) \sigma\_n \exp\left(-jk\_x x\_n - jk\_y y\_n\right). \tag{6}$$

The function of the distribution of the targets is expressed in Equation (7), and its Fourier transform is expressed as Equation (8).

$$f\_0 = \sum\_{n=1}^{N} \sigma\_n \delta(\mathbf{x} - \mathbf{x}\_n, \mathbf{y} - \mathbf{y}\_n). \tag{7}$$

$$F\_0(k\_\mathcal{X}, k\_\mathcal{Y}) = \sum\_{n=1}^N \sigma\_n \exp\left(-jk\_\mathcal{X}\mathbf{x}\_\mathcal{H} - jk\_\mathcal{Y}y\_\mathcal{H}\right). \tag{8}$$

Equation (9) can be obtained by combining Equations (6) and (8).

$$S(\omega, k\_{\rm u}) = P(\omega) F\_0. \tag{9}$$

Therefore, the distribution of the targets can be estimated through the following relationship:

$$F(k\_{\mathcal{X}}, k\_{\mathcal{Y}}) = P^\*(\omega) S(\omega, k\_{\mathcal{U}}) = \left| P(\omega) \right|^2 \sum\_{n=1}^N \sigma\_n \exp \left( -jk\_{\mathcal{X}} \mathbf{x}\_n - jk\_{\mathcal{Y}} y\_n \right). \tag{10}$$

The mapping—Equation (5) from (*ω*, *ku*) to *kx*, *ky* , called Stolt mapping [35] involves interpolation from the data. The interpolation process can be made more rational through the following spatial shift formulation:

$$F\_b(k\_x, k\_y) = F(k\_x, k\_y) \exp\left(jk\_x X\_c + jk\_y Y\_c\right) \tag{11}$$

where subscript *b* indicates baseband conversion, and *Xc*,*Yc* are the centers of the area of interest in the slant-range and cross-range, respectively. In Equation (11), the exponential term performs a spatial shift function, which performs a function similar to the carrier removal process in spectrum demodulation. It enables interpolation in a slowly varying region while moving the entire swath down to the origin of the spatial coordinates.

Because the received signal is scattered by a target with a target strength of 1 in the center of the area of interest, its Fourier transform can be expressed as Equations (12) and (13), respectively, and Equation (11) can be summarized as follows:

$$s\_0(t, \mu) = p \left( t - \frac{2\sqrt{X\_c^2 + \left(\chi\_c - \mu\right)^2}}{c} \right), \tag{12}$$

$$S\_0(\omega, k\_\mu) = P(\omega) \exp\left(-jk\_\mathrm{x} X\_\mathrm{c} - jk\_\mathrm{y} Y\_\mathrm{c}\right),\tag{13}$$

$$F\_b(k\_x, k\_y) = S(\omega, k\_u) S\_0^\*(\omega, k\_u). \tag{14}$$

The flow chart of the ω-k algorithm is shown in Figure 2 [1].

**Figure 2.** Flow chart of the ω-k algorithm.

#### **3. SAS Algorithm with CS Framework (CS-SAS)**

#### *3.1. Compressive Sensing*

Compressive sensing is a method or framework for solving linear problems, such as **<sup>y</sup>** <sup>=</sup> **Ax** for sparse signal **<sup>x</sup>** [36]. **<sup>x</sup>** <sup>∈</sup> <sup>C</sup>*<sup>N</sup>* is an unknown signal vector that we want to reconstruct. The unknown signal vector **x** is a *k-sparse* vector, where **x** is *k-sparse*, meaning that **x**<sup>0</sup> <sup>=</sup> *<sup>k</sup>*, that is, **<sup>x</sup>** has only *<sup>k</sup>* non-zero elements. **<sup>y</sup>** <sup>∈</sup> <sup>C</sup>*<sup>M</sup>* is a measurement vector consisting of measured values. In many realistic problems, **<sup>A</sup>** <sup>∈</sup> <sup>C</sup>*M*×*N*—called a sensing matrix—is introduced to represent the problem as a linear relationship, such as **y** = **Ax**. When the dimension of the measurement vector **y** is smaller than the dimension of **x**, that is, M N, the **y** = **Ax** problem becomes an underdetermined problem and has numerous solutions, making it impossible to specify **x**. Using the sparse property of **x**, it is possible to specify a unique and exact solution among countless feasible solutions of the underdetermined problem. The sparsity is imposed by the sparsity constraint *l*0-norm. The *l*0-norm minimization problem is formulated as follows:

$$\min\_{\mathbf{x}\in\mathbb{C}^{\mathcal{N}}} \|\mathbf{x}\|\_{0} \text{ subject to } \mathbf{y} = \mathbf{A}\mathbf{x}.\tag{15}$$

However, the *l*0-norm minimization problem, Equation (15), is an NP-hard problem that is computationally intractable. To deal with the NP-hard problem, various methods have been developed such as *l*1-norm relaxation or greedy algorithms represented by orthogonal match-pursuit.

One of the most representative methods for solving the compressive sensing problem is *l*1-norm relaxation, which solves the problem by replacing *l*0-norm with *l*1-norm. The *l*0-norm minimization problem, Equation (15), can be relaxed by reformulating as

$$\min\_{\mathbf{x}\in\mathbb{C}^{\mathcal{N}}} \|\mathbf{x}\|\_{1} \text{ subject to } \mathbf{y} = \mathbf{A}\mathbf{x}.\tag{16}$$

In the presence of noise, a sparse solution **x** can be obtained by the following equation:

$$\min\_{\mathbf{x}\in\mathbb{C}^{N}} \|\mathbf{x}\|\_{1} \text{ subject to } \|\mathbf{y} - \mathbf{A}\mathbf{x}\|\_{2} < \epsilon. \tag{17}$$

Equations (16) and (17) are called basis pursuit (BP) and basis pursuit denoising (BPDN) problems, respectively. The larger the hyperparameter , the sparser the optimized solution **x**. Oppositely, the smaller the , the more optimized solution **x** fits the data. Therefore, it is important to assume a suitable hyperparameter. However, finding a suitable hyperparameter is complex and deemed to be outside the scope of this study.

In this study, the SAS image was obtained by solving the BPDN problem using the tool provided by CVX [37].

#### *3.2. CS-SAS Algorithm for Single Sensor*

To handle SAS imaging problems from the perspective of compressive sensing, the problem must first be well defined as the **y** = **Ax** problem. To formulate a compressive algorithm in which a single sensor is in linear motion, the signal reflected by the targets and returned to the single sensor needs to be considered first. When a signal *p*(*t*) is transmitted from a single sensor located in **r***<sup>u</sup>* = [*x*0, *y*0] and reflected by *N*-targets, the received signal s(*t*,**r***u*) can be written as

$$\mathbf{s}(t, \mathbf{r}\_{\rm u}) = \sum\_{k=1}^{N} \sigma\_{k} p(t - \tau(\mathbf{r}\_{k'} \mathbf{r}\_{\rm u})),\tag{18}$$

$$\pi(\mathbf{r}\_{k\prime}, \mathbf{r}\_{u}) = 2\frac{|\mathbf{r}\_{k} - \mathbf{r}\_{u}|}{c} = 2\frac{\sqrt{(\mathbf{x}\_{k} - \mathbf{x}\_{0})^{2} + (y\_{k} - y\_{0})^{2}}}{c},\tag{19}$$

where *σk*, *k* = 1, ... , *N* is the target strength of the *k*-th target, τ(**r***k*, **r***u*) is a function representing travel time, *xk* is the slant-range of the *k*-th target, and *yk* is the cross-range of the *k*-th target. When *p*(*t*) is a continuous wave (CW) signal with a pulse duration of *Tp* and carrier frequency of *fc*, Equation (19) can be rewritten as Equation (21).

$$\mathbf{p}(t) = e^{j\omega t} = e^{j2\pi f\_c t} \begin{pmatrix} 0 \le t \le T\_p \end{pmatrix} \tag{20}$$

$$\mathbf{s}(t, \mathbf{r}\_{\mathsf{u}}) = \sum\_{k=1}^{N} \sigma\_{k} \exp\left(j\omega(t - \tau(\mathbf{r}\_{k}, \mathbf{r}\_{\mathsf{u}}))\right) \left(0 \le t - \tau(\mathbf{r}\_{k}, \mathbf{r}\_{\mathsf{u}}) \le T\_{\mathsf{P}}\right). \tag{21}$$

The above is a formulation of the signal received at one location, **r***u*. This can be expanded to the expression for a single sensor.

The operation of a single sensor sonar can be expressed as shown in Figure 3. The total number of pings is *Np*, the single sensor—transmitter and receiver—corresponding to the *m*-th ping, is *um*, and the position of *um* is *rum* , *m* = 1, ..., *Np*. The center point of the area of interest is (*Xc*, *Yc*), the half-size of the area of interest in the range is *X*0, and the half-size of the area of interest in the cross-range is *Y*0. By dividing the area of interest into *Nx* × *Ny* grids, assuming that there is a virtual target *σ<sup>k</sup>* at each grid point, the signal s(*t*,**r***um* ) received at the *m*-th ping can be written as follows:

$$\mathbf{s}(t, \mathbf{r}\_{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\rm{\tiny{\rm{\rm{\rm{\tiny{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\cdots}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}$$
}}

The signal vector **s***i*, which consists of signals received at a specific time *ti* at each position of the sonar, can be written as

$$\mathbf{s}\_{i} = \left[ \mathbf{s}(t\_{i}, \mathbf{r}\_{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\tiny{\rm{\rm{\tiny{\rm{\rm{\tiny{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\rm{\cdots}}\cdot}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} \right}} \right{\,$$
}}}

Combining Equations (22) and (23), **s***<sup>i</sup>* can be expressed as the product of the target strength vector σ of the virtual targets and the sensing matrix **A***i*:

$$\mathbf{s}\_{i} = \mathbf{A}\_{i}\boldsymbol{\sigma}\_{i} \tag{24}$$

$$\boldsymbol{\sigma} = \begin{bmatrix} \sigma\_{1\prime} \dots \dots \sigma\_{N\_x N\_y} \end{bmatrix}^T \tag{25}$$

$$\mathbf{A}\_{i}(m,k) = \varepsilon \mathbf{x} p(j\omega(t\_i - \tau(\mathbf{r}\_k, \mathbf{r}\_{\text{li}\_m}))) \quad \left(0 \le t\_i - \tau(\mathbf{r}\_k, \mathbf{r}\_{\text{li}\_m}) \le T\_{\overline{p}}\right), \tag{26}$$

where **A***i*(*m*, *k*) denotes an element corresponding to the *m*-th row and *k*-th column of **A***i*. In the CS system, where the length of **y** is *M* and the length of **x** is *N* and *k-sparse*, **x** is successfully recovered when *M* ≥ *O*(*k* log(*N*/*k*)) measurements are used [36,38]. In the proposed algorithm, if the length *Np* of the received signal **s***<sup>i</sup>* is too small compared to the length *NxNy* of σ, an accurate solution cannot be obtained. Therefore, in the case where the length of the signal **s***<sup>i</sup>* is too small, signals for a total of *Nt* times are collected to form a long signal vector **s** and, similarly, corresponding matrices are collected to form a long sensing matrix **A**:

$$\mathbf{s} = \mathbf{A}\,\sigma,\tag{27}$$

$$\mathbf{s} = \begin{bmatrix} \mathbf{s}\_i^T, \mathbf{s}\_{i+1}^T, \dots, \mathbf{s}\_{i+N\_l-1}^T \end{bmatrix}^T \tag{28}$$

$$\mathbf{A} = \begin{bmatrix} \mathbf{A}\_i^T \mathbf{,} \mathbf{A}\_{i+1} \mathbf{}^T \dots \mathbf{,} \mathbf{A}\_{i+N\_l-1} \mathbf{}^T \end{bmatrix}^T \tag{29}$$

where **<sup>s</sup>** <sup>∈</sup> <sup>C</sup>*NtNu* , **<sup>A</sup>** <sup>∈</sup> <sup>C</sup>*NtNu*×*NxNy* , and <sup>σ</sup> <sup>∈</sup> <sup>C</sup>*NxNy* . <sup>σ</sup> is estimated by solving the following BP or BPDN problems:

$$\min\_{\sigma \in \mathbb{C}^{N\_x N\_y}} ||\sigma||\_1 \text{ subject to } \mathbf{s} = \mathbf{A}\sigma,\tag{30}$$

$$\min\_{\sigma \in \mathbb{C}^{N\_R N\_Y}} \left\lVert \|\sigma\|\_1 \text{ subject to } \left\lVert \mathbf{s} - \mathbf{A}\sigma \right\rVert\_2 < \epsilon. \tag{31}$$

The σ value, determined by applying Equation (30) or (31), is a target strength vector obtained from using only the signals received in consecutive *Nt* time snapshots from *ti*. Therefore, to obtain the target strength vector for all the area of interest, the *l*1-norm minimization process must be repeated for all time snapshots corresponding to the area of interest. The final image was compiled by adding all the σ values obtained in each process.

**Figure 3.** Description for a single sensor sonar operation: *um* is the single sensor corresponding to the *m*-th ping, *Np* is the total number of pings, *σ<sup>k</sup>* is a virtual target, *Nx* is the number of the grid in the *x* direction, *Ny* is the number of the grid in the *y* direction, (*Xc*,*Yc*) is the center point of the area of interest, *X*<sup>0</sup> is the half-size of the area of interest in the *x* direction, *Y*<sup>0</sup> is the half-size of the area of interest in the *y* direction.

When obtaining solutions for BP or BPDN, the elements of σ with a large *l*1-norm of the corresponding sensing matrix column vector tend to have a non-zero value. To eliminate this bias, each column vector *ai* of the sensing matrix **A** and the received signal **s** is normalized by their *l*2-norms [14].

$$\mathbf{s}\_{\text{normal}} = \frac{\mathbf{s}}{\|\mathbf{s}\|\_2} \tag{32}$$

$$\mathbf{A}\_{\text{normal}} = \left[ \frac{\mathbf{a}\_1}{||\mathbf{a}\_1||\_2}, \frac{\mathbf{a}\_2}{||\mathbf{a}\_2||\_2}, \dots, \frac{\mathbf{a}\_{N\_x N\_y}}{||\mathbf{a}\_{N\_x N\_y}||\_2} \right]^T. \tag{33}$$

Therefore, Equations (30) and (31) are modified, and the final solution is obtained by compensating the *l*2-norms to the σ estimated from Equation (34) or Equation (35) as follows:

$$\min\_{\sigma \in \mathbb{C}^{N\_x N\_y}} \|\sigma\|\_1 \text{ subject to } \mathbf{s}\_{\text{normal}} = \mathbf{A}\_{\text{normal}} \sigma,\tag{34}$$

$$\min\_{\sigma \in \mathbb{C}^{N\_x N\_y}} ||\sigma||\_1 \text{ subject to } ||\mathbf{s}\_{\text{normal}} - \mathbf{A}\_{\text{normal}} \sigma||\_2 < \lambda. \tag{35}$$

#### *3.3. CS-SAS Algorithm for Uniform Line Array*

The algorithm proposed in Section 3.2 is a method used for a single sensor. However, in many cases, a uniform array sonar consisting of one transmitter and multiple receivers is used, and it needs to be extended to the algorithm for a uniform line array. The algorithm for the uniform linear array introduced in this section is the same as the algorithm for a single sensor, except for the travel time function. The transmitter in the *m*-th ping is *utm*, *m* = 1, ... , *Np*, and its position vector is **r***utm* . The number of receivers in the physical array is *Nu*, and the *n*-th receiver in the physical array is *un*, *n* = 1, ... , *Nu*. Therefore, Equations (22) and (23) can be reformulated as follows:

$$\mathbf{s}(t, \mathbf{r}\_{\mathrm{u}\_{n}}, \mathbf{r}\_{\mathrm{lu}\_{\mathrm{tr}}}) = \sum\_{k=1}^{N\_{\mathrm{x}}N\_{\mathrm{y}}} \sigma\_{\mathrm{k}} \exp\left(\mathbf{j}\omega(t - \boldsymbol{\tau}(\mathbf{r}\_{\mathrm{k}}, \mathbf{r}\_{\mathrm{u}\_{\mathrm{n}}}, \mathbf{r}\_{\mathrm{lu}\_{\mathrm{f}}}))\right) \left(0 \le t - \boldsymbol{\tau}(\mathbf{r}\_{\mathrm{k}}, \mathbf{r}\_{\mathrm{u}\_{\mathrm{n}}}, \mathbf{r}\_{\mathrm{lu}\_{\mathrm{f}}}) \le T\_{\mathrm{p}}\right), \tag{36}$$

$$\tau(\mathbf{r}\_{k\prime}, \mathbf{r}\_{\nu\_{l\prime}}, \mathbf{r}\_{\nu\_{lm}}) = \frac{|\mathbf{r}\_k - \mathbf{r}\_{\nu\_{l\prime}}| + |\mathbf{r}\_k - \mathbf{r}\_{\nu\_{lm}}|}{c}. \tag{37}$$

The signal vector composed of measurements received at a specific time *ti* of each receiver in the *j*-th ping is denoted as **s***i*,*j*. Therefore, the signal vectors **s***i*,*<sup>j</sup>* can be arranged as

$$\mathbf{s}\_{i,j} = \mathbf{A}\_{i,j}\boldsymbol{\sigma}\_{i} \tag{38}$$

$$\mathbf{s}\_{i,j} = \left[ \mathbf{s}\left(t\_{i\prime}, \mathbf{r}\_{u\_1}, \mathbf{r}\_{u\_{tj}}\right), \mathbf{s}\left(t\_i, \mathbf{r}\_{u\_{2\prime}}, \mathbf{r}\_{u\_{tj}}\right), \dots, \mathbf{s}\left(t\_{i\prime}, \mathbf{r}\_{u\_{N\_{\rm id}}}, \mathbf{r}\_{u\_{tj}}\right) \right]^T \tag{39}$$

$$\mathbf{A}\_{i,j}(m,k) = \exp\left(j\omega\left(t\_i - \tau\left(\mathbf{r}\_{k\prime}, \mathbf{r}\_{\mathcal{U}\_m,\prime} \mathbf{r}\_{\mathcal{U}\_l}\right)\right)\right) \quad \left(0 \le t\_i - \tau\left(\mathbf{r}\_{k\prime}, \mathbf{r}\_{\mathcal{U}\_m,\prime} \mathbf{r}\_{\mathcal{U}\_l}\right) \le T\_p\right). \tag{40}$$

Similar to the previous section, the following formulas are obtained:

$$\mathbf{s} = \mathbf{A}\boldsymbol{\sigma},\tag{41}$$

$$\mathbf{s} = \begin{bmatrix} \mathbf{s}\_{i,1} \, ^T \, \mathbf{s}\_{i,2} \, ^T \, \dots \, \mathbf{s}\_{i,N\_p} \, ^T \, \mathbf{s}\_{i+1,1} \, ^T \, \mathbf{s}\_{i+1,2} \, ^T \, \dots \, \mathbf{s}\_i \, \mathbf{s}\_{i+N\_l-1} \, \boldsymbol{N}\_p \end{bmatrix}^T \tag{42}$$

$$\mathbf{A} = \begin{bmatrix} \mathbf{A}\_{i,1} \, ^T \boldsymbol{\mathcal{A}}\_{i,2} \, ^T \boldsymbol{\omega} \dots \boldsymbol{\mathcal{A}}\_{i,N\_p} \, ^T \boldsymbol{\mathcal{A}}\_{i+1,1} \, ^T \boldsymbol{\mathcal{A}}\_{i+1,2} \, ^T \boldsymbol{\upgamma} \dots \boldsymbol{\upgamma} \mathbf{A}\_{i+N\_l-1,N\_p} \, ^T \end{bmatrix}^T \,. \tag{43}$$

The CS framework can be applied by formulation as above.

#### **4. Results**

In this section, the performance of the proposed algorithms is demonstrated by comparing the results of applying the ω-k algorithm and the CS-SAS algorithms to both the simulation and experimental data. In the simulation and in the experiment, the carrier frequencies of the CW signal were 400 and 455 kHz, respectively, whereas the sampling frequencies were 25 or 50 kHz and, therefore, the ω-k algorithm included the baseband process.

The following shows that the CS-SAS algorithms exhibit superior performance in terms of resolution and noise robustness and indicates how to become robust when combating conditions where sensors are not working or data are lost.

#### *4.1. Simulation Results*

#### 4.1.1. Simulation Results for Single Sensor

The ω-k and CS-SAS algorithms were compared for various cases. For the singlesensor SAS, five cases were simulated. The basic simulation environment was a singlesensor sonar operated at 0.02 m intervals from −5 to 5 in the cross-range axis, that is, *Np* = 501. The signal *p*(*t*) was a CW signal with carrier frequency *fc* = 400 kHz and pulse duration *Tp* = 0.1 ms. The center point of the area of interest was [*Xc*,*Yc*] = [250, 0], the half-size of the area of interest in slant-range was *X*<sup>0</sup> = 0.5 m, the half-size of the area of interest in cross-range was *Y*<sup>0</sup> = 1 m, the sampling frequency was *fs* = 50 kHz, and the area of interest consisted of *NX* = 101 and *NY* = 101 uniform grid points. The sound speed was *c* = 1500 m/s. Twelve-point targets were placed, as described in Figure 4.

**Figure 4.** Locations of the 12 targets.

All simulation environments for single-sensor sonar were fundamentally the same as the basic simulation environment described above and were performed by changing the noise level, *Xc*, *fs*, and sonar interval, as shown in Table 1.

**Table 1.** Simulation Case for Single Sensor.


As shown in Figure 5, it was confirmed that the images obtained through the CS-SAS algorithm accurately distinguished 12 targets and had a good azimuth resolution. In addition, it was confirmed that there was no performance degradation caused by the side lobes. However, the result for the ω-k algorithm is unable to distinguish between targets adjacent to each other in the center of the area of interest. In addition, much aliasing occurs, especially in the azimuth direction. On account of the influence of matched filtering and Fourier transform, the exact position of the point targets cannot be obtained, resulting in a blurred result.

**Figure 5.** Simulation results for Case 1 for single sensor: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for single sensor.

• Case 2

The results of the simulation are shown in Figure 6. In the case of the ω-k algorithm, because the sampling frequency is reduced by half compared to Case 1, the resolution in the slant-range direction is reduced. The aliasing at the center of the area of interest has a larger value than the values at the four target locations, [250 ± 0.02, 0] and [250 ± 0.04, 0]. Nonetheless, the image obtained using the CS-SAS algorithm yielded an accurate target image.

**Figure 6.** Simulation results for Case 2 for single sensor: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for single sensor.

The results are shown in Figure 7. The CS-SAS algorithm accurately fetches the images of 12 targets. However, the ω-k algorithm requires a Fourier transform in the space domain, which violates Nyquist theory because the sampling level in the space domain is reduced to 1/10. Therefore, aliasing occurred, and the image of the target could not be properly obtained. The eight target points in the center were not distinguishable, and the remaining four points were difficult to determine.

**Figure 7.** Simulation results for Case 3 for single sensor: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for single sensor.

The results are depicted in Figure 8. Even when the spatial sampling is reduced to a level of 1/20 and when some side lobes occur, the CS-SAS algorithm still fetches the image of 12 targets, whereas the ω-k algorithm fails to depict the proper image of the targets.

**Figure 8.** Simulation results for Case 4 for single sensor: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for single sensor.

• Case 5

Noisy conditions with signal-to-noise ratios (SNRs) of 20, 10, and 5 dB were simulated. The results in Figure 9 indicate that the CS-SAS algorithm applied to an environment with SNRs of 20 dB and 10 dB, obtained an almost accurate image of the targets. However, when the SNR was 5 dB, the values at the grid point between [250, ±0.04] and [250, ±0.08] were greater than the values at [250, ±0.04] and [250, ±0.08], and an accurate image could not be obtained. As the noise became louder, a degree of degradation occurred. Nevertheless, the CS-SAS algorithm was still superior to the ω-k algorithm in terms of resolution and sidelobe suppression.

**Figure 9.** Simulation results of CS-SAS algorithm for Case 5 for single sensor: (**a**) SNR = 20 dB; (**b**) SNR = 10 dB; (**c**) SNR = 5 dB.

#### 4.1.2. Simulation Results for Uniform Line Array

The CS-SAS and ω-k algorithms for a uniform line array were simulated in two cases. The first simulation case is as follows: The simulation environments of sampling frequency *fs* and sound velocity *c*, excluding sonar configuration and *Xc*, are the same as those of simulation Case 1 for a single sensor. The array has 20 receivers, with a 0.04 m spacing between receivers, as shown in Figure 10a. By moving 0.4 m between pings, a total of 25 pings were shot. The source position of the uniform line array is 0.1 m away from the first sensor in the cross-range direction. Conditions for other simulations are the same as for Case 1, except that the sensor spacing of the array is different. The array has two receivers, with 0.4 m spacing between receivers, as shown in Figure 10b. By moving 0.4 m between pings, a total of 25 pings were shot. The source position of the uniform line array is 0.1 m away from the first sensor in the cross-range direction.

**Figure 10.** Description of line array in: (**a**) the first case of simulation; (**b**) the second case of simulation; the red box is a transmitter, and the black boxes are receivers. The black-dash boxes in (**b**) are actually empty but marked for comparison with (**a**).

• Case 1

The results for Case 1 are shown in Figure 11. The CS-SAS algorithm accurately fetched images of the 12 targets. Contrarily, the result for the ω-k algorithm showed aliasing. In particular, the total number of sensors used were 500 = 20 × 25, which compares favorably to the simulation environment of a single sensor; however, the interval between the sensors doubled to 0.04, and the spatial sampling interval also doubled, resulting in aliasing near [250, ±0.07].

**Figure 11.** Simulation results for Case 1 for uniform line array: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for uniform line array.

The results for Case 2 are shown in Figure 12. The synthetic aperture is the same as in Case 1, but the sensor spacing is increased 10 times. Severe aliasing occurred in the ω-k results as well as the inability to properly identify the targets. However, the CS-SAS algorithm was significantly better distinguished.

**Figure 12.** Simulation results for Case 2 for uniform line array: (**a**) result for the ω-k algorithm; (**b**) result for the CS-SAS algorithm for a uniform line array.

The performances of the CS-SAS and ω-k algorithms were compared using simulation results for a single sensor and uniform line array. In the case of the ω-k algorithm, even under the most naïve simulation conditions, adjacent targets could not be distinguished and aliasing occurred, whereas in the case of the proposed algorithm, because CS was applied and sidelobes were rather suppressed, high-resolution results were obtained. In effect, CS-SAS has clearly distinguished targets under harsher conditions by increasing spatial sampling or reducing *fs*, and has obtained accurate locations and shown robustness in noisy situations. This is possible because the measured signal can be expressed in a sparse representation for a certain domain, and CS can significantly lower the sampling rate and has robust resistance to noise [36,39].

#### *4.2. Experimental Data Results*

This was a water tank experiment conducted by SonaTech Inc. (Santa Barbara, CA, USA). As shown in Figure 13a, an experiment was performed to obtain images of two rings in a water tank. As shown in Figure 13b, the sonar has one transmitter and 32 receivers. After transmitting and receiving the signal once, the transmitter moves 616.5 mm and then sends and receives the next signal. This process was repeated seven times to receive signals from 224 locations. The ping signal *p*(*t*) is a CW signal with carrier frequency *fc* = 455 kHz and pulse duration *Tp* = 0.3 ms, the sampling frequency is *fs* = 50 kHz, and the sound speed *c* = 1480 m/s. There are two ring-shaped targets of approximate length of major axis 1.5 m each in the slant-range of 7 to 10 m and a cross-range of −2.4385 to 2.4385 m.

**Figure 13.** Water tank experiment overview: (**a**) water tank structure; (**b**) synthetic aperture line-array sonar.

The raw data recorded in the slant-range are shown in Figure 14a. The CS-SAS result was derived by dividing the area of interest into a uniform grid of *NX* = 101 and *NY* = 101. The results of the ω-k and CS-SAS algorithms are shown in Figure 14b,c, respectively.

**Figure 14.** (**a**) Raw data; (**b**) result for the ω-k algorithm; (**c**) result for the CS-SAS algorithm.

From the raw data, the targets can be seen in the form of rings, but the shape appears thick, and it is difficult to accurately determine the location of the targets. In the results of the ω-k algorithm, the shapes are slightly thinner, but aliasing is severe in the azimuth direction, and it appears that there are several rings. The results of the CS-SAS algorithm construct tworing-shaped targets. Because the CS-SAS algorithm attempts to bring an image with as little target distribution as possible, the side lobes are suppressed to obtain thin ring-shaped targets.

To examine whether the CS-SAS algorithm is robust under conditions where some of the sensors are broken, the result was derived by assuming a situation in which data from some sensors were lost. The experiment was divided into two cases: one case where the sensor failed uniformly (Sensor Loss: Uniform, SLU) and another case where the sensor failed randomly (Sensor Loss: Random, SLR). The percentages of sensors that did not malfunction and operated normally are also indicated in the results. The results are displayed in Figure 15.

**Figure 15.** Results for some sensor loss conditions. Uniformly broken case (SLU) where the percentage of the remaining sensors are: (**a**) 75%; (**b**) 50%; (**c**) 25%. Randomly broken case (SRU) where the percentage of the remaining sensors are (**d**) 75%; (**e**) 50%; (**f**) 25%.

In addition, peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM), which are representative image quality measurement indicators [40,41], were calculated for quantitative comparison. PSNR is an index that evaluates loss information for the quality of the generated or compressed images and is expressed by the peak signal/mean square error (MSE) term. It has a unit of dB, and a higher value indicates less loss, i.e., higher image quality. SSIM is an index designed to evaluate differences in human visual quality rather than numerical errors. SSIM quality is evaluated in three aspects: luminance, contrast, and structural. However, since the correct answer image is not known, PSNR and SSIM were calculated for all sensor loss situations using Figure 14c as a reference image. The calculation results are shown in Table 2.


**Table 2.** PSNR and SSIM.

In the case of some conventional methods such as ω-k, a Fourier transform in space is performed. Therefore, it is difficult to obtain results freely in the form of an array because linear sampling is not possible in space when some sensors in the array fail. However, the CS-SAS algorithm does not perform Fourier transform in space and has a formulation that is free in the form of an array and, therefore, it is easy to obtain a result in a sensor loss situation. In addition, it is difficult to detect significant performance degradation of up to 75% for both SLU and SLR, and 50% of the SLR show particularly good results; note that CS has the best performance for random array or random down sampling [42,43]. Table 2 also shows that the random array results are generally better. In Table 2, it can be seen that the image quality of SLR is high for both 50% and 25%. At 75%, the indicators of SLR are worse than at 75% of SLU. Because there is little deterioration in image quality in 75% of cases, it can be seen that the PSNR and SSIM simply show how similar the reconstruction results are to Figure 14c, rather than showing the results of image quality deterioration. When it reaches approximately 25%, both SLU and SLR seem to blur to some extent, but in terms of side lobe suppression, it still shows no inferiority over the ω-k algorithm.

Using the CS-SAS algorithm in this study made it possible to obtain a higher resolution image than when using the conventional synthetic aperture sonar algorithm—the ω-k algorithm—and made it possible to reduce the problem of aliasing which also occurs in the conventional method. In addition, even with less spatial sampling, better results were obtained than compared to the conventional algorithm, and it was confirmed to be robust even when some sensors failed. Good results can be expected even if the number of sensors are reduced during actual sonar operation, and as a consequence, cost reduction is possible. Moreover, it is durable because it presents robust characteristics in failure situations. Results of actual experimental data were also observed, and it is expected that satisfying results will be obtained in the event that the CS-SAS algorithm is applied to a natural underwater environment.

#### **5. Conclusions**

In this paper, we proposed an algorithm that applies compressive sensing (CS) to a synthetic aperture sonar (SAS) under the assumption that the target distribution in water is sparse. Through simulation, it was confirmed that the proposed algorithm produces images with better resolution than the conventional SAS algorithm, the ω-k algorithm. In addition, because images obtained by the proposed method present very few and small side lobes, no deterioration of imaging performance occurs. Furthermore, even in the case of sampling at a low level that violates Nyquist theory in the time and space domain, a higher quality target image was obtained than with the ω-k algorithm.

Real environment applicability was revealed for the proposed method when comparing the results with actual experimental data. The results confirm that aliasing is reduced and side lobes are suppressed when applying the compressive sensing method. Contrarily, the ω-k algorithm does not obtain accurate target images due to aliasing. Importantly, it was confirmed that the proposed method is robust in the event of some sensors of the sonar system failing or when some data are lost.

**Author Contributions:** Conceptualization, H.-m.C. and H.-s.Y.; methodology, H.-m.C.; software, H. m.C.; validation, H.-s.Y. and W.-j.S.; formal analysis, H.-m.C.; investigation, H.-m.C.; resources, H.-s.Y. and W.-j.S.; data curation, H.-m.C.; writing—original draft preparation, H.-m.C.; writing—review and editing, H.-s.Y. and W.-j.S.; visualization, H.-m.C.; supervision, W.-j.S.; project administration, H.-s.Y.; funding acquisition, W.-j.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are not publicly available because these data belong to SonaTech Inc.

**Acknowledgments:** This research was supported by the Agency for Defense Development in Korea under Contract No. UD190005DD and the SonaTech Inc under Contract No. C180005.

**Conflicts of Interest:** The authors declare no conflict of interest.
