**2. Method**

To get high-resolution seabed topography and surface backscatter data, the proposed method is shown in Figure 2.

Firstly, both SSS and MBES data are processed to form the geocoded images.

Secondly, the SSS and MBES image matching based on common features is conducted, which includes the initial matching with the SURF algorithm with image geographic coordinate constraint and finer matching by using dense local self-similarity (DLSS) descriptors.

Thirdly, the random sample consensus (RANSAC) algorithm is used for removing mismatches and then the SSS backscatter image geographic coordinates are rectified according to the relationship model established by the correct matched points.

Finally, according to the consistent geographic coordinates, the rectified SSS backscatter image is superposed on the MBES bathymetric terrain.

The proposed method is described in detail in the following paragraphs and the image matching program was written using Matlab in a computer with the i7, 3.40GHz Intel Core and 8.00 GB RAM.

**Figure 2.** The acquisition of high-resolution seabed topography and surface details.

#### *2.1. SSS and MBES Data Processing*

The original SSS data were first decoded to form waterfall images ping by ping. In a complete process of SSS data, bottom tracking, radiometric correction, geometric correction and geocoding,and so forth, are included [5,28–32]. This SSS data processing was conducted by self-developed software and the processing procedure is shown in Figure 3a. Because thousands of echoes exist in a ping scanning line, a high-resolution SSS backscatter image could be obtained after the above data processing. By referring to Figure 1a, the positions of SSS towfish were reckoned by combining vessel-mounted GPS positioning results, cable length and the towfish heading, and thus became inaccurate due to the towing operation, the flat bottom assumption, towing distance and bearing controlled by the towing speed and the currents. Therefore, the geographic coordinates of high-resolution SSS backscatter image needed to be rectified.

**Figure 3.** The flow charts of SSS (**a**) and MBES (**b**) data processing.

The original MBES data were first decoded to get the bathymetric data, backscatter data, GPS data, attitude data, sound velocity, and so forth. Quality control for these raw data was done first. Then, calculation of bathymetric point was conducted which includes attitude correction, sound ray tracing, absolute coordinates obtaining of bathymetric point, and so forth. After getting the locations of beam footprints, the authors conducted the radiometric correction, angular response correction and geocoding for the MBES backscatter data, and could then obtain the seabed image. Finally, after tidal correction for the sound ray tracing results, the absolute 3D coordinates of each bathymetric point were obtained and thus the seabed topography could be achieved. The procedure of MBES backscatter and bathymetric data processing was displayed in Figure 3b. To avoid gaps on the MBES image, the survey vessel velocity should be determined appropriately. According to the MBES data processing procedure and MBES measurement principle (Figure 1b), the geographic coordinates of bathymetric and backscatter data were accurate. However, the seabed topography and image obtained by using MBES were of low resolution, as only hundreds of echoes exist in one ping. As for the MBES backscatter, there were three types of acquired data: single beam intensity, individual beam time series and integrated time series. Although there exist a few to dozens of snippets in each beam footprint, the resolution of MBES image was still far lower than that of the SSS backscatter image. Thus, the MBES image is usually used for reflecting the large-scale seabed surface features instead of high-resolution visualization.

#### *2.2. SSS and MBES Image Matching*

Though with different characteristics (e. g. resolution, position accuracy and intensity contrast), both SSS and MBES images can reflect seabed features of the same area. Thus, the SSS and MBES image matching can be conducted by searching for common features in both images. To get accurate matched points, the matching method is proposed, which includes initial matching using the SURF

algorithm with image geographic coordinate constraint, and the finer matching by using dense local self-similarity (DLSS) descriptors. The initial image matching step is to detect image feature points in the first place and use the SURF descriptors for feature-based image matching with the constraint of image coordinates. Since the SURF algorithm is not robust to nonlinear intensity differences between SSS and MBES images, some mismatches may arise in the initial matching step. To obtain more accurate matched points, the DLSS descriptor was used in the finer matching step and a template matching strategy was adopted. This two-step image matching process is described in detail in the following paragraphs.

### 2.2.1. Initial Image Matching Using SURF Algorithm with Image Geographic Coordinate Constraint

The SURF algorithm, which is invariant to rotation, scale, brightness and contrast [33], was adopted for detecting candidate feature points in SSS and MBES images to produce the SURF descriptors. By concatenating a 4D descriptor vector *V* of each sub-region, which underlies intensity structure for the square region of 4 × 4, the SURF descriptor of length 64 can be obtained [33].

$$V = (\sum d\_{\mathbf{x}}, \sum d\_{\mathbf{y}}, \sum |d\_{\mathbf{x}}|, \sum |d\_{\mathbf{y}}|) \tag{1}$$

where, *d*x and *d*y are respectively the Haar wavelet responses in horizontal and vertical directions. Since using the SURF descriptors for multi-model image matching directly may lead to many mismatches, some valid constraints can be added for specific applications [34,35].

For geocoded SSS and MBES images, the accuracies of their geographic coordinates are different and the geographic coordinates of the MBES image are more accurate than those of SSS because the SSS surveying is easily affected by the towing operation and many other factors. This inconsistency of image positional accuracy leads to the fact that the geographic coordinates of the same features presented in both images will be different. This kind of difference is within a range and can be determined by SSS location error, which is discussed in detail in Section 5.1. Thus, when the distances of the detected feature points in both images are too large, they will certainly not be the correct matches. Accordingly, the SURF algorithm for SSS and MBES image matching can be conducted when the distances of detected feature points in both images are within a threshold *Dis*, which can reduce the mismatches compared with using the SURF algorithm directly. With this constraint, the initial feature-based image matching was conducted as shown in Equation (2).

$$\begin{array}{ll}\text{if } \underset{\text{of}}{D\_i} \leq \text{Dis} & \text{SILRF matching} \\ \text{if } \underset{\text{other}}{\text{otherwise}} & \text{then} & \text{Refused} \end{array} \tag{2}$$

where, *D*<sup>i</sup> is the distance between the candidate matched points. The determination of threshold *Dis* is also described in detail in Section 5.1.

#### 2.2.2. Finer Image Matching Using DLSS Descriptors

Because the SSS backscatter reflects relative backscatter levels and the MBES images are usually normalized to dB scale, there is no linear relationship between their backscatter intensity levels. Meanwhile, as the resolution, SNR and intensity contrast of SSS backscatter image are higher than those of the MBES image, the same feature points of the seabed may be presented more clearly in SSS backscatter image than in the MBES image. As a result, some feature points can be detected in the SSS backscatter image but not in the MBES image. The SURF algorithm may detect incorrect matches when dealing with the non-linear intensity differences because it mainly focuses on the intensity variations of feature points and spatial distributions of their gradient information [33–36]. Moreover, as the number of detected feature points in the two images were different, the finally obtained number of matched points was limited. To get better matched points, the shape properties of distinct targets, topography changes or sediment distributions in these images can be used.

The DLSS descriptor can represent these shape properties and is robust against significant nonlinear intensity differences by integrating the local self-similarity (LSS) descriptors of multiple local image regions in a dense sampling grid [37]. LSS was calculated by correlating the image patch centered at a point *q* with a larger surrounding image region, resulting in a correction surface *Sq*(*x,y*) as below [38].

$$S\_{\emptyset}(\mathbf{x}, y) = \exp(-\frac{SSD\_{\emptyset}(\mathbf{x}, y)}{\max(\text{var}\_{noise}, \text{var}\_{auto}(q))})\tag{3}$$

where, *SSDq*(*x,y*) is the sum of square differences between image patch intensities, var*noise* is a constant corresponding to intensity variations and var*auto* is the maximal *SSD* of all patches within a very small neighborhood of *q*. Then, the *Sq*(*x,y*) was transformed into a log-polar representation and partitioned into bins. The maximal correction value of each bin was selected to generate the LSS descriptor associated with the pixel *q*. In order to be invariant to the intensity variations of the image, the LSS was linearly stretched into the range [0, 1] [38]. An example of the LSS descriptor is shown in Figure 4, which indicates that the LSS descriptors of feature points are similar when the points share a common layout.

**Figure 4.** An example of local self-similarity (LSS) descriptor. The (**a**) and (**c**) are SSS and MBES image; red rectangles in (**a**) and (**c**) are the locations of the feature points; (**b**) and (**d**) are LSS descriptors of the two points.

By collecting the LSS descriptors from spatial regions within a dense overlapping grid covering a defined square area, the DLSS was produced as a combined feature descriptor [37]. Using the DLSS descriptors, a template matching strategy was conducted based on the initial image matching results.

First, the DLSS descriptors of initial matched points in the SSS backscatter image were produced and a searching area centered at the initial matched points in the MBES image with the radius *r* was defined. Meanwhile, the DLSS descriptors of every pixel in this defined area were calculated.

Secondly, calculating the similarity between the DLSS descriptors of the SSS backscatter image feature points and those of the points still unprocessed in the above defined searching area in the MBES image. The normalized cross correlation (NCC) of DLSS descriptors as shown in Equation (4) was adopted as the similarity metric.

$$\text{NCC} = \frac{\sum\_{i=1}^{n} \left( DLSS1\_i - mean(DLS1) \right) \left( DLSS2\_i - mean(DLS2) \right)}{\sqrt{\sum\_{i=1}^{n} \left( DLSS1\_i - mean(DLS1) \right)^2 \sum\_{i=1}^{n} \left( DLSS2\_i - mean(DLS2) \right)^2}} \tag{4}$$

where, *DLSS1* and *DLSS2* are the DLSS descriptors of points in SSS and MBES images, *N* is the dimension of DLSS descriptor.

Finally, by choosing maximum NCCs and selecting the corresponding points in the defined area of the MBES image, the finer matches could be obtained. After conducting the above process for all the initial matched points in the SSS backscatter and MBES images, more correct matched points were obtained.

#### *2.3. SSS Backscatter Image Geographic Coordinates Rectification Based on the Matched Points*

Even after the finer matching step, mismatches still existed. The reason is that, affected by the measurement environment (e.g., the currents, the surveying vessel velocity and attitude changes), the seabed features presented in the SSS and MEBS images may not be identical. To eliminate these mismatches, the random sample consensus (RANSAC) algorithm was adopted. Since the RANSAC algorithm was conducted by selecting some existing matched points to build a geometric model [39] and the local geometric distortions in different parts of the SSS backscatter image were different, the SSS backscatter image could be segmented into different blocks according to feature distributions and towfish attitudes and the RANSAC algorithm could be conducted separately in each block [40–42]. When segmenting the SSS backscatter image, enough matched points should be contained in each block and the split lines are supposed to lie on the edge of the targets [41,42]. Meanwhile, to ensure the continuity between adjacent blocks, an overlapping ratio (e.g., one seventh of the block) can be set between them. To further ensure the remaining matches were correct, a geometric transformation model was established by using the matched points after application of the RANSAC algorithm and the coordinates of feature points in the SSS backscatter image were rectified. By referring to the matched feature points in the MBES image, the corresponding feature points in the rectified SSS backscatter image were considered as correct matches when their coordinate errors were less than double standard deviations. After this process, the correct matched points were obtained.

Based on the correct matched points between SSS and MBES images in each block, a geometric transformation model could be built to rectify the geographic coordinates of the SSS backscatter image by referring to those of the MBES image. Since the geographic distortions in local regions of the SSS backscatter image can be modeled by the affine function [43], the affine transformation model as shown in Equation (5) can be built within each block.

$$\begin{cases} \mathbf{x}\_{\text{ll}} = a\_0 + a\_1 \mathbf{x}\_{\text{s}} + a\_2 \mathbf{y}\_{\text{s}} \\ \mathbf{y}\_{\text{m}} = b\_0 + b\_1 \mathbf{x}\_{\text{s}} + b\_2 \mathbf{y}\_{\text{s}} \end{cases} \tag{5}$$

where, (*x*s,*y*s) and (*x*m,*y*m) are geographic coordinates of matched feature points in SSS and MBES images, *a*0, *a*1, *a*<sup>2</sup> and *b*0,*b*1,*b*<sup>2</sup> are the model parameters to be calculated. After obtaining the geometric transformation models within each segmented block, the SSS backscatter image geographic coordinates could be rectified.

### *2.4. Superimposition of Rectified SSS Backscatter Image on MBES Terrain*

After the rectification, the geographic coordinates of the SSS backscatter image were inconsistent with those of MBES image. Thus, the rectified SSS backscatter image could be superposed on the MBES terrain. This process was conducted by following steps:

(1) Based on the matched points, the geographic coordinates of the SSS backscatter image were rectified by referring to those of the MBES image.

(2) The 3D seabed topography was obtained by using the regular square grid method, which has the same resolution as the SSS backscatter image.

(3) According to the geographic coordinates, each grid point of the 3D topography was attributed with the corresponding grey value of SSS backscatter image.

After this process, the high-resolution 3D seabed topography and surface details can be presented together.

#### **3. Experiments and Results**

#### *3.1. Experimental Data*

To validate the proposed method, an experiment was carried out in Meizhou Bay, China with the area of 1.4 × 0.3 km. In this water area, the depth ranges from 10 to 14 m; seabed sampling showed sediments mainly contain silt and sand. In this experiment, EdgeTech 4100P with the slant range of 112 m and 7502 sampling points for each ping was adopted for the SSS measurement. For the MBES measurement, EM 3002 with a maximum angular sector of 130◦ and 131 beams was adopted. The MBES data were collected using the Seabed Information System (SIS) and restored in "\*.all" files. The SSS data were collected using the EdgeTech Discover Software and restored in "\*.xtf" files. After collecting these data, both SSS and MBES data were processed by self-developed software. The processing of SSS and MBES data is shown in Figure 3. After data processing, the geocoded SSS backscatter image, MBES image and seabed topography were created, as shown in Figure 5a–c. The coordinates are in WGS84. Considering the sensitivity of geographical data, the coordinates have been subtracted by a constant. Meanwhile, the MBES and SSS backscatter images and the MBES topography were resampled in the same ground sample distance (GSD) of 1m. Even with the same GSD, the SSS backscatter image could reflect clearer contours of seabed surface features because there exist many more sampling points for one ping of the SSS measurement than that of the MBES measurement. For example, the pipeline is presented more clearly in the SSS backscatter image than in the MBES image. Meanwhile, the topography shown in Figure 5c mainly reflects the 3D seabed undulations but with less detail.

Thus, the superimposition of the high-resolution SSS backscatter image on high-accuracy seabed terrain can take advantage of both datasets, which can capture both the 3D seabed topography and the detailed surface features together.

**Figure 5.** The SSS backscatter image (**a**), MBES image (**b**) and seabed topography (**c**) of the experimental area.

#### *3.2. Experimental Procedure and Results*

According to the proposed method depicted in Section 2, the SURF algorithm was first used for detecting feature points in SSS and MBES images. In a unified geographical framework, the image geographic coordinates can serve as the constraint in the initial image matching step as described in Section 2.2.1. This matching result is shown in Figure 6b. Compared with the matching results shown in Figure 6a, which were obtained by the classical SURF algorithm without constraint, the initial matching result with constraint seems better. The reason is that the SURF algorithm is not robust when dealing with the nonlinear intensity difference between SSS and MBES images. Meanwhile, the features existed on the seabed are relatively simple and the detected edge and point features can be represented by similar SURF descriptors, which may decrease the distinctiveness of SURF descriptors. Because of the two factors, even when two feature points in the SSS and MBES images are distant, they may be regarded as matches when they are represented by similar SURF descriptors. When using the image geographic coordinates as constraint in the initial matching step, the initial image matching is only conducted within a valid distance, which avoids mismatches when the geographic coordinates of two feature points are too far away.

Moreover, since there are many more sampling points for SSS measurement than for MBES measurement, the seabed features in the MBES image may not be as clear as those in the SSS backscatter image. As a result, some feature points may be detected in the SSS backscatter image but cannot be presented in the MBES image. Consequently, it was found that several detected feature points in the SSS backscatter image are matched to one feature point in the MBES image, particularly in zoomed area (b1) of Figure 6b. This several-to-one matching problem is obviously incorrect, which will be settled in the finer image matching step.

As image geographic coordinates are used as a constraint in the initial image matching step, the correct matched points in the MBES image are supposed to be located within an area centered at the initial matched ones. Thus, the optimization search operation described in Section 2.2.2 was conducted and the better matched points are shown in Figure 6c. In this process, a template matching strategy using DLSS descriptors was conducted. Compared with the SURF descriptor, the DLSS descriptor is robust to the nonlinear intensity difference between SSS and MBES images because it reflects the shape properties of feature points and their surrounding areas. Meanwhile, this finer image matching using a template matching strategy can also help find the same number of feature points in the MBES image as these in the SSS backscatter image. As a result, more correct matched points can be obtained by adopting the finer image matching step. Compared with Figure 6b, obtained from the initial image matching step, the several-to-one matching phenomena was removed after the finer matching step, which is clearly shown in the zoomed area (c1) of Figure 6c. This improvement of matching performance is also shown in Table 1. Table 1 lists the numbers of all the matches obtained by using SURF, initial matching and finer matching and that of the correct matches after the application of the RANSAC algorithm.

Even if the finer image matching step is followed, there still exist mismatches in Figure 6c. To eliminate theses mismatches, the SSS backscatter image was segmented into blocks and the RANSAC algorithm was adopted in each block. In the segment process, it can be seen that there exist three main independent areas with abundant seabed features. Thus, the SSS backscatter image was segmented into three blocks according to the feature distributions and towfish attitudes. Meanwhile, the overlapping ratio of one seventh between adjacent blocks was set and the split lines lay on the edge of isolated targets, which can be seen in the red rectangles in Figures 6 and 7. The image matching results after this process are shown in Figure 7 and Table 1. It can be seen that even after using the RANSAC algorithm, some mismatches still exist in Figure 7a, because too many mismatches occurred due to the use of the SURF algorithm directly and the RANSAC algorithm is sensitive to the high outlier rate. As a result, these mismatches cannot be effectively eliminated by using the RANSAC algorithm. This phenomenon also proves that adding the constraint of image geographic coordinates in the initial matching step was necessary, which improved the matching ratio from 8% to 55% as shown in Table 1. Even so, mismatches still exist after the initial step, which can be seen in the zoomed area (b1) of Figure 7b. After the finer image matching step, the matching ratio rose to 86%. More importantly, these matches are correct, as shown in the zoomed area (c1) of Figure 7c. This is because the combination of both the initial and finer image matching not only makes full use of the intensity variations and gradient information of feature points, but also considers their geographic coordinates and the shape properties of the area around them.

Using the correct matched points that were finally obtained in each block, the transformation model shown in Equation (5) could be established. According to the obtained transformation model, the SSS backscatter image geographic coordinates could be rectified following the method in Section 2.3. The accuracy of the rectified SSS backscatter image geographic coordinates is discussed in Section 4.

**Figure 6.** Image matching results by the Speeded-Up Robust Features (SURF) algorithm (**a**), initial matching with constraint (**b**) and finer matching (**c**). The colored full lines in (**a**) (**b**) (**c**) (**b1**) and (**c1**) are used to connect matched points. The red dotted rectangles in (**a**) (**b**) (**c**) are the segmented blocks. The black rectangles in (**b**) and (**c**) are the zoomed areas (**b1**) and (**c1**). The red rectangle in (**b1**) shows the several-to-one matching phenomena.

**Table 1.** The matched results of SURF algorithm, initial matching with constraint and finer matching before and after using the RANSAC.

**Figure 7.** Image matching results by SURF algorithm (**a**), initial matching with constraint (**b**) and finer matching (**c**) after the application of random sample consensus (RANSAC) algorithm. The colored full lines in (**a**) (**b**) (**c**) (**b1**) and (**c1)** are used to connect matched points. The red rectangles in (**a**) (**b**) (**c**) are the segmented blocks. The black rectangles in (**b**) and (**c**) are the zoomed areas (**b1**) and (**c1**).

Superimposing the rectified SSS backscatter image on seabed topography, the comprehensive presentation of 3D topography and surface details is shown in Figure 8. Compared with

Figure 5c, Figure 8 can reflect the topography undulations in combination with highly detailed SSS backscatter imagery.

**Figure 8.** Representation of seabed topography and surface details.

#### **4. Evaluation and Analysis**

To assess the accuracy of the rectified SSS backscatter image geographic coordinates, the interior checking and exterior checking were conducted. The interior checking involves calculating the geographic coordinate biases of matched feature points used to build the transformation model before and after the rectification, while the exterior checking involves calculating the matches not used to build the transformation model. Meanwhile, the statistical results of these biases were also calculated. In this experiment, two-thirds of matched points were used for establishing the transformation model and the interior checking; the remaining one-third of matched points were consequently used for the exterior checking. The statistical results and probability distribution function (PDF) curves of each checking are shown in Table 2, Table 3, Figures 9 and 10. In Tables 2 and 3, "Raw" means the coordinate biases of matched feature points in the raw SSS and MBES images; "Rectified" means those in the rectified SSS and MBES images; "dy" means those of the east direction, and "dx" means those of the north direction.

**Table 2.** The statistical results of interior checking (Unit: m).


**Table 3.** The statistical results of exterior checking (Unit: m).


**Figure 9.** The probability distribution function (PDF) curves of interior checking. (**a**) and (**c**) are PDF curves of coordinate biases in y direction before and after rectification; (**b**) and (**d**) are those of coordinate biases in x direction.

**Figure 10.** The PDF curves of exterior checking. (**a**) and (**c**) are PDF curves of coordinate biases in y direction before and after rectification; (**b**) and (**d**) are those of coordinate biases in x direction.

From Table 2, Table 3, Figures 9 and 10, it can be seen that obvious systemic errors and large standard deviations exist between the raw geographic coordinates of matched points. Tables 2 and 3 show the systemic errors in the two directions are 4.32 m and 5.98 m for interior checking and 5.06 m and 6.17 m for exterior checking. After using the transformation model, these systemic errors were removed. Meanwhile, the standard deviations became smaller, which can be seen in Table 2, Table 3, Figures 9 and 10. These results prove that the accuracy of the SSS backscatter image geographic coordinates improved after the rectification. Moreover, the accuracy of both checkings were consistent with each other, which indicates that the proposed method is robust in building the transformation model.

#### **5. Discussions**
