*3.1. Network Model Establishment and Bottom Tracking*

The experiment started with a small side scan survey line of 121 pings. The raw recorded (\*.xtf) data was decoded, and the corresponding waterfall sonar image was constructed, as shown in Figure 10a. The bottom tracking results were processed by manual recognition, as shown in Figure 10b.

**Figure 10.** (**a**) Side scan waterfall sonar image and (**b**) manual bottom tracking result of the survey line.

The positive sample sequences were selected as the bottom backscatter strength sequences from the side scan data, according to the corresponding bottom tracking position. Meanwhile, the negative sample sequences were uniformly selected in the water column and seabed area. The positive and negative samples constituted the sample set for the model training. Given that the survey line had 121 pings, the sample set contained 242 positive and 2662 negative samples, respectively.

The sample set was normalized, according to Equation (1), and was imported into the network as the input layer. The corresponding label (1: positive, 0: negative) was imported as the output layer. During the training, the samples were randomly divided into training and validation sets in a 70–30% proportion. The 1D-CNN (Figure 4) was trained to learn the variation features of the data samples. The training and validation accuracies improved as the training epoch increased, as shown in Figure 11.

As shown in Figure 11, the training accuracy gradually improved as the training epoch increased, and eventually reached a stable value of 100% after approximately 40 training epochs. The validation accuracy fluctuated in 10 training epochs and reached a stable value after 20 epochs. The training and validation losses gradually decreased along with an increasing training epoch, and eventually decreased to 0. For the small sample set of the selected survey line, the network model proposed in this paper can effectively learn the features of the positive and negative samples, and accurately recognize them after training, which is the basis for the real-time bottom tracking of the survey line.

Based on the trained network model, each ping of the survey line was bottom tracked following the procedure illustrated in Figure 6. The bottom tracking results of the port and starboard side scan data were then processed (Figure 5) and compared with each other (Figure 12a). The corresponding bottom tracking result can be displayed in the side scan waterfall image (Figure 12b).

**Figure 11.** Training and validation accuracies and losses of the network in 50 epochs.

**Figure 12.** Bottom tracking results obtained by using the trained network. (**a**) This area shows the bottom positions (sample indexes) of the port and starboard sides, and (**b**) this area shows the waterfall image with the bottom tracking lines.

The bottom tracking results of the port and starboard data were highly consistent with each other, and all bottom position differences were less than four samples (0.16 m) because the seabed topography of the survey water area was relatively flat. Moreover, the tracking results in the waterfall diagram were highly intuitive to show the edges of the port and starboard seabed area, which agreed with the visual results. The comparison between the bottom tracking results and manual ones showed that the bottom tracking accuracy can reach 100% on the training survey line. These results prove the validity of the proposed bottom tracking method.

#### *3.2. Method Validation and Comparison*

To validate the generalization of the trained model and the effectiveness of the proposed method for the side scan data of other survey lines in the test area, the trained model was used to recognize unknown data for the bottom tracking of other survey lines.
