1. Introduction
The underwater acoustic (UWA) channel is one of the most challenging communication media [
1,
2]. The low propagation speed of UWA waves will cause the multi-path and Doppler effects to be several magnitudes larger than wireless radio communication. Even when the transceivers do not move, the seawater movement and sea surface fluctuations will still cause Doppler shift. The severe multi-path and Doppler effects will cause time- and frequency-selective fading. Since the available carrier frequencies for medium-range UWA communication are only in the kHz range, a slight movement of the transceiver will cause a large Doppler shift. Orthogonal frequency division multiplexing (OFDM) is widely applied in UWA communication due to its high spectrum efficiency and robustness against the multi-path effect [
3,
4,
5,
6,
7,
8], whereas for classical OFDM communication, severe Doppler shift in the UWA channel will lead to inter-carrier interference (ICI), and the performance of OFDM will degrade significantly.
Orthogonal time frequency space (OTFS) modulation is a promising two-dimensional (2D) modulation technique proposed in recent years for high-mobility communication scenarios [
9,
10]. The basic principle of OTFS is to modulate information symbols in the 2D delay-Doppler (DD) domain rather than the time frequency (TF) domain. In light of the DD domain, OTFS modulation can transform the channel into an approximately non-fading channel through a series of 2D transformations. In UWA OTFS communication, the fast time-variant UWA channel will still bring ICI and inter-symbol interference (ISI). To improve the communication performance of OTFS, channel equalization and signal detection can mitigate the interference. Signal detection algorithms generally include linear and nonlinear detection algorithms. Linear signal detection methods, such as the zero-forcing (ZF) algorithm [
11] and linear minimum mean squared error (LMMSE) [
12], have high complexity in practical implementation. Bayesian-based nonlinear algorithms assume the interference terms are approximately Gaussian distributed noise, such as message passing (MP) [
13] and the Markov chain Monte Carlo algorithm. However, in actual UWA communication systems, the interference term may not obey the Gaussian distribution. Although the nonlinear algorithms can approximate the optimal performance with a large number of iterations, the complexity is much higher than that of the LMMSE algorithm. In UWA OTFS communication, signal detection has been studied by linear equalizers [
14,
15,
16] under different UWA channels.
Machine learning can be used in wireless communication for signal detection. In [
17], supervised machine learning techniques were applied to decode the tag symbols. The input features that form the training data were explored and extracted from the received signal for machine learning-based detectors. In [
18], support vector machine (SVM)-based data detection is proposed for optical OFDM in visible light communication. In this paper, the SVM detector contained multiple binary classifiers with different classification strategies. The experiment results presented that the SVM detection offered improved BER performance compared with the traditional direct decision method. In [
19], the SVM in machine learning was used to jointly optimize the processing chain of signal detection, feature extraction and signal classification, and the simulation results show that the SVM had good performance. However, when the sample size is large, machine learning has difficulty dealing with the problems, and deep learning (DL)-based methods can solve such problems well. For example, the convolutional neural network (CNN) and its derivative algorithms can automatically learn the deep features of input digital information for subsequent classification [
20]. The recurrent neural network (RNN) is also widely used due to its advantages in processing time series data [
20].
In recent years, the DL-based method has shown its potential in communication systems [
21]. Ye et al. replaced the channel equalization and demodulation blocks of the receiver with a five-layer fully connected deep neural network (FC-DNN) in the OFDM system [
22]. The experiment results show that the DNN-based receiver was more reliable than the conventional methods. In the UWA OFDM communication system, an FC-DNN is used to realize the whole signal processing at the receiver [
23,
24], and simulation results show that the FC-DNN offered better bit error rate (BER) performance than conventional algorithms. In [
25], the long short-term memory (LSTM) neural network architecture was employed as the receiving module of the cyclic shift keying spread spectrum UWA communication system. The neural network is fed the communication signals passing through known channel impulse responses in the offline stage and then used to demodulate the received signal in the online stage. For UWA communication, the receiver in [
26] jointly employed a CNN for channel equalization and an FC-DNN for demodulation. Compared with a single DNN-based OFDM receiver, this joint network model can better extract channel information for data recovery. In [
27], DenseNet was proposed to replace the entire information recovery process of block-based MIMO receivers. DeseNet takes multiple modules as one system for joint optimization, and its BER outperforms block-by-block receivers. A signal detection scheme based on LSTM was proposed in [
28]. The authors utilized an RNN with the BiLSTM architecture for signal detection in [
29]. The simulation results show that the trained model can trace the characteristics of wireless time-varying channels and achieve accurate and robust signal recovery performance.
For using DL in OTFS systems, Naikoti et al. conducted a preliminary exploration of FC-DNN-based signal detection [
30]. Li et al. proposed a receiver with CNN-based signal detection for OTFS [
31]. Y. K. Enku et al. proposed a 2D-CNN-based OTFS signal detection scheme [
32] and utilized data augmentation to improve the overall performance. In [
33], an FC-DNN was used to replace the signal detection in the UWA OTFS system. It can be seen that the DL-based methods outperformed the conventional methods under complex channels. Whereas the DNN and CNN can only extract local features, this paper proposes an OTFS signal detection scheme based on the joint CNN and RNN to utilize both local and sequential features.
The main contributions of this paper are summarized as follows:
We propose an UWA OTFS signal detection method based on the deep neural network. The UWA channel has severe transmission loss, time-varying multi-path propagation and the severe Doppler effect, which are extremely challenging for signal detection. Conventional signal detection methods not only have high computational complexity but also require prior knowledge of the noise. DL-based signal detection has the advantage of recovering signals with complex nonlinear interference and noise by training and learning, and it does not have to assume any prior knowledge. In this paper, we propose a DL-based signal detection method for UWA OTFS in the complex, nonlinear UWA channel to improve system performance. To the best of our knowledge, this is the first DL-based signal detector proposed for UWA OTFS communication.
The SC-CNN-BiLSTM network is designed for OTFS signal detection in the complex UWA channel, which takes the advantages of both CNNs and RNNs for feature extraction and sequential data processing. Different from our previous work [
33], a totally new neural network structure is proposed for performance improvement. The CNN in the proposed network can extract data features and learn the potential relationship between its input and output. Furthermore, the skip connection (SC) in a CNN can provide the flexibility of data feature fusion for performance improvement. The cascaded BiLSTM in the network can memorize and extract the effective information from sequential transmitted symbols from the past to the future, which can mitigate the ICI and ISI. For UWA OTFS communication, SC-CNN-BiLSTM signal detection outperforms other previous proposed DL-based and conventional linear and nonlinear signal detection methods.
The remainder of this paper is organized as follows.
Section 2 presents the UWA OTFS system model.
Section 3 proposes the SC-CNN-BiLSTM scheme for signal detection.
Section 4 evaluates the performance of SC-CNN-BiLSTM-based signal detection with the simulation and experimental data and compares its performance with other signal detection methods.
Section 5 provides a discussion about the results.
Section 6 concludes our research.
2. UWA-OTFS System Model
Compared with wireless radio communication, the Doppler shift in UWA communication is more severe, which is mainly determined by the transmission characteristics of the UWA channel [
33].
Table 1 shows the characteristics comparison between wireless radio and the UWA channel. The propagation speed of UWA waves is five orders of magnitude slower than that of radio waves. Due to the severe distance- and frequency-dependent attenuation, the available frequency for long-range communication is only in roughly the KHz range. Due to these factors, even a slight movement can cause obvious Doppler shifts in the UWA channel.
In
Table 1, the carrier frequency offset can be calculated as
, where
is the speed of movement between the transceivers. The normalized CFO can be calculated as
, which can represent the impact of the CFO referenced to subcarrier spacing. For example, as shown in
Table 1, for a relative moving speed of 1 m/s, the normalized CFO is about
for radio communication and 1 for UWA communication. Doppler shifts on subcarriers have more severe impact on UWA communication performance than radio systems. In the UWA OTFS system, the Doppler effect will cause severe ISI and ICI. This paper will enhance the performance of the OTFS system with the power of deep learning.
Figure 1 shows the block diagram of the UWA OTFS system. At the transmitter, the modulation module can map the one-dimensional constellation symbols
into 2D transmission symbols
,
with the specified modulation mode (e.g., BPSK or QPSK). The 2D symbols are distributed over
OTFS delay-Doppler data grids. Then, the symbols in the DD domain are converted to the TF domain by an inverse symplectic finite Fourier transform (ISFFT) as
The TF domain signal is further transformed into time domain signal
by Heisenberg transform as
where
is the subcarrier spacing,
is the symbol duration and
is the transmit pulse-shaping filter.
The channel impulse response (CIR)
in the DD domain can be expressed as
where
is the channel coefficient of path
i and
and
are the frequency bias and time delay of path
i, respectively. We assume that the CIRs are perfectly known at the receiver.
The transmitted signal will go through the UWA channel, which is represented by the CIR and additive noise. The received signal can be expressed as
where
is the additive Gaussian white noise.
At the receiver, the received time domain signal
is converted into a TF domain signal through a Wigner transform as
where
,
.
Then, the TF domain signal is converted into a DD domain signal by a symplectic finite Fourier transform (SFFT) as
After parallel-to-serial conversion, is converted to y at a size of .
Finally, signal detection and demodulation will be performed to recover the transmitted signal as .
The multi-path effect and Doppler shift in UWA communication are more severe than that of radio communication. In UWA communication, assume that the maximum delay of the time-varying UWA channel is and the maximum Doppler shift is . In OTFS modulation, from the view of the DD domain, the OTFS parameter design is related to the channel conditions. In the Doppler axis, determines the maximum supportable Doppler shift as . In the delay axis, determines the maximum supportable multi-path delay as . In time-varying UWA channels, the maximum multi-path delay is large, so the corresponding designed value of should be small, and T is as large as .
To support a certain data rate of subcarriers per frame, the OTFS system is designed with a total bandwidth and frame duration . is small, so the setting of M should be large for high data rate communication. T is large, and the frame duration should not be too large for demodulation latency, so N cannot be too large. For effective OTFS communication in the UWA channel, a small value of N and large value of M should be selected to achieve efficient communication.
Based on the above analysis of UWA OTFS, a large M and small result in a high resolution for the frequency, which is sensitive to intercarrier interference. Additionally, when T is large and the value of N is small, the resolution of the corresponding Doppler shift decreases, which will affect the accuracy of signal detection. For the challenging UWA OTFS communication, this paper designs a DL-based signal detector for data recovery in the UWA channel with severe interferences.
3. SC-CNN-BiLSTM-Based Signal Detection for UWA-OTFS
Figure 2 shows the proposed deep learning-based OTFS system, where the transmitter is the same as the typical OTFS system and the detection module is replaced by SC-CNN-BiLSTM. We assume that the CIR is known in the detection module.
SC-CNN-BiLSTM training is performed using a set of training sets known at the transmitter and receiver. The training data are pseudo-randomly generated by the transmitter and sent to the receiver through the DD channel. The received signal vector y and the transmitted signal vector x can be used for training the neural network. After being trained, the real and imaginary parts of y in the validation set are used as input to the SC-CNN-BiLSTM to recover the unknown transmitted data.
3.1. Architecture of the Proposed SC-CNN-BiLSTM Detector
For UWA OTFS with severe Doppler effect, a DL-based channel detection method is designed by cascading the skip connection CNN and BiLSTM. The architecture of SC-CNN-BiLSTM for UWA OTFS is shown in
Figure 2. It includes the following layers:
SC-CNN layer: The SC-CNN layer extracts local signal features and learns the hidden relationship between the input and output.
BiLSTM layer: The BiLSTM layer can extract features of time series data from both the forward and backward directions and keep correlated and ignore uncorrelated information by the gates structure. It can mitigate interference for UWA OTFS.
FC layer: A fully connected layer with a sigmoid activation function is used to output soft bits for signal detection.
SC-CNN layer:
The CNN is a type of feedforward neural network structure with convolution calculations. With the advantage of convolution operation, the CNN can extract and express the internal complex correlation of signals, which plays the role of the mapping function. Meanwhile, its weight-sharing structure significantly reduces the number of weights and network complexity. In the SC-CNN layer of the proposed SC-CNN-BiLSTM network, the CNN consists of three convolutional neural network layers and three deconvolutional neural network (DeCNN) layers. The multiple convolutional layers are used to extract the signal features and internal correlations. The hidden layers in the neural network do not output the exact value. The output of the previous layer is the input of the next hidden layer. Accordingly, the output of the CNN can be expressed as
where
y is the input data,
is the output of the CNN, and the function
represents the operation in each convolutional layer.
In the neural network structure, SC can create short paths from previous layers to later layers. Not only can they reuse information for training, but they also can ease the gradient disappearance problem in network backpropagation. In the proposed SC-CNN layer, we add symmetrical skip connections to transfer learned feature mapping from the previous layer to the current layer. Each DeCNN not only takes the output from the previous layer as input but is also skip connected to the previous CNN layers. This mechanism enhances feature reusability. With the output in the same dimensionality from the earlier layers, the SC-CNN can learn more effective information through the interactions of layers. As shown in
Figure 2, in SC-CNN-BiLSTM, the first DeCNN layer takes the output of the previous CNN layer as its input. Starting from the second DeCNN layer, the fused feature vector of the SC-CNN can be expressed as
where the function
represents the feature fusion of different network layers,
n is the number of layers for the CNN or DeCNN and the total number of layers is
, while
l represents the
lth CNN or DeCNN layer. The input of
is the fusion of output of
and
.
BiLSTM layer:
As shown in
Figure 3, in an SC-CNN-BiLSTM cascaded neural network, the BiLSTM layer includes two LSTM networks in different directions: LSTM-F and LSTM-B. The input sequences are passed into LSTM-F in the forward direction and LSTM-B in the backward direction. These two LSTM cells are cascaded and passed to more Bi-LSTM layers. In the forward layer, the calculation is performed from the start time to the time
t, and the output of the forward hidden layer at each time is obtained and saved. The backward layer is calculated in reverse along the time axis, and the output of the backward hidden layer at each time is also obtained and saved. Finally, at each moment, the final output can be achieved by combining the corresponding output results of the forward layer and the backward layer, which can be expressed as
where
and
represent the output of the forward calculation and backward calculation at time
t, respectively,
represents the final output of the BiLSTM,
represents the input of the current LSTM,
represents the output of the last LSTM,
indicates the output of LSTM in opposite directions,
are the corresponding weights of the variables,
is the cell state in LSTM and
is the forgetting factor. BiLSTM can learn more comprehensive intrinsic correlation of the input series signal by learning from the past to the future and from the future to the past. Therefore, it can improve the performance of signal detection in UWA OTFS.
BiLSTM consists of mutiple LSTM cells.
Figure 4 shows the basic structure of the LSTM cell. With collaboration of the input gates, forget gates and output gates, LSTM can memorize important information and solve the problem of long-term dependence on data in the learning process.
First, the forget gate of LSTM decides which information to forget to have
for the cell state update. Then, the input gate updates the important information
in the learning process and determines the useful information retained in cell state
. The cell state is calculated as
Finally, the output gate calculates the forgetting factor
according to
and
, and it obtains the final output
according to
and the cell state
:
where
is the weight of
,
is the bias of
and
is the hyperbolic tangent activation function.
The BiLSTM layer in SC-CNN-BiLSTM has a strong ability to capture the correlations of times series data. It can not only remember correlated and ignore uncorrelated information by the gates structure, but it also can extract features from both the forward and backward directions. Therefore, for the signal detection of sequential data with interference, BiLSTM can enhance the neural network to better memorize and extract the effective information with sequentially transmitted symbols from the past to the future, which can mitigate the ICI and ISI in UWA communication.
FC layer:
For the output of the SC-CNN-BiLSTM network, there is one FC-DNN layer with 32 neurons and a logistic sigmoid activation function. The 32 neurons correspond to 32 bits to be estimated from 32 consecutive subcarriers. The logistic sigmoid function mapped the output values between [0,1] as soft bits, and then the soft bits will be processed to obtain the target sent bits, which can be expressed as
where
represents the output of BiLSTM.
Finally, the transmitted bits are obtained according to the decision formula as
3.2. Training of the Proposed SC-CNN-BiLSTM Neural Network
In the training stage, the number of training examples is chosen through trials. We start with a small number of examples and increase the number of examples until the SC-CNN-BiLSTM training tends to be stable. SC-CNN-BiLSTM learned the mapping relationship between the received vector and the corresponding transmitted vector. After training, the SC-CNN-BiLSTM can be used for signal detection. In the testing stage, the transmitter generates random information bits, modulates the bits by OTFS and transmits the OTFS signal over the UWA channel to the receiver. The receiver utilizes the trained SC-CNN-BiLSTM to detect symbols in the DD domain for data recovery.
The performance of a neural network depends greatly on the training process. First, the loss function should be reasonably designed to provide an accurate measure of the distance between the outputs and true labels. The training process aims to minimize the difference between the original transmitted data sequence
and the signal detection output
through the deep learning model. In this study, we define the loss function
as
where
is the batch size and
represents the bits in the
ith batch.
In addition, the hyperparameters related to the network structure and training will affect the capabilities of neural networks. The learning rate affects the convergence rate and results of the DL network. The adaptive learning rate strategy is employed, which can avoid being trapped in the local optimum. In our training, the initial learning rate was set to 0.001, and the decay factor was set to 0.1.
For training optimizer selection, we compared the performance of three typical optimizers: the stochastic gradient descent (SGD) optimizer, adaptive momentum (Adam) optimizer and root mean square propagation (RMSprop) optimizer. The test was conducted with an OTFS dataset that went through an experimental channel. As shown in
Figure 5, with the SGD optimizer, the loss of the proposed neural network did not converge well during training process. The convergence results of the Adam optimizer were better than those of the SGD optimizer with much lower loss, and the convergence of the RMSprop optimizer was the best. Therefore, our proposed SC-CNN-BiLSTM employed the RMSprop optimizer for training.
4. Numerical Results
4.1. System Set-Up
Both the simulation channel and experimental channel were used to evaluate the performance of the proposed signal detection scheme. The OTFS frame size was set to , which means each frame had 8 symbols and 64 subcarriers in the TF domain. The carrier frequency was set to 6 kHz. The maximum multi-path delay in the sea experiment was about 100 ms, so the subcarrier spacing was set to Hz. The sound speed was set to c = 1500 m/s. Binary phase shift keying (BPSK) was utilized for symbol constellation mapping.
The proposed model was implemented on the DL framework of TensorFlow and Keras for training and testing. The parameters of the neural network are listed in
Table 2. In the proposed SC-CNN-BiLSTM, there are
input neurons, where (
) is the frame size of the OTFS. For the output of the SC-CNN-BiLSTM-based OTFS detector, every 32 bits of transmitted data were grouped and predicted according to the separately trained model and then serially converted to the final output. In the proposed SC-CNN-BiLSTM, the former three convolutional layers used 4, 8 and 16 filters and the rectified linear unit (ReLU) activation function, the convolutional kernels were 4, 2 and 2, and the stride sizes were 4, 2 and 2, respectively. The latter three deconvolution layers had 8, 4 and 2 filters, the convolutional kernels were 2, 2 and 4, and the stride sizes were 2, 2 and 4, respectively. BiLSTM includes three BiLSTM layers with 30, 20, and 16 hidden units, and layer normalization (LN) was added between each BiLSTM layer to accelerate convergence and prevent overfitting. The BiLSTM layers used a hyperbolic tangent (tanh) activation function. For the output, there was one FC-DNN layer with 32 neurons and a logistic sigmoid activation function. The logistic sigmoid function mapped the output values of [0,1] as soft bits, which would be then processed to obtain the target sent bits. The output layer used the regression sigmoid function to find the predicted values of the transmitted symbols [
33].
We generated 60,000 OTFS frame samples under time-varying delay-Doppler channels. The data samples were divided into the training set, validation set and test set at a ratio of 4:1:1.
The BER performance of the following signal detection methods will be compared:
We will evaluate the performance of SC-CNN-BiLSTM in both the simulation and experimental channels with the above system settings and also consider the non-ideal factor of signal processing in practical underwater acoustic communication.
4.2. Simulation Results
We considered a statistic channel simulation model in a mobile communication scenario, where the channel gains followed an independent Rayleigh distribution. The simulation’s parameter settings are shown in
Table 3. The maximum multi-path delay was set to
ms. There was a total number of eight random multi-paths within the maximum delay range, in which the channel gain followed an independent Rayleigh distribution. The moving speed was set to
knots (1 knot is 1 nautical mile per hour, which is equal to 1.852 kilometers per hour), and the corresponding maximum Doppler spread was
. The Doppler coefficient of each path was generated in
with equal probability.
Figure 6 shows the BER comparison of multiple signal detection methods for UWA OTFS. At the BER of
, the proposed SC-CNN-BiLSTM could achieve about 5.5 dB, 3 dB and 1.5 dB of improvement compared with the MP, FC-DNN and 2D-CNN, respectively.
All deep learning-based signal detection methods (SC-CNN-BiLSTM, 2D-CNN and FC-DNN) perform better than conventional linear ZF, LMMSE and nonlinear MP, as DL-based OTFS signal detection methods can use nonlinear operations in neural networks to better fit data in the DD domain compared with the linear-based method. Compared with nonlinear MP, DL-based OTFS detection can fit the input-output relationship through iterative optimization of the parameters and avoid falling into a local optimum for better performance.
The two CNN-based signal detection methods, SC-CNN-BiLSTM and 2D-CNN, performed better compared with FC-DNN-based signal detection. As the neurons in a CNN are connected to each other, the weights of neurons on the same feature mapping layer are shared. Therefore, the CNN can learn in parallel to avoid overfitting and achieve faster convergence. This is the major advantage of CNNs compared with other neural networks. Moreover, the CNN uses the ReLU activation function to prevent gradient disappearing.
SC-CNN-BiLSTM outperformed the 2D-CNN. A single CNN can only extract local features and cannot process time series data efficiently when used. As in SC-CNN-BiLSTM, we added symmetric skip connections to the six-layer CNN. The SC-CNN can provide more efficient information through interactions of convolutional and deconvolutional layers. After the SC-CNN extracts the important features of input vector y, BiLSTM can further focus on effective information in the data sequences to mitigate ICI and ISI.
4.3. Experimental Results
We further evaluated the performance of SC-CNN-BiLSTM-based OTFS signal detection under multiple channels from a sea experimental dataset [
34]. For evaluation of the proposed scheme, we used UWA experimental channels from the WATERMARK dataset. WATERMARK is a benchmark dataset driven by at-sea measurements of the time-varying impulse response. In this paper, we employed the raw CIR measured at Norway-Oslofjord (NOF) and Kauai 1 (KAU1). The parameter settings of the experiments are shown in
Table 4.
The CIRs of the NOF channel in the time domain and DD domain are shown in
Figure 7 and
Figure 8. As shown in
Figure 7, the CIR of NOF in the time domain had an obvious time-varying multi-path.
Figure 8 presents the corresponding CIRs in the DD domain, where the Doppler shift for each path can be observed.
As shown in
Figure 9, the BER performance of the proposed SC-CNN-BiLSTM-based signal detection method was compared with other methods. Similar to the results in the simulation channel, the proposed SC-CNN-BiLSTM-based signal detection showed the best performance under the NOF UWA experimental channel. The proposed method had 5 dB, 2.5 dB and 2 dB gains at a BER of
compared with MP-, FC-DNN- and 2D-CNN-based signal detection methods.
Compared with the 2D-CNN, SC-CNN-BiLSTM enhanced the performance by employing skip connections for data reuse and BiLSTM in an RNN for time series data processing. Specifically, in SC-CNN-BiLSTM, skip connections with the CNN can provide more efficient information through the interactions of the convolutional and deconvolutional layers. After the SC-CNN extracts the features of the input vector, LSTM can further focus on effective information in the data sequences to mitigate ICI and ISI. A single CNN can only extract local features and cannot process time series data as efficiently as an RNN.
Compared with DNN-based signal detection, our proposed method utilized a CNN and BiLSTM cascaded network for better data fitting than a single neural network. There are non-convex optimization and gradient disappearance problems in the FC-DNN, which limit its robustness.
Compared with nonlinear MP detection, SC-CNN-BiLSTM can converge to the optimum, whereas MP may get trapped in local optimum and have high complexity during iteration. Compared with linear-based methods, such as LMMSE and ZF, the SC-CNN-BiLSTM signal detection method can use nonlinear operations in the neural network to better fit data in the DD domain.
The CIR of the KAU1 channel in the time and DD domains are shown in
Figure 10 and
Figure 11, respectively. In both the time and DD domains, the CIR structure was more complex than that for NOF. The channel variations were also more obvious than those for NOF. In the DD domain, it can be seen that the maximum Doppler shift of KAU1 was larger than that for NOF, which was about 4 Hz. Note that in these two sea experiments, although the transmitter and receiver were deployed in fixed locations, the Doppler shift was still evident. The Doppler shift in practical UWA channels is severe and complex, and it is caused by the multiple unique characteristics of the UWA environment. For example, the movement of seawater can cause the transceiver to move.
In
Figure 12, the BER performance of multiple signal detection methods is similar to that of the NOF channel. In the KAU1 experimental channel, SC-CNN-BiLSTM-based signal detection could achieve 4.5 dB, 2.5 dB and 1.5 dB SNR gains at a BER of
compared with the MP, FC-DNN and 2D-CNN methods, respectively. Owing to the specific design of the neural network, our proposed method outperformed both DL-based signal detection and conventional linear or nonlinear signal detection methods.
When comparing the BER performance of SC-CNN-BiLSTM in NOF (
Figure 9) and KAU1 (
Figure 12), the BER performance at NOF was better than that at KAU1. The multi-path structure and Doppler shift of KAU1 were more severe than those at NOF, which would degrade the BER performance. In the two experimental channels, SC-CNN-BiLSTM-based OTFS detection outperformed all the other signal detection methods.
4.4. Robustness Analysis with UWA Non-Ideal Channel Estimation
In the actual UWA communication system, many uncertainties will make the estimated CSI non-ideal. It can be seen from the literature [
35] that when the system obtains the CSI with errors, the performance of the system will degrade. In this subsection, the impact of non-ideal channel estimation on the proposed model will be analyzed. Assume that the channel estimation errors follow a Gaussian distribution with a zero mean and a certain variance.
The KAU1 experimental channel dataset was used to evaluate the multiple signal detection methods with channel estimation error. In the simulation, the variance of the channel estimation error was set to 0.1.
As shown in
Figure 13, with the channel estimation error, the BER performances of conventional signal detection methods obviously became worse. In the experimental channel, the BER of ZF and LMMSE almost increased to the error floor, and the BER of MP increased by up to two orders of magnitude.
The BER performances of DL-based signal detection methods degraded less sharply than those for the conventional methods. As shown in
Figure 13, compared with the two DL-based signal detection methods, SC-CNN-BiLSTM-based signal detection could recover data more accurately with the channel estimation error. In the KAU1 experimental channel, at a BER of
, the FC-DNN, 2D-CNN and SC-CNN-BiLSTM methods with channel estimation error had 3 dB, 2 dB and 1.5 dB SNR losses compared with the corresponding signal detection method with ideal channel estimation. The robust performance of SC-CNN-BiLSTM signal detection can be attributed to the proposed cascaded network, where the SC-CNN provides information fusion from the interaction of the current layer and the previous layer for more effective signal feature extraction, and BiLSTM can continuously store and extract valid information with symbols that are sequentially transmitted from the past to the future.
4.5. Computational Complexity Analysis
As shown in
Table 5, the computational complexity of the proposed SC-CNN-BiLSTM signal detection and other methods are compared.
The complexity of SC-CNN-BiLSTM can be calculated as the summation of the CNN and LSTM. According to [
36], the computational complexity of the CNN can be defined as
, where
is the number of CNN layers used to construct the model,
l is the index of a convolutional layer,
is the number of filters (also known as the width) in the
lth layer,
is the number of input channels of the
lth layer,
is the kernel size and
is the OTFS frame size for the network input. As all the other parameters are constant during the training and testing phase, the overall complexity is a linear function of
expressed as
. Meanwhile, the computational complexity of the BiLSTM can be defined as
, where
is the dimension of each cell,
is the size of the input and
in our system. The complexity of BiLSTM can be expressed as
. Thus, the SC-CNN-BiLSTM complexity is a linear function expressed as
.
The complexity of the 2D-CNN is
[
32]. The complexity of the MP-based method is
, where
is the number of iterations,
is the number of non-zero channel taps and
is the modulation bit size. Thus, the complexity of MP depends on the sparsity level of the channels. The complexity of LMMSE and ZF is
.
From
Table 5, we can see that the complexity of SC-CNN-BiLSTM was lower than those of the MP, LMMSE and ZF methods but higher than that of the 2D-CNN. Let us take the consideration of complexity and BER performance together. Both the complexity and BER performance of SC-CNN-BiLSTM outperformed FC-DNN, MP, LMMSE and ZP signal detection. Although the proposed method has higher complexity than the 2D-CNN, it can achieve better BER performance. The complexity of SC-CNN-BiLSTM is a linear function expressed as
, and the complexity of the 2D-CNN is
. Therefore, the proposed SC-CNN-BiLSTM signal detection method has higher complexity than the 2D-CNN. For BER performance, in
Figure 6, at a BER of
, the proposed SC-CNN-BiLSTM could achieve about 1.5 dB improvement compared with the 2D-CNN under a statistic simulation channel model. In
Figure 9, at a BER of
, the proposed method could achieve about 2 dB of improvement compared with the 2D-CNN under the NOF channel. In
Figure 12, at a BER of
, the proposed method could achieve about 1.5 dB of improvement under the KAU1 channel. The proposed method of SC-CNN-BiLSTM had higher complexity than the 2D-CNN, but it could achieve better BER performance in various channel conditions.