Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems

Liu, Xinyu; Li, Chao; Jiang, Ziyun; Han, Lu

doi:10.3390/electronics12143120

Open AccessArticle

Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems

¹

School of Information and Electronics, Beijing Institute of Technology (BIT), Beijing 100081, China

²

School of Electronic Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(14), 3120; https://doi.org/10.3390/electronics12143120

Submission received: 10 June 2023 / Revised: 8 July 2023 / Accepted: 13 July 2023 / Published: 18 July 2023

(This article belongs to the Special Issue High-Speed Optical Communication and Information Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Nonlinear impairments caused by devices and fiber transmission links in a coherent optical communication system can severely limit its transmission distance and achievable capacity. In this paper, we propose a low-complexity pruned-convolutional-neural-network-(CNN)-based nonlinear equalizer, to compensate nonlinear signal impairments for coherent optical communication systems. By increasing the size of the effective receptive field with an 11 × 11 large convolutional kernel, the performance of feature extraction for CNNs is enhanced and the structure of the CNN is simplified. And by performing the channel-level pruning algorithm, to prune the insignificant channels, the complexity of the CNN model is dramatically reduced. These operations could save the important component of the CNN model and reduce the model width and computation amount. The performance of the proposed CNN-based nonlinear equalizer was experimentally evaluated in a 120 Gbit/s 64-quadrature-amplitude-modulation (64-QAM) coherent optical communication system over 375 km of standard single-mode fiber (SSMF). The experimental results showed that, compared to a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel, the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel, after channel-level pruning, saved approximately 15.5% space complexity and 43.1% time complexity, without degrading the equalization performance. The proposed low-complexity pruned-CNN-based nonlinear equalizer has great potential for application in realistic devices and holds promising prospects for coherent optical communication systems.

Keywords:

pruned convolutional neural network; low-complexity nonlinear equalizer; coherent optical communication system

1. Introduction

Breakthroughs and the continuous development of emerging technologies—such as cloud computing, artificial intelligence and the mobile internet—have propelled modern society into the “Big Data Era”. The increasing demand for super-large data storage, transmission and sharing services has further driven the explosive growth of network traffic. Therefore, modern communication networks need higher transmission rates, larger transmission capacity and better transmission quality, to ensure the growing demand of network traffic [1,2,3,4,5]. Improving the capacity of optical fiber communication systems has become an urgent issue for economic and social development [6,7,8,9,10].

Coherent optical communication technology—which combines a high-order modulation format, coherent detection technology and digital signal processing technology—can achieve high spectral efficiency, long distance and high-capacity signal transmission, and is an important technology for coping with the traffic crisis in modern communication networks. However, in current high-speed coherent optical communication systems, nonlinear impairment caused by devices and fiber transmission links is the most important factor limiting the high-capacity and long-distance transmission of optical signals in higher-order modulation formats [11,12,13]. Therefore, it is of great significance to explore nonlinear equalization techniques for coherent optical communication systems.

Over the last few years, with the development of machine learning, several machine learning techniques have been explored for dealing with nonlinear equalization in optical communication systems [14,15,16,17,18,19,20,21]. Neural network algorithms have achieved excellent performance in the nonlinear equalization of optical communication systems, attributable to their powerful nonlinear mapping capabilities for inputs and outputs. Examples include the artificial neural network (ANN) [22,23], the deep neural network (DNN) [24], the convolutional neural network (CNN) [25], the recurrent neural network (RNN) [26,27,28,29], the soft deep neural network (SDNN) [30], the sparsity learning deep neural network [31], the complex-valued neural network [32], the probabilistic neural network (PNN) [33], the echo state network [34] and Bayesian neural networks (BNN) [35]. On this basis, the neural-network-aided perturbation-theory-based fiber nonlinearity compensation technique has been widely investigated and has demonstrated its effectiveness in estimating complex nonlinear distortion fields with perturbation triplets as the input features [36]. In our previous work [37], we constructed a dual-channel feature map based on perturbation triplets as the input features, and a multilayer CNN with a 6 × 6 normal convolutional kernel was used, to process the input dual-channel feature map for nonlinear equalization. Significant complexity reductions were obtained, compared to perturbative nonlinearity compensation using a fully connected neural network. However, there was scope for further optimizing the structure of the CNN, to reduce the complexity without affecting the nonlinear equalization performances.

In this paper, in order to further reduce the complexity and improve the performance of nonlinear equalization, we propose a low-complexity pruned-CNN-based nonlinear equalizer, to compensate nonlinear signal impairments for coherent optical communication systems. Our experimental verification used an 11 × 11 large convolutional kernel, to increase the size of the effective receptive field, so that the performance of feature extraction for CNNs could be enhanced and the structure of the CNN could be simplified. A channel-level pruning algorithm was utilized, to prune the insignificant channels, whereby the complexity of the CNN could be dramatically reduced. The experimental verification of the proposed pruned-CNN-based nonlinear equalizer was demonstrated in a 120 Gb/s 64 QAM coherent optical communication system with a transmission distance of 375 km. The experimental results showed that the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel outperformed the CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel. And, by further using the channel-level pruning algorithm, the space complexity and time complexity of the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel could be further reduced. The proposed low-complexity pruned-CNN-based nonlinear equalizer showed great potential for coherent optical communication systems.

2. Principles

In this section, we present the feature map construction, applying the perturbation theory, we propose our CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel, we introduce the channel-level pruning algorithm and we analyze the corresponding complexity.

2.1. Feature Map Construction

For a polarization multiplexing coherent optical communication system, the nonlinear Schrödinger equation that describes the evolution of the optical field is as follows [38]:

\frac{\partial u_{x} (z, t)}{\partial z} + j \frac{β_{2}}{2} \frac{\partial^{2} u_{x} (z, t)}{\partial t^{2}} = j \frac{8}{9} γ [{|u_{x} (z, t)|}^{2} + | + u_{y} (z, t) |^{2}] u_{x} (z, t);

(1)

\frac{\partial u_{y} (z, t)}{\partial z} + j \frac{β_{2}}{2} \frac{\partial^{2} u_{y} (z, t)}{\partial t^{2}} = j \frac{8}{9} γ [{|u_{y} (z, t)|}^{2} + | + u_{x} (z, t) |^{2}] u_{y} (z, t),

(2)

where

u_{x} (z, t)

and

u_{y} (z, t)

are the optical fields of X and Y polarization, respectively,

β_{2}

is the group velocity dispersion and

γ

is the nonlinear coefficient. In the first-order perturbation theory, the solutions of Equations (1) and (2) are composed of the linear term

u_{0, x} (z, t)

or

u_{0, y} (z, t)

and the nonlinear perturbation term

Δ u_{x} (z, t)

or

Δ u_{y} (z, t)

:

u_{x} (z, t) = u_{0, x} (z, t) + Δ u_{x} (z, t);

(3)

u_{y} (z, t) = u_{0, y} (z, t) + Δ u_{y} (z, t) .

(4)

Assuming that the pulse spreading caused by fiber dispersion is much larger than the symbol duration, the nonlinear perturbation terms for the symbol at

t = 0

can be approximated as [39]:

Δ u_{x} = P_{0}^{3 / 2} \sum_{m, n} (X_{n} X_{m + n}^{*} X_{m} + Y_{n} Y_{m + n}^{*} X_{m}) C_{m, n};

(5)

Δ u_{y} = P_{0}^{3 / 2} \sum_{m, n} (Y_{n} Y_{m + n}^{*} Y_{m} + X_{n} X_{m + n}^{*} Y_{m}) C_{m, n},

(6)

where

P_{0}

is the launched optical power,

X_{n}

and

Y_{n}

are symbol sequences for the X and Y polarization, m and n are symbol indices, with respect to the symbol of interest

X_{0}

and

Y_{0}

,

C_{m, n}

are the nonlinear perturbation coefficients. The

X_{n} X_{m + n}^{*} X_{m} + Y_{n} Y_{m + n}^{*} X_{m}

term and the

Y_{n} Y_{m + n}^{*} Y_{m} + X_{n} X_{m + n}^{*} Y_{m}

term are defined as the intrachannel crossphase modulation (IXPM) and intrachannel fourwave mixing (IFWM) triplets. The triplets do not depend on the link parameters and can be calculated directly from the received symbol sequences.

In this paper, we chose to apply the feature map construction method we proposed in [37], where the feature units (FUs) of received complex M-QAM symbols are defined as the triplets

X_{n} X_{m + n}^{*} X_{m} + Y_{n} Y_{m + n}^{*} X_{m}

or

Y_{n} Y_{m + n}^{*} Y_{m} + X_{n} X_{m + n}^{*} Y_{m}

. The schematic diagram of the feature map is shown in Figure 1. Each FU of a received complex M-QAM symbol is a complex unit, corresponding to the triplet term of n and m values. Many different FUs form a feature map with two channels, Channel 1 and Channel 2, which represent the real and imaginary parts of the feature map, respectively. The superscript and subscript of FU denote the values of n and m, respectively. S denotes the maximum value of n and m. The channel size is

(2 S + 1)

, the side length of the square feature map.

2.2. CNN-Based Nonlinear Equalizer with an 11 × 11 Large Convolutional Kernel

For this paper, for a trade-off between the nonlinear equalization performance and the compensation complexity, for the feature map, the value of S was set to 5. Thus, the size of the corresponding input feature map for our proposed CNN-based nonlinear equalizer was 11 × 11 × 2. The input feature map not only contained the inherent connection between the real part and the imaginary part of the triplet terms, but also preserved the position relationships between different FUs.

The architecture of the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel is shown in Figure 2. The proposed CNN-based nonlinear equalizer contains four layers: an input layer; a convolution layer; a fully connected layer; and an output layer. In the input layer, the 11 × 11 × 2 feature map of a received complex M-QAM symbol is input into the convolution layer. In the convolution layer, n convolutional channels with 11 × 11 large kernel size are utilized, to capture the global input feature maps information, and the output feature map is 1 × 1 × n. Batch normalization [40] is adopted after the convolution layer, as a standard approach to achieving fast convergence and better generalization performance. The output feature map is fully connected to the fully connected layer. The number of nodes in the fully connected layer is the same as the number of classes of the M-QAM signals. The Softmax operation calculates the probabilities that the corresponding received complex M-QAM symbol maps to each class. Then, the output layer outputs the corresponding predicted class of the corresponding received complex M-QAM symbol, with the maximum probability.

An 11 × 11 large convolutional kernel is utilized to capture the global input feature maps information. As shown in Figure 3, compared to the normal 6 × 6 convolutional kernel utilized in our previous work [37], the large 11 × 11 convolutional kernel has the following advantages: (1) it has a larger effective receptive field, so that the performance of feature extraction for the convolutional layer can be significantly enhanced; (2) to obtain 1 × 1 output maps, only one convolution layer with an 11 × 11 large convolutional kernel is required, whereas with a 6 × 6 normal convolutional kernel, multiple layers of convolution are required; therefore, the structure of the CNN can be greatly simplified; (3) the effective receptive field of an 11 × 11 large convolutional kernel is large enough to cover the entire input feature map, so that the correlation between each FU in the input feature map can be better captured. Thus, the CNN model with an 11 × 11 large convolutional kernel establishes a closer connection between the input feature map and the output result.

2.3. The Channel-Level Pruning Algorithm

In order to further reduce the CNN model size and decrease the computational complexity of the CNN model, without compromising the performance of nonlinear equalization, a simple, yet effective, network training scheme [41] is applied to our proposed CNN-based nonlinear equalizer. This simple training scheme can achieve channel-level sparsity in the CNN model by leveraging the scaling factors of each channel, to effectively identify and prune unimportant channels in the CNN model.

The idea of this channel-level pruning method is introducing a scaling factor

γ

for each channel, then jointly training the network weights and these scaling factors with sparsity regularization, pruning the channels with small factors and fine-tuning the pruned network, as shown in Figure 4. Scaling factors

γ

act as the agents for channel selection: as they are jointly optimized with the network weights, the network can automatically identity insignificant channels, which can be safely removed without greatly affecting the performance. When we prune the channels of CNN models trained with channel-level sparsity, the channels of CNN models will be sorted from largest to smallest, according to the scaling factors

γ

, and then the bottom

α n

channels with smaller scaling factors

γ

will be pruned. The pruned percentage is denoted by

α

, which is a manually set percentile among all channels.

The training objective in this training scheme is as follows:

L = \sum_{(X, y)} l (f (X, W), y) + λ \sum_{γ \in Γ} g (γ),

(7)

where

(X, y)

denotes the input and target of the training dataset, W denotes the trainable weights of the CNN model, the first sum term corresponds to the normal training loss of CNN (cross entropy loss, in this paper),

g (\cdot)

is a sparsity-induced penalty on the scaling factors (

g (γ) = |γ|

, L1-norm, in this paper) and

λ

balances the two terms. Subgradient descent is adopted as the optimization method for the non-smooth L1 penalty term.

Pruning a channel essentially corresponds to removing all the incoming and outgoing connections of that channel: by doing so, a more compact network with less parameters and computing operations can be obtained. A fine-tuning process on the pruned CNN model can partly compensate the performance loss caused by channel pruning.

2.4. Complexity Analysis

We analyze the complexity of the proposed CNN-based nonlinear equalizer before and after pruning. We focus on analyzing the complexity in two aspects: the number of parameters of the CNN-based nonlinear equalizer, and the number of multiplications required for the CNN-based nonlinear equalizer to equalize per M-QAM symbol.

(1) The number of parameters of the CNN-based nonlinear equalizer before pruning mainly comprises the parameters of the convolution layer and the parameters of the fully connected layer. The number of parameters for the convolution layer is the number of weight parameters of the convolutional kernels,

N_{P_C_u n p} = C_{i n} \times C_{o u t} \times K

. The number of parameters for the fully connected layer is

N_{P_F_u n p} = C_{o u t} \times M

. Thus, the number of parameters of the CNN-based nonlinear equalizer before pruning is

N_{P_u n p} = C_{i n} \times C_{o u t} \times K + C_{o u t} \times M

.

C_{i n}

and

C_{o u t}

denote the numbers of the input channels and output channels, respectively, while K denotes the convolution kernel size and M denotes the number of nodes in the fully connected layer.

(2) The number of parameters of the CNN-based nonlinear equalizer after pruning mainly includes the parameters of the convolution layer and the parameters of the fully connected layer. The number of parameters for the convolution layer is the number of weight parameters of the convolutional kernels,

N_{P_C_p} = C_{i n} \times C_{o u t} \times α \times K + C_{o u t}

. The number of parameters for the fully connected layer is

N_{P_F_p} = C_{o u t} \times α \times M

. Thus, the number of parameters of the CNN-based nonlinear equalizer after pruning is

N_{P_p} = C_{i n} \times C_{o u t} \times α \times K + C_{o u t} + C_{o u t} \times α \times M

. The pruned percentage is denoted by

α

.

(3) The number of multiplications required for the CNN-based nonlinear equalizer before pruning, to equalize per M-QAM symbol, mainly comprises the multiplications of the convolution layer and the multiplications of the fully connected layer. The number of multiplications for the convolution layer is

N_{M_C_u n p} = C_{i n} \times C_{o u t} \times K

. The number of multiplications for the fully connected layer is

N_{M_F_u n p} = C_{o u t} \times M

. Thus, the number of multiplications of the CNN-based nonlinear equalizer before pruning is

N_{M_u n p} = C_{i n} \times C_{o u t} \times K + C_{o u t} \times M

.

(4) The number of multiplications required for the CNN-based nonlinear equalizer after pruning to equalize per M-QAM symbol mainly includes the multiplications of the convolution layer and the multiplications of the fully connected layer. The number of multiplications for the convolution layer

N_{M_C_p} = C_{i n} \times C_{o u t} \times α \times K

. The number of multiplications for the fully connected layer is

N_{M_F_p} = C_{o u t} \times α \times M

. Thus, the number of multiplications of the CNN-based nonlinear equalizer before pruning is

N_{M_p} = C_{i n} \times C_{o u t} \times α \times K + C_{o u t} \times α \times M

.

3. Experimental Setup

The experimental setup for a 120 Gb/s 64-QAM coherent optical communication system with a transmission distance of 375 km is depicted in Figure 5.

At the transmitter, the nominal linewidth of the external cavity laser (ECL) is 100 kHz. Symbol sequences of the 64-QAM signals generated by the MATLAB program are uploaded to an arbitrary waveform generator (AWG) with a sampling rate of 25 GSa/s. Two electric amplifiers (EA) are used to amplify two analog signals. Then, the two amplified analog signals are sent into the inphase/quadrature (I/Q) modulator. Next, a modulated optical signal is generated by the I/Q modulator. The polarization multiplexing of the signal is completed by a polarization-division multiplexing (PDM) module. The PDM module is composed of a polarization-maintaining optical coupler (PM-OC), an optical delay line, a polarization controller (PC) and a polarization beam combiner (PBC). Then, the optical signal is amplified by an Erbium-doped fiber amplifier (EDFA), and the optical signal power can be adjusted by a variable optical attenuator (VOA).

The transmission link consists of 5-span G.652D single-mode fiber (SMF), with a length of 75 km. An EDFA is used to compensate the fiber loss at the end of each span.

At the receiver, the local oscillator (LO) for coherent detection is an ECL with 100 kHz linewidth. An LO, two polarization beam splitters (PBSs), two 90-degree optical hybrids and four balanced photodetectors (BPDs) form an optical polarization and phase-diversity coherent receiver front end. The X- and Y-polarization components of the received optical signal and the local oscillator are separately combined and detected by two identical phase-diversity receivers. A 90-degree optical hybrid and two BPDs form a phase-diversity receiver. Then, the signals are digitized by a 4-channel digital phosphor oscilloscope (DPO) with a 100 GSa/s sampling rate.

The offline DSP consists of a low-pass filter, amplitude normalization, chromatic dispersion compensation (CDC), clock recovery, resampling, the Gram–Schmidt orthogonalizing process (GSOP), constant-modulus-algorithm-(CMA) equalization, frequency offset estimation (FOE), carrier phase estimation (CPE) based on blind phase search, a CNN-based nonlinear equalizer (CNN-based NLE) and 64-QAM demapping.

In our experiment, the measured launched optical power ranged from −4 dBm to 5 dBm. The datasets of each launched optical power contained approximately 2

^{20}

symbols. The whole dataset was divided into training data (80%) and testing data (20%). The CNN models were built, trained and evaluated in Pytorch 1.6.0. Furthermore, in the training process, structural reparameterization [42] was utilized, in order for the CNN model to achieve better feature representation.

4. Results and Discussion

The number of output channels of the convolution layer is an important parameter influencing the equalization performance and computational complexity of the proposed CNN-based nonlinear equalizer. Figure 6a presents the BER performance of the proposed CNN-based nonlinear equalizer shown in Figure 2, with different numbers of output channels of the convolution layer. The launched optical power was 1 dBm. As shown in the figure, when the number of output channels exceeded 160, the proposed CNN-based nonlinear equalizer achieved BER below the HD-FEC limit of 3.8 × 10

^{- 3}

; when the number of output channels exceeded 310, the proposed CNN-based nonlinear equalizer achieved BER below 1.0 × 10

^{- 3}

. The more output channels in the convolutional layer, the more convolutional kernels there are, and the stronger the ability to globally extract information from the input feature map simultaneously. However, too many output channels of the convolution layer would result in a redundant CNN model structure and excessive computational complexity.

The BER performance of the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels was similar to that of a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel [37], when the launched optical power was 1 dBm. Figure 6b presents the BER performance of the proposed pruned-CNN-based nonlinear equalizer of 270 output channels with different pruned percentages. It can be seen that, when the pruned percentage was 15%, which means that 15% of the channels of the CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel were pruned, the BER results changed from 1.23 × 10

^{- 3}

to 1.28 × 10

^{- 3}

, a degree of BER performance which is acceptable.

Figure 7 presents the BER performance of a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel [37], the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels and a 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels. The BER results represented by the blue line are the BER performance of the minimum-Euclidean-distance-(MED)-based decision. The BER results represented by the orange line (A) are the BER performance of the CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel [37]. The BER results represented by the yellow line (B) are the BER performance of the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels. The BER results represented by the purple line (C) are the BER performance of the 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels. As shown in the figure, both the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels and the 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels outperformed the CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel.

The computational complexity of the CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel [37] (orange columns (A)), the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels (yellow columns (B)) and the 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels (purple columns (C)), with similar BER performances (1.284 × 10

^{- 3}

, 1.239 × 10

^{- 3}

, 1.283 × 10

^{- 3}

, respectively, when the launched optical power was 1 dBm) are presented in Figure 8. Space complexity represents the number of parameters for the CNN-based nonlinear equalizer, and time complexity represents the number of multiplications required for the CNN-based nonlinear equalizer to equalize per symbol. Lower space complexity means less memory space is taken up, and lower time complexity means less running time is required.

It can be seen that, compared to the CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel, the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels saved about 33% time complexity. Furthermore, the 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels saved about 15.5% space complexity and 43.1% time complexity. The proposed pruned-CNN-based nonlinear equalizer requires less memory occupation and less running time, has great potential for application in realistic devices and is suitably matched for coherent optical communication systems with higher rate, longer distance and higher capacity requirements.

5. Conclusions

In this paper, we propose a low-complexity pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel. An 11 × 11 large convolutional kernel can better capture the global input feature maps information, and the insignificant channels of the CNN model are pruned through training with channel-level sparsity. Then, the important component of the CNN model can be saved, and the model size and computation amount can be reduced. An experimental 120 Gb/s 64-QAM coherent optical communication system with a transmission distance of 375 km was built, to verify the performance of the proposed low-complexity pruned-CNN-based nonlinear equalizer. The experimental results showed that, compared to a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel, the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel saved about 33% time complexity, while its achieved BERs were similar. Furthermore, the complexity of the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel after channel-level pruning was further reduced: about 15.5% space complexity and 43.1% time complexity could be saved without degrading the equalization performance. This could ensure the application of the proposed low-complexity pruned-CNN-based nonlinear equalizer in realistic devices with less memory occupation and less running time. The proposed low-complexity pruned-CNN-based nonlinear equalizer has great potential for coherent optical communication systems, and promises to have a positive impact on the further development of modern communication networks.

Author Contributions

Conceptualization, X.L. and C.L.; methodology, X.L. and C.L.; software, Z.J. and L.H.; validation, X.L., C.L., Z.J. and L.H.; formal analysis, X.L. and C.L.; investigation, X.L. and C.L.; resources, Z.J.; data curation, Z.J. and L.H.; writing—original draft preparation, X.L., C.L. and Z.J.; writing—review and editing, X.L., C.L. and Z.J.; visualization, X.L.; supervision, X.L.; project administration, X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China from the Ministry of Science and Technology (2021YFB2800904), the National Natural Science Foundation of China (62206018), and the Open Fund of IPOC (BUPT) (IPOC2021B05).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chang, H.; Yin, X.; Yao, H.; Wang, J.; Gao, R.; An, J.; Hanzo, L. Low-Complexity Adaptive Optics Aided Orbital Angular Momentum Based Wireless Communications. IEEE Trans. Veh. Technol. 2021, 70, 7812–7824. [Google Scholar] [CrossRef]
Guo, D.; Zhang, W.; Tian, F.; Shi, J.; Wang, K.; Kong, M.; Zhang, Q.; Lv, K.; Li, D.; Pan, X.; et al. LDPC-Coded Generalized Frequency Division Multiplexing for Intensity-Modulated Direct-Detection Optical Systems. IEEE Photonics J. 2019, 11, 7902115. [Google Scholar] [CrossRef]
Zhou, S.; Zhang, Q.; Gao, R.; Chang, H.; Xin, X.; Li, S.; Pan, X.; Tian, Q.; Tian, F.; Wang, Y. High-accuracy atmospheric turbulence compensation based on a Wirtinger flow algorithm in an orbital angular momentum-free space optical communication system. Opt. Commun. 2020, 477, 126322. [Google Scholar] [CrossRef]
Winzer, P.; Neilson, D.; Chraplyvy, A. Fiber-optic transmission and networking: The previous 20 and the next 20 years [Invited]. Opt. Express 2018, 26, 24190–24239. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Gao, R.; Xin, X.; Zhou, S.; Chang, H.; Li, Z.; Wang, F.; Guo, D.; Yu, C.; Liu, X.; et al. Adaptive Bayes-Adam MIMO equalizer with high accuracy and fast convergence for Orbital angular momentum mode division multiplexed transmission. J. Lightwave Technol. 2023; in press. [Google Scholar] [CrossRef]
Chang, H.; Yin, X.; Yao, H.; Wang, J.; Gao, R.; Xin, X.; Guizani, M. Adaptive Optics Compensation for Orbital Angular Momentum Optical Wireless Communications. IEEE Trans. Wirel. Commun. 2022, 21, 11151–11163. [Google Scholar] [CrossRef]
Guo, D.; Zhang, Q.; Xin, X.; Lv, K.; Tian, F.; Pan, X.; Tian, Q.; Wang, X.; Wang, Y. Adaptive Reed-Solomon coding and envelope detection of photonic vector signal in V-band radio over fiber system. IEEE Trans. Wirel. Commun. 2019, 439, 210–217. [Google Scholar] [CrossRef]
Zhou, S.; Gao, R.; Zhang, Q.; Chang, H.; Xin, X.; Zhao, Y.; Liu, J.; Lin, Z. Data-defined naïve Bayes (DNB) based decision scheme for the nonlinear mitigation for OAM mode division multiplexed optical fiber communication. Opt. Express 2021, 29, 5901–5914. [Google Scholar] [CrossRef]
Li, C.; Wang, Y.; Yao, H.; Yang, L.; Liu, X.; Huang, X.; Xin, X. Ultra-low complexity random forest for optical fiber communications. Opt. Express 2023, 31, 11633–11648. [Google Scholar] [CrossRef]
Zhu, L.; Yao, H.; Chang, H.; Tian, Q.; Zhang, Q.; Xin, X.; Yu, F. Adaptive Optics for Orbital Angular Momentum-Based Internet of Underwater Things Applications. IEEE Internet Things J. 2022, 9, 24281–24299. [Google Scholar] [CrossRef]
Liu, X.; Wang, Y.; Wang, X.; Tian, F.; Xin, X.; Xin, X.; Zhang, Q.; Tian, Q.; Yang, L. Mixture-of-Gaussian clustering-based decision technique for a coherent optical communication system. Appl. Opt. 2019, 58, 9201–9207. [Google Scholar] [CrossRef]
Redyuk, A.; Averyanov, E.; Sidelnikov, O.; Fedoruk, M.; Turitsyn, S. Compensation of Nonlinear Impairments Using Inverse Perturbation Theory With Reduced Complexity. J. Light. Technol. 2020, 38, 1250–1257. [Google Scholar] [CrossRef] [Green Version]
Ip, E. Nonlinear compensation using backpropagation for polarization-multiplexed transmission. J. Light. Technol. 2010, 28, 939–951. [Google Scholar] [CrossRef]
Freire, P.; Osadchuk, Y.; Spinnler, B.; Napoli, A.; Schairer, W.; Costa, N.; Prilepsky, J.; Turitsyn, S. Performance Versus Complexity Study of Neural Network Equalizers in Coherent Optical Systems. J. Light. Technol. 2021, 39, 6085–6096. [Google Scholar] [CrossRef]
Zhang, J.; Chen, W.; Gao, M.; Shen, G. K-means-clustering-based fiber nonlinearity equalization techniques for 64-QAM coherent optical communication system. Opt. Express 2017, 25, 27570–27580. [Google Scholar] [CrossRef] [PubMed]
Giacoumidis, E.; Matin, A.; Wei, J.; Doran, N.; Barry, L.; Wang, X. Blind nonlinearity equalization by machine-learning-based clustering for single- and multichannel coherent optical OFDM. J. Light. Technol. 2018, 36, 721–727. [Google Scholar] [CrossRef] [Green Version]
Xu, M.; Zhang, J.; Zhang, H.; Jia, Z.; Wang, J.; Cheng, L.; Campos, L.; Knittle, C. Multi-stage machine learning enhanced DSP for DP-64QAM coherent optical transmission systems. In Proceedings of the Optical Fiber Communication Conference, San Diego, CA, USA, 3–7 March 2019. paper M2H.1. [Google Scholar]
Zhang, J.; Gao, M.; Chen, W.; Shen, G. Non-data-aided k-nearest neighbors technique for optical fiber nonlinearity mitigation. J. Light. Technol. 2018, 36, 3564–3572. [Google Scholar] [CrossRef]
Giacoumidis, E.; Mhatli, S.; Stephens, M.; Tsokanos, A.; Wei, J.; McCarthy, M.; Doran, N.; Ellis, A. Reduction of nonlinear intersubcarrier intermixing in coherent optical OFDM by a fast newton-based support vector machine nonlinear equalizer. J. Light. Technol. 2017, 35, 2391–2397. [Google Scholar] [CrossRef] [Green Version]
Tian, F.; Zhou, Q.; Yang, C. Gaussian mixture model-hidden Markov model based nonlinear equalizer for optical fiber transmission. Opt. Express 2020, 28, 9728–9737. [Google Scholar] [CrossRef]
Zhang, J.; Xu, T.; Jin, T.; Jiang, W.; Hu, S.; Huang, X. Meta-Learning Assisted Source Domain Optimization for Transfer Learning Based Optical Fiber Nonlinear Equalization. J. Light. Technol. 2023, 41, 1269–1277. [Google Scholar] [CrossRef]
Jarajreh, M.; Giacoumidis, E.; Aldaya, I.; Le, S.; Tsokanos, A.; Ghassemlooy, Z.; Doran, N. Artificial Neural Network Nonlinear Equalizer for Coherent Optical OFDM. IEEE Photonics Technol. Lett. 2015, 27, 387–390. [Google Scholar] [CrossRef]
Chen, X.; Fang, X.; Pittalà, F.; Zhang, F. 100 Gbaud PDM 16QAM NFDM transmission with neural network-based equalization. Opt. Fiber Technol. 2023, 78, 103329. [Google Scholar] [CrossRef]
Aldaya, I.; Giacoumidis, E.; Tsokanos, A.; Jarajreh, M.; Wen, Y.; Wei, J.; Campuzano, G.; Abbade, M.; Barry, L. Compensation of nonlinear distortion in coherent optical OFDM systems using a MIMO deep neural network-based equalizer. Opt. Lett. 2020, 45, 5820–5823. [Google Scholar] [CrossRef] [PubMed]
Chuang, C.; Liu, L.; Wei, C.; Liu, J.; Henrickson, L.; Huang, W.; Wang, C.; Chen, Y.; Chen, J. Convolutional neural network based nonlinear classifier for 112-Gbps high speed optical link. In Proceedings of the Optical Fiber Communication Conference, San Diego, CA, USA, 11–15 March 2018. paper W2A.43. [Google Scholar]
Zhou, Q.; Yang, C.; Liang, A.; Zheng, X.; Chen, Z. Low computationally complex recurrent neural network for high speed optical fiber transmission. Opt. Commun. 2019, 441, 121–126. [Google Scholar] [CrossRef]
Liu, X.; Wang, Y.; Wang, X.; Xu, H.; Li, C.; Xin, X. Bi-directional gated recurrent unit neural network based nonlinear equalizer for coherent optical communication system. Opt. Express 2021, 29, 5923–5933. [Google Scholar] [CrossRef] [PubMed]
Lu, X.; Lu, C.; Yu, W.; Qiao, L.; Liang, S.; Lau, A.; Chi, N. Memory-controlled deep LSTM neural network post-equalizer used in high-speed PAM VLC system. Opt. Express 2019, 27, 7822–7833. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Deligiannidis, S.; Bogris, A.; Mesaritakis, C.; Kopsinis, Y. Compensation of fiber nonlinearities in digital coherent systems leveraging long short-term memory neural networks. J. Light. Technol. 2020, 38, 5991–5999. [Google Scholar] [CrossRef]
Schädler, M.; Böcherer, G.; Pachnicke, S. Soft-Demapping for Short Reach Optical Communication: A Comparison of Deep Neural Networks and Volterra Series. J. Light. Technol. 2021, 39, 3095–3105. [Google Scholar] [CrossRef]
Yadav, G.; Chuang, C.; Feng, K.; Chen, J.; Chen, Y. Sparsity Learning Deep Neural Network Nonlinear Equalization Method for 112Gbps PAM4 Short-Reach Transmission Links. J. Light. Technol. 2023, 41, 2333–2342. [Google Scholar] [CrossRef]
Xie, T.; Yu, J. 4Gbaud PS-16QAM D-Band Fiber-Wireless Transmission over 4.6 km by Using Balance Complex-Valued NN Equalizer with Random Oversampling. Sensors 2023, 23, 3655. [Google Scholar] [CrossRef]
Wang, F.; Gao, R.; Zhou, S.; Li, Z.; Cui, Y.; Chang, H.; Wang, F.; Guo, D.; Yu, C.; Liu, X.; et al. Probabilistic neural network equalizer for nonlinear mitigation in OAM mode division multiplexed optical fiber communication. Opt. Express 2022, 30, 47957–47969. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Wang, Y.; Li, W.; Zhu, B.; Ding, J.; Wang, K.; Liu, C.; Wang, C.; Kong, M.; Zhao, L.; et al. Echo State Network Based Nonlinear Equalization for 4.6 km 135 GHz D-Band Wireless Transmission. J. Light. Technol. 2023, 41, 1278–1285. [Google Scholar] [CrossRef]
Zhou, S.; Liu, X.; Gao, G.; Jiang, Z.; Zhang, H.; Xin, X. Adaptive Bayesian neural networks nonlinear equalizer in a 300-Gbit/s PAM8 transmission for IM/DD OAM mode division multiplexing. Opt. Lett. 2023, 48, 464–467. [Google Scholar] [CrossRef] [PubMed]
Zhang, S.; Yaman, F.; Nakamura, K.; Inoue, T.; Kamalov, V.; Jovanovski, L.; Vusirikala, V.; Mateo, E.; Inada, Y.; Wang, W. Field and lab experimental demonstration of nonlinear impairment compensation using neural networks. Nat. Commun. 2019, 10, 3033. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, C.; Wang, Y.; Wang, J.; Yao, H.; Liu, X.; Gao, R.; Yang, L.; Xu, H.; Zhang, Q.; Ma, P.; et al. Convolutional Neural Network-Aided DP-64 QAM Coherent Optical Communication Systems. J. Light. Technol. 2022, 40, 2880–2889. [Google Scholar] [CrossRef]
Agrawal, G. Nonlinear Fiber Optics; Academic Press: San Diego, CA, USA, 2007. [Google Scholar]
Mecozzi, A.; Clausen, C.; Shtaif, M. Analysis of intrachannel nonlinear effects in highly dispersed optical pulse transmission. IEEE Photonics Technol. Lett. 2000, 12, 392–394. [Google Scholar] [CrossRef]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the ICML’15: 32nd International Conference on International Conference on Machine Learning, Lille, France, 6–11 July 2015. [Google Scholar]
Liu, Z.; Li, J.; Shen, Z.; Huang, G.; Yan, S.; Zhang, C. Learning Efficient Convolutional Networks through Network Slimming. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
Ding, X.; Zhang, X.; Han, J.; Ding, G. Diverse Branch Block: Building a Convolution as an Inception-like Unit. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]

Figure 1. Feature map with two channels: (a) Channel 1: real part of the complex FUs; (b) Channel 2: imaginary part of the complex FUs.

Figure 2. Architecture of the proposed CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel.

Figure 3. Effective receptive field ranges of different convolutional kernels: (a) 6 × 6 normal convolutional kernel; (b) 11 × 11 large convolutional kernel.

Figure 4. Flowchart of network pruning procedure.

Figure 5. Experimental setup.

Figure 6. (a) BER performance of the proposed CNN-based nonlinear equalizer with different numbers of output channels of the convolution layer; (b) BER performance of the proposed CNN-based nonlinear equalizer of 270 output channels with different pruned percentage. The launched optical power was 1 dBm.

Figure 7. BER performance versus launched optical power: with a MED-based decision (blue line); with a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel (orange line); with an unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels (yellow line); with a 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels (purple line).

Figure 8. Computational complexity, including space complexity and time complexity. Orange columns (A): the complexity of a CNN-based nonlinear equalizer with a 6 × 6 normal convolutional kernel. Yellow columns (B): the complexity of the proposed unpruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels. Purple columns (C): the complexity of the 15%-pruned-CNN-based nonlinear equalizer with an 11 × 11 large convolutional kernel and 270 output channels.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, X.; Li, C.; Jiang, Z.; Han, L. Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems. Electronics 2023, 12, 3120. https://doi.org/10.3390/electronics12143120

AMA Style

Liu X, Li C, Jiang Z, Han L. Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems. Electronics. 2023; 12(14):3120. https://doi.org/10.3390/electronics12143120

Chicago/Turabian Style

Liu, Xinyu, Chao Li, Ziyun Jiang, and Lu Han. 2023. "Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems" Electronics 12, no. 14: 3120. https://doi.org/10.3390/electronics12143120

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Low-Complexity Pruned Convolutional Neural Network Based Nonlinear Equalizer in Coherent Optical Communication Systems

Abstract

1. Introduction

2. Principles

2.1. Feature Map Construction

2.2. CNN-Based Nonlinear Equalizer with an 11 × 11 Large Convolutional Kernel

2.3. The Channel-Level Pruning Algorithm

2.4. Complexity Analysis

3. Experimental Setup

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI