2.1.1. Convolutional Layer

The convolutional layer uses a convolution kernel to perform convolution operations on our input data or local regions of features, and extract relevant features from the data. Figure 2 shows the structure diagram of the convolutional layer and the pooling layer. The top layer is the pooling layer, the middle is the convolutional layer, and the bottom is the input layer [34]. In Figure 2, convolution neurons are organized into feature planes, and each neuron in the convolution layer is locally connected to the feature surface in its input layer. The output of each neuron in the convolution layer can be obtained by passing the local weighting and transfer to the activation function.

An important feature of convolutional neural networks is weight sharing. The weights of convolutional neural networks in the plane of the same input feature and the same output feature are shared. Weight sharing also reduces the complexity of the network model to a certain extent. It also avoids the over-fitting problem caused by too many parameters. In actual operations, most of the related operations can be replaced by convolution operations, which can avoid the problem of reversing the convolution kernel during backpropagation. The formula for convolution operation is shown in (1):

$$y\_i^{k+1}(j) = \kappa\_i^k \times \chi^k(j) + b\_i^k. \tag{1}$$

where *κki* and *bki* respectively represent the weight and bias of the *k*th filter kernel of the *i*th layer of the neural network, and use *χ<sup>k</sup>*(*j*) to represent the *j*th local region of the *k*th layer. Where × is used to calculate the inner product of the kernel and the local area, and *yk*+<sup>1</sup> *i* (*j*) represents the input of the *j* neuron in the frame *i* of the *k* + *l* layer.

**Figure 2.** Schematic diagram of the convolutional layer and the pooling layer structure.
