2.1.2. Activation Layer

After the convolutional layer, we need to use the activation function to introduce nonlinear modeling capabilities to our neural network, eliminate redundant data in the data, and enhance the learning ability of the neural network, so that the features in the data can be further processed for segmentation. The commonly used activation functions mainly include sigmoid function, tanh function, ReLu function, ELU function, etc. For details, please refer to the literature [35]. In our convolutional neural network, we choose to use the ReLu function as the activation function. Its main feature is compared with linear functions. ReLu has better expression ability compared with nonlinear functions, as ReLu does not have the problem of gradient disappearance and can maintain the convergence rate of the model in a stable state. The ReLu function is expressed as follows (2):

$$\alpha\_i^{k+1}(j) = \text{ReLU}(y\_i^{k+1}(j)) = \max\{0, y\_i^{k+1}(j)\},\tag{2}$$

where *yk*+<sup>1</sup> *i* (*j*) represents the output of the first convolutional layer, and *αk*+<sup>1</sup> *i* (*j*) represents the result of *yk*+<sup>1</sup> *i*(*j*) activated by ReLu.
