*2.3. Activation Functions*

Activation functions are used to add the non-linearity behavior of the ANN [53–56]. Without the activation function, the output of each layer of the ANN would just be the output of a linear model with number of parameters equal to the number of the neurons in each layer [54,55]. Consequently, activation functions increase the overall performance of the ANN and add a nonlinear behavior to it, depending on the behavior of the activation function itself. Thus, if activation functions are not applied on the ANN, the ANN usually has limited performance and acts as a linear regression model [54,55,57,59].

Figure 5 shows the basic structure of the activation function, where *x* = inputs, *w* = weights, *f*(Σ) = activation functions, and *y* = outputs [54,55].

**Figure 5.** Structure of neural networks with activation function.

The most common activation functions are hyperbolic tangent function, sigmoid function, linear function, ReLU (rectified linear unit) function, leaky-ReLU function, softmax function, and swish (a self-gated) function [55,58,59].

In this work, we use the hyperbolic tangent function (*tanh*) as the activation function for our proposed ANN and DNN models. The tanh function is used for the input and hidden layers.

In hyperbolic tangent function (−1, 1)

$$f(\mathbf{x}) = \tanh(\mathbf{x}) = \frac{(\boldsymbol{\varepsilon}^{\mathbf{x}} - \boldsymbol{\varepsilon}^{-\mathbf{x}})}{(\boldsymbol{\varepsilon}^{\mathbf{x}} + \boldsymbol{\varepsilon}^{-\mathbf{x}})} \; , \tag{11}$$

However, the rectified linear unit (ReLU) activation function is utilized in the output layer to provide a non-negative solar radiation predictive value [54–59].

ReLU (rectified linear unit) function [0, ∞)

$$f(\mathbf{x}) = \begin{cases} \ 0 \text{ for } \mathbf{x} < 0\\ \mathbf{x} \text{ for } \mathbf{x} \ge 0 \end{cases} \tag{12}$$
