3.4.1. Model Descriptions

Artificial neural network is one of the most popular machine learning techniques for nonlinear approximations because of its ability to deal with a large number of functions with a high degree of accuracy (Chen et al. 2003). The idea of ANN came from the structure of the animal brain, more specifically, from the human neural system. It is based on the idea of how brain works, how the neurons in the brain receive information from the input neurons, analyse it, and finally identify the object or pattern. Fundamentally, the mechanism has three layers—input layer, hidden layers, and output layer. Each layer consists of neurons or nodes. The hidden part may consist of many layers, however, for the time series analysis and forecasting, the single hidden layer feed forward network is the most widely used model structure (Zhang et al. 1998). A simple three layer neural network has the following mathematical form

$$Y\_t = \mathcal{W}\_0 + \sum\_{j=1}^q \mathcal{W}\_j \cdot \mathcal{g} \left(\mathcal{W}\_{0,j} + \sum\_{i=1}^p \mathcal{W}\_{i,j} \cdot Y\_{t-i}\right) + \mathfrak{e}\_{t\_I} \tag{9}$$

where, *Wi*,*<sup>j</sup>* and *Wj* for *i* = 1, 2, ... , *p*, *j* = 1, 2, ... , *q* are known as connection weights. The parameter *p* and *q* are the number of input and output nodes respectively. The network involves an activation function which plays a very important role because it converts the input signals to be used for the neurons or nodes in the next layer, eventually the output neuron. The most widely used activation functions are the logistic and hyperbolic functions (Khashei and Bijari 2010), which are shown in Equations (10) and (11)

$$\text{sig}(\mathbf{x}) = \frac{1}{1 - \varepsilon^{-\mathbf{x}}} \tag{10}$$

$$\tan^{-1}(x) = \frac{1 - e^{-2x}}{1 + e^{-2x}}.\tag{11}$$

Most of the modelers prefer the hyperbolic tangent function as the activation functions because of its faster convergence, and it makes the optimization easier. Hence, we used this activation function in our model. There is no systematic rule of choosing the number of neurons or nodes, *q* in the hidden layer (Khashei and Bijari 2010). In most of the cases it is data-dependent and chosen on the basis of trial and error.

### 3.4.2. Artificial Neural Network for *S* & P 500 Index

The model proposed for S & P 500 in this section is a three layer model—input, hidden, and the output layer. The input layer consists of a total of seven nodes which are daily *Open, Close, High, Low, Average, Volume,* and *Return*. The variable *Average* is the average of daily *Open, Close, High,* and *Low*. The *Volume* was converted to million units. Daily return was calculated by this formula *rt* = log *St St*−<sup>1</sup> , where *St* is the adjusted closed price and day one return, *r*0 was considered zero. The output layer has only one node that corresponds to the predicting variable *Adjusted Close* price. The number of the nodes in the hidden layer was chosen based on the error measures in Equation (1) for different combinations of the hidden nodes, which are displayed in Table 3. From this Table 3, we see that model ANN(7-15-1) has the lowest APE, AAE, ARPE, and RMSE and the highest adjusted *R*<sup>2</sup> value.


**Table 3.** Error measures for different network structures.

The original dataset had 1257 observations, but the dataset used in this method was modified in this way—all the predictor and predicting variables have the same length of 1256, however, the predictor variables started from day 1 to the 1256th day and the predicting variable day 2 to the 1257th day. Then, the dataset was divided into two parts to run the model. The test dataset contained the last 63 actual stock prices (adjusted close) which were compared to the predicted prices. The best model was selected on the basis of the adjusted *R*<sup>2</sup> and four error measures (Table 3). The model architecture is shown in Figure 5 and the result of this model is discussed in Section 4.3.

**Figure 5.** Artificial neural network architecture.
