*2.1. OP-ELM*

The optimally pruned extreme learning machine (OP-ELM) is the improved version of the extreme learning machine. This improved technique, introduced by Miche et al., uses the leave-one-out (LOO) method to select the optimal number of neurons [38]. LOO marginalizes the irrelevant neurons built into ELM's network. This marginalization helps overcome the shortfall in the approximation of the training dataset's correlated and irrelevant variables. Given a training set *xi*, with a target vector *ti*, the OP-ELM's objective is to obtain the minimum possible error function. The OP-ELM equation is given by (1). If there exists an input weight vector connecting the *k*th hidden neuron and the input (*wk*), a *k*th hidden node's bias (*bk*), and an output weight connecting the output and the *k*th hidden neuron (*βk*), such that <sup>∑</sup>*<sup>j</sup> <sup>k</sup>*=<sup>1</sup> *f*(*wk*, *bk*, *xi*)*β<sup>k</sup>* = *yi*, (1) can be re-written as (2).

$$\sum\_{k=1}^{j} f\left(w\_{k\prime} b\_{k\prime} x\_i\right) \beta\_k = t\_i \tag{1}$$

$$\mathbf{H}\boldsymbol{\beta} = \mathbf{T} \tag{2}$$

$$H = \begin{bmatrix} f(w\_1, b\_1, \mathbf{x}\_1) & \cdots & f(w\_k, b\_k, \mathbf{x}\_1) \\ \vdots & \vdots & \ddots & \vdots \\ \vdots & \vdots & \ddots & \vdots \\ \end{bmatrix} \tag{3}$$

$$\left\| \begin{array}{ccc} \vdots & \vdots & \vdots\\ f(w\_1, b\_{1\cdot}, \mathbf{x}\_m) & \dots & f(w\_k, b\_{k\cdot}, \mathbf{x}\_m) & \end{array} \right\|\_{m \times j}$$

$$\mathcal{J} = H^\* T = \left( H H^T \right)^{-1} H T^T \tag{4}$$

where *yi* is the output vector, *ti* is the output target vector, *H* is the hidden layer's output matrix, and *k* = 1, 2 ... *j*. The input weights and biases are assigned at random and do not require tuning. The hidden layer's output matrix parameters are also assigned random values. If *H* is a square matrix, matrix inversion can be used to determine the output weights. In a case where *H* is not a square matrix, the Moore–Penrose Equation (4) is used to determine the output weights. The neurons are ranked using multi-response sparse regression, and the LOO is then applied.
