*3.1. ELM Model*

As is shown in Figure 3, the extreme learning machine (ELM) is a single hidden layer neural network with low computational complexity and good general performance [24,25]. Compared with the traditional neural network, ELM can randomly initialize the input weight and offset without adjusting during the training process [26], which make it simple and fast with the guarantee of accuracy [27].

**Figure 3.** Extreme learning machine (ELM) network model.

Supposing there are *N* arbitrary data, the ELM network with *M* hidden layer nodes can be expressed as follows:

$$\sum\_{j=1}^{M} \beta\_j \mathbf{y}(w\_j \cdot \mathbf{x}\_i + b\_j) = y\_i(i = 1, 2, \dots, N) \tag{19}$$

where *g*(*x*) is the activation function, *w<sup>j</sup>* is the input weight, β*<sup>j</sup>* is the output weight, and *bj* is the bias of the *i*th node.

The aim of neural network learning is minimizing the output error, which can be expressed as follows:

$$E(w, b, \mathcal{J}) = \sum\_{i=1}^{N} \|y\_i - \mathfrak{t}\_i\|\tag{20}$$

$$\min E(\mathbf{w}, \mathbf{b}, \boldsymbol{\theta}) = \min\_{\boldsymbol{\omega}, \mathbf{b}, \boldsymbol{\theta}} \|\mathbf{H}(\mathbf{w}\_1, \dots, \mathbf{w}\_M; \mathbf{b}\_1, \dots, \mathbf{b}\_M; \mathbf{x}\_1, \dots, \mathbf{x}\_N)\boldsymbol{\boldsymbol{\beta}} - T\boldsymbol{\|}\tag{21}$$

$$H\beta = T\tag{22}$$

$$
\hat{\beta} \equiv \mathbf{H}^+ \mathbf{T} \tag{23}
$$

where *H* is the output of the hidden layer, β is the output weight, *T* is the output of mathematical expectation, and *H*<sup>+</sup> is the Moore-Penrose generalized inverse of matrix *H*.
