Next Article in Journal
Enhancing Multi-Objective Optimization with Automatic Construction of Parallel Algorithm Portfolios
Previous Article in Journal
Deep Joint Source-Channel Coding for Wireless Image Transmission with Adaptive Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Multireservoir Echo State Network Combined with Olfactory Feelings Structure

1
School of Control Science and Engineering, Bohai University, Jinzhou 121013, China
2
School of Information Engineering, Suqian University, Suqian 223800, China
*
Authors to whom correspondence should be addressed.
Electronics 2023, 12(22), 4635; https://doi.org/10.3390/electronics12224635
Submission received: 5 October 2023 / Revised: 31 October 2023 / Accepted: 31 October 2023 / Published: 13 November 2023

Abstract

:
As a special form of recurrent neural network (RNN), echo state networks (ESNs) have achieved good results in nonlinear system modeling, fuzzy nonlinear control, time series prediction, and so on. However, the traditional single-reservoir ESN topology limits the prediction ability of the network. In this paper, we design a multireservoir olfactory feelings echo state network (OFESN) inspired by the structure of the Drosophila olfactory bulb, which provides a new connection mode. The connection between subreservoirs is transformed into the connection between each autonomous neuron, the neurons in each subreservoir are sparsely connected, and the neurons in different subreservoirs cannot communicate with each other. The OFESN greatly simplifies the coupling connections between neurons in different libraries, reduces information redundancy, and improves the running speed of the network. The findings from the simulation demonstrate that the OFESN model, as introduced in this study, enhances the capacity to approximate sine superposition function and the Mackey–Glass system when combined. Additionally, this model exhibits improved prediction accuracy by 98% in some cases and reduced fluctuations in prediction errors.

1. Introduction

A recurrent neural network (RNN) is a class of artificial neural networks that use their internal memory to process arbitrary sequences of inputs, forming internal states of the network through cyclic connections between the units, allowing their dynamic time behavior to be displayed [1]. The echo state network (ESN) is a member of the RNN family [2]. Like RNNs, ESNs have the ability of nonlinear autoregression, but they also solve the challenges related to slow convergence speed and complex training process encountered in traditional RNNs. ESNs hav been employed for the purpose of forecasting the renowned Mackey–Glass chaotic time series. Remarkably, the application of an ESN resulted in a substantial enhancement of prediction accuracy, achieving an impressive rise of 2400 times, as reported in [3]. ESNs are composed of three main components: an input layer, a reservoir, and an output layer. The reservoir can be considered analogous to the hidden layer of ESNs. It is comprised of several neurons that exhibit sparse connections. The reservoir of an ESN has the following characteristics: (1) In contrast to the hidden layer of conventional neural networks, the reservoir can accommodate a relatively higher number of neurons without significantly augmenting the difficulty of the training algorithm or the time complexity. (2) The connections between neurons are formed in a random manner, and no subsequent adjustments are performed following their initial formation. (3) Neurons exhibit a sparse connectivity pattern.
The training procedure of ESNs involves the adjustment of connection weights between the reservoir and the output layer. Given the aforementioned attributes of the reservoir, ESNs exhibit the subsequent noteworthy features: (1) ESNs employ a hidden layer that consists of a sparsely linked reservoir, which is produced randomly. (2) The reservoir generation process is autonomous and occurs prior to the training phase of the ESN, hence guaranteeing the stability of the ESN throughout training and its ability to generalize after training. (3) With the exception of output connection weights, all other connection weights are initially created randomly and stay unaltered during the training process. (4) The output connection weights can be acquired through the utilization of linear regression or the least square approach [4]. This streamlines the training procedure of the neural network. The architecture of ESNs is characterized by their straightforward design, while the training procedure is known for its efficiency in terms of speed. ESNs have been applied successfully to a wide range of domains, including nonlinear modeling [5], pattern recognition [6], fuzzy nonlinear control [7,8], time series prediction [9,10,11,12], and so on.
ESNs have a fixed reservoir composed of randomly sparsely connected neurons, which makes ESNs have the advantage of certain universality. However, this reservoir is usually not optimal. Usually, a good reservoir needs to meet the following conditions: (1) The reservoir parameters must be taken to ensure the echo state property. (2) Reservoir neurons are dynamically rich and capable of representing classification features or approximating complex dynamic systems. (3) The coupling connection of reservoir neurons is as simple as possible. (4) Overfitting or underfitting phenomena should be avoided. The reservoir is the core factor that determines the performance of an ESN. There have been many attempts to find more efficient reservoir schemes to improve the performance of ESNs, for example, the reservoir structure [13,14,15,16,17], the type of reservoir neurons [18,19], reservoir parameter optimization [20,21,22], obtaining echo state property (ESP) condition [23,24], etc.
In [3], H. Jaeger pointed out that an ESN with a single reservoir can be well trained to generate a sinusoidal function superposition generator, but  performance becomes worse when faced with the task of implementing multiple sinusoidal function superposition. The reason may be that the neurons in the same reservoir are coupled, while the task requires the existence of multiple uncoupled neurons [25]. The topology of a single reserve pool limits the application of ESN in time series prediction and other fields. In order to further improve the prediction accuracy, researchers have proposed a series of multireservoir ESN models, including deep reservoir [16,26], growing reservoir [13], and chain reservoir [27], etc.
This paper proposes a novel multireservoir echo state network, called olfactory feelings echo state network (OFESN), to improve the approximation ability and classification ability of ESNs. Each subreservoir of OFESN is composed of a master neuron and several other neurons called sister neurons. The master neuron plays a key role and is the core representative of its own subreservoir. The sister neurons belonging to the same subreservoir can communicate with each other, but the sister neurons belonging to different subreservoir cannot.
The OFESN model can provide a new connection mode, as follows: (i) The connections between subreservoirs are transformed into the connections between their respective master neurons. (ii) The sister neurons within each subreservoir are sparsely connected. (iii) There are no connections between the sister neurons in different subreservoir. Such a new connection mode greatly simplifies the coupling connection between neurons in different reservoirs and then reduces the information redundancy. The sparse connection between the master neurons actually creates a virtual subreservoir, which makes the reservoir add a new subreservoir composed of master neurons with large diversity and thus be equivalent to an increasing the number of neurons. Therefore, the OFESN model may be deemed more appropriate in scenarios where the network’s approximation capability is limited due to a small number of neurons in the overall network reservoir. This is particularly relevant in cases where both the number of neurons in each subreservoir is small and the number of subreservoirs is large. The validity of the OFESN is assessed by utilizing two time series, i.e., sine superposition function and the Mackey–Glass system. The findings from the simulation demonstrate that the OFESN enhances the capacity to approximate several sinusoidal functions in a superposition job. Moreover, it exhibits attributes such as increased prediction accuracy and reduced fluctuations in prediction errors.
The rest of this article is organized as follows. In Section 2, the basic theory of Leaky-ESN is introduced. In Section 3, the OFESN model is proposed, considering the stability of the OFESN and the sufficient conditions for the OFESN to ensure the echo state properties are given, as well as the implementation steps of the OFESN. In Section 4, the OFESN model is simulated and discussed. Finally, the conclusion is given in Section 5.

2. Basic Theories of the Standard Leaky-ESN

A standard ESN typically consists of an input layer, a reservoir, and an output layer, as shown in Figure 1. The circles in Figure 1 represent neurons.
The quantities of input neurons, reservoir neurons, and output neurons are denoted as K, N, and L, respectively. At time step n, the input vector is denoted as u ( n ) = [ u 1 ( n ) , u 2 ( n ) , , u K ( n ) ] T , the state of the reservoir is represented by x ( n ) = [ x 1 ( n ) , x 2 ( n ) , , x N ( n ) ] T , and the output vector is given by y ( n ) = [ y 1 ( n ) , y 2 ( n ) , , y L ( n ) ] . The given expression can be rewritten as T T . The input weight matrix, denoted as W i n with dimensions N × K , represents the weights associated with the input of a system. The reservoir weight matrix, denoted as W with dimensions N × N , represents the weights within the reservoir of the system. The feedback weight matrix, denoted as W f b with dimensions N × L , represents the weights associated with the feedback connections in the system. Lastly, the output weight matrix, denoted as W o u t with dimensions L × ( K + N ) , represents the weights associated with the output of the system. Typically, the initial values of W i n , W, and  W f b are predetermined and remain constant during the training of an ESN. Conversely, the weight matrix W o u t is acquired through the training process of the ESN, representing one of the notable benefits of this approach. The Leaky-ESN is an enhanced variant of the normal ESN, characterized by a reservoir composed of Leaky Integral neurons. The reservoir equation of state proposed by Leaky-ESN is represented as follows [17]:
x ˙ = 1 c a x + f W i n u + W x + W f b y ,
y = g W o u t x ; u ,
where c > 0 represents the time constant of the Leaky-ESN model. The parameter a > 0 corresponds to the leaking rate of the reservoir nodes, which can be interpreted as the rate at which the reservoir state update equation is discretized in time. The function f refers to a sigmoid function used within the reservoir, commonly either the hyperbolic tangent (tanh) or logistic sigmoid function. The function g represents the output activation function, typically either the identity function or the hyperbolic tangent (tanh) function. The notation  ; denotes the concatenation of two vectors. By employing Euler’s discretization method to the ordinary differential Equation (1) with respect to time, we can accurately derive the discrete equation for the Leaky-ESN model:
x n + 1 = 1 a x n + f W i n u n + 1 + W x n + W f b y n ,
In Equation (2), the variable W o u t is both unknown and adjustable, and its computation is necessary throughout the training process of the desired network. During the training phase, the echo states x n are organized into a state collection X in a row-wise manner. Similarly, the learned output values y n that correspond to the x n are arranged in a row-wise vector Y. Next, the calculation of W o u t is determined using the learning equation in the following manner:
W o u t = X T X 1 X T Y ,
where X T represents the transpose of matrix X, while X T X 1 represents the inverse of the square matrix X T X .
The objective of training the ESN is to minimize the error function E y , d . The error, denoted as E y , d , is commonly represented as a normalized root-mean-square error (NRMSE). The formula for NRMSE is given by
E ( y , d ) = 1 n i = 1 n y ( i ) d ( i ) 2 σ ( d ) ,
where y ( i ) denotes the ith data of the actual output; d ( i ) denotes the ith data of the desired output; · means the Euclidean distance of a variable; and σ ( d ) denotes the standard deviation of the desired output.

3. Olfactory Feelings Echo State Network

3.1. The Structure of the OFESN

The subreservoirs of the OFESN may be composed of different types of neurons or the same types of neurons. Here, we assume that the reservoir of the OFESN is composed of m subreservoirs, and each subreservoir is composed of the same types of neurons, i.e., the neuron state update model of each subreservoir is the same. The enrichment of the dynamics of the reservoir can be achieved by constructing an echo state network with the following idea:
(1)
First, m neurons with different initial states are generated, and the Euclidean distance between any two initial states is greater than or equal to a certain number that can be either specified or generated randomly. If this number is randomly generated, the state of neurons of the subreservoirs subsequently generated will be guaranteed to have more complex differences. The m neurons generated above are referred to as master neurons, similar to the master neurons of typical neural circuits, such as olfactory cortex, cerebellar cortex, and hippocampal structures, that are responsible for the input and output of the circuit. Each master neuron becomes the core of each subreservoir and thus becomes the representative of each subreservoir. Let x 11 , x 22 , ⋯ x m m denote these m neurons, respectively. Their initial states need to meet the inequality | | x i i x j j | | σ i j ( i = 1 , 2 , , m ; j = 1 , 2 , , m ) . Here, σ i j may be either specified or randomly generated.
(2)
Next, with each master neuron as the center, a subreservoir is constructed around the master neuron. In each subreservoir, the other neurons except the master neuron are called sister neurons. The master neuron and the sister neurons of a subreservoir need to ensure high similarity and correlation. Thus, the master neuron can represent its own subreservoir. The communication between subreservoirs can be realized by the communication between master neurons. The sister neurons belonging to the same subreservoir can communicate with each other, but the sister neurons belonging to a different subreservoir cannot communicate with each other. The m master neurons can construct an m subreservoir, called the actual subreservoir. Let the neurons of the i t h subreservoir satisfy x i i x i j σ ¯ 2 i . Here, x i j ( j = 1 , 2 , , d i , j i ) denotes the j t h sister neuron of the i t h subreservoir, and d i denotes the number of neurons of the i t h  subreservoir.
(3)
The OFESN model can provide a new connection mode as follows: (i) The connections between subreservoirs are transformed into the connections between their respective master neurons, which can be determined by the small-world network method or sparse connection. The connections between these master neurons actually generate a virtual and flexible subreservoir. The biggest difference between the virtual reservoir and the actual subreservoir is that the virtual reservoirs are only composed of the master neurons, and thus their neuron states have a bigger difference and less redundant information than those of the actual subreservoir. Therefore, the OFESN is equivalent to having a flexible virtual subreservoir and m actual subreservoirs. (ii) The sister neurons within each subreservoir are sparsely connected. (iii) The sister neurons in different subreservoirs cannot communicate with each other, and there are no connections among them. Such a new connection mode greatly simplifies the coupling connection between neurons in different reservoirs and then reduces the information redundancy. The sparse connection between the master neurons actually creates a virtual subreservoir, which makes the reservoir add a new subreservoir composed of master neurons with large diversity and thus is equivalent to increasing the number of neurons. Therefore, the OFESN model can be more suitable for the situation where the network approximation ability is poor due to the small number of neurons in the whole network reservoir, especially the situation where the number of neurons in each subreservoir is small and the number of subreservoirs is large.
The structure of the OFESN is shown in Figure 2. In Figure 2, the dashed lines from the output layer to the reservoir represent W f b ,  the solid lines from the reservoir to the output layer represent W o u t , and the colored dotted line means that W o u t can be adjusted online such that the network output y ( k ) follows y d ( k ) , in addition to being calculated by Equation (4). The ellipses of the black line in the reservoir represent actual subreservoirs, respectively. In each subreservoir, a circle filled with red denotes its master neuron. All master neurons construct a virtual subreservoir denoted by the ellipses of the green dotted line. The neurons of each actual subreservoir, including its master neuron and the sister neurons, use the sparse connection. Each subreservoir can be represented by its master neuron, and then the connections between actual subreservoirs can be determined by the connections between master neurons. The master neurons, i.e., the neurons of the virtual subreservoir, may use sparse connection or the small-world network method.
The state update equation of the OFESN is as follows:
x n + 1 = 1 a x n + f W i n u n + 1 + W x n + W f b y n ,
y n = g W o u t x n ; u n ,
where x ( n ) = [ x 1 T ( n ) , x 2 T ( n ) , , x m T ( n ) ] T denotes the state vector of the reservoir and x i ( n ) = [ x i 1 , x i 2 , , x i d i ] T denotes the state of the subreservoir. d i indicates the number of neurons of the ith subreservoir. W i n , W, W f b are the input connection weight matrix, the reservoir neuron connection weight matrix, and the output feedback connection weight matrix, respectively. The reservoir neuron connection weight matrix takes the following form:
W = W 1 W 12 W 1 m W 21 W 22 W 2 m W m 1 W m 2 W m
where W i j is the connection weight matrix between the ith subreservoir and the jth subreservoir, W 1 , W 2 , ⋯, W m denote the internal connection weight matrix of m actual subreservoirs, which can be generated randomly with a certain sparsity. The dimensions of W, W i j , and W i are i = 1 m d i × i = 1 m d i , d i × d j , and d i × d i , respectively.
The following is an example of how to generate W 12 , W 13 , ⋯, W 1 m to explain how to generate W i j . W 12 denotes the connection weight matrix between the 1st subreservoir and the 2nd subreservoir. To reduce information redundancy and simplify the connections between neurons, the OFESN uses the connections between the master neuron of the 1st subreservoir and the master neuron of the 2nd subreservoir to represent the connections between the 1st subreservoir and the 2nd subreservoir. We assume that the first neuron of the first subreservoir is the principal neuron and the second neuron of the second subreservoir is the master neuron. Thus, the element in the 1st row and the 2nd column of W 12 denotes the connection weights between the master neuron of the 1st subreservoir and the master neuron of the 2nd subreservoir, which can be expressed as W ¯ 12 . W ¯ 12 may be a zero or nonzero value. W ¯ 12 = 0 represents that there is no connection between the master neuron of the 1st subreservoir and the master neuron of the 2nd subreservoir. W ¯ 12 0 represents that there is a connection between the master neuron of the 1st subreservoir and the master neuron of the 2nd subreservoir. In other words, only W ¯ 12 of W 12 may be a nonzero element, and the rest of the elements W 12 are zero. Similarly, the element in the 1st row and 3rd column of W 13 , denoted by W ¯ 13 , represents the connection weights between the master neuron of the 1st subreservoir and the master neuron of the 3rd subreservoir. Only the W ¯ 13 of W 13 may be a nonzero element, and the rest of the elements W 13 are zero. So, the element in the 1st row and mth column of W 1 m is denoted by W ¯ 1 m . Only the W ¯ 1 m of W 1 m may be a nonzero element, and the rest of the elements W 1 m are zero. The values of W ¯ 12 , W ¯ 13 , ⋯, W ¯ 1 m are determined by the connection weights between the neurons of the virtual subreservoir, and  W ¯ 12 , W ¯ 13 , ⋯, W ¯ 1 m is actually the corresponding element of the connection weight matrix W ¯ of the virtual subreservoir.
From the above, the element in the ith row and the jth column of W i j , denoted by W ¯ i j , represents the connection weights between the master neuron of the ith subreservoir and the master neuron of the jth subreservoir. Only one element of W i j is possibly nonzero, and the rest of the elements of W i j are zero. The possibly nonzero element characterizes the master neuron of the ith subreservoir possibly connected with the master neuron of the jth subreservoir, and its value is determined by the random sparse connection of the virtual subreservoir. W ¯ i i represents the connection of the master neuron itself in the ith subreservoir. So, W ¯ is made up of W ¯ i j ( i = 1 , 2 , , m , j = 1 , 2 , , m ) . Let W ¯ i i = 0 , which is the equivalent of W ¯ i i added to W i . W ¯ can be generated randomly with a spectral radius less than 1 and a certain sparsity. It is especially worth noting that the connection weight matrix of reservoir W is asymmetric; that is, the connection between reservoir neurons is duplex.
In addition, in Equation (7), the input matrix W i n and feedback connection weight matrix W f b are in the following form:
W i n = ( W 1 i n ) T ( W 2 i n ) T ( W m i n ) T
W f b = ( W 1 f b ) T ( W 2 f b ) T ( W m f b ) T
After the matrix is normalized, Equation (6) is rewritten as
x n + 1 = 1 a x n + f S i n W i n u n + 1 + ρ W x n + S f b W f b y n ,
where S i n is the input scaling factor, ρ is the spectral radius, and S f b is the output feedback scaling factor.
After normalization, W, W i n , W f b can be, respectively, rewritten as
W = ρ 1 W 1 W 12 W 1 m W 21 ρ 2 W 22 W 2 m W m 1 W m 2 ρ 3 W m
W i n = ( S 1 i n W 1 i n ) T S 2 i n ( W 2 i n ) T S m i n ( W m i n ) T
W f b = ( S 1 f b W 1 f b ) T S 2 f b ( W 2 f b ) T S m f b ( W m f b ) T
among them, W i , W i i n , W i f b is the normalized matrix.

3.2. The Echo State Property of the OFESN

Theorem 1 ([28]).
For a discrete OFESN model (8), if the following conditions are satisfied:
(i) 
f selects sigmoid function (tanh);
(ii) 
The output activation function g is a bounded function (for example, tanh) or W f b = 0 ;
(iii) 
There are no output feedbacks, that is, W f b = 0 ;
(iv) 
| 1 a + ρ δ m a x |   < 1 (where δ m a x is the maximal singular value of W);
the OFESN model has the echo state property.
In order to ensure that the OFESN satisfies the echo state property, the reservoir connection weight matrix must satisfy Theorem 1. In fact, each subreservoir does not need to meet the echo state property, as long as the entire reservoir does.
Proof. 
For any two states x n + 1 and x n + 1 of the reservoir at time n + 1 [23], the following holds:
x ( n + 1 ) x ( n + 1 ) = ( 1 a ) ( x ( n ) x ( n ) ) + ( f ( S i n W i n u n + 1 + ρ W x n ) f ( S i n W i n u n + 1 + ρ W x n ) ( 1 a ) x ( n ) x ( n ) + f ( S i n W i n u n + 1 + ρ W x n ) f ( S i n W i n u n + 1 + ρ W x n ) ( 1 a ) x ( n ) x ( n ) + ρ W x ( n ) W x ( n ) ( 1 a + ρ δ m a x ) x ( n ) x ( n )
Thus, | 1 a + ρ δ m a x | is a global Lipschitz rate by which any two states approach each other in the state update. To guarantee that the OFESN has the echo state property, the inequality (10) must be satisfied.
| 1 a + ρ δ m a x | < 1 ,
hence, the proof is complete.    □

3.3. Optimizing the Global Parameters of OFESN

The models to be optimized here are Equations (7) and (8). The parameters to be optimized are a, S i i n , S i f b , ρ i , ρ ¯ (spectral radius of matrix), and W o u t . W o u t can be solved by linear regression method, such as pseudoinverse method. In order to simplify the operation, ρ i , σ i j , σ ¯ 2 i is not optimized and is given in advance. Only the q { a , S i i n , S i f b , ρ ¯ } parameters are optimized, and echo state property conditions are satisfied. In this paper, stochastic gradient descent method is used to optimize these parameters. Here, i = 1 , 2 , , m .
When W f b = 0 , invoke the chain rule and observe (8); we obtain x n q :
x n a = ( 1 a ) x n 1 a x n 1 + f ( x ) . ( ρ ¯ W x n 1 a ) ,
x n ρ ¯ = ( 1 a ) x n 1 ρ ¯ + f ( x ) . ( ρ ¯ W x n 1 ρ ¯ + W x ( n 1 ) ) ,
x n S i i n = ( 1 a ) x n 1 S i i n + f ( x ) . ( ρ ¯ W x n 1 S i i n + W i n S i i n u ( n ) ) ,
x n S i f b = ( 1 a ) x n 1 S i f b + f ( x ) . ( ρ ¯ W x n 1 S i f b ) ,
where . denotes the element-wise product of two vectors.
When containing W f b , let X ( n ) = S i n W i n u n + 1 + ρ W x n + S f b W f b y n . Here, we use a simple symbol 0 u = 0 , , 0 T to represent the input vector whose entries are all zeros; the  y n q = W o u t [ x n 2 q ; 0 u ] , and we obtain x n q :
x n a = ( 1 a ) x n 1 a x n 1 + f ( x ) . ( ρ ¯ W x n 1 a + W f b W o u t [ x n 2 a ; 0 u ] ) ,
x n ρ ¯ = ( 1 a ) x n 1 ρ ¯ + f ( x ) . ( ρ ¯ W x n 1 ρ ¯ + W x ( n 1 ) + W f b W o u t [ x n 2 ρ ¯ ; 0 u ] ) ,
x n S i i n = ( 1 a ) x n 1 S i i n + f ( x ) . ( ρ ¯ W x n 1 S i i n + W i n S i i n u ( n ) + W f b W o u t [ x n 2 S i i n ; 0 u ] ) ,
x n S i f b = ( 1 a ) x n 1 S i f b + f ( x ) . ( ρ ¯ W x n 1 S i f b + W f b S i f b y ( n ) + W f b W o u t [ x n 2 S i f b ; 0 u ] ) ,
In other words, to train the output weight matrix W o u t , the output y n should be as close as possible to the teacher output d n during the training process. The error ε n expression is as follows:
ε n = y n d n ,
we define the squared error E ( n ) as follows:
E n = 1 2 ε n 2 ,
to 0 u = 0 , , 0 T , there is
E n + 1 q = ε n + 1 W o u t [ x n 1 q ; 0 u ] ,
the global parameter update expression is as follows:
q n + 1 = q n K E n + 1 q ,
K represents the learning rate of the global parameter q. The parameters modified in this process must ensure that the OFESN model has the echo state property in practical applications.

3.4. Implementation of the OFESN

The simulation flow chart is shown in Figure 3, and the steps for OFESN implementation are as follows:
(i)
Assume that the reservoir of the OFESN is composed of m classes of neurons, and each class of neurons constitutes a subreservoir, and the number of neurons in the ith reservoir is d i , then the total number of neurons in the reservoir is N = i = 1 m d i .
(ii)
Initialize the parameters, including the sparse degree, run step, learning rate, and connection weight matrix W i n and W f b , as well as the parameters to be optimized a, ρ , S i i n , S i f b .
(iii)
The master neurons of m subreservoir constitute a virtual subreservoir, and its corresponding connection weight is W ¯ that is randomly generated and has a certain sparsity.
(iv)
Initialization of m master neurons. Let x 11 , x 22 , x 33 , , x m m (m) be given, respectively, and let their initial state satisfy | x i i x j j | σ i j , and  σ i j can be either specified or randomly generated.
(v)
The other sister neurons of m subreservoirs generate initial states. With master neuron x i i ( i = 1 , 2 , , m ) as the center, multiple sister neurons with high similarity and high correlation are formed into the ith subreservoir, and the neuron state of the ith subreservoir meets | x i i x i j | σ ¯ 2 i .
(vi)
The internal connection weight matrix W i ( i = 1 , 2 , , m ) of the ith actual subreservoir can be randomly generated, has certain sparsity, and may satisfy the spectral radius of less than 1. W ¯ is composed of W ¯ i j , among which ( i = 1 , 2 , , m , j = 1 , 2 , , m ) can be randomly generated with a spectral radius less than 1 and a certain sparsity. The spectral radius of W i and W ¯ i j may not be less than 1, but the echo state property condition must be satisfied as inequality (10).
(vii)
Update the system reservoir states according to (8).
(viii)
The optimal parameters and W o u t were obtained at the end of the training.
(ix)
Test OFESN accuracy and running time.
The implementation pseudocode is shown in Algorithm 1:
Algorithm 1: OFESN algorithm
Input: Dataset, number of reservoirs m, number of ith reservoir d i , sparse degree, run step, learning rate K, σ i j , σ 2 j ¯
Output: Testing set accuracy and running time
1Split the dataset into a training set and a testing set;
2Initialize parameters: W i n , W f b , a , ρ , S i i n , S i f b
3 W ¯ ← master neurons: x i i x j j σ i j , subreservoir: x i i x i j σ 2 j ¯
4while error ≥ ε or n≤ run step do
5    Update the system reservoir states according to Equation (8);
6    Calculate W o u t according to Equation (4);
7    Calculate the predicted value according to Equation (7);
8    Calculate error;
9    Update the parameters by SGD;
10     n n + 1 ;
11end
12Return a , ρ , S i i n , S i f b , W o u t ;
13Generate the optimal model;
14Validate accuracy by the testing set;

4. Verification by Experiment Simulation

To verify the effectiveness of the OFESN, sine superposition function and Mackey–Glass were used, and comparisons between the OFESN with Leaky-ESN are given in terms of run time and prediction accuracy.

4.1. Simulation Example 1

In this section, a sine superposition function is given as follow:
u 1 ( n ) = sin ( n ) + sin ( 0.51 n ) + sin ( 0.22 n ) + sin ( 0.1002 n ) + sin ( 0.05343 n ) ,
The teacher output d 1 n is
d 1 n = u 1 n 5 ,
According to Equations (23) and (24), we generate 20,500 data samples, which are divided into three parts: 20,000 training samples (100 initial washout samples) and 500 testing samples. The sample point, or epoch, is set at 500 times, and a total of 20,000 times results in 40 points. In this context, it is necessary to examine two scenarios pertaining to the size of the reservoir. The first group consists of 36 neurons, while the second group consists of 17 neurons. When the size of the reservoir, denoted as N, is equal to 36, the reservoir can be categorized into two categories based on the number of subreservoirs it contains, i.e., the reservoir with either 6 subreservoirs or 3 subreservoirs. In each case, the model performs 20 times in a random manner. The resulting mean and standard deviation of the NRNSE are then gathered. Subsequently, error bar graphs are constructed to visually represent these values.

4.1.1. The Structure of 3 Actual Subreservoirs

The reservoir contains 3 actual subreservoirs, the number of neurons in the ith actual subreservoir is denoted by N i (i = 1, 2, 3). Here, the number of neurons in the virtual subreservoir is denoted by N 4 . Assuming that the actual subreservoirs are N i = 12 and the virtual subreservoir is N 4 = 3 , different initial values were selected for parameter a, ρ ¯ , S i i n , and the different sparsity of neuron connections in the subreservoir was tested. In this section, a, ρ ¯ , S i i n take two different initial values. The first case is [ a ( 0 ) , ρ ¯ ( 0 ) , S i i n ( 0 ) ] = [ 1 ; 0.5 r a n d ( 1 ) ; 0.15 r a n d ( 1 ) ] , and the second case is [ a ( 0 ) , ρ ¯ ( 0 ) , S i i n ( 0 ) ] = [ 0.8 ; 0.6 ; 0.1 ] . Here, r a n d ( 1 ) denotes a random number uniformly distributed within the interval ( 0 , 1 ) . Then, the connections between neurons in each subreservoir can be divided into two cases of fully connections and partially connections. In view of different initial values of the parameters and the different connections between neurons in each subreservoir, simulation tests are carried out in this section. The results are shown in Figure 1, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9, respectively. The figure is an error bar diagram of NRMSE. Error bars are calculated using mean and standard deviation. The standard deviation is obtained by dividing by M − 1, where M represents the number of samples.
In the initial phase of network model, the prediction error is slightly larger. In order to see the details of subsequent prediction errors, the initial 3 prediction error points are usually not drawn in the figure; so, there are only 37 data points. In addition, usually x ( n ) data from the beginning of the training run are discarded (i.e., not used for learning W o u t ) since they are contaminated by initial transients. Figure 4 shows the prediction error diagram of the OFESN proposed in this paper when parameters are different initial values and the subreservoir is fully connected or partially connected.
(i).
The first initial value case and the subreservoir fully connected (called Case 1)
[ a ( 0 ) , ρ ¯ ( 0 ) , S i i n ( 0 ) ] = [ 1 ; 0.5 r a n d ( 1 ) ; 0.15 r a n d ( 1 ) ] . The internal neurons of each subreservoir are fully connected, i.e., the sparse degree of each actual subreservoir is 1, and the sparse degree of the virtual subreservoir is also 1. Thus, the number of the nonzero internal connection weights in the whole reservoir N c _ F u l l is approximately calculated as follows:
N c _ F u l l _ 3 = i = 1 3 N i N i + N 4 N 4
According to Equation (25), N c _ F u l l _ 3 = 441 . To make Leaky-ESN have the same number of nonzero internal connection weights, its corresponding sparse degree should be calculated as follows:
N c _ F u l l _ 3 = N N S 1 N
where S 1 N represents the sparse degree of the reservoir. According to Equation (26) and N = 36 , S 1 = 12.25 is obtained, and then Leaky-ESN has a sparse degree of 12.25 N . Figure 5 shows the prediction accuracy comparison between the OFESN and Leaky-ESN. In Figure 6, the performance of Leaky-ESN with a fully connected reservoir is given.
As can be seen from Figure 5 and Figure 6, the OFESN has higher error accuracy in the training process, and the error fluctuation range is smaller than Leaky-ESN, which verifies the effectiveness of the OFESN model.
(ii).
The firs initial value case and the partially connected subreservoir (called Case 2)
The sparse degree of the 3 subreservoirs of the OFESN is selected as 6 N i ( i = 1 , 2 , 3 ) , respectively, and the virtual subreservoir composed of 3 master neurons uses full connection. Then, the number of internal connection weights in the whole reservoir is calculated as follows:
N c _ p a r t i a l _ 3 = i = 1 3 N i N i 6 N i + N 4 N 4
According to Equation (27), N c _ p a r t i a l _ 3 = 225 . The Leaky-ESN with the same number of internal connection weights should have a sparse degree of reservoir as follows:
N c _ p a r t i a l _ 3 = N N S 2 N
Solve Equation (28) and obtain S 2 = 6.25 . Thus, the sparse degree of the corresponding leaky-ESN should be set to 6.25 36 , which is consistent with the number of neurons interconnections in the reservoir of the OFESN. Here, the neuron interconnections include the self-connection of neurons, the two-way connection between two neurons, and one-way connection between two neurons.
Figure 7 shows the comparison of prediction error bars between the OFESN and Leaky-ESN. It can be seen that in the training process, the OFESN is superior to Leaky-ESN in terms of error accuracy and error stability, which further verifies the effectiveness of the OFESN model.
(iii).
The second initial value case and the fully connected subreservoir (called Case 3)
The initial value is the second case; that is, [ a ( 0 ) , ρ ¯ ( 0 ) , S i i n ( 0 ) ] = [ 0.8 ; 0.6 ; 0.1 ] , and the sparse degree of the 3 actual subreservoirs is 1, and the sparse degree of the virtual subreservoir is also 1. Similar to Equations (25) and (26), the corresponding sparse degree of Leaky-ESN should be set to 12.25 36 , which is the same as the number of neuron interconnections in the OFESN reservoir.
Figure 8 shows the prediction accuracy comparison between the OFESN and Leaky-ESN. Figure 9 shows the performance of Leaky-ESN with fully connected reservoir and the performance of the OFESN with all subreservoirs fully connected. As can be seen from Figure 8 and Figure 9, the prediction performance of the OFESN is better than Leaky-ESN, and the prediction error fluctuation is also much smaller than Leaky-ESN.
(iv).
The second initial value case and the partially connected subreservoir (called Case 4)
The sparse degree of the 3 actual subreservoirs is 6 N i ( i = 1 , 2 , 3 ) , respectively. The virtual subreservoir composed of the 3 master neurons is fully connection, and its sparse degree is 1. Similar to Equations (27) and (28), the corresponding sparse degree of Leaky-ESN should be set to 5.8 36 . This makes leaky-ESN have the same number of nonzero internal connection weights as the OFESN. The comparison of train accuracy of the two models is shown in Figure 10. It can be seen that in the whole training process, the OFESN partially connected to the reservoir has better prediction performance than Leaky-ESN.

4.1.2. The Structure of Six Actual Subreservoirs

The OFESN reservoir is divided into 6 actual subreservoirs, and the number of neurons in the ith actual subreservoir is denoted by N i ( i = 1 , 2 , , 6 ) . Thus, the number of neurons in the virtual subreservoir, denoted by N 7 , is N 7 = 6 . The prediction accuracy of the OFESN and Leaky-ESN is compared for the two different initial values of parameters a, ρ ¯ , and S i i n , respectively, and the two kinds of connections, including partial connection and full connection of the subreservoirs. Figure 11 shows the prediction accuracy of the OFESN under different conditions. It can be seen that under the condition of sparsely connected and fully connected, the OFESN is convergent, and the error is stable between 0 and 0.005.
(i).
The first initial value case and the fully connected subreservoir (called Case 1)
The sparse degree of each actual subreservoir is 1; that is, all the neurons in each actual subreservoir are connected, including interconnection and self-connection. The sparse degree of the virtual subreservoir is 1, and then the number of connections is 252. Thus, the corresponding sparse degree of Leaky-ESN should be set to 7 36 . In this case, the number of connected reservoir neurons of the OFESN is approximately the same as that of Leaky-ESN. Figure 12 shows the comparison between the OFESN and Leaky-ESN. Figure 13 shows the comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir.
It can be seen that under the same conditions, compared with Leaky-ESN, the OFESN has higher prediction accuracy and smaller error fluctuation.
(ii).
The first initial value case and the partially connected subreservoir (called Case 2)
The sparse degrees of the subreservoirs are selected as 3 N i ( i = 1 , 2 , , 5 ) , 2 N 6 , and the sparse degree of the virtual subreservoir is 6 N 7 . Thus, the connection number of neurons in the whole reservoir is approximately as follows:
N c _ p a r t i a l _ 6 = i = 1 5 N i N i 3 N i + N 6 N 6 2 N 6 + N 7 N 7
The corresponding approximate sparse degree of Leaky-ESN should be set to 3.83 36 . When the initial value is [ a ( 0 ) , ρ ¯ ( 0 ) , S i i n ( 0 ) ] = [ 1 ; 0.5 r a n d ( 1 ) ; 0.15 r a n d ( 1 ) ] , the comparison between the OFESN and Leaky-ESN is shown in Figure 14. It can be seen that under the condition of different sparsity, the OFESN and Leaky-ESN have a similar prediction effect.
(iii).
The second initial value case and the fully connected subreservoir (called Case 3)
Each subreservoir is fully connected, and the corresponding sparse degree of Leaky-ESN should be set to 7 36 . Figure 15 shows the comparison between Leaky-ESN and the OFESN with each subreservoir fully connected at the second initial value. Figure 16 shows the comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir. It can be seen that under the condition of full connection, Leaky-ESN has higher prediction accuracy and smaller error fluctuation compared with the OFESN.
(iv).
The second initial value case and the partially connected subreservoir (called Case 4)
The sparse degree of each actual subreservoir is 3 N i ( i = 1 , 2 , 3 , 4 , 5 ) , and 2 N 6 . The sparse degree of the virtual subreservoir is 6 N 7 . Thus, the corresponding sparse degree of leaky-ESN should be set to 3.83 36 . Figure 17 shows the comparison between the OFESN and Leaky-ESN at the second initial value and partially connected subreservoirs. It can be seen that under partial connection conditions, compared with Leaky-ESN, the OFESN has weaker prediction accuracy and greater error fluctuation.
Under the same number of neurons in the whole reservoir, the performance comparison for the OFESN with a different number of subreservoirs is shown in Figure 18. In Figure 18, the red line with ∗ denotes the prediction error bars of the OFESN with three subreservoirs under the first initial value and subreservoirs partially connected, and the blue line with ∘ denotes the prediction error bars of the OFESN with six subreservoirs under the first initial value and subreservoirs partially connected.
The predicted time series length is 500, and the run time is shown in Table 1. Table 2 shows the comparison of the core parameter ratio in the OFESN and Leaky-ESN. In Table 1, similar to Table 3 and Table 4, N R M S E _ t r a i n represents training error, T i m e _ t r a i n represents training time, N R M S E _ t e s t represents testing error, and T i m e _ t e s t represents testing time. In Table 2, similar to Table 5 and Table 6, a represents Leaky rate, ρ represents spectral radius, and S i n represents input scaling factor. As can be seen from Table 1, the training and test time of the OFESN are slightly better than Leaky-ESN; the training and test errors are both improved by an order of magnitude, and the prediction accuracy is improved by 98% at most. As can be seen from Table 2, a, ρ , and S i n are closely related to the selection of initial values.

4.1.3. The Reservoir with 17 Neurons

In order to test the performance of the OFESN when the reservoir contains only a small number of neurons, we select the total number of neurons in the reservoir N = 17 in the following simulation. Assuming that the reservoir is divided into 6 actual subreservoirs, N i = 3 ( i = 1 , 2 , , 5 ) , N 6 = 2 , and then the number of neurons in the virtual reservoir is N 7 = 6 . The performance of the OFESN is tested on the case of the reservoir neurons with different connections and two different initial values of the reservoir parameters.
(i).
The first initial value case and the fully connected subreservoir (called Case 1)
The sparse degree of each actual subreservoirs of the OFESN is set to 1; that is, it is fully connected. The sparse degree of the virtual subreservoir is 3 6 . Thus, the reservoir of the OFESN is equivalent to having 67 internal connection weights. Corresponding to Leaky-ESN, its sparse degree is 3.94 17 . Figure 19 shows the performance of the OFESN and Leaky-ESN. It can be seen that the prediction performance of the OFESN is significantly better than Leaky-ESN, and it has smaller error fluctuation.
(ii).
The first initial value case and the partially connected subreservoir (called Case 2)
The sparse degree of each subreservoir is set to 2 N i ( i = 1 , 2 , 3 , 4 , 5 , 6 ) , 3 N 7 , respectively, and is equivalent to 52 internal connection weights. Thus, the corresponding sparse degree of leaky-ESN should be set to 3.06 17 . The performance comparison between the OFESN and leaky-ESN is shown in Figure 20. It can be seen that in the whole training process, the prediction performance of Leaky-ESN is better than that of the OFESN, and the error fluctuation is smaller.
(iii).
The second initial value case and the fully connected subreservoir (called Case 3)
The sparse degree of each actual subreservoir is set to 1; that is, it is fully connected. The sparse degree of the virtual subreservoir is 3 6 , which is equivalent to having 67 internal connection weights in the whole reservoir., corresponding to Leaky-ESN, and its sparse degree is 3.94 17 . Figure 21 shows the performance comparison between the OFESN and Leaky-ESN. It can be seen that the OFESN is slightly better than Leaky-ESN in terms of prediction accuracy and error fluctuation.
(iv).
The second initial value case and the partially connected subreservoir (called Case 4)
The sparse degree of each actual subreservoir is set to 2 / N i ( i = 1 , 2 , , 6 ) ; that is, it is partially connected. The sparse degree of the virtual subreservoir is 3 6 , which is equivalent to having 52 internal connection weights in the whole reservoir, which corresponds to Leaky-ESN, and its sparse degree is 3.06 17 . Figure 22 shows the performance comparison between the OFESN and Leaky-ESN.
Figure 23 shows the performance comparison between the OFESN with 36 neurons and the OFESN with 17 neurons in the case of the first initial value and partially connected subreservoirs. The predicted time series length is 20,000, and run time is shown in Table 3. Table 5 shows the comparison of the core parameter ratio in the OFESN and Leaky-ESN. As can be seen from Table 3, the training and test time of the OFESN is slightly better than Leaky-ESN; the training and test errors are both improved by an order of magnitude, and the prediction accuracy is improved by 96% at most. As can be seen from Table 5, a, ρ , and S i n are closely related to the selection of initial values.

4.1.4. Analysis of the Simulation Results

We can see from the simulation figures that the prediction performance of the OFESN with the same number of internal connection weights is better than that of Leaky-ESN under more cases. Even if the prediction performance of the OFESN with the same number of connection weights is similar to that of the ESN, the prediction error volatility of the OFESN is much smaller than that of Leaky-ESN. In addition, we give comparisons between the OFESN with each subreservoir fully connected and Leaky-ESN with full connection, shown in Figure 6, Figure 9, Figure 13 and Figure 16, and the OFESN with each subreservoir fully connected has a better performance and less error fluctuation than Leaky-ESN with full connection. We can see from Figure 23 that the OFESN with 36 neurons and 3 subreservoirs has a prediction performance closer to the OFESN with 17 neurons and 6 subreservoirs in the whole reservoir. Thus, we can say, when the number of neurons is small, the greater the number of subreservoirs, the greater the number of neurons in the whole reservoir by adding a virtual reservoir. In a word, the OFESN has more stable performance and shorter run time.

4.2. Mackey–Glass Chaotic Time Series

The Mackey–Glass chaotic time series (MGS) is a classic nonlinear dynamical system, commonly used in fields such as time series analysis, signal processing, and chaos theory. For example, the Mackey–Glass equation can be used to describe the fluctuations and trends in market prices in economics and study the biological clock and rhythmic behavior within organisms in biology. The Mackey–Glass equation can be represented as
g ( θ + 1 ) = g ( θ ) + Δ T ( p g ( t τ Δ T ) 1 + g ( t τ Δ T ) 10 + q g ( θ ) )
where p = 0.2 , τ = 17 , q = 0.1 , Δ T = 0.1 . τ denotes the delay factor, and the sequence has chaotic properties when τ > 16.8 . We normalize the dataset by the min-max scaling method so that all the data are between 0 and 1. We split the time series dataset by using 70 % of it for training purposes, 20 % for validation purposes, and the remaining 10 % for testing purposes. The first 100 samples are discarded during training to guarantee that the system is not affected by the initial transient, in accordance with the echo property. According to Equations (23) and (24), we generate 40,000 data samples, which are divided into three parts: 20,000 training samples, 10,000 testing sample points, and 100 initial washing out samples. The predicted time series length is 20,000, and the run time is shown in Table 4. Table 6 shows the comparison of core parameter ratio in the OFESN and Leaky-ESN.
Figure 24, Figure 25, Figure 26 and Figure 27 shows the comparison of prediction error results of the OFESN and Leaky-ESN in different cases. As can be seen from the figure, in cases 1, 2, and 4, the prediction error of Leaky-ESN is significantly better than that of the OFESN, and it tends to converge, while the OFESN tends to diverge. In case 3, the prediction effect of the OFESN is better than Leaky-ESN. As can be seen from Table 4, the training and test time of the OFESN is slightly better than Leaky-ESN; the training and test errors are both improved by an order of magnitude, and the prediction accuracy is improved by 98% at most. As can be seen from Table 6, a, ρ , and S i n are closely related to the selection of initial values.

5. Conclusions

This paper proposed a new multireservoir echo state network, namely the OFESN. The OFESN can transform the connections of a single reservoir into the connections of subreservoirs by their master neurons, greatly reducing the coupling connections between neurons in a reservoir, further increasing sparse degree, and reducing information redundancy. Compared with Leaky-ESN network, the OFESN network greatly reduces the internal coupling and correlation between different subreservoirs. Without increasing the number of neurons in the reservoir, introducing a sparse connection between the master neurons is actually equivalent to greatly increasing the number of neurons, especially the number of neurons with great dissimilarity. Therefore, when the number of neurons in each actual subreservoir is small and the number of subreservoirs is large, it is equivalent to obtaining more reservoir neurons and ensuring the dissimilarity of neuron states, which improves the prediction accuracy of Leaky-ESN and greatly reduces the amount of calculation. Prediction accuracy improved by 98% in some cases. However, there are some limitations to the OFESN and Leaky-ESN. The gradient descent optimization method is used to optimize parameters with constraint conditions (echo state characteristic conditions), which is sensitive to the initial value selection of parameters and cannot guarantee global optimization. Therefore, in the future, we will find a suitable swarm intelligence optimization algorithm to optimize its parameters to eliminate the impact of the initial parameter values. The performance comparison between the OFESN and LEAKY-ESN is not a comparison between the optimal performance of the two models but a comparison under the same parameters and equivalent conditions.

Author Contributions

Methodology, Q.W.; software, Q.W.; validation, Q.W.; data curation, Q.W.; writing—original draft preparation, Q.W.; writing—review and editing, S.L., J.C. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 61773074 and Key research projects of the Education Department in Liaoning Province grant number LJKZZ20220118.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Beritelli, F.; Capizzi, G.; Sciuto, G.L.; Napoli, C.; Tramontana, E.; Woźniak, M. Reducing interferences in wireless communication systems by mobile agents with recurrent neural networks-based adaptive channel equalization. Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2015. In Proceedings of the XXXVI Symposium on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments (Wilga 2015), Wilga, Poland, 25 May 2015; Volume 9662, pp. 497–505. [Google Scholar]
  2. Jaeger, H. The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn Ger. Ger. Natl. Res. Cent. Inf. Technol. GMD Tech. Rep. 2001, 148, 13. [Google Scholar]
  3. Jaeger, H.; Haas, H. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 2004, 304, 78–80. [Google Scholar] [CrossRef]
  4. Jaeger, H. Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the “Echo State Network” Approach; German National Research Center for Information Technology: Sankt Augustin, Germany, 2002. [Google Scholar]
  5. Soliman, M.; Mousa, M.A.; Saleh, M.A.; Elsamanty, M.; Radwan, A.G. Modelling and implementation of soft bio-mimetic turtle using echo state network and soft pneumatic actuators. Sci. Rep. 2021, 11, 12076. [Google Scholar] [CrossRef] [PubMed]
  6. Wootton, A.J.; Taylor, S.L.; Day, C.R.; Haycock, P.W. Optimizing echo state networks for static pattern recognition. Cogn. Comput. 2017, 9, 391–399. [Google Scholar] [CrossRef]
  7. Mahmoud, T.A.; Abdo, M.I.; Elsheikh, E.A.; Elshenawy, L.M. Direct adaptive control for nonlinear systems using a TSK fuzzy echo state network based on fractional-order learning algorithm. J. Frankl. Inst. 2021, 358, 9034–9060. [Google Scholar] [CrossRef]
  8. Wang, Q.; Pan, Y.; Cao, J.; Liu, H. Adaptive Fuzzy Echo State Network Control of Fractional-Order Large-Scale Nonlinear Systems With Time-Varying Deferred Constraints. IEEE Trans. Fuzzy Syst. 2023, 1–15. [Google Scholar] [CrossRef]
  9. Gao, R.; Du, L.; Duru, O.; Yuen, K.F. Time series forecasting based on echo state network and empirical wavelet transformation. Appl. Soft Comput. 2021, 102, 107111. [Google Scholar] [CrossRef]
  10. Bai, Y.; Liu, M.D.; Ding, L.; Ma, Y.J. Double-layer staged training echo-state networks for wind speed prediction using variational mode decomposition. Appl. Energy 2021, 301, 117461. [Google Scholar] [CrossRef]
  11. Tian, Z. Echo state network based on improved fruit fly optimization algorithm for chaotic time series prediction. J. Ambient Intell. Humaniz. Comput. 2022, 13, 3483–3502. [Google Scholar] [CrossRef]
  12. Ribeiro, G.T.; Santos, A.A.P.; Mariani, V.C.; dos Santos Coelho, L. Novel hybrid model based on echo state neural network applied to the prediction of stock price return volatility. Expert Syst. Appl. 2021, 184, 115490. [Google Scholar] [CrossRef]
  13. Qiao, J.; Li, F.; Han, H.; Li, W. Growing echo-state network with multiple subreservoirs. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 391–404. [Google Scholar] [CrossRef] [PubMed]
  14. Huang, J.; Li, Y.; Shardt, Y.A.; Qiao, L.; Shi, M.; Yang, X. Error-driven chained multiple-subnetwork echo state network for time-series prediction. IEEE Sens. J. 2022, 22, 19533–19542. [Google Scholar] [CrossRef]
  15. Ma, Q.; Chen, E.; Lin, Z.; Yan, J.; Yu, Z.; Ng, W.W. Convolutional multitimescale echo state network. IEEE Trans. Cybern. 2019, 51, 1613–1625. [Google Scholar] [CrossRef] [PubMed]
  16. Gallicchio, C.; Micheli, A. Deep reservoir neural networks for trees. Inf. Sci. 2019, 480, 174–193. [Google Scholar] [CrossRef]
  17. Na, X.; Zhang, M.; Ren, W.; Han, M. Multi-step-ahead chaotic time series prediction based on hierarchical echo state network with augmented random features. IEEE Trans. Cogn. Dev. Syst. 2022, 15, 700–711. [Google Scholar] [CrossRef]
  18. Wang, H.; Wu, Q.J.; Wang, D.; Xin, J.; Yang, Y.; Yu, K. Echo state network with a global reversible autoencoder for time series classification. Inf. Sci. 2021, 570, 744–768. [Google Scholar] [CrossRef]
  19. Lun, S.X.; Yao, X.S.; Qi, H.Y.; Hu, H.F. A novel model of leaky integrator echo state network for time-series prediction. Neurocomputing 2015, 159, 58–66. [Google Scholar] [CrossRef]
  20. Lun, S.; Zhang, Z.; Li, M.; Lu, X. Parameter Optimization in a Leaky Integrator Echo State Network with an Improved Gravitational Search Algorithm. Mathematics 2023, 11, 1514. [Google Scholar] [CrossRef]
  21. Ren, W.; Ma, D.; Han, M. Multivariate Time Series Predictor With Parameter Optimization and Feature Selection Based on Modified Binary Salp Swarm Algorithm. IEEE Trans. Ind. Inform. 2022, 19, 6150–6159. [Google Scholar] [CrossRef]
  22. Hu, H.; Wang, L.; Tao, R. Wind speed forecasting based on variational mode decomposition and improved echo state network. Renew. Energy 2021, 164, 729–751. [Google Scholar] [CrossRef]
  23. Wainrib, G.; Galtier, M.N. A local echo state property through the largest Lyapunov exponent. Neural Netw. 2016, 76, 39–45. [Google Scholar] [CrossRef] [PubMed]
  24. Gallicchio, C.; Micheli, A. Echo state property of deep reservoir computing networks. Cogn. Comput. 2017, 9, 337–350. [Google Scholar] [CrossRef]
  25. Yang, C.; Qiao, J.; Ahmad, Z.; Nie, K.; Wang, L. Online sequential echo state network with sparse RLS algorithm for time series prediction. Neural Netw. 2019, 118, 32–42. [Google Scholar] [CrossRef]
  26. Gallicchio, C.; Micheli, A.; Pedrelli, L. Deep reservoir computing: A critical experimental analysis. Neurocomputing 2017, 268, 87–99. [Google Scholar] [CrossRef]
  27. Wu, Z.; Li, Q.; Zhang, H. Chain-structure echo state network with stochastic optimization: Methodology and application. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 1974–1985. [Google Scholar] [CrossRef] [PubMed]
  28. Jaeger, H.; Lukoševičius, M.; Popovici, D.; Siewert, U. Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw. 2007, 20, 335–352. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The structure of a standard ESN.
Figure 1. The structure of a standard ESN.
Electronics 12 04635 g001
Figure 2. The structure of the OFESN.
Figure 2. The structure of the OFESN.
Electronics 12 04635 g002
Figure 3. OFESN test error at different initial value and different connection.
Figure 3. OFESN test error at different initial value and different connection.
Electronics 12 04635 g003
Figure 4. OFESN test error at different initial value and different connection.
Figure 4. OFESN test error at different initial value and different connection.
Electronics 12 04635 g004
Figure 5. Comparison between the OFESN with full connection and Leaky-ESN with a sparse degree of 12.25 36 at the first initial value.
Figure 5. Comparison between the OFESN with full connection and Leaky-ESN with a sparse degree of 12.25 36 at the first initial value.
Electronics 12 04635 g005
Figure 6. Comparison between the OFESN with subreservoirs fully connected and Leaky-ESN with full connection at the first initial value.
Figure 6. Comparison between the OFESN with subreservoirs fully connected and Leaky-ESN with full connection at the first initial value.
Electronics 12 04635 g006
Figure 7. Comparison between the OFESN and Leaky-ESN at the partially connected reservoir and the first initial value.
Figure 7. Comparison between the OFESN and Leaky-ESN at the partially connected reservoir and the first initial value.
Electronics 12 04635 g007
Figure 8. Comparison between the OFESN with fully connected subreservoir and Leaky-ESN with the sparse degree of 12.25 36 at the second initial value.
Figure 8. Comparison between the OFESN with fully connected subreservoir and Leaky-ESN with the sparse degree of 12.25 36 at the second initial value.
Electronics 12 04635 g008
Figure 9. Comparison of Leaky-ESN with fully connected reservoir and the OFESN with all subreservoirs fully connected.
Figure 9. Comparison of Leaky-ESN with fully connected reservoir and the OFESN with all subreservoirs fully connected.
Electronics 12 04635 g009
Figure 10. Comparison between the OFESN and Leaky-ESN at the partially connected reservoir and the second initial value.
Figure 10. Comparison between the OFESN and Leaky-ESN at the partially connected reservoir and the second initial value.
Electronics 12 04635 g010
Figure 11. Prediction error bars of the OFESN under different conditions.
Figure 11. Prediction error bars of the OFESN under different conditions.
Electronics 12 04635 g011
Figure 12. Comparison between the OFESN and Leaky-ESN under the first initial value and full connection.
Figure 12. Comparison between the OFESN and Leaky-ESN under the first initial value and full connection.
Electronics 12 04635 g012
Figure 13. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir under the first initial value.
Figure 13. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir under the first initial value.
Electronics 12 04635 g013
Figure 14. Comparison between the OFESN and Leaky-ESN under partial connection and the first initial value.
Figure 14. Comparison between the OFESN and Leaky-ESN under partial connection and the first initial value.
Electronics 12 04635 g014
Figure 15. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with sparse degree 7 36 at the second initial value.
Figure 15. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with sparse degree 7 36 at the second initial value.
Electronics 12 04635 g015
Figure 16. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with the reservoir fully connected at the second initial value.
Figure 16. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with the reservoir fully connected at the second initial value.
Electronics 12 04635 g016
Figure 17. Comparison between the OFESN and Leaky-ESN under partial connection and the second initial value.
Figure 17. Comparison between the OFESN and Leaky-ESN under partial connection and the second initial value.
Electronics 12 04635 g017
Figure 18. Comparison of the OFESN with a different number of subreservoirs under the first initial value and partial connection.
Figure 18. Comparison of the OFESN with a different number of subreservoirs under the first initial value and partial connection.
Electronics 12 04635 g018
Figure 19. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN under the first initial value.
Figure 19. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN under the first initial value.
Electronics 12 04635 g019
Figure 20. Comparison between the OFESN with each subreservoir partially connected and Leaky-ESN under the first initial value.
Figure 20. Comparison between the OFESN with each subreservoir partially connected and Leaky-ESN under the first initial value.
Electronics 12 04635 g020
Figure 21. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN under the first initial value.
Figure 21. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN under the first initial value.
Electronics 12 04635 g021
Figure 22. Comparison between the OFESN and Leaky-ESN under partial connection and the second initial value.
Figure 22. Comparison between the OFESN and Leaky-ESN under partial connection and the second initial value.
Electronics 12 04635 g022
Figure 23. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir under the first initial value.
Figure 23. Comparison between the OFESN with each subreservoir fully connected and Leaky-ESN with fully connected reservoir under the first initial value.
Electronics 12 04635 g023
Figure 24. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Figure 24. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Electronics 12 04635 g024
Figure 25. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Figure 25. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Electronics 12 04635 g025
Figure 26. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Figure 26. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Electronics 12 04635 g026
Figure 27. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Figure 27. Error result based on the OFESN and Leaky-ESN for Mackey–Glass.
Electronics 12 04635 g027
Table 1. Comparison of running time and NRMSE between the OFESN and Leaky-ESN.
Table 1. Comparison of running time and NRMSE between the OFESN and Leaky-ESN.
NRMSE _ train Time _ train (s) NRMSE _ test Time _ test (s)
OFESNCase1 2.5609 × 10 7 0.231060 2.5282 × 10 7 0.004107
Case2 6.3762 × 10 7 0.234678 7.2728 × 10 7 0.004057
Case3 8.2226 × 10 6 0.228521 8.4277 × 10 6 0.004182
Case4 4.7944 × 10 6 0.232367 5.2103 × 10 6 0.004416
Leaky-ESNCase1 2.0096 × 10 6 0.232815 2.2874 × 10 6 0.007072
Case2 2.9704 × 10 6 0.246669 3.1719 × 10 6 0.006204
Case3 1.3395 × 10 5 0.239171 1.4666 × 10 5 0.005308
Case4 1.3040 × 10 5 0.276972 1.4304 × 10 5 0.006306
Table 2. Comparison of parameter values between OFESN and Leaky-ESN.
Table 2. Comparison of parameter values between OFESN and Leaky-ESN.
a ρ S in
OFESNCase 10.999720.304660.015876
Case 20.995730.435320.219020
Case 30.800570.598740.099303
Case 40.801510.598470.096980
Leaky-ESNCase 10.999790.405910.090287
Case 20.998530.127690.095093
Case 30.799420.599290.100000
Case 40.800240.600030.100000
Table 3. Comparison of running time and NRMSE between the OFESN and Leaky-ESN.
Table 3. Comparison of running time and NRMSE between the OFESN and Leaky-ESN.
NRMSE _ train Time _ train (s) NRMSE _ test Time _ test (s)
OFESNCase1 8.6111 × 10 7 0.205972 8.4250 × 10 7 0.005496
Case2 2.8909 × 10 5 0.214866 2.8671 × 10 5 0.004983
Case3 3.4598 × 10 5 0.207253 3.6116 × 10 5 0.005006
Case4 6.3554 × 10 5 0.251348 6.6807 × 10 5 0.004716
Leaky-ESNCase1 6.1811 × 10 6 0.220781 8.3576 × 10 6 0.007218
Case2 2.1921 × 10 5 0.202413 6.4650 × 10 5 0.005537
Case3 6.0361 × 10 6 0.207606 4.9683 × 10 5 0.005591
Case4 2.2701 × 10 5 0.216183 2.5725 × 10 4 0.005744
Table 4. Comparison of running time and NRMSE between the OFESN and Leaky-ESN for Mackey–Glass series.
Table 4. Comparison of running time and NRMSE between the OFESN and Leaky-ESN for Mackey–Glass series.
NRMSE _ train Time _ train (s) NRMSE _ test Time _ test (s)
OFESNCase1 8.3616 × 10 8 0.535822 6.0156 × 10 8 0.059386
Case2 6.6978 × 10 8 0.231891 8.2199 × 10 8 0.066284
Case3 1.6348 × 10 7 0.237385 6.4672 × 10 8 0.046180
Case4 1.3460 × 10 7 0.237897 5.4552 × 10 8 0.046180
Leaky-ESNCase1 6.093 × 10 7 0.238008 6.9946 × 10 7 0.076255
Case2 1.8484 × 10 7 0.233209 2.1197 × 10 7 0.055636
Case3 1.1278 × 10 7 0.239115 1.4215 × 10 7 0.055965
Case4 1.9905 × 10 7 0.235569 2.2779 × 10 7 0.056949
Table 5. Comparison of parameter values between the OFESN and Leaky-ESN.
Table 5. Comparison of parameter values between the OFESN and Leaky-ESN.
a ρ S in
OFESNCase 10.958110.320020.017681
Case 20.997670.170670.080665
Case 30.725060.290530.071419
Case 40.800240.599860.098906
Leaky-ESNCase 11.00080.239610.08112
Case 20.976940.0275510.13631
Case 30.798640.599560.100000
Case 40.799330.599110.100000
Table 6. Comparison of parameter values between the OFESN and Leaky-ESN for Mackey–Glass series.
Table 6. Comparison of parameter values between the OFESN and Leaky-ESN for Mackey–Glass series.
a ρ S in
OFESNCase 10.999230.360610.083281
Case 21.000200.411800.14214
Case 30.799660.601010.095925
Case 40.800770.599930.098408
Leaky-ESNCase 10.995210.326430.094922
Case 21.001500.439650.044478
Case 30.804580.593220.100000
Case 40.798610.600500.100000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lun, S.; Wang, Q.; Cai, J.; Lu, X. A Multireservoir Echo State Network Combined with Olfactory Feelings Structure. Electronics 2023, 12, 4635. https://doi.org/10.3390/electronics12224635

AMA Style

Lun S, Wang Q, Cai J, Lu X. A Multireservoir Echo State Network Combined with Olfactory Feelings Structure. Electronics. 2023; 12(22):4635. https://doi.org/10.3390/electronics12224635

Chicago/Turabian Style

Lun, Shuxian, Qian Wang, Jianning Cai, and Xiaodong Lu. 2023. "A Multireservoir Echo State Network Combined with Olfactory Feelings Structure" Electronics 12, no. 22: 4635. https://doi.org/10.3390/electronics12224635

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop