Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks

Karn, Prakash Kumar; Ardekani, Iman; Abdulla, Waleed H.

doi:10.3390/math12162525

Open AccessArticle

Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks

by

Prakash Kumar Karn

^1,*

,

Iman Ardekani

²

and

Waleed H. Abdulla

^1,*

¹

Department of Electrical, Computer and Software Engineering, University of Auckland, Auckland 1010, New Zealand

²

School of Arts and Sciences, The University of Notre Dame Australia, Fremantle 6160, Australia

^*

Authors to whom correspondence should be addressed.

Mathematics 2024, 12(16), 2525; https://doi.org/10.3390/math12162525

Submission received: 27 May 2024 / Revised: 8 August 2024 / Accepted: 12 August 2024 / Published: 15 August 2024

(This article belongs to the Special Issue Machine-Learning-Based Process and Analysis of Medical Images)

Download

Browse Figures

Versions Notes

Abstract

:

This paper introduces a novel approach to neural networks: a Generalized Liquid Neural Network (GLNN) framework. This design excels at handling both sequential and non-sequential tasks. By leveraging the Runge Kutta DOPRI method, the GLNN enables dynamic simulation of complex systems across diverse fields. Our research demonstrates the framework’s capabilities through three key applications. In predicting damped sinusoidal trajectories, the Generalized LNN outperforms the neural ODE by approximately 46.03% and the conventional LNN by 57.88%. Modelling non-linear RLC circuits shows a 20% improvement in precision. Finally, in medical diagnosis through Optical Coherence Tomography (OCT) image analysis, our approach achieves an F1 score of 0.98, surpassing the classical LNN by 10%. These advancements signify a significant shift, opening new possibilities for neural networks in complex system modelling and healthcare diagnostics. This research advances the field by introducing a versatile and reliable neural network architecture.

Keywords:

Generalized Liquid Neural Network (GLNN); neural ordinary differential equations (ODEs); Runge-Kutta DOPRI 5 method; non-sequential task processing; Optical Coherence Tomography (OCT) image classification

MSC:

68T07

1. Introduction

Neural ordinary differential equations (ODEs) present a novel paradigm in deep learning, offering a continuous depth (or time) perspective on model architecture [1]. Unlike traditional neural networks, which operate in discrete layers, neural ODEs conceptualize the depth of a network as a continuous variable. This continuous-depth approach enables neural ODEs to dynamically adjust their complexity based on the input data, potentially leading to more efficient and adaptable models [2]. Neural ODEs are especially promising for tasks involving time series data or any form of data that grows over time. This continuous-depth approach is also beneficial in applications such as financial forecasting, climate modelling, and any domain requiring the analysis of evolving data over continuous intervals. Their inherent design aligns closely with the continuous nature of physical systems, making them particularly suitable for physics-informed machine learning and dynamic systems modelling applications. By treating the evolution of the network’s hidden states as a continuous process governed by differential equations, neural ODEs can naturally model the temporal dynamics of complex systems. Neural ODEs have shown promising results in various domains, including physics-informed machine learning, where they effectively model dynamic systems by treating the evolution of hidden states as a continuous process governed by differential equations.

Furthermore, the adaptive computation feature of neural ODEs allows for variable computational resources depending on the task’s complexity. A neural ODE might use fewer computational steps for tasks with more straightforward dynamics, whereas more complex dynamics might necessitate a deeper computational “depth”. This flexibility can lead to more efficient models that scale their complexity based on the complexities of the input data.

Neural ordinary differential equations (ODEs) are increasingly used for their properties like invertibility, stability, and parameter efficiency, forming a family of models through continuous-time ODE approximations. Additionally, their inherent stability and ability to invert transformations make them robust against perturbations in data, which is crucial for reliable performance in critical applications. H. Cai et al. [3] proposed that an ODE-based brain state recognition neural network (OSR-Net) demonstrates superior performance in recognizing cognitive states from neuroimaging data by leveraging steady neural ODEs and incorporating SPD matrices on the Riemannian manifold, outperforming traditional RNN-based models. Neural ODEs also offer benefits in terms of parameter efficiency and the ability to handle irregularly sampled data. Since a continuous transformation defines the model, it can effectively interpolate or extrapolate the data dynamics even when the sampling points are not uniformly distributed. This characteristic is particularly advantageous for real-world applications where data may be missing or sampled at irregular intervals.

The advent of neural networks has significantly advanced the field of artificial intelligence, offering powerful tools for a wide range of applications, from image recognition to natural language processing. Among various architectures, Liquid Neural Networks (LNNs) have gained attention due to their capability to process time-varying signals, making them particularly suited for sequential tasks. However, the conventional application of LNNs has been predominantly constrained to such tasks, limiting their broader utility. Recent advancements have begun exploring their potential beyond these confines, indicating a promising future for LNNs in a wider array of applications. For instance, LNNs have been effectively applied in real-time speech recognition, sensor data analysis, and sequential decision-making processes, demonstrating their versatility in handling time-dependent data [4,5,6]. The extension of Liquid Neural Networks (LNNs) to non-sequential tasks represents a major advancement, as it expands the applicability of LNNs beyond their traditional domain of sequential data processing. This is particularly important because many real-world problems, such as image classification and other spatial data tasks, are inherently non-sequential. Traditional methods for handling non-sequential tasks, such as feedforward neural networks and convolutional neural networks, are typically static in their architecture and lack the dynamic adaptability that LNNs offer. By leveraging the continuous-time dynamics inherent in LNNs, our approach can dynamically adjust to varying complexities within the data, providing a more flexible and robust solution.

LNNs are designed to handle time-varying signals and sequential data by employing neurons that evolve according to differential equations. This dynamic adaptability allows LNNs to capture and model complex temporal patterns effectively.

Pros:

Dynamic Adaptability: LNNs can adjust their internal states continuously, making them highly effective for tasks involving time series data.
Robustness: The differential equation-based framework provides stability and robustness against perturbations in the data.
Parameter Efficiency: LNNs often require fewer parameters compared to traditional RNNs, leading to more efficient models.

Cons:

Complexity in Training: The continuous nature of LNNs can make the training process more complex and computationally intensive.
Limited Application to Non-Sequential Tasks: Traditionally, LNNs have been applied primarily to sequential tasks, limiting their broader utility.

This paper introduces a novel augmentation to the traditional Liquid Neural Network framework, extending its applicability to encompass non-sequential tasks. Integrating the Runge-Kutta DOPRI 5 method [7] empowers the LNN to effectively address non-sequential tasks by enhancing its stability and predictability through improved numerical handling of the underlying dynamics [8]. By reformulating the system model from time-dependent to time-independent, assuming it has reached equilibrium (no change, dy/dx = 0), we enable LNNs to tackle new challenges.

The intersection of neural networks and differential equations presents fertile ground for innovation, especially in modelling dynamic systems. Traditional LNNs emphasizing sequential data processing have successfully captured temporal dynamics. However, their utility in non-sequential tasks remains underexplored, mainly due to their inherent design favouring time-varying inputs. This paper seeks to bridge this gap by building a generalized framework of LNN architecture to operate effectively in scenarios where temporal sequences do not govern the input–output relationship.

The primary contributions of this study are threefold. First, we propose a Generalized Liquid Neural Network architecture framework that breaks the traditional boundary of sequential task processing. Second, we detailed the integration of the Runge-Kutta DOPRI 5 method into the LNN, facilitating its operation under equilibrium conditions. Lastly, we validate our approach through three diverse case studies:

Case-1: Prediction of Damped Sinusoidal Trajectories: This case explores the proposed framework’s capability to predict the behaviour of damped sinusoidal systems, a common phenomenon in oscillatory systems subject to friction or resistance. This generalized framework can accurately model and predict these trajectories and showcase its potential in physics and engineering applications.

Case-2: Non-linear RLC Circuit Output Estimation: In our second case, the GLNN is applied to predict the output of a non-linear Resistor-Inductor-Capacitor (RLC) circuit. RLC circuits are fundamental components in electrical engineering, and their non-linear variants pose significant challenges for traditional prediction methods. In this case, the performance underscores its applicability in complex electronic systems.

Case-3: Retinal Disease Classification: The third case extends the GLNN’s application to the biomedical field, particularly in classifying retinal diseases from OCT images. Using a Neural Circuit Policy within the framework, this case demonstrates the network’s potential in healthcare, offering a novel approach to diagnosing and understanding various retinal conditions.

Through these case studies, we demonstrated the versatility and efficacy of the proposed framework in addressing a wide range of problems, from engineering and physics to biomedical applications.

The remainder of this paper is organized as follows: Section 2 and Section 3 review related studies and background, highlighting previous approaches and identifying the gaps our research aims to fill. Section 4 details the methodology, including the system model design for the GLNN framework. Section 5 presents three case studies, outlining the application of GLNN in each scenario and discussing the results. Finally, Section 6 concludes the paper with our findings and suggestions for future research directions.

2. Related Work

In an extensive exploration of the literature surrounding Neural ODEs, it is principal to trace the trajectory of this innovative approach from its inception to its current applications and potential future directions. The foundational concept of Neural ODEs was pioneered by Chen et al. [9] in their groundbreaking work, which introduced the idea of modelling the depth of neural networks as a continuous variable. This novel perspective allowed for the conceptualization of infinitely deep networks, providing a more natural and flexible approach to capturing the dynamics of complex systems. Building upon this, Dupont et al. [10] explore the potential of Neural ODEs more profoundly, emphasizing their adaptability and efficiency, particularly in scenarios requiring fine-grained modelling of temporal dynamics, setting the stage for a new paradigm in deep learning.

The initial application of Neural ODEs was predominantly in sequential data, where their ability to model continuous dynamics proved highly beneficial. Rubanova et al. [11] extended the Neural ODE framework to Recurrent Neural Networks (RNNs), introducing ODE-RNNs. This innovative integration showcased the superior performance of Neural ODEs in handling irregularly sampled time series data, a common challenge in fields such as finance and healthcare. The adaptability of Neural ODEs to various temporal resolutions was further highlighted, broadening their applicability and showcasing their potential beyond traditional time-dependent models.

Exploring Neural ODEs in non-sequential domains marked a significant expansion in their applicability. Massaroli et al. [12] provided critical insights into the theoretical underpinnings that make Neural ODEs suitable for a broader range of applications, including static data analysis and image classification. This opened new avenues for the continuous-depth nature of Neural ODEs in scenarios not constrained by temporal dynamics, thus challenging the conventional boundaries of their application.

Integrating Neural ODEs with existing neural network architectures, such as Convolutional Neural Networks (CNNs), has been another area of significant advancement. Haber and Ruthotto [13] explored the parallels between ResNets and discretized ODEs, laying the groundwork for the subsequent fusion of Neural ODEs with more complex architectures. This integration has led to notable improvements in tasks such as image processing, demonstrating the versatility and potential of Neural ODEs in enhancing existing deep-learning models.

Neural ODEs have shown exceptional promise in modelling complex dynamical systems. Their capability to accurately represent the dynamics of systems in fields ranging from climate science to neuroscience has been a game changer. For instance, Poli et al. [14] demonstrated how Neural ODEs could be adapted to graph-structured data, expanding their utility to encompass a wide array of complex systems. This adaptation has paved the way for innovative applications such as social network analysis and biological systems modelling.

Despite the significant advancements facilitated by Neural ODEs, challenges remain, particularly in computational efficiency and handling stiff equations. Gholami et al. [15] addressed some of these challenges, proposing solutions to enhance the practicality of Neural ODEs for real-world applications. Exploring methods to improve the computational tractability of Neural ODEs is an ongoing area of research with significant implications for their broader adoption and implementation.

As the landscape of Neural ODE research continues to evolve, it is clear that the boundaries of what can be modelled and understood through this framework are expanding. Neural ODEs are paving the way for discoveries and innovations across various scientific and engineering domains with each new application and integration. The versatility and adaptability of Neural ODEs, coupled with ongoing advancements in computational efficiency and model integration, promise to sustain their position at the forefront of deep learning research.

Building upon the foundational work on Neural ODEs, we investigate the precedents of earlier research in continuous-time neural networks, the adaptability and efficiency of computational mechanisms in deep learning architectures, and novel methodologies for learning from differential equation models. The lineage of continuous-time neural network training through the adjoint method, as initially suggested by LeCun et al. [16] and further elaborated by Pearlmutter [17], laid the groundwork for understanding the dynamics of such networks, though without extensive practical demonstrations.

The innovative reinterpretation of Residual Networks (ResNets) by He et al. [18] as functional analogues to ordinary differential equation (ODE) solvers marked a significant shift in understanding deep neural architectures. They posited that the layer-by-layer progression in ResNets could be viewed through the lens of a discretized ODE solver, where each layer approximates a small step in solving a continuous dynamical system. This conceptual framework opened avenues for exploring the inherent properties of reversibility and the capacity for approximation within these network structures.

Building upon this foundational insight, subsequent research by Chang et al. [19] and Lu et al. [20] looks into the practical implications of these theoretical properties within the domain of ResNets. Chang et al. [19] explored the potential of enhancing computational efficiency and model interpretability by the reversible nature of certain neural network architectures, a concept closely aligned with the reversible nature of many ODE solvers. On the other hand, Lu et al. [14] investigated the integration of ODE solvers within deep learning frameworks, proposing models that closely mimic the continuous dynamics of ODEs, thus enabling a more natural representation of continuous data processes.

In the scope of adaptive computation, previous endeavours, as noted by Graves [21], Jernite et al. [22], and Figurnov et al. [23], involved the training of auxiliary neural networks to dictate the computational depth of recurrent or residual networks. In contrast, the utilization of ODE solvers circumvents these issues by providing a set of established, computationally efficient rules for dynamically adjusting computation, thereby offering a more streamlined approach.

As Baydin et al. [18] highlighted, this integration of black-box ODE solvers with automatic differentiation signifies a pivotal advancement in the field, enabling the end-to-end training of models that encapsulate complex dynamical systems within their architecture. Incorporating ODE solvers seamlessly alongside other model components facilitates a more holistic and nuanced understanding of the data-generating processes, enhancing the model’s predictive capabilities and interpretability.

Furthermore, the application of Neural ODEs extends beyond mere data fitting; it opens new horizons in exploring data with inherent temporal dynamics or continuous-time processes. From ecological modelling to financial time series prediction, the ability to directly model the continuous-time evolution of systems presents a significant advantage. It allows for a more natural representation of phenomena, leading to more accurate models that align with the underlying physical principles.

In conclusion, integrating Neural ODEs into deep learning architectures represents mathematical elegance and practical utility. Drawing on the strengths of differential equations and combining them with the flexibility of neural networks offers a powerful paradigm for modelling complex systems.

3. Neural ODE-Based Model’s Background

The modern integration of technology into various sectors of society necessitates processing vast amounts of data efficiently. This need has led to the development of artificial intelligence (AI), particularly in machine learning. Neural networks, a prominent framework in machine learning, consist of layers for receiving data, processing it through hidden layers, and producing output. Deep learning, a subset of machine learning, involves neural networks with more than three hidden layers.

One notable type of neural network is the Residual Network (ResNet), which features skip connections that preserve input data from previous layers. This method aids in learning processes. The core idea behind machine learning with neural networks is to replicate and recognize given inputs. This research investigates ordinary differential equations (ODEs) representing input data for neural networks.

A deterministic dynamical system involves a changing “state” over time governed by rules described by differential equations (DEs). Numerical methods have been developed to solve DEs with high accuracy, paving the way for integrating neural networks with numerical analysis.

Transitioning from a ResNet to an ODE net involves incorporating time as a parameter within the network’s architecture. The trajectory of an ODE net is defined by a local initial condition and shared dynamics across the all-time series. Solving ODEs for an ODE net involves ODE solvers, which compute the network’s output as a function of time.

Training ODE nets requires continuous time backpropagation, solving the initial value problem backwards using the adjoint sensitivity method. This method computes gradients by solving a second ODE backwards in time, facilitating optimization of the network’s weights and dynamics.

The research extends to the augmented state, which considers the gradients for the parameters of the network. This thorough investigation provides insights into integrating neural networks with ODEs, paving the way for efficient and accurate learning in non-sequential tasks.

3.1. Dynamical System

Dynamical systems form the backbone of many scientific models, describing how a system changes over time. In the context of this research, a dynamical system

f_{t}

poses a set of mappings M→M, parameterized by either discrete or continuous time t. This system operates within a topological and differentiable structure with

f_{t}

being continuous and differentiable.

3.1.1. Differential Equations

Differential equations (DEs) are fundamental to describing how these dynamical systems evolve over time. A DE is essentially a condition involving functions, describing the rate of change of a variable to its independent variables. In the simplest form, a differential equation can be represented as:

\frac{d y}{d t} = f (t, y)

. Here, f is a given function of two variables. A solution to this equation,

y = t

, is a function that satisfies the equation for all t in each interval. Depending on the number of independent variables, DEs can be categorized into ODEs and PDEs.

3.1.2. Numerical Methods

Numerical methods play a crucial role in solving DEs, especially when analytical solutions are not feasible. These methods provide approximate solutions with high accuracy. Techniques such as the Runge-Kutta methods, including the Runge-Kutta 4 (RK4) and the Dormand Prince method (DOPRI 5) [7] mentioned in this research, are widely used for solving ODEs. These methods involve breaking down the problem into smaller steps, allowing for the computation of solutions over discrete intervals.

3.2. Backpropagation in Neural ODEs

Backpropagation is a crucial algorithm in training neural networks, enabling them to learn from data by adjusting their weights and biases. In the context of Neural ODEs, backpropagation operates similarly to traditional neural networks but with an added twist. It involves computing gradients with respect to the network’s parameters, which in this case include the ODE parameters and initial conditions.

3.2.1. Continuous Time Backpropagation in Neural ODEs

Continuous time backpropagation in Neural ODEs extends the concept of training these networks by incorporating the dynamics of the ODEs. When training Neural ODEs, the goal is to optimize the ODE parameters and initial conditions to minimize a loss function. This process solves the initial value problem backwards in time, ensuring that the network’s trajectory aligns with the desired output.

3.2.2. Solution for Neural ODEs

The solution for Neural ODEs involves using ODE solvers to compute the network’s output as a function of time. By treating the ODE net as a continuous system, ODE solvers determine the evolution of the network’s hidden states over time. The output of an ODE net is not the actual solution but its derivative, which is then used to compute the loss function. This loss function measures the discrepancy between the predicted and actual outputs, guiding the optimization process during training.

4. Model Design of Generalized Liquid Neural Network

In transitioning our focus from time-dependent dynamics to time-independent scenarios within system modelling, we explored adapting continuous-time frameworks to accommodate static or equilibrium states. A Continuous Time-Recurrent Neural Network (CT-RNN) is a type of recurrent neural network that models the dynamics of neural processing with continuous rather than discrete time steps. CT-RNNs can handle tasks involving complex temporal patterns and time-sensitive information more effectively.

We have, for continuous time-recurrent neural network (CT-RNN) [24]:

\frac{d x (t)}{d t} = - \frac{x (t)}{τ} + f (x (t), I (t), t, θ)

(1)

Here,

\frac{d x (t)}{d t}

helps the system reach equilibrium with time constant

τ

. In the equation, x(t) is the dependent variable (usually representing some quantity that changes with time), t is the independent variable (time), and τ is a positive constant known as the time constant. The function f(x(t), I(t), t, θ) represents the external input or forcing function, and θ represents any additional parameters. Solving this differential equation involves finding the function x(t) that satisfies the equation, given an initial condition. The solution would describe the system’s evolution based on the given dynamics and initial conditions.

On the other hand, the dynamic of non-spiking neural potential [25] can be written as:

\frac{d x (t)}{d t} = - g_{l} x (t) + s (t)

(2)

where

g_{l}

is leakage conductance, and s(t) is the sum of all synaptic input to the node. By considering non-linearity, s(t) can be written as:

s (t) = f (x (t), I (t), (A - x (t)))

(3)

where A is synaptic potential between nodes. Substituting (3) into (2) yields:

\frac{d x (t)}{d t} = - g_{l} x (t) + f (x (t), I (t), (A - x (t)))

(4)

Comparing (4) with (1) and introducing the model parameter θ, we get:

\frac{d x (t)}{d t} = - \frac{x (t)}{τ} + f (x (t), I (t), θ, (A - x (t)))

(5)

4.1. Derivation of the Steady-State Conditions

Suppose we want to adapt the described model in (5) for time-independent or non-sequential data. In that case, it is essential to recognize that the equations we have derived are differential equations that describe the dynamics of a system over time. These equations inherently capture sequential or temporal aspects. Hence, it might be more appropriate to consider a static or steady-state version of the model.

In this case, we can set the left-hand side of our differential equation to zero, indicating that the system has reached equilibrium. The equation then becomes a static equation representing the steady-state conditions:

0 = - \frac{x}{τ} + f (x, I, θ, (A - x))

(6)

Since x(t) no longer changes at the steady state, it effectively becomes a constant x. Hence, in the equations, x(t) is replaced by x, I(t) by I, to denote these constant values. This Equation (6) can be solved for x to find the steady-state solution in terms of the input parameters (I, A, and

θ

) without considering the time-dependent dynamics. Solving for x at steady state, we would typically set

\frac{d x (t)}{d t} = 0

. The solution will then represent the system’s equilibrium state under the given input conditions.

It is important to note that this approach assumes that the system has reached a stable, time-independent state. If our data are non-sequential and we are interested in steady-state behaviour, solving the steady-state equation provides insights into the equilibrium conditions of the system.

To solve the steady-state equation further, we can follow these steps. In our case, the steady-state equation is given by (6) as follows:

We decompose

f (x (t), I (t), θ, (A - x (t)))

as:

f (x (t), I (t), θ, A - x (t)) = - g_{l} x (t) + h (x (t), I (t), A)

(7)

where

- g_{l} x (t)

is a constant representing leakage in a biological or physical system. The negative sign indicates that this term reduces x(t) over time, pulling it back towards zero. The term h is a function that accounts for the combined effects of the system’s state x(t), the input I(t), and the threshold A.

\frac{d x (t)}{d t} = - \frac{x (t)}{τ} - g_{l} x (t) + h (x (t), I (t), A)

(8)

In steady state conditions, we have

\frac{d x (t)}{d t} = 0

; therefore,

0 = - \frac{x}{τ} - g_{l} x + h (x, I, A)

(9)

Now, rearranging (9) results in the following expression for x:

\frac{x}{t} + g_{l} x = h (x, I, A)

(10)

x (\frac{1}{τ} + g_{l}) = h (x, I, A)

(11)

x = \frac{h (x, I, A)}{\frac{1}{τ} + g_{l}}

(12)

Here,

h (x, I, A)

is a function that represents the remaining part of

f (x (t), I (t), θ, A - x (t))

after separating it into two components:

- g_{l} x (t)

and

h (x, I, A)

. This provides an implicit solution for the steady-state x in terms of the input parameters I, A, τ, and

g_{l}

. The above equation shows that the implicit solution is still a function of x.

To solve this problem, we have proposed an explicit solution, considering that the system has already reached equilibrium. Proceeding further from (9), a common approach is to approximate

h

around an operating point (possibly the expected steady state) using a Taylor series expansion, typically truncating to the linear terms for simplicity:

h (x, I, A) \approx h (x_{0}, I_{0}, A_{0}) + \frac{\partial h}{\partial x} |_{x_{0}, I_{0}, A_{0}} (x - x_{0}) + \frac{\partial h}{\partial I} |_{x_{0}, I_{0}, A_{0}} (I - I_{0}) + \frac{\partial h}{\partial A} |_{x_{0}, I_{0}, A_{0}} (A - A_{0})

(13)

In the given equation,

x_{0}, I_{0}, A_{0}

represent the operating points around which the function ℎ is being linearized using a Taylor series expansion. These are the values of the state variable (x), input (I), and parameter (A) at the equilibrium or steady-state point.

Assuming

h (x_{0}, I_{0}, A_{0}) = 0

(no contribution at the equilibrium point) and simplifying notation with constants from partial derivatives:

c_{1} = \frac{\partial h}{\partial x} |_{x_{0}, I_{0}, A_{0}} c_{2} = \frac{\partial h}{\partial I} |_{x_{0}, I_{0}, A_{0}} c_{3} = \frac{\partial h}{\partial A} |_{x_{0}, I_{0}, A_{0}}

(14)

Linearizing and substituting back into the steady state equation and neglecting the constant term for simplicity (assuming it zeros from the definition of the equilibrium), we obtain:

0 = (- \frac{1}{τ} - g_{l} + c_{1}) x + c_{2} I + c_{3} A

(15)

x = \frac{c_{2} I + c_{3} A}{- \frac{1}{τ} - g_{l} + c_{1}}

(16)

4.2. Derivation of Dynamic-State Condition

For the system dynamics, we are concerned with how x(t) evolves over time given some initial conditions and changes in inputs or parameters. In this case, using the DOPRI 5 method is a usual and yet powerful option. This method excels in handling stiff differential equations and provides error control through adaptive step sizing.

Let us consider an example with two neurons, as depicted in Figure 1. In this model, the leakage constant

g_{l}

common to both neurons represents the passive decay of membrane potential. The

f_{i j}

functions define various interactions:

f_{01}

and

f_{02}

describe external inputs to each neuron,

f_{11}

and

f_{22}

model self-regulation within each neuron, and

f_{21}

and

f_{12}

capture synaptic interactions between the neurons. The external input function

I (t)

, which may vary over time, simulates changing external conditions. Parameters

A_{i j}

associated with each

f_{i j}

function represents synaptic strengths or thresholds, crucial for determining each neuron’s sensitivity to its inputs.

\frac{d x_{1} (t)}{d t} = - g_{l} x_{1} (t) + S_{01} (t) + S_{21} (t) + S_{11} (t)

(17)

\frac{d x_{2} (t)}{d t} = - g_{l} x_{2} (t) + S_{02} (t) + S_{12} (t) + S_{22} (t)

(18)

where

S_{01} (t) = f_{01} (I (t), A_{01} - x_{1} (t))

(i)

S_{21} (t) = f_{21} (x (t), A_{21} - x_{1} (t))

(ii)

S_{11} (t) = f_{11} (x (t), A_{11} - x_{1} (t))

(iii)

S_{02} (t) = f_{02} (I (t), A_{02} - x_{2} (t))

(iv)

S_{12} (t) = f_{12} (I (x), A_{12} - x_{2} (t))

(v)

S_{11} (t) = f_{11} (x (t), A_{11} - x_{1} (t))

(vi)

Therefore, the general equation for

S_{i j} (t)

is as follows:

S_{i j} (t) = f_{i j} (u_{j} (t), A_{i j} - x_{i} (t))

(19)

where

u_{j} (t)

is as follows:

u_{j} (t) = \{\begin{matrix} I (t), i f j = 0 \\ x_{j} (t), i f j \geq 1 \end{matrix}

(20)

Substituting corresponding values from (i) to (vi) into (17) and (18), we get:

\begin{matrix} \frac{d x_{1} (t)}{d t} = - g_{l} x_{1} (t) + f_{01} (I (t), A_{01} - x_{1} (t)) + f_{21} (x_{2} (t), A_{21} - x_{1} (t)) \\ + f_{11} (x_{1} (t), A_{11} - x_{1} (t)) \end{matrix}

(21)

\begin{matrix} \frac{d x_{2} (t)}{d t} = - g_{l} x_{2} (t) + f_{02} (I (t), A_{02} - x_{2} (t)) + f_{12} (x_{1} (t), A_{12} - x_{2} (t)) \\ + f_{22} (x_{2} (t), A_{22} - x_{2} (t)) \end{matrix}

(22)

The general form for the dynamic x(t) in the presence of interaction and input is as follows:

\frac{d x_{i} (t)}{d t} = - g_{l} x_{i} (t) + \sum_{j = o}^{n} f_{i j} (u_{j} (t), A_{i j} - x_{i} (t))

(23)

In this research, we have used the Dormand-Prince method to solve these equations numerically. The Dormand–Prince (DOPRI) method is a numerical method for solving ODEs and belongs to the Runge–Kutta family. It computes fourth- and fifth-order accurate solutions using six function evaluations per step. The difference between these solutions provides a convenient error estimate, making it suitable for adaptive step size integration algorithms. The method employs seven stages but benefits from the “First Same as Last” (FSAL) property, evaluating the last stage at the same point as the first stage of the next step. Dormand and Prince designed the coefficients to minimize the error of the fifth-order solution. Algorithm 1 explains the pseudocode to find an explicit x solution using the DOPRI method.

Algorithm 1: Steps involved to find the explicit solution of x using the DOPRI method.
Objective	Numerically solve the differential equations using the DOPRI adaptive step-size control method.
Input	$Initial condition x (t_{0}) = x_{0}$ $, end time t_{e n d},$ initial step size h, and tolerance tol.
Output	$Approximation of x (t) over the interval [t_{0}, t_{e n d}$ ].
1:	$Initialize the solution x at the starting time t_{0}$ $Set x_{n} = x_{0}$ $and t = t_{0}$
2:	Set the initial step size h and tolerance
3:	While $t < t_{e n d},$ do: $Use the Dormand - Prince coefficient to compute k_{1}, k_{2}, k_{3}, k_{4}, k_{5}, k_{6}$ $(use Butcher tableau for the coefficient of k_{n}$ ) $k_{1} = h . f (x_{n}, I_{n} {, A}_{n})$ $k_{2} = h . f (x_{n} + \frac{1}{5} k_{1} {, I_{n}, A}_{n} + \frac{1}{5} h)$ $k_{3} = h . f (x_{n} + \frac{3}{40} k_{1} + \frac{9}{40} k_{2} {, I_{n}, A}_{n} + \frac{3}{40} h)$ $k_{4} = h . f (x_{n} + \frac{44}{45} k_{1} - \frac{56}{15} k_{2} + \frac{32}{9} k_{3} {, I_{n}, A}_{n} + \frac{44}{45} h)$ $k_{5} = h . f (x_{n} + \frac{19,372}{6561} k_{1} - \frac{25,360}{2187} k_{2} + \frac{64,448}{6561} k_{3} - \frac{212}{729} k_{4} {, I_{n}, A}_{n} + \frac{19,372}{6561} h)$ $k_{6} = h . f (x_{n} + \frac{9017}{3168} k_{1} - \frac{355}{33} k_{2} + \frac{46,732}{5247} k_{3} + \frac{49}{176} k_{4} {, I_{n}, A}_{n} + \frac{9017}{3168} h)$
4:	$Compute the fifth - order solution for x_{5}$ $and the fourth - order solution for x_{4}$ $x_{5} {= x}_{n} + \frac{35}{384} k_{1} + \frac{500}{1113} k_{3} + \frac{125}{192} k_{4} - \frac{2187}{6784} k_{5} + \frac{11}{84} k_{6}$ $x_{4} {= x}_{n} + \frac{5179}{57,600} k_{1} + \frac{7571}{16,695} k_{3} + \frac{393}{640} k_{4} - \frac{92,097}{6784} k_{5} + \frac{187}{2100} k_{6}$
5:	$Compute e r r o r = \|x_{5} - x_{4}\|$
6:	If $e r r o r \leq t o l e r a n c e,$ then $accept the solution and update {x_{n} t o x}_{5}$ . else, reject the step and repeat from Step 3 with a reduced step size. end
7:	Adjust the step size ℎ based on the error and tolerance.
8:	$Update t to t + h$ $. Repeat from Step 3 until t \geq t_{e n d}$ .
	End while

Further, we designed a unified method to analyse a system’s dynamic behaviour and steady-state using a single solution method, mainly when working with numerical solvers for differential equations. However, this typically involves using the numerical solver to handle the dynamic simulation and then analysing the resulting data to assume the steady state. In an equilibrium state, the value of

k_{2}

to

k_{6}

become zero. From this, we get the following equation:

x_{5} {= x}_{n} + \frac{35}{384} k_{1}

(24)

x_{4} {= x}_{n} + \frac{5179}{57,600} k_{1}

(25)

This implies

e r r o r = |x_{5} - x_{4}| = 0

. In this case, the Dormand-Prince method simplifies to a method where the solution x remains constant (

\frac{d x (t)}{d t} = 0

) during each time step. The method adapts the step size to control the accuracy of this constant solution.

4.3. Integrating Unified Solver Outputs with Neural Circuit Policies for Generalized Liquid Neural Network Framework

Integrating the output from a unified solver with Neural Circuit Policies (NCPs) offers a sophisticated approach to enhancing the capabilities of GLNN frameworks, as shown in Figure 2. This integration effectively merges the data generated by unified solvers with the structured, biologically inspired architecture of NCPs [26], which mimics the hierarchical neural organization observed in the C. elegans nematode [25]. The design of Neural Circuit Policies (NCPs) is influenced by the complex neural network of the C. elegans nematode, showcasing a unique four-tier hierarchical organization. Within this organism, sensory neurons initially collect environmental cues, passing these data through interneurons and command neurons and dictating actions to motor neurons leading to muscle movements.

By channelling the solver outputs into the sensory neurons of an NCP, the framework can process these inputs through interneurons and command neurons, culminating in precise action determinations by motor neurons. This method enables the neural network to adapt fluidly to changes in dynamic environments, enhancing its decision-making and control capabilities. The seamless fusion of ODE dynamics with the hierarchical processing power of NCPs within a Liquid Neural Network extends the application range of these networks. It also boosts their efficiency in real-time adaptive systems, offering promising possibilities for robotics and autonomous system design advancements.

5. Applications of Generalized Liquid Neural Network Framework:

5.1. Case 1: Predicting Damped Sinusoidal Trajectories

Various methods have been used in the literature to predict sinusoidal trajectory and time frequency analysis [27,28,29]. This case study explores the application of GLNN frameworks, an evolution of traditional Liquid Neural Networks [30] with the integration of the Adjoint Sensitivities algorithm [31], in predicting damped sinusoidal trajectories. This combination enables efficient gradient computations through the network’s parameters and optimizes the network’s response to variations in trajectory data, ensuring accurate predictions. The adjoint sensitivities method further augments this process by facilitating scalable and computationally efficient backpropagation, which is essential for continuous learning and adaptation in dynamic environments. Thus, this sophisticated framework significantly enhances predictive accuracy and system adaptability in tasks requiring analysing and predicting damped sinusoidal movements.

This study examines a system characterized by linear ordinary differential equations (ODEs) specified through a given Jacobian matrix and initial conditions. Throughout the training process, the GLNN consistently exhibited a capability to minimize loss and improve predictive accuracy, effectively emulating the system’s actual behaviour as demonstrated by phase plane plots. This performance highlights the framework’s potential for managing complex, time-dependent systems in physics and engineering applications, underscoring its viability and effectiveness in practical scenarios.

Experimental Setup and Observations: In this section, we explore the dynamics of a system described by a pair of linear ordinary differential equations (ODEs), formulated explicitly as per the following equations:

\frac{d x}{d t} = \frac{- x}{10} - y

(26)

\frac{d y}{d t} = x - \frac{y}{10}

(27)

The Jacobian matrix encapsulates the system’s dynamics as follows:

J (x, y) = \frac{d}{d t} [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} - 0.1 & - 1 \\ 1 & - 0.1 \end{matrix}] [\begin{matrix} x \\ y \end{matrix}]

(28)

with initial conditions of

f_{0} = [\begin{matrix} 2 \\ 0 \end{matrix}]

.

The initial value problem was processed for each test sample throughout the training phase using the Generalized LNN framework. In this research, we evaluated the performance of different neural network models for predicting damped sinusoidal signals, focusing on a comparative analysis between three distinct methodologies: a Neural ODE model, a conventional LNN, and a proposed Generalized Liquid Neural Network. The graphical illustration in Figure 3 presents itself by plotting the true trajectory with the predicted trajectory in the phase plane for each model case. A comparison of all these models’ abilities for trajectory prediction and training loss is given in Figure 4. Figure 3 illustrates how the training for f, the orange trajectory, fits with true value. This experimental procedure demonstrated a consistent decrease in loss, indicating an enhancement in learning accuracy. Our experimental setup consists of a neural network model that predicts the system’s evolution. This model comprises a single hidden layer with 50 neurons and uses the hyperbolic tangent activation function. The network outputs are two-dimensional, corresponding to our study’s dynamical system. Weights and biases are initialized with a normal distribution (mean = 0 and standard deviation = 0.1) and zero, respectively, to promote convergence. The dataset involves 1000 data points sampled from the system’s trajectory, with training conducted in mini batches of 20 samples, each consisting of sequences of 10 time points. The training process is repeated over 20,000 iterations with a learning rate of 0.001 using the RMSprop optimization algorithm. Loss is calculated as the norm of the difference between predicted and actual trajectories, and this metric is recorded periodically every 20 iterations to monitor training progress.

Both models underwent a training of 20,000 iterations for the conventional LNN and the proposed Generalized LNN, using early stopping mechanisms to optimize training time and prevent overfitting. The Generalized LNN framework utilized an NCP to learn the vector field of a dynamical system, integrating this using the ‘odeint’ function with the “DOPRI 5” method.

The final loss reported for the Neural ODE method was 1.9899, which sets a benchmark for the performance over the specified number of iterations. The conventional LNN achieved a higher loss of 2.5494, demonstrating a substantial decrease in learning the target signal compared to the Neural ODE. The proposed Generalized LNN, which features enhancements tailored to the model’s architecture and learning strategy, recorded a significantly lower loss of 1.0738 at the end of 20,000 iterations. This represents the best performance among the three methods tested and highlights the effectiveness of the modifications in accurately predicting complex signal patterns.

These results underscore the potential of the proposed Generalized LNN, which outperformed the traditional approaches in accuracy and demonstrated robustness in handling the intricacies of damped sinusoidal signals, thereby confirming the benefits of the architectural adjustments and specialized Neural Circuit Policies.

5.2. Case 2: Estimating Output in a Non-Linear RLC Circuit Using a Generalized Liquid Neural Network Framework

In this case study, we explored the application of a GLNN framework to predict the output of a non-linear RLC circuit, as depicted in Figure 5. The circuit’s dynamic behaviour is captured by continuous-time state-space equations, where the non-linear dependency of inductance on the inductor current

i_{l}

is a focal point of our analysis.

The circuit setup is inspired by [32] and is designed around several key components and parameters that dictate its overall behaviour. The input voltage, denoted as

v_{i n}

, serves as the primary driving force influencing the dynamics of the circuit. Accompanying this are two state variables: the capacitor voltage

V_{c}

and the inductor’s current

i_{l}

, both critical to the circuit’s functionality. The capacitor voltage

V_{c}

is directly influenced by

v_{i n}

, while the inductor’s current

i_{l}

plays a pivotal role in affecting the inductance

L

. The resistance R is fixed at 3 Ohms, and the capacitance C is set at 270 nF (nano Farads).

A continuous-time state-space equation describes the behaviour of the circuit as follows:

[\begin{matrix} V_{c} \\ I_{l} \end{matrix}] = [\begin{matrix} 0 & \frac{1}{c} \\ \frac{- 1}{L (i_{l})} & \frac{- R}{L (i_{l})} \end{matrix}] [\begin{matrix} v_{c} \\ i_{l} \end{matrix}] + [\begin{matrix} 0 \\ \frac{1}{L (i_{l})} \end{matrix}] v_{i n}

(29)

Additionally, the inductance, represented as

L (i_{l})

, varies with the inductor current and is given by the following relationship:

L (i_{l}) = L_{0} [0.9 (\frac{1}{π} \arctan (- 5 |i_{l}| - 5) + 0.5) + 0.1]

(30)

This formula ensures that

L (i_{l})

adjusts dynamically with changes in

i_{l}

, thereby influencing both the inductor current and the overall circuit response to varying input voltages. Integrating these elements forms a comprehensive system crucial for understanding and predicting the circuit’s behaviour under different operational conditions.

This relationship reflects the characteristics typical of ferrite inductors operating in partial saturation. For this study, the GLNN framework is tasked with estimating the system’s state variables,

V_{c}

and

i_{l}

, which are observed under noisy conditions. The training dataset, denoted as

D

with (

N = 4000

) samples, is constructed by simulating the system over a 2 ms duration with a fixed timestep

\nabla t = 0.5 μ s

. The input

v_{i n}

for this dataset is a filtered white noise signal with a bandwidth of 150 kHz and a standard deviation of 80 V. This setup provides a realistic noise environment where the observations of

V_{c}

and

i_{l}

are corrupted by additive white Gaussian noise with zero mean and standard deviations of 10 V and 1 A, respectively, corresponding to signal-to-noise ratios (SNR) of 20 dB and 13 dB.

An independent test dataset is similarly generated but uses traditional LNN to estimate the system’s state variables,

V_{c}

and

i_{l}

. It helps to compare the robustness and adaptability of the GLNN framework.

This case study effectively demonstrates the proposed framework’s capability to handle complex, non-linear systems with varying parameters and under significant observational noise. It showcases its potential utility in real-world engineering applications where precise estimation of system dynamics is crucial.

The Generalized Liquid Neural Network framework demonstrated a notable enhancement in precision, achieving a precision score of 0.95 compared to the 0.75 achieved by the conventional LNN. This 20% improvement highlights the superior capability of the Generalized LNN in capturing the complex dynamics of non-linear systems more accurately. Additional performance metrics also showed improvement, with accuracy rising from 0.82 to 0.90 and recall from 0.78 to 0.88, confirming the overall enhanced effectiveness of the Generalized LNN framework. Figure 6 presents a comparison of voltage and current over time in an RLC circuit, as captured by two distinct frameworks: a standard Liquid Neural Network (LNN) and a Generalized Liquid Neural Network framework. The true output of the circuit is depicted in black, while the predicted outputs by the respective frameworks are shown in red and blue. The term ‘sim’ denotes that the data represented are from simulations, highlighting the effectiveness of these frameworks in accurately modelling the circuit’s dynamics.

5.3. Case 3: Retinal Disease Classification Using Generalized LNN Framework

Our research explored integrating the Neural Circuit Policies (NCPs) network within an encoder architecture to extract features from Optical Coherence Tomography (OCT) images and classify the diseases. The OCTMNIST dataset [33], comprising 100,000 validated OCT images, is categorized into four diagnostic groups: Choroidal Neovascularization (CNV), Diabetic Macular Oedema (DME), Drusen, and normal retina. These conditions vary from CNV, involving new blood vessel growth that can lead to significant visual loss, to DME, characterized by fluid accumulation impacting sharp central vision, and Drusen, indicative of potential age-related macular degeneration [34]. Various machine learning techniques [35,36] have been used to analyse the OCT image, ranging from hybrid attention-based U-Net to vision transformers.

Our experimental framework utilizes the proposed Generalized Liquid Neural Network (LNN) framework with 19 hidden neurons, designed to process sequences of 32 features simulated from the spatial dimensions of OCT images. This setup allows us to capture intricate patterns within the retina, which is crucial for accurate disease identification. The model undergoes training over 100 epochs, with a batch size of 128 and a learning rate of 0.001, aiming to balance computational efficiency with high performance. We employ Cross-Entropy Loss to evaluate model performance and the F1 score to estimate the balance between precision and recall. These metrics are vital in medical imaging, where accurate disease classification can significantly influence diagnostic and treatment decisions.

In our exploration of OCT image handling, we reviewed two sequence modelling techniques within the Neural Circuit Policies (NCPs) framework described by Truong et al. [37]. These techniques, known as Y-NCP and Z-NCP, mentioned in Figure 7, were explicitly adapted to manage the unique data structure of OCT images. The Y-NCP approach organizes data along the y-axis, creating sequences representing rows of pixels across the image’s width. In contrast, the Z-NCP method arranges pixel values along the depth, or z-axis, effectively segregating channel layers into sequences. This alignment allows for independent learning among sequence elements, enhancing the model’s capability to process and analyse the complexities inherent in medical imagery. Truong et al.’s research demonstrated that the Z-NCP model offers superior robustness compared to the Y-NCP method. Consequently, we have chosen to utilize the Z-NCP sequencing model, which significantly improves our ability to effectively process and classify OCT images, facilitating more accurate and reliable medical assessments.

These sequences are then processed through the generalized framework of LNN, enhanced by the NCP head. The NCP head is configured similarly to the neural structure of the C. elegans nematode, including sensory, interneuron, command, and motor layers. This multi-layered approach ensures that each sequence is thoroughly analysed, decisions are made intelligently, and accurate classifications are executed. Post-processing, the data passes through a fully connected (FC) layer, which transforms the sequence of predictions into a single probability score per class, ensuring precise disease classification. This innovative method improves the efficiency and accuracy of feature extraction and utilizes the unique capabilities of the NCP network to enhance diagnostic performance while minimizing computational resources. The block diagram in Figure 8 consists of three main blocks: encoder-based feature extraction, bottleneck, and NCP head. The first block extracts the features using an encoder, and the second block sequentializes the data using the adaptive global average and Z-NCP. The third block comprises the NCP head and FC layer for disease classification.

The LNN achieved an AUC of 0.97, an accuracy of 0.96, and an F1 score of 0.88. Meanwhile, the GLNN showed identical AUC performance at 0.97 but surpassed the LNN with an accuracy of 0.98 and an F1 score of 0.98. These results indicate that while both models are highly effective in distinguishing between classes, GLNN offers improved accuracy and precision, making it a superior choice in scenarios requiring optimal classification performance. The comparative analysis of the training accuracy and loss curve for the LNN and Generalized LNN, presented in Figure 9, distinctly demonstrates the superiority of the proposed method. Throughout the 100 epochs, Generalized LNN achieves a steeper and more rapid improvement in training accuracy and maintains higher accuracy levels, indicating efficient learning and better stability. The loss graphs further underscore Generalized LNN’s advantages, showing a sharper and more consistent decrease in loss, with the final values approximately 10% lower than LNN. This consistent outperformance in terms of higher accuracy and lower loss highlights its potential for better generalization, making it a more effective model. The results suggest that the proposed framework is a more robust and efficient choice, effectively adapting and learning quicker within the same training regime. The Generalized LNN is particularly suitable for retinal disease classification due to its dynamic temporal processing capabilities, biologically inspired architecture, and recurrent neuron dynamics. GLNNs excel at handling the sequential nature of OCT images, capturing patterns across different slices, which is crucial for accurate disease detection. The general form of the dynamic neurons allows them to maintain and update hidden states over time, effectively integrating information across multiple slices. Additionally, advanced sequence modelling techniques like Y-NCP and Z-NCP enable the network to analyse spatial and depth information within the images. GLNNs’ adaptability and efficient resource utilization, resulting from their sparse and compact structure, make them highly effective and computationally efficient for medical imaging tasks. These characteristics collectively enhance GLNNs’ ability to capture complex patterns and slight variations in OCT images, leading to precise and reliable retinal disease classification. Table 1 provides a performance comparison between the proposed model and existing state-of-the-art approaches, with the best results emphasized in bold.

6. Conclusions

The Generalized Liquid Neural Network (GLNN) framework demonstrates exceptional versatility and adaptability across diverse applications, including predicting damped sinusoidal trajectories, modelling non-linear RLC circuits, and classifying retinal diseases from Optical Coherence Tomography (OCT) images. By integrating advanced computational techniques such as neural ordinary differential equations and the Runge-Kutta DOPRI 5 method, the GLNN significantly enhances predictive accuracy and performance.

In each case study, the GLNN outperformed traditional models, proving its capability to address complex scientific and engineering challenges. The framework’s application to biomedical imaging also underscores its potential to advance healthcare diagnostics by improving the accuracy and efficiency of medical image analysis.

Overall, the GLNN framework is a practical tool for solving complex problems across various domains, offering improved accuracy and expanding the scope of applications. Future research will explore further applications and innovations, continuing to integrate neural network models with differential equation solvers to tackle increasingly complex systems.

Author Contributions

P.K.K.: conceptualization, algorithm development, methodology, software implementation and experimentation, investigation, writing—original draft, writing—reviewing and editing, I.A.: conceptualization, algorithm development, research development, investigation, writing—reviewing and editing. W.H.A.: conceptualization, research direction and investigation, resources, supervision, writing—reviewing and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data source is cited within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Karlsson, D.; Svanström, O. Modelling Dynamical Systems Using Neural Ordinary Differential Equations. 2019. Available online: https://odr.chalmers.se/handle/20.500.12380/256887 (accessed on 26 May 2024).
Biloš, M.; Sommer, J.; Rangapuram, S.S.; Januschowski, T.; Günnemann, S. Neural flows: Efficient alternative to neural ODEs. Adv. Neural Inf. Process. Syst. 2021, 34, 21325–21337. [Google Scholar]
Cai, H.; Dan, T.; Huang, Z.; Wu, G. OSR-NET: Ordinary Differential Equation-Based Brain State Recognition Neural Network. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, Colombia, 18–21 April 2023; pp. 1–5. [Google Scholar]
Wu, Y.; Dong, M.; Jena, R.; Qin, C.; Gee, J.C. Neural Ordinary Differential Equation based Sequential Image Registration for Dynamic Characterization. arXiv 2024, arXiv:2404.02106. [Google Scholar]
Shi, Y.; Jiang, K.; Wang, K.; Li, J.; Wang, Y.; Yang, M.; Yang, D. StreamingFlow: Streaming Occupancy Forecasting with Asynchronous Multi-modal Data Streams via Neural Ordinary Differential Equation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 14833–14842. [Google Scholar]
Pan, L.; Lu, J.; Tang, X. Spatial-temporal graph neural ODE networks for skeleton-based action recognition. Sci. Rep. 2024, 14, 7629. [Google Scholar] [CrossRef] [PubMed]
Dormand, J.R.; Prince, P.J. A family of embedded Runge-Kutta formulae. J. Comput. Appl. Math. 1980, 6, 19–26. [Google Scholar] [CrossRef]
Al Ghafli, A.A.; Nawaz, Y.; Al Salman, H.J.; Mansoor, M. Extended Runge-Kutta scheme and neural network approach for SEIR epidemic model with convex incidence rate. Processes 2023, 11, 2518. [Google Scholar] [CrossRef]
Chen, R.T.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. arXiv 2018, arXiv:1806.07366. [Google Scholar] [CrossRef]
Dupont, E.; Doucet, A.; Teh, Y.W. Augmented neural odes. arXiv 2019, arXiv:1904.01681. [Google Scholar] [CrossRef]
Rubanova, Y.; Chen, R.T.; Duvenaud, D.K. Latent ordinary differential equations for irregularly-sampled time series. arXiv 2019, arXiv:1907.03907. [Google Scholar] [CrossRef]
Massaroli, S.; Poli, M.; Park, J.; Yamashita, A.; Asama, H. Dissecting neural odes. Adv. Neural Inf. Process. Syst. 2020, 33, 3952–3963. [Google Scholar]
Haber, E.; Ruthotto, L. Stable architectures for deep neural networks. Inverse Probl. 2017, 34, 014004. [Google Scholar] [CrossRef]
Poli, M.; Massaroli, S.; Park, J.; Yamashita, A.; Asama, H.; Park, J. Graph neural ordinary differential equations. arXiv 2019, arXiv:1911.07532. [Google Scholar]
Gholami, A.; Keutzer, K.; Biros, G. Anode: Unconditionally accurate memory-efficient gradients for neural odes. arXiv 2019, arXiv:1902.10298. [Google Scholar]
LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
Pearlmutter, B.A. Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Trans. Neural Netw. 1995, 6, 1212–1228. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Chang, B.; Meng, L.; Haber, E.; Ruthotto, L.; Begert, D.; Holtham, E. Reversible architectures for arbitrarily deep residual neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar] [CrossRef]
Lu, Y.; Zhong, A.; Li, Q.; Dong, B. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 3276–3285. [Google Scholar] [CrossRef]
Graves, A. Adaptive computation time for recurrent neural networks. arXiv 2016, arXiv:1603.08983. [Google Scholar]
Jernite, Y.; Grave, E.; Joulin, A.; Mikolov, T. Variable computation in recurrent neural networks. arXiv 2016, arXiv:1611.06188. [Google Scholar]
Huang, Q.; Zhou, K.; You, S.; Neumann, U. Learning to prune filters in convolutional neural networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 709–718. [Google Scholar] [CrossRef]
Funahashi, K.; Nakamura, Y. Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 1993, 6, 801–806. [Google Scholar] [CrossRef]
Koch, C.; Segev, I. Methods in Neuronal Modeling: From Ions to Networks; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar] [CrossRef]
Lechner, M.; Hasani, R.; Amini, A.; Henzinger, T.A.; Rus, D.; Grosu, R. Neural circuit policies enabling auditable autonomy. Nat. Mach. Intell. 2020, 2, 642–652. [Google Scholar] [CrossRef]
Biswal, B.; Karn, P.K.; Sairam, M.; Surekhabolli, B.R. Time-frequency analysis and classification of power signals using adaptive cuckoo search algorithm. Int. J. Numer. Model. Electron. Netw. Devices Fields 2019, 32, e2477. [Google Scholar] [CrossRef]
Al-Fahoum, A.S.; Al-Fraihat, A.A. Methods of EEG Signal Features Extraction Using Linear Analysis in Frequency and Time-Frequency Domains. Int. Sch. Res. Not. 2014, 2014, 730218. [Google Scholar] [CrossRef]
Boashash, B. Time-Frequency Signal Analysis and Processing: A Comprehensive Reference; Academic Press: Cambridge, MA, USA, 2016. [Google Scholar] [CrossRef]
Hasani, R.; Lechner, M.; Amini, A.; Rus, D.; Grosu, R. Liquid time-constant networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; pp. 7657–7666. [Google Scholar] [CrossRef]
Cao, Y.; Li, S.; Petzold, L.; Serban, R. Adjoint sensitivity analysis for differential-algebraic equations: The adjoint DAE system and its numerical solution. SIAM J. Sci. Comput. 2003, 24, 1076–1089. [Google Scholar] [CrossRef]
Forgione, M.; Piga, D. Continuous-time system identification with neural networks: Model structures and fitting criteria. Eur. J. Control 2021, 59, 69–81. [Google Scholar] [CrossRef]
Yang, J.; Shi, R.; Ni, B. Medmnist classification decathlon: A lightweight automl benchmark for medical image analysis. In Proceedings of the 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), Nice, France, 13–16 April 2021; pp. 191–195. [Google Scholar]
Karn, P.K.; Abdulla, W.H. On Machine Learning in Clinical Interpretation of Retinal Diseases Using OCT Images. Bioengineering 2023, 10, 407. [Google Scholar] [CrossRef] [PubMed]
Karn, P.K.; Abdulla, W.H. Abdulla Enhancing Retinal Disease Classification with Dual Scale Twin Vision Transformers using OCT Imaging. In Proceedings of the 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Taipei, Taiwan, 31 October–3 November 2023; pp. 2362–2369. [Google Scholar]
Karn, P.K.; Abdulla, W.H. Advancing Ocular Imaging: A Hybrid Attention Mechanism-Based U-Net Model for Precise Segmentation of Sub-Retinal Layers in OCT Images. Bioengineering 2024, 11, 240. [Google Scholar] [CrossRef]
Truong, H.M.; Huynh, H.T. A Novel Approach of Using Neural Circuit Policies for COVID-19 Classification on CT-Images. In Proceedings of the International Conference on Future Data and Security Engineering, Ho Chi Minh City, Vietnam, 23–25 November 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 640–652. [Google Scholar]
Kermany, D.S.; Goldbaum, M.; Cai, W.; Valentim, C.C.S.; Liang, H.; Baxter, S.L.; McKeown, A.; Yang, G.; Wu, X.; Yan, F.; et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell 2018, 172, 1122–1131.e9. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Two neurons example for a dynamic state system.

Figure 2. Generalized Liquid Neural Network framework for sequential and non-sequential Data.

Figure 3. Phase portrait of prediction from different models.

Figure 4. Damping sinusoidal prediction using GLNN vs. LNN vs. Neural ODE.

Figure 5. RLC circuit.

Figure 6. Voltage and current over time in an RLC circuit obtained by LNN and Generalized LNN framework.

Figure 7. Y-NCP and Z-NCP sequence modelling techniques.

Figure 8. Block diagram for OCT disease classification with NCP adoption.

Figure 9. Training and validation accuracy plot (right) and training loss (left).

Table 1. Comparison of the proposed model with the existing state-of-the-art [38] (Best Values are in Bold).

Methods	AUC	ACC
ResNet-18 (28)	0.943	0.743
ResNet-18 (224)	0.958	0.763
ResNet-50 (28)	0.952	0.762
ResNet-50 (224)	0.958	0.776
auto-sklearn	0.887	0.601
AutoKeras	0.955	0.763
LNN	0.97	0.96
Proposed GLNN	0.97	0.98

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Karn, P.K.; Ardekani, I.; Abdulla, W.H. Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks. Mathematics 2024, 12, 2525. https://doi.org/10.3390/math12162525

AMA Style

Karn PK, Ardekani I, Abdulla WH. Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks. Mathematics. 2024; 12(16):2525. https://doi.org/10.3390/math12162525

Chicago/Turabian Style

Karn, Prakash Kumar, Iman Ardekani, and Waleed H. Abdulla. 2024. "Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks" Mathematics 12, no. 16: 2525. https://doi.org/10.3390/math12162525

APA Style

Karn, P. K., Ardekani, I., & Abdulla, W. H. (2024). Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks. Mathematics, 12(16), 2525. https://doi.org/10.3390/math12162525

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks

Abstract

1. Introduction

2. Related Work

3. Neural ODE-Based Model’s Background

3.1. Dynamical System

3.1.1. Differential Equations

3.1.2. Numerical Methods

3.2. Backpropagation in Neural ODEs

3.2.1. Continuous Time Backpropagation in Neural ODEs

3.2.2. Solution for Neural ODEs

4. Model Design of Generalized Liquid Neural Network

4.1. Derivation of the Steady-State Conditions

4.2. Derivation of Dynamic-State Condition

4.3. Integrating Unified Solver Outputs with Neural Circuit Policies for Generalized Liquid Neural Network Framework

5. Applications of Generalized Liquid Neural Network Framework:

5.1. Case 1: Predicting Damped Sinusoidal Trajectories

5.2. Case 2: Estimating Output in a Non-Linear RLC Circuit Using a Generalized Liquid Neural Network Framework

5.3. Case 3: Retinal Disease Classification Using Generalized LNN Framework

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI