1. Introduction
In recent years, the integration of artificial intelligence (AI) and machine learning (ML) with natural sciences and physical engineering has led to significant advancements, particularly in addressing the complexities of nonlinear partial differential equations (PDEs). These equations are fundamental in understanding various physical phenomena, ranging from turbulent fluid dynamics to complicated physico-chemical processes. Within the domain of nonlinear PDE systems lies a rich tapestry of intricate dynamics, including instabilities, multiscale interactions, and chaotic behaviors. To enhance predictive capabilities and design robust control strategies in engineering applications, computational methods are indispensable. These methods, often in the form of numerical solvers, enable the accurate simulation of PDE solutions across spatial and temporal domains. Implicit in these solvers is the concept of the functional mapping operator, which could iteratively advance the PDE solution functions in time, providing a pathway to explore the evolution of physical systems over extended durations. A distinctive class of machine learning methods has emerged, capable of learning and replicating the behavior of these PDE operators.
Recent advancements have seen the proliferation of operator learning methods, each offering unique insights and capabilities. Early efforts in this domain drew inspiration from deep convolutional neural networks (CNNs) [
1,
2,
3,
4,
5,
6,
7], employing techniques from computer vision. These CNN-based approaches parameterize the PDE operator in a finite-dimensional space, enabling the mapping of discrete functions onto image-like representations. Building upon this foundation, recent strides have witnessed the development of neural operator methods [
8,
9] capable of learning operators of infinite dimensionality. Notable examples include Deep Operator Network [
10] and the Fourier Neural Operator (FNO) [
11], both demonstrating remarkable proficiency across a diverse array of benchmark problems [
12,
13]. Furthermore, recent advancements have extended neural operators by amalgamating concepts from wavelet methods [
14,
15] and adapting approaches for complex domains [
16].
In our recent investigations [
17,
18], we delved into the intricate dynamics of flame instability and nonlinear evolution, a canonical problem with profound implications for combustion science. Flames can undergo destabilization due to intrinsic instabilities, including the hydrodynamic Darrieus–Laudau (DL) mechanism [
19,
20] attributed to density gradients across a flame, and the Diffusive–Thermal (DT) mechanism [
21,
22] driven by heat and reactant diffusion disparities. Our previous work [
17] primarily focused on DL flames, scrutinizing the evolution of unstable flame fronts within periodic channels of varying widths. Under DL instability, an initially planar flame morphs into a steady curved front; as the channel width increases, the curved front becomes sensitive to random noise, and small wrinkles start to emerge. At sufficiently large channels, DL flames give rise to complicated fractal fronts characterized by hierarchical cascading of cellular structures [
23].
The nonlinear evolution of DT flame development can be modeled by the Michelson–Sivashinsky equation [
24], while a more accurate but computationally expensive approach involves direct numerical simulation (DNS) of Navier–Stokes equations. Utilizing these two approaches to generate training datasets, our investigations [
17] demonstrated that both CNNs and FNO could effectively capture the evolution of DL flames, with FNO exhibiting superior performance in modeling complex flame geometries over longer durations. Subsequently, we embarked on developing parameterized learning methodologies capable of encapsulating dynamics across diverse parameter regimes within a single network framework. Through the introduction of pCNNs and pFNO models [
18], we demonstrated their efficacy in replicating the behavior of DL flames across varying channel widths. Additionally, our methods have shown success in learning the parametric solutions of the Kuramoto–Sivashinsky equation [
25], which models unstable flame evolution due to the DT mechanism. However, a challenge remains in mitigating the tendency of these models to overestimate noise effects.
In this paper, we extend our research horizon to encompass the complexities arising from hybrid instabilities, specifically those arising from the coexistence of DL and DT mechanisms. These hybrid systems pose novel challenges, as they embody a rich spectrum of behaviors stemming from the interplay of distinct instability modes. Leveraging our recently developed operator learning methodologies, we aim to unravel the nuanced dynamics underlying such hybrid instabilities, shedding light on their short-term evolution and long-term statistical properties. Furthermore, our endeavor holds promise for the development of robust modeling frameworks capable of capturing the intricate dynamics of real-world flame evolution scenarios.
The paper is organized as follows: first, we describe the problem setup for learning PDE operators, followed by brief descriptions of the two parametric learning methods to be used in this work. These methods will be compared in the context of learning parametric-dependent solution time–advance operators for the Sivashinsky Equation [
26], which models unstable front evolution due to hybrid mechanisms of flame instability. Finally, we provide a summary and conclusion.
2. Problem Setup for Learning PDE Operators
In this section, we delineate the problem setup for learning a parametric PDE operator, along with a description of recurrent training methods.
Consider a system governed by PDEs, typically involving multiple functions and mappings between them. Our focus here is on a parametric operator mapping, denoted as
where
represents a set of parameters. The input function is
, where
resides in a functional space
with domain
and co-domain
while the output function is
, where
belongs to another functional space
with domain
and co-domain
.
Our primary interest lies in the solution time advancement operator with parametric dependence, given by
where
denotes the solution to a PDE under parameters
, and
represents normalized time with a small time increment
. For simplicity, we assume identical domain and co-domain for both input and output functions, i.e.,
,
,
, and
, with periodic boundary conditions on
.
To approximate the mapping
using neural network methods, let
denote the space of trainable parameters in the network. A neural network can be defined as
where
represents the space of network parameters. Training the neural network involves finding an optimal choice of parameters
such that
approximates
.
Starting with an initial solution function under fixed parameter values , the recurrent application of the operator can roll out predicted solutions of arbitrary length by iteratively updating the input function with its output from the previous prediction. Note that while the learned operator is expected to make accurate short-term predictions, its long-term prediction might be allowed to deviate if the ground truth PDE admits chaotic solutions. On the other hand, it is still desirable that the learned operator can reproduce the correct statistics in the long-term solutions.
Following previous studies [
17,
18], our training approach adopts a one-to-many setup where the recurrent network is trained to make multiple successive predictions from a single input function. Such a setup ensures numerical stability in the learned solution advancement operator, a crucial consideration highlighted in the prior work [
17,
18]. More specifically, let
be a total (
) number of training data arranged as input/output pairs in the 1-to-
n manner, and an operator with a superscript
n denotes its repeated application
n times, e.g.,
. Training a network
to approximate
then becomes a minimization task
where
and
are randomly drawn according to independent probability measures of
and
, respectively. The cost function
is set to the relative mean square (L2) error of
; here,
abbreviates the Cartesian product of
n copies of
.
3. Parametric Operator Learning Methods
In this section, we present a concise overview of two methods capable of learning the parametric operator
. Further details about these methods can be found in paper [
18].
3.1. Parametric Convolutional Neural Network (pCNN)
The operator
can be regarded as an image-to-image map when applied for the temporal advancement of discretized solutions. Deep Convolutional Neural Networks (CNNs) have demonstrated effectiveness in image-learning tasks. The network architecture suitable for learning operators resembles a convolutional auto-encoder similar to that in U-Net [
6] and ConvPDE [
7]. This network comprises an encoder block and a decoder block, with input data passing through a series of transformations via convolutional layers. Additionally, the method incorporates side networks to handle additional parameter inputs. The pCNN model is outlined in
Figure 1.
Let denote the input function represented on an x-mesh. The encoder block follows an iterative update procedure: . This iteration occurs over the level sequence . Denote the last decoding output as , a subsequent decoding procedure is applied through reversing level l.
Here, represent four data sequences, each with channels and size of . The data size is halved as for . The first-stage encoder contains two sub-maps of and ; both are implemented using vanilla-stacked convolution layers (with a filter size of 3, stride 1, periodic padding, and ReLU activation). Some layers are replaced by Inception layers for improved performance. Additionally, a size 2 max-pooling layer is prepended to halve the image size for . The second-stage encoder map is implemented as . Here, is a simple function (a two-layer perceptron) that converts the PDE parameters into a scaling ratio. The decoder update involves concatenating (but up-sample it to double its size) with along the channel dimension. The final output is obtained as .
3.2. Parametric Fourier Neural Operator (pFNO)
The parametric Fourier Neural Operator (pFNO) [
18] was developed based on the original FNO method [
11], wherein learning for the infinite-dimensional operator is achieved by parameterizing the integral kernel operators in Fourier Space. The pFNO adopts an architecture of maps-composition as
, comprising a concatenation map
, a lifting map
, a sequence of hidden maps
for
, and a projection map
.
The first map simply concatenates the parameters to the co-dimension of input function , yielding . The second map , lifts the input to a higher-dimensional functional space with . The subsequent hidden maps act sequentially to update for all . Finally, the map projects back to low-dimension functional space, finally yielding .
Both
and
are implemented using simple multilayer perceptrons(MLP). The hidden maps
are implemented as parametric Fourier layers:
where
and
are learnable weights and biases, respectively, and
is a ReLU activation function. Here,
and
represent the Fourier Transform and its inverse, respectively. The function
acts on the truncated Fourier modes, transforming them as:
where
are two learnable weight tensors and
is a function converting the parameters
into
-number of scaling ratios. This function consists of a two-stage map
, with
outputting
scaling ratios and implemented as an MLP. The second map hierarchically redistributes these ratios across the wave numbers. In one dimension(
), the distribution map reads:
for
at
, and, for
at
.
One might observe that we can deactivate the second weight tensor
in Equation (
6) by enforcing the map
to output only zeros. This modification still enables learning of the parameter operator due to the concatenation map
. Such a modified method can viewed as a simple tweak to the baseline method of FNO [
11] and will be referred as pFNO* in a later section.
4. Numerical Experiments and Result Discussions
In this section, we employ the pFNO and pCNN methods to learn flame evolution under hybrid instabilities arising from both Darrieus–Landau (DL) [
19,
20] and Diffusive–Thermal (DT) [
21,
22] mechanisms. The dynamics of such unstable flame development are encapsulated by the Sivashinsky equation [
26]. To facilitate parametric learning, we begin by reformulating the Sivashinsky equation, introducing two parameters that enable straightforward specification for blending the two instabilities and controlling the largest unstable wave numbers. By sampling across these parameters, we construct an extensive training dataset covering a range of relevant scenarios subjected to different DL/DT mixing. Subsequently, we present the results and compare the performance of the different methods in learning these hybrid instabilities.
4.1. Governing Equations
Consider modeling the unstable development of a statistically planar flame front. Let
denote time and
represent the spatial coordinate along the normal direction of flame propagation. Introduce a displacement function
describing the stream-wise coordinate of a flame front undergoing intrinsic flame instabilities. Such evolution can be modeled by the Sivashinsky equation [
26]:
where
is a linear singular non-local operator defined using the Hilbert transform
, or equivalently written as
using the spatial Fourier transform
and its inverse
.
In Equation (
7),
is the density ratio between burned product and fresh reactant;
is a ratio (positive or negative) depending on the Lewis number of deficient reactant and another critical Lewis number. Introduce three constants (
) for variable transformation on time
, space
and the displacement function
, then Equation (
7) can be rewritten as
with
,
,
,
and
.
In this work, we consider the flame front solution
of Equation (
8) in a channel domain subjected to periodic boundary condition, i.e.,
. One might notice that Equation (
8) admits a zero equilibrium solution being a flat flame (i.e.,
); a perturbation analysis around this zero solution yields a linear dispersion relation
with the perturbed solution being
(superscript * denotes complex conjugate) and the Fourier mode of perturbation evolving as
.
Equations (
8) and (
9) present a straightforward approach to hybridizing two flame instabilities of DT and DL mechanisms. This strategy is accomplished by specifying two parameters,
and
, while the remaining parameters (
,
, and
) can be determined by additional constraints outlined below. Initially, the parameter
(in between 0 and 1) is defined to allow for the continuous blending of these two instabilities. When
, the Sivashinsky Equation (
8) yields a pure DL instability described by the Michelson–Sivashinsky (MS) equation [
24]:
whereas, at the other end (
), it recovers the pure DT instability as described by the Kuramoto–Sivashinsky (KS) equation [
25]:
Secondly, the parameter
is determined as the largest value for which the dispersion relation of Equation (
9) equals zero (i.e.,
). This definition yields
. Consequently, we can prescribe
to establish the largest unstable wave number. To mitigate variability in the remaining parameters, a third constraint is imposed: the maximum value of
over the interval
must be
. Furthermore,
is employed to better accommodate the timescales attributed to the various hybrid instabilities. This strategy allows for the determination of all remaining parameters given the values of
and
. This is illustrated in
Figure 2, which presents dispersion relation plots and associated parameters.
Before proceeding further, it may be worthwhile to mention a few well-known results. The KS Equation (
11) is often utilized as a benchmark example for PDE learning studies and is renowned for exhibiting chaotic solutions at large
. On the other hand, the MS Equation (
10), although less familiar outside the flame instability community, can be precisely solved using a pole decomposition technique [
27], transforming it into a set of ODEs with finite freedoms. Moreover, at large
, the MS equation admits a stable solution in the form of a giant cusp front. However, at smaller
, the equation becomes susceptible to noise, resulting in unstable solutions characterized by persistent small wrinkles atop a giant cusp. Additional details about known theory can be found in references [
23,
28,
29,
30,
31,
32,
33,
34,
35].
4.2. Training Dataset
Equation (
8) is tackled using a pseudo-spectral approach combined with a Runge–Kutta (4,5) time integration method. All solutions are computed on a uniformly spaced 1D mesh consisting of 256 points. Training datasets are generated for a total of 15 parametric configuration tuples
, formed as the Cartesian product of three values for
in the range
and five values for
in the range
. For each of the fifteen parametric configurations, we generate 250 sequences of short-duration solutions, as well as a single sequence of long-duration solutions. Each short solution sequence spans a time duration of
and contains 500 consecutive solutions separated by a time interval of
. Additionally, each sequence starts from random initial conditions
sampled from a uniform distribution over the range
. The long sequence covers a time duration of
and comprises 125,000 consecutive solutions outputted at the same interval
. A validation dataset is similarly created for all fifteen parameter tuples, but it contains only 10 percent of the data present in the training dataset.
4.3. Result Analysis
The training datasets described in the previous section are utilized to train parametric solution advancement operators, denoted as with and . As a reminder, one ending value of = 0 enables the pure DT instability while the other ending value of activates the pure DL instability.
In this study, three models—pFNO, pFNO*, and pCNN—described in
Section 2 are employed to learn the two-parameter dependent operator
. As explained in the last paragraph in
Section 2, pFNO* is a simple variant of the baseline FNO method [
11] that includes the parameters in the co-domain of the input function. On the other hand, pCNN has shown poor performance in learning the full operator
, with a high training error exceeding 3 percent; see
Table 1. Therefore, we resort to two slightly restricted models (pCNN10 and pCNN40), which learn the single parameter (
) dependent operators,
and
, with each model being trained using one-third of the total dataset at
= 10 and 40, respectively.
The learned operator at given parameters is expected to make recurrent predictions of solutions over an extended period. The training for such operators aims not only for accurate short-term predictions but also for robust predictions of long-term solutions with statistics similar to the ground truth. As demonstrated in previous studies [
17,
18], achieving this involves organizing the training data in a 1-to-20 pair, as expressed in Equation (
4), optimized for accurately predicting 20 successive steps of outputs from a single input over a range of parameter values.
Table 1 presents the relative training/validation errors for various models. The validation errors in
Table 1 are consistent with those reported in our previous work [
17,
18]. Additional details on training and model hyper-parameters are provided in
Appendix A.
Figure 3 compares two randomly initialized sequences of front displacements predicted by two models (pFNO* and pCNN10) against the reference solutions at
and
= 10. A similar comparison for pFNO* and pCNN40 at
is shown in
Figure 4. Additionally,
Figure 5 and
Figure 6 depict similar comparisons for the predicted front slope (
) at
= 10 and 40, respectively.
All relevant model predictions at all fifteen parametric configurations
are compared in
Figure 7 for the normalized total front length (
), in
Figure 8 for the model errors accumulated through recurrent predictions, and in
Figure 9 for the long-term auto-correlation function. This auto-correlation function characterizes the long-term recurrently predicted solutions:
where
denotes the predicted solutions obtained after a sufficiently long time. Numerical calculation for the expectation
in Equation (
12) is implemented by averaging over seven randomly initialized sequences of model predictions for a time duration
. Moreover, for each of the learned models
, we compute an approximated dispersion relation:
where
J is the operator Jacobian
This Jacobian is computed using the automatic differential tool (e.g., torch.autograd. functional.jacobian in PyTorch version 3.10.4).
Figure 10 compares the dispersion relations by all models with the reference ones. Additionally,
Figure 10 shows one example of a learned operator Jacobian, which is clearly diagonal dominant.
4.4. Findings
Overall learning: Our study underscores the robust learning capabilities of pFNO and pCNN methodologies in capturing the nuanced dynamics of flame front evolution, modulated by varying DL and DT instabilities blends. Both pFNO and pFNO* demonstrate good performance in learning the full two-parameter front evolution operator modulated by a -varying blends of DL/DT instabilities as well as by a -varying size of the largest unstable wavelength. While pCNN encounters difficulty in learning the full operator, the method still performs well in learning different instabilities when being restricted for the single-parameter operators and .
Short-term learning: Across the board, all learned models (pFNO, pFNO* and pCNN10/40) demonstrate good accuracy in short-term predictions, with training/validation errors below 2 percent (
Table 1) and small accumulated errors (
Figure 8). This precision extends to various metrics, including front displacement, front slope, and normalized front length (
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7 for
), affirming the models’ fidelity in capturing short-term dynamics. Moreover, pFNO demonstrates the smallest error and is the most accurate model for learning short-term solutions.
Long-term learning: Detailed analysis of reference solutions unveils distinct characteristics of isolated instabilities. At
, DL fronts evolve toward a single giant cusp structure, either remaining stationary at small
(
Figure 3) or exhibiting noise-induced wrinkles at larger
(
Figure 4). Conversely, at
, DT fronts adopt an overall flat shape interspersed with oscillatory wavy structures, with decreasing wavelength and amplitude as
increases from 10 to 40 (
Figure 3 and
Figure 4). The slope plots for DT front evolution result in the typical zebra stripe pattern (
Figure 5 and
Figure 6). Intermediate values of
showcase a gradual transition between these features, with the front structure blending wavy oscillations together with the cusp shape.
Long-term predictions by all learned models (pFNO*, pFNO, pCNN10/pCNN40) accurately replicate these characteristic behaviors across diverse parametric configurations, encompassing pure DL and DT instabilities as well as blended scenarios. Quantitative comparisons through auto-correlation functions (
Figure 9) and total front length (
Figure 7) confirm the models’ proficiency in capturing long-term solutions.
Learning challenges: However, a common challenge across both pFNO and pCNN models lies in over-predicting the impact of noise-induced wrinkles, particularly noticeable at small
(
Figure 3 and
Figure 5). This tendency leads to an overestimation of the total front area, especially pronounced at lower
values of 10 and 25 (
Figure 7 at
). When learning for the hybrid DL and DT instabilities, excessive noisy wrinkles also show up in all the model predictions (at
in
Figure 5 and
Figure 6), however, the issue becomes less discernible toward smaller values of
when DT instability plays a larger role, as also evident by the front length in
Figure 7.
Extra finding: It is particularly interesting to point out that the two models, pFNO and pFNO*, learn well on the parametric-dependent linear dispersion relations, as seen in
Figure 10. Except for a moderate level of mismatch at a few parameter conditions toward large
at 25 and 40, pFNO and pFNO* reproduce the relations quite accurately.
Such learning performance is impressive considering the fact that the data effective for learning these linear relations (i.e., the initial near-zero solutions) is just a tiny portion of the total dataset. For pCNN-based models,
Figure 10 shows pCNN10 learns the dispersion quite accurately while pCNN40 learned relations show a more significant deviation than ones by pFNO.
5. Summary and Conclusions
This paper delves into the potential of machine learning (ML) for understanding and predicting the behavior of flames experiencing hybrid instabilities. These instabilities arise from the interplay of two key mechanisms: the Darrieus–Landau (DL) instability, driven by density gradients across a flame, and the Diffusive–Thermal (DT) instability, caused by heat and mass diffusion disparities.
The nonlinear development of unstable flames can be modeled by a well-known partial differential equation (PDE), specifically the Sivashinsky equation. By re-expressing the Sivashinsky equation, we introduce two parameters: and . These parameters control the blending of DT and DL instabilities, as well as the cutoff wavelength for unstable flame behavior.
Our learning problem focuses on understanding the PDE solution time advancement operator under different parameter combinations. This operator, when repeatedly applied with its input solution as the output from the previous iteration, yields a time sequence of solutions of arbitrary length. We employ two recently developed operator learning models: parameterized Fourier Neural Operators (pFNO) and Convolutional Neural Networks (pCNNs). Our findings demonstrate that both pFNO and pCNN models effectively capture the intricate flame dynamics under varying DT/DL instabilities (due to variations). Specifically:
Short-Term Predictions: All learned models accurately predict short-term solutions and dispersion relations.
Long-Term Behavior: The models also reproduce correct statistics, quantified by autocorrelation functions and total front length.
pFNO Superiority: Notably, pFNO outperforms pCNN by allowing the learning of the full two-parameter operator, enabling variation in both and .
Challenges: However, both pCNN and pFNO tend to overestimate noise-induced wrinkles associated with DL instability, leading to inaccurate predictions of the total flame area, especially at lower instability levels.
In conclusion, this work showcases the potential of operator learning methodologies for analyzing complex flame dynamics arising from hybrid instabilities. While challenges persist, particularly related to noise overestimation, these methods offer assisting tools for understanding and predicting real-world flame behavior in combustion systems [
36,
37,
38,
39,
40,
41,
42,
43]. Realistic flame development can be influenced by various factors beyond the two intrinsic flame instability mechanisms considered in this study. These additional factors include mechanisms such as thermoacoustic instabilities, Rayleigh–Taylor instabilities, and disturbances due to turbulent background flow. However, if the evolution of a realistic flame can be described by certain PDEs, it can still be viewed as a parametrized solution advancement operator. It is crucial to emphasize the importance of obtaining a high-quality training dataset on real flame evolution. Such datasets can be derived either from high-fidelity numerical simulations or sophisticated laser-diagnostic experiments. With this data, the flame evolution could potentially be learned by the parametric operator learning methods demonstrated in our work. Future research directions may involve incorporating additional physical mechanisms or exploring alternative learning architectures to further enhance the accuracy and robustness of these models.