1. Introduction
Mesh-based simulations are central to the modeling of complex physical systems. Since the computational cost of such simulations is very expensive, more and more research has started to use mesh-based machine learning models to accelerate the numerical simulation of physical systems such as fluid dynamics [
1,
2,
3], structural mechanics [
4,
5], and quantum mechanics [
6,
7]. AI-based prediction of physical systems is the process of analyzing past system states and training a simulator to predict future system states. However, the prediction of physical systems has been a challenging task due to its complex temporal and spatial dependence. Temporal dependence refers to the changes in system states over time, which are characterized by periodicity and trend. Spatial dependence refers to the changes in system states that are affected by the mesh topology, which is manifested as the interaction of physical quantities between neighboring mesh nodes. This spatial dependence is related to the discretization methods (Finite Difference, Finite Volume, or Finite Element) in CFD. In the mesh shown in
Figure 1, we use the explicit finite difference method to solve the one-dimensional heat transfer equation, where the state of node
i at moment
is governed by the state of node
i and its neighboring nodes at moment
n.
Graph neural networks are widely used for the prediction of physical systems on unstructured meshes. These methods encode simulated states into graphs and adaptively assign computations to spatial regions with strong gradients or where high accuracy is required [
3,
8,
9,
10]. However, the majority of GNN-based prediction models for physical systems are designed as next-step prediction models. This means that they forecast the next state of a physical system based on its current state. When facing long-term time series forecasting (LTSF) tasks, next-step models inevitably suffer from severe error accumulation, even accompanied by drifting phenomena, and such drifting is difficult to mitigate.
On the other hand, some current autoregressive prediction models have achieved great success in LTSF tasks, such as Recurrent Neural Networks (RNNs) [
11], Transformer [
12,
13,
14,
15,
16,
17], or simple linear models [
18]. They can capture trending and periodic information from historical time series. For example, cylinder flows show periodic changes during the vortex shedding phase; their system states also trend over time, and the next system state is influenced by the system state at the previous moment or even longer. By utilizing this temporal information, they can greatly mitigate error accumulation and better maintain phase and conserved quantities [
19,
20]. However, these methods only consider temporal dependence but ignore the topology of the mesh, making the changes of physical systems unconstrained by the physical space and thus failing to accurately predict system states.
In this paper, based on GNNs and linear models, we design a graph space encoder (GSE) and a time encoder (AutoLinear). The GSE leverages message passing to aggregate information but introduces several enhancements, such as employing multiple basis functions to map the messages, combining multiple headers, and utilizing multiple aggregators. The AutoLinear embeds the decomposition blocks used in Autoformer as internal operators and employs lineal models to extract periodic and trending information from time series. This design allows AutoLinear to progressively decompose intermediate results throughout the forecasting process, capturing future time information. We propose a hybrid method, named TGN, by combining GSE and AutoLinear. TGN is utilized for spatiotemporal modeling of complex physical systems on unstructured meshes. First, GSE performs multiple rounds of message passing on the input system state sequence to aggregate the local information into node representations, thus obtaining a graph sequence with spatial characteristics. The node representations of each graph in the sequence form a latent vector that represents the spatial encoding of the corresponding system state in the low-dimensional space. We then transform the graph sequence into a latent vector sequence. Subsequently, AutoLinear captures periodicity and trend in this latent vector sequence, modeling temporal dependence and predicting the next system state. Experiments on two fluid dynamics datasets confirm that TGN achieves state-of-the-art performance. The contributions are summarized as follows:
(1) We propose a new GNN architecture, graph space encoder (GSE), and demonstrate that it can capture the mesh topology to model spatial dependence.
(2) For long-term prediction, we use AutoLinear as a decomposition architecture, which breaks the preprocessing convention of series decomposition and embeds it as an internal block in a linear model. AutoLinear captures periodicity and trends in the time series to mitigate error accumulation.
(3) We model the spatiotemporal dependence of physical systems by combining GSE and AutoLinear.
(4) We evaluate our method on two fluid dynamics datasets. The results show that the accuracy of our method outperforms the state-of-the-art MeshGraphNets [
3] baseline, and unlike the baseline, no strong error accumulation or drift is observed in our method.
2. Related Work
The development and execution of simulations for complex physical systems can be highly time consuming, and modeling complex physical problems using machine learning models has become an important area of research. Learnable models are useful for accelerating simulations of aerodynamics [
1] or turbulence [
21,
22] and have achieved superior performance in tasks such as weather prediction [
23] and graphical visualization [
24]. Several studies have introduced physical expertise by incorporating loss terms [
25] or feature normalization based on physical information [
2] to enhance prediction accuracy. Most of these approaches utilize convolutional neural networks on regular meshes, which are widely employed for learning complex physical systems. However, there has been a recent surge of interest in leveraging GNNs for learnable simulators. GNNs offer the capability to simulate on meshes with irregular or adaptive resolutions. For instance, Belbute-Peres et al. [
26] embedded an aerodynamic solver within a graph convolution architecture [
27] to achieve super-resolution predictions. Alet et al. [
28] employed graph element networks to make predictions on 2D grid domains, whereas MeshGraphNets [
3] extended GNN-based predictions to complex 3D systems with thousands of nodes.
These methods can be categorized as either steady-state or next-step prediction models. However, next-step models often suffer from drift and error accumulation in the face of LTSF tasks due to the lack of information about historical time series. In contrast, sequence models can prevent error accumulation by modeling historical time series. Recently, Transformer-based solutions for LTSF tasks have proliferated and have been successfully applied to predict simple small physical systems [
19]. However, when predicting complex large physical systems, graph coarsening [
28,
29,
30] is required due to its high memory overhead, which can lead to higher errors in the first few steps of prediction [
20]. Zeng et al. [
18] introduced a set of simple single-layer linear models (LTSF-Linear), which significantly outperforms the Transformer-based complex LTSF models in experimental results under direct multi-step (DMS) prediction [
31]. However, linear models cannot directly deal with unstructured data such as scale-varying meshes, and only take into account temporal features and ignore spatial dependence, making the predicted system dynamics unconstrained by physical space, so applying linear models to predict complex physical systems remains a challenge.
In the field of traffic prediction, much research [
32,
33,
34,
35,
36,
37] makes full use of spatiotemporal dependence to solve traffic prediction problems. Since these methods are based on CNNs to model spatial dependence, they are only applicable to Euclidean spaces, such as images or regular meshes, and cannot be extended to traffic networks with complex topology. In recent years, with the development of GNNs, a series of studies have extended traffic prediction to graph-structured data. Li et al. [
38] proposed the DCRNN model, which introduces GNNs to model temporal dependence. Zhu et al. [
39] proposed the A3T-GCN model to capture global time dynamics and spatial features. The model utilizes gated recurrent units (GRUs) [
40] and graph convolution (GCN) to model the spatiotemporal dependence of traffic flow, and introduces an attention mechanism to focus on the global time information to improve prediction accuracy. Although these methods take into account spatiotemporal dependence and have achieved great success in traffic prediction tasks, they are inherently applicable to small 2D systems based on urban road networks and have limitations for complex high-dimensional physical systems.
Based on this background, this paper develops a new neural network method for the spatiotemporal dependence of physical systems that can extend spatiotemporal modeling to complex physical systems with thousands of nodes.
5. Conclusions
In this paper, we propose a method based on spatiotemporal encoding, called TGN, which can accurately perform long-term prediction of complex physical systems. On the one hand, we utilize GSE to capture the topology of the graph to obtain the relative position information among nodes and the low-dimensional vector representation of the whole graph; on the other hand, we utilize AutoLinear to capture the periodicity and trend information of the historical time series, and thus accurately predict the dynamics of the physical system. By modeling the spatiotemporal dependence, our model can be successfully applied to the task of long-term prediction of complex physical systems. Compared with the next-step model, our model can greatly mitigate error accumulation and achieve higher prediction accuracy. Our work is not limited to the prediction of physical systems, but can also be applied to other spatiotemporal tasks, such as traffic prediction. However, our method has to use training noise to augment the training data to maintain stable rollout. The training noise is difficult to adjust and may limit the accuracy of the model. In the future, we will explore predictive models that are more accurate and do not require training noise. We sincerely hope that our research will be useful for future spatiotemporal modeling of complex systems in engineering.