1. Introduction
Sea Surface Temperature (SST) is a crucial physical and chemical indicator of seawater. It plays a significant role in the process of interaction between the Earth’s surface and the atmosphere, with a major impact on the global ecological environments and climate [
1]. Therefore, the prediction of SST has critical guiding significance for large and medium-scale marine physical phenomena [
2,
3]. And it plays a fundamental role in many application fields such as marine weather forecasting [
4,
5], marine activities including fisheries [
6,
7,
8] and mining, marine environmental protection, and military operations [
9,
10]. Consequently, SST prediction is a critical issue in marine science and has been widely studied in recent years.
Existing SST prediction methods can be divided into three major categories: numerical forecasting methods, traditional machine learning methods, and deep learning methods. The details of the above three categories are discussed as below.
Numerical forecasting methods [
11] simulate marine physical processes and combine observational data to forecast SST changes. These methods have been widely applied in practical operations, and with the rapid development of assimilation techniques and model improvements, the accuracy of global and regional numerical forecasts of SST has been significantly enhanced to varying degrees [
12]. The Coupled Model Intercomparison Project (CMIP) [
13] is an international collaborative project aimed at sharing, analyzing, and comparing simulation results from the latest global climate models. These climate prediction models are based on numerical forecasting methods and improve the accuracy of SST prediction by integrating multiple mathematical models. The project has been updated to CMIP6. Peng et al. [
14] studied the seasonal prediction of SST in the nearshore areas of China based on CMIP6. Research on correction methods and assimilation techniques has led to some improvements in the prediction results of numerical methods. However, due to the complexity of the marine environment, models’ descriptions and simulations of marine physical processes are still not accurate enough, and the issues of initial field uncertainty and numerical solution errors remain unresolved. Therefore, there are still accuracy issues in SST prediction through numerical methods.
Traditional machine learning methods [
15] directly learn the rules of SST changes from massive historical databases and use them for prediction. Common machine learning methods include linear regression [
16], the Support Vector Machine (SVM) models [
17], and Artificial Neural Networks (ANNs) [
18]. For example, Kug [
16] established a linear regression model based on the lag relationship between the SST in the Indian Ocean and NINO3 SST. Lins [
19] proposed a tropical Atlantic SST prediction method combining SVM models using data provided by the PIRATA project buoys. Tangang [
18] used an ANN model to predict seasonal SST changes in a selected region of the tropical Pacific. Wu [
20] utilized an ANN to predict the five principal components of SST in the tropical Pacific using sea level pressure and SST anomalies as inputs. Garcia-Gorriz [
21] analyzed the ability of neural networks to estimate seasonal and interannual SSTs in the western Mediterranean from 1960 to 2005 using monthly averages of meteorological parameters such as mean sea level pressure, wind, temperature, and cloud cover. Aparna [
22] developed an ANN model for predicting SST and SST fronts in the north-eastern Arabian Sea. Although traditional machine learning methods have achieved certain results in SST prediction, they have limitations in terms of computing power and model refinement when facing increasingly refined SST data, making it difficult to fully learn deep features from a large amount of SST data for prediction.
Deep learning methods use deep neural networks to model and predict SST, which can more effectively process massive amounts of data. Compared to traditional machine learning methods, deep learning models have powerful nonlinear fitting capabilities and can learn deep features from the data. Currently, deep learning methods such as Convolutional Neural Networks (CNN) [
23,
24], Long Short-Term Memory (LSTM) [
25,
26], and Graph Neural Networks (GNN) [
27,
28] have been widely applied to SST prediction, achieving excellent predictive performance. Zhang [
25] first treated SST prediction as a time series regression problem, using a fully connected Long Short-Term Memory (FC-LSTM) to model the sequential relationship and predict SST. Yang [
29] proposed a CFCC-LSTM model consisting of FC-LSTM layers and convolutional layers. The model takes the temporal and spatial information of SST data as a set of three-dimensional grids, with the FC-LSTM layer extracting temporal features and the convolutional layer extracting spatial features to produce the prediction. The outstanding performance of the ConvLSTM network in precipitation forecasting has attracted the attention of researchers [
30]. Xiao [
31] conducted a joint forecast experiment for the future 10 days of SST in the East China Sea using the ConvLSTM network. Zha [
32] introduced a multi-granularity spatiotemporal network (MGSN), constructing a multi-branch network to extract SST time features at different granularities. This model has richer time granularity features, can simulate more complex SST changes, and considers the influence of other locations in the spatial domain, thereby improving the prediction accuracy to some extent. With increasing prediction steps, the errors between each step also increase. Qiao [
33] introduced an attention mechanism to address this issue. The attention mechanism can assign different weights to different parts of the model, allowing the model to focus more on task-relevant parts, thereby improving the model’s quality. Xie [
34] proposed a Gated Recurrent Unit (GRU) encoder-decoder (GED) that implements a dynamic impact chain (DIL) between historical and future SST values. Xin [
35] used a three-channel convolutional LSTM for real-time SST forecasting. Due to the irregularity of the ocean shape, in certain areas such as land and islands, effective SST data are lacking. In such cases, regular grid networks such as CNNs struggle to fully encode the spatial variations of SST.
In recent years, Graph Neural Networks (GNNs) [
36] have rapidly developed and have been applied in various scenarios of deep learning. GNNs can handle graph-structured data representing features in Euclidean space as irregular networks. Zhang [
26] introduced the Memory Graph Convolutional Network (MGCN) to address the inability of regular grid neural networks to fully encode SST. Liang [
37] proposed an SST prediction method based on the Graph Memory Neural Network (GMNN), using a graph to adequately represent the spatial information of incomplete areas in SST data. This model uses distance thresholds and Pearson correlation coefficients to establish the graph representation of SST in order to fully express the spatial information of irregular areas. By applying edge construction methods to each node, node and edge representations of SST data are obtained, and edge information is updated through edge updates and edge aggregation functions. Although the methods based on GNNs have solved the problem of difficult encoding in irregular regions compared to traditional deep learning methods, there are still some issues that need to be addressed.
These above methods view the marine environment as a uniformly changing field, and then use a uniform grid to model and predict SST. However, the marine environment is a complex system, influenced by various meteorological and hydrological elements such as solar radiation, atmospheric temperature, ocean currents, and wind, resulting in a highly uneven spatial distribution of SSTs. Firstly, due to the differences in the thermal properties of land and sea, in the middle of the ocean, there are usually areas with SST data that are the same or change very little over tens of square kilometers. While closer to the edge of the ocean, the gradient of SST data becomes larger, especially in coastal areas, where the spatial variations in SST are very evident. Secondly, influenced by the variation of solar radiation with latitude, in the low latitude region of 0–20°, the spatial distribution of SST is relatively uniform, while in the mid-to-high latitude region, especially in the 30°–50° region, the spatial variation of SST is more drastic. In this case, if a uniform grid is used for prediction, it cannot adapt to the complexity of the spatial distribution of SST. When using grids that are too sparse, the prediction accuracy is limited in regions with large spatial variations in SST, while using grids that are too dense leads to computational waste in regions with a relatively uniform spatial distribution of SST.
To address the above issues, we propose a Non-uniform Grid Graph Convolutional Network (NGGCN) for the precise prediction of SST. The NGGCN first calculates the spatial gradient of SST in the sea area and designs a threshold weight function based on the gradient. We first established a uniform grid, and then decided whether to select or discard the data points adjacent to each node in the original data field according to the value of the spatial gradient at each node. Then we subdivided and merged the grid to finally obtain a non-uniform grid based on the spatial gradient. Considering the irregularity of SST data distribution, in order to fully encode the spatial variation of SST, we used a GNN to model SST. We converted the SST data into a graph representation based on the generated non-uniform grid and obtained spatial correlations through graph convolution. Then, the output of the graph convolution was constructed into a time series input to the Gated Recurrent Unit (GRU) to obtain temporal correlation, and the final prediction result was output to achieve the effect of fine-grained prediction of SST.
In summary, we have made the following contributions in this work:
We identified the problem brought by the uneven spatial distribution of SST to the prediction work, and proposed the NGGCN method that can achieve the precise prediction of SST.
We designed a threshold weight function based on the spatial gradient of SSTs within the region, generated a non-uniform grid topology for the current region, and captured the spatial correlation of SSTs through graph convolution. We designed a time encoder GRU to capture the trend of SST over time.
We conducted extensive experiments on SST datasets in representative regions, and the results demonstrated the effectiveness and superiority of our model in predicting SSTs in representative regions.
The rest of this paper is organized as follows. We will describe the technical details of our method in
Section 2.
Section 3 presents the experimental part.
Section 4 provides the results of validation and evaluation. Finally,
Section 5 will conclude this paper.
2. Methods and Materials
2.1. Problem Statement
For SST prediction, we typically divide the study area into grids based on longitude and latitude, with the data at grid points representing SST. Assuming there are
grids along the latitude and
grids along the longitude, we obtain a total of
grid regions. For example,
Figure 1 shows the grid division of a portion of the Northwest Pacific. The target area is within [120°E–130°E, 30°N–40°N], divided into 100 grids of size 1° × 1° each. For each time interval
, the SST data of all grid points in the entire area form a matrix
. Equation (1) shows the principle of SST sequence prediction, where
represents a prediction model. It involves using SST data from the previous
days,
,
, …,
, to predict the SST of the following
days,
,
, …,
.
In most existing SST prediction studies, the same uniform grid is used to predict a certain study area. In reality, even within the same area, factors such as ocean currents and differences in ocean–land thermal properties lead to an uneven spatial distribution of SST, with significant variations in SST spatial gradients within the region. Therefore, using the same uniform grid to predict SST within a study area has limitations, demanding high precision in grid accuracy.
2.2. Framework of NGGCN
Figure 2 illustrates the framework of the NGGCN model, which consists of two spatiotemporal modules. Each spatiotemporal module comprises a GCN module, an FC module, and a GRU module. The GCN and GRU modules are employed for extracting spatial and temporal features, respectively, while the FC module is used for feature decoding and prediction result output. First, the SST grid data are converted into a graph representation and constructed into a time series input to the GCN module. After the graph convolution operation, the extracted spatial feature matrix is obtained. This matrix is then restored to the features of each node by the FC module. The graph sequence with the extracted features is subsequently input to the GRU module to extract temporal features, which are then restored again through the FC module. Following the first spatiotemporal module, the node information is updated and then input into the second spatiotemporal module for a second round of feature extraction, ultimately leading to the output of the prediction results.
2.3. Spatial-Gradient-Based Non-Uniform Grid Construction
Due to the influence of solar radiation and the thermal property differences between land and sea, the spatial distribution of SST is uneven. In low-latitude areas and the central ocean, SST typically changes more uniformly, whereas in mid-to-high latitude areas and coastal regions, the changes are more dramatic.
First, we calculated the annual average gradient of SSTs in the Northwest Pacific, as shown in
Figure 3. The numbers in the figure represent the annual average gradient of SST within each 10° × 10° grid area.
Figure 4 shows the visualization results of the annual average gradient of SSTs in the Northwest Pacific Ocean. It can be observed that in the low-latitude area of 0–20°, the spatial gradient of SSTs is relatively small, indicating stable SST variations in this region, which allows for the construction of a relatively sparse grid. As the latitude increases, especially in the 30°–50° region, the spatial gradient of SSTs significantly increases. The redder area in
Figure 4 represent regions with larger SST spatial gradients, indicating more dramatic changes in SSTs, accordingly necessitating the construction of denser grids. Therefore, we constructed a non-uniform grid based on the calculated annual average gradient of SSTs. The core concept is to decide whether to create nodes based on the gradient size. The specific scheme is as follows.
First is the refinement process of the uniform grid, using an initial resolution grid of 0.25°, considered as uniform grid points. A function was designed based on the calculated gradient. Since most areas with uniform SST changes have gradients below 0.1, and areas with dramatic SST changes have gradients above 0.2, 0.1 and 0.2 were used as boundary points to design the gradient function. For gradients between 0–0.1, indicating smooth SST changes, only the node itself is considered. For gradients between 0.1–0.2, indicating relatively dramatic SST changes, the nearest 8 points with a resolution of 0.15° are added. For gradients greater than 0.2, indicating very dramatic SST changes, the nearest 8 points with a resolution of 0.05° are added on top of the 0.15° resolution. The merging process of the uniform grid follows, where, if the current uniform grid point has no fine grid points generated around it, the uniform grid point is merged, retaining the current uniform grid point, ultimately obtaining all non-uniform grid points.
For edge construction, we employed a distance threshold-based method. After generating non-uniform grid nodes, the effective SST data we obtained are stored in a matrix of the same size as our original data, categorized into coarse grid point data, uniform grid point data, and fine grid point data. Coarse grid point data are the merged data. Uniform grid point data are the initial 0.25° resolution data, which are also used for comparative experiments. Fine grid point data are the refined data. Our strategy is to use full connectivity within the same category, connecting each point to nearby points, while using distance threshold-based connections between different category data. Additionally, we introduce two new concepts: direct connection and indirect connection. A direct connection refers to a situation where no other data, such as SST data or land data, exist between two valid data points within the spatial range, allowing for an effective edge between two SST data points. An indirect connection refers to the presence of other data between two valid sea temperature data points within the spatial range, implying no strong correlation, and requiring consideration of the distance between them to decide whether to create an edge based on the distance threshold.
Figure 5 shows the construction of uniform and non-uniform grids within the same area.
Figure 6 and
Figure 7, respectively, show the annual average gradient of SST in the study area and the non-uniform grid construction based on the gradient. It can be observed that the regions with dense node distribution in
Figure 7 correspond to the high gradient red bands in
Figure 6, proving the rationality of our non-uniform grid construction method.
2.4. Spatiotemporal Module
The spatiotemporal module consists of GCN modules, FC modules, and GRU modules. The GCN module performs graph convolution operations to extract spatial distribution features of SST, the FC module restores the features extracted by the GCN module to each node, and the GRU module extracts temporal features from the graph sequence after spatial feature extraction.
Figure 8 illustrates the structure of the GCN module. The core of the GCN module lies in the graph convolution operation, which aggregates and transforms information from each node and its neighboring nodes in the graph structure to update node information and extract features. In this study, the number of features is 1, representing the SST data. Given an undirected graph
with
nodes, the feature vector
is of size
. We use spectral graph convolution to represent the spatial variation of SST, with the normalized graph Laplacian matrix
defined as follows:
where
is the identity matrix,
is the adjacency matrix of graph
, and
is the degree matrix, with each diagonal element
representing the degree of a node
, i.e., the number of nodes connected to node
. Considering the influence of the node’s own information, we introduce the self-loop and normalized Laplacian matrix
:
where
is the adjacency matrix with added self-loops, and
is the degree matrix with added self-loops, calculated from
. Since each node has an added edge to itself, the degree of each node increases by 1 accordingly, i.e.,
.
Given the feature matrix
at layer
, a single graph convolution operation is represented as:
where
is the weight matrix,
is the bias, and both
and
are trainable parameters. And
is the activation function, set to
in this case. Since we are dealing with time series data, we not only consider the spatial structure of the graph, i.e., the relationships between nodes, but also need to process the node features of each time step. Therefore, we adopt a GCN module that integrates time series data [
38]. Considering the node features and the hidden state of each time step, the combined graph convolution operation is specifically expressed as follows:
where
represents the input feature matrix,
represents the hidden state, and
combines the two along a specific dimension.
represents the output of graph convolution operation, encompassing the comprehensive features of graph structure information and time series dynamic information.
The first FC module uses to map the extracted spatial features back to each node. Then we use the restored features to extract temporal feature. The second FC module similarly restores features. In the first spatiotemporal module, features are used to update nodes for input into the next spatiotemporal module. And in the second spatiotemporal module, features are used to output prediction results.
Figure 9 illustrates the structure of the GRU module. In the GRU module, we combine the graph convolution operation and the gated recurrent unit (GRU) [
39]. The update mechanism of the GRU is adapted to handle graph-structured data, processing data related to each node through graph convolution operations. The operation at each time step
can be summarized as follows:
where
represents the feature vector matrix of time step
,
represents the hidden state of the previous time step
, the sigmoid activation function is used to ensure that the gate signal is within the range of [0, 1],
represents the candidate hidden state, used to update the current hidden state,
represents the hyperbolic tangent activation function,
represents the final hidden state,
represents Hadamard product,
and
represent the weight matrix and bias term, with subscripts
,
, and
representing the reset gate, update gate, and candidate hidden state, respectively, and
represents the normalized Laplacian matrix with added self-loop, used for graph convolution operation. The graph convolution operation
is applied to inputs
and
, which allows information to propagate in the spatial dimension of the graph. In this way, the GRU module can consider the graph structured neighborhood information of nodes while updating their state. This design allows the GRU module to simultaneously capture the spatial relationships of graph data and the temporal dependencies of time series data, providing an effective modeling method for graph structured time series data.
We designed two spatiotemporal modules in the experiment to perform two-layer feature extraction. Finally, we output the sequence of extracted spatial and temporal features through an FC layer to obtain the final prediction result.
2.5. Output and Loss Function
In this paper, the feature vector of the data is one-dimensional, which is the SST data. Therefore, the second FC layer of the second spatiotemporal module serves as the output layer of the model. After the feature restoration through this FC layer, the updated node information from the two spatiotemporal modules is mapped back to each corresponding node, resulting in the final output result.
We use Mean Squared Error (MSE) as the loss function for this study, as shown in Equation (10):
where
represents the total prediction steps,
represents each time step,
represents the true value at time
, and
represents the predicted value at time
.
5. Conclusions
In this paper, we propose an NGGCN model for sequence prediction of SST. The core idea of this model is based on a non-uniform grid construction method driven by spatial gradients. We first compute the spatial annual mean gradient of SST within the region, and design a threshold function based on the computed gradient values. During the process of converting SST data into a graph representation, we first created a sparse and uniform grid. Based on this initial grid, we decided whether to refine or merge the initial grid according to the spatial gradient and the designed threshold function, in order to complete the creation of nodes. Then, we proposed the concepts of direct connection and indirect connection in the construction of edges. For directly connected points, we directly constructed edges, while for indirectly connected points, we adopted a distance threshold-based method to construct edges. Through the construction of nodes and edges, we developed a non-uniform grid construction method based on spatial gradients. We also proposed a spatiotemporal module for sequence prediction, which consists of a GCN module, a fully connected module, and a GRU module. Firstly, the SST data were converted into a graph representation and constructed into a sequence. It was input into the GCN module to extract spatial features, and then restored to each node through an FC layer. Finally, the spatial features were extracted through the GRU module. In the spatiotemporal module, the GCN module contains sequence information, while the GRU module contains the graph convolution process. Therefore, in each module, both spatial and temporal features were extracted simultaneously. To verify the effectiveness of the NGGCN, we compared it with advanced graph neural network models, GCN, GRU, and GCN-LSTM.
We selected the representative Yellow Sea and Bohai Sea regions for the experiment. This region is located in the mid-latitude area and is far from the ocean center, where SST changes are relatively dramatic, and selecting this sea area can better verify the effectiveness of the non-uniform grid construction method based on spatial gradient. We conducted experiments on daily, weekly, and monthly scales, and the results show that our NGGCN model improves in all evaluation metrics compared to the three baseline models.
To validate the effectiveness of our proposed non-uniform grid construction method, we ran non-uniform grid data on all baseline models and compared the results with uniform grid data. The experimental results indicate that compared with uniform grid data, all evaluation indicators of non-uniform grid data have been improved, demonstrating the effectiveness of the non-uniform grid construction method.
In addition, we found that the model’s improvement in SST prediction results is more significant on a daily scale. However, as the time scale and prediction steps increase, especially on the monthly scale, the improvement becomes relatively less pronounced. We attribute this to the obvious seasonal and periodic changes in SST. And with the increase in prediction steps, the model’s ability to capture the complex temporal dependencies and nonlinear relationships of the data is somewhat insufficient. Moreover, the SST gradients show significant spatial differences during different seasons. Our work mainly focused on the spatial features of SST at different time scales overall. We will conduct research on the seasonal variation of SST in the future. Meanwhile, SST is influenced by various factors such as solar radiation, atmospheric temperature, and ocean current advections, while our paper only makes predictions based on SST data, which has certain limitations. In the future, we will attempt to address this issue. We will incorporate more meteorological and oceanic elements into the model as additional input features to further enhance the predictive performance of SST.