Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions

Wanigasekara, R. W. W. M. U. P.; Zhang, Zhenqiu; Wang, Weiqiang; Luo, Yao; Pan, Gang

doi:10.3390/rs16132468

Open AccessArticle

Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions

by

R. W. W. M. U. P. Wanigasekara

^1,2

,

Zhenqiu Zhang

¹,

Weiqiang Wang

¹,

Yao Luo

¹ and

Gang Pan

^1,*

¹

State Key Laboratory of Tropical Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(13), 2468; https://doi.org/10.3390/rs16132468

Submission received: 16 May 2024 / Revised: 1 July 2024 / Accepted: 1 July 2024 / Published: 5 July 2024

(This article belongs to the Special Issue Artificial Intelligence and Big Data for Oceanography)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Sea Surface Temperature (SST) is of great importance to study several major phenomena due to ocean interactions with other earth systems. Previous studies on SST based on statistical inference methods were less accurate for longer prediction lengths. A considerable number of studies in recent years involve machine learning for SST modeling. These models were able to mitigate this problem to some length by modeling SST patterns and trends. Sequence analysis by decomposition is used for SST forecasting in several studies. Ensemble Empirical Mode Decomposition (EEMD) has been proven in previous studies as a useful method for this. The application of EEMD in spatiotemporal modeling has been introduced as Multidimensional EEMD (MEEMD). The aim of this study is to employ fast MEEMD methods to decompose the SST spatiotemporal dataset and apply a Convolutional Long Short-Term Memory (ConvLSTM)-based model to model and forecast SST. The results show that the fast MEEMD method is capable of enhancing spatiotemporal SST modeling compared to the Linear Inverse Model (LIM) and ConvLSTM model without decomposition. The model was further validated by making predictions from April to May 2023 and comparing them to original SST values. There was a high consistency between predicted and real SST values.

Keywords:

Sea Surface Temperature (SST); fast Multidimensional Ensemble Empirical Mode Decomposition (MEEMD); Convolutional Long Short-Term Memory (ConvLSTM); spatiotemporal modeling; Bay of Bengal

1. Introduction

Due to the dynamic nature of the oceans, it is difficult to model and forecast ocean parameters such as Sea Surface Temperature (SST). Due to its influence on Oceanography, Climatology, atmosphere and others, SST had been declared an essential climate and ocean variable [1]. SST’s influence has a major impact on the thermal equilibrium of the ocean and atmosphere. Due to heat exchange between ocean and atmosphere, the climate might be affected by SST variation, regionally as well as globally.

Using in-situ and satellite remote sensing-based ocean observation datasets of SST and other parameters, quite a number of studies have been conducted to model SST and make accurate forecasts and hindcasts. These models can be categorized into two main categories; data-driven and numerical [2,3,4]. Data-driven models, as the name suggests, are driven by past SST data alone. These models involve training models to recognize patterns in the SST time series and adapting to them using mathematical equations. These models have achieved a significant success in recent years due to the emergence of machine learning (ML). The use of ML algorithms ranging from Support Vector Machines (SVM), and advanced deep learning algorithms such as Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), Convolutional Neural Network (CNN), and Temporal Convolutional Network (TCN) has been increasing in recent years. These models can be deployed only over a small area due to the model capabilities of adapting to increasing amount of variations in SST being low.

The use of ML in SST prediction goes back to studies of tropical SST forecasts in the Pacific Ocean using Markov models [5]. Thereafter, the use of Empirical Canonical Correlation analysis for Indian Ocean SST predictions [6], linear regression applications using Indian Ocean SST and NINO3 SST [7], the use of 12 Artificial Neural Networks (ANN) to forecast SST in the Indian Ocean [8], studies on tropical Atlantic SST forecasting using SVM [9], and the application of ensemble learning for ocean subsurface temperature prediction [10] are at the forefront of usage of ML.

With the invention of the Recurrent Neural Network (RNN)-based LSTM network in 1997 [11], the prediction capabilities of time series modeling have improved. The use of LSTM for SST prediction was first attempted in 2017 [12]. Their work involved SST prediction of the Bohai Sea region based on daily SST data using a fully connected LSTM (FC-LSTM) network, and the results indicated an accuracy over 0.98 and Root Mean Square Error (RMSE) less than 0.58 in 1–7-days-ahead predictions.

From that point forward, the application of LSTM for SST prediction has been increasing. A combined CNN and FC-LSTM (CFFC-LSTM) model was proposed in 2018 for spatiotemporal SST forecasting and it achieved predictions with over 0.97 accuracy and less than 0.8 RMSE values [13]. An assessment of different ML and DL algorithms with different configurations was carried out for the Mediterranean Sea and the study concluded that LSTM and GRU models performed best among the considered models [4].

Numerical models, on the other hand, involve simulation of physical parameters and their variability to model SST. These models are often difficult to derive and the accuracy might be low compared to the capabilities of data-driven models. However, unlike data-driven models, these models can be deployed over a large area covering even the entire global oceans. The model’s capability to adopt and adapt variations of physical parameters might be the reason for this. The Intergovernmental Panel on Climate Change (IPCC) Coupled Model Inter-comparison Project (CMIP) climate models, Hybrid Coordinate Ocean Model (HYCOM), and Regional Ocean Modeling System (ROMS) are the most noteworthy among the numerical ocean models.

On a special note, a 2016 study combined neural and numerical prediction models to achieve an ideal prediction accuracy of 1.0 for the Arabian Sea in 1-day- and 1-week-ahead predictions based on daily and weekly SST anomaly (SSTA) data, respectively. Although there exists some small amount of error, this can be considered as an ideal accuracy at least in terms of the accuracy value metric. This study employed a wavelet neural network which uses a wavelet transformation technique to decompose the input data through high and low pass filtering. The decomposed datasets will be used in the neural network prediction process [14].

Jia et al. (2022) investigated the possible influence of different parameters, such as data input length, prediction length, and spatial characteristics, on SST prediction accuracy. The study concludes that while increasing input length can increase the model capabilities, there is no positive correlation between them. In contrast, prediction length and prediction errors have a positive correlation. The prediction of extreme values seems to be difficult, for the model and also the model accuracy have suffered in well-known ocean currents, displaying the difficulty of SST predictions in those chaotic conditions [15].

Convolutional LSTM (ConvLSTM) is a deep learning algorithm developed by combining CNN and LSTM algorithms. It is capable of capturing spatiotemporal patterns and making predictions. The method was first developed for precipitation forecasting over a local region for short time periods [16]. The method was then adapted by several studies for SST modeling. Taylor and Feng adopted the ConvLSTM model to predict global monthly mean SSTA. Using a U-Net configuration of ConvLSTM, they obtained successful SSTA prediction results and concluded that the method captures key features of global SST variation [17].

To fully utilize spatial relationships between neighboring regions, Ren et al. (2024) employed a spatiotemporal U-Net (ST-UNet) model based on convolutional models and ConvLSTM models to predict SST. The authors conclude that according to the results of the study the ST-UNet model outperforms all conventional models such as CFCC-LSTM, the GRU Encoder Decoder network (GED), and the conventional U-Net model [18].

Xiao et al. (2019) applied a ConvLSTM model as the basic building block in their novel model to SST spatiotemporal satellite data from the South China Sea and the East China Sea. Their results indicate a significant increase in accuracy of the ConvLSTM model for SST predictions compared to Linear Support Vector Regression (Linear SVR), and two LSTM models with different configurations [19].

Hao et al. (2023) employed ConvLSTM and spatiotemporal ConvLSTM (ST-ConvLSTM) models derived from the ConvLSTM model [20] to investigate the SST prediction capabilities depending on the prediction length, hidden size of the layers, and the input length. According to their overall results, the ConvLSTM model seems to be less error prone than the ST-ConvLSTM model, for the most part, except for single time step prediction length [21]. It can be assumed therefore that the ConvLSTM model is the better of the two models.

Sequence decomposition can be considered as a highly successful path to take in time series modeling. Decomposition methods such as Empirical Wavelet Transform (EWT), Empirical Fourier Decomposition (EFD), and Empirical Mode Decomposition (EMD) are some of the tested methods of decomposition [22]. SST modeling and forecasting based on the EMD method has been tested in previous studies. According to these studies, the high accuracy in decomposition-aided SST modeling is undeniable compared to other instances without decomposition [23].

The Multidimensional EEMD (MEEMD) method is an EEMD-based method used for spatiotemporal data analysis. This method applies the EEMD method along the separate axes of the spatiotemporal data to create IMFs and residual functions [24]. While the method is successful, it was proven to be computationally extensive and therefore time consuming. The fast MEEMD method was introduced in its place. This method employs Principal Component Analysis (PCA) and Empirical Orthogonal Function (EOF) to eliminate data which are considered unnecessary or redundant. For this reason, the method is considered as a ‘lossy’ data compression method in contrast to its counterpart, which is ‘lossless’ data compression [25].

The MEEMD method has not been applied for SST modeling and forecasting in any of the previous studies, to the authors’ knowledge. The aim of this research is to fill this research gap by employing a considerably more efficient fast MEEMD method for SST modeling and forecasting in the Bay of Bengal located in the northern Indian Ocean.

2. Materials and Methods

2.1. Data Source

The National Oceanic and Atmospheric Administration (NOAA) Weekly Optimally Interpolated SST (WOISST) version 2.0 dataset will be employed for this study. This dataset is a 0.25° gridded, weekly mean dataset available from 1981, September. The dataset was created using satellite sensor data, ship, and buoy in-situ data combined [26]. For this study, data from 1981, September to 2023, March will be employed. The total dataset is a 2168-weeks-long spatiotemporal dataset of the study area. The data can be accessed at https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.highres.html (accessed on 27 March 2023).

2.2. Study Area

The selected study area is the Bay of Bengal located in the northern Indian Ocean, which is a politically, economically, environmentally, and scientifically significant region for the surrounding countries, including Sri Lanka, India, Bangladesh, Myanmar, and the Andaman and Nicobar Islands. SST variation-induced tropical cyclones and sea level rise have a significant impact on these countries. Heavy rains, floods, droughts, landslides, freshwater salinization, and coastal erosion are some of the major events that occur in the region due to SST variation. The below figure (Figure 1) represents the selected study area.

2.3. Data Extraction

The required spatiotemporal data for the study area will be extracted from the downloaded WOISST global dataset. The data within the spatial range 5°N–20°N and 80°E–95°E will be extracted in the Python 3.11 environment and saved. The dataset will be masked for the land area using a land mask that can be acquired from the NOAA.

2.4. Differencing, Train–Test Split and Normalizing Dataset

The spatiotemporal SST data will undergo a differencing process for 1–4 week lag values. This involves calculating the difference between the current (time step t) SST value and the lagged (time step t + lag) SST value. The following equation demonstrates this process.

{Differenced SST}_{t} = SST at time step t + lag - SST at time step t

(1)

The differenced dataset will be used for further analysis. The differenced dataset will then be split into training and testing sets, respectively. The train–test data split ratio will be decided from values in the range 0.5–0.8, such that there will not be any overfitting or underfitting in modeling. The matter of best fit of the model to the data is highly important in machine learning. A substandard fit might provide more error-prone predictions causing the model to fail. Thus, to select the best fit for the data a model can be trained on different train–test split ratios and can select the most probable value for the modeling [27,28].

The differenced data processing, modeling, prediction, and evaluation workflow after this step is illustrated in the following figure (Figure 2) for convenience.

2.5. Fast MEEMD Decomposition

The fast MEEMD method employs PCA and EOF to remove redundant or unnecessary data from the dataset. Hence, it can be considered as a ‘lossy’ data compression method. However, compared to the original MEEMD method, on which this method is based, the fast MEEMD method, as the name suggests, is less time consuming, and computationally less extensive [25].

The fast MEEMD method which will be employed here involves the use of Locally Linear Embedding (LLE). It could be considered as a series of Principal Components (PCs) distributed locally. Within that locale, the function attempts to preserve existing neighborhood distances. A local tangent space alignment algorithm had been used as the method of LLE [29].

The normalized dataset was decomposed using the fast MEEMD method. The fast MEEMD method divides the spatiotemporal SST data into several IMFs and a residual function. The process involves two steps. The first is the application of the LLE method with a local tangent space alignment algorithm to create PCs necessary for decomposition. The next part is applying a least square solution algorithm to the transformed dataset of the previous step. Here, least square solutions will be calculated for each pixel in the dataset. The matrix dot product of the LLE-transformed dataset and the calculated least square solutions can transform the data back to the scaled dataset. This was used in the inverse transformation process of the predictions [30].

Each training and testing set will undergo fast MEEMD decomposition independent from each other. Once the resulting IMFs and residual functions have been created, the moving window method will be employed to further split the datasets. This process is a proven method used in time series analysis to improve the prediction accuracy.

The EEMD method for fast MEEMD will be implemented using the PyEMD [31] library and the relevant LLE algorithm will be implemented by the Scikit-learn library [32]. The split datasets will be separated into sequence and prediction windows by the moving window method which divides a sequence to predict a certain length (prediction length-n) of prediction points in time by utilizing the previous consecutive set of known data point in time with a predetermined length (sequence length-m). The sequence length is set to be 4 times the prediction length according to the suggestion by [12] which has been justified by [3]. For example, to predict the 5th point in time, the data from 1st to 4th points will be utilized and to predict the 6th in time, data from 2nd to 5th will be used. The moving window method is a widely accepted and employed method in time series modeling [33]. It is efficient, able to handle big datasets easily, and most importantly uses less memory which is advantageous in RNN modeling.

Training and testing IMFs and residuals were split using the moving window method. This process splits an IMF into several steps of sequence length and corresponding prediction lengths. For example, a single training IMF sequence will split into two parts; train_IMF_X, and train_IMF_y, and a testing IMF will split into test_IMF_X and test_IMF_y. This is the case for residual functions as well.

2.6. Convolutional LSTM Modeling

For the model, the Convolutional LSTM model, which seems to be highly suitable for multidimensional modeling, was used [17,34]. The Convolutional LSTM model is a model created by extending the capabilities of the Fully Connected LSTM (FC-LSTM) algorithm in input-to-state and state-to-state transitions to adapt convolutional structure. Originally used for precipitation forecasting for a selected locality over a short time period, ConvLSTM models have been recognized as one of the finest models to apply in spatiotemporal big data modeling and forecasting [16]. Due to these reasons, the ConvLSTM layers will be employed as the base layers in modeling and forecasting SST multidimensional data.

The model’s configuration will be as given in Table 1. The model will be implemented using the Google Tensorflow 2.13 library in the Python 3.11 environment. The model will be compiled using the ‘Adam’ optimizer with a learning rate of 0.001 and the ‘logcosh’ loss function. Previous studies on comparison between different loss functions suggest that the ‘logcosh’ loss function is the function least influenced by prediction errors. The loss function is similar to MSE, but less error prone and more flexible [35]. Hence, it was employed in the model training. The training was implemented for 200 epochs with a validation split value of 0.1 and batch size of 32.

The ConvLSTM layer is created to make forecasts for a certain cell in a grid based on the inputs and past states of each local neighbor cell adjacent to the particular cell. Similar to LSTM cells, ConvLSTM cells contain an input gate, where the data are passed into the cell; an output gate, where the processed data are transferred out; and a forget gate, where the ConvLSTM cell removes past memory of the data. In addition, ConvLSTM cells contain a convolutional layer that can process grid data instead of only single cell data as normal LSTM cells do [16].

The model was trained on train_IMF_X and train_IMF_y data for each training IMF and residual function. The trained model can then be used to make predictions based on test_IMF_X data for each testing IMF and residual. Each prediction made here is comparable to test_IMF_y data. The sum of these predictions will reverse the decomposition, albeit with slight reconstruction errors.

The dot product of summed predictions and the least square solutions of the test dataset calculated earlier will reverse the data compression of the predicted dataset. The current state of SST will be added to the first time step prediction and the resulting value will be added to the next prediction. For example, in 2-weeks-ahead prediction, 1st week predictions will be added to this week’s true SST value, and the resulting value will be added to the 2nd week prediction. These can be compared against the true SST values to evaluate the model capabilities.

2.7. Performance Evaluation

The performance of the fast MEEMD–ConvLSTM model will be evaluated using the Mean Absolute Error (MAE) and RMSE accuracy metrics. These accuracy metrics are readily available in the Scikit-learn python library which will be employed in this study. The following equation indicates the accuracy metric calculations that will occur in the program.

MAE = (\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - \hat{y_{i}}))

(2)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(3)

n—Total number of samples
$\hat{y_{i}}$ —Predicted value
$y_{i}$ —True value

2.8. Linear Inverse Model (LIM)

The performance of the forecasts will be evaluated further by comparing the model capabilities against LIM. This model is a statistical linear model extensively used in climate and atmospheric sciences for time series forecasting [36,37]. In particular, this model has been used for seasonal to interannual SST and other ocean surface parameter modeling [38].

For the LIM model, the data have to be 2-dimensional, meaning the dataset should be of the shape, time steps * number of features. To accommodate this requirement, the training SST dataset was reshaped from (t, 60, 60) to the shape (t, 3600).

The LIM model can be augmented using EOF and regularization techniques. An EOF algorithm will implement a reduced space transformation to the dataset while the regularization technique attempts to regularize the linear model for generalization. The capabilities of these algorithms depend on several parameters; namely, the number of principal components, and the regularization strength. The values for these parameters can be decided according to the model prediction capabilities [31,32,33]. These two algorithms have been applied to the differenced SST dataset. Then, the transformed data can be fitted to the model.

LIM models can be configured to accommodate different time lags. For this study, the default time lag of 1 was employed. The dataset will be divided into 2 parts X and Y. The X dataset will include all SST data except the last time step, and Y dataset will include all SST data except the first time step. The two datasets will then be used to calculate the linear operator of the LIM using the least square method. The calculated linear operator can be used to make predictions into the future for different prediction lengths.

The predictions made using the linear operator will be decompressed using the inverse transformation method. The resulting array will be reshaped back to a 3-dimensional dataset. This can be directly compared against the true SST values to evaluate the prediction errors made by the LIM model.

2.9. ConvLSTM Model

The ConvLSTM model implemented at this step is similar to the model implemented with the fast MEEMD method. The difference is that here the data preprocessing includes only the lagged time series differencing step and no fast MEEMD decomposition step. Hence, the prediction results of this model can be compared against the fast MEEMD-based ConvLSTM model to evaluate the fast MEEMD method’s capabilities.

3. Results

3.1. Extracted Data

The SST spatiotemporal data of the study were extracted from the global WOISST dataset and saved as an Excel workbook. Each sheet of the workbook contained 1 week’s worth of data for the entire study area. So, for an entire time length of 2202 weeks, there was an equivalent number of sheets in the saved workbook. A saved instance of the time series was visualized, as Figure 3 indicates.

3.2. Decomposition of the Normalized Training and Testing Datasets

The extracted SST dataset was differenced to create the dataset that will be used for further analysis. The original SST dataset was saved as a trend dataset which can be added to the differenced dataset to create the original SST dataset. The differenced data were split into training and testing in a ratio value of 0.6. This ratio was decided according to the training and validation curves of modeling in the experiment. To model the data with an ideal fit, the train–test split ratio was determined to be the aforementioned value. The validation and training loss for split ratio values 0.6 and 0.7 is illustrated in Figure 4.

Generally, the validation loss of a trained model must be lower than the training loss. Otherwise, it is considered as an overfitting case. The model fits greatly to the training data, including the noise in the training dataset. However, introducing new data to the model will result in lower accuracy resulting in unreliable predictions. In this case the validation loss will be higher than the training loss [27].

The other case is underfitting; the model does not fit the training data as it should. This model might not be able to fully capture the trends and patterns in the time series. This case lowers the model accuracy due to the low amount of training data it had been exposed to. Thus, an ideal fit needs to be selected [27]. For this purpose, comparison between training loss and validation loss curves can be employed as given in the figure below. In this case, the ideal fit is through using train–test split ratio 0.6.

This value is not exact for every dataset. Generally, it was said that 0.8 is a good value for the train–test split ratio [39]. However, it is accepted that this value can range anywhere between 0.5 and 0.9. The deciding factor is that the validation loss does not exceed the training loss while retaining a high model performance [27,28].

The resulting training and testing datasets were then normalized using the standard scaling method proposed above. Afterwards, each normalized dataset was decomposed using the fast MEEMD method independent from each other. The algorithm was limited to create only five IMFs for each pixel to proceed with same number of IMFs for each pixel (parameter ‘max_imfs’ of EEMD initialization was set to four). Otherwise, there would be 4–5 IMFs for each pixel and the process would be hindered by inequivalent number of IMFs. The resulting IMFs and residual functions were visualized as given in Figure 5.

The small number of IMFs in spatiotemporal SST data decomposition compared to only temporal SST data decomposition might be due to the need to create IMFs for an entire region, instead of one small pixel. The small number of IMFs is advantageous, because the computational time for decomposition is lessened. Consequently, the model training becomes faster.

Once the IMFs and residual functions for each training and testing dataset have been created, each training and testing IMF and residual function was divided using the moving window method. This split dataset was used for model training.

3.3. ConvLSTM Modeling

The training dataset IMFs and residual functions that were split using the moving window method were used to train the model. Once trained, the model was used to make predictions using the testing dataset IMFs and residual functions that were also split through the moving window method.

The predicted results underwent inverse transformative techniques to convert the obtained prediction results back to the level of values of the original SST. For this, the addition of predicted IMFs and residual data was undertaken at first. Finally, the trend data were added to the result to create the final prediction result. This dataset is compared against the original SST test data to evaluate the performance of the model.

Predicted {SST}_{t + lag} = {Predicted differenced SST}_{t + lag} + SST at time step t

(4)

The model results indicate a highly successful prediction result for 1–4-weeks-ahead predictions as illustrated in Figure 6. It indicates the R-squared and RMSE accuracy metrics of the prediction results. As illustrated, the prediction accuracy of the fast MEEMD-based ConvLSTM model is in the range of 0.24 < MAE < 0.94 °C and 0.31 < RMSE < 1.2 °C performance metric values. Consideration of spatial characteristics seems to be advantageous according to the results.

3.4. Performance Analysis

While the model performance is comparatively higher in most of the study area, there are certain areas with somewhat higher errors. These areas are spatially located nearer to the coastal areas where the ocean dynamics are considerably more chaotic than the deep-water areas. With the shallow water level, highly unpredictable ocean waves, river freshwater discharge, ocean currents, and other factors, SST near shallow coastal areas varies more than in deep sea areas. This might be one of the reasons why the model shows a higher rate of errors in these areas compared to others. In addition, precipitation and wind have a significant impact on the SST variability in the study area [41].

There exists an abnormally high error area in the area near the Andaman and Nicobar Islands (near 10°N, 90°E). According to the study on Marine Heat Waves in the Bay of Bengal [42], a major Marine Heat Wave in May–July 2016 had been discovered in this area while observing spatiotemporal SST data.

Considering the factors that may have had an influence on SST prediction accuracy, it was assumed that this abnormal SST variation could have had an effect on the prediction capability of the model. To test this, we trained and tested the model for SST datasets before 2016. The results are illustrated below (Figure 7 and Figure 8). According to the results, it is safe to assume that there seems to be a slight effect of MHW on the SST prediction capabilities of the model. However, the data removal at this step seems to have had adverse effects on the model capabilities resulting in somewhat higher errors in other regions.

3.5. LIM and ConvLSTM Predictions

The ConvLSTM predictions made for the SST dataset in this study are of comparably lower performance with higher errors. As can be seen in the figure below, there is an increase in accuracy from ConvLSTM models to the new model. However, for the EOF and Lasso regression-applied LIM model, the prediction performance is higher compared to the capabilities of the fast MEEMD–ConvLSTM model, at least for short-term predictions. For longer prediction lengths (4 weeks), the fast MEEMD–ConvLSTM model surpassed both of the other models. The following figure (Figure 9) represents the mean RMSE and MAE variation of the three models for 1–4 weeks ahead predictions.

With lower RMSE and MAE values in comparison to the ConvLSTM predictions without the fast MEEMD method, the fast MEEMD–ConvLSTM model displays a higher success rate in enhancing a model’s capabilities. However, the EOF and Lasso regression-applied LIM model performs better in short-term predictions than the fast MEEMD–ConvLSTM model.

Once the model capabilities had been determined sufficiently, the created model was applied to SST data to predict future SST beyond the dataset used in the study. To do this, the model trained in the previous step was used. SST data from 26th March 2023 to 23th April 2023 were used to make predictions for the time period from 23th April 2023 to 14th May 2023. The SST dataset used to make the prediction (26th March–23th April SST dataset) was preprocessed through the same procedure as before. The only difference is that there is no train–test data split as this is a purely predictive process.

Once the data have been preprocessed through the fast MEEMD method, each IMF and residual function is used to make predictions using the previously trained model. Once predictions are made they are inverse transformed back to the original SST value range using the same transformative techniques discussed before. The predicted SST can then be mapped and compared with the true SST values.

As illustrated in Figure 10, the predictions made by the model for the time period from 23th April to 21th May 2023 is slightly different to the true SST values observed in the study area that increases with the prediction length. Due to the comparably low errors of the true and predicted SST, it can be determined that the fast MEEMD method is capable of improving the SST predictions of a model.

The results of SST image series prediction indicate the capability of MEEMD-based models to achieve significant success in multidimensional SST time series modeling. Therefore, it can be recommended for data-driven spatiotemporal SST modeling and forecasting tasks. For time sensitive tasks, fast MEEMD can be recommended. However, for tasks that value accuracy over speed, the MEEMD method must be employed. The reason for this is that the lossy compression employed in fast MEEMD would remove data that it deems unnecessary or redundant.

4. Conclusions

The fast MEEMD method has proven to be a better data preprocessing method capable of enhancing the model’s performance. As proven in this study, fast MEEMD-applied ConvLSTM model predictions surpassed the ConvLSTM model predictions used without it. It can be recommended for data-driven spatiotemporal SST modeling, interpolation, and short-term extrapolation tasks. For time sensitive tasks, fast MEEMD can be recommended. The only disadvantage of the fast MEEMD method seems to be its lossy compression method which would remove data that it deems unnecessary or redundant. However, this seems to have worked in favor for this particular study due to the high success rate. The removal of chaotic SST spikes in the time series seems to have made the predictions more accurate than the previous models.

Author Contributions

Conceptualization, R.W.W.M.U.P.W. and G.P.; Data curation, R.W.W.M.U.P.W.; Formal analysis, R.W.W.M.U.P.W.; Funding acquisition, W.W.; Investigation, R.W.W.M.U.P.W.; Methodology, R.W.W.M.U.P.W.; Project administration, W.W.; Resources, Z.Z., W.W. and Y.L.; Software, R.W.W.M.U.P.W.; Supervision, Y.L. and G.P.; Validation, Z.Z., W.W., Y.L. and G.P.; Visualization, R.W.W.M.U.P.W.; Writing—original draft, R.W.W.M.U.P.W. and G.P.; Writing—review and editing, W.W., Y.L. and G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This study was jointly supported by the National Key R&D Program of China (2022YFE0203500), the Science and Technology Planning Project of Guangdong Province, China (2022B1212050003), the Chinese Academy of Sciences (CAS) Key Technology Talent Program of 2024, the special fund of South China Sea Institute of Oceanology of the Chinese Academy of Sciences (SCSIO2023QY01, SCSIO2023HC07), and the development fund of South China Sea Institute of Oceanology of the Chinese Academy of Sciences (SCSIO202201).

Data Availability Statement

The original data presented in the study are openly available in the National Oceanic and Atmospheric Authority (NOAA) Optimally Interpolated SST v2.0 website at https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.highres.html (accessed on 27 March 2023).

Acknowledgments

The authors gratefully acknowledge the support of China–Sri Lanka Joint Center for Education and Research (CSL–CER), Chinese Academy of Sciences (CAS). Finally, the authors wish to express gratitude towards the University of Chinese Academy of Sciences (UCAS) and the Alliance of International Science Organizations (ANSO).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Robles-Tamayo, C.M.; Valdez-Holguín, J.E.; García-Morales, R.; Figueroa-Preciado, G.; Herrera-Cervantes, H.; López-Martínez, J.; Enríquez-Ocaña, L.F. Sea Surface Temperature (SST) Variability of the Eastern Coastal Zone of the Gulf of California. Remote Sens. 2018, 10, 1434. [Google Scholar] [CrossRef]
Patil, K.; Deo, M.C. Prediction of Daily Sea Surface Temperature Using Efficient Neural Networks. Ocean Dyn. 2017, 67, 357–368. [Google Scholar] [CrossRef]
Xiao, C.; Chen, N.; Hu, C.; Wang, K.; Gong, J.; Chen, Z. Short and Mid-Term Sea Surface Temperature Prediction Using Time-Series Satellite Data and LSTM-AdaBoost Combination Approach. Remote Sens. Environ. 2019, 233, 111358. [Google Scholar] [CrossRef]
Kartal, S. Assessment of the Spatiotemporal Prediction Capabilities of Machine Learning Algorithms on Sea Surface Temperature Data: A Comprehensive Study. Eng. Appl. Artif. Intell. 2023, 118, 105675. [Google Scholar] [CrossRef]
Xue, Y.; Leetmaa, A. Forecasts of Tropical Pacific SST and Sea Level Using a Markov Model. Geophys. Res. Lett. 2000, 27, 2701–2704. [Google Scholar] [CrossRef]
Collins, D.C.; Reason, C.J.C.; Tangang, F. Predictability of Indian Ocean Sea Surface Temperature Using Canonical Correlation Analysis. Clim. Dyn. 2004, 22, 481–497. [Google Scholar] [CrossRef]
Kug, J.S.; Kang, I.S.; Lee, J.Y.; Jhun, J.G. A Statistical Approach to Indian Ocean Sea Surface Temperature Prediction Using a Dynamical ENSO Prediction. Geophys. Res. Lett. 2004, 31, L09212. [Google Scholar] [CrossRef]
Tripathi, K.C.; Das, M.L.; Sahai, A.K. Predictability of Sea Surface Temperature Anomalies in the Indian Ocean Using Artificial Neural Networks. Indian J. Mar. Sci. 2006, 35, 210–220. [Google Scholar]
Lins, I.D.; Araujo, M.; Moura, M.D.C.; Silva, M.A.; Droguett, E.L. Prediction of Sea Surface Temperature in the Tropical Atlantic by Support Vector Machines. Comput. Stat. Data Anal. 2013, 61, 187–198. [Google Scholar] [CrossRef]
Qi, J.; Liu, C.; Chi, J.; Li, D.; Gao, L.; Yin, B. An Ensemble-Based Machine Learning Model for Estimation of Subsurface Thermal Structure in the South China Sea. Remote Sens. 2022, 14, 3207. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, H.; Dong, J.; Zhong, G.; Sun, X. Prediction of Sea Surface Temperature Using Long Short-Term Memory. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1745–1749. [Google Scholar] [CrossRef]
Yang, Y.; Dong, J.; Sun, X.; Lima, E.; Mu, Q.; Wang, X. A CFCC-LSTM Model for Sea Surface Temperature Prediction. IEEE Geosci. Remote Sens. Lett. 2018, 15, 207–211. [Google Scholar] [CrossRef]
Patil, K.; Deo, M.C.; Ravichandran, M. Prediction of Sea Surface Temperature by Combining Numerical and Neural Techniques. J. Atmos. Ocean. Technol. 2016, 33, 1715–1726. [Google Scholar] [CrossRef]
Jia, X.; Ji, Q.; Han, L.; Liu, Y.; Han, G.; Lin, X. Prediction of Sea Surface Temperature in the East China Sea Based on LSTM Neural Network. Remote Sens. 2022, 14, 3300. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.; Woo, W. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
Taylor, J.; Feng, M. A Deep Learning Model for Forecasting Global Monthly Mean Sea Surface Temperature Anomalies. Front. Clim. 2022, 4, 932932. [Google Scholar] [CrossRef]
Ren, J.; Wang, C.; Sun, L.; Huang, B.; Zhang, D.; Mu, J.; Wu, J. Prediction of Sea Surface Temperature Using U-Net Based Model. Remote Sens. 2024, 16, 1205. [Google Scholar] [CrossRef]
Xiao, C.; Chen, N.; Hu, C.; Wang, K.; Xu, Z.; Cai, Y.; Xu, L.; Chen, Z.; Gong, J. A Spatiotemporal Deep Learning Model for Sea Surface Temperature Field Prediction Using Time-Series Satellite Data. Environ. Model. Softw. 2019, 120, 104502. [Google Scholar] [CrossRef]
Wang, Y.; Wu, H.; Zhang, J.; Gao, Z.; Wang, J.; Yu, P.S.; Long, M. PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2208–2225. [Google Scholar] [CrossRef]
Hao, P.; Li, S.; Song, J.; Gao, Y. Prediction of Sea Surface Temperature in the South China Sea Based on Deep Learning. Remote Sens. 2023, 15, 1656. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Non-Stationary Time Series Analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Wu, Z.; Jiang, C.; Conde, M.; Deng, B.; Chen, J. Hybrid Improved Empirical Mode Decomposition and BP Neural Network Model for the Prediction of Sea Surface Temperature. Ocean Sci. 2019, 15, 349–360. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. Ensemble Empirical Mode Decomposition: A Noise-Assisted Data Analysis Method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Wu, Z.; Feng, J.; Qiao, F.; Tan, Z.M. Fast Multidimensional Ensemble Empirical Mode Decomposition for the Analysis of Big Spatio-Temporal Datasets. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2016, 374, 20150197. [Google Scholar] [CrossRef]
Huang, B.; Liu, C.; Banzon, V.; Freeman, E.; Graham, G.; Hankins, B.; Smith, T.; Zhang, H.-M. Improvements of the Daily Optimum Interpolation Sea Surface Temperature (DOISST) Version 2.1. J. Clim. 2021, 34, 2923–2939. [Google Scholar] [CrossRef]
Géron, A. Hands-On Machine Learning with Scikit-Learn and TensorFlow, 3rd ed.; O’Reilly Media: Sebastopol, CA, USA, 2022; ISBN 9781491962299. [Google Scholar]
Aliferis, C.; Simon, G. Overfitting, Underfitting and General Model Overconfidence and Under-Performance Pitfalls and Best Practices in Machine Learning and AI. In Artificial Intelligence and Machine Learning in Health Care and Medical Sciences, Health Informatics; Springer: Cham, Switzerland, 2024; pp. 477–524. ISBN 9783031393556. [Google Scholar]
Zhang, Z.; Zha, H. Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment. J. Shanghai Univ. (Engl. Ed.) 2004, 8, 406–424. [Google Scholar] [CrossRef]
Fast-MEEMD. Available online: https://github.com/liuquartz/Fast-MEEMD.git (accessed on 14 March 2024).
Laszuk, D. Python Implementation of Empirical Mode Decomposition Algorithm; Github Repository: San Francisco, CA, USA, 2017. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Vishwas, B.V.; Patel, A. Hands-on Time Series Analysis with Python; Apress: Berkeley, CA, USA, 2020; ISBN 978-1-4842-5991-7. [Google Scholar]
Qiao, B.; Wu, Z.; Tang, Z.; Wu, G. Sea Surface Temperature Prediction Approach Based on 3D CNN and LSTM with Attention Mechanism. In Proceedings of the International Conference on Advanced Communication Technology, ICACT, Pyeongchang, Republic of Korea, 13–16 February 2022; pp. 342–347. [Google Scholar]
Jadon, A.; Patil, A.; Jadon, S. A Comprehensive Survey of Regression Based Loss Functions for Time Series Forecasting. arXiv 2022, arXiv:2211.02989. [Google Scholar]
Lou, J.; O’Kane, T.J.; Holbrook, N.J. A Linear Inverse Model of Tropical and South Pacific Climate Variability: Optimal Structure and Stochastic Forcing. J. Clim. 2021, 34, 143–155. [Google Scholar] [CrossRef]
Kwasniok, F. Linear Inverse Modeling of Large-Scale Atmospheric Flow Using Optimal Mode Decomposition. J. Atmos. Sci. 2022, 79, 2181–2204. [Google Scholar] [CrossRef]
Alexander, M.A.; Matrosova, L.; Penland, C.; Scott, J.D.; Chang, P. Forecasting Pacific SSTs: Linear Inverse Model Predictions of the PDO. J. Clim. 2008, 21, 385–402. [Google Scholar] [CrossRef]
Joseph, V.R. Optimal Ratio for Data Splitting. Stat. Anal. Data Min. ASA Data Sci. J. 2022, 15, 531–538. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E.; Chen, X. The Multi-Dimensional Ensemble Empirical Mode Decomposition Method. Adv. Adapt. Data Anal. 2009, 1, 339–372. [Google Scholar] [CrossRef]
Kumar, P.K.D.; Paul, Y.S.; Muraleedharan, K.R.; Murty, V.S.N.; Preenu, P.N. Comparison of Long-Term Variability of Sea Surface Temperature in the Arabian Sea and Bay of Bengal. Reg. Stud. Mar. Sci. 2016, 3, 67–75. [Google Scholar] [CrossRef]
Kumar, S.; Chakraborty, A.; Chandrakar, R.; Kumar, A.; Sadhukhan, B.; Roy Chowdhury, R. Analysis of Marine Heatwaves over the Bay of Bengal during 1982–2021. Sci. Rep. 2023, 13, 2–15. [Google Scholar] [CrossRef]

Figure 1. Study area.

Figure 2. Normalized SST data processing workflow.

Figure 3. SST Variation at a single random instance.

Figure 4. Experimenting the optimal train–test split ratio: the figure represents training (blue) and validation (red) errors of two experiments in train–test ratio split modeling. Generally, the validation error should be lower than the training error in modeling. This allows a model to avoid overfitting to the training data and underfitting to the testing data. This way the model will make more reliable predictions.

Figure 5. Fast MEEMD decomposed IMFs and residual functions visualized for the training (left) and testing (right) sets, respectively. The figure represents the SST dataset after preprocessing steps. The sum of all the IMFs and residual functions should, theoretically, create the exact same normalized SST time series as the original normalized SST time series it was generated from. However, in application there exists a small error called the reconstruction error [40]. This error is mostly negligible.

Figure 6. Fast MEEMD–ConvLSTM model prediction accuracy metrics for different prediction lengths are illustrated in this figure. The model prediction performance metrics MAE ranges from 0.24–0.94 and RMSE ranges from 0.31–1.2 for 1–4-weeks-ahead prediction lengths.

Figure 7. MAE of SST predictions before 2016 shows that the prediction performance near the Andaman and Nicobar Islands (near 10°N, 90°E) is slightly higher compared to the prediction performance that includes MHWs near this location. However, the removal of data from 2016–2023 seems to have an adverse effect on the model performance.

Figure 8. RMSE error metric for SST predictions before 2016 indicates a lower error near the Andaman and Nicobar Islands (near 10°N, 90°E) compared to SST prediction capabilities that include the 2016 MHW near the area. As before, the removal of data for 2016–2023 period seems to have an adverse effect on the model capabilities, resulting in higher errors in some other areas.

Figure 9. SST Prediction MAE (left) and RMSE (right) mean value comparison between different models is indicated in the figure. The fast MEEMD-based model has lower error compared to the ConvLSTM model without fast MEEMD decomposition. However, the EOF and Lasso regression-employed LIM model performed better for 1–3-weeks-ahead predictions. For 4-weeks-lead prediction the fast MEEMD–ConvLSTM model performed better than both of the other models.

Figure 10. SST prediction vs true for the time period from 23th April 2023 to 21th May 2023 period. For all predictions the true and predicted values have slight variations that increase over prediction length.

Table 1. ConvLSTM model parameter setting.

Layer	Hyperparameters	Output Shape
ConvLSTM	Layers-128, kernel size-3, Dropout-0.8	(None, m, 20, 32)
Dense	Layers-1	(None, 20, n)
Permute	Layers-1	(None, 1, 20)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wanigasekara, R.W.W.M.U.P.; Zhang, Z.; Wang, W.; Luo, Y.; Pan, G. Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions. Remote Sens. 2024, 16, 2468. https://doi.org/10.3390/rs16132468

AMA Style

Wanigasekara RWWMUP, Zhang Z, Wang W, Luo Y, Pan G. Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions. Remote Sensing. 2024; 16(13):2468. https://doi.org/10.3390/rs16132468

Chicago/Turabian Style

Wanigasekara, R. W. W. M. U. P., Zhenqiu Zhang, Weiqiang Wang, Yao Luo, and Gang Pan. 2024. "Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions" Remote Sensing 16, no. 13: 2468. https://doi.org/10.3390/rs16132468

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Fast MEEMD–ConvLSTM in Sea Surface Temperature Predictions

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Source

2.2. Study Area

2.3. Data Extraction

2.4. Differencing, Train–Test Split and Normalizing Dataset

2.5. Fast MEEMD Decomposition

2.6. Convolutional LSTM Modeling

2.7. Performance Evaluation

2.8. Linear Inverse Model (LIM)

2.9. ConvLSTM Model

3. Results

3.1. Extracted Data

3.2. Decomposition of the Normalized Training and Testing Datasets

3.3. ConvLSTM Modeling

3.4. Performance Analysis

3.5. LIM and ConvLSTM Predictions

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI