Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications

Barooni, Mohammad; Ghaderpour Taleghani, Shiva; Bahrami, Masoumeh; Sedigh, Parviz; Velioglu Sogut, Deniz

doi:10.3390/atmos15060640

Open AccessArticle

Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications

by

Mohammad Barooni

¹,

Shiva Ghaderpour Taleghani

²,

Masoumeh Bahrami

³,

Parviz Sedigh

⁴ and

Deniz Velioglu Sogut

^1,*

¹

Ocean Engineering and Marine Sciences, Florida Institute of Technology, Melbourne, FL 32901, USA

²

School of Arts and Communication, Florida Institute of Technology, Melbourne, FL 32901, USA

³

Electrical and Computer Engineering, University of New Hampshire, Durham, NH 03824, USA

⁴

Mechanical Engineering, University of New Hampshire, Durham, NH 03824, USA

^*

Author to whom correspondence should be addressed.

Atmosphere 2024, 15(6), 640; https://doi.org/10.3390/atmos15060640

Submission received: 25 April 2024 / Revised: 22 May 2024 / Accepted: 23 May 2024 / Published: 26 May 2024

(This article belongs to the Special Issue High-Performance Computing for Atmospheric Modeling)

Download

Browse Figures

Versions Notes

Abstract

:

The advancement towards utilizing renewable energy sources is crucial for mitigating environmental issues such as air pollution and climate change. Offshore wind turbines, particularly floating offshore wind turbines (FOWTs), are developed to harness the stronger, steadier winds available over deep waters. Accurate metocean data forecasts, encompassing wind speed and wave height, are crucial for offshore wind farms’ optimal placement, operation, and maintenance and contribute significantly to FOWT’s efficiency, safety, and lifespan. This study examines the application of three machine learning (ML) models, including Facebook Prophet, Seasonal Autoregressive Integrated Moving Average with Exogenous Factors (SARIMAX), and long short-term memory (LSTM), to forecast wind speeds and significant wave heights, using data from a buoy situated in the Pacific Ocean. The models are evaluated based on their ability to predict 1-, 3-, and 30-day future wind speed and wave height values, with performances assessed through Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) metrics. Among the models, LSTM displayed superior performance, effectively capturing the complex temporal dependencies in the data. Incorporating exogenous variables, such as atmospheric conditions and gust speed, further refined the predictions.The study’s findings highlight the potential of machine learning (ML) models to enhance the integration and reliability of renewable energy sources through accurate metocean forecasting.

Keywords:

offshore wind turbine; deep learning; SARIMAX; metocean data forecast

1. Introduction

The transition to renewable energy sources is being increasingly acknowledged as a critical strategy for mitigating air pollution and combating global warming [1,2]. Offshore wind energy, which is enabled by advances in floating turbine technologies, presents a sustainable solution for reducing our reliance on fossil fuels, significantly contributing to CO₂ emissions and climate change [3]. Renewable energies, such as wind power, play a pivotal role in decarbonizing the energy-production sector and mitigating the environmental and health impacts of air pollution. Diseases linked to air pollution, such as chronic respiratory conditions, cardiovascular diseases, and lung cancer, pose substantial public health challenges, underscoring the pressing need for cleaner energy alternatives [4,5]. Moreover, incorporating renewable energies into the electricity sector can yield positive social, environmental, and economic outcomes, aligning with global initiatives to achieve a sustainable and low-carbon future [6].

Floating offshore wind turbines (FOWTs) mark a significant advancement in harnessing wind energy, providing access to deeper waters where winds are consistently stronger. Unlike their fixed-bottom counterparts, FOWTs can be deployed in deep waters, thus exploiting vast untapped wind energy potential far from the shore. This adaptability reduces visual impact and navigational risks while maximizing energy capture [7]. The importance of accurately forecasting metocean data, such as wind speed and significant wave height, cannot be overstated in the context of FOWTs. These forecasts are crucial for offshore wind farm development’s design, operation, and maintenance phases. Precise metocean forecasts facilitate optimal turbine placement, enhancing energy production and efficiency [8,9]. They also improve safety and reduce operational costs by identifying ideal timeframes for installation and maintenance tasks, thus minimizing downtime and extending the lifespan of wind farms [10,11]. Moreover, incorporating metocean data into the planning and operational frameworks of offshore wind farms plays a vital role in enhancing the reliability and stability of the power grid. This integration aids in mitigating environmental impacts by ensuring that the dynamic marine environment is considered in FOWT designs, safeguarding marine ecosystems, and fostering sustainable development [12]. Integrating renewable energy sources into the power grid requires advanced forecasting techniques to handle the inherent variability and uncertainty of natural resources like wind and ocean waves. Recent advancements in machine learning (ML) have demonstrated promising results in forecasting wind speed and wave height, which are essential factors for optimizing the performance of renewable energy systems [13]. This literature review explores various ML approaches used to predict wind speed and wave height, highlighting their contribution to enhancing renewable energy integration and efficiency.

ML algorithms, such as support vector machines and random forest models, have been used to develop predictive models for runoff, demonstrating their effectiveness in forecasting and highlighting their potential in environmental applications [14]. A recent study explored using machine learning (ML) methods to forecast offshore wind speed, wave height, and alignment to optimize the operational performance of floating offshore wind turbines [15]. Through the application of nonlinear autoregressive with exogenous input (NARX) neural networks and Gaussian process regression (GPR), their study demonstrates the capability of ML to improve the precision of metocean predictions. This advancement contributes significantly to the efficiency and safety of renewable energy sources within marine settings. Another study presented an innovative approach to forecasting chaotic and random wind speed patterns by combining the Volterra series with machine learning (ML) techniques [16]. This study focuses on predicting Volterra kernels up to the third order, employing a forward–backward propagation neural network trained on 12-month wind speed data from the Fujairah site in the United Arab Emirates. Their methodology demonstrates the potential of ML in accurately forecasting wind speeds with complex patterns, offering valuable insights for wind energy management and planning. A different study introduced a hybrid machine learning (ML) framework designed specifically for short-term wind speed and power forecasting within smart city power grids [17]. Their model, which is called EMD-KM-SXL, integrates empirical mode decomposition, K-means clustering, and various ML techniques such as support vector regressor, XGBoost regressor, and Lasso regressor to predict wind speed. The demonstrated performance of this model highlights the effectiveness of ML in improving the accuracy of wind power forecasting, which is vital for the efficient scheduling of smart power generation.

Another study explored the application of Artificial Neural Network (ANN) models for wind speed forecasting at different potential locations in Pakistan [18]. By analyzing wind speed data at four distinct heights across 12 stations, the researchers demonstrated the capability of ANN models in capturing the variability of wind speeds, which is essential for assessing the wind energy potential of a region. An ultra-short-term forecasting approach for wind speed using lightweight features and ML models was examined in another study [19]. Their two-step method employs support vector regressor, random forests, and multi-layer perceptron models, indicating the superiority of ML models in predicting wind speed accurately over short intervals. Their study contributes to developing efficient wind energy management strategies by providing reliable wind-speed forecasts. Another study investigated the potential of ML techniques for forecasting wave height over the Ocean of Things, focusing on its relevance to ocean renewable energy generation [20]. Their study highlights the adaptability of ML methods in predicting oceanic conditions, facilitating the exploitation of wave energy as a renewable resource. A novel hybrid framework that enhances the accuracy of predicting karst spring discharge using historical data tailored for regions with sparse meteorological data was presented in another study [21]. This framework leverages LSTM models optimized with advanced algorithms and variable screening techniques like partial correlation and mutual information, improving data input quality. Additionally, it employs time series decomposition methods such as LOESS and empirical mode decomposition to simplify the input data, making the model more effective. This approach outperforms conventional models that are dependent on meteorological inputs and offers a robust solution for water resource management in karst areas. Another study proposed a novel approach to daily runoff prediction by integrating physically based models with a long short-term memory (LSTM) network [22]. Their research addresses the challenges of the non-stationary and time-varying nature of runoff prediction, leveraging the simulation strength of physical mechanism models and the nonlinear analysis capabilities of LSTM. This combination strategy enhances the accuracy of runoff predictions and offers a comprehensive evaluation metric that considers the characteristics of multiple models, showing substantial improvements in forecasting performance across various watershed characteristics. This innovative methodology promises significant implications for water resource management and reservoir operations. In another example, particle swarm optimization (PSO) is integrated with long short-term memory (LSTM) neural networks to enhance flood forecasting [23]. This approach systematically optimizes LSTM hyperparameters, significantly improving the accuracy and reliability of rainfall-runoff predictions. This methodology is pivotal for effective flood prevention and advances the capabilities of hydrological models in environmental management and disaster mitigation.

This study utilizes three machine learning methods to analyze metocean data collected from an offshore buoy in the Pacific Ocean, northwest of California, with the aim of developing a comprehensive method for predicting wind and wave patterns at a specific offshore site. The short-term prediction of wind and wave conditions with high accuracy significantly contributes to optimizing wind turbine control systems and their efficiency. Accurate predictions for more extended periods help operators schedule maintenance during times of low activity, which reduces downtime and increases the overall availability of the turbines. This optimized scheduling directly impacts the efficiency and lifespan of turbines by ensuring they operate under ideal conditions and receive maintenance before weather-induced wear and tear can occur.

This study focuses on short-term and mid-range, or sub-seasonal, metocean forecasts. Short-term forecasts, spanning 1–3 days, are known for their high precision and are particularly useful for energy-maximizing control purposes and for predicting ship motions during offshore operations. Short-term forecasts of wind and wave conditions, spanning 1–3 days, are crucial for making decisions during marine operations [24]. Sub-seasonal forecasts generally cover a period from 10 to 30 days, bridging the gap between conventional weather models and longer-term seasonal predictions, which span one to seven months [25].

Medium-term forecasts are crucial for planning maintenance operations and selecting operational modes for marine renewable energy devices, such as power production or survivability mode [26]. These forecasts enable offshore engineers to schedule construction, maintenance, and drilling activities during favorable weather conditions, minimizing downtime and enhancing operational efficiency. Accurate forecasts facilitate the optimal allocation of resources, such as deploying vessels and personnel at the right time, thereby reducing costs associated with delays and standbys. Marine operations exceeding 72 h are typically planned as weather-unrestricted, with environmental conditions estimated using long-term statistics that may vary seasonally. Improved weather forecasts, especially for significant wave height and wind speed, can extend the feasible duration of weather-restricted operations [27]

For wind energy, maintaining turbines involves tasks like working in the nacelle or using cranes, where wind speed safety limits must be observed, meaning work can only proceed when wind speeds are low. These maintenance activities often necessitate hiring cranes, contractors, and other equipment that needs to be booked well in advance, often with a wait time of several weeks [28]. Knowing expected weather conditions allows for better scheduling by allowing researchers to make decisions further in advance or providing more information than simply relying on average conditions for the season. For instance, knowing if a particular week is forecasted to be more or less windy than usual can help researchers plan the number of jobs to schedule for that week. Extended periods of unusually low wind, sometimes coupled with high demand due to cold weather, can complicate and increase the cost of power system management. Subseasonal-to-seasonal forecasts can offer early warnings about these unusual conditions, allowing for timely preparations and corrective actions [25].

Although the main focus of this study is on FOWTs, the implications of this research extend beyond the interest of offshore wind. The precise seasonal and sub-seasonal predictions of wind and waves will offer advantages in terms of coastal land management, marine vessel routing, renewable energy sectors, and oil and gas operations [29,30,31,32].

Even though short-term forecasts are crucial for shaping societal decisions, many important choices must be made several weeks or months before favorable or disruptive environmental conditions occur. For instance, relocating emergency and disaster relief supplies can take weeks or months. However, pre-staging these resources in areas likely to experience extreme weather or disease outbreaks could save lives and maximize the effectiveness of limited resources [33]. Likewise, emergency managers dealing with unexpected events such as nuclear power plant accidents or large oil spills must communicate the impacts of these events over extended timeframes. Additionally, naval and commercial shipping planners set shipping routes weeks in advance to strategically position assets, avoid hazards, and capitalize on favorable conditions [33].

Large waves can prevent ships from mooring with oil and gas platforms. Typically, wave heights up to 2 m are within standard operating conditions, while heights between 2 and 3.5 m make docking more difficult. An analysis of North Sea data revealed that over

50 %

of wave heights were above 3.5 m in nearly half of the winters studied, leading to difficult mooring conditions. Furthermore, in

14 %

of the winters, wave heights of 5 m or more were observed over

30 %

of the time, likely compromising platform operability. Thus, wave height forecasts weeks in advance would greatly aid in planning operations that involve oil and gas platforms [34].

The remainder of this paper is organized as follows. Section 2 details the machine learning models used in this study, including Facebook Prophet, SARIMAX, and long short-term memory (LSTM), and describes the data acquisition and preprocessing processes. In Section 3, the performance of these models is evaluated in forecasting wind speed and significant wave height, offering a critical analysis of their effectiveness and practical implications. In Section 4, findings are synthesized, their significance for offshore wind turbine applications is discussed, and potential avenues for future research are outlined.

2. Methodology

The present study uses three machine learning models—Facebook Prophet, SARIMAX, and LSTM—to predict wind speed and wave height using high- and low-resolution datasets for 1-, 3-, and 30-day periods. The subsequent sections elaborate extensively on these three ML models.

2.1. Facebook Prophet

Facebook Prophet is a prediction tool that manages time series data characterized by trends and seasonality patterns. It is a perfect model for handling irregular data, which are common in business forecasts, often containing missing values or significant outliers, and it typically necessitates minimal data preprocessing [35]. The objective of the Prophet model is to streamline the forecasting process by automating a substantial portion of the statistical modeling procedure [36]. In the core of the Prophet model, a decomposable time series model comprising three primary elements of trend, seasonality, and holidays lies represented as follows:

y (t) = g (t) + s (t) + h (t) + ε_{t}

(1)

where

y (t)

is the predicted value;

g (t)

represents the trend function, which models non-periodic changes;

s (t)

represents the seasonality or modeling periodic changes,

h (t)

represents the effects of holidays or events; and

ε_{t}

represents the error term accounting for any idiosyncratic changes not accommodated by the model. A detailed analysis of the components of the Prophet model, an exploration of its functionality, and the underlying mechanisms that drive its operations are provided here.

The trend component, denoted as

g (t)

, is modeled, and a piecewise linear or logistic growth curve is employed to adapt the variations in the time series trend. In the linear model, the trend manifests as a piecewise linear function. Conversely, in the logistic growth model, the trend is influenced by a carrying capacity,

C (t)

, which is subject to temporal fluctuations and can be expressed as follows:

g (t) = \frac{C (t)}{1 + e^{- k (t - m)}}

(2)

The seasonality component, denoted as

s (t)

, is modeled utilizing the Fourier series to accommodate complex seasonal patterns. This feature enables Prophet to capture yearly, weekly, and daily seasonal variations. The Fourier series for seasonality can be represented as follows:

s (t) = \sum_{n = 1}^{N} (a_{n} cos (\frac{2 π n t}{P}) + b_{n} sin (\frac{2 π n t}{P}))

(3)

Here, N is the number of Fourier terms that control the model flexibility,

a_{n}

and

b_{n}

are the Fourier coefficients, and P is the period. The holidays component

h (t)

models predictable irregularities on specific dates, like holidays or events, which can be manually specified. It uses an indicator function for each holiday to model its effect as follows:

h (t) = \sum_{i = 1}^{I} I_{h o l i d y_{i}} (t) . δ_{i}

(4)

where

I_{h o l i d y_{i}} (t)

is an indicator function that is 1 if time t is the ith holiday and 0 otherwise, and

δ_{i}

represents the impact of the ith holiday on the forecast. Finally, the error term, which is denoted as

ε_{t}

, captures any random fluctuations in the data that remain unaccounted for by the model. It is presumed to follow a normal distribution,

ϵ_{t} \sim N (0, σ^{2})

, where

σ^{2}

represents the variance of the error term.

2.2. SARIMAX

The Seasonal Autoregressive Integrated Moving Average with eXogenous variables (SARIMAX) model is an extension of the Autoregressive Integrated Moving Average (ARIMA) model. The SARIMAX incorporates both seasonality and exogenous variables into the forecasting equation. It is a powerful statistical method used for forecasting time series data that can account for complex patterns, trends, seasonal effects, and the influence of external factors. These features make it capable of capturing complex data patterns beyond trends and noise [37]. The SARIMAX model can be represented by the notation SARIMAX(p, d, q)(P, D, Q)[s] with exogenous variables, where p stands for the order of the autoregressive (AR) part, d is the degree of differencing, and q denotes the order of the moving average (MA) part. P, Q, and D represent the seasonal components of the AR, differencing, and MA parts, respectively. The exogenous variables represent the external factors that might influence the target variable but are not included in the time series, and s indicates the periodicity of the seasonality. The combined equation for the SARIMAX model is as follows:

(1 - ϕ (B) - Φ (B^{s})) {(1 - B)}^{d} {(1 - B^{s})}^{D} Y_{t} = (1 + θ (B) + θ (B^{s})) ε_{t} + β X_{t}

(5)

This model integrates the effects of autoregression, moving average, differencing, seasonality, and exogenous variables into a single comprehensive forecasting model. The SARIMAX model offers a robust framework for forecasting complex time series data, utilizing internal dynamics and external influences and accounting for seasonality. Internal dynamics are implemented through AR, MA, and differencing, while the external influences are applied via exogenous variables. A breakdown of these key components of the SARIMAX model with insights into its functionality and underlying mechanisms is described here.

The autoregressive term (AR) models the relationship between the current value of the time series data and its past values.

ϕ (B) Y_{t} = ϕ_{1} Y_{t - 1} + ϕ_{2} Y_{t - 2} + \dots + ϕ_{p} Y_{t - p}

(6)

Here,

ϕ (B)

represents the AR polynomial in the backshift operator B, and

ϕ_{1}, ϕ_{2}, ϕ_{3}, \dots, ϕ_{p}

are the model’s parameters. Finally, applying the backshift operator to

Y_{t}

results in

Y_{t - 1}

.

The moving average component (MA) models the error term as a linear combination of past error terms and can be represented as follows:

θ (B) ε_{t} = θ_{1} ε_{t - 1} + θ_{2} ε_{t - 2} + \dots + θ_{q} ε_{t - q}

(7)

Similar to AR here,

θ (B)

is the MA polynomial in the backshift operator B, and

θ_{1} + θ_{2} + \dots + θ_{q}

, are the model’s parameters.

Integration component

(I)

involves differencing the time series data to achieve stationarity. Here, d is the order of differentiation. It subtracts the next value by the current value d times, which helps remove the trends and seasonality in the series.

{(1 - B)}^{d} Y_{t}

(8)

The seasonal AR and MA components are denoted as

Φ (B^{s})

and

Θ (B^{s})

, respectively, where s is the seasonality period and P and Q are the orders of the seasonal AR and MA parts.

Finally, the Exogenous Variables

(X)

are external factors that influence the time series data. Here,

β

represents the coefficients of the exogenous variables and

X_{t}

represents the external factors at time t.

Y_{t} = β X_{t} + A R I M A

(9)

2.3. Long Short-Term Memory

Long short-term memory (LSTM) models represent a distinct category within Recurrent Neural Networks (RNNs), as they possess the ability to learn long-term dependencies and remember information for long periods within sequential data [38,39]. This is a crucial capability in many applications in which the current output is significantly influenced by context and history of information. These models were introduced during training to eliminate the vanishing gradient problem of the traditional RNNs. LSTMs are widely used for sequence prediction problems like natural language processing, speech recognition, and time series forecasting.

The core components of LSTMs typically use special units called cell states equipped with three gates, namely forget, input, and output gates, which regulate the flow of information. The forget gate

(f_{t})

decides what information should be thrown away or kept and the input gate

(i_{t})

updates the cell state with new information, while the output gate

(O_{t})

determines what the next hidden state should be.

Given a sequence of inputs

{x_{1}, x_{2}, \dots, x_{T}}

, an LSTM updates its hidden state

h_{t}

, and cell state

C_{t}

at each time step t. The forget gate

(f_{t})

searches for

h_{t - 1}

and

x_{t}

values and outputs a number between 0 and 1 for each number in the cell state

c_{t - 1}

. Output 1 represents “completely keep”, while 0 represents the action “Eliminate”.

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(10)

The input gate

(i_{t})

decides which values will be updated, and a tanh layer

{\tilde{c}}_{t}

creates a vector of new candidate values that could be added to the state.

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(11)

{\tilde{c}}_{t} = tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(12)

The cell state regulates the forgetting and replacement process of the state with a scale factor for the decision on how the update of each state value should be reflected.

c_{t} = f_{t} * c_{t - 1} + i_{t} * {\tilde{c}}_{t}

(13)

Finally, the hidden state condition

(h_{t})

, which contains predictions and information on previous inputs, is decided by the output gate

(O_{t})

.

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(14)

h_{t} = o_{t} * tanh (c_{t})

(15)

Here,

X_{t}

is the input vector at time step t, while

h_{t - 1}

and

C_{t - 1}

are the hidden and cell states from the previous time step, respectively. W and b determine the weights and biases for the different parts of the system, and the output values between 0 and 1 are achieved using the Sigmoid

(σ)

and hyperbolic tangent

(tanh)

functions. When all these components are multiplied elementwise, the state function and the output gate values are obtained.

2.4. Data Acquisition and Preprocessing

The data utilized in this study consist of standard meteorological and descriptive wave measurements obtained from the National Oceanic and Atmospheric Administration (NOAA) in the United States. These measurements were gathered by sensors installed on floating offshore buoys deployed across the United States and international waters, which were managed by the National Data Buoy Center (NDBC). Specifically, data from Station 46059, positioned west of San Francisco, California, was selected for analysis. Situated in the Pacific Ocean off the northwest coast of California (

38^{\circ} 4^{'} 9^{″} N 129^{\circ} 58^{'} 34^{″} W

), this offshore buoy provides historical data from October 2015 to April 2021. The dataset contains various measurement parameters, including wind direction (WDIR), wind speed (WSPD), gust (GST), wave height (WVHT), dominant wave period (DPD), average wave period (APD), mean wave direction (MWD), pressure (PRES), air temperature (ATMP), water temperature (WTMP), dew point (DEWP), visibility (VIS), tide (TIDE), and a date column.

The selection of this particular station was motivated by weather stability and sea conditions, which makes it a suitable place for offshore wind turbines without significant disturbances. Moreover, the mean wind speed recorded at this station over the historical data utilized in this study is 8.66 m/s, with a maximum wind speed of 23.36 m/s at the hub height of NRELs (National Renewable Energy laboratories) 5MW offshore wind turbines. These wind speeds fall within the operational range of these wind turbines, further justifying the choice of this station for analysis.

The high-resolution meteorological data files feature a sampling frequency of 10 min for wind speed and one hour for significant wave height. These data were resampled to a daily sampling frequency to create a low-resolution dataset for longer 30-day forecasts. This study focuses on forecasting wind speed and wave height for 1, 3, and 30 days using high- and low-resolution data. The data were resampled to daily frequency using average values, and any missing values were filled out using linear interpolation to prepare for machine learning models. After handling missing values, a seasonal decomposition was performed to gain insights into the seasonality of the data. This decomposition reveals three key components: trend, seasonality, and residual, as demonstrated in Figure 1 and Figure 2 for WSPD and WVHT. The analysis indicates the presence of seasonal patterns in the wind speed data, which aligns with expectations given the influence of seasonal weather patterns on wind speed. The Augmented Dickey–Fuller (ADF) test results for wind speed data show a value of

- 22.245

for the test statistic, 0.0 for the p-value, and the number of lags and observations used are 1 and 2018, respectively. Moreover, the critical values for

1 %

,

5 %

, and

10 %

confidence levels are

- 3.433

,

- 2.863

and

- 2.567

, respectively.

The ADF test statistic is far less than the critical values, and the p-value is 0.0, indicating strong evidence against the null hypothesis. The null hypothesis can be rejected with confidence, and it can be concluded that the series is stationary. The results of the seasonal decomposition offer valuable insights into the underlying structure of the Wave Height (WVHT) data, as depicted in Figure 2. While WVHT displays a certain level of predictability through its seasonal patterns, any long-term changes indicated by the trend do not stem from non-stationary processes but signify a stable evolution over time. This analysis underscores the importance of considering seasonal influences and long-term trends when comprehending and forecasting wave height dynamics. Moreover, the absence of significant residuals suggests that the decomposition model effectively captures the primary dynamics of the data. This makes it a potentially valuable tool for further analysis and decision-making related to oceanic and coastal activities.

The ADF test results for WVHT data indicate a value of

- 5.9337

for the test statistic and for the p-value, and the number of lags and observations used are 14 and 2004, respectively. Moreover, the critical values for

1 %

,

5 %

, and

10 %

confidence levels are

- 3.433

,

- 2.863

, and

- 2.567

, respectively. Like the ADF for WSPD, the null hypothesis with a high confidence level can be rejected, as the ADF of WVHT is significantly lower than the critical values for confidence levels, meaning the WVHT time series is stationary. The p-value for WSPD and WVHT is also very small, supporting the conclusion that the time series is stationary and does not have a unit root. This implies that the mean and variance of the WVHT and WSPD data do not have time-dependent structures that would require differencing to make them stationary.

Choosing a suitable exogenous variable (EXOG) for an ML model depends on the specific context and the hypothesis about what might influence the variables that are being predicted, which in this study are WSPD and WVHT. Exogenous variables are external factors that could have a predictive relationship with the target variable. The following features might be considered as exogenous variables for wind speed forecasting.

Changes in atmospheric pressure (PRES) can influence wind patterns and speeds. Lower pressure often leads to higher wind speeds as air moves from high- to low-pressure areas. In addition, temperature differences can cause pressure differences, leading to wind. The difference between air temperature (ATMP) and water temperature (WTMP) might also be significant, especially for coastal areas or over bodies of water. Furthermore, the air’s humidity or dew point (DEWP) can affect atmospheric conditions and, consequently, wind patterns. Likewise, the wind direction (WDIR) could affect the context for seasonal wind patterns or shifts that influence speed. To evaluate the connection between potential exogenous variables and wind speed, an exploratory data analysis (EDA) is conducted, concentrating on several selected variables, including PRES, ATMP, WTMP, DEWP, and WDIR. The correlations of these exogenous variables with the wind speed are examined and visualized using scatter plots. This analysis aims to identify variables that exhibit a notable relationship with wind speed, revealing promising candidates for inclusion as exogenous variables in the machine learning models. The correlation coefficients between wind speed (WSPD) and the selected potential exogenous variables, such as atmospheric pressure (PRES) and wind direction (WDIR), indicate a very weak relationship, with negative values of

- 0.0084

and

- 0.0297

, respectively. Similarly, the correlation between WSPD and other exogenous variables like air temperature (ATMP), water temperature (WTMP), and dew point (DEWP) also shows a weak relationship, with negative values of

- 0.1135

,

- 0.1346

, and

- 0.1134

, respectively. These correlation values suggest little to no linear relationship between wind speed and the selected exogenous variables. However, it is essential to note that correlation coefficients only capture linear relationships, and there may still be nonlinear or complex interactions between these variables that could impact wind speed. Therefore, further analysis and modeling techniques may be necessary to fully understand the relationship between wind speed and these exogenous variables.

The correlations are relatively weak, with the strongest (yet still weak) negative relationships observed with water temperature (WTMP), air temperature (ATMP), and dew point (DEWP). These negative correlations suggest that as the temperatures and dew point increase, wind speed slightly decreases, possibly due to temperature’s influence on atmospheric pressure and wind patterns. However, the correlations are not strong enough to indicate a direct or significant predictive relationship on their own. Given the weak correlations, Figure 3 provides a useful visualization of these relationships to look for nonlinear patterns or outliers that might influence the wind speed.

From Figure 3, it can be concluded that the weak relationships suggest that the considered variables may not be potential exogenous variables in an ML model, as they are not strong predictors of wind speed. Instead, new features that capture a meaningful relationship between air temperature (ATMP) and wind speed (WSPD) were created to be used as exogenous variables in the models. The choice of air temperature is based on the availability of data, domain knowledge, and its correlation with wind speed. While no direct formula universally links air temperature to wind speed due to the complex nature of atmospheric dynamics, a few conceptual ideas that might help generate a new feature that could potentially enhance the model’s ability to predict wind speed can be explored. One concept to consider is the temperature gradient, which measures how temperature changes in space. In meteorology, significant temperature differences can drive wind formation due to pressure differences. Another approach could be creating an interaction term between ATMP and other relevant features, which might unveil hidden relationships. For example, the interaction between ATMP and (PRES) could provide insights, as pressure differences driven by temperature changes are a fundamental cause of wind. In this study, the correlation results between wave height (WVHT) and exogenous variables such as average wave period, gust speed, and wind speed demonstrate relatively higher correlations, with values of 0.643, 0.631, and 0.566, respectively. These findings suggest a stronger linear relationship between wave height and these variables, indicating they may significantly impact wave height fluctuations. On the other hand, the correlation coefficients between “WVHT” and other exogenous variables like dominant wave period, mean wind direction, and absolute wind direction reveal relatively low correlations, with values of 0.345, 0.157, and 0.13, respectively. While these variables may still influence wave height, their impact appears less pronounced than the average wave period, gust speed, and wind speed. Furthermore, features such as air pressure (PRES), dewpoint temperature (DEWP), water temperature (WTMP), and air temperature (ATMP) showed lower correlations with WVHT. This analysis suggests that wave period (both average and dominant), gust speed, and wind speed are strongly associated with the wave height, which aligns with physical expectations.

3. Results and Discussion

This section presents the study’s findings regarding the forecasting of wind speed and significant wave height for 1-, 3-, and 30-day periods, utilizing three distinct machine learning models: Prophet, SARIMAX, and LSTM.

3.1. Prophet Model

Initially, the Prophet forecasting model was employed to forecast wind speed and significant wave height over a 30-day timeframe, utilizing a meticulously preprocessed low-resolution dataset to meet the model’s specifications. Following data loading and cleaning procedures, the dataset was divided into training and testing subsets, with the final 30 days designated for model evaluation purposes. The Prophet model is renowned for handling the inherent seasonality and trends of time series data. It was trained on historical data, excluding the last 30 days. Subsequently, predictions for the following 30 days were generated and compared with the observed values. Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) metrics were utilized to gauge the model’s performance, offering concise assessments of the forecasting accuracy. MAE measures the average magnitude of errors between predicted and actual values, providing a clear view of prediction accuracy. RMSE squares these differences before averaging, emphasizing more significant errors and making it sensitive to outliers. Both metrics are essential for evaluating the precision of forecasting models, offering insights into their practical effectiveness. The 2.37 MAE and 2.76 RMSE values indicate reasonable accuracy for forecasting wind speed, since the mean value of the wind speed for test data is equal to 6.36 m/s. As shown in Figure 4, the prophet model cannot capture the seasonality and residual of the data and works as a moving average model to mimic the data trend.

Figure 5 shows the prophet model predictions over the entire wind speed dataset, with the shaded region highlighting the uncertainty range.

With 0.83 MAE and 0.93 RMSE values for forecasting significant wave height, with an average value of 2.57 m, results show a slight improvement from the wind speed forecast. However, Figure 6 and Figure 7 indicate that the prophet model is still poorly performing in capturing complex patterns of metocean data.

High-resolution datasets were utilized for short-term forecasting. The dataset was divided into training and testing sets starting from 1 January 2021. Data from before this date were used for training, while data from 1 January onwards were used for model evaluation. The MAE for 1-day and 3-day wind speed forecasts using the Prophet model were 1.29 and 1.14, respectively. The results are illustrated in Figure 8. These Figures reveal that the model’s performance is suboptimal, and it struggles to capture underlying patterns in the data.

Similarly, using high-resolution data, the Mean Absolute Error (MAE) for the wave height forecast is 0.35 for one day and 1.23 for three days. Figure 9 illustrates the model’s performance. The model successfully captures the overall trends of the data but performs poorly in capturing fluctuations.

3.2. SARIMAX

Following the Prophet model, the SARIMAX model was employed to forecast wind speed (WSPD), utilizing its capability to encompass both seasonal variations and the influence of external factors on wind dynamics. The SARIMAX

(1, 0, 1) \times (2, 1, 0, 12)

was selected based on its superior performance in capturing the inherent temporal structure and seasonality present in the wind speed data, and exhibiting the lowest Akaike Information Criterion (AIC). The AIC serves as a crucial metric for SARIMAX model selection, as it provides a balance between model fit and complexity, penalizing the inclusion of unnecessary parameters that fail to enhance the model significantly. Lower AIC values indicate a better model fit, which is crucial for determining the order of SARIMAX models. By integrating autoregressive (AR) and moving average (MA) components with seasonal adjustments, the SARIMAX model offers a nuanced insight into wind-speed fluctuations over time. Furthermore, incorporating exogenous variables provides a framework for quantifying the impact of atmospheric conditions, such as pressure and temperature, on wind speed predictions. The diagnostic plots of the SARIMAX model provided in Figure 10 provide several key insights into the model’s performance and the residuals’ behavior.

The plot in the top left corner of Figure 10 illustrates the time series of standardized residuals. Notably, these residuals do not show any evident patterns, which is favorable, as it suggests that the residuals closely resemble white noise, which aligns with our modeling objectives. At the top right, a comparison of the distribution of the standardized residuals with a normal distribution is shown. If the residuals are normally distributed, the Kernel Density Estimate line (KDE) should ideally follow the N(0,1) line closely. There appears to be a good fit between the KDE and N(0,1) lines, suggesting that the residuals are approximately normally distributed. In the bottom left plot, the Quantile–Quantile plot is presented, illustrating the comparison between the distribution of the residuals and a normal distribution. Ideally, the points should closely follow the red line. The plot exhibits a strong alignment, particularly in the middle quantiles, indicating that the residuals are well approximated by a normal distribution. The residuals’ correlogram (ACF plot) is on the bottom right. For a competent model, the absence of significant autocorrelations in the residuals is expected. The plot reveals that all autocorrelations fall within the confidence band, indicating that the residuals exhibit the characteristics of white noise.

SARIMAX model was applied to the dataset to forecast wind speed with and without exogenous variables. However, the resulting Root Mean Squared Error (RMSE) of 3.78 for wind speed predictions showed no improvement compared to the Prophet model. Additionally, incorporating exogenous variables into the SARIMAX model did not affect the prediction results. Predicted wind-speed values using SARIMAX model and actual wind speed are plotted in Figure 11.

From the results shown in Figure 11, it can be concluded that, in this particular context, the SARIMAX model did not offer enhanced forecasting performance over the Prophet model, regardless of its inclusion of exogenous variables. Further exploration and refinement of the modeling approach may be necessary for improved predictive accuracy. The SARIMAX model of

(0, 0, 1) \times (0, 1, 2, 12)

was selected for its significant wave height prediction. The resulting RMSE error of 1.48 indicated a decrease in prediction accuracy compared to the Prophet model when no exogenous variables were applied. Figure 12 illustrates the predicted and measured values for 30 days, highlighting the difference between the predicted and actual significant wave heights. Similar to the wind-speed analysis of the SARIMAX model, in its current configuration without exogenous variables for the significant wave height, it may not be able to effectively capture the underlying dynamics of significant wave height fluctuations, and further investigation and refinement of the modeling approach may be necessary to improve prediction accuracy in this regard.

Contrary to the wind speed prediction, incorporating gust speed as an exogenous variable to the SARIMAX model for significant wave height data remarkably enhances the model’s accuracy and reduces the RMSE error to 1.13. This improvement enables the SARIMAX model to capture the complex patterns observed in the wave data effectively. As depicted in Figure 13, the predictions generated by the SARIMAX model can be utilized to forecast significant wave height by applying a correction factor to address the overestimation inherent in the model. This approach compensates for any discrepancies and ensures that the model accurately captures the behavior of the data.

These results indicate that the SARIMAX model overtakes the Prophet model in predicting wave height, despite having a slightly higher RMSE value. The SARIMAX model’s ability to incorporate exogenous variables, such as gust speed, is essential in enhancing prediction accuracy and capturing the intricate patterns in the data.

Using high-resolution data and training the SARIMAX model with data up to 1 January 2021, the model achieved an MSE of 3.97 and an RMSE of 1.99 for a one-day wind speed prediction, and an MSE of 2.82 and an RMSE of 1.68 for a three-day prediction. These results indicate the model’s lower accuracy in handling high-resolution data. Figure 14 plots the predicted values for these forecasting periods against the actual wind speed values.

Using high-resolution data and the same data splitting point for training and evaluation as those used in the short-term wind speed prediction, the SARIMAX model achieved an MSE of 0.57 and an RMSE of 0.75 for a one-day wave height prediction, and an MSE of 1.03 and an RMSE of 1.01 for a three-day prediction. Figure 15 displays the predicted values for these forecasting periods plotted against the actual values of significant wave height. The significant decrease in accuracy observed when utilizing high-resolution data stems from the necessity to adjust the model’s parameter concerning the number of periods per season. This adjustment must account for the sampling frequency. However, such modifications substantially increase the computational demands, rendering the simulation infeasible without high-performance computing resources.

3.3. LSTM

In the final step, a LSTM neural network was implemented to forecast atmospheric conditions, focusing on wind speed and significant wave height and utilizing historical weather data. The LSTM model architecture was designed with an input layer to accommodate lagged observations of the chosen features, enabling it to capture temporal dependencies within the dataset. This design allows the model to learn from sequences of past observations, which is crucial for accurate time-series forecasting. A sequential neural network model comprising two LSTM layers followed by a dense layer is employed in this study. The initial LSTM layer, configured with 100 units, processes the input sequences into a higher-dimensional space, outputting a sequence of vectors (of length 30 for each timestep), thereby retaining the sequential nature of the input data. This is crucial for maintaining the temporal characteristics of the input for subsequent processing. The second LSTM layer, consisting of 50 units, further refines these temporal features into a condensed form, outputting a single vector that encapsulates the learned temporal dynamics. Finally, a dense layer with a single unit is used to produce the forecasted value from the processed features. The model was trained by optimizing weights using the Adam optimizer to minimize the mean squared error loss function across ten epochs. This training process demonstrates the model’s ability to learn iteratively from the dataset. The model’s performance was evaluated based on the root mean squared error (RMSE) between the model’s predictions and the actual observations, providing a quantitative measure of the model’s forecasting accuracy. Figure 16 presents the prediction results for wind speed compared with recorded wind speed data, considering wind speed as the target variable and air temperature gradient as an exogenous feature. This visual representation offers insights into the model’s predictive capabilities and alignment with the observed data. With an RMSE value of 2.12 and the ability to closely replicate the data’s behavior, the LSTM model emerges as the top performer in wind speed prediction among all machine learning models examined in this study.

Moreover, the LSTM model exhibits exceptional performance in forecasting wave height, as demonstrated in Figure 17. When applied to wave height data as the target variable and wind gust speed as an exogenous variable, it achieves an impressively low RMSE of 0.74. This outcome underscores the LSTM model’s effectiveness in capturing the complex dynamics inherent in wave height fluctuations and further solidifies its position as a superior choice for predicting both wind speed and significant wave height

Indeed, the presence of a lag in predictions can be attributed to the effect of wind gust speed on the formation of ocean surface waves. The wind gust speed data were shifted backward by one day to address this lag. The lagged gust speed data were then added to the dataset and utilized in wave height prediction.

This modification led to a significant improvement in model accuracy, with the RMSE decreasing to 0.48. As illustrated in Figure 18, the predicted results exhibit an exceptional match to the historical data following this adjustment to the dataset. This underscores the importance of accounting for lag effects in wind-related phenomena when forecasting significant wave height and highlights the efficacy of incorporating lagged gust speed data to enhance the model’s predictive performance.

To forecast short-term wind speed, high-resolution data obtained prior to 1 January were used to train the LSTM model, with data from 1 January onward used for evaluation. The RMSE used to predict the test set of data is 0.54 for wind speed forecasting. Figure 19 illustrates the model’s performance over 1- and 3-day prediction periods.

For more clarification, six random 1-day periods throughout the test data were considered, and the predictions versus actual wind speed values for these periods are plotted to further demonstrate the accuracy of the model’s predictions, as shown in Figure 20. Figure 21 plots the history of training and validation loss of the model against the number of epochs.

Using high-resolution data and the same data splitting point for training and evaluation as in the short-term wind speed prediction, the LSTM model achieved an RMSE of 0.27 for predicting the test set of data for wave height. Figure 22 showcases the model’s performance over 1- and 3-day prediction periods.

Additionally, six random 1-day periods were selected from the test data, and the predictions versus actual wave height values for these periods are plotted to further demonstrate the accuracy of the model’s predictions, as depicted in Figure 23. Figure 24 plots the history of training and validation loss of the model against the number of epochs.

4. Conclusions

This study’s exploration into the use of machine learning (ML) models for metocean data forecasting in the context of offshore wind turbine placement highlights significant insights and advancements toward optimizing renewable energy sources by employing three distinct ML models, including Facebook Prophet, SARIMAX, and Long Short-Term Memory (LSTM). This research aims to enhance the precision of wind speed and significant wave height predictions, which are critical factors in offshore wind farms’ design, placement, operation, and maintenance. The analysis revealed that the LSTM model exhibited superior performance in predicting both wind speed and significant wave height among the three models. The model’s success is attributed to its advanced architecture, which allows it to capture complex temporal dependencies and long-term patterns in the data with erratic nature, such as wind and wave patterns. Integrating exogenous variables, such as atmospheric conditions for wind speed forecasts and gust speed for wave height forecasts, further enhanced the models’ accuracy, underscoring the value of incorporating external factors into predictive analyses for renewable energy applications. The study’s findings contribute valuable insights into the ongoing efforts to integrate renewable energy sources into the power grid. Accurate metocean data forecasts are crucial for minimizing operational costs, improving safety, and maximizing energy production efficiency. ML models’ demonstrated effectiveness suggests a prospective direction for future research and application in renewable energy forecasting, providing a foundation for more reliable and efficient renewable energy systems. Moreover, this research highlights the importance of continuous innovation and adaptation of ML techniques in the renewable energy sector. As the global shift towards sustainable energy sources gains momentum, accurately forecasting environmental conditions becomes increasingly crucial. Future work could explore integrating more diverse data sources, applying emerging ML models, and developing more sophisticated forecasting frameworks to enhance renewable energy systems’ reliability and efficiency. Future research in machine learning-based forecasting for offshore wind turbines could focus on integrating real-time sensor data, exploring new models like deep reinforcement learning, and enhancing model adaptability to different locations and environmental conditions. In conclusion, this study affirms the significant potential of machine learning models in improving the accuracy of metocean data forecasts for offshore wind turbine applications. The advancements in ML, particularly the application of LSTM models, pave the way for optimizing the performance and sustainability of renewable energy sources, contributing to global efforts in combating climate change and promoting environmental sustainability.

Author Contributions

Conceptualization, M.B. (Mohammad Barooni); methodology, M.B.(Masoumeh Bahrami) and D.V.S.; investigation, M.B. (Mohammad Barooni), S.G.T., M.B.(Masoumeh Bahrami) and P.S.; resources, D.V.S.; data curation, M.B. (Mohammad Barooni), S.G.T., M.B. (Masoumeh Bahrami) and P.S; writing—original draft preparation, M.B. (Mohammad Barooni), S.G.T., M.B. (Masoumeh Bahrami) and P.S; writing—review and editing, D.V.S.; visualization, M.B. (Mohammad Barooni) and D.V.S.; supervision, D.V.S.; funding acquisition, D.V.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the first author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Sogut, D.V.; Farhadzadeh, A.; Jensen, R.E. Characterizing the Great Lakes marine renewable energy resources: Lake Michigan surge and wave characteristics. Energy 2018, 150, 781–796. [Google Scholar] [CrossRef]
Velioglu Sogut, D.; Jensen, R.E.; Farhadzadeh, A. Characterizing lake ontario marine renewable energy resources. Mar. Technol. Soc. J. 2019, 53, 21–37. [Google Scholar] [CrossRef]
Barooni, M.; Ashuri, T.; Velioglu Sogut, D.; Wood, S.; Ghaderpour Taleghani, S. Floating offshore wind turbines: Current status and future prospects. Energies 2022, 16, 2. [Google Scholar] [CrossRef]
Manisalidis, I.; Stavropoulou, E.; Stavropoulos, A.; Bezirtzoglou, E. Environmental and health impacts of air pollution: A review. Front. Public Health 2020, 8, 505570. [Google Scholar] [CrossRef]
Parums, D.V. Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2023 Guidelines for COPD, Including COVID-19, Climate Change, and Air Pollution. Med. Sci. Monit. Int. Med. J. Exp. Clin. Res. 2023, 29, e942672-1. [Google Scholar] [CrossRef]
Farghali, M.; Osman, A.I.; Chen, Z.; Abdelhaleem, A.; Ihara, I.; Mohamed, I.M.; Yap, P.S.; Rooney, D.W. Social, environmental, and economic consequences of integrating renewable energies in the electricity sector: A review. Environ. Chem. Lett. 2023, 21, 1381–1418. [Google Scholar] [CrossRef]
Barooni, M.; Nezhad, S.K.; Ali, N.A.; Ashuri, T.; Sogut, D.V. Numerical study of ice-induced loads and dynamic response analysis for floating offshore wind turbines. Mar. Struct. 2022, 86, 103300. [Google Scholar] [CrossRef]
Brushett, B.A. Assessment of Metocean Forecast Data and Consensus Forecasting for Maritime Search and Rescue and Pollutant Response Applications. Master’s Thesis, Griffith University, Nathan, Australia, 2015. [Google Scholar]
Froehling, L.; Grotebrune, T.; Hildebrandt, A. Comparison of field and forecast metocean data in the german bight. Coast. Eng. Proc. 2022, 37, 87. [Google Scholar] [CrossRef]
Orlandi, A.; Cappugi, A.; Mari, R.; Pasi, F.; Ortolani, A. Meteorological navigation by integrating metocean forecast data and ship performance models into an ecdis-like e-navigation prototype interface. J. Mar. Sci. Eng. 2021, 9, 502. [Google Scholar] [CrossRef]
Shanas, P.; Kumar, V.S. Trends in surface wind speed and significant wave height as revealed by ERA-Interim wind wave hindcast in the Central Bay of Bengal. Int. J. Climatol. 2015, 35, 2654–2663. [Google Scholar] [CrossRef]
Cartwright, P.J.; Fearns, P.R.; Branson, P.; Cuttler, M.V.; O’leary, M.; Browne, N.K.; Lowe, R.J. Identifying metocean drivers of turbidity using 18 years of MODIS satellite data: Implications for marine ecosystems under climate change. Remote Sens. 2021, 13, 3616. [Google Scholar] [CrossRef]
Barooni, M.; Velioglu Sogut, D. Forecasting Pitch Response of Floating Offshore Wind Turbines with a Deep Learning Model. Clean Technol. 2024, 6, 418–431. [Google Scholar] [CrossRef]
Liu, Q.; Cheng, P.; Lyu, M.; Yan, X.; Xiao, Q.; Li, X.; Wang, L.; Bao, L. Impacts of Climate Change on Runoff in the Heihe River Basin, China. Atmosphere 2024, 15, 516. [Google Scholar] [CrossRef]
Sacie, M.; Santos, M.; López, R.; Pandit, R. Use of state-of-art machine learning technologies for forecasting offshore wind speed, wave and misalignment to improve wind turbine performance. J. Mar. Sci. Eng. 2022, 10, 938. [Google Scholar] [CrossRef]
Abdul Majid, A. A Novel Method of Forecasting Chaotic and Random Wind Speed Regimes Based on Machine Learning with the Evolution and Prediction of Volterra Kernels. Energies 2023, 16, 4766. [Google Scholar] [CrossRef]
Wang, Z.; Wang, L.; Revanesh, M.; Huang, C.; Luo, X. Short-term wind speed and power forecasting for smart city power grid with a hybrid machine learning framework. IEEE Internet Things J. 2023, 10, 18754–18765. [Google Scholar] [CrossRef]
Burney, S.A.; Drakhshan, K.; Karim, S. Forecasting Wind Speed Using Machine Learning ANN Models at 4 Distinct Heights at Different Potential Locations in Pakistan. WSEAS Trans. Comput. 2023, 22, 127–141. [Google Scholar] [CrossRef]
Al-Hajj, R.; Fouad, M.M.; Smieee, A.A.; Mabrouk, E. Ultra-short-term forecasting of wind speed using lightweight features and machine learning models. In Proceedings of the 2023 12th International Conference on Renewable Energy Research and Applications (ICRERA), Oshawa, ON, Canada, 29 August–1 September 2023; pp. 93–97. [Google Scholar]
Upreti, K.; Arora, S.; Sharma, A.K.; Pandey, A.K.; Sharma, K.K.; Dayal, M. Wave Height Forecasting Over Ocean of Things Based on Machine Learning Techniques: An Application for Ocean Renewable Energy Generation. IEEE J. Ocean. Eng. 2023, 49, 430–445. [Google Scholar] [CrossRef]
Zhang, W.; Duan, L.; Liu, T.; Shi, Z.; Shi, X.; Chang, Y.; Qu, S.; Wang, G. A hybrid framework based on LSTM for predicting karst spring discharge using historical data. J. Hydrol. 2024, 633, 130946. [Google Scholar] [CrossRef]
Guo, J.; Liu, Y.; Zou, Q.; Ye, L.; Zhu, S.; Zhang, H. Study on optimization and combination strategy of multiple daily runoff prediction models coupled with physical mechanism and LSTM. J. Hydrol. 2023, 624, 129969. [Google Scholar] [CrossRef]
Xu, Y.; Hu, C.; Wu, Q.; Jian, S.; Li, Z.; Chen, Y.; Zhang, G.; Zhang, Z.; Wang, S. Research on particle swarm optimization in LSTM neural networks for rainfall-runoff simulation. J. Hydrol. 2022, 608, 127553. [Google Scholar] [CrossRef]
Wu, M.; Stefanakos, C.; Gao, Z.; Haver, S. Prediction of short-term wind and wave conditions for marine operations using a multi-step-ahead decomposition-ANFIS model and quantification of its uncertainty. Ocean Eng. 2019, 188, 106300. [Google Scholar] [CrossRef]
Soret, A.; Torralba, V.; Cortesi, N.; Christel, I.; Palma, L.; Manrique-Suñén, A.; Lledó, L.; González-Reviriego, N.; Doblas-Reyes, F.J. Sub-seasonal to seasonal climate predictions for wind energy forecasting. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2019; Volume 1222, p. 012009. [Google Scholar]
Penalba, M.; Aizpurua, J.I.; Martinez-Perurena, A.; Iglesias, G. A data-driven long-term metocean data forecasting approach for the design of marine renewable energy systems. Renew. Sustain. Energy Rev. 2022, 167, 112751. [Google Scholar] [CrossRef]
Natskår, A.; Moan, T.; Alvær, P.Ø. Uncertainty in forecasted environmental conditions for reliability analyses of marine operations. Ocean Eng. 2015, 108, 636–647. [Google Scholar] [CrossRef]
Tawn, R.; Browell, J.; McMillan, D. Subseasonal-to-seasonal forecasting for wind turbine maintenance scheduling. Wind 2022, 2, 260–287. [Google Scholar] [CrossRef]
Tsimplis, M.N.; Woolf, D.; Osborn, T.; Wakelin, S.; Wolf, J.; Flather, R.; Shaw, A.; Woodworth, P.; Challenor, P.; Blackman, D.; et al. Towards a vulnerability assessment of the UK and northern European coasts: The role of regional climate variability. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2005, 363, 1329–1358. [Google Scholar] [CrossRef]
Coll, J.; Woolf, D.K.; Gibb, S.W.; Challenor, P.G. Sensitivity of ferry services to the Western Isles of Scotland to changes in wave and wind climate. J. Appl. Meteorol. Climatol. 2013, 52, 1069–1084. [Google Scholar] [CrossRef]
Clark, R.T.; Bett, P.E.; Thornton, H.E.; Scaife, A.A. Skilful seasonal predictions for the European energy industry. Environ. Res. Lett. 2017, 12, 024002. [Google Scholar] [CrossRef]
Bell, R.; Kirtman, B. Seasonal forecasting of wind and waves in the North Atlantic using a grand multimodel ensemble. Weather. Forecast. 2019, 34, 31–59. [Google Scholar] [CrossRef]
National Academies of Sciences; Division on Earth; Life Studies; Ocean Studies Board; Board on Atmospheric Sciences; Committee on Developing a US Research Agenda to Advance Subseasonal to Seasonal Forecasting. Next Generation Earth System Prediction: Strategies for Subseasonal to Seasonal Forecasts; National Academies Press: Washington, DC, USA, 2016. [Google Scholar]
Colman, A.W.; Palin, E.J.; Sanderson, M.G.; Harrison, R.T.; Leggett, I.M. The potential for seasonal forecasting of winter wave heights in the northern North Sea. Weather. Forecast. 2011, 26, 1067–1074. [Google Scholar] [CrossRef]
Zunic, E.; Korjenic, K.; Hodzic, K.; Donko, D. Application of facebook’s prophet algorithm for successful sales forecasting based on real-world data. arXiv 2020, arXiv:2005.07575. [Google Scholar]
Rafferty, G. Forecasting Time Series Data with Facebook Prophet: Build, Improve, and Optimize Time Series Forecasting Models Using the Advanced Forecasting Tool; Packt Publishing Ltd.: Birmingham, UK, 2021. [Google Scholar]
Alharbi, F.R.; Csala, D. A seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) forecasting model-based time series approach. Inventions 2022, 7, 94. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Nezhad, S.K.; Barooni, M.; Velioglu Sogut, D.; Weaver, R.J. Ensemble Neural Networks for the Development of Storm Surge Flood Modeling: A Comprehensive Review. J. Mar. Sci. Eng. 2023, 11, 2154. [Google Scholar] [CrossRef]

Figure 1. Decomposition of wind speed (WSPD).

Figure 2. Decomposition of wave height (WVHT).

Figure 3. Scatter plots of wind speed (WSPD) against the potential exogenous variables.

Figure 4. Comparison of actual wind speed with Prophet model for the last 30 days of dataset.

Figure 5. Comparison of actual wind speed with Prophet model on entire dataset.

Figure 6. Comparison of actual significant wave height with Prophet model for the last 30 days of dataset.

Figure 7. Comparison of actual significant wave height with Prophet model on entire dataset.

Figure 8. Comparison of actual wind speed with Prophet model for (a) 1-day and (b) 3-day periods.

Figure 9. Comparison of actual significant wave height with Prophet model for (a) 1-day and (b) 3-day periods.

Figure 10. Diagnostic plots for assessing the quality of the SARIMAX model fit, including the analysis of standardized residuals, their distribution, normality through a Q-Q plot, and autocorrelation patterns.

Figure 11. Comparison of actual wind speed with SARIMAX model.

Figure 12. Comparison of actual significant wave height with SARIMAX model without exogenous variable.

Figure 13. Comparison of actual significant wave height with SARIMAX model with exogenous variable.

Figure 14. Comparison of actual wind speed with SARIMAX model for (a) 1-day and (b) 3-day periods.

Figure 15. Comparison of actual significant wave height with SARIMAX model with exogenous variable for (a) 1-day and (b) 3-day periods.

Figure 16. Actual wind speed vs. multi-variable LSTM model forecast.

Figure 17. Comparison of actual significant wave height with multi-variable LSTM model forecast with gust speed.

Figure 18. Comparison of actual significant wave height with multi-variable LSTM model forecast with lagged gust speed.

Figure 19. Comparison of actual wind speed with multi-variable LSTM model forecast for (a) 1-day and (b) 3-day periods.

Figure 20. Comparison of actual wind speed with multi-variable LSTM model forecast.

Figure 21. Training and validation loss for wind-speed prediction.

Figure 22. Comparison of actual significant wave height with multi-variable LSTM model forecast for (a) 1-day and (b) 3-day periods.

Figure 23. Comparison of actual significant wave height with multi-variable LSTM model forecast.

Figure 24. Training and validation loss for wave height prediction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barooni, M.; Ghaderpour Taleghani, S.; Bahrami, M.; Sedigh, P.; Velioglu Sogut, D. Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications. Atmosphere 2024, 15, 640. https://doi.org/10.3390/atmos15060640

AMA Style

Barooni M, Ghaderpour Taleghani S, Bahrami M, Sedigh P, Velioglu Sogut D. Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications. Atmosphere. 2024; 15(6):640. https://doi.org/10.3390/atmos15060640

Chicago/Turabian Style

Barooni, Mohammad, Shiva Ghaderpour Taleghani, Masoumeh Bahrami, Parviz Sedigh, and Deniz Velioglu Sogut. 2024. "Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications" Atmosphere 15, no. 6: 640. https://doi.org/10.3390/atmos15060640

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications

Abstract

1. Introduction

2. Methodology

2.1. Facebook Prophet

2.2. SARIMAX

2.3. Long Short-Term Memory

2.4. Data Acquisition and Preprocessing

3. Results and Discussion

3.1. Prophet Model

3.2. SARIMAX

3.3. LSTM

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI