Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations

Pandit, Ravi; Astolfi, Davide; Tang, Anh Minh; Infield, David

doi:10.3390/en15197233

Open AccessArticle

Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations

by

Ravi Pandit

^1,*

,

Davide Astolfi

²,

Anh Minh Tang

³

and

David Infield

⁴

¹

Centre for Life-Cycle Engineering and Management, Cranfield University, Bedford MK43 0AL, UK

²

Department of Engineering, University of Perugia, Via G. Duranti, 06125 Perugia, Italy

³

École des Ponts ParisTech (ENPC), Ministry for the Ecological Transition, 77420 Paris, France

⁴

Electronics and Electrical Engineering Department, University of Strathclyde, Glasgow MK43 0AL, UK

^*

Author to whom correspondence should be addressed.

Energies 2022, 15(19), 7233; https://doi.org/10.3390/en15197233

Submission received: 25 August 2022 / Revised: 19 September 2022 / Accepted: 27 September 2022 / Published: 1 October 2022

Download

Browse Figures

Versions Notes

Abstract

:

Offshore wind turbines (OWTs), in comparison to onshore wind turbines, are gaining popularity worldwide since they create a large amount of electrical power and have thus become more financially viable in recent years. However, OWTs are costly as they are vulnerable to damage from extremely high-speed winds and thereby affect operation and maintenance (O&M) operations (e.g., vessel access, repair, and downtime). Therefore, accurate weather forecasting helps to optimise wind farm O&M operations, improve safety, and reduce the risk for wind farm operators. Sequential data-driven models recently found application in solving the wind turbines problem; however, their application to improve offshore operation and maintenance through weather forecasting is still limited and needs further investigation. This paper fills this gap by proposing three sequential data-driven techniques, namely, long short-term memory (LSTM), bidirectional LSTM (BiLSTM) and gated recurrent units (GRU) for long-term weather forecasting. The proposed techniques are then compared to summarise the strength and weaknesses of these models concerning long-term weather forecasting. Weather datasets (wind speed and wave height) are intermittent over different time scales and reflect offshore weather conditions. These datasets (obtained from the FINO3 database) will be used in this study for training and validation purposes. The study results suggest that the proposed technique can generate more realistic and reliable weather forecasts in the long term. It can also be stated that it responds better to seasonality and forecasted expected results. This is further validated by the calculated values of statistical performance metrics and uncertainty quantification.

Keywords:

wind turbine; offshore wind; weather forecasting; deep learning; machine learning

1. Introduction

Due to eco-friendly climate change policies and significant cost reductions over the last decade, wind energy experienced remarkable growth across the globe, driven by technological improvement and competition. For instance, according to IRENA, wind energy (combined offshore and onshore) could create more than a third (35%) of total electricity needs, becoming the prominent clean energy source by 2050 and is there is forecasted to be an addition of nearly 51 GW of new offshore wind worldwide by 2024 [1]. Compared with all conventional fossil fuel, onshore wind turbines (WTs) levelised cost of energy (LCOE) is already competitive now and expected to further decline in the coming years within the range of USD 0.03 to 0.05/kWh by 2030 and USD 0.02 to 0.03/kWh by 2050 [1]. Offshore WTs, as the name suggests, are generally located in the high seas to capture the abundant wind resources and witnessed a total installed capacity of 23 GW in 2018 [1]. According to the Wind Europe report on offshore wind in Europe: Key trends and statistics 2019 [2], Europe installed a record 3.6 GW of new offshore WT capacity in 2019, and the UK records the highest installed capacity (48.5%), followed by Germany (30.5%), Denmark (10%), and Belgium (10%). Compared to onshore wind farms, offshore across Europe has great interest, notably in Denmark, Germany, and the UK. However, offshore operations and reliability are becoming more challenging and costly for these countries [3]. According to current statistics [4,5,6], operation and maintenance (O&M) cost for offshore WTs are about 25–30% (including all expenditures associated with planned and unplanned repair tasks) of the total lifecycle costs, which can be increased further due to logistics, access for routine maintenance and unplanned maintenance. Hence, many offshore WT operators and manufacturers are looking for a way to optimise maintenance strategies and reduce O&M costs under unexpected marine environments.

1.1. Factors Affecting Offshore O&M Activities

Offshore O&M activities are complex and involve a massive workforce and cost. Thus, optimising O&M is key to determining the most effective and efficient maintenance plan to reduce maintenance costs. However, designing an optimal maintenance strategy for offshore wind farms is greatly affected by weather conditions that delay repair works and cause high uncertainty in power production [7]. Thus, it is crucial to develop effective models and efficient techniques to predict the weather condition to gain access to offshore WTs for service and repairs, improve availability, and reduce O&M costs. Furthermore, these offshore O&M activities depend highly on wave height and hub height wind speed; therefore, they are often together, called weather datasets, and are widely used in many kinds of literature [8]. For example, Rothkofp et al. [9] proposed a Markovian wave-height-based model which can be used in any Monte Carlo simulation for offshore. Anastasiou and Tsekos [10] suggested a technique for the existence of different marine environmental conditions using Markov theory. They assumed the distribution of the environmental condition as a stationary first-order Markov process. Their investigation suggests that the Markov chain effectively models marine environmental parameters and achieves better accuracy than the Kuwashima–Hogben algorithm [11]. Dinwoodie et al. [12] proposed a novel technique based on autoregressive (AR) models to represent met ocean site conditions where wave height and speed data sets are used for training and validation purposes. Their result shows the influence of weather on component reliability and access thresholds at various existing sites on availability. It offers new insights into offshore WTs’ O&M. Feuchtwang et al. [13] use a closed-form probabilistic model to quantify the maintenance delays due to the sea state. Access constraints and a Weibull distribution for the environment are included in their model, intended to highlight a given location’s conditions. Offshore repair works are delayed either because of a lack of technicians or are forced to have to shut down due to significant faults. The harsh environmental condition restricts access to the WT by service vehicles, where wave height is critical in determining if access can be securely achieved. Dinwoodie et al. [14] present wave height limits for various vehicle types such as helicopters and multiple vessels at sea. They specify that the wave height restrictions apply to the service vehicle’s length at sea. All these literature examples outline the importance of weather conditions and their impact on offshore O&M activities. Thus, many researchers seek cost-effective technologies to detect weather conditions to optimise offshore maintenance activities. These are briefly reviewed as follows.

1.2. Related Works

In recent years, data-driven techniques are finding applications in optimising offshore O&M activities and analysing the impacts of weather conditions on turbine operations. Reder et al. [15] proposed a framework capable of correlating and analysing failure data and environmental conditions ahead of wind turbine component failures. They used supervised and unsupervised data-driven techniques to filter the weather and failure data. The a priori rule-mining algorithm is then applied to understand the logical interconnections between the failure occurrences and the environmental data. Their result established the relationship between the environmental parameters (e.g., relative humidity, ambient temperature, and wind speed) and the failures of five major WT components: gearbox, generator, frequency converter, pitch and yaw system. Tautz-Weinert et al. [16] carried out a sensitivity analysis of the maintenance decision at a Spanish wind farm where a wind blade’s replacement was needed to avoid a catastrophic failure. While doing so, they considered the effect of environmental conditions on power performance. Their finding highlights the importance of weather seasonality and seasonality in the electricity market in the O&M decision-making process. Juan et al. [17] developed an open-access O&M tool that estimates a given offshore wind farm’s availability and helps optimise operational strategies. They used a discrete Markov model to assess weather conditions in which wave height and wind speed obtained from the FINO3 database are being used for training and validation purposes. Hofmann et al. [18] simulated weather time series data using a Markov chain method where they assumed a perfect weather prediction for the duration of the next shift. In [19], the Metocean module used re-sampling of wind and wave data and, in addition to the time series for wind speed and significant wave height, provided wind shear model parameters and operating limits for each type of equipment. Dalgic et al. [20] used an autoregressive multivariate model to produce the synthetic weather data. They used wind speeds at sea level for access controls and hub height wind speeds for maintenance and development with a jack-up vessel. Wave heights and wave cycles were used to signify the wave environment. This model preserves site-specific weather stability, seasonality, and the association between wind intensity, wave height, and wavelength. Seyr et al. [21] presented a stochastic process for generating weather data and can be used as an alternative to the traditional generation models based on simulation. More recently, probabilistic ARMA-GARCH approaches have been applied to wave height forecasting, which could be applied to future offshore decision making as an alternative to other existing forecasting techniques [22].

1.3. Timeliness, Knowledge Gap and Novelty of the Proposed Work

As per the World Energy Council findings, better weather forecasting could reduce operational costs by 3%, attracting the attention of numerous researchers and OWT operators to create strong weather forecasting models to improve OWTs’ O&M operations, availability, and dependability. Furthermore, a missed weather window can be extremely costly, especially during construction when specialised vessels are involved. Accurate weather forecasting can improve safety and reduce the risk for wind farm operators. Therefore, these risks and costly circumstances can be avoided by using reliable weather forecasting systems. Data-driven technologies such as machine learning and deep learning have started finding applications in weather forecasting. For example, Pandit et al. [23] proposed LSTM and Markov chain for long-term weather forecasting and found that Markov outperformed LSTM. However, they did not incorporate different deep learning approaches for their model performance validations. Furthermore, in [24], it is discussed that there is little literature about data-driven models for long-term wind forecasting and, in that study, several tree-based algorithms are employed. This article attempts to fill this gap by proposing three deep learning (LSTM, biLSTM and GRU) techniques for long-term weather forecasting to optimise the OWTs’ operation with accurate weather forecasts. The rationale is to explore the use of deep learning networks which have become widely used in different fields of wind energy applications such as, for example, short-term forecast or sub-component fault diagnosis. The data-driven models selected in this study are then compared to suggest the most appropriate long-term weather forecasting in terms of accuracy and computational cost. This research also addresses the theoretical and practical limitations associated with the implementation of these data-driven models to help the offshore wind O&M decision-making process.

A framework for the proposed research is illustrated in Figure 1 and described as follows. The weather datasets (wind speed and wave height) extracted from the FINO database were first pre-processed together and were then split into training and validation sets. The proposed data-driven models were then trained and their effectiveness was then tested. A performance comparison is the final stage of the proposed methodology where proposed models are compared in terms of accuracy and uncertainty. Statistical performance error metrics and uncertainty assessment were undertaken to answer the following research question: which data-driven model is robust in estimating long-term weather to improve O&M activities of turbines?

The rest of this paper is organised as follows. First, weather data and its pre-processing are described; then, the data-driven model’s methodologies for long-term weather forecasting are proposed. Thereafter, performance comparison is conducted with existing techniques to find out the most effective model and summarising strengths and weaknesses of the models. Finally, the paperwork finishes by outlining conclusions and discussion together with possible future directions.

2. Datasets Preparation and Pre-Processing

This study refers to wind speed and wave height as weather data since they are the main weather factors that affect offshore maintenance scheduling [17,21]. These weather datasets are being taken from FINO3, situated about 80 kilometres west of Sylt, in the midst of German offshore wind farms [25]. The FINO3 datasets describe the 3-hourly data points (wind speed at 106 m above sea level and wave height) for 10 years, starting with ‘1 January 2000 00:00 AM’ and ending at the timestamp ‘31 December 2010 21:00’. Before being fed into the model, all the samples are divided into the training set and test set with the ratio of 70:30 which gives 22,493 and 9636, respectively, (as illustrated in Table 1) for model training and testing purposes.

Raw data contain the univariate component of the metrics in time order which is then converted into multivariate features. Datasets were normalized on a scale of 0 to 1 and, using the feature engineering technique, features were created. The overall framework of the proposed techniques is easily explained by a flow chart as shown in Figure 2, where raw weather datasets are taken for the pre-processing stage (normalisation, conversion from univariate to multivariate, and the data division) and then the resulting datasets are used for the proposed model training and testing. FINO3 weather datasets are pre-processed using ‘RobustScaler’ available in the scikit-learn Python machine learning library via the ‘RobustScaler’ class. It performs better in reducing the influence of outliers [26]. While doing so, the scaling range is defined by the interquartile range (IQR) and is bound by their default values. Then, datasets are transformed using ‘scalar. fit’ of python and will be used for LSTM model training and testing purposes in upcoming sections. Once models were developed, then their performance was tested in terms of performance metrics and uncertainty analysis.

Figure 3 and Figure 4 demonstrate the three-hourly time-series data points of wind speeds and wave heights from 1992 to 2016 (10 years) representing high variability. For the sake of simplicity, time series plotted for sample datasets are shown in Figure 5 and Figure 6, respectively. Furthermore, by looking at these two parameters closely, we found that they are closely correlated to each other, which is expected. This is further confirmed by the wave height and wind speed scatter plot as shown in Figure 5.

Augmented Dickey–Fuller (ADF) tests were performed on both wind speed and wave height datasets and both rejected the null hypothesis and, therefore, confirmed the alternative hypothesis that both time-series datasets are stationary and have no unit root. The formulation of the test is as follows: consider the model in Equation (1):

Δ y_{t} = α + β t + γ y_{t - 1} + δ_{1} Δ y_{t - 1} + \dots + Δ y_{t - p + 1} + ε_{t}

(1)

The null hypothesis corresponds to

γ = 0

, which means that the lagged value

y_{t - 1}

is irrelevant for predicting the change of the target at time t. If the null hypothesis can be rejected, the process is stationary, has no unit root and exhibits reversion to the mean, which means that the lagged value is relevant for predicting the change of the output. After confirming that the time series are stationary, autocorrelation analysis is taken into consideration to measure a set of present values against past values to check if they correlate. Because wave heights are highly volatile, it is essential to determine the internal correlation using a plot of the autocorrelation of a time series of wave height by lag called the autocorrelation function (ACF). The autocorrelation and partial autocorrelation functions with lag k are defined in Equations (2) and (3):

R_{k} = E [y_{t + k} \bar{y_{t}}]

(2)

φ_{k} = E [(y_{t + k} - \hat{y_{t + k}}) (y_{t} - \hat{y_{t}}),

(3)

where E stands for expected value and

\hat{y_{t + k}}

and

\hat{y_{t}}

are linear combinations of

y_{t + 1}

,

y_{t + 2}

, …,

y_{t + k - 1}

which minimize the mean squared error of

y_{t + k}

and

y_{t}

, respectively. Figure 6 indicates that the wave height time series have substantial autocorrelation that persists amid high volatility and similar trends can be seen in the case of wind speed as shown in Figure 7.

3. Methodology

In this section, the components and the architecture of the proposed sequential dense LSTM are introduced in detail. Here, wind speed and wave height (weather datasets) predictions are defined as predicting the future based on historical information.

3.1. Technical Challenges

Training recurrent neural networks comes with vanishing gradient and exploding gradient problems, and they prevent connecting information from several previous steps to the present stage. This problem is serious as it creates a significant barrier to training large networks. This phenomenon was well researched by Hochreiter (1991) [27] and Bengio, et al. (1994) [28]. Techniques such as reducing the number of layers, gradient clipping and weight initialization are to address exploding and vanishing gradients; however, these techniques affect the accuracy of recurrent neural networks [29,30]. However, LSTM by default can remember long-term dependencies using a ‘memory cell’, which are used in this paper to develop robust techniques for OWTs’ weather prediction for an extended period.

3.2. Software Design

The experiment and model training was conducted on the Python platform with deep learning and code implementation on Keras (using Tensorflow as backend) where networks are represented as a series of layers in Keras in the form of sequential class. The sequential model is a linear stack of layers that allows the creation of models layer-by-layer for most problems. It is considered flexible in defining models where layers connect to more than just the previous and subsequent layers. The LSTM layer relies on the chosen input; thus, not all input data points undergo the training process. To train this LSTM algorithm, only selected and specified data points are used, which are valuable for the model-tuning process, resulting in efficient data computation.

The dense layer is the standard and frequently used layer deeply connected to the neural network layer, also known as the fully connected layer. In the dense layer, each neuron is connected to the neurons from the next layer and ensures that the model is fully connected [31]. After that, the LSTM model discovers the mapping and the correlation among data and their prediction with these inputs. The dropout [32] layer is used to construct generalisation in the model, next to the LSTM layer. Between matrix multiplications, the activation layer is used to give a neural network the ability to model non-linear processes. As the internal activation functions, long short-term memory networks have

t a n h

and

s i g m o i d

. The activation layer is then applied to the model, identifying which LSTM cells should be allowed and whether the information received by cell is significant, making the activation role in a deep neural network extremely important. To generalise the training carried out by the LSTM layer, a dropout layer of 0.75 was applied to it [33]. The optimiser selected was Adam (adaptive moment estimation), as it is computationally efficient and requires significantly less memory space. In addition, in deep learning and machine learning, ‘Adam’ is well suited to many non-convex optimisation problems. It is necessary to minimise the cost function by identifying the optimised value for weights and ensuring that the algorithm generalises well to achieve an accurate prediction based on LSTM. The learning rate is

10^{- 2}

for model optimisation as it was found to be robust. Epoch defines the number of times that the learning algorithm will work through the entire training dataset and is kept at 20 for both weather datasets training. Batch size signifies the number of sequences trained together and is fixed at 16. In addition, hidden units are set at 150. These entire hyper-parameter configurations are shown in Table 2. As deep learning models are computationally expensive, an early stopping methodology has been adopted and datasets were fed to the model in a batch of 32. The entire model runs for 100 epochs and based on the early stopping model stop training if model performance does not improve for 10 consecutive epochs. In the end, the model output was DE-normalized for analysing the results. Performance metrics such as RMSE and MAE were evaluated for each model (results are shown in the upcoming section).

3.3. LSTM-Based Weather Forecasting Models

In this section, the components and the architecture of the proposed LSTM are introduced, covering the training and validation process. A fully trained sequential dense LSTM model was used to generate future weather datasets to support OWTs maintenance activities. The LSTM is a special form of recurrent neural network (RNN), which works well on sequence-based tasks with long-term dependencies, and has recently been applied for several sequence modelling tasks, such as natural language processing [34], speech recognition [35], and image generation [36], and has recently gained a lot of attention in research on time series [37,38,39]. While there is a range of traditional LSTM variants proposed in recent times, a large-scale LSTM variant study showed that none of the alternatives could significantly enhance the standard LSTM architecture and have complexity [27,40]. Therefore, this study implements the standard LSTM architecture as part of the proposed network structure for simplicity and incorporates it in this section.

The widely used LSTM network architecture was proposed by Sepp Hochreiter and Jrgen Schmidhuber [27] to address the gradient disappearance problem in practice. The hidden layer is the only separate component between standard LSTM and RNN, and they are referred to as LSTM cells in the LSTM architecture, as shown in Figure 8 [41]. For a given time

t

, the LSTM cell has the layer input and output,

x_{t}

and

h_{t}

, respectively, which is kind of similar to RNNs. The LSTM, each computational unit is linked not only to a hidden state

h_{t}

but also to a cell input state,

\tilde{c_{t}}

, as well as a cell output state,

c_{t}

, together with the previous cell output,

c_{t - 1},

in order to train and update the parameters. By adjusting constant gain transfer equal to 1,

c_{t - 1}

changed to

c_{t}

to spread errors without any disappearing gradient phenomena in previous processes. Because of the gated layout, LSTM can manage long-term dependencies to allow relevant information to pass along the LSTM network. In an LSTM, there are three gates: an input gate, a forget gate and an output gate present. The forget gate allows LSTM to be an efficient and scalable model for many sequential data-related learning problems [42]. The cell’s status can be changed via a door that allows or prevents updating through an input gate. Likewise, at the output gate of the LSTM unit, a door regulates whether the state of the cell is transmitted. The most common version of LSTMs also utilises a forgotten gate to reset the cell status. At time

t,

the input gate, the forget gate, and the output gate are defined as

i_{t}, f_{t}

, and

o_{t}

, respectively. The input gate, the forget gate, the output gate, and the input cell state are represented by colourful boxes, while pink circles are arithmetic operators and the coloured rectangles are the gates in the LSTM cell in Figure 8.

The gates can be calculated using the following dynamic equations:

f_{t} = σ_{g} (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f})

(4)

i_{t} = σ_{g} (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i})

(5)

o_{t} = σ_{g} (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o})

(6)

\tilde{C_{t}} = \tan h (W_{C} x_{t} + U_{C} h_{t - 1} + b_{C})

(7)

where

W_{f}

,

W_{i}

,

W_{o}

and

W_{C}

are the weight matrices mapping the hidden layer input to the three gates and the input cell state, while

U_{f}

,

U_{i}

,

U_{o}

and

U_{C}

are the weight matrices connecting the previous cell output state to the three gates and the input cell state.

b_{f}

,

b_{i}

,

b_{o}

and

b_{C}

are four bias vectors.

σ_{g}

is the gate activation function, which normally is the sigmoid function, and the

t a n h

is the hyperbolic tangent function. Based on the results of the four above equations, for a given time

t

, the cell output state,

C_{t}

, and the layer output,

h_{t}

, can be obtained by using the following equations:

C_{t} = f_{t} * C_{t - 1} + i_{t} * \tilde{C_{t}}

(8)

h_{t} = O_{t} * (\tanh (C_{t}))

(9)

A vector of all outputs, represented by

Y_{T} = [h_{T - n}, …, h_{T - 1}]

, is the final output of the LSTM layer. Here, for example, while estimating weather datasets, we predict only the last element of the output vector, i.e.,

h_{T - 1}

. Thus, for example, the predicted wave height value

(\hat{x})

for the next time iteration,

T

, is

h_{T - 1}

, namely

{\hat{x}}_{T} = h_{T - 1}

.

The above-outlined LSTM methodology is applied to the datasets and software design described in Section 2 to, respectively, train and test the proposed weather forecasting model using Python programming. In LSTM, each time step of the test dataset will be used one at a time. The results of the LSTM weather forecasting models are shown in Figure 9 and Figure 10 and found that LSTM-model-forecasted values are close to the tested values of the wave height and wind speed and follow the expected pattern, despite having slight differences noticed in the case of wind speed prediction which is due to the wind speed stochastic behaviour. This is further confirmed by error analysis of LSTM-based weather forecasting models as shown in Figure 11 and Figure 12.

3.4. BiLSTM-Based Weather Forecasting Models

A recurrent neural network used primarily for natural language processing is called bidirectional LSTM (BiLSTM). Unlike standard LSTM, the input flows in both directions, and it is capable of using information coming from both sides. In conclusion, BiLSTM reverses the direction of information flow by adding one extra LSTM layer. It simply means that in the additional LSTM layer, the input sequence flows backwards. The outputs from the two LSTM layers are then combined in a variety of ways, including average, sum, multiplication, and concatenation as illustrated in Figure 13. A theoretical description of the BiLSTM technique can be found in [42]. Using pre-processing datasets, the BiLSTM model for weather forecasting was constructed and the results are shown in Figure 14 and Figure 15, respectively, and suggest that the forecast follows the measured values and error analysis; Figure 16 and Figure 17 further confirm these analyses. It is worth noting that despite combining LSTM layers from both directions (i.e., BiLSTM), the accuracy is close to LSTM results: the details will be discussed in the upcoming section.

3.5. GRU-Based Weather Forecasting Models

The gated recurrent unit, or GRU for short, uses the same workflow as an RNN, but each GRU unit’s operation and associated gates are different. GRU uses the update gate and reset gate operating techniques to address the issue with traditional RNN. Figure 18 illustrates GRU operations and is briefly described as follows. The amount of prior knowledge that must be transmitted along with the next state is decided by the update gate. This is incredibly powerful since the model can choose to copy all of the prior data and completely remove the possibility of a vanishing gradient.

The reset gate is utilised in the model to determine how much of the prior knowledge must be disregarded; in other words, it determines whether or not the previous cell state is significant. The reset gate first activates; it stores pertinent data from the previous time step in new memory content. The input vector and hidden state are then multiplied by their respective weights. After that, it multiplies the multiple of the previously hidden state and the reset gate element-by-element. The following sequence is formed by using the non-linear activation function after adding up the aforementioned stages.

Detailed methodologies of the GRU can be found in [44] and used to develop weather forecasting models based on filtered datasets described in Section 2 and results are shown in Figure 19 and Figure 20, respectively. Furthermore, model error analysis (as shown in Figure 21 and Figure 22) suggests that the GRU is able to forecast wind speed and wave height which follows the trends similar to measured datasets.

However, as discussed in previous sections, both LSTM and GRU techniques are used to solve the vanishing gradient problem and can significantly affect the time-series model’s accuracy. However, there are four main differences between these models as described as follows [44]:

✓: The GRU has two gates, and LSTM has three gates.
✓: The output gate seen in LSTM is absent from the GRU, and neither device has internal memory.
✓: While, in the GRU, the prior hidden state is directly affected by the reset gate, in LSTM, the input gate and target gate are coupled by an update gate. The two gates, input and target, in LSTM are in charge of resetting the gate.
✓: GRU needs fewer training parameters and, as a result, uses less memory and runs faster than LSTM whereas LSTM is more accurate on big datasets.

4. Performance Comparisons

These proposed data-driven models for weather condition forecasting have been shown to be successful based on the analysis mentioned above. In this section, the performance comparison of these techniques is discussed. For the sake of simplicity and a better understanding of the comparative analysis of the proposed methods, the sample of 500 weather data points was used for comparing the performance of LSTM, BiLSTM and GRU models and the results are shown in Figure 23 and Figure 24, respectively. It has been suggested that all models show a similar kind of accuracy while forecasting wind speed and wave height; however, GRU is faster but less accurate compared to LSTM and BiLSTM as they use fewer training parameters and hence consume less memory. Therefore, for larger datasets, LSTM and BiLSTM were found to be suitable. This is further validated by the performance metrics as described below.

Model performance was deeply impacted by less exposure to training time. In this study, to train models, it took only 25 min for each weather parameter of size 51 K, making the total time equal to 50 min to tune it for better sequence capturing. It is worth noting that, while training the model, the mini-batch gradient descent technique is used in order to optimise the mean squared error (MSE) using RMSProp optimiser, and an early stopping mechanism is used to minimise over-fitting. This is emphasised by the calculated values of RMSE that indicate good forecasting. It is worth also noting that to perform effective long-term forecasting (typically of several years), parameters such as lags, number of hidden units, and number of training iterations need to be tuned depending upon the size of the training datasets. Otherwise, it leads to overfitting, which ultimately affects the forecasting accuracy of the LSTM model.

Using Performance Error Metrics

To measure the effectiveness of proposed LSTM algorithms for offshore weather forecasting, popularly used performance error metrics (PEM), namely mean absolute errors (MAE), root mean square error (RMSE), percentage of coverage (%) and average width are taken into consideration. The following equations mathematically define these:

M A E = \frac{1}{n} \sum_{i = 1}^{n} | x_{i} - \hat{x_{i}} |

(10)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(x_{i} - \hat{x_{i}})}^{2}}{n}}

(11)

where

n

is the number of data points,

x_{i}

is the measured weather datasets, and

\hat{x_{i}}

is the predicted weather datasets.

Furthermore, to analyse the model, uncertainty analysis was carried out. For these two metrics, the percentage of coverage, which is defined as the percentage of the dataset within the prediction band, and the average width, which is the average width of the prediction band, are reported in Table 3. Both metrics have a trade-off. So, the best model should have a higher percentage of coverage and, at the same time, lower average width.

All the compared models in this section were trained and tested multiple times to eliminate outliers, and the results of them presented were averaged to reduce random errors. To remove outliers, the proposed weather forecasting models in this section were trained and tested numerous times, and the results reported were summed to minimise random errors. The results of performance error matrices for all models were calculated and the results are tabulated in Table 3. The numerical results suggest that the proposed model accuracy is similar and the only difference can be found in training the models.

5. Conclusions and Future Works

As more offshore assets are constructed in the coming years, the significance of weather for enhancing the accessibility and maintenance of offshore wind farms will only grow. In order to maximise operating lifetime and enhance offshore turbine availability, maintenance activities can be planned with the help of accurate weather forecasts. Long-term gains from the consequent higher revenues will go to offshore operators.

Three models, namely, LSTM, BiLSTM and GRU, are proposed in this paper for weather forecasting and are then compared against each other to suggest a robust weather forecasting model in terms of accuracy and computational costs. To train and test the models, weather datasets obtained from the FINO3 database were used. Experimental results suggest that the performance of these models is relatively the same while predicting both wind speed and wave height for the long term, as shown in Figure 23 and Figure 24 and tabulated in Table 3. Moreover, the proposed technique turns out to be more efficient to learn features from the weather datasets; however, training time (which took approximately 40 min) is still an issue. Moreover, this training time is expected to increase further depending upon data size and hyper-parameter (e.g., epoch and dense layers) optimisation and complexity. Despite this, the proposed methods are still capable of making accurate weather predictions for datasets spanning several years.

This experiment used datasets with a 3 h resolution; however, findings could differ if tested against datasets with a higher resolution or 10 min resolution (mostly used by wind industries), which would make the training process much slower. Future studies will, therefore, involve putting the suggested solutions to the test with varying resolution data points. Future studies will also examine the impact of incorporating more layer units and speeding up the algorithm by adjusting hyper-parameters.

Author Contributions

Conceptualization, R.P. and D.I.; methodology, R.P.; R.P.; validation, R.P., D.A. and A.M.T.; formal analysis, R.P., A.M.T.; investigation, D.A. and A.M.T.; resources, R.P.; data curation, R.P.; writing—original draft preparation, R.P.; writing—review and editing, D.I., A.M.T. and D.A.; visualization, R.P.; supervision, A.M.T. and D.I.; project administration, R.P. and D.I.; funding acquisition, Not applicable. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

IRENA. Future of Wind: Deployment, Investment, Technology, Grid Integration and Socio-Economic Aspects (A Global Energy Transformation Paper); International Renewable Energy Agency: Abu Dhabi, United Arab Emirates, 2019. [Google Scholar]
Ramírez, L.; Fraile, D.; Brindley, G. Offshore Wind in Europe: Key Trends and Statistics 2019; Wind Europe: Brussels, Belgium, 2020; p. 40. [Google Scholar]
Caroll, J.; McDonald, A.; McMillan, D. Failure rate, repair time and unscheduled O&M cost analysis of offshore wind turbines. Wind. Energy 2016, 19, 1107–1119. [Google Scholar]
Tchakoua, P.; Wamkeue, R.; Ouhrouche, M.; Slaoui-Hasnaoui, F.; Tameghe, T.A.; Ekemb, G. Wind Turbine Condition Monitoring: State-of-the-Art Review, New Trends, and Future Challenges. Energies 2014, 7, 2595–2630. [Google Scholar] [CrossRef] [Green Version]
Seyr, H.; Muskulus, M. Decision Support Models for Operations and Maintenance for Offshore Wind Farms: A Review. Appl. Sci. 2019, 9, 278. [Google Scholar] [CrossRef] [Green Version]
Tavner, P. Offshore Wind Turbines: Reliability, Availability and Maintenance; Institution of Engineering and Technology: London, UK, 2012. [Google Scholar]
Shafiee, M. Maintenance logistics organisation for offshore wind energy: Current progress and future perspectives. Renew. Energy 2015, 77, 182–193. [Google Scholar] [CrossRef]
Astolfi, D.; Pandit, R.; Celesti, L.; Vedovelli, M.; Lombardi, A.; Terzi, L. Data-Driven Assessment of Wind Turbine Performance Decline with Age and Interpretation Based on Comparative Test Case Analysis. Sensors 2022, 22, 3180. [Google Scholar] [CrossRef] [PubMed]
Rothkopf, M.H.; McCarron, J.K.; Fromovitz, S. A weather model for simulating offshore construction alternatives. Manag. Sci. 1974, 20, 1345–1349. [Google Scholar] [CrossRef]
Anastasiou, K.; Tsekos, C. Persistence statistics of marine environmental parameters from Markov theory, Part 1: Analysis in discrete time. Appl. Ocean Res. 1996, 18, 187–199. [Google Scholar] [CrossRef]
Kuwashima, S.; Hogben, N. The estimation of wave height and wind speed persistence statistics from cumulative probability distributions. Coast. Eng. 1986, 9, 563–590. [Google Scholar] [CrossRef]
Dinwoodie, I.; McMillan, D.; Revie, M.; Lazakis, I.; Dalgic, Y. Development of a Combined Operational and Strategic Decision Support Model for Offshore Wind. Energy Proc. 2013, 35, 157–166. [Google Scholar] [CrossRef] [Green Version]
Feuchtwang, J.; Infield, D. Offshore wind turbine maintenance access: A closed-form probabilistic method for calculating delays caused by sea-state. Wind Energy 2013, 16, 1049–1066. [Google Scholar] [CrossRef]
Dinwoodie, I.A.; VCatterson, M.; McMillan, D. Wave height forecasting to improve offshore access and maintenance scheduling. In Proceedings of the 2013 IEEE Power & Energy Society General Meeting, Vancouver, BC, Canada, 21–25 July 2013. [Google Scholar]
Reder, M.; Yürüşen, N.Y.; Melero, J.J. Data-driven learning framework for associating weather conditions and wind turbine failures. Reliab. Eng. Syst. Saf. 2018, 169, 554–569. [Google Scholar] [CrossRef] [Green Version]
Tautz-Weinert, J.; Yürüşen, N.Y.; JMelero, J.; Watson, S.J. Sensitivity study of a wind farm maintenance decision—A performance and revenue analysis. Renew. Energy 2019, 132, 93–105. [Google Scholar] [CrossRef] [Green Version]
Chiachio-Ruano, J.; Kolios, A.; Walgern, J.; Koukoura, S.; Pandit, R. Open O&M: Robust O&M Open Access Tool for Improving Operation and Maintenance of Offshore wind Turbines. In Proceedings of the 29th European Safety and Reliability Conference (ESREL), Hannover, Germany, 22–26 September 2019. [Google Scholar] [CrossRef]
Hofmann, M.; Sperstad, I.B. NOWIcob—A tool for reducing the maintenance costs of offshore wind farms. Energy Proc. 2013, 35, 177–186. [Google Scholar] [CrossRef] [Green Version]
Asgarpour, M.; van de Pieterman, R. O&M Cost Reduction of Offshore Wind Farms: A Novel Case Study; ECN: Petten, The Netherlands, 2014. [Google Scholar]
Dalgic, Y.; Lazakis, I.; Dinwoodie, I.; McMillan, D.; Revie, M. Advanced logistics planning for offshore wind farm operation and maintenance activities. Ocean Eng. 2015, 101, 211–226. [Google Scholar] [CrossRef] [Green Version]
Seyr, H.; Muskulus, M. Using a Langevin model for the simulation of environmental conditions in an offshore wind farm. J. Phys. Conf. Ser. 2018, 1104, 012023. [Google Scholar] [CrossRef]
Taylor, J.W.; Jeon, J. Probabilistic forecasting of wave height for offshore wind turbine maintenance. Eur. J. Oper. Res. 2018, 267, 877–890. [Google Scholar] [CrossRef]
Pandit, R.K.; Kolios, A.; Infield, D. Data-Driven weather forecasting models performance comparison for improving offshore wind turbine availability and maintenance. IET Renew. Power Gener. 2020, 14, 2386–2394. [Google Scholar] [CrossRef]
Ahmadi, A.; Nabipour, M.; Mohammadi-Ivatloo, B.; Amani, A.M.; Rho, S.; Piran, M.J. Long-term wind power forecasting using tree-based learning algorithms. IEEE Access. 2020, 8, 151511–151522. [Google Scholar] [CrossRef]
FINO3. Metrological masts datasets provided by BMU (Bundesministerium fuer Umwelt, Federal Ministry for the Environment, Nature Conservation and Nuclear Safety) and the PTJ (Projekttraeger Juelich, project executing organisation). Available online: https://www.fino3.de/en/ (accessed on 5 August 2022).
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Hochreiter, J.S.S. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef]
Mangal, S.; Modak, R.; Joshi, P. LSTM Based Music Generation System. arXiv 2019, arXiv:1908.01080. [Google Scholar] [CrossRef]
The Vanishing Exploding Gradient Problem in Deep Neural Networks. Available online: https://towardsdatascience.com/the-vanishing-exploding-gradient-problem-in-deep-neural-networks-191358470c11 (accessed on 21 July 2022).
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. CVPR 2017, 1, 3. [Google Scholar]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Wy to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Bjørndalen, O.M. Mido. Available online: https://github.com/olemb/mido (accessed on 21 August 2022).
Mikolov, T.; Karafiat, M.; Burget, L.; Cernocky, J.; Khudanpur, S. Recurrent neural network based language model. Interspeech 2010, 2, 3. [Google Scholar]
Graves, A.; Mohamed, A.R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649. [Google Scholar]
Gregor, K.; Danihelka, I.; Graves, A.; Rezende, D.J.; Wierstra, D. DRAW: A recurrent neural network for image generation. arXiv 2015, arXiv:1502.04623v2. [Google Scholar]
Salinas, D.; Flunkert, V.; Gasthaus, J. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. arXiv 2017, arXiv:1704.04110. [Google Scholar] [CrossRef]
Bandara, K.; Bergmeir, C.; Smyl, S. Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach. Expert Syst. Appl. 2020, 140, 112896. [Google Scholar] [CrossRef] [Green Version]
Hewamalage, H.; Bergmeir, C.; Bandara, K. Recurrent neural networks for time series forecasting: Current status and future directions. arXiv 2019, arXiv:1909.00590. [Google Scholar] [CrossRef]
Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. arXiv 2017. [Google Scholar] [CrossRef] [Green Version]
Cui, Z.; Ke, R.; Pu, Z.; Wang, Y. Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values. Transp. Res. Part C Emerging Technol. 2020, 118102674. [Google Scholar] [CrossRef]
Xie, J.; Chen, B.; Gu, X.; Liang, F.; Xu, X. Self-Attention-Based BiLSTM Model for Short Text Fine-Grained Sentiment Classification. IEEE Access. 2019, 7, 180558–180570. [Google Scholar] [CrossRef]
Differences Between Bidirectional and Unidirectional LSTM. Baeldung on Computer Science. Available online: https://www.baeldung.com/cs/bidirectional-vs-unidirectional-lstm (accessed on 5 August 2022).
LSTM Vs. GRU in Recurrent Neural Network: A Comparative Study. Available online: analyticsindiamag.com (accessed on 5 August 2022).

Figure 1. Framework of the proposed data-driven weather forecasting models.

Figure 2. Framework of the proposed data-driven long-term weather forecasting models.

Figure 3. Raw measured wind speed datasets.

Figure 4. Raw measured wave height.

Figure 5. Scatter plot between wave height and wind speed.

Figure 6. Wave height autocorrelation and partial autocorrelation.

Figure 7. Wind speed autocorrelation and partial autocorrelation.

Figure 8. Overview of LSTM architecture [40].

Figure 9. LSTM wind speed prediction.

Figure 10. LSTM wave height prediction.

Figure 11. LSTM-based wind speed model error analysis.

Figure 12. LSTM-based wave height model error analysis.

Figure 13. BiLSTM structure overview, [42].

Figure 14. BiLSTM wind speed prediction.

Figure 15. BiLSTM wave height prediction.

Figure 16. BiLSTM-based wind speed model error analysis.

Figure 17. BiLSTM-based wave height model error analysis.

Figure 18. GRU structure overview, [43].

Figure 19. GRU wind speed prediction.

Figure 20. GRU wave height prediction.

Figure 21. GRU-based wind speed models error analysis.

Figure 22. GRU-based wave height model error analysis.

Figure 23. Performance comparison: wind speed.

Figure 24. Performance comparison: wave height.

Table 1. FINO3 weather datasets’ descriptions.

Start Timestamp	End Timestamp	Total Measured Data	Training Data	Testing Data
1 January 2000	31 December 2010	32,145	22,493	9636

Table 2. Hyper-parameters configuration for weather datasets.

Models	Learning Rate	Batch-Size	Epochs	Dense Hidden Units
Weather training datasets	$10^{- 2}$	16	20	150

Table 3. Performance error metrics calculated values.

Method	Metrics	RMSE (m/s)	MAE (m/s)	Percentage of Coverage (%)	Average Width
LSTM	Wind speed	01.43	01.06	94.40	05.64
LSTM	Wave height	00.18	00.12	94.82	00.73
BiLSTM	Wind speed	01.43	01.05	94.47	05.63
BiLSTM	Wave height	00.19	00.13	94.71	00.75
GRU	Wind speed	01.43	01.06	94.41	05.64
GRU	Wave height	00.18	00.12	94.72	00.73

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pandit, R.; Astolfi, D.; Tang, A.M.; Infield, D. Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations. Energies 2022, 15, 7233. https://doi.org/10.3390/en15197233

AMA Style

Pandit R, Astolfi D, Tang AM, Infield D. Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations. Energies. 2022; 15(19):7233. https://doi.org/10.3390/en15197233

Chicago/Turabian Style

Pandit, Ravi, Davide Astolfi, Anh Minh Tang, and David Infield. 2022. "Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations" Energies 15, no. 19: 7233. https://doi.org/10.3390/en15197233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sequential Data-Driven Long-Term Weather Forecasting Models’ Performance Comparison for Improving Offshore Operation and Maintenance Operations

Abstract

1. Introduction

1.1. Factors Affecting Offshore O&M Activities

1.2. Related Works

1.3. Timeliness, Knowledge Gap and Novelty of the Proposed Work

2. Datasets Preparation and Pre-Processing

3. Methodology

3.1. Technical Challenges

3.2. Software Design

3.3. LSTM-Based Weather Forecasting Models

3.4. BiLSTM-Based Weather Forecasting Models

3.5. GRU-Based Weather Forecasting Models

4. Performance Comparisons

Using Performance Error Metrics

5. Conclusions and Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI