Maximum Individual Wave Height Prediction Using Different Machine Learning Techniques with Data Collected from a Buoy Located in Bilbao (Bay of Biscay)
Abstract
:1. Introduction
2. Related Works
3. Data
3.1. Location and Attributes
- Time:
- –
- Date (GMT), date and hour.
- Waves:
- –
- Significant height, (m).
- –
- Mean period, (s).
- –
- Peak period, (s).
- –
- Maximum height, (m).
- –
- Period associated with maximum height, (s).
- –
- Mean direction, (degrees).
- –
- Mean direction at peak energy, (degrees).
- –
- Directional spread at peak energy, (degrees).
- Meteorology (data recorded at 3 m above the surface):
- –
- Average wind speed, (m/s).
- –
- Average wind direction, (degrees).
- –
- Air temperature, ().
- –
- Atmospheric pressure, p (HPa).
- Oceanography (data recorded at 3 m below the surface):
- –
- Average current velocity (cm/s).
- –
- Average current direction, (degrees).
- –
- Salinity, (psu).
3.2. Data Preprocessing
- Significant wave height ().
- Mean wave period ().
- Peak wave period ().
- Period associated with the maximum wave height ().
- Wind speed (U).
- Mean wave period ().
- Directional spread at the energy peak ().
- Period associated with the maximum wave height ().
- Mean wave direction ().
- Peak wave period ().
- Predictive Mean Matching (PMM): This method identifies the observed values closest to the predictive mean, randomly selects one, and assigns it to the missing entry.
- Classification and Regression Trees (CART):
- –
- Constructs decision trees via recursive partitioning.
- –
- For each missing value, determines the terminal node of the fitted tree.
- –
- Assigns the observed value through random sampling within the terminal node.
- LASSO Regression with Normalization (LASSO.NORM): This method employs LASSO linear regression combined with bootstrapping to handle univariate normal missing data.
4. Methodology
- Linear Regression (LM): Linear regression can model the relationship between dependent (target) variables and independent (predictor) variables as a linear equation:
- Support Vector Regression (SVR): SVR is an extension of support vector machine (SVM) that constructs hyperplanes in high-dimensional space to minimize the prediction error within a defined tolerance margin. It is particularly effective for complex nonlinear datasets.
- Long Short-Term Memory (LSTM): LSTM is a recurrent neural network (RNN) architecture designed to model data sequences while maintaining long-term information through a memory cell structure. It uses different types of gates (input, forget, and output) to regulate the flow of information, allowing it to learn long-term dependencies without encountering the problem of gradient vanishing.
- Gated Recurrent Units (GRU): GRU is another variant of the recurrent neural network architecture. Similar to LSTM, it is designed to handle long-term dependencies in data sequences. Unlike LSTM, however, GRU has a simpler structure, using only two gates (reset and update) to control the flow of information, which makes it more computationally efficient without sacrificing the ability to model temporal dependencies.
5. Proposed Experiment
5.1. Hmax Predictions
- Method 1: The training set consisted of the first three months of the year, while the test set comprised the remaining nine months. Predictions were generated iteratively; the model was initially trained on the first three months, after which it predicted the following month. After each prediction, the actual values of the predicted month were added to the training set and the process was repeated until predictions for all months in the test set were completed.
- Method 2: The training set was defined as the first 1000 observations of the year, while the remaining observations were assigned to the test set. Predictions were generated for the 50 observations immediately following the training set. After obtaining these predictions, the actual values corresponding to the predictions were incorporated into the training set and iteration continued until predictions were generated for the entire test set.
- Method 3: The training set comprised the first two weeks of the year, while the remaining data were allocated to the test set. Predictions were made for three consecutive days following the initial two-week training period. After obtaining these predictions, the actual values corresponding to the predictions were incorporated into the training set and iteration continued until all test set predictions were completed.
- Method 4: This method was similar to Method 3 but introduced an additional adjustment. After generating predictions for three consecutive days, the actual values corresponding to the predictions were added to the training set, while the three oldest days were removed. This iterative process continued until all predictions for the test set were completed.
Algorithm 1 General algorithm |
Input: data |
Goal: predict the class variable with the cleaned and normalized database data. |
data_train← each method has its own training set. |
data_test← each method has its own test set. |
n_rows← number of rows of data_test |
increment← data in the prediction window |
for i in seq(1, n_rows, by = increment) |
for (j in 1:increment) |
model ← regressor model with data_train |
prediction ← prediction of the regressor with model and data_test |
data_train ← rbind(data_train, prediction) |
end for |
end for |
5.2. Hmax/Hs Predictions
6. Results
- R-Squared (): The coefficient of determination, often referred to as R-squared, is the proportion of the total variance in the dependent variable that can be explained by the estimated regression model. A result close to 1 means that the model is able to explain much of the variation in the data, while lower values mean that it lacks this ability and does not fit the data well. The R-squared value is calculated using the following formula:
- Mean Absolute Error (MAE): The MAE is used to quantify the prediction accuracy. It measures the average magnitude of the errors in a set of predictions, providing a numerical value that represents how far the predictions are from the actual values. An MAE value close to 0 indicates that the model is more accurate, as there is not much difference between the actual and predicted values.
- Mean Square Error (MSE): The MSE measures the average of the squares of the errors, i.e., the average squared difference between the estimated values and the actual value:
- Root Mean Square Error (RMSE): The RMSE is another metric used to measure the accuracy of predictions. In this case, the square of the errors is taken into account, which assigns more weight to larger errors. The ideal value is 0. The formula for RMSE is
6.1. Predictions
6.2. Predictions
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ANFIS | Adaptative Neuro-Fuzzy Interference System |
CART | Classification and Regression Trees |
GHG | Greenhouse Gases |
LASSO.NORM | Lasso Linear Regression |
LM | Linear Regression Modeling |
ML | Machine Learning |
MAE | Mean Average Error |
MICE | Multiple Imputation using Chained Equations |
MRE | Marine Renewable Energy |
MSE | Mean Square Error |
OTEC | Ocean Thermal Energy Conversion |
PMM | Predictive Mean Matching |
PPCA | Probabilistic Principal Component Analysis |
R-Squared | |
RF | Random Forest |
RMSE | Root Mean Square Error |
SVR | Support Vector Regression |
1 | https://www.appa.es/appa-marina/que-es-la-energia-marina/?cn-reloaded=1, accessed on 5 October 2024; https://www.irena.org/Publications/2021/Jul/Offshore-Renewables-An-Action-Agenda-for-Deployment, accessed on 5 October 2024 |
2 | www.puertos.es, 15 Septembre 2024 |
References
- Yang, Y.; Zhao, L.; Chen, L.; Wang, C.; Wang, G. The spillover effects between renewable energy tokens and energy assets. Res. Int. Bus. Financ. 2024, 74, 102672. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, W.; Rauf, A.; Ozturk, I. Transitioning from conventional energy to clean renewable energy in G7 countries: A signed network approach. Energy 2024, 307, 132655. [Google Scholar] [CrossRef]
- Khurshid, H.; Mohammed, B.S.; Al-Yacoubya, A.M.; Liew, M.; Zawawi, N.A.W.A. Analysis of hybrid offshore renewable energy sources for power generation: A literature review of hybrid solar, wind, and waves energy systems. Dev. Built Environ. 2024, 19, 100497. [Google Scholar] [CrossRef]
- Ibarra-Berastegui, G.; Sáenz, J.; Ulazia, A.; Sáenz-Aguirre, A.; Esnaola, G. CMIP6 projections for global offshore wind and wave energy production (2015–2100). Sci. Rep. 2023, 13, 18046. [Google Scholar] [CrossRef]
- Ulazia, A.; Esnaola, G.; Serras, P.; Penalba, M. On the impact of long-term wave trends on the geometry optimisation of oscillating water column wave energy converters. Energy 2020, 206, 118146. [Google Scholar] [CrossRef]
- Ulazia, A.; Sáenz, J.; Saenz-Aguirre, A.; Ibarra-Berastegui, G.; Carreno-Madinabeitia, S. Paradigmatic case of long-term colocated wind–wave energy index trend in Canary Islands. Energy Convers. Manag. 2023, 283, 116890. [Google Scholar] [CrossRef]
- Ulazia, A.; Saenz-Aguirre, A.; Ibarra-Berastegui, G.; Sáenz, J.; Carreno-Madinabeitia, S.; Esnaola, G. Performance variations of wave energy converters due to global long-term wave period change (1900–2010). Energy 2023, 268, 126632. [Google Scholar] [CrossRef]
- Martinez-Iturricastillo, N.; Ulazia, A.; Ringwood, J. Long term wave load trends against offshore monopile structures: A case study in the Bay of Biscay. Proc. EWTEC 2023, 15, 1–7. [Google Scholar] [CrossRef]
- Sierra, J.P.; Castrillo, R.; Mestres, M.; Mösso, C.; Lionello, P.; Marzo, L. Impact of climate change on wave energy resource in the Mediterranean coast of Morocco. Energies 2020, 13, 2993. [Google Scholar] [CrossRef]
- Stansell, P. Distributions of extreme wave, crest and trough heights measured in the North Sea. Ocean Eng. 2005, 32, 1015–1036. [Google Scholar] [CrossRef]
- Neshat, M.; Sergiienko, N.Y.; Rafiee, A.; Mirjalili, S.; Gandomi, A.H.; Boland, J. Meta Wave Learner: Predicting wave farms power output using effective meta-learner deep gradient boosting model: A case study from Australian coasts. Energy 2024, 304, 132122. [Google Scholar] [CrossRef]
- Barbariol, F.; Bidlot, J.R.; Cavaleri, L.; Sclavo, M.; Thomson, J.; Benetazzo, A. Maximum wave heights from global model reanalysis. Prog. Oceanogr. 2019, 175, 139–160. [Google Scholar] [CrossRef]
- White, C.; Carrelhas, A.; Gato, L.; Portillo, J.; Cândido, J. Floating wind and wave energy technologies: Applications, synergies and role in decarbonization in Portugal. Proc. EWTEC 2023, 15. [Google Scholar] [CrossRef]
- Young, I.R.; Zieger, S.; Babanin, A.V. Global trends in wind speed and wave height. Science 2011, 332, 451–455. [Google Scholar] [CrossRef]
- Reguero, B.G.; Losada, I.J.; Méndez, F.J. A recent increase in global wave power as a consequence of oceanic warming. Nat. Commun. 2019, 10, 205. [Google Scholar] [CrossRef] [PubMed]
- Stansby, P.K. Solitary wave run up and overtopping by a semi-analytical model. Coast. Eng. 2003, 47, 159–179. [Google Scholar] [CrossRef]
- Bromirski, P.D.; Miller, A.J.; Flick, R.E.; Auad, G. Wave power variability and trends across the North Pacific. J. Geophys. Res. Ocean. 2020, 125, e2019JC015419. [Google Scholar] [CrossRef]
- Cabral, I.S.; Young, I.R.; Toffoli, A. Long-term and seasonal variability of wind and wave extremes in the Arctic Ocean. J. Geophys. Res. Ocean. 2020, 125, e2020JC016708. [Google Scholar] [CrossRef]
- Young, I.R.; Ribal, A. Multiplatform evaluation of global trends in wind speed and wave height. Science 2019, 364, 548–552. [Google Scholar] [CrossRef]
- Boccotti, P. Wave Mechanics for Ocean Engineering; Elsevier Oceanography Series; Elsevier: Amsterdam, The Netherlands, 2000; Volume 64, Chapter 11. [Google Scholar] [CrossRef]
- Goda, Y. Random Seas and Design of Maritime Structures; World Scientific Publishing Company: Singapore, 2000. [Google Scholar] [CrossRef]
- Tayfun, M.A. Narrow-band nonlinear sea waves. J. Geophys. Res. Ocean. 1980, 85, 1548–1552. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, X.; Dong, Q.; Chen, G.; Li, X. Phase-resolved wave prediction with linear wave theory and physics-informed neural networks. Appl. Energy 2024, 355, 121602. [Google Scholar] [CrossRef]
- Liu, Y.; Huang, L.; Ma, X.; Zhang, L.; Fan, J.; Jing, Y. A fast, high-precision deep learning model for regional wave prediction. Ocean Eng. 2023, 288, 115949. [Google Scholar] [CrossRef]
- Memar, S.; Mahdavi-Meymand, A.; Sulisz, W. Prediction of seasonal maximum wave height for unevenly spaced time series by Black Widow Optimization algorithm. Mar. Struct. 2021, 78, 103005. [Google Scholar] [CrossRef]
Buoy | Longitude | Latitude | Depth | Cadence | Model | Years |
---|---|---|---|---|---|---|
Bilbao-Biscay | 3° W | 43° N | 870 m | 60 min | SeaWatch | 1990–2024 |
Method 1 | Method 2 | ||||||||
R2 | MAE | MSE | RMSE | R2 | MAE | MSE | RMSE | ||
LM | 0.8932 | 0.0967 | 0.0196 | 0.1335 | 0.6202 | 0.1293 | 0.0536 | 0.1702 | |
sd | 0.0215 | 0.0338 | 0.0131 | 0.0453 | 0.2474 | 0.1156 | 0.1200 | 0.1578 | |
SVR | 0.9006 | 0.0916 | 0.0185 | 0.1286 | 0.6339 | 0.1246 | 0.0534 | 0.1659 | |
sd | 0.0222 | 0.0348 | 0.0133 | 0.0473 | 0.2463 | 0.1168 | 0.1239 | 0.1617 | |
LSTM | 0.8162 | 0.1433 | 0.0389 | 0.1884 | 0.3858 | 0.1804 | 0.0786 | 0.2374 | |
sd | 0.0790 | 0.0518 | 0.0234 | 0.0625 | 0.2764 | 0.1130 | 0.1136 | 0.1494 | |
GRU | 0.8124 | 0.1435 | 0.0397 | 0.1892 | 0.3906 | 0.1841 | 0.0796 | 0.2397 | |
sd | 0.0792 | 0.0553 | 0.0256 | 0.0662 | 0.2777 | 0.1138 | 0.1131 | 0.1493 | |
Method 3 | Method 4 | ||||||||
R2 | MAE | MSE | RMSE | R2 | MAE | MSE | RMSE | ||
LM | 0.6748 | 0.1539 | 0.0728 | 0.2046 | 0.6801 | 0.1497 | 0.0647 | 0.1966 | |
sd | 0.2367 | 0.1290 | 0.1431 | 0.1766 | 0.2471 | 0.1216 | 0.1182 | 0.1619 | |
SVR | 0.6884 | 0.1484 | 0.0725 | 0.1997 | 0.6842 | 0.1455 | 0.0646 | 0.1947 | |
sd | 0.2376 | 0.1309 | 0.1476 | 0.1813 | 0.2427 | 0.1189 | 0.1227 | 0.1642 | |
LSTM | 0.7047 | 0.3078 | 0.1862 | 0.4099 | 0.6896 | 0.3547 | 0.2687 | 0.4699 | |
sd | 0.1778 | 0.1066 | 0.1202 | 0.1429 | 0.1797 | 0.1929 | 0.2728 | 0.2323 | |
GRU | 0.4640 | 0.2066 | 0.1028 | 0.2746 | 0.6960 | 0.3664 | 0.2956 | 0.4802 | |
sd | 0.2902 | 0.1242 | 0.1366 | 0.1652 | 0.1729 | 0.2291 | 0.3569 | 0.2703 |
Method 1 | Method 2 | Method 3 | Method 4 | |
---|---|---|---|---|
R2 | 1.1 × | 2.2 × | 7.8 × | 6.8 × |
MAE | 1.5 × | 2.2 × | 8.9 × | 9.4 × |
MSE | 3.0 × | 2.2 × | 6.3 × | 3.5 × |
RMSE | 3.0 × | 2.2 × | 6.3 × | 9.07 × |
Method 1 | Method 2 | ||||||||
R2 | MAE | MSE | RMSE | R2 | MAE | MSE | RMSE | ||
LM | 0.0108 | 0.1170 | 0.0229 | 0.1516 | 0.0278 | 0.1188 | 0.0259 | 0.1532 | |
sd | 0.0099 | 0.0029 | 0.0013 | 0.0043 | 0.0367 | 0.0262 | 0.0380 | 0.0489 | |
SVR | 0.0114 | 0.1166 | 0.0235 | 0.1533 | 0.0283 | 0.1182 | 0.0264 | 0.1547 | |
sd | 0.0099 | 0.0030 | 0.0013 | 0.0043 | 0.0382 | 0.0262 | 0.0383 | 0.0492 | |
Method 3 | Method 4 | ||||||||
R2 | MAE | MSE | RMSE | R2 | MAE | MSE | RMSE | ||
LM | 0.0221 | 0.1184 | 0.0255 | 0.1535 | 0.0235 | 0.1205 | 0.0260 | 0.1552 | |
sd | 0.0303 | 0.0219 | 0.0305 | 0.0444 | 0.0308 | 0.0215 | 0.0307 | 0.0440 | |
SVR | 0.0221 | 0.1179 | 0.0260 | 0.1549 | 0.0229 | 0.1189 | 0.0262 | 0.1557 | |
sd | 0.0306 | 0.0219 | 0.0307 | 0.0447 | 0.0219 | 0.0216 | 0.0304 | 0.0439 |
Method 1 | Method 2 | Method 3 | Method 4 | |
---|---|---|---|---|
R2 | 0.8253 | 0.8946 | 0.7515 | 0.9070 |
MAE | 0.6911 | 0.6726 | 0.6341 | 0.3464 |
MSE | 0.3099 | 0.5535 | 0.4136 | 0.7876 |
RMSE | 0.3099 | 0.5535 | 0.5391 | 0.7876 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Porlan-Ferrando, L.; Nuñez-Gonzalez, J.D.; Ulazia Manterola, A.; Martinez-Iturricastillo, N.; Ringwood, J.V. Maximum Individual Wave Height Prediction Using Different Machine Learning Techniques with Data Collected from a Buoy Located in Bilbao (Bay of Biscay). J. Mar. Sci. Eng. 2025, 13, 625. https://doi.org/10.3390/jmse13040625
Porlan-Ferrando L, Nuñez-Gonzalez JD, Ulazia Manterola A, Martinez-Iturricastillo N, Ringwood JV. Maximum Individual Wave Height Prediction Using Different Machine Learning Techniques with Data Collected from a Buoy Located in Bilbao (Bay of Biscay). Journal of Marine Science and Engineering. 2025; 13(4):625. https://doi.org/10.3390/jmse13040625
Chicago/Turabian StylePorlan-Ferrando, Lucia, J. David Nuñez-Gonzalez, Alain Ulazia Manterola, Nahia Martinez-Iturricastillo, and John V. Ringwood. 2025. "Maximum Individual Wave Height Prediction Using Different Machine Learning Techniques with Data Collected from a Buoy Located in Bilbao (Bay of Biscay)" Journal of Marine Science and Engineering 13, no. 4: 625. https://doi.org/10.3390/jmse13040625
APA StylePorlan-Ferrando, L., Nuñez-Gonzalez, J. D., Ulazia Manterola, A., Martinez-Iturricastillo, N., & Ringwood, J. V. (2025). Maximum Individual Wave Height Prediction Using Different Machine Learning Techniques with Data Collected from a Buoy Located in Bilbao (Bay of Biscay). Journal of Marine Science and Engineering, 13(4), 625. https://doi.org/10.3390/jmse13040625