1. Introduction
The EU has the largest wind energy exploitable maritime space in the world. Consequently, to reduce net greenhouse gas emissions, the EU plans to expand the current offshore wind capacity of 12 GW to 60 GW by 2030 and 300 GW by 2050 [
1,
2]. Similarly, the United Kingdom aims at reach 40 GW offshore wind capacity by 2030 in their Ten-Point Plan [
3].
The main potential for such offshore wind farms in the EU and UK is located in the North Sea, due to relatively steadily blowing winds and shallow sea depths allowing for ground-based installations (see, e.g., [
1,
4,
5] for recent studies). Accordingly, the vast majority of the 29 largest European offshore wind farms displayed in
Figure 1 are located in the North Sea (further details in
Section 2). In this context, data from wind farms are of particular interest for quantitative analyses related to the growing offshore capacities at these and nearby locations. Such data could feed to simulations, forecasts, models, or studies on the overall energy system. However, to the best of our knowledge, no such data are freely available to a broad public.
In this paper, we provide an hourly data set covering the last 40 years for the 29 biggest wind farms in Europe. We include wind speeds at hub height based on meteorological reanalysis data and further weather parameters as well as specifications of currently installed and planned wind turbines with parametric forms of their power curves. Furthermore, we provide synthetic energy time series of produced power over the considered horizon, i.e., power that would have been produced in the past at the respective locations with the current capacity and technology installed. These synthetic time series do not consider interactions of the installed turbines on their respective output, such as wake effects, or constraints in the connected power grids. The data set mainly focuses on the meteorological effects of electricity production. However, we include variables such as wind direction and surface roughness that allow enriching the potential of the data set for energy production analysis in future studies.
Studies in which data sets similar to ours could contribute can be found in the area of economic analysis (see, e.g., [
6,
7,
8,
9,
10]) or grid integration ([
11,
12,
13,
14,
15,
16,
17,
18,
19]) as well as in climate change analysis [
20,
21,
22]. Among these papers, ref. [
19] implicitly use a data set similar to ours to analyze future developments of an offshore (and energy) transmission grid in 2050. They consider 16 wind farms in the North Sea region and use a single reference wind turbine to calculate future feed-in data from meteorological reanalysis data, taking wake effects into account. A recent publication providing a short overview over calculation feed-in data from reanalysis data is [
23] (see also the references therein). Further contributions methodologically related to ours include [
24], who uses among other data high-resolution geo-spatial wind speed data to analyze renewable energy potentials in the European Union. In addition, the work of [
25] synthetically calculates Swedish wind power production based on a single reference turbine for three years. With respect to modeling spatial and temporal dependency structures of wind power production, ref. [
26] analyzed northern European countries and waters. For onshore locations, recent studies related to ours include, e.g., [
27,
28,
29,
30,
31].
The data set provided in our paper can be of particular use and interest to researchers and industry experts as well as policymakers. It is prepared in a structured way so that a broad readership can analyze the data without much computational skills and effort. To better understand the data and potential insights derived from it, we give a first analysis. We include descriptive statistics and aggregate production figures over various time horizons for each wind farm as well as for total electricity production. This includes average production numbers, full load hours, and numbers for site-specific volatility and intermittency. By calculating these variables, we obtain an overview of all wind farms and the possibility to compare production characteristics at different locations. Since offshore wind energy is intended to play an essential role in the future European power system, we further analyze the dependencies of wind speed and electricity production between considered locations. Such analysis is interesting and very relevant with respect to reducing the wind power variability by aggregating productions from diverse geographical locations. While highly correlated locations lead to high volatility and intermittency in the overall supply, low correlations balance the overall output. Interestingly, the correlations of offshore wind farms over distance behave similarly to those found in onshore studies [
32,
33,
34] but show higher correlations in neighboring locations.
The remainder of the paper is structured as follows. The data published with this paper, including weather data, information about wind farms, and derived wind power generation, are presented in
Section 2. We explain the data generation step by step so that readers can keep the data set updated and extend it according to their needs.
Section 3 includes the results of the analysis of the 29 wind farms. It is divided into a descriptive analysis for all of them as well as combined and a dependence analysis. We conclude in
Section 4.
2. Data
We provide 40 years of hourly wind and production data for Europe’s 29 largest offshore wind farms in terms of installed capacity. The data include wind farms that are still under construction but will begin commercial operation in the next three years (by 2024). The locations of these farms are shown in
Figure 1 and listed with technical details in
Table 1. Our data set consists of hourly wind speeds and synthetic hourly power generation signals for each site. Wind speeds were determined by matching the wind farms locations to the nearest grid point in the ERA5 data set [
35] and transforming the wind speeds at 100 m to the hub heights of the turbines. Afterwards, the wind speeds were converted to production signals using the power transfer function of each turbine, which we also provide in this paper. All steps are explained in detail below.
Details about the considered wind farms in
Table 1 include their approximate location, hub height (Hub (m)), turbine types, resulting capacity in MW, and the start of commercial operations. The location of each wind farm is rounded to the next quarter longitude and latitude; thus, the positions are projected to a grid corresponding to the resolution of the weather data described below. Lastly, we assign a letter to each wind farm for improved visualization.
Baseline data of each wind farm were manually collected from publicly available information given by the operator of each wind farm or other public information (see
Table A1 in
Appendix A). Note that although we are talking about 29 wind farms, the number is not so clear to define, and one could, turning to
Table 1, also talk about 28 or 32 considered parks. For example, we count Horns Rev Phase 1–3, a wind project built from three turbines types, as one wind farm, since the three parts map onto the same weather coordinates. Opposed to that, we count, e.g., the two projects Hollandse Kust Zuid/Noord as two wind farms. So, decisive for us are the resulting locations we are able to distinguish in the network of weather data.
For the weather data, we extract ERA5 data for every wind farm location from the Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [
35]. There, the weather data are provided on a grid of quarter degrees in longitude and latitude, and thus, we already matched each wind farm to the nearest grid point in
Table 1. For each location, we extract the lateral wind speed components
u and
v in (m/s) at 100 m above ground. We neglect the lateral wind component and compute the absolute wind speed
from these two orthogonal components, by
and wind direction
by
where we have
for a northerly wind and the angle increases clockwise. Here, atan2 is the 2-argument arctangent function.
In the subsequent step, we follow [
32,
36] and assume a logarithmic velocity profile to scale the wind speeds
to different hub heights
(in m) of the turbines based on [
37]:
Here,
corresponds to the surface roughness depending on the actual ocean state (characteristic height of waves, depth, etc.), which is also provided in the ERA5 data set [
35]. For wind farms with unknown hub height (where we did not manage to find reliable information, nor were able to calculate it, see
Table 1), we set the hub height to 100 m.
Figure 2 displays the wind roses of wind in 100 m for wind farms Gwynt y Mor and Kriegers Flak. Note that Gwynt y Mor represents the most western wind farm in our data, and Kriegers Flak represents the most eastern.
Despite the notable difference in their position, the resulting wind roses indicate quite similar main wind directions for both wind farms and very few northern as well as north-eastern winds. However, we observe a higher proportion of low wind speed hours at the Gwynt y Mor wind farm, which is located in a bay of the Irish Sea near the shore (wind farm C in
Figure 1). This difference will also be visible in the descriptive statistics of generated power in
Section 3, where, e.g., we observe more downtimes of Gynt y Mor due to low winds (below 4 m/s, which is typically the cut-in speed) compared to the Kriegers Flak wind farm.
As a final step in data preparation, we convert the wind speed data into synthetic power output using the turbines’ power curves. However, in the datasheets of most turbine types, the power curve is only given for individual points, i.e., in the form of a table with discrete wind speeds and corresponding nominal output power. We follow [
38] and fit a combination of third-order polynomials to the nominal power at each wind speed to get a functional relationship. A piece-wise definition of the function is given by
where for each turbine type,
is the cut-in speed, i.e., the minimum wind speed required for any power, and
is the minimum wind speed for the rated power.
is defined as cut-out speed, i.e., the speed at which the turbine is stopped or braked, and set to
m/s for all turbines. In addition to these technical parameters of the turbines,
is the turning point within our functional representation, where we change to the second polynomial. As proposed by [
38], we fitted a third-order polynomial to find this point, where the concavity of the power curve changes sign. The resulting power curves for the wind turbine Siemens SWT-3.6-107 and Siemens Gamesa SG 8.0-167 DD, installed in wind farms Gwynt y Mor and Kriegers Flak, are shown in
Figure 3.
Fitted polynomials and plots for the other turbine types are given in
Appendix C. Note that data on nominal power at different wind speeds were not available for four turbines (Vestas V164-8.25, V164-9.0, V164-10.0, and Siemens Gamesa SG 11.0-200 DD). In these cases, we used a scaled version of the most similar Vestas V164-9.5 power curve instead. To be more specific about the scaling, consider, e.g., an unknown turbine with a nominal capacity of 8 MW and an unknown power curve. Then, the unknown power curve is approximated by adopting the shape of the 9.5 MW Vestas V164-9.5 turbine, and each value is scaled by
.
Having a time series of power output of every single turbine at each wind farm, we sum up all turbines belonging to the same wind farm to model the farms’ overall resulting power output. Without a doubt, this aggregation is a simplification of the actual effects of how individual wind turbines combine to form a wind farm. However, it suffices to provide insights into overall variations, intermittencies, their time constants, as well as distributional characteristics of power production at certain locations and dependency patterns between locations. For studies where the absolute level of generated power of particular wind farms is needed as accurately as possible, we recommend taking the interactions of the wind turbines such as wake effects into account. An implicit way of doing that would be by calibrating the synthetic power data calculated here with the help of measured power data over a short period of time. Depending on the amount of measured data available, one could use different calibrations for different wind conditions such as wind direction or light wind and strong wind scenarios. Alternatively, one could try to consider wake effects within the aggregation step using a theoretical model that incorporates relevant parameters about the wind farm’s outline. Various approaches to model these effects have been proposed, and detailed overviews are given, e.g., in [
39,
40,
41].
The resulting total produced electricity of the Gwynt y Mor and Kriegers Flak wind farms in 2019 is displayed in
Figure 4a,b for illustration. The figure shows that the power outputs at these locations are highly volatile and vary between no output at all and the maximum, i.e., the rated power. For better visualization,
Figure 5 gives a more detailed view of Gwynt y Mor for January 2019. Here, flat tableaus where the wind speed falls below the cut-in speed or exceeds the speed of rated power are visible. The high volatility of the series might not be surprising for readers familiar with offshore wind power, but it clearly shows that the idea of the wind blowing continuously on the sea is not accurate. However, turning back to
Figure 4, we can observe that the upper/lower bound of the power output is often reached at different points in time and, consequently, we expect that aggregating the power from multiple sites will have a flattening effect on the overall production.
We end this section with an overview of the exact format in which we provide the data set before we give a brief (descriptive) analysis in the next section. The data set may be downloaded as a zip archive under the provided DOI. It consists of 31 CSV files, one for each wind farm (total 29), one file summarizing the wind speed, and one for the resulting power outputs. In the first 29 files, we report detailed data for each wind farm, including wind components u and v, the forecast surface roughness (fsr), calculated wind speed, wind direction, scaled wind speed at hub height, and estimated power for each turbine type in the columns. Similar to the last two files, reporting wind speed at hub height and total power for each wind farm, each row represents one point in time. Starting from 1 January 1980, 00:00 a.m. UTC in the first row, the data set ranges up to 31 December 2019, 11:00 p.m. in the last of 350,640 rows.