**3. Statistical Verification Results**

#### *3.1. Characteristics of the Winds in the Region*

Figure 2 shows the diurnal variation in wind speed observations averaged at all wind turbine sites over the study area, as well as those in the seven sub-areas given in Figure 1. The wind speed over the whole region (Figure 2) exhibits evident diurnal variations, with errors gradually increasing during daytime (from 00:00 to 09:00 UTC, i.e., 08:00–17:00 LST). The median wind speed maximizes between 09:00 UTC and 10:00 UTC (~6.8 m/s), and then starts to decrease in the nighttime. The 25 and 75% sub-quartile wind speeds are 4.2 and 8.2 m/s, respectively; ~5% of the wind speeds are greater than 12 m/s, and ~5% of the wind speeds are less than 2 m/s.

Although the median wind speeds in all seven sub-areas are close (~6 m/s), the diurnal variations in the wind speeds in these regions are quite large. Area 1 possesses a high peak wind speed at 02:00 UTC and 15:00 UTC, and is also prone to greater wind speeds during the day–night transition. The median wind speed in Area 2 tends to slowly increase during daytime, with two local maxima at 03:00 UTC and 09:00 UTC, respectively. The wind speed in Area 3 is higher at night, with a peak median wind speed at 18:00 UTC.

Wind farms in Areas 4, 5, and 6 are in complex mountainous terrain, where winds increase until 09:00–10:00 UTC during the daytime, and show a decreasing trend at night. Finally, Area 7 is in a high plain region, and the diurnal variation in its wind speed is relatively flat, with a small peak in the afternoon, a small trough in the evening, and then a gradual rebound at night.

#### *3.2. Overall Performance of the Wind Forecasts*

To compare the forecasts of the ensemble members driven by the initial and boundary conditions derived from the three global model forecasts (GFS, GEOS, and GEM), we first calculated the error metrics of each ensemble member, and then averaged the errors of the 13 members within each subgroup. The average error for each sub-group is computed as follows:

$$\chi\_{\rm m} = \frac{1}{13} \sum\_{i=1}^{13} \chi\_i \tag{1}$$

where *xi* (m/s) represents the error metrics of the forecast of the ith ensemble member. With verification done for 0–24 h forecasts for the 45 days for all 411 wind turbines, the total number of data samples used in computing the statistical verification in each cell of the Table 2 was 23,081,760.

**Table 2.** Statistical verification of all stations for the GFS, GEOS, and GEM groups (45 days).


The overall performance of the three groups of global model forecast members, along with the CC, BIAS, and MAE of the 0–24-hour wind turbine hub-height wind forecasts of all members of the three groups, are calculated and shown in the 'mean' column in Table 2. The CC and MAE of the wind forecasts of the GFS group are better than those of the GEOS group, and both are better than those of the GEM group. In contrast, the BIAS in the GEOS group is smaller than that in the GFS group. The GEM group has the worst scores for all three metrics. The minimum, maximum, and median of correlation coefficients, mean errors, and mean absolute errors of 13 member predictions (13 outcomes for each background field) versus observations are shown in the 'min', 'max', and 'median' columns, respectively, in Table 2. Ensemble average forecasts outperformed the best members. Overall, the GFS group was better than GEOS, and GEM was the worst, which is statistically significant (with all at a confidence level above 98%).

To assess the overall performance of the members driven by the three global model forecasts, the statistical metrics of the three group ensemble forecasts were ranked from the best to the worst for each wind turbine site. The number of stations that performed the best and worst by each ensemble group was counted, as shown in Table 3, along with their proportion to the total turbine sites. The performance of the three ensemble forecast groups varies with the geographic setting of the turbines, as well as the local regional weather and climate characteristics. The statistical verification metrics were calculated separately for each site.


**Table 3.** Ranking statistics of wind speed forecast errors for the three ensemble forecast groups driven by the GFS, GEOS, and GEM model forecasts.

\* NBPS: # of best performing stations; NWPS: # of worst performing stations; R: ratio with reference to the total.

Among the three forecast groups, the GFS groups performed the best at ~59–64% of the total sites in terms of CC and MAE, the GEOS group achieved ~34–37%, and the GEM group performed the best for the remaining ~2%. Conversely, from the perspective of the worst performance of the forecasts, the GEM group underperformed at ~77–86% of sites, the GEOS group at ~11–20%, and the GFS group at only ~2–3%. It is interesting to point out that the GEOS group performed the best (~77% of sites) in terms of BIAS, and had relatively more cases with larger positive and negative deviations.

Figure 3 shows the distribution of the turbine sites colored for the predominant best performing ensemble group in terms of the mean CC, BIAS, and MAE among the three ensemble forecast member groups driven by the GFS, GEOS, and GEM global model forecasts. In general, the sites that achieved the best CC and the best MAE coincide. Nevertheless, for BIAS, the GEOS group performed the best at the most turbine sites.

**Figure 3.** Distribution of the sites colored for the dominant best performing ensemble groups driven by the GFS, GEOS, and GEM forecasts: (**a**) correlation coefficients, (**b**) mean absolute errors, and (**c**) biases.

#### *3.3. Variations of Forecast Errors with Wind Regimes*

Wind power generation is proportional to the cubic wind speed [68]. Therefore, it is important to evaluate the model performance in different ranges of wind speeds. Herein, the wind speed is divided into bins of 3 m/s from 0 to 21 m/s, and the forecast errors for each wind speed bin are computed and shown in Figure 4.

**Figure 4.** (**a**) BIAS and (**b**) MAE of the wind forecasts of the ensemble groups driven by the GFS, GEOS, and GEM. The line charts are the error statistics, and the histograms correspond to the number of data samples.

The winds in the region are mostly 3–12 m/s (Figure 4). The wind forecast bias of all three groups is similar. The wind forecast bias is negatively correlated with the observed wind speeds, with a nearly linear relationship. For the weak wind conditions of 0–3 m/s, the wind speed is overestimated by 2 m/s. In the bin of 3–9 m/s, the bias gradually decreases to 0, and then the negative bias gradually increases with the wind speed. For winds over 15 m/s, the negative bias reaches 4–5 m/s. The MAE of the wind forecast of the three groups is around 2 m/s in the wind speed range of 3–12 m/s. The overestimation of wind speed in the low-wind-speed range (0–3 m/s) and the underestimation of wind speed in the high-wind-speed range lead to larger MAE for the weak and strong wind ranges.

For the winds in the range of 0–6 m/s, the forecast errors of the GFS and GEOS groups are basically the same, and both are better than the GEM group. For strong winds over 12 m/s, the forecast errors of the GEOS and GEM groups are very similar, and worse than the GFS group. The overestimation of wind speeds in the low-wind-speed range and the underestimation of wind speeds in the high-wind-speed range are smaller for the GFS group than for the other two groups.
