**4. Method**

This work makes use of a diverse range of models and involves a set of tasks. Figure 2 depicts a flowchart of the approach considered, highlighting the main tasks and the relationships between them.

**Figure 2.** Overview of the methodology considered in this work, specifically the method and the main tasks related to the definition of the bids of the wind power producer (green) and the other market participants (light orange), as well as the procedure associated with the operation of the day-ahead market and the imbalance managemen<sup>t</sup> process (light yellow).

### *4.1. Wind Power Forecast Approach*

In the last few years, one of the key scientific research topics is the development of reliable forecasting systems. These systems are being widely employed in different fields, such as the prediction of consumer comfort based on temperature and humidity, electricity demand, wind speed and direction for onshore wind power, and also metocean conditions for offshore wind power, and irradiation for solar power (see, e.g., [43–46]).

For the particular case of wind power forecast systems, several techniques have been proposed in the literature (see, e.g., [47–52]). To support the participation of wind power producers in the DAM, existing wind power forecast systems are generally based on statistical post processing approaches coupled with mesoscale NWP outputs [23]. In this article, a K-nearest neighbour (K-NN) methodology is applied to provide the wind power forecasts. This statistical approach, also known as analogous

forecast, has been applied by several authors due to its simplicity, effectiveness and non-parametric features [53,54].

The K-NN methodology is considered a lazy learning methodology, since there is no need to (previously) build a model or a function. Instead, the methodology uses only similar situations from the historical data to forecast. This characteristic is especially suitable to forecast weather dependent variables, such as wind power. The K-NN methodology considers that atmospheric flow releases the same local impact. Consequently, and taking into account the atmospheric flow characteristics, where the meteorological weather patterns have a tendency to repeat over certain regions, from time to time, the wind power forecast production for a determined hour can be determined from a similar meteorological weather pattern from historical events. The K-NN technique has shown a high efficiency in forecasting phenomena where the synoptic variability is predominant, such as precipitation [55] and wind speed [56]. Currently, this technique has already been explored by several authors for forecasting wind power, showing good performance [57–59]. To estimate the degree of similarity with each historical event, the Euclidean distance, with a trajectory matrix providing information regarding the state of the atmosphere on the preceding times, is the most common metric used in the wind forecast sector [59].

Due to a large number of degrees of freedom of the atmosphere, it is usual to apply a principal component analysis (PCA) approach to the input data. This procedure allows to filter atmospheric perturbation that represents only background noise [27]. To assist in this filtering process, the North criterion [60] was applied, allowing to identify the appropriate number of principal components (PCs) to be used in each meteorological parameter input.

Figure 3 depicts the flowchart of the K-NN methodology applied in this work. In order to calibrate the forecasting methodology, sensitivity studies were performed, using two years of real wind power data from a set of wind parks located in the central region of Portugal. These sensitive tests comprise the suitable number of K nearest neighbours, the size of the trajectory matrix and the selection of the most adequate meteorological parameters. After some tests, the most adequate configuration of the forecast methodology was to set K to 10 using a trajectory matrix with a lag of 3 h, composed by the first three PCs of the longitudinal component of the wind, wind speed, and atmospheric pressure. Thus, the deterministic wind power forecast is based on the average value of the ten historic events.

The normalized root mean square error (NRMSE) represents the quadratic difference between the estimated value (based on the proposed methodologies), and was normalized by the nominal capacity of wind park: *Pnom*. The correlation *r* is used in statistics to measure how strong the relationship is between two variables, in this case, the observed power *Pobs t* and the forecast power *Pf or t* . The *F* Test is used to verify the statistical significance between the deviations associated with the two different gate-closures considered in this work. Mathematically, the previous parameters are defined by the following equations:

$$NRMSE\left[\%\right] = \sqrt{\frac{\sum \left(P\_t^{obs} - P\_t^{for}\right)^2}{N}} \times \frac{1}{P^{nom}}\tag{1}$$

$$r = \frac{cov\left(P\_t^{obs}, P\_t^{for}\right)}{\sqrt{var\left(P\_t^{obs}\right)var\left(P\_t^{for}\right)}}\tag{2}$$

$$F = \frac{var\left(P\_t^{obs} - P\_t^{for1}\right)^2}{var\left(P\_t^{obs} - P\_t^{for2}\right)^2} \tag{3}$$

**Figure 3.** Wind power forecasts methodology based on NWP data coupled with a K-NN approach.

### *4.2. Selection of Representative Days*

The underlying goal of choosing representative days is to statistically detect the most typical wind power daily patterns and, at the same time, clustering together days that exhibit identical patterns. This procedure allows for: (i) feeding the MATREM simulator without resorting to extensively time-consuming simulations, and (ii) assessing, for instance, the typical profiles that can jeopardize the wind power producer revenues enabling for adopting measures to mitigate their risk exposure. The identification of representative days is a suitable approach widely employed to increase the knowledge of a determined parameter allowing the creation of decision support systems (e.g., the classification of the type of profile of electricity customers [61]). A K-medoids clustering algorithm [62] is used in this work to find the representative days based on the daily observed wind power profile for the wind parks considered. This technique allows arranging the input data with similar characteristics into clusters in order to achieve the most representative daily profile. With this step, it is possible to identify (in a statistical way) statistically independent patterns from the data that can be related to physical processes. Clustering algorithms are unsupervised learning processes typically applied to find and split the data according to the similarity among the observations, in a way that is always closer to the elements of the same cluster, and dissimilar among the remaining clusters [62,63].

The main advantages of the K-medoids algorithm, when compared to others non-hierarchical clustering algorithms (e.g., the K-means algorithm) are: (1) to be more robust to noise and outliers by using the median values, and (2) to allow for selecting data points as centres (medoids) [63]. The K-medoids technique used in this work is classified as a non-hierarchical clustering algorithm, allowing to group the data into K clusters. The suitable K, i.e., the number of clusters, is predetermined through the Calinski–Harabasz (CH) criterion [64].

The wind power input matrix (*Xd*,*<sup>t</sup>*) for the clustering algorithm is defined as follows:

$$X\_{d,t} = \begin{vmatrix} Z\_{1,1} & Z\_{1,2} & \cdots & Z\_{1,t} \\ Z\_{2,1} & Z\_{2,2} & \cdots & Z\_{2,t} \\ \vdots & \vdots & \vdots & \vdots \\ Z\_{d,1} & Z\_{d,2} & \cdots & Z\_{d,t} \end{vmatrix} \tag{4}$$

where *t* represents the wind power observed during a predetermined hour of day *d* during the period 2009–2010.

### *4.3. Measures of Economic Results*

In this section, the formulation used to compare the results between the 12:00 p.m. scenario (hereafter designated as the *base case*) and 2:00 p.m. scenario (hereafter designated as the *upgraded case*) is presented. The total remuneration (per hour) of the wind power producers is as follows:

$$R\_t = \sum\_{t=0}^{23} P\_t^{\text{bid}} \mathcal{C}\_t^{\text{day}, \text{had}} + \begin{cases} \left(P\_t^{\text{obs}} - P\_t^{\text{bid}}\right) \mathcal{C}\_t^{\text{updeviation}}, & \text{for } P\_t^{\text{bid}} < P\_t^{\text{obs}} \\\left(P\_t^{\text{obs}} - P\_t^{\text{bid}}\right) \mathcal{C}\_t^{\text{downdeviation}}, & \text{for } P\_t^{\text{bid}} > P\_t^{\text{obs}} \end{cases} \tag{5}$$

where:

• *Cdayahead t*is the day-ahead price at hour *t*;

• *Cupdeviation t*and *Cdowndeviation t*are the up and down deviation costs, respectively.

It is important to note that *Pbid t Cdayahead t* is the part consisting of the remuneration obtained from the day-ahead. The other part consists of the remuneration obtained from the deviations. Therefore, the total remuneration *Rd* and average remuneration *R*¯ *d* for a specific day *d* are as follows:

$$\mathcal{R}\_d = \sum\_{t=0}^{23} \mathcal{R}\_t \tag{6}$$

$$\mathcal{R}\_d = \frac{R\_d}{\sum\_{t=0}^{23} P\_t} \tag{7}$$

The average remuneration obtained in the day-ahead market *R*¯ *dayahead d*is defined as follows:

$$\mathcal{R}\_d^{dagahcd} = \frac{\sum\_{l=0}^{23} P\_l^{bid} C\_l^{dagahcd}}{\sum\_{l=0}^{23} P\_l^{bld}} \tag{8}$$

In addition, the average remuneration obtained by considering the deviations *R*¯ *deviation d*is as follows:

$$\mathcal{R}\_{d}^{deviation} = \begin{cases} \frac{\sum\_{t=0}^{23} \left( P\_t^{obs} - P\_t^{bdd} \right) \mathbf{C}\_t^{undervolution}}{\sum\_{t=0}^{23} \left( P\_t^{obs} - P\_t^{bdd} \right)}, & \text{for } P\_t^{bdd} < P\_t^{abs} \\\\ \frac{\sum\_{t=0}^{23} \left( P\_t^{obs} - P\_t^{bdd} \right) \mathbf{C}\_t^{dunderviation}}{\sum\_{t=0}^{23} \left( P\_t^{obs} - P\_t^{bdd} \right)}, & \text{for } P\_t^{bdd} > P\_t^{abs} \end{cases} \tag{9}$$

With these formulae, it is possible to compute the average wind power value, the energy transacted in the tertiary reserve market, and the tertiary reserve cost for both scenarios. Both the reserve cost and the electric system levelized cost are computed by taking into account the occurrence of each representative day and the traded energy. In order to assess the gain effect regarding the proposed adaptation of the gate closure of the day-ahead market, several key performance indicators (KPIs) are defined (see Table 1).


**Table 1.** KPIs considered in this work.

### **5. The Case Study**

This section describes a case study to analyse the effect of wind power forecasts errors on the outcomes of the DAM. The following two scenarios are considered: (i) a base scenario, where the DAM closes at 12:00 p.m. (the bids of the wind power producers are based on a wind forecast performed 18 to 42 h ahead), and (ii) an updated scenario, where the DAM closes at 2:00 p.m. (the bids of WPPs are based on an updated forecast performed 12 to 36 h ahead).

### *5.1. Software Agents and Wind Power Profiles*

This study makes use of data published by the Iberian electricity market (MIBEL) and involves the simulation of the day-ahead market prices as well as the balancing market prices. Market participants are modeled as software agents, defined with the help of the MATREM system. Since the normal operation of the daily market of MIBEL involves a number of bids on the order of thousands for a particular hour, there is a need to make some simplifications related to the number of software agents, in order to avoid a large computational complexity. Accordingly, the main agents considered in the study are as follows: a market operator (S1), a system operator (S2), twelve producers (supply-side agents) and four retailers (demand-side agents). Table 2 presents the characteristics of the supply-side agents. The wind aggregator (agent *<sup>P</sup>*1) represents the Portuguese wind farms.

> **Table 2.** Producer agents (software agents) and their key characteristics.


It is important to note that the detection of violations related to the interconnection constraints between Portugal and Spain leads to a process of market splitting, resulting in different price areas for Portugal and Spain. After a careful examination, we concluded that this situation applies for some hours of the days under consideration. In practice, this means two sets of simulation for each hour of operation, one for Portugal and another for Spain. However, for convenience, and in the interests of simplicity, the day-ahead market is cleared for Portugal only.

The forecast methodology is deterministic and uses the following: (i) numerical weather prediction data outputs (see Section 4), and (ii) observed data for a set of wind farms during the period 2009–2010. The wind farms have a nominal capacity of 250 MW (10% of the Portuguese installed capacity, in 2010). This value is upscaled to 2500 MW to obtain a meaningful impact on the market results. The observed wind power profiles are depicted in Figure 4a. The representability of each wind power profile is also shown in Figure 4b.

**Figure 4.** Wind energy typical profile (**a**) and representability of each wind power profile during the 2-year period of the study (**b**).

The analysis of Figure 4 supports the consideration that the Portuguese wind farms are located in a mountain region, since several wind power profiles show the typical features of wind speed in such a region (although with different intensity)—that is, due to the thermal stratification and local effects [65], the highest wind speed is associated with the nocturnal period. Moreover, the most common wind power profile shows a reduced production during all day. On the other hand, profile 2 is associated with a high level of wind power production, which occurs during the passage of severe meteorological phenomena, as the cyclone systems [66]. This profile shows the lowest number of occurrences.

### *5.2. Wind Power Forecast Deviations*

Figure 5 depicts the wind power forecast deviations (forecast minus observed production) for the seven representative days, at 12:00 p.m. (base case, left) and 2:00 p.m. (updated case, right). The figure shows that the wind power forecast deviations at 12:00 p.m. have an absolute value that is almost twice that of the deviations at 2:00 p.m. Moreover, wind power fluctuations are considerably higher in the base scenario. For instance, the uncertainty in profile 2 ranges between −200 and 1700 MW.

**Figure 5.** Wind power forecast deviations at 12:00 p.m. (base case, **left**) and at 2:00 p.m. (updated case, **right**).

Table 3 depicts the forecast results regarding both the NRMSE and the correlation values between the two scenarios. The most significant improvement was observed in profile 2. This profile, usually associated to extreme weather conditions, shows a strong improvement in the correlation and the NRMSE values. The link between wind power variability as well as the uncertainty with extreme weather conditions was described by several authors [23,66–69], who demonstrated that larger errors in the wind power forecast are expected to occur during severe weather conditions with strong dynamics (e.g., storms and cold fronts) when compared with weather conditions associated with stationary systems (e.g., anticyclonic systems). For a 0.05 significance level, the critical value is 2.01, which means that comparing both deviations (see Figure 5), exist statistically differences in all profiles, except in profile 6 (see Table 3).

Profile 4 also shows a strong improvement in the wind power forecast for the upgrade scenario. Profile 5 shows the lowest NRMSE amelioration that can be explained with the capabilities to obtain a reliable forecast during calm wind speed conditions [67,69]. Consequently, results from Table 3 highlighted the fact that the data considered about the ICs strictly define the wind power forecast errors. Thus, as expected, since the most up-to-date information on the state of the atmosphere is used, postponing the gate closure can strongly reduce the uncertainty of the bids that the wind power producers submit to the day-ahead market.


**Table 3.** Correlation and NRMSE between the observed and forecasted wind power and *F* value of the wind power deviations for each scenario and wind power profile.
