**Modeling the Price Stability and Predictability of Post Liberalized Gas Markets Using the Theory of Information**

#### **Anis Hoayek 1,**†**, Hassan Hamie 2,\*,**† **and Hans Auer <sup>2</sup>**


Received: 29 April 2020; Accepted: 3 June 2020; Published: 11 June 2020

**Abstract:** Energy markets in the United States and Europe are getting more liberalized. The question of whether the liberalization of the gas industry in both markets has led to stable prices and less concentrated markets has appealed great interest among the scientific community. This study aims to measure the power and efficiency of an information structure contained in the gas prices time series. This assessment is useful to the oversight duty of regulators in such markets in the post liberalized era. First, econometric and mathematical methods based on game theory, records theory, and Shannon entropy are used to measure the following indicators: level of competition, price stability, and price uncertainty respectively—for both markets. Second, the level of information generated by these indicators is quantified using the information theory. The results of this innovative two-step approach show that the functioning of the European market requires the regulator's intervention. This intervention is done by applying additional rules to enhance the competitive aspect of the market. This is not that case for the U.S. market. Also, the value of the information contained in both markets' wholesale gas prices, although in asymmetric terms, is significant, and therefore proves to be an important instrument for the regulators.

**Keywords:** gas markets; game theory-Cournot model; records theory; entropy; information theory

#### **1. Introduction**

Unlike the oil markets, the gas markets have witnessed regional divergence at several levels. However, the degree of competitiveness varies between the different gas markets.

Following extensive infrastructure development and regulation changes, the North American market developed transparent and competitive gas pricing hubs. Additional gas hubs emerged afterward in Europe, providing physical and virtual locations for trading gas. The abundance of gas and the presence of competition between different stakeholders at different levels of the value chain led to an increase in trade in both the spot and future markets. However, the price of gas does not reflect market fundamentals and forces in all markets.

The role of the regulators is to promote competitive conduct, domestic gas production, third-party access, price trade reporting, and to ensure the presence of futures trading. Once the liberalization measures are implemented and regulated, the status of the gas hub is confirmed as liquid and stable, which results in prices being indicative of market fundamentals.

In this study, we focus on the North American and European markets, in specific the United Kingdom (UK). The choice of these markets is explained by the fact that both attempted to liberalize the gas markets, and underwent intense regulations and policy changes over the past years [1].

Wholesale buyers used to follow long-term contracts indexed to the price of oil derivatives in both of the aforementioned markets [2]. Also, the gas industry was mostly dominated by state-owned monopolies.

However, the Federal Energy Regulatory Commission (FERC), encouraged the establishment of gas markets driven by free competition in the United States [3]. As a result, the Henry Hub (HH), known as the most successful hub, was created [4]. The success of the HH is marked by a large liquid portfolio of spot and future contracts, along with hub indexed prices which serve as a reference for the value of the gas commodity all over the North America.

Consequently, the UK and the European Union (EU) started their reforms. The Office of Gas and Electricity Markets (OFGEM) took lead and started the process of market liberalization since the 1990s. The reforms led to the establishment of the National Balancing Point (NBP), which serves as a physical platform for gas trading in the UK. Currently, the NBP is considered to be the most developed hub in Europe and has the longest standing European gas pricing point [5,6]. It is worth mentioning that the UK gas market and the European gas market are used interchangeably in the remainder of this study.

In 2016, the U.S. Natural gas consumption was roughly 750 billion cubic meters (BCM) [7]. The majority of the demand was satisfied through indigenous production, and the remaining was imported from Canada, via pipelines. Additional marginal gas is imported from Mexico (also via pipelines), and from around the world, via liquefied natural gas, LNG (see Figure 1).

**Figure 1.** Indigenous production and monthly gas imports to U.S. Source: EIA (Available on EIA website, https://www.eia.gov/naturalgas/monthly/).

The UK natural gas consumption in 2016 is estimated at around 73 BCM [8], out of which 42 BCM are imported while the remaining volumes are produced locally (see Figure 2).

**Figure 2.** Monthly gas imports to the UK and indigenous production. Source: OFGEM –UK (Available on OFGEM website, https://www.ofgem.gov.uk/data-portal/gas-demand-and-supply-source-month-gb).

To reflect the gradual advances in supply-side competition, the functioning of a wholesale gas market should be measured quantitatively. This has attracted attention from the professional and scientific community, as in-depth analysis of gas markets have been conducted and published [2,9–13]. All studies confirm that parameters such as market participants, the monthly day ahead trades, and churn ratio give an indication and a feel of the market. The churn ratio, calculated as the ratio of traded gas volumes to the total gas demanded, is an indicative measure of the liquidity of a gas hub and market maturity. Additionally, it measures the confidence of traders and consumers in the market.

The numbers shown in Table 1, reveal a high churn ratio for both markets (above 15), which indicates that the gas prices registered at both hubs are liquid and reflect market conditions [11]. Therefore, clearing prices are accepted as a reference and indicator, which contain reliable information for all stakeholders involved in the gas value chain (traders, customers, regulators, etc.).


**Table 1.** The United States and UK traded volumes and churn. Source: OFGEM [14] and Cornerstone Research, IEA [15].

Analysis of recent trends in the European and North American gas markets shows that the prices of gas are fundamentally market-driven. However, rules and policies set by gas regulators are a must to guarantee that the market keeps on operating efficiently [16].

The information theory introduced by [17] is a probabilistic principle that helps to quantify the information generated by a random variable in an uncertain context. Information theory can explain observations without the need to rely on neither statistical assumptions regarding the distribution of random variables, nor the random noise [18]. Additional parametric assumptions such as estimates of demand and cost functions can also be avoided using the information theory. The mathematical tool that will be used in this work to measure the amount of information from the gas wholesale clearing prices is the statistical entropy.

Another tool to assess the information in a decision-making problem is the Blackwell approach [19]. However, this approach has several complexities that prevent a simple application of Blackwell's principle [20,21], especially at the level of cost and return function assumptions. Hence, to overcome all the mentioned difficulties, the entropy principle will be applied.

The first objective of this paper is to study whether the wholesale gas prices of two of the most liberalized gas markets carry valuable information that can serve as indicators for the relevant gas regulators. The value of these indicators will be quantified by using several econometric methods and mathematical theories. This analysis will guide and assist the decision-making process of regulators regarding the need for an intervention to stabilize the gas markets and improve the functioning of their internal markets. The second objective is to quantify and measure the accuracy and efficiency of the hidden information structure generated by these indicators.

All methods applied in this research are proven mathematical theories that have been used in previous studies [22–34]. All four theories (information, records, entropy, and game theory) have applications in the field of mathematics and statistics, in other words, econometrics. However, the novelty of our method lies in adopting a two-step approach that was not applied before in the literature. This approach is useful to assess the performance of a gas market in terms of information generated by several indicators and combined in one structure, called information structure. The indicators give an idea of the level of competition, level of price volatility, and price stability on one hand, and the level of the information structure, which measures the performance of the market. Among all gas stakeholders, this information is important for gas regulators. These will

be more confident and can trust the price indicators if the performance of the market is powerful and efficient.

The level of competition changes from one market to another, and if measured correctly defines the concentration of competing firms in the market. Limited number of firms imply a highly concentrated market, and that based on their strategies can dictate prices, otherwise known as price-setters. Besides, the fewer the number of firms the easier it is to abuse conduct and act collusively. Such firms adjust their strategies in conjunction with an agreed-upon understanding with the competing firms at the expense of the welfare of gas consumers and possibly smaller firms. A typical example of such a market and behavior is the presence of cartels in commodity markets.

The level of volatility indicates how fast gas prices change in the short term. The higher the volatility the harder it is to predict the future behavior of the changes, thus making the market uncertain.

On the other hand, price stability hints at the behavior of gas prices in the medium and long term. Commodity prices tend to have abrupt and rapid price shocks, this explains the sudden increase or decrease due to sudden changes in supply and demand characteristics. The longer it takes for a commodity price to witness a shock the more stable the market is.

The performance of the market is the measure of the power and efficiency of the information contained in the gas prices and the indicators. The more the information is efficient, the more reliable, and reflective the prices are in such a market.

In the first step of the approach, the authors identified three different indicators: market concentration, price stability, and price uncertainty. The authors then applied three appropriate mathematical and statistical theories to extract, from the gas prices time-series metric values that are most suitable to measure the relevant indicators. The formulation of each model is explained and justified in the next section.

In the second step, the three indicators are combined to create an information structure that will help the authors evaluate the performance of the gas market in question. These are assessed against the actions/states that could be executed by the regulator of such markets. Two actions are identified: to intervene or not to intervene. Intervention in the market is conducted by taking legal actions, such as issuing new directives to ensure a stable supply and demand equilibrium, and making sure that there is no abusive conduct by gas suppliers. Furthermore, the approach deals with a case where the information is neither completely absent nor perfectly known, which has rarely been dealt with in literature.

#### **2. Methods**/**Data and Models Formulation**

To avoid price abuse and manipulation, a gas regulator is expected to regulate firms' behavior by ensuring that customer welfare is maximized while maintaining the attractiveness and profitability for the producers and traders.

As stated in the introduction, price dynamics of a commodity in a liberalized market are indicative of the market structure. They contain consistent information that should, if adequately analyzed, help the regulators in assessing the performance of the market, namely the wholesale gas market in this case.

The authors have identified three main metrics that can signal information in the hidden structure of the price values of both hubs. These metrics are based on econometric and mathematical methods, and are used to inform the regulator in each market about the following:


The first indicator studies the degree of concentration in the two different gas markets by using game theory, specifically the non-parametric Nash-Cournot equilibrium test. In other words, if the test shows that traders are participating in the market by trying to maximize their profit as "the only pure" strategy, then the market is considered efficient and the likelihood of anti-competitive behavior is negligible.

The second indicator employs the records theory, which relies on the analysis of the peak observations reached in a certain period. This indicator measures the degree of market stability, by calculating the probability of witnessing future peak prices. Therefore, the measure of probability is a measure of market stability and predictability. If the results point toward a tendency to score high probabilities of extreme gas prices, then the market can be characterized as unstable.

The third and final indicator studies the price predictability of both markets, by the use of Shannon entropy and the measure of volatility. This is done by analyzing the variation of prices and returns and assessing the degree of uncertainty and volatility which are present in gas prices. Simply, the higher the uncertainty in prices, the higher the volatility.

These indicators combined will inform the regulator about the functioning of the market. If the market shows signs of concentration, the likelihood of extreme prices, volatility, and uncertainty, then the regulator should intervene and use its policy enforcement power.

Since our indicators are based on econometric theory and models, it is important to assess the performance of such models. Therefore, a quantitative analysis that relies on the information theory is used to compare the power and efficiency of the information generated by all indicators in the two different selected markets. The market with the highest information power will give additional credibility to the indicators so that the gas prices time series speak for itself. Regulators in such a market have higher confidence and can trust the indicators, which will guide their decision of whether to intervene in the market or not.

The following part of Section 2 will list and define the three indicators metrics and will explain the econometric and mathematical methods that will be used in this manuscript. Section 2.5 will lay out an outline of the power of information. Results will be presented in Section 3 with an analysis of their significance and impact, with an overview of how they can be used by gas regulators in their assessment of wholesale gas markets. The final section will conclude the study and emphasize the importance of the dynamics of gas prices and the power of information that it provides.

#### *2.1. Data*

A description of the data used in the study is presented in Table 2. The variables set consist of monthly wholesale gas prices that were registered between October 2009 and June 2018.


(a) Available on OFGEM website, https://www.ofgem.gov.uk/data-portal/gas-prices-day-ahead-contracts-monthlyaverage-gb; (b) Available on the U.S. Energy Information Administration (EIA) website, https://www.eia.gov/dnav/ ng/hist\_xls/RNGWHHDd.xls.

Figure 3 illustrates, in a time series plot, the evolution of the natural gas prices for the two different markets. There is a clear divergence that happened in the year 2009 and continues to date. This is mainly due to two main factors that took place in the United States. The first is the abundance and oversupply of new unconventional shale gas production in the local market. The second is that U.S. natural gas contracts in that period started to be decoupled from the crude oil prices.

**Figure 3.** Monthly gas prices of both gas markets, USD/MMBtu.

The line plots of the two markets presented in Figure 3 show that there is no clear indication of a linear relationship between both variables.

#### *2.2. Indicator 1—Level of Competition*

The level of competition and market concentration method involves the classical Nash-Cournot equilibrium test. A Cournot equilibrium is reached when a given firm maximizes its profit by changing its output taking into account the other firm's output. One important feature of a Cournot model is that firms are not allowed to cooperate. Therefore, as long as the players are playing the latter strategy, the companies would be abiding by pure market profit-oriented strategies, trying to maximize their "utility function," and have no agreed-upon behavior (i.e., collusion). The market, where producers follow this trend is considered more liberalized.

The aim is to test whether the behavior of the gas producers in the respective markets follows a Cournot model. If a set of gas producers are not following the assumptions of a Cournot game, the test will identify them.

The optimal quantities of the suppliers in a given market are obtained by numerically solving the following set of equations:

$$\max\_{q\_{i,t}\in\mathbb{R}\_+} (P\_t(Q\_t) \times q\_{i,t} - C\_{i,t}(q\_{i,t})),\tag{1}$$

At each observation t, the supplier *i* chooses quantity *qi*,*<sup>t</sup>* ∈ R<sup>+</sup> (where R<sup>+</sup> represents the set of non-negative real numbers) to maximize its profit given the output of competent supplier *j*, at its optimal choice, *Qj*,*t*. *Qt* is the total quantity supplied to the market, and *Pt*(*Qt*) is the inverse demand function, from which the gas price is deduced. The latter function depends on the total quantity of gas supplied to the market. Finally, production and transmission costs are represented by the cost function *Ci*,*t*(*qi*,*t*). The demand function is normally represented by a decreasing straight line and is estimated using regression methods. However, this methodology has its limitation due to endogeneity.

A non-parametric method, with no assumptions on cost and demand functions, has been developed by [24–27]. This analysis does not make any prior assumptions about the cost and demand functions. Therefore, instead of relying on an incomplete set of data related to demand and cost, the non-parametric method avoids such constraints. Nonetheless, many authors have contributed to the literature and exploited the parametric approach, by solving the equilibrium using some sensitivity analysis on cost and demand assumptions [22,23,35].

The marginal cost of supplier *i* at time *t* will be denoted by *MCi*,*t*. The condition of the first order of the optimization problem defined in Equation (1) is given by:

$$Q\_{i,t} \propto P\_t'(Q\_t) + P\_t(Q\_t) - \text{MC}\_{i,t} = \begin{array}{c} \text{(2)} \end{array} \tag{2}$$

where with *MCi*,*<sup>t</sup>* ≥ 0 and *Qi*,*<sup>t</sup>* the solution of (1). Now, we consider the observations given by

$$\mathcal{C} \, = \left\{ P\_l(Q\_l) \, , (Q\_{\bar{i},l})\_{i \in \mathcal{I}} \right\} \, \_{t \in \mathcal{T}} \tag{3}$$

where I = *the set o f suppliers* and T = *the set o f periods indices* and say that C respects Cournot equilibrium if the following **conditions** are verified:

1. The matrix of data *MCi*,*<sup>t</sup>* (*i*,*t*)∈ T ×I must satisfy the following:

$$(P\_l(Q\_l) - MC\_{1,l})/Q\_{1,l} = (P\_l(Q\_l) - MC\_{2,l})/Q\_{2,l} = \dots \\ = (P\_l(Q\_l) - MC\_{N,l})/Q\_{,l} \ge 0 \; \forall l \in \mathcal{T} \quad (4)$$

where with N = Number of suppliers in the market.

**Condition 1** compares the values of marginal costs for both firms. This means that the marginal cost of the firm with the higher marginal cost will produce a quantity that is lower than the firm with the lower marginal costs.

2. Optimal solutions *Qi*,*<sup>t</sup>* must verify the following:

$$(MC\_{i,t} - MC\_{i,t})(Q\_{i,t}, -Q\_{i,t}) \ge 0 \; \forall t \ne t' \in \mathcal{T}' \; and \; \forall i \in T \tag{5}$$

**Condition 2** allows us to compare the costs of different firms at different times. For instance, if firm *i* changes the produced quantity from *Q*1,*<sup>t</sup>* to *Q*1,*t* the marginal cost at time *t* must be lower than the marginal cost at time *t*. The same analysis for firm *j* leads to an arrangement of marginal costs for each firm and at each time in increasing order.

To conduct the problem of detection of Cournot's equilibrium, a numerical algorithm was developed. The result of which indicates whether or not the data being tested respects the Cournot equilibrium.

The algorithm starts with an assumption on the initial marginal cost of firm *i* to produce quantity *Qi*,*<sup>t</sup>* that is equal to the price *Pt*(*Qt*) and tests for conditions 1 and 2. This is typical in a fully competitive market, where the price of any commodity (i.e., gas in our case) should be as close as possible to the delivery cost. This procedure is repeated in several iterations by changing the marginal cost each time until conditions 1 and 2 are fully met. If the algorithm does not converge, then the set of observations C does not respect a Cournot equilibrium. The ratio of the number of observations that respects the Cournot equilibrium (more specifically conditions 1 and 2 in our algorithm) to the total number of observations, is then calculated and is called the Cournot acceptance rate, defined as δ, and is expressed as a percentage (%).

The Cournot Theorem states that, for any market producing and selling a certain commodity, the price converges to its production cost, whenever the number of market participants *N* tends to be infinity.

$$\lim\_{N \to \infty} P = \text{ marginal cost} \tag{6}$$

This means that the more companies participate in the market, the more they are unable to affect the market price. The company will become a price-taker and must accept the equilibrium price as is. So it is normal and logical to start by the highest possible marginal cost, which is equal to the gas price at that specific time, and then reducing the cost values until we test the whole array *MCi*,*<sup>t</sup>* (*i*,*t*)∈ T ×I. This is in line with our initial assumption.

For additional information about the algorithm, the readers are invited to check the four simple steps found in Appendix A.

In simple words, the algorithm can have the following outcomes:

1. Companies are competing based on a Cournot model, trying to maximize their profit by acting strategically. In this case, Cournot acceptance is high.


#### *2.3. Indicator 2—Market Stability*

Records theory studies observations that are concentrated in the tail of a given distribution [30] and will be used in this context to test the stability of two different gas markets.

Previous modelers of the record theory have obtained results in the case of independent and identically distributed (*i.i.d*) underlying observations, called the classical case (see [29]). In our application, we consider the absolute value of the difference between two consecutive gas prices as underlying observations. The most popular record model beyond the *i*.*i*.*d*. case was introduced by Yang and developed by Nevzorov [28,29,31], and it is currently called the Yang-model. In the latter model, the observations are considered to be independent but not identically distributed.

Considering a time series {*Xt*, 1 ≤ *t* ≤ *T*}, where *T* denotes the present time, the observation *Xj* is said to be an upper record if and only if *Xj* > max *t*<*j Xt* and record indicators are a sequence of random variables δ*<sup>t</sup>* defined by:

$$\delta\_{\mathbf{t}} = \begin{cases} 1 \text{ if } \mathbf{X}\_{\mathbf{t}} \text{ is a record} \\ 0 \text{ otherwise} \end{cases} \tag{7}$$

The total number of records in the considered time series is given by:

$$N\_T = \sum\_{t=-1}^{T} \delta\_t \tag{8}$$

Additionally, the probability that the observation *Xt* represents a record in the *i*.*i*.*d* case is called record rate and is given by:

$$P\_t = \frac{1}{t} \tag{9}$$

This has been justified by [36], and it can be deduced that the lim*t*→∞Pt <sup>=</sup> 0, this means that the chance of witnessing a record on a long term level is minimal and rare.

It is only reasonable to go beyond the classical case and test if the Yang model fits our set of data. However, before doing so, one should test if the data comes from a sequence of *i*.*i*.*d* random variables. The test is based on the statistic

$$\mathcal{N}\_T = (\mathcal{N}\_T - \log T) / \sqrt{\log T} \tag{10}$$

which was shown by [29] to have an asymptotic normal behavior.

Moving to the Yang model, the time between two consecutive records converges asymptotically to a geometric distribution and the record rate verifies the following equation:

$$P\_t = \frac{\mathbf{y}^t(\mathbf{y} - \mathbf{1})}{\mathbf{y}(\mathbf{y}^t - \mathbf{1})} \tag{11}$$

where γ is a parameter that needs to be estimated.

Unlike the classical case, the Yang record rate converges to a constant value in the long term and is given by:

$$\lim\_{t \to \infty} P\_t = (\gamma - 1) / \gamma \tag{12}$$

This means that records are always expected in the long term, and not only observed among the first observations as in the classical case.

#### *2.4. Indicator 3—Volatility and Uncertainty of Prices*

The third quantitative method used in this study is represented by Shannon's probabilistic entropy and is used on a time series analysis, to test the predictability power hidden in the underlying probabilistic distribution of the considered time series. A time series with a high predictability power is considered to have a high level of stability with an anticipated pattern.

The classical definition of entropy is as follows: for a given source of information represented by a finite discrete random variable *X* with *n* possible outcomes, each possible outcome *xi* having a probability *pi* to appear, the Shannon entropy *H* of the random variable *X* is defined by:

$$H(X) = -\sum\_{i=1}^{n} p\_i \log\_2 p\_{i\prime} \tag{13}$$

In general, a logarithm of base 2 (log2) is used because the entropy is generally expressed in bits [18]. Several researchers have previously attempted to predict the entropy of the commodity markets (oil, more specifically Brent and West Texas Intermediate, WTI, and other commodities) and tried to measure the information from statistical observations [32–34]. Brent and WTI are two different crude oil grades (quality) and are known to be the most important oil pricing benchmark around the globe. As previously explained, the gas markets in question have been liberalized, and the influence of oil prices on gas prices is shrinking. Gas prices are becoming more influenced by gas to gas competition. To the knowledge of the authors, no previous researchers have worked on predicting the entropy of the gas markets.

To compute the Shannon entropy of a time series, which is a continuous random variable, a particular discretization method is introduced:

First, the returns of the prices are computed, this is a requirement for normalizing the data set

$$r\_t = 100 \times \frac{P\_t - P\_{t-1}}{P\_{t-1}},\tag{14}$$

where *Pt* and *Pt*−<sup>1</sup> are the prices at times *t* and *t* − 1 respectively.

It is trivial that the series of observations of returns *rt* has an underlying continuous distribution. Therefore, the second step is to introduce the discrete random variable *st* defined by:

$$s\_t = \begin{cases} 1 & \text{if } \begin{array}{c} r\_t \ge 0 \\ 0 \text{ if } \begin{array}{c} r\_t < 0 \end{array} \end{cases} \tag{15}$$

The random variable has a binary output. Unity is when the returns are positive, which means that the prices are increasing, and zero means that the prices tend to diminish.

Based on the observed values of *st*, and by denoting the total number of observations by *n*, the corresponding probability distribution is computed:

$$p\_1 = \mathbb{P}[s\_t = 1] = \frac{\sum\_{t=1}^{n} s\_t}{n},\tag{16}$$

and,

$$p\_0 = \mathbb{P}[\mathbf{s}\_t = \mathbf{0}] = \mathbf{1} - p\_1 \tag{17}$$

Hence, the entropy related to the random variable *st* is given by:

$$H\left(\text{s}\_{\text{I}}\right) = -p\_{\text{0}} \times \log\_{2}(p\_{\text{0}}) - p\_{\text{1}} \times \log\_{2}(p\_{\text{1}}) \tag{18}$$

To compute the underlying entropy of each gas market by the above-explained method, we rely on daily values instead of monthly values. The price variable is divided into year windows, with 252 observations for each. The passage from one window to the other is done by removing the

first observation while adding another from the remaining ones, and so on. By applying the latter procedure, we obtain a series of entropies that should be represented by the mean parameter, as a representative of a series of entropy observations.

However, a major disadvantage of the mean is that it is sensitive to the distribution with a thick queue which can be caused by the presence of outliers and extreme observations. Besides, the mean may have a false interpretation in case of a highly skewed distribution for the considered data. To overcome these weaknesses, a second statistical parameter will be adopted: the median value. Moreover, it has been shown that the median is useful when comparing sets of data. Once the median entropy of each market is computed, the values for each market should be compared. Besides, a non-parametric mathematical test will be conducted to check the statistical significance of the difference between both markets [37]. The Kruskal test is used to compare two independent samples and checks if the observations originate from two different distributions or not.

Finally, the volatility for each market is computed and can serve to validate the results of the price unpredictability. It is the degree of variation in the price series of each market and is measured by the classical standard deviation of the returns.

Note that one can find in literature a version of the entropy adapted to the continuous random variable cases, called differential entropy [38], given by:

$$H(\mathbf{X}) = -\int\_{\mathbf{x}} f(\mathbf{x}) \times \log\_2 f(\mathbf{x}) \, d\mathbf{x} \tag{19}$$

where *f*(·) is the probability density function of the underlying distribution of the continuous random variable *X*. However, this method has many flaws:


#### *2.5. Information Theory*

After defining the indicators, which can be extracted from the gas prices of each market, the regulator has to make important decisions. If the market indicators indicate market concentration and price instability, then certain measures should be taken to bring back stability to the gas prices. Therefore, the two defined states in this study are either for the regulators to take action or keep the business as usual (BAU). This is defined by *si*, *i* = 1, 2 respectively. The indicators previously defined are denoted by *yk*, *k* = 1, 2, 3 respectively. Also, we denote by *pij* the conditional probability that the market is in state *si* after receiving the indicator *yk* i.e.,

$$p\_{ik} = \mathbb{P}\left[\frac{S = s\_i}{Y = y\_k}\right] \tag{20}$$

The probabilities *pik* are categorized into three classes: Low, medium, and high. Each class has the following respective probabilities: <sup>1</sup> <sup>10</sup> , <sup>1</sup> <sup>2</sup> , *and* <sup>9</sup> <sup>10</sup> . The information structure is illustrated in Table 3.


**Table 3.** Information structure conditional probabilities.

The methods used in this paper reduce the subjectivity of the probability distribution. The indicators are complemented by econometric models founded by economic parameters of the relevant gas markets, and by data analysis on its gas prices, from which the information is extracted.

The probability of being in a certain set (either *s*<sup>1</sup> or *s*2) after receiving the indicator (either *y*1, *y*2, or *y*3) can take either a value of 0.1, 0.5, or 0.9, which constitutes the possible events on the probability set. The sum of the probability of being in *s*<sup>1</sup> or *s*<sup>2</sup> after receiving the same indicator *y*1, however, should be equal to 1. This is normal as there are only two sets considered in this study.

$$\sum\_{i=1}^{2} p\_{i1} = p\_{11} = \mathbb{P}\left[\frac{S = s\_1}{Y = y\_1}\right] + p\_{21} = \mathbb{P}\left[\frac{S = s\_2}{Y = y\_1}\right] = 1\tag{21}$$

Note that the first step is to compute entropy, called "apriori" entropy, based on the distribution of the states *S* before the reception of any additional information. Then, we start by:

$$H(S) = -\sum\_{i=1}^{2} \pi\_i \log\_2 \pi\_i \tag{22}$$

where π*<sup>i</sup>* = P[*S* = *si*] is the probability of being in the state *si* before receiving additional information called "*apriori probability*". As π*<sup>i</sup>* is defined based on no previous information, it is reasonable to consider a distribution, which has the highest level of uncertainty. In other words, the regulator has no information that can lead him to make an action, and the probability of either of the two states is equally likely. This is a uniform distribution with the following "*apriori probabilities*" π<sup>1</sup> = π<sup>2</sup> = <sup>1</sup> 2 .

Now, after receiving a specific indicator of information *yk*, the conditional entropy of the random variable *S* relative to the indicator *yk* is defined as:

$$H\_k(\frac{S}{y\_k}) = -\sum\_{i=1}^{2} p\_{ik} \log\_2 p\_{ik} \tag{23}$$

Hence, to assess the power of information generated by the whole information structure (composed by *y*1, *y*2, and *y*3), the "*posterior entropy*" is defined, and is compared to the "*apriori probability*":

$$H\left(\frac{S}{Y}\right) = \sum\_{k=-1}^{3} q\_k \times H\_k\left(\frac{S}{y\_k}\right) \tag{24}$$

where *qk* is the weight given for each indicator. This number is equally distributed for the three indicators, as they are equally important and each contributes to the understanding of the gas market in different ways.

Finally, the amount of reduced uncertainty, due to the additional received information, is measured by a quantity called mutual information:

$$I(\mathcal{S}, \mathcal{Y}) = H(\mathcal{S}) - H(\frac{\mathcal{S}}{\mathcal{Y}}) \tag{25}$$

Thus, to compare the power of information generated by two information structure related to two different gas market, one should consider the one with the highest *I*(*S*,*Y*). This is equivalent to say that the structure of information that reduces most of the uncertainty of the random variable *S* will be most efficient and powerful.

#### **3. Results and Discussions**

#### *3.1. Results of the Non-Parametric Cournot Test*

Considering Table 1, it is evident that both gas markets are competitive. Nonetheless, this is considerable and significant in the case of HH. Table 1 draws attention to two main numbers: the first being the big difference between the volumes of gas traded in the future and the volume traded on the physical, which indicates the excessive participation for traders and financial players in the virtual market. The second being the large numbers of churn ratios, which indicates high liquidity and healthy trading platform, an attractive characteristic for all stakeholders. Unlike the U.S. gas market, the Herfindahl index for the European gas market is relatively high [11]. This is a sign of healthy competition, and this simply means that out of the many gas suppliers in the U.S. market, none has market power on its own. However, this violates one of the main assumptions of a Cournot competition model, where firms have market power, and each firm's output decision affects the gas prices. In a nutshell, there is no risk of market manipulation in such a market, therefore the market concentration is minimal and close to zero. All U.S. gas suppliers should be price takers in this case, and the Cournot acceptance rate is no longer valid.

As explained in section two, the data that is used for the Cournot test consist of the gas prices and the gas supplies to the relevant market. Gas supplies are shown in Figures 1 and 2. The suppliers are represented by countries of origin. Results can be more indicative if the data related to gas supplies are composed of volumes of the suppliers (shippers, trader, and companies) directly rather than the country (market) where the gas was purchased. The authors acknowledge the need for the traders' suppliers' data and the need to perform the Cournot test on the American market, however with no publically available information on the supply market shares of companies, this is not possible. Therefore, we encourage the publishing agencies to list such data on their website (or upon request). The FERC Form 552 provides a database of trading activity and lists the data related to the largest companies (Top 20) with the largest total transaction volume from year to year. The list found in [15] is incomplete and contains yearly data only. Thus additional data related to suppliers' portfolios is needed to have valid test results. The suppliers in North Western Europe are oligopolistic [39]; therefore, the usage of the data will lead to conclusive and significant results, when using the algorithm.

The non-parametric test results for the European market gives a Cournot acceptance rate of 51%. The results can be analyzed as follows: the behavior of the large gas suppliers in the European can be explained by a Cournot model, where suppliers are trying to maximize their payoffs by competing over quantities. However, the other half of the acceptance rate means that there are companies that are not behaving as such. This could implicate that some of the suppliers have other strategies such as collusion, or strategies that are not "pure" profit maximizers.

An example of a possible collusive behavior has been witnessed in the oil markets under the Organization of the Petroleum Exporting Countries, OPEC back in the 1970s. These countries used to control a major share of the world oil supplies, and together they form a cartel that cooperates, to increase prices and limit external competition.

Other examples that can be used to illustrate possible reasons why these suppliers are not seeking a "profit only" strategy under the Nash-Cournot umbrella are listed below:

Authors such as [40], suggest that Gazprom, a major gas suppliers, is maximizing its "utility function" not only by limiting itself on one strategy that is focused on making a profit, but also by contemplating other strategies such as seeking to eliminate competition, even if this leads to some losses in profits initially.

Other authors, such as [41], enumerate other reasons that are preventing some of the European gas suppliers from exerting their oligopolistic power, and these are due to old legacy contracts that are still effective, and perhaps new regulations. In short gas prices mechanism in old legacy contracts are mainly indexed to oil prices, and this type of contract does not offer the needed flexibility to gas suppliers. These valid assumptions are among many, possible reasons why the Cournot acceptance rate is not that elevated in Europe.

The first indicator is informative and the analysis of the prices of the NBP wholesale gas prices is indicative for the regulator in this market.

#### *3.2. Results of the Records Theory*

The second indicator is assessed using the records theory. To anticipate if the data belong to an *i*.*i*.*d* sequence of variables, the goodness of fit test is used.

The results shown in Table 4 indicate that the European market rejects the null hypothesis. The test results were computed at a confidence level of 5%. Accordingly, and from an empirical perspective, the gas prices recorded in this market are characterized by price variations and sudden price shifts.


**Table 4.** Goodness of fit test results.

Based on the analysis of Table 5, the result is not surprising, as it confirms that the European market has a high number of records relative to the small number of observations. This indicates that the European gas price records are not grouped in one section of the time series, and are instead more spread, while the U.S. market is rather more stable and that price shifts are rarely observed all along the time series.


**Table 5.** Number of records and record index.

Looking further, in an attempt to measure the probability of witnessing a record in each of the gas markets, the Yang model will be used for the European market and the classical model for the U.S. market. The computed probabilities were computed for each market, and Table 6 shows the result of the probability that matches the date of June 2018.


The probability of witnessing a new record is higher in the European market. The results from the above analysis can be summarized as follows:


#### *3.3. Results of the Shannon Entropy*

By applying the procedure described in Section 2.4 dealing with Shannon entropy, the representative median entropy of each considered market in addition to the *p*-value of the Kruskal non-parametric test is calculated. The results of the entropy approach are presented in Table 7.


If a random variable *X* follows a discrete uniform distribution with *n* possible outcomes, the corresponding entropy is *H*(*X*) = *log*2*n* [42]. Hence, in our context, the values of the entropies are both close to the case of uniform distribution *log*22, which is equal to unity, and this means that both markets are far from being predictable.

As the values of the median entropies of the considered two markets are close to each other, it is substantial to test if the two considered median entropies are issued from two different distributions. If it is the case, then this indicates a significant difference between the two medians. The non-parametric Kruskal test is applied to verify the latter hypothesis.

Based on Table 7, the *p*-value of the Kruskal test is close to zero, and therefore less than 5%. Accordingly, the difference between the markets in terms of entropy is significant, i.e., the market with the higher median value, European market in our case, has an entropy significantly higher than the U.S. market.

Also, the volatility in the U.S. market is very low (0.7), whereas it is significantly high in the European market (2.5).

Thus, for indicator number three, the U.S. market is significantly more predictable and has lower uncertainty than the European market.

#### *3.4. Synthesis of Indicator Results*

In an attempt to better illustrate the results of the three mathematical models used in the previous sections to measure the market indicators, Table 8 summarizes the results and lists the main findings for each market.


#### **Table 8.** Synthesis of indicator results.

#### *3.5. Results of the Information Theory*

Concentrated markets raise regulatory and antitrust concerns, as this is a clear sign of market power in the hands of suppliers. Appropriate actions need to be taken by the regulator to make sure that neither collusion, nor cooperation between companies, nor any kind of strategic decisions that do not end up in favor of consumer welfare, are permitted.

The regulator in such a case should ensure that under no circumstances, the companies communicate and have the agreed-upon understanding to raise prices and profit margins at the expense of consumer welfare. The barrier to entry for new companies should also be considered and reduced by regulators in such markets, to increase competition and diversify supplies. These are some examples of actions that the regulator can impose on the suppliers.

Markets that witness price volatility and uncertainty in the medium term, as well as price instability in the long term, are also raising concerns for regulators. In such a case the key to determining the movement of gas prices are supply and demand fundamentals [43]. A slowdown in global demand is a key downside risk for suppliers, as they will eventually earn less while trying to sell their gas. On another hand, a sudden slowdown in supply is a key downside risk for another player in the gas value chain, which is the consumer. The latter will have to pay more to purchase the commodity.

In both cases, regulators should anticipate such results by acting in favor of a continuous supply and demand equilibrium, by trying to diversify supply (indigenous production, imports, and storage), while also ensuring that the consumers have the appropriate infrastructure and financial means to

buy such a commodity. However in a market characterized by gas prices that are predictable in the medium term and prices that are stable over the long term, then there is no need for further actions by the regulator.

Moving forward, we start by assigning the relevant conditional probabilities *pik* which indicate to the regulator the state of nature of the gas market. As previously mentioned, the probabilities are categorized into three classes: low, medium, and high. Also important to remember that the sum of <sup>2</sup> *<sup>i</sup>* <sup>=</sup> <sup>1</sup> *pi*<sup>1</sup> is equal to 1, as we only have two possible sets. The same thing applies to indicators 2 and 3.

After presenting the results of the three indicators in Sections 3.1–3.4, Table 9 explains the process of probability category selection and lists the results for both the U.S. and European markets.


**Table 9.** Information structure conditional probabilities for both markets.

The results listed in Table 9, give a clear indication that the market in the United States is functioning smoothly and that the regulator does not need to add other measures. In other words, the BAU case is favored.

This is not the case however, for the European market. The regulator is more inclined to intervene. UK's regulator OFGEM has to intervene and investigate the reason behind some instability and signs of non-competitive behavior, where some firms are not only focused on profit maximization.

To compute the global power of information generated by the considered information structure, we start by assessing the level of uncertainty of each receiving indicator by computing the conditional entropy of the latter *Hk S yk* ; then we get:

The "*posterior entropy,*" previously defined by *H S Y* is then computed, and compared with the "*apriori entropy,*" which is defined in Section 2.5 as the entropy of a uniform distribution (one that has the highest level of uncertainty), and given a value of 1.

Tables 10 and 11 illustrate the results of the entropies, conditional to the relevant indicators, which is then used to compute the outcome of these indicators in aggregated dimension and for each market.


**Table 10.** Conditional entropy of each indicator.

#### **Table 11.** "Posterior" and "apriori" entropies.


The difference between the "*posterior entropy*" and the "*apriori entropy*" will help assess the level and amount of information, previously defined as *I*(*S*,*Y*) gained by analyzing the gas prices data in each market. In other words, the indicator analyses that is measured by the various econometric methods used in this study constitute additional information that the regulators can use to assess the status of the market. The more the additional information increases (i.e., the difference between the "*posterior entropy*" and the "*apriori entropy*"), the more confident the regulator is about the power of information generated.

The amount of reduced uncertainty, due to the additional information received from the indicators, is estimated at 0.38 for the U.S. market and 0.18 for the European market, which means that the level of uncertainty has been reduced in the European and U.S. market respectively by 18% and 38%. The value of the information contained in both markets, although in asymmetric terms, is significant, powerful, and can serve as a reliable and efficient source of information.

#### **4. Conclusions**

Overall, this work presents four econometric and mathematical methods that are used collectively to estimate the level of information contained in gas prices in two separate wholesale gas markets, i.e., the European and the U.S. gas markets. The theories employed are Cournot theory, records theory, Shannon entropy, and information theory.

By analyzing the efficiency of the gas market and assessing the need for additional measures and intervention, the work of gas regulators with regard to market oversight is likely to be improved. The value of the information is based on three market indicators: the possibility of non-competitive behavior by gas firms, market stability, and uncertainty in prices.

Our findings suggest that the U.S. gas market is stable. The information value contained in the wholesale gas prices gives a clear indication that there is no need for additional market oversight. However, this is not the case in the UK (the most developed European gas market), where results show signs of market instability and non-competitive behavior. In other words, some firms are not only focused on profit maximization; therefore, the wholesale prices are not solely the product of classical law of supply and demand.

Interestingly, the value of the additional information brought about by the indicator analysis and included in both markets has contributed to reducing uncertainty. This makes the information carried in the gas prices of both markets, although asymmetric, powerful and efficient. The regulators in both markets can, therefore, act accordingly by using the two-step approach to assess the level of competition, price stability, and price predictability.

The originality of the two-step approach applied in this document can be summarized as follows: it is the first time that several multidisciplinary econometric methods have been combined to create a probabilistic structure assessing the underlying information of a gas market. Furthermore, the approach deals with a case where the information is neither completely absent nor perfectly known, which has rarely been dealt with in literature.

Worthy to mention, the authors have chosen three market indicators and four different econometric methods in this study. It is believed that additional mathematical/statistical analysis can be used for this topic. For further research, one can work on estimating the entropy generated (the third indicator) using another discretization procedure. This work is a growing research track and needs a large number of observations. Besides, one can also work on creating estimators for the underlying probability distribution of each indicator. A starting point is to apply goodness of fit techniques or a more empirical to perform a bootstrap process.

**Author Contributions:** Formal analysis, H.H.; methodology, A.H. and H.H.; project administration, H.A.; software, A.H. and H.H.; writing—original draft, H.H.; writing—review and editing, H.A. and H.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors acknowledge TU Wien Bibliothek for financial support through its Open Access Funding Program.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

Algorithm for Non-parametric Cournot test


$$\gamma\_{i,t} = \min \{ \min\_{\{t' \neq t \colon Q\_{i,t'} > Q\_{i,t}\}} \{ \mathrm{MC}\_{i,t'}^{ub} \} , \mathrm{MC}\_{i,t}^{ub} \} $$

Note that if *t t* : *Qi*,*t* > *Qi*,*<sup>t</sup>* = φ set γ*ub <sup>i</sup>*,*<sup>t</sup>* <sup>=</sup> *MCub <sup>i</sup>*,*<sup>t</sup>* . This step ensures that Condition 1 is respected.

• *Step 3:* Define for each supplier *i* and at each time *t* the variables λ*<sup>t</sup>* and γ*ub i*,*t* :

$$\lambda\_t = \max\_j \left\{ \frac{P\_t - \mathcal{V}\_{j,t}}{Q\_{j,t}} \right\}$$

and

$$\mathcal{Y}\_{i,t}^{ub} = \,^t P\_t - \lambda\_t Q\_{i,t}$$

This step ensures that Condition 2 is respected.

	- i. If ∃(*i*, *t*) such that γ*ub <sup>i</sup>*,*<sup>t</sup>* < 0, then the algorithm is stopped, and it is concluded that Cournot equilibrium conditions are not satisfied.
	- ii. If ∀(*i*, *t*), *MCub <sup>i</sup>*,*<sup>t</sup>* <sup>=</sup> <sup>γ</sup>*ub <sup>i</sup>*,*<sup>t</sup>* , then the algorithm is stopped, and it is concluded that Cournot equilibrium conditions are satisfied.
	- iii. Otherwise, return to Step 1 for a new iteration by letting *MCub <sup>i</sup>*,*<sup>t</sup>* <sup>=</sup> <sup>γ</sup>*ub <sup>i</sup>*,*<sup>t</sup>* for all (*i*, *t*).

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A New Methodology to Obtain a Feasible Thermal Operation in Power Systems in a Medium-Term Horizon**

#### **Luis Montero \*, Antonio Bello and Javier Reneses**

Institute for Research in Technology (IIT), ICAI School of Engineering, Comillas Pontifical University, 28015 Madrid, Spain; antonio.bello@iit.comillas.edu (A.B.); javier.reneses@iit.comillas.edu (J.R.) **\*** Correspondence: luis.montero@iit.comillas.edu

Received: 27 March 2020; Accepted: 10 June 2020; Published: 12 June 2020

**Abstract:** Nowadays, electricity market paradigms are constantly changing. On the one hand, the deployment of non-dispatchable renewable energy sources is bringing out the necessity of representing hourly dynamics in medium-term fundamental models. On the other, the promotion of new interconnection capacity and the integration of markets (as is the case of the European market) makes necessary the simultaneous modeling of multiple electricity systems. Thus, the large size of power markets, together with the consideration of uncertainty in some inputs, make it computationally intractable to work rigorously on an hourly detailed time span. Temporal aggregation, integer programming relaxation or less accurate generation modeling are usually employed to obtain reasonable computation times. However, the application of these techniques often leads to infeasible or suboptimal operational outputs. This paper proposes a new soft-linking methodology to meet reliable results from medium-term models, such as hourly prices or aggregated productions, with a feasible and detailed representation of the thermal generation, considering technical constraints and risk aversion. The results of a fundamental model that represents the competitive behavior between market players in a multi-area power system are used as the starting point for the methodology. Then, a post-processing method is applied to optimize and make feasible the thermal portfolio of a market agent. The final output is a feasible hourly scheduling and an ample space for optimization, where the introduction of a strategic term represents the rational behavior of a player who tries to maximize its profit.

**Keywords:** electricity markets; feasible operation; medium-term representation; optimization models; power systems; thermal generation; unit commitment

#### **1. Introduction**

The liberalization of electricity markets over the last decades has highlighted the importance for generation companies to optimize the production of their thermal units in order to maximize profits and be competitive. Power market models have traditionally been used as a supporting tool for this purpose and have proven their effectiveness. However, the accurate representation of current market trends in real size systems has become a great challenge. In particular, there are four main sources of the increasing complexity in electricity market modeling.

First, the maturity status of the onshore and offshore wind generation, together with the recent drop in the price of photovoltaic facilities, are leading to major changes in traditional power mixes. The variability of electricity supply will be accentuated in the near future with an increasing penetration of these renewable energy sources. This trend brings out the need to consider a more detailed time granularity in the horizons of energy models, as well as detailing the flexibility of thermal units [1]. This is especially crucial when trying to represent and assess the behavior of storage (pumping, batteries, etc.) facilities, which are highly influenced by time chronology.

Second, the current market paradigms do not only demand great modeling detail. The energy integration policies adopted in many regions have promoted a notable increase in the interconnection capacities between countries. Additionally, the diversification strategies in markets embraced by some players to cover risks have also contributed to enlarging the simulation dimensions, making it necessary to represent multi-area power systems in order to analyze an electricity market [2]. In particular, the European electricity market is becoming more and more integrated and is now cleared on a multi-country basis.

Third, the mentioned increase in intermittent renewable generation makes it necessary to properly address the technical constraints, costs, and flexibility of thermal units. The fast-ramping ability of gas-fired power plants makes them the best medium-term option to assure a reliable electricity service, covering demand peaks, plant falls and drastic variations in the renewable generation. This fact points out the importance of an accurate representation of the operation of flexible thermal units like combined cycle gas turbines (CCGTs) [3].

Finally, the competitive behavior of generation companies should also be included in energy models. This is relevant both regarding profit maximization and risk aversion. The continuous evolving of energy regulation, the ongoing generation switching, as well as the increasing deployment of intermittent energy resources, bring out the increasing importance of the uncertainty consideration [4]. This means that models have to be run using a number of scenarios of the different risk variables.

These changing market conditions can be represented accurately through the application of short-term methodologies, which allow a rigorous representation of large power systems using a short and detailed time horizon. However, generators, retailers and large consumers also need medium-term tools to optimize the operation of their assets and support the decision-making process, like fuel purchases, hydro-thermal management, emissions allowances trading or financial contracting [5]. In particular, the role of natural gas as a vector in the energy transition towards a renewable mix makes it essential for generation companies to know how to obtain the maximum return on these assets [6].

Despite the continuous computational improvements, the extension of the accurate short-term techniques to longer horizons is computationally intractable. For this reason, modeling simplification is necessary, and some assumptions are carried out in the literature in order to reduce the size of the considered problems. Nevertheless, the combination of high-detail modeling, multiple areas, hourly granularity, and uncertainty consideration at once, is increasingly desired. These aspects are analyzed in depth in Section 2.

This paper proposes a new soft-linking methodology to fill this gap, meeting reliable results from medium-term models, such as hourly prices or aggregated productions, with a feasible and detailed representation of the thermal generation, considering technical constraints and risk aversion.

#### **2. Literature Review**

The operation of power systems is a subject widely studied in the literature. The representation and optimization of the hydro-thermal generation has been deeply addressed in order to increase its profitability. For this purpose, several modeling techniques have been developed to reach an accurate performance of the simulation tools.

A suitable example for the rigorous representation of large power systems is the unit commitment (UC) problem. It provides the optimal dispatch of thermal units according to price and demand forecasts. Technical constraints, operating costs, and profit maximization can be easily modelled with this methodology and useful results are obtained in reasonable running times. In most cases of the literature, e.g., [7–13], UC considers one day to one week time spans in an hourly basis, performing a precise simulation of thermal generation that cannot be extended to longer time horizons.

The representation of the strategic behavior between players in competitive electricity markets is also a problem widely discussed in the literature. A great variety of methodologies have been proposed to address the Nash's game theory [14] applied to electricity markets [15]. Diverse models, based on the mixed complementary problem [16,17], heuristic techniques [18] or dynamic programming [19] have successfully described competitive behavior in the medium term.

However, market equilibrium tends to be represented as an optimization problem. With this aim, behavior assumptions are made, such as perfect competition [20] or Cournot competition [21]. Both competitions are encompassed by conjectural-variation (CV) models, as well as a wide range of intermediate situations between these two extreme behaviors. Quite accurate simulations of oligopolistic markets can be performed through these models. In [22], a medium-term CV-based model is proposed, where the equivalence of this optimization problem to a market equilibrium problem is also mathematically demonstrated.

Medium-term equilibrium models experience a tight trade-off between modeling detail and run time. Despite the improvements in computational techniques, a completely hourly detail in real size cases with uncertainty consideration is still intractable. Nevertheless, renewable penetration and market integration developments bring out the necessity of a multi-area medium-term modeling in an hourly basis, in order to achieve an adequate representation of the importance of the greater supply variability and the interconnection capacity increase.

Generic formulations are frequently employed in many academic and commercial models for the medium- and long-term representation of energy systems [23–37]. These open presentations increase their flexibility and brings custom options to the users. Nonetheless, these formulations are not always followed by a case study to show the real scope of the model. Temporal horizons, thermal unit details or multi-area limitations are not usually fenced when a new model is presented.

Many of these methodologies offer a wide modeling catalogue to perform a meticulous simulation. High granularity in temporal representation, long-term horizons, multi-area representation, integer programming, even the inclusion of uncertainty, are often available in the same model. Nevertheless, it is computationally intractable to consider every technique at a once in a medium- or long-term representation of a real power system.

As an example, Table 1 shows the particularities for each model formulated in [23–37], where the difficulties of considering every single accurate modeling technique at once are exposed. It is demonstrated that simplifications are necessary when the time span exceeds short-term horizons.

The uncertainty representation in a detailed medium-term horizon frequently implies to renounce to integer programming and its accurate modeling of the power system operation [38,39]. In fact, integer variables are also relaxed if a high granularity along the whole time span is desired [40–42]. Otherwise, if a MIP performance is necessary, time aggregation techniques are usually implemented [30,35], either temporal decoupling [43] or a drastic reduction in the problem size [31]. Other representative examples of the trade-off between modeling detail and computational resolution of medium-term models are illustrated in [29,44,45].

As expected, energy market representations in the long term require the same modeling simplifications. The combination of integer programming with a complete hourly resolution can be hardly afforded even if a tiny power system is considered [46]. Integer variable relaxation and temporal aggregation techniques [37] or a low time granularity [32], are needed if the representation of real size electricity markets is desired. In conclusion, medium- and long-term models cannot afford the whole modeling details at once in an hourly basis.

These cases highlight the need to use temporal aggregation techniques for the representation of large electricity markets in the medium term if a high temporal granularity is desired. Load blocks [47] have been traditionally employed as clustering technique to reduce the consumption of computational resources in hourly time spans. These clusters represents the system through demand levels.

The variability introduced by the penetration of non-dispatchable renewable technologies can be modelled with a load duration curve that characterizes the net demand [48]. This formulation was overcome by the system states [49,50], which ushered the inclusion of multiple system features and introduced cluster-transition concepts to increase the accuracy of the model.


**Table 1.** Modeling simplifications of each case study. IP: Integer Programming.

\* Integer variables are relaxed when the time span exceeds one year.

Nonetheless, the chronology between clusters was not taken into account until [51,52], where new constraints were formulated to keep the technical information in the transitions between clusters. In this way, chronological relationships are maintained using system states or enhanced representative periods. Regardless, as every single temporal aggregation method, they sacrifice some detail to gain in resolution time.

The application of these techniques to optimization models often leads to suitable results in the medium-term forecasting of power systems. The methodology proposed in [53] for a medium-term fundamental model based on conjectural variations represents uncertainty in risk variables such as demand, water incomes, wind generation, fuel costs, CO2 prices and unavailability of thermal units, reaching a great characterization of the hourly prices obtained for the Spanish electricity market as a case study. Nevertheless, it does not offer a feasible hourly scheduling.

The application of integer programming to cases of such a large size and detail as the previously mentioned is computationally intractable in hourly horizon representations. The inclusion of clustering techniques in the unit commitment problem [54–56], open the door to extend the time spans of models

that work with integer variables, allowing to represent in detail the operation of large energy markets in the medium term.

Moreover, the use of temporary aggregation techniques complicates the representation of the start-ups, since its cost depends on the number of hours that the thermal unit has been offline. The representation of this cost as a single step [56] does not provide a great detail. However, the formulation of [54] seems to overcome this problem, but the representation by centroids will always result in a loss of the real variability existing between the elements that integrate the cluster.

In [57], a new methodology is proposed in order to preserve the variability between hours in the performance of a medium-term model that considers system states. The use of statistical techniques results in a notable improvement in the representation of the hourly prices of the electricity market. Nevertheless, their use cannot be extrapolated to the production outputs. In turn, it would not solve the problem of the hourly infeasibilities either, which appears when the problem size or the uncertainty consideration oblige to use relaxed variables if reasonable run times are sought.

Integer programming relaxation is widespread in medium-term models. The use of continuous variables provide differentiability, exploited by powerful current solvers to solve large LP and QC problems without the consuming too much time or computing resources. However, its utilization means a loss of fidelity of results because of its acceptance of fractional values as levels of decision variables like the commitment status or the start-ups and shut-downs of thermal units. For this reason, some outputs are far from the real behavior of the power system.

Continuous variables allow partial commitment status for thermal units, as well as non-integer internalization of the costs of start-ups and shut-down processes. This means that production results and hourly costs cannot always be extrapolated to reality. However, since a minor detail perspective such as a monthly vision, some outputs of these models like productions, incomes and costs are quite close to the real values for each player, being a useful information for generation companies.

According to the high usefulness of these results for companies and their approximation to real values, a novel post-processing method is proposed in this paper in order to obtain a feasible hourly scheduling for the thermal generation portfolio of a market player. The scope of this methodology is to overcome the mentioned gaps of the existing models in the literature, being flexible to be applied to several medium- or long-term simulation tools. The main contributions of this paper are:


#### **3. Methodology**

#### *3.1. Overview*

As previously discussed, modeling simplifications are usually assumed in medium- and long-term models. Temporal aggregation, integer programming relaxation or less accurate modeling are used to achieve computational feasibility. Furthermore, risk representation notably increases the complexity of the problem. Considering uncertainty is more difficult the longer the horizon is, either by stochastic programming or Monte Carlo simulations.

This section proposes a methodology to harmonize the performance of a medium-term model, subject to modeling simplifications, with short-term techniques which bring a detailed representation of power markets operation.

#### 3.1.1. Medium-Term Fundamental Model

The methodology takes as inputs the high-reliable outputs from a medium-term market model. To this end, a medium-term fundamental market model will be solved. This model looks for a detailed representation of the operation of an electricity market, considering its regulation framework and technical constraints. The main features of this kind of models are described below:


Generally, some simplifications are needed in order to make these models tractable from a computational point of view. First, integer programming relaxation is typically needed, and this relaxation collaborate in getting good price signals, since the use of integer programming only reflects the variable costs of the units that are committed, when the dual variables which represent the prices are obtained. Secondly, the application of time aggregation or selection techniques is also required, either by aggregating similar hours in time blocks (and consequently losing the chronology), or with the selection of prototypical days or weeks. Either simplification is able to obtain accurate aggregated results (such as weekly or monthly productions) but will fail obtaining feasible and realistic hourly operations. Finally, some simplifications are needed regarding thermal units, as detailed start-up costs depending on the time that the unit has been offline.

The proposed methodology will be tested with a particular medium-term model, but the formulation is open to the consideration of any medium- or long-term model, whose operation results are desired to be made feasible and optimal on an hourly basis.

#### 3.1.2. Post-Processing Methodology

Once the results of the medium-term model are obtained, they are used as an input data by the post-processing method, which provides a detailed, accurate, and feasible thermal schedule. The methodology will be stated from the point of view of a thermal agent trying to obtain a feasible operation of its thermal portfolio. This feasibility process is formulated as an optimization problem, which responds to the rational behavior of a player that wishes to maximize its profit considering risk aversion. The methodology uses as input data the expected hourly prices and the expected productions of the thermal portfolio through a considered time span. It is important to note that these results respond to a rational infrastructure management, like hydro reservoirs, fuel storage, maximum number of start-ups, minimum annual operation hours etc. These expected productions are the aggregated values of each thermal unit included in the portfolio. The compliance of the production goals has a certain flexibility degree, according to a strategic term that could be adjusted by the market agent. Hence, the optimization process can only consider some clearance to the production targets to avoid

infeasibilities, or it can have a more flexible character, in which a redistribution of productions between units is allowed. It will depend on the strategic term, that can be easily determined by the market player according to extra operational cost, logistics, opportunity costs and any other desired consideration.

The next section describes the mathematical formulation of the methodology. Its nomenclature is included in a glossary at the end of the document. Upper-case letters are used for denoting parameters and sets, while lower-case letters denote variables and indexes. Hourly intervals are considered for unit consistency.

#### *3.2. Mathematical Formulation*

The post-processing method is presented as an optimization problem in which a player wishes to maximize its profits, adjusting a given production of its thermal units. In addition to this input, an hourly price market forecast throughout the considered time span is also taken as an input. This time horizon is flexible, being able to cover days, weeks, or even months.

The objective function of the maximize optimization problem is shown below. It is subject to the restrictions formulated along this section:

$$\max \sum\_{\emptyset \in G} \sum\_{t \in T} (p\_{\emptyset, t} L\_t - c\_{\emptyset, t}^{PROD} - c\_{\emptyset, t}^{SD} - c\_{\emptyset, t}^{SI}) - \sum\_{\emptyset \in G} c\_{\emptyset}^{DIV} \tag{1}$$

#### 3.2.1. Production Costs

The production cost of the thermal units are modelled as a quadratic function of the power output:

$$\mathbf{c}\_{\mathbf{g},t}^{\rm PROD} = \mathbf{u}\_{\mathbf{g},t}\mathbf{C}\_{\mathbf{g}}^{\rm NL} + p\_{\mathbf{g},t}\mathbf{C}\_{\mathbf{g}}^{\rm LV} + p\_{\mathbf{g},t}^2\mathbf{C}\_{\mathbf{g}}^{\rm QC} \tag{2}$$

This formulation is more accurate than the simplified linear production costs which are taken in [9,12,56]. However, it is less detailed than the piecewise approximation of [7], where the use of binary variables allows the non-convex and non-differentiable variables costs of the thermal units to be segmented and adjusted with high accuracy.

The choice of a quadratic function is based in the power of the current solvers to work with MIQCPs and prevent the MILP problem from slowing down with a big number of binary variables, whose resolution requires a long run time. Besides, it constitutes a high-accurate approximation to the actual cost function.

#### 3.2.2. Start-Up and Shut-Down Costs

The exponential nature of the start-up cost function, where cost increases with the amount of hours that the unit has been offline, is usually represented by a stairwise approximation, as Figure 1 illustrate, or simplified to a single step cost [9,56].

**Figure 1.** Stairwise approximation of the non-linear start-up cost function.

*Energies* **2020**, *13*, 3056

According to the advantages of the formulation of [12] over those of [10], the equations of [12] are chosen to represent the behavior of the start-ups, consuming less computational resources:

$$\mathbf{c}\_{\mathbf{g},t}^{SLI} = \sum\_{s \in \mathcal{S}} \delta\_{\mathbf{g},s,t} \mathbf{C}\_{\mathbf{g},s}^{SLI} \tag{3}$$

$$\delta\_{\mathfrak{g},s,t} \le \sum\_{i=T^{\text{SLI}}\_{\mathfrak{g},s}}^{T^{\text{SLI}}\_{\mathfrak{g},s+1}-1} w\_{\mathfrak{g},t-i} \qquad\qquad\qquad\forall \mathbf{g}, s \in [1, S\_{\mathfrak{k}}), t \in [T^{\text{SLI}}\_{\mathfrak{g},s+1}, T] \tag{4}$$

$$
\delta\_{\mathbf{g},\mathbf{s},t} \tag{5}
$$

$$\delta\_{\mathfrak{g},s,t} = 0 \tag{6.8} \\ \qquad \qquad \forall \mathfrak{g}, s \in [1, S\_{\mathfrak{g}}), t \in (T\_{\mathfrak{g},s+1}^{\mathrm{SII}} - T\_{\mathfrak{g}'}^{0} T\_{\mathfrak{g},s+1}^{\mathrm{SII}}) \tag{6}$$

Regarding shut-downs costs, its modeling is widely extended as a fixed cost:

$$\mathbf{c}\_{\mathcal{S},t}^{SD} = w\_{\mathcal{S},t} \mathbf{C}\_{\mathcal{S}}^{SD} \tag{7}$$

#### 3.2.3. Diverting Target Production Costs

*vg*,*<sup>t</sup>* = ∑ *s*∈*Sg*

Diverting costs are not a real cost, but the way to model the possible transfer of production targets between the thermal groups, as a result of making the problem feasible (moving production targets that are below the minimum power outputs) or optimizing profits (moving productions targets from a group that would start for a few hours to one that is committed).

$$\mathcal{L}\_{\mathcal{S}}^{DIV} = (f\_{\mathcal{S}}^{A} + f\_{\mathcal{S}}^{B}) \frac{\mathbb{C}^{F}}{2} \tag{8}$$

Note that the definition of the diverting strategic term *C<sup>F</sup>* will affect the optimization behavior, from a high *C<sup>F</sup>* where the production targets of each unit are respected, to a *C<sup>F</sup>* equal to zero where only the total production target *A<sup>T</sup>* is respected. These events are detailed in Section 3.2.4 and analyzed in the case study proposed in Section 4.2.

#### 3.2.4. Production Adjustment Equations

The post-processing method implements two balance equations in order to represent the strategic management of the expected production targets:

$$A^T = \sum\_{\emptyset \in G} \sum\_{t \in T} p\_{\emptyset, t} \tag{9}$$

$$A\_{\mathcal{S}} = \sum\_{t \in T} p\_{\mathcal{S},t} + f\_{\mathcal{S}}^A - f\_{\mathcal{S}}^B \tag{10}$$

$$0 \le f^A\_{\mathcal{S}} \tag{11}$$

$$0 \le f\_{\mathcal{S}}^{\mathcal{B}} \tag{12}$$

Equation (9) always respects the total aggregated production target, *AT*, of the set of thermal units considered in the post-processing problem.

Additionally, Equation (10) make it possible to overcome unfeasible data for the production targets of each thermal unit, *Ag*, like production targets which are below the minimum power output as the result of using relaxed variables in the medium-term fundamental model. In turn, this equation also gives versatility to the profit optimization. The considered time span will depend on the reliability of the productions obtained with the medium-term model, frequently achieving solid values when the aggregation exceeds one week.

Variables *f <sup>A</sup> <sup>g</sup>* and *f <sup>B</sup> <sup>g</sup>* distribute production targets among the set of thermal units, to a greater or lesser extent depending on the value of *CF*, which penalizes transfers in the objective function. If high values are assigned to *CF*, the model will always try to respect the objective production of each unit *Ag*, relocating productions with the sole purpose of avoiding infeasibilities. On the other hand, if *C<sup>F</sup>* is set to zero, transfers are free and the operation of the portfolio increases its flexibility. In this case, *Ag* is ignored and the greatest possible profit, considering *AT*, is obtained after the optimization.

This parameter opens the door to the analysis of different situations and strategic behaviors in the management of a thermal portfolio belonging to a generation company. The allocation of moderate values would allow from transfer few MWh to avoid that a thermal unit stretches its production at some hours that are not so profitable, to the possibility of preventing a unit from starting to be working for only one hour or similar.

The introduction of this strategic term refers to the internalization of some operation, logistic and opportunity costs that can not be considered otherwise. In that way, the post-processing method will naturally avoid the inefficiencies mentioned above, as well as return a feasible thermal scheduling. All of these events are analyzed in the case studies proposed in Section 4.

#### 3.2.5. Basic Operating Constraints

These equations determine the chronological relationship between the hourly periods, defining the logic between commitments and startups/shutdowns throughout the time span:

$$w\_{\mathcal{G},t} - w\_{\mathcal{G},t} = u\_{\mathcal{G},t} - u\_{\mathcal{G},t-1} \tag{13} \\ \tag{13}$$

$$\mathfrak{l}\_{\mathfrak{g}}^{\vee} \qquad \qquad \qquad \forall \mathfrak{g}, t \in [1, 2) \tag{14}$$

$$u\_{\mathbb{S}^t} \in \{0, 1\} \tag{15}$$

$$w\_{\mathbb{S}^t} \in \{0, 1\} \tag{16}$$

$$w\_{\mathbb{S}^t} \in \{0, 1\} \tag{17}$$

Note that it is not necessary to formulate *wg*,*<sup>t</sup>* as binary variable. Its behavior is defined by differences between binaries and the only values that can be taken are 0 or 1.

Finally, the operating constraints of the thermal units are included, limiting their hourly production between its minimum and maximum power outputs when they are committed.

$$\|u\_{\mathcal{G}^t}\|\_{\mathcal{S}}^{\text{MIN}} \le p\_{\mathcal{G},t} \tag{18}$$

$$p\_{\mathcal{S},t} \le u\_{\mathcal{S},t} P\_{\mathcal{S}}^{MAX} \tag{19}$$

#### **4. Case Study and Results**

In order to show the usefulness of the methodology presented in this paper, a real size case study of an agent that wishes to make a feasible scheduling and optimize the production of four CCGTs in a medium-term horizon is analyzed. Section 4.1 describes the case study, as well as the origin of the inputs needed in the post-processing method. Section 4.2 shows the results of the application of this methodology. This section also presents three cases where the value of the diverting penalty is analyzed, representing a combined profit optimization and feasible scheduling post-processing.

#### *4.1. Presentation of the Case Study and Its Medium-Term Fundamental Model*

*vg*,*<sup>t</sup>* <sup>−</sup> *wg*,*<sup>t</sup>* <sup>=</sup> *ug*,*<sup>t</sup>* <sup>−</sup> *<sup>U</sup>*<sup>0</sup>

In this case study, the production of four CCGTs belonging to a generation company operating in the Iberian electricity market (MIBEL) will be made feasible and optimized. In the first phase of the methodology, a medium-term model is run. This model follows the mathematical formulation proposed in [22], representing the equilibrium between markets players through conjectural variations. Its validity to determine the market equilibrium as an optimization problem is also proved in [22], where it is demonstrated that if the cost function is convex and there are non-negative conjectures, the optimization problem is equivalent to an equilibrium problem.

This formulation was summarized in [53] as follows. The competitive behavior in the market is represented as an oligopoly, where the conjectured-price, *θi*,*a*,*p*, of each market player *i* is considered as known. The function that relate the production cost for each player and area *a*, *Ci*,*a*, with its electricity generation, *qi*,*a*,*p*, during the period *p*, is linear or quadratic. The electricity price, *λa*,*p*, is determined as the dual variable of the power balance constraint, which matches the total energy output with the demand, *Da*,*p*. Finally, technical constraints, H, are shorten in a generic formulation:

$$\min\_{q\_{i,a,p}} \sum\_{i,p,p} \left( \mathbb{C}\_{i,a} q\_{i,a,p} + \theta\_{i,a,p} \frac{q\_{i,a,p}^2}{2} \right) \tag{20}$$

subject to:

$$\sum\_{i} q\_{i,a,p} = D\_{a,p} : \lambda\_{a,p} \tag{21} \tag{21}$$

$$\forall \mathcal{H}(q\_{i,a,p}) \ge 0 \tag{22}$$

The main technical constraints applied to the thermal units are those related to the commitment status, maximum and minimum power outputs, operational costs, start-up and shut-down costs, maximum number of start-ups within a period, unplanned unavailability and maintenance schedules. Regarding river basins, the turbine and pumping capacities are modelled, as well as efficiency, storage capacity, inflows, topology and the upper and lower water bounds to guarantee a safety operation.

This model is used in an accurate representation of the Spanish, Portuguese and French electricity markets and its interconnections. Every single thermal unit is considered, as well as hydro reservoirs and the non-dispatchable renewable energy sources. The horizon comprises three years on an hourly basis. After the market clearing determination, the model also checks technical issues (such as network constraints), affairs as the Transmission System Operator does, committing some thermal units to guarantee the stability of the grid, if required.

In order to obtain reasonable run times, integer programming relaxation is applied and the time aggregation technique of system states [49] is used. This clustering process take into account different conditions of the power system and aggregates hours according to its corresponding thermal gap. The transition between clusters is considered with this method, but the equations to keep chronology are not included. In this way, 940 time steps represent the whole time span of three years.

The combination of these modeling techniques leads to very acceptable run times. On the other hand, it presents the drawback of being possible to obtain technically infeasible results. Besides, the representation of detailed thermal costs (as it is the case of start-up costs) is simplified with respect to the proposed post-processing methodology. Table 2 shows a comparison between a three-year case of the described fundamental model, with the post-processing method proposed in Section 3, applied to a four thermal-unit case through a 31-day time span on an hourly basis. The cases analyzed in this paper have been run in a computer Intel Xeon CPU E5-2660 v3 @2.60 GHz with 40 logical processors and 144 GB of installed RAM memory running 64-bit Windows Server 2012 R2, solved with the commercial solver CPLEX 12.10 [58] under GAMS [59].

Regarding the representation of uncertainty, a Monte Carlo simulation has been carried out. A total of 300 cases have been considered to represent different scenarios for the following risk factors: Power demand, hydro conditions, wind generation, solar generation, coal prices, natural gas prices, CO2 emissions allowance prices and unplanned unavailability of thermal units.

Given the great variety of risk variables considered in the simulation, the spatial interpolation technique proposed in [60] has been applied, making it possible to obtain a high accuracy in the results with only 300 cases evaluated. Furthermore, the determination of correlations between variables and the scenario creation has been carried out in collaboration with a major utility present in the Spanish electricity market.

This Monte Carlo simulation, carried out with the medium-term model described above, has been used to obtain the necessary input data for the proposed methodology. The corresponding results needed in the post-processing method are shown in Table 3 and Figure 2, which gather the expected thermal productions and the electricity prices, respectively, for the four CCGTs considered in this case study along the 31 day-hourly time span. However, the real scope of this Monte Carlo simulation is longer.


**Table 2.** Problem sizes after the performance of CPLEX presolve.

It is also important to mention that these outputs correspond to the mean value of the distribution function. Nevertheless, either mean values or those results that belong to any of the contemplated centiles (P10, P50, P90, etc.) can be easily handled with the post-processing methodology proposed in this paper.

**Table 3.** Expected production of the thermal units considered in the post-processing case study.


Technical data of the thermal units considered in the post-processing are shown in Tables 4 and 5. Start-up costs of the thermal units are modelled with three steps, as mentioned in [1,3]. This representation improves the unique start-up cost of the medium-term model. In turn, the formulation described in Section 3.2.2, easily allows an increment of steps if a more accurate simulation is desired.

**Table 4.** Technical data of the thermal units and status in the first hour of the considered time span.




#### *4.2. Analysis of Feasible Schedules Obtained with the Post-Processing Methodology*

The application of the post-processing method after the performance of a medium-term model, like the one described in Section 4.1, offers many advantages. It achieves a feasible thermal scheduling, keeping reliable and quite valuable medium-term information, such as the hydraulic generation. In turn, it also improves the representation of the technical operational constraints, since the medium-term model only uses a single-step start-up cost. This phase is more computationally flexible, being possible to approach the start-up cost to a multi-step function, which provides a much more accurate modeling.

Additionally, it is possible to introduce a strategic term to consider hidden operation preferences, allowing a more realistic management of these assets by a market player. These operational priorities can be easily quantified in economic terms and give the chance of transferring some production targets between the thermal units of the portfolio. In this section, three case studies will be considered to analyze the impact of the strategic term, *CF*, on the thermal scheduling:


It is important to keep in mind that the strategic divertion term behavior will depend on the hourly price forecast, because its higher or lower levels would promote or damp the transfers in the objective function. This strategic term can be easily assigned by generation companies, which know in depth its operation, logistic, and opportunity costs. In turn, the company can also play with this value to analyze different situations and risk scenarios.

Table 6 shows the outputs of the performance of the three cases. The comparison is carried through the obtained profits. As expected, the post-processing method reaches a greater profit when there is a higher availability to transfer productions between units, moving them to the most profitable hours and avoiding useless start-ups and the imposition of quantities that are far from the optimum values for this case. The gap established for the three cases is 1%. This value is accurate enough, but it is important to note that there is a trade-off between the desired accuracy and the run time and computational resources. Thus, the higher the number of thermal units involved in the post-processing methodology is, the higher the run time to reach an integer solution will be.


**Table 6.** Results of the evaluation of the three cases.

In Case 1, every single production target is respected according to the high value assigned to *CF*. On the contrary, the reduced production target of Unit D is quickly transferred to other units when *C<sup>F</sup>* is relaxed in Case 2. The optimum value is reached moving 190 MWh from Unit D to Unit A. Finally, the total optimization of Case 3 shows that the optimal solution of the problem is to use each thermal unit along its most profitable hours. In this case, 79,155 MWh and 44,258 MWh are yielded by Unit B and Unit C, being 5870 MWh assumed by Unit A and 117,543 MWh by Unit D.

All of these cases provide a real picture of the detailed operation of the thermal portfolio. As it was expected, its production responds to the electricity price peaks dynamics, considering an optimal management of start-ups and shut-downs to maximize profit. Figure 2 shows an example of feasible scheduling, where Unit A maximizes its benefit according to the expected production gathered in Table 3. The thermal schedule represented in this figure corresponds to the results of Case 1, where the strategic term *C<sup>F</sup>* is high enough to avoid transfers of production targets.

**Figure 2.** Hourly matching prices of the Spanish market obtained with the fundamental model of Section 4.1 and response of Unit A to its production target with a strategic term of 500 \$/MWh as used in Case 1.

For the sake of simplicity, only one thermal unit has been included in Figure 2, allowing an easier interpretation of the operational behavior. Unit A, as well as the other thermal units, sets its production at its maximum power output during the most profitable hours. In addition, it usually reduces its production to the minimum output when prices drop, incurring in shut-downs if electricity peaks are too separated in time.

#### **5. Conclusions**

The changing reality of current electricity markets highlights the importance of a proper representation of power systems. The increasing variability of generation, due to the deployment of non-dispatchable RES, and the interconnection promotions between areas as a result of integration policies, points out the necessity of simulating multi-area power system on an hourly basis. Nowadays, electricity market models require a high level of detail and time granularity not only in the short term, but also in the medium term, especially in order to represent a real management of energy storage facilities. This fact, together with the uncertainty consideration in some input data, makes it imperative to simplify medium-term models to increase their computational tractability. However, these simplifications lead to infeasible and/or suboptimal operational outputs for thermal units.

A new soft-linking methodology to overcome these problems has been proposed in this paper. This method combines the advantages of medium-term models, selecting high reliable results from these tools, and using them in a post-processing phase. This step provides a feasible thermal scheduling for a well-detailed generation portfolio. In this way, the infeasible outputs obtained with the medium-term model as a consequence of the implemented simplifications, such as time aggregation, integer programming relaxation or less accurate modeling, are corrected.

In addition, the methodology allows the use of a strategic term, providing an alternative to jointly optimize the thermal generation portfolio of a market agent. This term improves the single representation of technical constraints, allowing the assignment of logistic and opportunity costs, as well as the inclusion of hidden flexibility possibilities in the operation of thermal units. The formulation of the post-processing phase as an optimization problem also contributes to recreating the competition in power markets, where each player tries to maximize its profit. The whole methodology has been tested with a realistic case study, showing its validity.

**Author Contributions:** Conceptualization, L.M., A.B. and J.R.; methodology, L.M., A.B. and J.R.; software, L.M.; validation, L.M., A.B. and J.R.; formal analysis, L.M.; investigation, L.M.; resources, L.M.; data curation, L.M.; writing—original draft preparation, L.M.; writing—review and editing, L.M., A.B. and J.R.; visualization, L.M.; supervision, A.B. and J.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

CCGT Combined Cycle Gas Turbine


#### **Nomenclature**

Indexes & Sets



#### Parameters


#### Variables



#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
