**2. Model**

This section presents first the benchmark model for the optimization problem of the aggregation of predictions under the perspective of Information Theory. This model activates the criterium of Kullback–Leibler distance to determine the optimal weights of the aggregation of predictions. A second approach, the machine learning technique, provides the second model. Finally, the relationship between both approaches is described.

## *2.1. Benchmark Maximum-Entropy-Inference (MEI)*

Given a set of agents *I*, let {*yi*,*<sup>t</sup>*}*<sup>i</sup>*∈*I*,*t*≥<sup>0</sup> be forecasts for an economic variable at time *t* made at a prior time. We consider the combination of the individual results or weighted by a vector of parameters for each possible forecast denoted by *ωi*. The weights *ωi* are interpreted as the degrees of expertise for every agen<sup>t</sup> *i* ∈ *I*. By assuming a non-degenerate distribution of weights, the true prediction at time *t* is denoted *at*, which verifies ∑*<sup>i</sup>*∈*<sup>I</sup> <sup>ω</sup>iyi*,*<sup>t</sup>* = *at*. The first problem we tackle is to find out the weights *ωi* such that the true prediction fits the aggregation of predictions.

A parallel problem that we consider is the entropy maximization of the distribution of {*<sup>ω</sup>i*}*<sup>i</sup>*∈*<sup>I</sup>* subject to the true value coinciding with the aggregation of predictions for all possible temporal horizon *t*. This optimization problem is expressed as follows:

$$\begin{aligned} \max\_{\boldsymbol{\omega}\_i} & \sum\_{i \in I} \omega\_i \log \boldsymbol{\omega}\_i^{-1} \\ \text{subject to} & \sum\_{i \in I} \omega\_i = 1 & \omega\_i \ge 0, \\ & \sum\_{i \in I} \omega\_i \boldsymbol{y}\_{i,t} = a\_t & \text{for } t \ge 0 \end{aligned}$$

This methodology known as maximum-entropy inference is equivalent to the problem of finding out a non-negative distribution of weights {*<sup>ω</sup>i*}*<sup>i</sup>*∈*<sup>I</sup>* that minimizes the Kullback–Leibler-distance between such a distribution and the uniform distribution over the set of agents, that is, 1|*I*|. The

Kullback–Leibler distance between two distributions *p* and *q* is defined as *<sup>K</sup>*(*p*, *q*) = ∑*x p*(*x*)log *p*(*x*) *q*(*x*) . Notice that the Kullback–Leibler distance is always non-negative but it is not a proper distance since it neither verifies the symmetric nor the triangular properties. This approach based on the Kullback–Leibler-distance assures a non biased outcome over the set of agents.

Formally,

$$\min\_{\omega\_i} \sum\_{i \in I} \frac{1}{|I|} \log(\omega\_i |I|)^{-1} \tag{1}$$

$$\begin{aligned} \textit{subject to} \quad & \sum\_{i \in I} \omega\_i = 1 \quad \quad \omega\_i \ge 0, \\ & \sum\_{i \in I} \omega\_i y\_{i,t} = a\_t \quad \quad \text{for } t \ge 0. \end{aligned}$$

To solve this, we start with the Lagrangian of the above problem. It should be noted that the cardinality of variables compared with the set of restrictions may be not enough to guarantee a unique solution, even the existence of a solution. Despite being able to characterize a set of possible weights that minimize the relative distance, it may not fit the true prediction condition.

This issue is well recognized in the literature, since the complexity of finding a proper solution increases with the cardinality of the parameters and conditions; hence, it is necessary to use numerical algorithms to find out (if it exists) the set of candidates of solution.

In order to reduce the complexity of the problem, we can consider a new parametrization of {*<sup>ω</sup>i*}*<sup>i</sup>*∈*I*. In particular, one can reparameterize *ωi* = *exi* ∑ *exi* for *xi* ∈ **R**. This guarantees that ∑*i ωi* = 1 with *ωi* > 0 while simplifying the optimization problem by reducing the number of constraints from three to one.

Another way to approach the original problem is to allow a balance between the two restrictions written in the Lagrangian. On the one hand, the distribution of weights should minimize the relative entropy, on the other hand, this system should generate an aggregation of predictions as close as possible to the true prediction.
