**4. A Numerical Experiment**

The numerical simulation compares the performance of the estimation strategies explained previously to estimate a set of latent indicators (*T* × *K*). The target will be the unknown elements *yij* (output per worker, income per capita, etc.) that measure the amount of certain variable *zij* per unit of other auxiliary variable *lij*. The values of the later are drawn from a normal distribution as *lij* ∼ *N*(20, 2), which define the weights as θ*ij* = *lij*/*li*· We also simulate an observable disaggregated indicator *xij* drawn as *xij* ∼ *N*(10, 1) related to our unobservable target *yij*.

In the context of simulation, we assume that the indicator *yij* is generated as a convex combination from two possible schemes:

$$y\_{ij} = \delta[a\_i + \beta\_{i\bar{j}} \mathbf{x}\_{i\bar{j}} + \varepsilon\_{i\bar{j}}] + (1 - \delta)[\eta\_{\bar{i}\bar{j}} \mathbf{x}\_{i\bar{j}} + \varepsilon\_{i\bar{j}}]; \quad i = 1, \ldots, T; \ j = 1, \ldots, \mathbf{K}. \tag{19}$$

This equation contains two sets of slope parameters, namely β*ij* and η*ij*, which relate the regressor *xij* with the target *yij*. Furthermore, a fixed area effect α*<sup>i</sup>* is also included. These parameters have been arbitrarily set as:

$$\begin{array}{l} \alpha\_{i} \sim \mathcal{N}(\mathbf{5}, \mathbf{1})\\ \beta\_{ij} \sim \mathcal{N}(\mathbf{0}, \mathbf{0}.\mathbf{1})\\ \eta\_{ij} \sim \mathcal{N}(\mathbf{1}, \mathbf{0}.\mathbf{1}) \end{array} \tag{20}$$

and they are kept constant along the simulations. The error term ε*ij* is drawn as ε*ij* ∼ *N*(0, 0.1) and it is generated in each new trial of the experiment.

The first part of the equation (α*<sup>i</sup>* + β*ijxij* + ε*ij*) shows that *yij* can be generated from a process like the one depicted in (16): a linear function of *xij* with slope heterogeneity plus a specific area effect (see 11). The second term (η*ijxij* + ε*ij*) does not include any specific area indicator and assumes that *yij* is exclusively affected by *xij* (see 2). Equation (19) includes the scalar δ bounded between 0 and 1 that weighs the two possible sources that generate the variable. If we make δ → 1, the first possible mechanism takes over and the contrary happens when we make δ → 0. Note that if we set δ = 1 we are imposing a data-generating process in line with the assumptions made in the GME program depicted in Equations (12)–(14) for the DWR estimation. On the contrary, if we set δ = 0, this is a scenario compatible with the assumptions of non-uniform priors for the parameters that reflected the belief of absence of area-specific effects and a slope parameter close to 1 (labeled as GCE when the simulation results are shown). Any other value of δ between these two extreme cases shows a data-generating process that is not fully incorporated in the priors of either alternative. It is in this type of intermediate situation with the composite prior estimator (labeled as DWP in the simulation results) described in Equations (16)–(18) can be useful, because both priors are considered and we let the data speak for themselves and favor the most realistic one.

The unobservable indicators generated in (20) will be estimated by the three estimation strategies described in the paper (DWR, GCE and DWP estimators) with equal amounts of observable information (the aggregates *yi*· <sup>=</sup> *<sup>K</sup> j*=1 *yij*θ*ij*). We have specified a common supporting vector for all the parameters with *M* = 3 points at *b*- = (−10, 0, 10). Similarly, a three-point (*H* = 3) support vector with values 0, 0.5 and 1 has been set for the weighting parameters γ. For the error terms, the support with *L* = 3 values has been chosen, applying the three-sigma rule with uniform a priori weights.

In the experiment, we compare the performance of the three approaches under different scenarios. Three different dimensions (T × K) of the matrix with the target indicators *yij* have been considered and for each case we set arbitrarily six different values of scalar δ: 0.0; 0.2; 0.4; 0.6; 0.8 and 1.0. In each one of these 18 scenarios, we have carried out 200 trials and computed the mean of the absolute deviation in percentage between our estimates and the real *yij*. Table 1 shows the results:


**Table 1.** Results of the numerical experiment (1000 replications): deviation figures.

**Table 1.** *Cont*.


Values on each cell report the mean absolute deviation (in %) between the real generated target values and the estimated ones. Values in parentheses show the average bias, on absolute terms (ABIAS), and the figures in brackets show the root of the mean squared errors of the estimates (RMSE).

Independently of the estimation approach, the numbers on Table 1 show some common patterns to the three of them. The deviations increase with the value of the scalar δ given that high values of this scalar give more weight to the part of the data-generating process that includes an area-specific effect, which makes the *yij* indicators more difficult to predict. The errors seem more stable regarding the different sizes of the target matrices.

If we pay attention to the comparative performance among the three approaches evaluated in the experiment, the results indicate (not surprisingly) that, for low values of the scalar δ, it seems preferable considering that the GCE approach does not introduce any area-specific effect and considers the regressor *xij* as the best prediction in absence of observable information. The longer the value of this scalar, the better the relative performance of the GME-DWR approach (based on a priori uniform distributions).

The rule of thumb would be, consequently, to use the former when we suspect that no area-specific effect is present (if the second term in Equation (19) dominates) and to favor the latter otherwise (if the first term is more important). In empirical estimation problems, is virtually impossible to know beforehand which one of the two terms is more important. It is in these situations when the use of the composite prior estimator can be helpful. The DWP approach generally outperforms the competing estimators for intermediate values of δ (ranging from 0.4 to 0.8). These medium values indicate some degree of uncertainty about the type of process that generates the data to be estimated. Moreover, the DWP approach can be seen as a conservative solution: even when one of the two parts of the process is clearly dominant (δ = 0 or δ = 1), the composite prior does not perform much worse than the best of the three options. The losses in terms of prediction, however, can be larger if we choose one single-prior estimator when the other is the best option (see the first and last rows of Table 1).
