Set 1: *Null hypothesis is correctly specified, and alternative hypothesis is overspecified.*

Consider the true data-generating model given by

$$y\_i = \beta\_{0,1} + \beta\_{0,2}\mathbf{x}\_{i2} + \beta\_{0,3}\mathbf{x}\_{i3} + \epsilon\_{i\nu}$$

where *<sup>i</sup>* <sup>∼</sup> *<sup>N</sup>*(0, 50), *<sup>β</sup>*0,1 <sup>=</sup> 1, *<sup>β</sup>*0,2 <sup>=</sup> *<sup>β</sup>*0,3 <sup>=</sup> 0.5 and , *xi*<sup>2</sup> *xi*<sup>3</sup> -*<sup>T</sup>* is sampled as indicated in (8).

For the hypothesis testing setting in Set 1, the null and alternative models are defined as

$$\begin{aligned} H\_1: y\_i &= \beta\_1 + \beta\_2 \mathbf{x}\_{2i} + \beta\_3 \mathbf{x}\_{i3}, \\ H\_2: y\_i &= \beta\_1 + \beta\_2 \mathbf{x}\_{i2} + \beta\_3 \mathbf{x}\_{i3} + \beta\_4 \mathbf{x}\_{i4} + \beta\_5 \mathbf{x}\_{i5} + \beta\_6 \mathbf{x}\_{i6} + \beta\_7 \mathbf{x}\_{i7}. \end{aligned}$$

Note that the null model is adequately specified, while the alternative model contains the true model plus four additional explanatory variables. These extra explanatory variables are generated from the distribution indicated in (8).

Set 2: *Null hypothesis is underspecified, and alternative hypothesis is correctly specified.*

Consider the true data-generating model given by

$$y\_i = \beta\_{0,1} + \beta\_{0,2}\mathbf{x}\_{i2} + \beta\_{0,3}\mathbf{x}\_{i3} + \beta\_{0,4}\mathbf{x}\_{i4} + \beta\_{0,5}\mathbf{x}\_{i5} + \mathbf{e}\_i\mathbf{x}\_i$$

where *<sup>i</sup>* ∼ *N*(0, 45), *β*0,1 = 1, *β*0,2 = 0.11, *β*0,3 = 0.13, *β*0,4 = 0.12, *β*0,5 = −0.11, and , *xi*<sup>2</sup> *xi*<sup>3</sup> ··· *xi*<sup>5</sup> -*<sup>T</sup>* is sampled as indicated in (8).

For the hypothesis testing setting in Set 2, the null and alternative models are

$$\begin{aligned} H\_1: y\_i &= \beta\_1 + \beta\_2 \varkappa\_{2i} + \beta\_3 \varkappa\_{i3} + \beta\_4 \varkappa\_{i4}, \\ H\_2: y\_i &= \beta\_1 + \beta\_2 \varkappa\_{i2} + \beta\_3 \varkappa\_{i3} + \beta\_4 \varkappa\_{i4} + \beta\_5 \varkappa\_{i5}. \end{aligned}$$

Here, the alternative model has the same structure as the data-generating model, but the null model is missing one of the explanatory variables in the true model, namely *x*5.

Set 3: *Both null and alternative models are underspecified, but the null is closer to the data-generating model.*

Consider the true data-generating model given by

$$y\_i = \beta\_{0,1} + \beta\_{0,2}\mathbf{x}\_{i2} + \beta\_{0,3}\mathbf{x}\_{i3} + \beta\_{0,4}\mathbf{x}\_{i4} + \beta\_{0,5}\mathbf{x}\_{i5} + \beta\_{0,6}\mathbf{x}\_{i6} + \mathbf{e}\_i.$$

where *<sup>i</sup>* ∼ *N*(0, 50), *β*0,1 = 1, *β*0,2 = *β*0,3 = 0.5, *β*0,4 = *β*0,5 = −0.5, *β*0.6 = 0.1, and , *xi*<sup>2</sup> *xi*<sup>3</sup> ··· *xi*<sup>6</sup> -*<sup>T</sup>* is sampled as indicated in (8).

For the hypothesis testing setting in Set 3, the null and alternative models are

$$\begin{aligned} H\_1: y\_i &= \beta\_1 + \beta\_2 x\_{2i} + \beta\_3 x\_{i3}, \\ H\_2: y\_i &= \beta\_1 + \beta\_4 x\_{i4} + \beta\_6 x\_{i6}. \end{aligned}$$

In this setting, both the null and alternative candidate models have the same number of explanatory variables, and they are both missing variable *x*4. However, there is a slight difference in the effect sizes of the variables for these models. For the alternative, the effect sizes are −0.5 and 0.1 for *x*<sup>4</sup> and *x*6, respectively. On the other hand, the effect size for the null model is 0.5 for both *x*<sup>2</sup> and *x*3. When comparing the null and alternative models, the smaller effect size on *x*<sup>6</sup> sets the alternative further away from the true model.
