*4.1. Settings for Simulation Sets*

For Sets 1 to 5, the true data-generating model is of the form

$$\boldsymbol{y}\_{i} = \mathbf{x}\_{i}^{T}\boldsymbol{\beta}\_{0} + \boldsymbol{\varepsilon}\_{i\prime}$$

$$\text{with } \boldsymbol{\beta}\_{0}^{T} = \begin{bmatrix} \boldsymbol{\beta}\_{0,1} & \boldsymbol{\beta}\_{0,2} & \cdots & \boldsymbol{\beta}\_{0,p} \end{bmatrix}, \mathbf{x}\_{i}^{T} = \begin{bmatrix} 1 & \boldsymbol{x}\_{i2} & \cdots & \boldsymbol{x}\_{ip} \end{bmatrix}, \text{and}$$

$$\begin{bmatrix} \boldsymbol{x}\_{i2} & \cdots & \boldsymbol{x}\_{ip} \end{bmatrix}^{T} \sim \mathcal{N}\_{p-1}(\boldsymbol{\mu}, \boldsymbol{\Sigma}), \tag{8}$$

where the entries of *μ* are chosen from {−1, 1} with equal probability, and Σ = *diagp*−1(100). For Sets 1 to 4, we have *<sup>i</sup>* <sup>∼</sup> *<sup>N</sup>*(0, *<sup>σ</sup>*<sup>2</sup> <sup>0</sup> ); for Set 5, we have that *<sup>i</sup>* ∼ *td f*=5, where *td f* denotes

the Student's *t* distribution based on *d f* degrees of freedom; and for Set 6, we have that *<sup>i</sup>* ∼ *Z* · *N*(0, 1)+(1 − *Z*) · *N*(0, 50), where *Z* ∼ *Bernoulli*(*π*) with *π* = 0.85.

In the setting at hand, the true data-generating model *g* has parameters *θ* = (*β<sup>T</sup>* <sup>0</sup> , *<sup>σ</sup>*<sup>2</sup> <sup>0</sup> )*T*. Hurvich and Tsai [6] showed that for the family of approximating models *y* = *Xβ* + , where *<sup>X</sup>* is the design matrix and <sup>∼</sup> *<sup>N</sup>*(0, *<sup>σ</sup>*<sup>2</sup> *In*), with maximum likelihood estimators given by

$$\hat{\beta} = (X^T X)^{-1} X^T y$$

and

$$
\hat{\sigma}^2 = \frac{(y - X\hat{\boldsymbol{\beta}})^T(y - X\hat{\boldsymbol{\beta}})}{n},
$$

the KLD measure *d*(*g*, ˆ *θ*) is given by

$$d(\mathbf{g}, \hat{\theta}) = n \log(2\pi\vartheta^2) + \frac{n\sigma\_0^2}{\hat{\sigma}^2} + \frac{(X\beta\_0 - X\beta)^T(X\beta\_0 - X\beta)}{\hat{\sigma}^2}.\tag{9}$$

The expected value of the KLD for the null and the alternative models was approximated by averaging the KLD over 5000 samples generated from *g*. These 5000 KLD values, computed using (9), approximate the joint distribution of *d*(*g*, ˆ *θ*1) and *d*(*g*, ˆ *θ*2); hence, the simulation-based estimator of the KLDCP is given by

$$\hat{P} = \frac{1}{5000} \sum\_{i=1}^{5000} I[d(\mathcal{g}, \hat{\theta}\_1(i)) < d(\mathcal{g}, \hat{\theta}\_2(i))]. \tag{10}$$

This KLDCP estimate is calculated 100 times in order to estimate the KLDCP distribution and its expected value.

Finally, for each of the 5000 samples, we calculate the BD and the BDb using 200 bootstrap samples. However, to attenuate the simulation variability incurred by the mixture distribution, the number of bootstrap samples in Set 6 was increased to 500. The results displayed in the tables are based on averages over the 5000 samples.
