**1. Introduction**

Let (<sup>X</sup> , *<sup>β</sup>*<sup>X</sup> , *<sup>P</sup>θ*)*θ*∈<sup>Θ</sup> be the statistical space associated with the random variable *<sup>X</sup>*, where *<sup>β</sup>*<sup>X</sup> is the *<sup>σ</sup>*-field of Borel subsets *<sup>A</sup>* ⊂ X and {*Pθ*}*θ*∈<sup>Θ</sup> is a family of probability distributions defined on the measurable space (<sup>X</sup> , *<sup>β</sup>*<sup>X</sup> ), whit <sup>Θ</sup> an open subset of <sup>R</sup>*<sup>p</sup>* and *p* ≥ 1. We assume that the probability measures *P<sup>θ</sup>* are described by densities *fθ*(*x*) = *dPθ*/*dμ*(*x*), where *μ* is a *σ*-finite measure on (X , *β*<sup>X</sup> ). Given a random sample *X*1, ... , *Xn*, of the random variable *X* with density belonging to the parametric family *Pθ*, the most popular estimator for the model parameter *θ* is the maximum likelihood estimator (MLE), which maximizes the likelihood function of the assumed model. The MLE has been widely studied in the literature for general statistical models, and it has been shown that, under certain regularity conditions, the sequence of MLEs of *θ*, 4 *θn*, is asymptotically normal and it satisfies some desirable properties, such as consistency and asymptotic efficiency. That is, the MLE is the BAN (best asymptotically normal) estimator. However, in many popular statistical models, the MLE is markedly non-robust against deviations, even very small ones, from the parametric conditions.

To overcome the lack of robustness, minimum distance (or minimum divergence) estimators (MDEs) have been developed. MDEs have received growing attention in statistical inference because of their ability to conciliate efficiency and robustness. In parametric estimation, the role of divergence or distance measures is very intuitive: the estimates of the unknown parameters are obtained by minimizing a suitable divergence measure between the estimated from data and the assumed model distributions. There is a growing

**Citation:** Jaenada, M.; Miranda, P.; Pardo, L. Robust Test Statistics Based on Restricted Minimum Rényi's Pseudodistance Estimators. *Entropy* **2022**, *24*, 616. https://doi.org/ 10.3390/e24050616

Academic Editors: Karagrigoriou Alexandros and Makrides Andreas

Received: 30 March 2022 Accepted: 26 April 2022 Published: 28 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

body of literature that recognizes the importance of MDEs in terms of robustness, without a significant loss of efficiency, with respect to the MLE. See, for instance, the works of Beran [1], Tamura and Boes [2], Simpson [3,4], Lindsay [5], Pardo [6], and Basu et al. [7] and the references therein.

Let *G* denote the unknown distribution function, with associated density *g*, underlying the data. The minimum divergence (distance) functional evaluated at *G*, *T*(*G*), is defined as

$$d(\mathcal{g}\_{\prime}f\_{T(G)}) = \min\_{\emptyset \in \Theta} d(\mathcal{g}\_{\prime}f\_{\emptyset}) \, \tag{1}$$

with *d*(*g*, *fθ*) being a distance or divergence measure between the densities *g* and *fθ*. As the true distribution underlying the data is unknown, given a random sample, we could estimate the model parameter *θ*, substituting in the previous expression the true distribution *G* by its empirical estimation *Gn*. Therefore, the MDE of *θ* is given by

$$
\widehat{\boldsymbol{\theta}}\_n = T(\mathbf{G}\_n), \tag{2}
$$

When dealing with continuous models, it is convenient to consider families of divergence measures for which non-parametric estimators of the unknown density function are not needed. From this perspective, the density power divergence (DPD) family, leading to the minimum density power divergence estimators (MDPDEs) (see Basu et al. [7]), as well as the Rényi's pseudodistance (RP), leading to the minimum Rényi's pseudodistance estimators (MRPE) (see Broniatowski et al. [8]) between others, play an important role. The results presented in Broniatowski et al. [8] in the context of independent and identically distributed random variables were extended for the case of independent but not identically distributed random variables by Castilla et al. [9].

In many situations we have additional knowledge about the true parameter value, as it must satisfy certain constraints. Then, the restricted parameter space has the form

$$\{\theta \in \Theta / \lg(\theta) = \mathbf{0}\_r\},\tag{3}$$

where **<sup>0</sup>***<sup>r</sup>* denotes the null vector of dimension *<sup>r</sup>*, and *<sup>g</sup>* : <sup>R</sup>*<sup>p</sup>* <sup>→</sup> <sup>R</sup>*<sup>r</sup>* is a vector-valued function such that the *p* × *r* matrix

$$G(\theta) = \frac{\partial \mathbf{g}^{\mathsf{T}}(\theta)}{\partial \theta} \tag{4}$$

exists and is continuous in *θ*, and rank(*G*(*θ*)) = *r*. Here, superscript *T* represents the transpose of the matrix. In the following, the restricted parameter space given in (3) is denoted by Θ0, as in most situations, it will represent a composite null hypothesis.

The most popular estimator of *θ* under the non-linear constraint given in (3) is the restricted MLE (RMLE) that maximizes the likelihood function subject to the constraint *g*(*θ***) = 0***<sup>r</sup>* (see Silvey [10]). The RMLE encounters similar robustness problems to the MLE. To overcome such deficiency, the restricted MDPDEs (RMDPDEs) were introduced in Basu et al. [11] and their theoretical robustness properties were later studied in Ghosh [12].

The main purpose in this paper is extending the theory developed for the MRPE to the restricted parameter space setting, yielding to the restricted MRPE (RMPRE), where the parameter space has the form (3). The rest of the paper is as follows: In Section 2, MRPE is introduced. Section 3 presents RMPRE, and its asymptotic distribution as well as its influence function are obtained. In Section 4, two different test statistics for testing composite null hypothesis, based on the RMRPE, are developed, and explicit expressions of the statistics are presented for testing in normal populations. Section 5 presents a simulation study, where the robustness of the proposed estimators and test statistics is empirically shown. Section 6 deals with real-data situations. Finally, some conclusions are presented in Section 7.

#### **2. Minimum Rényi Pseudodistance Estimators**

In this section, we introduce the MRPE. We derive the estimating equations of the MRPE and recall its asymptotic distribution.

Let *X*1, ... , *Xn* be a random sample of size *n* from a population having true and unknown density function *g*, modeled by a parametric family of densities *f<sup>θ</sup>* with *θ* ∈ Θ ⊂ R*p*. The RP between the densities *f<sup>θ</sup>* and *g* is given, for *τ* > 0, by

$$\begin{split} R\_{\pi}(f\_{\theta},\mathcal{g}) &= \frac{1}{\pi+1} \log \left( \int f\_{\theta}(\mathbf{x})^{\mathsf{r}+1} d\mathbf{x} \right) + \frac{1}{\pi(\mathsf{r}+1)} \log \left( \int \mathcal{g}(\mathbf{x})^{\mathsf{r}+1} d\mathbf{x} \right) \\ &\quad - \frac{1}{\pi} \log \left( \int f\_{\theta}(\mathbf{x})^{\mathsf{r}} \mathcal{g}(\mathbf{x}) d\mathbf{x} \right) . \end{split}$$

The RP can be defined for *τ* = 0 taking continuous limits, yielding the expression

$$R\_0(f\_{\theta'}g) = \lim\_{\tau \downarrow 0} R\_\tau(f\_{\theta'}g) = \int g(\mathbf{x}) \log \frac{g(\mathbf{x})}{f\_\theta(\mathbf{x})} d\mathbf{x}.$$

Then, the RP coincides with the Kullback–Leibler divergence (KL) between *g* and *fθ*, at *τ* = 0 (see Pardo, 2006).

The RP was considered for the first time by Jones et al. [13]. Later Broniatowski et al. [8] established some useful properties of the divergence, such as the positivity of the RP for any two densities and for all values of the parameter *τ*, *Rτ*(*fθ*, *g*) ≥ 0 and uniqueness of the minimum RP within a parametric family, that is, *Rτ*(*fθ*, *g*) = 0 if and only if *f<sup>θ</sup>* = *g*. The last property justifies the definition of the MRPEs as the minimizer of the RP between the assumed distribution and the empirical distribution of the data. It is interesting to note that the so-called RP by Broniatowski et al. [8] had been previously considered by Fujisawa and Eguchi [14] under the name of *γ*-cross entropy. In that paper, some appealing robustness properties of the estimators based on such entropy are shown.

Given a sample *X*1, ... , *Xn*, from Broniatowski et al. [8] it can be seen that minimizing *Rτ*(*fθ*, *g*) leads to the following definition.

**Definition 1.** *Let* (<sup>X</sup> *, <sup>β</sup>*<sup>X</sup> , *<sup>f</sup>θ*)*θ*∈Θ⊂R*<sup>p</sup> be a statistical space. The MRPE based on the random sample X*1,..., *Xn for the unknown parameter θ is given, for τ* > 0*, by*

$$\widehat{\theta}\_{\tau}(X\_1, \ldots, X\_n) = \arg\sup\_{\theta \in \Theta} \sum\_{i=1}^n \frac{f\_{\theta}(X\_i)^{\tau}}{\mathbb{C}\_{\tau}(\theta)},\tag{5}$$

*where*

$$C\_{\tau}(\theta) = \left(\int f\_{\theta}(x)^{\tau+1} dx\right)^{\frac{\tau}{\tau+1}}.$$

Further, at *τ* = 0, 4 *θ*0(*X*1, ... , *Xn*) minimizes the KL divergence, and thus the MRPE coincides with the MLE for *τ* = 0. Based on the previous definition (5), differentiating, we obtain that the estimating equations of the MRPE are given by

$$\sum\_{i=1}^{n} \Psi\_{\mathbf{\tau}}(\mathbf{x}\_{i}; \boldsymbol{\theta}) = \mathbf{0}\_{\mathcal{P}'} \tag{6}$$

with

$$\begin{aligned} \Psi\_{\tau}(\mathbf{x};\boldsymbol{\theta}) &= f\_{\boldsymbol{\theta}}(\mathbf{x})^{\top}(\boldsymbol{u}\_{\boldsymbol{\theta}}(\mathbf{x}) - \mathbf{c}\_{\tau}(\boldsymbol{\theta})), \\ \boldsymbol{u}\_{\boldsymbol{\theta}}(\mathbf{x}) &= \left(\boldsymbol{u}\_{\boldsymbol{\theta}\_{1}}(\mathbf{x}), \dots, \boldsymbol{u}\_{\boldsymbol{\theta}\_{p}}(\mathbf{x})\right)^{T}, \boldsymbol{u}\_{\boldsymbol{\theta}\_{i}}(\mathbf{x}) = \frac{\partial}{\partial \boldsymbol{\theta}\_{i}} \log f\_{\boldsymbol{\theta}}(\mathbf{x}), \\ \frac{\partial \mathbb{C}\_{\tau}(\boldsymbol{\theta})}{\partial \boldsymbol{\theta}} &= \mathbb{C}\_{\tau}(\boldsymbol{\theta}) \mathbf{c}\_{\tau}(\boldsymbol{\theta}) \mathbf{r}, \end{aligned} \tag{7}$$

being

$$\mathbf{c}\_{\tau}(\boldsymbol{\theta}) = \frac{1}{\kappa\_{\tau}(\boldsymbol{\theta})} \boldsymbol{\xi}\_{\tau}(\boldsymbol{\theta}) = \left(c\_{\tau,1}(\boldsymbol{\theta}), \dots, c\_{\tau,p}(\boldsymbol{\theta})\right)^{T},\tag{8}$$

$$\mathcal{J}\_{\tau}(\theta) = \int f\_{\theta}(\mathbf{x})^{\tau+1} \mathbf{u}\_{\theta}(\mathbf{x}) d\mathbf{x},\tag{9}$$

$$\kappa\_{\tau}(\theta) = \int f\_{\theta}(\mathbf{x})^{\tau+1} d\mathbf{x}.\tag{10}$$

The MRPE is an M-estimator and thus its asymptotic distribution and influence function (IF) can be obtained based on the asymptotic theory of the M-estimators. Broniatowski et al. [8] studied the asymptotic properties and robustness of the MRPEs. The next result recalls the asymptotic distribution of the MRPEs.

**Theorem 1.** *Let θ*<sup>0</sup> *be the true unknown value of θ*. *Then,*

$$\sqrt{n}(\hat{\boldsymbol{\theta}}\_{\mathsf{T}} - \boldsymbol{\theta}\_{0}) \underset{n \to \infty}{\mathcal{L}} \mathcal{N}\left(\mathbf{0}\_{\mathsf{P}\prime} \,\mathrm{V}\_{\mathsf{T}}(\boldsymbol{\theta}\_{0})\right) \tag{11}$$

*where*

$$V\_{\tau}(\theta) = \mathcal{S}\_{\tau}(\theta)^{-1} \mathcal{K}\_{\tau}(\theta) \mathcal{S}\_{\tau}(\theta)^{-1} \tag{12}$$

*with*

$$\mathbf{S}\_{\tau}(\boldsymbol{\theta}) = -\mathrm{E}\left[\frac{\partial \mathbf{Y}\_{\tau}(\boldsymbol{X};\boldsymbol{\theta})^{T}}{\partial \boldsymbol{\theta}}\right],\tag{13}$$

$$\mathbf{K}\_{\tau}(\boldsymbol{\theta}) = \mathbb{E}\left[\boldsymbol{\Psi}\_{\tau}(\boldsymbol{X};\boldsymbol{\theta})\boldsymbol{\Psi}\_{\tau}^{T}(\boldsymbol{X};\boldsymbol{\theta})\right].\tag{14}$$

Castilla et al. [15] introduced useful notation for the computation of *Vτ*(*θ*).

$$S\_{\tau}(\theta) = J\_{\tau}(\theta) - \frac{1}{\kappa\_{\tau}(\theta)} \mathbb{E}\_{\tau}(\theta) \mathbb{1}\_{\tau}(\theta)^{T},\tag{15}$$

$$\mathbf{K}\_{\tau}(\boldsymbol{\theta}) = \mathbf{J}\_{2\tau}(\boldsymbol{\theta}) + \frac{1}{\kappa\_{\mathrm{t}}(\boldsymbol{\theta})} \left( \frac{\kappa\_{2\mathrm{t}}(\boldsymbol{\theta})}{\kappa\_{\mathrm{t}}(\boldsymbol{\theta})} \mathbf{\tilde{\xi}}\_{\tau}(\boldsymbol{\theta}) \mathbf{\tilde{\xi}}\_{\tau}(\boldsymbol{\theta})^{\mathrm{T}} - \mathbf{\tilde{\xi}}\_{\tau}(\boldsymbol{\theta}) \mathbf{\tilde{\xi}}\_{2\tau}(\boldsymbol{\theta})^{\mathrm{T}} - \mathbf{\tilde{\xi}}\_{2\tau}(\boldsymbol{\theta}) \mathbf{\tilde{\xi}}\_{\tau}(\boldsymbol{\theta})^{\mathrm{T}} \right), \tag{16}$$

where

$$J\_{\tau}(\theta) = \int f\_{\theta}(\mathbf{x})^{\tau+1} \mathfrak{u}\_{\theta}(\mathbf{x}) \mathfrak{u}\_{\theta}(\mathbf{x})^{T} d\mathbf{x},\tag{17}$$

and *κτ*(*θ*) and *ξτ*(*θ*) are as in (9) and (10), respectively.

Toma and Leoni-Aubin [16] defined new robust and efficient measures based on the RP. Later, Toma et al. [17] considered the MRPE for general parametric models and developed a model selection criterion for regression models. Broniatowski et al. [8] applied the method to the multiple regression model (MRM) with random covariates. Subsequently, Castilla et al. [18] developed Wald-type tests based on MRPE for the MRM, and Castilla et al. [19] studied the MRPE for the MRM in the ultra-high dimensional set-up. Further, Jaenada and Pardo [20,21] considered the MRPE and Wald-type test statistics for generalized linear models (GLM). Despite Wald-type test statistics, there exist others relevant test statistics having an important role in the statistical literature: the likelihood-ratio and Rao (or score) tests, which are based on restricted estimators, usually the RMLE. Then, it makes sense to develop robust versions of these popular statistics based on the RMRPE.

### **3. The Restricted Minimum Rényi Pseudodistance Estimator: Asymptotic Distribution and Influence Function of RMRPE**

In this section, we introduce the RMRPE and we derive its asymptotic distribution. Moreover, we study its robustness properties through its influence function (IF).

**Definition 2.** *The RMRPE functional T*\**τ*(*G*) *evaluated at the distribution G is defined by*

$$R\_{\tau}(\mathcal{g}\_{\prime}f\_{\overline{T}\_{\tau}(G)}) = \min\_{\emptyset \in \Theta\_0} R\_{\tau}(\mathcal{g}\_{\prime}f\_{\emptyset})\_{\prime}$$

*given that such a minimum exists.*

*Accordingly, given random sample X*1, ... , *Xn from the distribution G, the RMRPE of θ is defined as*

$$\tilde{\theta}\_{\mathsf{T}} = \arg\sup\_{\theta \in \Theta\_{\mathsf{U}}} \sum\_{i=1}^{n} \frac{f\_{\theta}(X\_{i})^{\mathsf{T}}}{C\_{\mathsf{T}}(\theta)}.$$

Next, the result states the asymptotic distribution of the RMRPE, \* *θ<sup>τ</sup>* = *T*\**τ*(*G*).

**Theorem 2.** *Suppose that the true distribution satisfies the conditions of the model and let us denote by θ*<sup>0</sup> ∈ **Θ**<sup>0</sup> *the true parameter. Then, the RMRPE* \* *θ<sup>τ</sup> of θ obtained under the constraints g*(*θ*) = **0***<sup>r</sup> has distribution*

$$n^{1/2}(\overline{\theta}\_{\tau} - \theta\_0) \underset{u \longrightarrow \infty}{\longrightarrow} \mathcal{N}(\mathbf{0}\_{p\prime}, \Sigma\_{\tau}(\theta\_0))$$

*where*

$$\Sigma\_{\tau}(\theta\_0) = P\_{\tau}^\*(\theta\_0) K\_{\tau}(\theta\_0) P\_{\tau}^\*(\theta\_0)^T,$$

$$P\_{\tau}^\*(\theta\_0) = \mathcal{S}\_{\tau}(\theta\_0)^{-1} - \mathcal{Q}\_{\tau}(\theta\_0) \mathcal{G}(\theta\_0)^T \mathcal{S}\_{\tau}(\theta\_0)^{-1},\tag{18}$$

$$Q\_{\tau}(\theta\_0) = \mathcal{S}\_{\tau}(\theta\_0)^{-1} G(\theta\_0) \left[ G(\theta\_0)^T \mathcal{S}\_{\tau}(\theta\_0)^{-1} G(\theta\_0) \right]^{-1}. \tag{19}$$

*and Sτ*(*θ*0) *is defined in (13), evaluated at θ* = *θ*0*.*

*P*∗
