*2.5. Estimation and Inference*

Attractive asymptotic properties, such as efficiency and consistency, are some of the reasons that make the maximum likelihood method the most usually applied method of parametric point estimation. The MLEs are the points that maximize the likelihood function over the domain of the parameter space. Since the logarithmic function is increasing, performing the maximization of the log-likelihood function, besides being a more convenient task, also provides the MLEs.

Given that *ξ* = (*ξ*1, ... , *ξr*) is the *r* × 1 parametric vector of a random variable *X* that follows a Normal-*G* distribution, *<sup>G</sup>*(*x*|*ξ*) = *Gξ* (*x*) is the baseline, *g*(*x*|*ξ*) = *gξ* (*x*) is its corresponding pdf and **X** = (*<sup>x</sup>*1,..., *xm*) is a complete random sample of size *m* from *X*, then the log-likelihood function is:

$$\begin{split} \ell(\boldsymbol{\xi}|\mathbf{X}) &= \sum\_{j=1}^{m} \log \phi \left( \frac{2\mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j}) - 1}{\mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j}) \left[1 - \mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j})\right]} \right) + \sum\_{j=1}^{m} \log \left(1 - 2\mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j}) + 2\mathbf{G}\_{\boldsymbol{\xi}}^{-2}(\mathbf{x}\_{j})\right) \\ &- 2\sum\_{j=1}^{m} \log \mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j}) - 2\sum\_{j=1}^{m} \log \left[1 - \mathbf{G}\_{\boldsymbol{\xi}}(\mathbf{x}\_{j})\right] + \sum\_{j=1}^{m} \log \mathbf{g} \xi(\mathbf{x}\_{j}) \ . \end{split} \tag{25}$$

Thanks to powerful functions available within the software for statistical computing, it is possible to use numerical methods to maximize (25); for this purpose, R [22] brings the function optim in package stats.

The MLEs can also be obtained by solving the system of equations *U*(*ξ*|**X**) = **0***<sup>r</sup>*, where *U*(*ξ*|**X**) = <sup>∇</sup>*ξ* -(*ξ*|**X**)=(*ui*)<sup>1</sup>≤*i*≤*r* is the score vector, such that:

$$\begin{split} u\_{i} &= \sum\_{j=1}^{m} \frac{[G\_{\xi}(\mathbf{x}\_{j}) - 1]^{4} - G\_{\xi}^{2}(\mathbf{x}\_{j})}{G\_{\xi}^{3}(\mathbf{x}\_{j})[1 - G\_{\xi}(\mathbf{x}\_{j})]^{3}} \cdot \frac{\partial}{\partial \xi\_{i}} G\_{\xi}(\mathbf{x}\_{j}) + \sum\_{j=1}^{m} \frac{4G\_{\xi}(\mathbf{x}\_{j}) - 2}{1 - 2G\_{\xi}(\mathbf{x}\_{j}) + 2G\_{\xi}^{2}(\mathbf{x}\_{j})} \cdot \frac{\partial}{\partial \xi\_{i}} G\_{\xi}(\mathbf{x}\_{j}) \\ &- 2 \sum\_{j=1}^{m} \frac{1}{G\_{\xi}(\mathbf{x}\_{j})} \cdot \frac{\partial}{\partial \xi\_{i}} G\_{\xi}(\mathbf{x}\_{j}) + 2 \sum\_{j=1}^{m} \frac{1}{1 - G\_{\xi}(\mathbf{x}\_{j})} \cdot \frac{\partial}{\partial \xi\_{i}^{x}} G\_{\xi}(\mathbf{x}\_{j}) + \sum\_{j=1}^{m} \frac{1}{g\_{\xi}(\mathbf{x}\_{j})} \cdot \frac{\partial}{\partial \xi\_{i}} g\_{\xi}(\mathbf{x}\_{j}) \end{split}$$

and **0***r* is a *r* × 1 vector of zeros.

The information matrix *J*(*ξ*|**X**) is essential to construct confidence intervals and to test hypotheses on *ξ*. The expectation of *J*(*ξ*|**X**) is the expected Fisher information matrix I*ξ* and under certain conditions of regularity, √*m*(*ξ* − *ξ*) follows approximately a multivariate normal distribution *Nr* **0***r*, <sup>I</sup>*ξ*<sup>−</sup><sup>1</sup>. The expression for *J*(*ξ*|**X**) is presented in Appendix A.
