Next Article in Journal
On Bivariate Distributions with Singular Part
Previous Article in Journal
A Study of Structural Stability on the Bidispersive Flow in a Semi-Infinite Cylinder
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Small Area Estimation under Poisson–Dirichlet Process Mixture Models

1
School of Science, Shanghai Institude of Technology, Shanghai 201418, China
2
Shanghai Institute of Applied Mathematics and Mechanics, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200072, China
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(7), 432; https://doi.org/10.3390/axioms13070432
Submission received: 18 April 2024 / Revised: 15 June 2024 / Accepted: 25 June 2024 / Published: 27 June 2024

Abstract

:
In this paper, we propose an improved Nested Error Regression model in which the random effects for each area are given a prior distribution using the Poisson–Dirichlet Process. Based on this model, we mainly investigate the construction of the parameter estimation using the Empirical Bayesian(EB) estimation method, and we adopt various methods such as the Maximum Likelihood Estimation(MLE) method and the Markov chain Monte Carlo algorithm to solve the model parameter estimation jointly. The viability of the model is verified using numerical simulation, and the proposed model is applied to an actual small area estimation problem. Compared to the conventional normal random effects linear model, the proposed model is more accurate for the estimation of complex real-world application data, which makes it suitable for a broader range of application contexts.

1. Introduction

The study of small area estimation has received more and more attention due to the increasing demand for reliable estimates in governmental departments, enterprises, agriculture, commerce, and socioeconomic fields. The issue of how to improve the estimation accuracy and obtain reliable estimates of subpopulation parameters is known as small area estimation. The term “small area” usually refers to a small geographic area, such as a county, city, state, or a small population group that is crossclassified by demographic characteristics, such as a certain age, gender, or ethnic group. Since a majority of sample surveys are designed to estimate certain parameters for the overall population, there are no particular requirements for sample sizes for subgroups, which can lead to the possibility that the parameter estimates for subgroups may not be as accurate as desired. Furthermore, if the sample size of a survey is determined to achieve a specified level of accuracy on a large scale, there may not be enough resources to conduct a second survey aimed at achieving a similar level of accuracy on a smaller scale.
The application of indirect estimation, in particular model-based estimation for small areas, to address this issue has been widely accepted. Recognizing the commonality or similarity of several small areas in certain aspects, we can construct reliable small area estimates with the help of well-defined models, thus borrowing relevant auxiliary information from other variables in the area, from neighboring small areas, or even from other sources, such as population census data. This concept has also greatly contributed to the development of the small area estimation problem, and various models and methods have been proposed to estimate small area parameters by borrowing auxiliary information. For example, one can check the publications of Rao [1] and Pfeffermann [2] for understanding and reviewing. Rao and Molina [3] give a detailed and previous description of the models and methods for small area estimation.
Fay and Herriot [4] were the first to propose an area level model, the Fay-Herriot model, to solve the problem of small area estimation of per capita income in the USA. Battese, Harter, and Fuller [5] were the first to adopt the unit level model, the Nested Error Regression Model, which combines agricultural survey data and satellite data, to estimate the average acreage of crops in twelve counties in the state of Iowa in the United States. With the growing demand for small area estimation, there has been a large amount of literature on extending these two base models to accommodate different data structures and requirements. For example, Serena et al. [6] provided two extensions of the Fay–Herriot model. The first one is a multivariate extension, which jointly models the survey estimates of two or more different but related demographic characteristics; the second extension is to build a functional measurement error model into the original Fay–Herriot model to consider the case of covariates with errors. Yang and Chen [7] improved the Nested Error Regression model by clustering small areas based on their centers to obtain a new model and estimate the model parameters based on the model.
The accuracy and precision of small area estimators depend on the validity of the model. In this context, we pay particular attention to an important underlying assumption of the model, namely the normality of random effects. The assumption of normality of random effects is not necessarily reasonable for no reason other than computational convenience. In particular, this assumption is difficult to detect in practice because it involves unobservable quantities. Therefore, it is necessary to investigate the flexible modeling of random effects. Datta et al. [8] considered the case where random effects do not exist and proposed a bootstrap method for hypothesis testing that allows us to determine the presence or absence of random effects in various regions of the model. This is because the use of random effects will increase the variability of the estimates when the actual data structure does not include random effects. Sugasawa and Kubokawa [9] also proposed a Nested Error Regression model that utilizes uncertain random effects and gave estimates of the corresponding model parameters. Ferrante and Pacei [10] considered the asymmetry of the data they examined and thus relaxed the normality assumption of the Fay–Herriot model by adopting a skewed distribution. They proposed a multivariate skewed small area model and applied the model to the business statistics of firms. Fabrizi and Trivisano [11] considered two extensions of the Fay–Herriot model where both extensions were for the assumptions of the random effects of the model, which were an exponential power distribution and a skewed power distribution. Chakraborty et al. [12] proposed a mixture model based on the Fay–Herriot model with random effects obeying a two-component normal form. Diallo and Rao [13] concerned themselves with the skewed distribution of the response variable and suggested replacing the assumption of normality for both random effects and errors with a skew normal distribution. Tsujino and Kubokawa [14] investigated a model in which the random effects remain normal but the errors obey a skew normal distribution, and they gave an expression for predicting the random effects.
In small area estimation, parametric models are usually used to build mixed models to achieve estimation. However, parametric models suffer from model mis-specification, which may produce unreliable small area estimations. The application of nonparametric and semiparametric models to achieve small area estimation has been partially considered in the literature. Opsomer et al. [15] used a p-spline approach applied to the nonparametric estimation of the regression component in a Nested Error Regression model linear model. Polettini [16] used a Dirichlet Process mixture model to implement the construction of random effects in the Fay–Herriot model.
To further extend the Nested Error Regression model, this paper proposes an improved Nested Error Regression model in which the default normality assumption for random effects is replaced by a nonparametric specification, i.e., using the Poisson–Dirichlet Process. In Nested Error Regression models, random effects are typically assumed to be normally distributed. This assumption simplifies the theoretical and computational analysis of the model. However, the normality assumption may not hold in certain cases, thus leading to inaccurate model estimates. To address this limitation, we propose the use of a nonparametric specification to replace the default normality assumption. Nonparametric methods do not rely on specific parameters or distributional forms; instead, they learn their structure or characteristics directly from the data. The Poisson–Dirichlet Process is proposed as a prior distribution for the random effects. This approach differs from the conventional assumption that the random effects in each area follow a fixed distribution. Instead, the model allows the distribution of these random effects to change adaptively based on the data. The Poisson–Dirichlet Process is a two-parameter generalization of the Dirichlet Process, where, in addition to the concentration parameter, an additional parameter called the discount parameter is also added. Similar to the Dirichlet Process, samples from the Poisson–Dirichlet Process correspond to discrete distributions that have the same support as its underlying distribution. The underlying distribution of the Poisson–Dirichlet Process is the Poisson–Dirichlet Distribution introduced by Pitman and Yor [17]. Due to its properties, we can apply it to the prior distribution of random effects, which has an unobservable variable. There are many studies related to the Dirichlet Process and the Poisson–Dirichlet Process in Bayesian nonparametrics. For example, Al-labadi et al. [18], by applying Bayesian estimation of the Dirichlet Process, proposed a novel methodology for computing varentropy and varextropy. Handa [19] studied a two-parameter Poisson–Dirichlet Process based on point process theory. Favaro et al. [20] developed a two-parameter Poisson–Dirichlet Process model for dealing with the problem of the predictive sampling of species or kinds. Performing exact Bayesian inference on nonparametric models is always challenging, because it is difficult to derive the posterior distribution. This drives us to use the Markov chain Monte Carlo (MCMC) algorithm [21] for approximate inference.
The paper is organized as follows: Section 2 reviews the classical Nested Error Regression model and the Poisson–Dirichlet Process. Section 3 describes the proposed model. In Section 4, the methods corresponding to the estimation of each parameter in the model and the algorithm flow are given. In Section 5 and Section 6, the feasibility of the model and the corresponding parameter estimation methods are investigated by applying them to simulated and real data. The article is briefly summarized in the last section.

2. Theoretical Background

2.1. Nested Error Regression Models

Battese, Harter, and Fuller [5] published a seminal paper that contributed to the popularization of model-based small area estimation in government statistics. Model-based small area estimation requires the use of appropriate regression models that relate area target variables to appropriate auxiliary variables obtained from other surveys and records administered by government agencies. These authors presented Nested Error Regression (NER) models, as well as a unit-level model that combines satellite data and farm-level survey observations to determine the average corn and soybean acreage in twelve counties in Iowa, USA.
Suppose that the population of interest U is partitioned into mutually m independent small areas U i , where the total size is N = i = 1 m N i , and the sample size is n = i = 1 m n i . The sequence y i 1 , x i 1 , , y i n i , x i n i is the target variable and the corresponding covariate for the observation of the jth sample unit in the ith small area, where i = 1 , 2 , , m is an indicator of the small areas. And x i j = 1 , x i j 1 , , x i j p 1 is a known p-dimensional vector of covariates. Battese et al. [5] proposed the following normal NER model:
y i j = x i j β + ν i + ε i j
where β = β 0 , , β p 1 denotes the unknown p-dimensional vector of the regression coefficients, and ν i and ε i j denote the area-specific random effect and error, respectively, for the ith small area. It is assumed that the random effects ν i and errors ε i j follow normal distributions, ν i N 0 , σ ν 2 and ε i j N 0 , σ ε 2 , and that the random effects ν i and errors ε i j are independent of each other. We frequently seek to obtain the estimated mean ρ = ρ 1 , , ρ m of the target variable for all small areas through the small area model, also known as the small area mean ρ . Assuming that the mean of the covariate x ¯ is available for each small area, the ith small area mean ρ i is defined by the following formula:
ρ i = x ¯ i β + ν i
As shown below in the hierarchical Bayesian approach based on the model in Equation (1) for predicting ρ i , this is accomplished by giving a prior distribution to the unknown model parameter ψ = β , σ ν 2 , σ ε 2 .
The NER model can be described as follows:
  • Define a conditional on ρ = ρ 1 , , ρ m and ψ = β , σ ν 2 , σ ε 2 , as well as estimator Y i N ρ i , σ ν 2 , for i = 1 , 2 , , m , independently;
  • Define a conditional on ψ = β , σ ν 2 , σ ε 2 and small area means ρ i N x ¯ i β , σ ν 2 , for i = 1 , 2 , , m , independently;
  • The model parameters ψ = β , σ ν 2 , σ ε 2 are given a prior distribution with a density π ψ .

2.2. Poisson–Dirichlet Process

The Poisson–Dirichlet Process (PDP), also known as the Pitman–Yor Process, is a two-parameter extension of the Dirichlet Process. Similar to the Dirichlet Process, the Poisson–Dirichlet Process is a distribution placed on top of a distribution. The distribution underlying the Poisson–Dirichlet Process is the Poisson–Dirichlet Distribution. Assume that there exists a pair of parameters α and θ such that 0 α < 1 and θ > α . Here, we name α as the discount parameter and θ as the concentration parameter. Let V 1 , V 2 , be a sequence of mutually independent random variables, and allow V k obey the following distribution: V k | α , θ B e t a 1 α , θ + k α ; define p 1 , p 2 , as follows:
p 1 = V 1 , p k = V k i = 1 k 1 1 V i
Meanwhile, p 1 , p 2 , satisfies k = 1 p k = 1 . If we arrange p 1 , p 2 , in descending order to obtain p = p ˜ 1 , p ˜ 2 , , then p is a Poisson-Dirichlet distribution, which is denoted as p P D α , θ . Having defined the Poisson–Dirichlet Distribution, we can formally define the Poisson–Dirichlet Process. Assume that the base distribution H 0 is a probability distribution on the measurable space χ , B . Let X 1 , X 2 , be a sequence of independently and identically distributed random variables from the base distribution H 0 , and assume that p P D α , θ ; then, we define the random probability measure G on χ , B to be the Poisson–Dirichlet Process with parameters α and θ and base distribution H 0 , which is given by the following:
G x | α , θ , H 0 = k = 1 p k δ X k x
where δ X k x denotes the Dirac measure of degeneracy at point X k : δ X k x = 1 , x = X k 0 , o t h e r w i s e . This stick-breaking distribution G is also denoted as G P D α , θ ; H 0 . The parameters α and θ determine the power law properties of the Poisson–Dirichlet Process. In practical modeling, the Poisson–Dirichlet Process is more appropriate than the Dirichlet Process, because it exhibits power law properties that can be captured in natural language, and α = 0 corresponds to the Dirichlet Process.

3. Small Area Model with PDP Random Effects

As in the NER model, in the proposed model, we assume that the observations y i = y i 1 , y i 2 , y i n i , i = 1 , 2 , , m are a set of data associated with independent and identically distributed random effects ν i = ν i 1 , , ν i n i , i = 1 , 2 , , m . We consider replacing the normal random effects prior distribution of the NER model with a Bayesian nonparametric prior; namely, we assume that ν i = ν i 1 , , ν i n i , i = 1 , 2 , , m is independently and identically distributed, thus obeying an unknown probability measure H i , i = 1 , 2 , , m . At this moment, the distribution of the random effects H i , i = 1 , 2 , , m , as an unknown quantity, can be given a Bayesian nonparametric prior. In this section, by assuming 0 α < 1 and θ > α , we introduce the Poisson–Dirichlet Process with parameters α and θ and a base distribution H 0 as a prior to assign the distributions H i , i = 1 , 2 , , m of the random effects to the model, and we obtain a unit-level small area model with PDP random effects:
y i j = x i j β + ν i j + ε i j , j = 1 , 2 , , n i , i = 1 , 2 , , m
where x i j is a known auxiliary variable in the p-dimension associated with the observation y i j , β is an unknown vector of regression coefficients in the p-dimension, ε i j is a random variable with a mean of 0 and a variance of V a r ε i j = σ ε 2 , and ν i j is a random effect with a magnitude reflecting the differences between units in different areas; ε i j and ν i j are mutual and independent, and they can be expressed as follows:
ε i j N 0 , σ ε 2 j = 1 , 2 , , n i ; i = 1 , 2 , , m
ν i 1 , , ν i n i | H i i i d H i i = 1 , 2 , , m
H i P D α , θ ; H 0
where H 0 is the base distribution, α is the discount parameter, and θ is the concentration parameter. Parameters α and θ determine not only the power law properties of the Poisson–Dirichlet Process, but also the probability that a new random effect ν i j will be sampled if ν i 1 , , ν i j 1 already exists and given H i . In the proposed model, when a new random effect ν i j is drawn, it either comes from one of the previous classes of ν i 1 , , ν i j 1 or a new one from H i . If ν i j comes from one of the previous classes of ν i 1 , , ν i j 1 , its probability is positively related to the number of data points contained in this class. The larger the value of the parameter θ , the greater the probability that ν i j will be drawn from H i in a new class. For the random effects variable ν i = ν i 1 , , ν i n i , let ν i 1 * , , ν i K i * be the elements of ν i = ν i 1 , , ν i n i that are not identical to each other, let K i be the number of classes in ν i = ν i 1 , , ν i n i , and let m i k be the number of elements in the kth class; then the sampling probability of ν i j is as follows:
P r ν i j = ν i k * | ν i 1 , , ν i j 1 = m i k α j 1 + θ , k = 1 , 2 , , K i
P r ν i j ν i 1 * , , ν i K i * | ν i 1 , , ν i j 1 = K i α + θ j 1 + θ
The hierarchical structure of our model can be represented as follows.
The NER model with PDP random effects:
  • Define a conditional on ν = ν 1 , , ν m , ν i = ν i 1 , , ν i n i , β , and σ ε 2 ; estimator y i j is given by y i j N x i j β + ν i j , σ ε 2 , for j = 1 , 2 , , n i ; i = 1 , 2 , , m , independently;
  • Define a conditional on α , θ , and H 0 ; the random effects ν are given by ν i 1 , , ν i n i | H i i i d H i and H i P D α , θ ; H 0 , for j = 1 , 2 , , n i ; i = 1 , 2 , , m , independently.

4. Estimation

In small area estimation, we aim to obtain good estimators of the small area mean ρ i . Based on the proposed model, we mainly consider the estimation of the model parameters β , σ ε 2 , α , θ , and H 0 , and the conditional mean ρ i ˜ β , σ ε 2 , α , θ , H 0 , y i j is given the data y 11 , , y m n m .
In the small area model with PDP random effects, for the ith small area, the random effect ν i 1 , , ν i n i that comes from ν i 1 , , ν i n i | H i H i , ν i 1 , , ν i n i can be divided into mutually exclusive classes ν i 1 * , , ν i K i * , and we define m i = m i 1 , , m i K i , where m i j denotes the number of contained elements in the jth class of the ith small area, and K i denotes the number of mutually exclusive classes in ν i 1 , , ν i n i . Assume that we assign data points y i 1 , , y i n i according to the classes of ν i 1 , , ν i n i to obtain K i groups: y i 1 1 , , y i m 1 1 , , y i 1 K i , , y i m K i K i . The likelihood function of the model parameters β , σ ε 2 , α , θ , and H 0 is
L β , σ ε 2 , α , θ , H 0 = E ν k = 1 n j f y i j | ν i j = E ν j = 1 m i 1 f y i j 1 | ν i 1 * j = 1 m i K i f y i j K i | ν i K i * = E K , m j = 1 m i 1 f y i j 1 | u g 0 u d u j = 1 m i K i f y i j K i | u g 0 u d u
where E x denotes the marginal expectation about x, f denotes the probability density function of normality, and h 0 denotes the density function of the base distribution H 0 . This likelihood function is too complex to be maximized or numerical, so we consider applying empirical Bayesian estimation to solve the estimation of the model parameters.
In this section, we consider the application of empirical Bayesian nonparametric methods to study the estimation of the regression coefficients β and the error variance σ ε 2 for the small area model with PDP random effects, as well as the estimation of the discounting parameter α , the concentration parameter θ , and the base distribution H 0 for the Poisson–Dirichlet Process when given the known data y 11 , y 1 n 1 , , y m 1 , , y m n m . We explain the algorithms used to derive these estimates in detail.

4.1. Proposed Approach

4.1.1. Estimation of Regression Coefficients and Error Variance

The first consideration we make is the estimation of the regression coefficients β and the error variance σ ε 2 in the model. Assuming that the random effects ν i j are known, by rewriting Equation (5) and defining y i j * in the model, we obtain the following:
y i j * = y i j ν i j = x i j β + ε i j j = 1 , 2 , , n i , i = 1 , 2 , , m
We consider a matrix representation of the above equation such that Y * = Y 1 * , , Y m * , and X = X 1 , , X m is a matrix with column rank r X and error ε = ε 1 , , ε m . In this case, Y i * = y i 1 * , , y i n i * , X i = x i 1 , , x i n i , and ε i = ε i 1 , , ε i n i ; thus, we again obtain the following:
Y * = X β + ε
ε N 0 , σ ε 2
For the above linear regression model, we consider the idea of using the classical algorithm of parameter estimation for solving the regression model, that is, the least squares estimation algorithm, to obtain the estimates of the regression coefficients β and the error variance σ ε 2 .
For the regression coefficients β , the objective of least squares estimation is to find an estimate of the regression coefficients β ^ that minimizes the sum of the squared residuals of all sample observations Q β ^ = Y i * Y ^ i * 2 . The sum of the squared residuals is easily obtained by derivation:
Q β ^ = Y * Y * 2 β ^ X Y * + β ^ X X β ^
By minimizing the sum of the squared residuals Q β ^ and assuming that X X 1 exists, an estimate of the regression coefficient β can be obtained as follows.
β ^ = X X 1 X Y *
For the estimation of the error variance σ ε 2 , we first consider that the error vector ε = Y * X β is an unobservable vector. Suppose we replace β with the least squares estimate β ^ of β , thus defining the residual vector ε ^ = Y * X β ^ . It is natural to consider using the residual sum of squares R S S = ε ^ ε ^ as a measure of the magnitude of σ ε 2 . This can be obtained by substituting Equation (16) and ε = Y * X β into the residual sum of squares R S S = ε ^ ε ^ :
R S S = Y * Y * Y * X X X 1 X Y *
Furthermore, we compute the expectation E R S S = n r X of the R S S ; then, we can obtain an unbiased estimate of the error variance σ ε 2 :
σ ^ ε 2 = 1 n r X Y * Y * Y * X X X 1 X Y *

4.1.2. Estimation of the Base Distribution and Two Parameters of the Poisson–Dirichlet Process

We first discuss the estimation of the base distribution H 0 of the Poisson–Dirichlet Process for random effects. Yang and Wu [22] proposed the application of the multivariate kernel density method to estimate the base distribution H 0 under the Dirichlet Process prior; Qiu, Yuan, and Zhou [23] considered applying the multivariate kernel density method to estimate the base distribution H 0 of the Poisson–Dirichlet Process under a multigroup data structure. We can also apply the multivariate kernel density method to realize the estimation of the base distribution of our model.
Assuming that the random effect ν i j is known, we can equivalently obtain ν i j * H 0 , K 1 , , K m , and m i j . The density function h 0 of the base distribution H 0 can then be estimated using the following equation:
h ˜ 0 · = 1 K i i = 1 m j = 1 K i ω t · , ν i j *
where ω t x , ν i j * = 1 t ω x ν i j * t is some kernel function with bandwidth t > 0 . We choose the kernel function as a Gaussian kernel function to realize the estimation.
Then, we discuss the estimation of the two parameters α and θ of the Poisson–Dirichlet Process for random effects. Similar to Carlton [24], who studied the estimation of the parameters of the Poisson–Dirichlet Process for a single set of data, we apply maximum likelihood method to estimate the parameters α and θ .
For each small area i = 1 , 2 , , m , define A j i 0 to denote the number of categories containing j individuals in the ith small area, where j = 1 n i A j i = K i , and j = 1 n i j A j i = n i . Denote A i = A 1 i , A 2 i , , A n i i , i = 1 , 2 , , m . Then, the log likelihood functions for parameters α and θ are given below:
l α , θ = i = 1 m log P r A i = a i = i = 1 m log N a i i = 1 m l = 1 n i 1 log θ + l + i = 1 m l = 1 K i 1 log θ + l α + i = 1 m j = 2 n i 1 a j i l = 1 j 1 log l α
where A i = a i is the given observed data, and N a i = n i ! j = 1 n i j ! a j i a j i ! . The maximum likelihood estimates of the parameters α and θ are obtained by solving the equations:
l α α , θ = l α , θ α = i = 1 m l = 1 K i 1 l θ + l α i = 1 m j = 2 n i 1 a j i l = 1 j 1 1 l α = 0
l θ α , θ = l α , θ θ = i = 1 m j = 2 n i 1 1 θ + l + i = 1 m l = 1 K i 1 1 θ + l α = 0
Here, we use the numerical method of the Newton–Rapson iteration to obtain the above maximum likelihood estimates α ˜ and θ ˜ .

4.2. Algorithms

The random effects ν i j are unobservable hidden variables, and we construct the following pseudoestimates:
h ˜ ˜ 0 · ; h 0 , β , σ ε 2 , α , θ = E h ˜ 0 · , ν | Y = E 1 K i i = 1 m j = 1 K i ω t · , ν i j * | Y
β ˜ ˜ , σ ˜ ˜ ε 2 , α ˜ ˜ , θ ˜ ˜ h 0 , β , σ ε 2 , α , θ = E β ˜ ν , σ ˜ ε 2 ν , α ˜ ν , θ ˜ ν | Y
Given the observed data Y, we can introduce an algorithm that is computed according to the following iterative formula given some initial values of the parameters h 0 , β , σ ε 2 , α , and θ :
h ^ 0 r + 1 · = E h ^ 0 r , β ^ r , σ ^ ε 2 r , α ^ r , θ ^ r h ˜ 0 · , ν | Y
β ^ r + 1 , σ ^ ε 2 r + 1 , α ^ r + 1 , θ ^ r + 1 = E h ^ 0 r , β ^ r , σ ^ ε 2 r , α ^ r , θ ^ r β ˜ ν , σ ˜ ε 2 ν , α ˜ ν , θ ˜ ν | Y
where E h ^ 0 r , β ^ r , σ ^ ε 2 r , α ^ r , θ ^ r represents the posterior expectation E h ˜ 0 · , ν | Y , and E β ˜ ν , σ ˜ ε 2 ν , α ˜ ν , θ ˜ ν | Y is calculated by applying h ^ 0 r , β ^ r , σ ^ ε 2 r , α ^ r , θ ^ r to replace the unknown parameter h 0 , β , σ ε 2 , α , θ when estimating the parameter in the r + 1 th iteration. Then, we can obtain the parameter estimates:
h ^ 0 , β ^ , σ ^ ε 2 , α ^ , θ ^ = lim r h ^ 0 r , β ^ r , σ ^ ε 2 r , α ^ r , θ ^ r
However, it is not an easy task to compute the above posterior expectation expression during the iterative process, and we must consider applying the MCMC algorithm to seek its numerical solution. In the following, we will discuss the process of computational estimation of the model, which is divided into three stages, namely the selection of the initial values, the full conditional distribution of the MCMC algorithm, and the sampling and estimation.

4.2.1. Selection of Initial Values

During each iteration, we need to give the initial values of the parameters to achieve the corresponding parameter estimation.
We first consider the initial value of the base distribution h ^ 0 0 , assuming that f is a normal density function, and we obtain the maximum likelihood estimate ν ^ i j = Y i j by solving the equation f Y i j | ν ^ i j = max u f Y i j | u . This results in a kernel estimate of h 0 as the initial value of the base distribution h ^ 0 0 :
h ^ 0 0 · = 1 n i = 1 m j = 1 n i ω t · , ν i j
For the selection of the two parameters α and θ , in the absence of information, we can choose α ^ 0 0 and θ ^ 0 0 to be some random values that satisfy the requirements 0 α < 1 and θ > α . Given h ^ 0 0 , α ^ 0 0 and θ ^ 0 0 , we obtain the hidden variable ν i j by extracting it from P D α ^ 0 0 , θ ^ 0 0 ; h ^ 0 0 so that we can obtain the initial values β 0 0 and σ ε 2 0 0 of the regression coefficient β and the error variance σ ε 2 from the least squares estimation.

4.2.2. Full Conditional Distributions of the MCMC Algorithm

Given the observations Y i j and ν i j , notate ν i j = ν i 1 , ν i 2 , , ν i j 1 , ν i j + 1 , . . , ν i n i to denote the residual vector removing ν i j from ν i = ν i 1 , , ν i n i , K i j to denote the number of mutually exclusive elements in ν i j , and m i k to denote the number of elements taking the value ν i t * in ν i j . In order to apply the MCMC algorithm to solve the posterior expectation, we need to discuss the full conditional distribution of the MCMC algorithm.
Theorem 1.
For each j = 1 , 2 , , n i , given ν i j and Y i j , the conditional distribution of ν i j is
ν i j | ν i j , Y i j , σ ε 2 , β , α , θ , h 0 q 0 H ν + k = 1 K i j q k δ ν i k * , ν i j
where q 0 θ + α K i j f Y i j | ν i j , σ ε 2 , β , and q k m i k α f Y i j | ν i j , σ ε 2 , β , thus satisfying the condition q 0 + k = 1 K i j q k = 1 ; H ν denotes the posterior distribution of ν i j given observation Y i j .
Proof of Theorem 1.
The posterior distribution of the Poisson–Dirichlet Process is known to be
P r ν i j · | ν i j , α , θ , h 0 = θ + α K i j θ + n i 1 h 0 · + 1 θ + n i 1 k = 1 K i j m i k α δ ν i k * , ·
The conditional distribution of ν i j is obtained as given in ν i j , Y i j :
d H ν i j | ν i j , Y i j , σ ε 2 , β , α , θ , h 0 = d F ν i j , ν i j , Y i j | σ ε 2 , β , α , θ , h 0 d F ν i j , ν i j , Y i j | σ ε 2 , β , α , θ , h 0 d ν i j = f Y i j | ν i j , ν i j , σ ε 2 , β d F ν i j , ν i j | α , θ , h 0 f Y i j | ν i j , ν i j , σ ε 2 , β d F ν i j , ν i j | α , θ , h 0 = f Y i j | ν i j , ν i j , σ ε 2 , β d F ν i j | ν i j , α , θ , h 0 f ν i j f Y i j | ν i j , ν i j , σ ε 2 , β d F ν i j | ν i j , α , θ , h 0 f ν i j = f Y i j | ν i j , σ ε 2 , β d F ν i j | ν i j , α , θ , h 0 f Y i j | ν i j , σ ε 2 , β d F ν i j | ν i j , α , θ , h 0 = θ + α K i j f Y i j | ν i j , σ ε 2 , β h 0 ν i j d ν i j + k = 1 K i j m i k α f Y i j | ν i j , σ ε 2 , β δ ν i k * , ν i j θ + α K i j f Y i j | ν i j , σ ε 2 , β h 0 d ν i j + k = 1 K i j m i k α f Y i j | ν i j , σ ε 2 , β
We let
Y i j = 1 θ + α K i j f Y i j | ν i j , σ ε 2 , β h 0 d ν i j + k = 1 K i j m i k α f Y i j | ν i j , σ ε 2 , β
Then, we obtain
d H ν i j | ν i j , Y i j = Y i j θ + α K i j f Y i j | ν i j , σ ε 2 , β h 0 ν i j d ν i j + Y i j k = 1 K i j m i k α f Y i j | ν i j , σ ε 2 , β δ ν i k * , ν i j
Let q 0 = Y i j θ + α K i j f Y i j | ν i j , σ ε 2 , β , and let q k = Y i j m i k α f Y i j | ν i j , σ ε 2 , β , thus satisfying the condition q 0 + k = 1 K i j q k = 1 and making H ν = h 0 ν i j d ν i j ; thus, the conditional distribution of ν i j is obtained as follows:
ν i j | ν i j , Y i j , σ ε 2 , β , α , θ , h 0 q 0 H ν + k = 1 K i j q k δ ν i k * , ν i j

4.2.3. Sampling and Estimation

Given the initial values of the parameters and given the full conditional distribution of the MCMC algorithm, we present the sampling stage of the iterative computation. For all r = 0 , 1 , , with the parameter estimates α ^ r , θ ^ r , h ^ 0 r , β r , σ ε 2 r already obtained for the rth iteration, we consider sampling from the posterior distribution of the hidden variable ν i j . The sampling phase is divided into two steps: the first step is Gibbs sampling based on the full conditional distribution of the MCMC algorithm, and the second step is to consider the use of an accelerated step, that is, to consider the introduction of an auxiliary parameter to sample ν i j * at the end of each iteration. This is because if the sampling is done directly based on the full conditional distribution of the MCMC algorithm, the problem may arise that f and h 0 are not conjugate, or the MCMC chain is slow to converge when k = 1 K i j q k is relatively large with respect to q 0 . And, the method of introducing auxiliary parameters can be applied to the two types of problems mentioned above.
To start, we draw ν i j b from P D α ^ r , θ ^ r ; h ^ 0 r , for b = 0 , 1 , 2 , , B , where B stands for the number of repetitions within an iteration and is a sufficiently large number. We introduce an auxiliary variable two-step update ν i j b based on Equation (34), thus referring to the method in the article by Neal [25].
Define the auxiliary variable c ν i b = c 1 , c 2 , , c n i to denote which category ν i j b belongs to in ν i * b = ν i 1 * b , ν i 2 * b , , ν i K i * b , 0 , , 0 , and use m b = m i 1 , m i 2 , , m i K i , 0 , , 0 to denote the number of occurrences of ν i j b in ν i b . Assuming that ν i b , c ν i b , ν i * b , and m b have been obtained, first we consider updating the auxiliary variables first.
For the update of c i j , j = 1 , 2 , , n i , suppose that m i c j = m i c j 1 if c i j is removed; at this point, K i j = K i 1 if m i c j = 0 . Otherwise, the value of K i j is the same as that of K i . Then, the new c i j is drawn again, and the distribution of the new c i j is taken as follows:
q c = P r c i j = c | Y i j , c i j , ν i 1 * , , ν i K i j * = w m i c α ^ r f Y i j | ν i c * , σ ^ ε 2 r , β ^ r , 1 c n i , m i c 0 w θ ^ r + α ^ r K i j f Y i j | ν i c * , σ ^ ε 2 r , β ^ r , 1 c n i , m i c = 0
where w is a normalization parameter designed to satisfy the condition j = 0 n i q j = 1 .
Through the sampling stage, we can obtain ν i b , b = 1 , 2 , , B . Next, we can estimate the parameters based on the obtained hidden variables ν i b . We obtain the estimates of each parameter h ˜ 0 ν i b , σ ˜ ε 2 ν i b , β ˜ ν i b , α ˜ ν i b , θ ˜ ν i b according to the algorithm mentioned earlier; then, we can obtain the r + 1 th iteration estimates:
h ^ 0 r + 1 · , σ ^ ε 2 r + 1 , β ^ r + 1 , α ^ r + 1 , θ ^ r + 1 = 1 B b = 1 B h ˜ 0 ν i b , σ ˜ ε 2 ν i b , β ˜ ν i b , α ˜ ν i b , θ ˜ ν i b

5. Simulation

This section provides the simulation results to study the estimation performance of the proposed parameters under a small area model with PDP randon effects.

5.1. Model Setup and Simulation Conditions

We first design a finite population containing m = 20 small areas, and we take a certain number of samples in each small area. For convenience, we set the sample capacity of each small area to n i = 30 .
Then, we provide the following model
y i j = β 0 + x i j 1 β 1 + ν i j + ε i j , j = 1 , 2 , , n i , i = 1 , 2 , , m
ν i 1 , , ν i n i | H i i i d H i i = 1 , 2 , , m
H i P D α , θ ; H 0
ε i j N 0 , σ ε 2 j = 1 , 2 , , n i ; i = 1 , 2 , , m
The basic simulation assumptions are as follows:
  • Five choices of parameters α , θ are: 0.5 , 10 , 0.3 , 5 , 0.9 , 3 , 0.4 , 7 , or 0.5 , 2 , and the base distribution is set to be h 0 N 0 , 1 , h 0 t 5 , or h 0 t 2 ;
  • The error ε i j comes from a normal distribution N 0 , 0 . 1 2 ;
  • The true value of the regression coefficient is set to β = β 0 , β 1 = 1 , 2 ;
  • The initial parameter values α ^ 0 , θ ^ 0 are set to be 0.2 , 5 , 0.2 , 2 , 0.5 , 1 , 0.1 , 5 , or 0.1 , 1 when α , θ is 0.5 , 10 , 0.3 , 5 , 0.9 , 3 , 0.4 , 7 , or 0.5 , 2 ;
  • The initial random effects are set to ν ^ i j 0 = y i j ;
  • The number of iterations for the MCMC algorithm is set to R = 200 .

5.2. Simulation Results and Analysis

The simulation results are given in detail by the following figures and tables. Table 1 shows the simulation results of the estimation of the corresponding parameters for the different cases of varying the values of parameters α and θ versus transforming the base distribution under the PDP prior for random effects in the proposed model. The first column in Table 1 shows all the parameters that were estimated, the second column shows the settings of the base distribution for the different cases, the third column shows the true values of the corresponding parameters, the fourth and fifth columns show the bias and the MSE, and the sixth column shows the 95 percent confidence domain. Figure 1 shows the density estimation curves for the parameters α and θ under different simulation data.
In Table 1, we calculated the bias, MSE, and confidence intervals of α ^ and θ ^ for different types of base distributions. We assumed that the base distribution follows the normal distribution N 0 , 1 and the t distribution, respectively. Here, we chose t distributions with five and two degrees of freedom as the base distributions for estimation. The results of these estimators obtained under two different types of base distribution conditions were similar. The biases of α ^ were inside the interval 0.073 , 0.091 , and the biases of θ ^ were inside the interval 0.901 , 0.486 . The MSEs of α ^ were less than 0.04, although the MSEs of α ^ were large, and most of the estimated confidence intervals failed to capture the true values of the parameters. Meanwhile, we considered the bias, MSE, and confidence intervals of α ^ and θ ^ with different true values for α and θ under the condition where the base distribution follows a normal distribution or a t distribution. From Table 1, we can see that when the base distribution followed the normal distribution and α , θ = 0.5 , 10 , the bias and MSE of θ ^ were larger than those of other cases. The bias and MSE of θ ^ were the smallest when the base distribution followed t 2 and α , θ = 0.5 , 2 . The density curve of α ^ and θ ^ in Figure 1 also agrees with the estimated results of Table 1.
Table 1 lists the estimation results of the regression coefficients given different values of the parameters α and θ and different base distribution conditions. The biases and MSEs of β 0 and β 1 were maintained at very low levels, which reflect the accuracy and reliability of the estimates. Figure 2 shows the density estimation curve of β 0 and β 1 when α , θ = 0.5 , 10 based on N 0 , 1 , which coincides with the estimation results in Table 1.
Table 2 presents the results of a comparison between the estimates of the regression coefficients β 0 and β 1 of the proposed model and those of the NER model, as derived from five sets of simulations. Table 2 demonstrates that the estimates of β 0 and β 1 obtained from the estimation of both the proposed model and the NER model were highly close to the true values. However, for the simulation where the base distribution followed the normal distribution, the NER model estimation was slightly more accurate than the proposed model. Conversely, when the base distribution followed the t distribution, the proposed model estimation was more precise than the NER model estimation.
The Total Variation Distance (TVD) is a statistical distance measure between probability distributions, which represents the distance between the true and estimated distribution of the base distribution h 0 and thus serves as a basis for evaluating the estimation performance. Figure 3 shows the Total Variation Distance between the two distributions for each iteration under these five scenarios. As can be seen from Figure 3, the gap between the estimated distribution and the true distribution was small, and the total variance distance was less than 0.3 in each iteration.
Table 3 shows the simulation results for the estimates of all small area means ρ i when the situation was that α , θ = 0.5 , 10 and the base distribution is N 0 , 1 . The first column in Table 3 shows the mean of ρ i , and the second column is the estimated value of the small area mean. As can be seen from Table 3, the estimates about the small area means were more accurate, and the estimates were close to the means without large deviations.
In order to more directly reflect the reliability of the estimation of the small area means, we introduced squared residuals to measure the degree of matching between the estimated values of the small area means and the true values, and we give five graphs of the squared residuals of the small area means under these five scenarios. As shown in Figure 4, the squared residuals for all small area means were small, with most of the squared residuals centered between 0 and 0.3.

5.3. Simulated Normal Data

In this section, we demonstrated how we implemented the estimation by using simulated data with the aim of testing the strengths and weaknesses of the model. Similar to the simulations in the previous section and for ease of computation, we assumed that there are m = 20 small areas, and the sample size for each small area was set to n i = 30 . And, the simulation data y i j were derived from the following model structure:
y i j = β 0 + x i j 1 β 1 + ν i j + ε i j , j = 1 , 2 , , n i ; i = 1 , 2 , , m
ν i j N 0 , σ ν 2 j = 1 , 2 , , n i ; i = 1 , 2 , , m
ε i j N 0 , σ ε 2 j = 1 , 2 , , n i ; i = 1 , 2 , , m
where the random effects ν i j comes from a normal distribution N 0 , 4 , and the error ε i j comes from a normal distribution N 0 , 0 . 1 2 . The true value of the regression coefficient was set to β = β 0 , β 1 = 1 , 2 .
Then, we used the proposed model with PDP random effects to estimate and obtain the corresponding estimates. Table 4 shows the results of parameter estimation. The estimates of the fixed effects in Table 4 were very close to the true values, and through Table 4, we can also observe the parameters α and θ of the PDP prior for the random effects of the model. Based on the estimates of the parameters α and θ and the estimated distribution of the base distribution of the random effects, we further obtained the estimates of the means of all the small areas through the model, which are presented through Table 5. Table 5 shows that the estimation of the small area mean was more reliable, and the estimates were close to the means without major deviations. Figure 5 shows squared residuals for all the small area means. We can find that the residual squared of all small area means were small, thus demonstrating the validity of our model and method.

6. Application

Following this, we applied the proposed model to a dataset of combined income and other sociological variables for the Spanish provinces [26], which is available in the R package sae [27]. This dataset contains 20 regions, 21 variables, and a total of 1050 observation units. We retained and integrated these variables to select four variables, that is, incomedata, age, edu, and sex, thus representing total income, age, education level, and gender, respectively. Because the central aim of the survey was to understand the income levels of the Spanish provinces, incomedata was used directly as a response variable in the model. The remaining three variables, age, edu, and sex, are considered to be closely related to income and therefore served as auxiliary variables in the model. We removed any units containing missing values in these variables and normalized these data.
The proposed model can be described as follows:
i n c o m e d a t a i j = β 0 + β 1 e d u i j + β 2 s e x i j + β 3 a g e i j + ν i j + ε i j , j = 1 , 2 , , n i ; i = 1 , 2 , , m
ν i 1 , , ν i n i | H i i i d H i i = 1 , 2 , , m
H i P D α , θ ; h 0
ε i j N 0 , σ ε 2 j = 1 , 2 , , n i ; i = 1 , 2 , , m
Based on the proposed model, by applying the parameter estimation method in Section 4 and the provided algorithm to run two hundred rounds, the estimation results of each parameter were obtained and are shown in Table 6. Figure 6 gives the density plot of the base distribution obtained from our estimation.

7. Conclusions

In this paper, we proposed to use the Poisson–Dirichlet Process in a Nested Error Regression model to provide a priori distributions for random effects in unit-level data. In the small area model, since the random effects are not directly observable as hidden variables, we applied the MCMC algorithm to extract the random effects at fixed initial values and constructed parameter estimates in the prior; we then gave estimates of the parameters such as regression coefficients and the base distributions with known random effects. Through numerical simulations and the application of example data, we demonstrated the feasibility of the studied model and the practicality of the estimation algorithm.
Our proposed model and its parameter estimation method have significant advantages. Firstly, the Poisson–Dirichlet process as a prior is able to flexibly capture the nonparametric properties of the random effects, thus overcoming the limitations of the traditional normality assumption and improving the adaptability and accuracy of the model. Second, the effective application of the MCMC algorithm ensures the robustness and accuracy of parameter estimation, especially when dealing with complex data and models. Both the theoretical and simulation results confirm these advantages, thus making our model and method widely applicable and effective in practical applications.
Although our proposed model and method performed well in several aspects, there are still some shortcomings and directions worthy of further research. The computational complexity of the models is high, especially when dealing with large-scale datasets, and the computational cost may become a limiting factor for their application. Therefore, developing more efficient computational methods and algorithms is a future research focus. In addition, with the development of data science, combining new statistical techniques and machine learning algorithms to improve and optimize the model is also a worthy research direction.

Author Contributions

Conceptualization, X.Z.; methodology, Q.K. and X.Z.; software, Q.K. and X.Z.; validation, Q.K. and X.Z.; formal analysis, Q.K. and X.Z.; investigation, X.Z.; resources, X.Q.; writing—original draft, Q.K.; writing—review and editing, X.Z.; visualization, X.Q.; supervision, Y.L.; project administration, X.Q. and Y.L.; funding acquisition, X.Q. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (Grant Nos. 12032016 and 12372277).

Data Availability Statement

The data that support the findings of this study are openly available, which can be downloaded from https://cran.r-project.org/web/packages/sae/index.html (accessed on 15 April 2024).

Acknowledgments

We thank the associate editor and the reviewers for their useful feedback that improved the quality and clarity of this paper.

Conflicts of Interest

The authors declared no conflicts of interest.

References

  1. Rao, J.N. Small Area Estimation; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 331. [Google Scholar]
  2. Pfeffermann, D. New important developments in small area estimation. Stat. Sci. 2013, 28, 40–68. [Google Scholar] [CrossRef]
  3. Rao, J.N.; Molina, I. Small Area Estimation; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  4. Fay, R.E., III; Herriot, R.A. Estimates of income for small places: An application of James-Stein procedures to census data. J. Am. Stat. Assoc. 1979, 74, 269–277. [Google Scholar] [CrossRef]
  5. Battese, G.E.; Harter, R.M.; Fuller, W.A. An error-components model for prediction of county crop areas using survey and satellite data. J. Am. Stat. Assoc. 1988, 83, 28–36. [Google Scholar] [CrossRef]
  6. Arima, S.; Bell, W.R.; Datta, G.S.; Franco, C.; Liseo, B. Multivariate Fay–Herriot Bayesian estimation of small area means under functional measurement error. J. R. Stat. Soc. Ser. A Stat. Soc. 2017, 180, 1191–1209. [Google Scholar] [CrossRef]
  7. Yang, Z.; Chen, J. Small area mean estimation after effect clustering. J. Appl. Stat. 2020, 47, 602–623. [Google Scholar] [CrossRef] [PubMed]
  8. Datta, G.S.; Hall, P.; Mandal, A. Model selection by testing for the presence of small-area effects, and application to area-level data. J. Am. Stat. Assoc. 2011, 106, 362–374. [Google Scholar] [CrossRef]
  9. Sugasawa, S.; Kubokawa, T. Bayesian estimators in uncertain nested error regression models. J. Multivar. Anal. 2017, 153, 52–63. [Google Scholar] [CrossRef]
  10. Ferrante, M.R.; Pacei, S. Small domain estimation of business statistics by using multivariate skew normal models. J. R. Stat. Soc. Ser. A Stat. Soc. 2017, 180, 1057–1088. [Google Scholar] [CrossRef]
  11. Fabrizi, E.; Trivisano, C. Robust linear mixed models for small area estimation. J. Stat. Plan. Inference 2010, 140, 433–443. [Google Scholar] [CrossRef]
  12. Chakraborty, A.; Datta, G.S.; Mandal, A. A two-component normal mixture alternative to the Fay-Herriot model. Stat. Transit. New Ser. 2016, 17, 67–90. [Google Scholar]
  13. Diallo, M.S.; Rao, J. Small area estimation of complex parameters under unit-level models with skew-normal errors. Scand. J. Stat. 2018, 45, 1092–1116. [Google Scholar] [CrossRef]
  14. Tsujino, T.; Kubokawa, T. Empirical Bayes methods in nested error regression models with skew-normal errors. Jpn. J. Stat. Data Sci. 2019, 2, 375–403. [Google Scholar] [CrossRef]
  15. Opsomer, J.D.; Claeskens, G.; Ranalli, M.G.; Kauermann, G.; Breidt, F.J. Non-parametric small area estimation using penalized spline regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 2008, 70, 265–286. [Google Scholar] [CrossRef]
  16. Polettini, S. A Generalised Semiparametric Bayesian Fay–Herriot Model for Small Area Estimation Shrinking Both Means and Variances. Bayesian Anal. 2016, 12, 729–752. [Google Scholar] [CrossRef]
  17. Pitman, J.; Yor, M. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 1997, 25, 855–900. [Google Scholar] [CrossRef]
  18. Al-Labadi, L.; Hamlili, M.; Ly, A. Bayesian Estimation of Variance-Based Information Measures and Their Application to Testing Uniformity. Axioms 2023, 12, 887. [Google Scholar] [CrossRef]
  19. Handa, K. The two-parameter Poisson–Dirichlet point process. Bernoulli 2009, 15, 1082–1116. [Google Scholar] [CrossRef]
  20. Favaro, S.; Lijoi, A.; Mena, R.H.; Prünster, I. Bayesian non-parametric inference for species variety with a two-parameter Poisson–Dirichlet process prior. J. R. Stat. Soc. Ser. B Stat. Methodol. 2009, 71, 993–1008. [Google Scholar] [CrossRef]
  21. Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis, 3rd ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2013. [Google Scholar]
  22. Yang, L.; Wu, X. Estimation of Dirichlet process priors with monotone missing data. J. Nonparametr. Stat. 2013, 25, 787–807. [Google Scholar] [CrossRef]
  23. Qiu, X.; Yuan, L.; Zhou, X. MCMC sampling estimation of Poisson-Dirichlet process mixture models. Math. Probl. Eng. 2021, 2021, 6618548. [Google Scholar] [CrossRef]
  24. Carlton, M.A. Applications of the Two-Parameter Poisson-Dirichlet Distribution; University of California: Los Angeles, CA, USA, 1999. [Google Scholar]
  25. Neal, R.M. Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 2000, 9, 249–265. [Google Scholar] [CrossRef]
  26. Molina, I.; Rao, J.N. Small area estimation of poverty indicators. Can. J. Stat. 2010, 38, 369–385. [Google Scholar] [CrossRef]
  27. Molina, I.; Marhuenda, Y. sae: An R Package for Small Area Estimation. R J. 2015, 7, 81–98. [Google Scholar] [CrossRef]
Figure 1. Density estimation curve of α and θ . (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Figure 1. Density estimation curve of α and θ . (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Axioms 13 00432 g001
Figure 2. Density estimation curve of regression coefficients when α , θ = 0.5 , 10 based on N 0 , 1 .
Figure 2. Density estimation curve of regression coefficients when α , θ = 0.5 , 10 based on N 0 , 1 .
Axioms 13 00432 g002
Figure 3. Total variation distance between two distributions at each iteration. (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Figure 3. Total variation distance between two distributions at each iteration. (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Axioms 13 00432 g003
Figure 4. Squared residuals results of small area means. (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Figure 4. Squared residuals results of small area means. (a) α , θ = 0.5 , 10 based on N 0 , 1 ; (b) α , θ = 0.3 , 5 based on N 0 , 1 ; (c) α , θ = 0.9 , 3 based on N 0 , 1 ; (d) α , θ = 0.4 , 7 based on t 5 ; (e) α , θ = 0.5 , 2 based on t 2 .
Axioms 13 00432 g004
Figure 5. Squared residuals results of small area means of simulated data.
Figure 5. Squared residuals results of small area means of simulated data.
Axioms 13 00432 g005
Figure 6. Density of estimated base distribution of real data.
Figure 6. Density of estimated base distribution of real data.
Axioms 13 00432 g006
Table 1. Performance of parameters estimation.
Table 1. Performance of parameters estimation.
ParameterBase DistributionTrue ValueBiasMSEConfidence Interval
α N 0 , 1 0.5−0.065344070.03458486(0.427019, 0.442293)
θ 10−0.900174722.34711(8.874574, 9.325077)
β 0 10.014574390.005534238(1.011374, 1.017774)
β 1 20.030452860.01246042(2.025742, 2.035163)
α N 0 , 1 0.30.080812970.02533613(0.361643, 0.399983)
θ 50.48561227.561553(5.105321, 5.865904)
β 0 10.018487040.002993466(1.011289, 1.025685)
β 1 20.043522890.004664257(2.036166, 2.050880)
α N 0 , 1 0.9−0.072128030.01389355(0.814840, 0.840904)
θ 3−0.83077393.068224(1.915222, 2.423231)
β 0 10.081762620.01210896(1.071468, 1.092058)
β 1 2−0.088132820.01125358(1.903614, 1.920121)
α t 5 0.40.052138930.02911436(0.429428, 0.474850)
θ 7−0.529416415.58317(5.916658, 7.024509)
β 0 10.061142310.006702937(1.053531, 1.068753)
β 1 20.028706150.004429664(2.020312, 2.037100)
α t 2 0.50.090999370.01867303(0.576749, 0.605250)
θ 2−0.38019792.727176(1.395155, 1.844450)
β 0 1−0.13480950.02203277(0.856507, 0.873875)
β 1 20.35879330.1333754(2.349268, 2.368318)
Table 2. Estimated mean of regression coefficients.
Table 2. Estimated mean of regression coefficients.
Regression CoefficientTrue ValueNER ModelNER Model with PDP Random Effects
0.96073641.014574
1.0177081.018487
β 0 11.0016201.081763
1.0695681.061142
0.71139940.8651905
2.0452442.030453
2.0382862.043523
β 1 21.9009591.911867
2.0449872.028706
2.45824982.358793
Table 3. Results of small areas means when α , θ = 0.5 , 10 based on N 0 , 1 .
Table 3. Results of small areas means when α , θ = 0.5 , 10 based on N 0 , 1 .
AreaSample MeanEstimate
12.0896282.220492
21.7934371.916604
31.6924201.731447
42.4289032.606485
51.5509032.462545
62.0680022.527393
72.4125262.482989
81.7173981.926363
92.0428112.390698
102.0949582.358032
112.2183282.641435
121.8659922.597315
131.7251392.231473
142.0208982.302906
151.6308732.099237
162.1296732.268741
171.8341071.922384
182.5703862.952974
192.0068352.219573
201.7330932.066331
Table 4. Performance of parameters estimation of simulated data.
Table 4. Performance of parameters estimation of simulated data.
α ^ θ ^ β ^ 0 β ^ 1
0.60032615.1249031.0030032.411909
Table 5. Results of small areas means of simulated data.
Table 5. Results of small areas means of simulated data.
AreaSample MeanEstimate
12.62189052.342866
22.00786871.469475
30.96125821.162803
41.55133561.384326
52.26998211.809038
62.07968271.951780
73.02645842.486590
83.15218682.758094
91.24774161.554100
101.27969261.416766
112.57286592.642539
121.85077241.963746
132.86043332.081979
143.17787842.290092
151.75051821.817446
161.81906481.810704
171.02715401.127514
182.67064062.292215
192.59224612.019817
202.72004852.626496
Table 6. Performance of parameters estimation of real data.
Table 6. Performance of parameters estimation of real data.
α ^ θ ^ β ^ 0 β ^ 1 β ^ 2 β ^ 3
0.8757753.025505−0.007919−0.000048−0.000173−0.000169
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qiu, X.; Ke, Q.; Zhou, X.; Liu, Y. Small Area Estimation under Poisson–Dirichlet Process Mixture Models. Axioms 2024, 13, 432. https://doi.org/10.3390/axioms13070432

AMA Style

Qiu X, Ke Q, Zhou X, Liu Y. Small Area Estimation under Poisson–Dirichlet Process Mixture Models. Axioms. 2024; 13(7):432. https://doi.org/10.3390/axioms13070432

Chicago/Turabian Style

Qiu, Xiang, Qinchun Ke, Xueqin Zhou, and Yulu Liu. 2024. "Small Area Estimation under Poisson–Dirichlet Process Mixture Models" Axioms 13, no. 7: 432. https://doi.org/10.3390/axioms13070432

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop