Next Article in Journal
The Impact of Robotic Therapy on the Self-Perception of Upper Limb Function in Cervical Spinal Cord Injury: A Pilot Randomized Controlled Trial
Previous Article in Journal
Indoor Air Purification and Residents’ Self-Rated Health: Evidence from the China Health and Nutrition Survey
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accounting for Sampling Weights in the Analysis of Spatial Distributions of Disease Using Health Survey Data, with an Application to Mapping Child Health in Malawi and Mozambique

by
Sheyla Rodrigues Cassy
1,2,
Samuel Manda
3,4,*,
Filipe Marques
2,5 and
Maria do Rosário Oliveira Martins
6
1
Department of Mathematics and Informatics, Faculty of Sciences, Eduardo Mondlane University, Maputo 254, Mozambique
2
Centre for Mathematics and Applications, CMA, NOVA School of Science and Technology, NOVA University of Lisbon, 2829-516 Lisbon, Portugal
3
Department of Statistics, University of Pretoria, Pretoria 0028, South Africa
4
Biostatistics Research Unit, South Africa Medical Research Council, Pretoria 0001, South Africa
5
Department of Mathematics, NOVA School of Science and Technology, NOVA University of Lisbon, 2829-516 Lisbon, Portugal
6
Global Health and Tropical Medicine, GHTM, Instituto de Higiene e Medicina Tropical, IHMT, Universidade Nova de Lisboa, 1349-0008 Lisbon, Portugal
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2022, 19(10), 6319; https://doi.org/10.3390/ijerph19106319
Submission received: 9 March 2022 / Revised: 9 May 2022 / Accepted: 11 May 2022 / Published: 23 May 2022
(This article belongs to the Section Global Health)

Abstract

:
Most analyses of spatial patterns of disease risk using health survey data fail to adequately account for the complex survey designs. Particularly, the survey sampling weights are often ignored in the analyses. Thus, the estimated spatial distribution of disease risk could be biased and may lead to erroneous policy decisions. This paper aimed to present recent statistical advances in disease-mapping methods that incorporate survey sampling in the estimation of the spatial distribution of disease risk. The methods were then applied to the estimation of the geographical distribution of child malnutrition in Malawi, and child fever and diarrhoea in Mozambique. The estimation of the spatial distributions of the child disease risk was done by Bayesian methods. Accounting for sampling weights resulted in smaller standard errors for the estimated spatial disease risk, which increased the confidence in the conclusions from the findings. The estimated geographical distributions of the child disease risk were similar between the methods. However, the fits of the models to the data, as measured by the deviance information criteria (DIC), were different.

1. Introduction

In epidemiology and public health, the methods for mapping disease have long been used to estimate spatial patterns of disease risk. Statistical advances in the methods have included spatial smoothing of disease risk to produce interpretable maps, and extensions to include temporal components as well as individual and geographical-level data. Estimation of geographical patterns of diseases in low-resource settings is increasingly important in guiding decision-making on where to allocate resources [1,2].
In sub-Saharan Africa, many disease mapping analyses that use data from complex health surveys fail to account for the survey designs such as disproportionate sampling [2]. In standard survey analyses, disproportionate sampling is corrected in the analysis by using the survey sampling weights that adjust for the disproportionate contribution of each ultimate sampling unit to the whole sample data. Ignoring the sampling weight in disease mapping analyses could lead to biased estimates of the spatial distributions of the disease risk, which could adversely affect policy decisions based on them. Thus, appropriate statistical analysis methods that incorporate sampling weights in the estimation of spatial patterns of diseases are critical [3,4,5,6,7].
In this paper, rather than detailing the epidemiology of child diseases in sub-Saharan Africa, a research topic that has extensively been analysed in several disease-mapping analyses in Africa, we present recent statistical analysis methods for incorporating sampling weights in the estimation of the geographical distributions of disease risk using complex health survey data. The two datasets: 2015-16 Malawi Demographic and Health Survey (2015-16 MDHS) [8], and the 2015 Mozambique Immunization Malaria and HIV/AIDS Key Indicator Survey (IMASIDA 2015) [9], are used for illustrative purposes using the mapping of child malnutrition, fever, and diarrhoea.

2. Methods

2.1. General Notation

Suppose that a finite population U = { 1 , 2 , , N } is distributed into i = 1 , 2 , , I areas, and a random probability sample survey s with size n is taken according to a given design. Let N i and n i be the population and sample of sizes for area i , respectively, such that U =   i = 1 I U i ,   N = i = 1 I N i ,   s =   i = 1 I s i   and   n = i = 1 I n i . Let Y i j be the binary indicator for the presence of the disease, taking a value of 1 or 0 on whether or not the j th individual has the disease in area i   ( j = 1 , , N i ;   i = 1 , , I ) . It is assumed that N i is known for each area i . Further, we assume that individual i j has a known probability π i j of being included in the sample.
Our interest is to estimate the true area-specific population prevalence P i , which is defined as:
P i = 1 N i j = 1 N i Y i j ,
using the accrued sample from area i . The area-specific unweighted estimator of the true area prevalence P i is given by P ^ i U W which is calculated as:
P ^ i U W = 1 n i j = 1 n i y i j ,
and its variance is obtained as:
v a r ^ ( P ^ i U W ) = P ^ i U W ( 1 P ^ i U W ) n i .
In the case of a simple random sampling design without replacement, the estimator (2) is unbiased. However, in complex sampling, it would be inadequate as it does not account for the sample survey design, for example, sampling weights [10].

2.2. The Horvitz–Thompson Estimator

The well-known sample-design based unbiased estimator of the population prevalence P i is the Horvitz–Thompson (HT) estimator [11], which is given by:
P ^ i H T = j s i w i j d y i j j s i w i j d = 1 n i j s i w ˜ i j d y i j ,
where s i is the set of individuals who are sampled from area i , with y i j being the observed value for j s i with | s i | = n i , w i j d = 1 π i j the design weight (i.e., the sampling weights are the inverse probability of inclusion in the sample adjusted for non-response [12]), and w ˜ i j d given by:
w ˜ i j d = n i w i j d j s i w i j d ,
is the normalized sampling weight. According to study [4], an estimator of the variance of P ^ i H T can be expressed as follows:
v a r ^ ( P ^ i H T ) = 1 n i ( 1 n i N i ) 1 n i 1 j s i w ˜ i j d 2 ( y i j P ^ i H T ) 2 .
This HT estimator falls into the group of considered direct estimators, as they are based only on the area sample data [13,14].

2.3. Bayesian Hierarchical Spatial Smoothing Models

Bayesian hierarchical spatial smoothing models have recently gained attention regarding their use in small area estimation instead of direct estimators [4,5]. These methods rely on the assumption that area-specific estimates borrow information from other areas, which makes it possible to find more accurate estimates. Furthermore, this creates the advantage that estimates can be obtained in areas with no samples. The models involve three stages: (i) the likelihood of the response, which is defined conditionally on latent variables (random effects); (ii) the latent variables themselves are given a distribution, and (iii) the specification of prior distributions of all unknown parameters.
A three-stage Bayesian hierarchical spatial rmoothing model for the total number of individuals with the disease in area i given by y i = j s i y i j uses a binomial distribution for stage one as:
y i | P i B i n o m i a l ( n i , P i ) ,
In the second stage, we model the between-area variation in P i using the area random-effects model. In recent times, this has involved incorporating both non-spatial and spatial random effects using the convolution model of Besag–York–Mollié (BYM) [15]. Besides, the standard binomial spatial model, we will describe a series of models based on the BYM model that have been used to perform spatial analyses on the prevalence data from health surveys.
Using the Binomial distribution in (7), our first model is a standard spatial modelling approach for count data using health survey data to estimate the spatial distribution of disease risk. It simply links the estimated prevalence of the disease with the two types of area random terms via a logit function as:
l o g i t ( P i ) = β 0 + u i + v i ,
u i | u i , i i N ( 1 a i i n e ( i ) u i , σ u 2 a i ) ,
v i | σ i 2 i i d N ( 0 , σ v 2 ) , i = 1 , , I ,
where β 0 is the intercept; v i is the unstructured random component; u i is the structured spatial random component; n e ( i ) indicates the set of neighbours, and a i is the number of neighbours for a given area i . Here, we adopt the common convention of neighbouring, which considers two areas as neighbours if they share a common boundary. The specification of the structured random effects is based on an intrinsic conditional autoregressive (ICAR) prior [15,16]. We call this, the Binomial Spatial Model, as Model 1.
In the third stage, we require priors for β 0 and the variances of the random effects. The model in Equation (8) results in the smoothing of extreme area estimates in areas with small sample sizes. However, without the incorporation of sampling weights, the estimated of the spatial patterns of disease risk could be biased.

2.4. Incorprating Survey Sampling Weights in Hierarchical Spatial Model Analysis

Following studies [4,5], various approaches have been proposed to account for the survey sampling weight. Let us consider now Models 2 and 3 which are based on the HT estimator. As the HT estimator could be skewed, the estimates are often transformed to approximately conform to normality, Some of the most common transformations are bases the logit and arcsine functions. We first consider the logit transformation. Thus, Model 2 is given by:
l o g i t ( P ^ i H T ) | P i N ( l o g i t ( P i ) , σ i 2 )
l o g i t ( P i ) = β 0 + u i + v i ,
where l o g i t ( P ^ i H T ) has variance σ i 2 = v a r ^ ( P ^ i H T ) / ( P ^ i H T ( 1 P ^ i H T ) ) 2 . We will call Model 2, the Logit Normal (LN) spatial model. For the arcsine square-root transformation, given as arcsin   ( P ^ i H T ) [5,17], the Arcsine (AS) spatial model leads to the following model specification:
arcsin   ( P ^ i H T ) | P i N ( arcsin   ( P i ) , σ i 2 )
arcsin   ( P i ) = β 0 + u i + v i ,
and the variance of the arcsine transformation is σ i 2 = 1 4 n i E , where n i E = P ^ i H T ( 1 P ^ i H T ) / v a r ^ ( P ^ i H T ) is the effective sample size in area i . Thus, our Model 3 is the Arcsine square root (AS) spatial model.
For Model 4, we consider pseudo-likelihood (PL), which uses a weighted likelihood [5], where the response values ( y i j ) are weighted using the normalised design weights. Thus, rather than using the binomial outcomes used in (7) (Model 1), here, we use y i P L = j s i w ˜ i j d y i j as:
y i P L | P i B i n o m i a l ( n i , P i )
l o g i t ( P i ) = β 0 + u i + v i .
A drawback of the general approach is that the appropriate standard error is not recovered in the case of clustering. Rabe-Hesketh and Skrondal [18] used a pseudo-likelihood method with scaled weights and used sandwich estimation to provide valid standard error estimates within a multilevel framework but did not consider spatial smoothing. We denoted the pseudo-likelihood (PL) spatial model as Model 4. Our Models 5 and 6 are also variations of Model 1, but they now depend on effective sample size and the number of cases. For Model 5, the effective sample size n i E is computed as previously shown and depends on the weighted estimator of prevalence [4]. The effective sample size is the sample size that is required to make the variance under the complex survey design equivalent to that of a simple random sample [4]. Then, the effective number of cases is easily found as y i E = n i E P ^ i H T . As for Model 6, the effective sample size is obtained by using the design effect in area i and is estimated as:
n i E = n i d e f f i ,
where:
d e f f i = s i 2 s r i 2   f o r   i = 1 , , I ,
where s i 2 is the unbiased direct estimate of the variance of the sample proportion based on the complex sampling design and s r i 2 is the unbiased direct estimate of the variance of the proportion based on the simple random sampling design [19]. As before, this resulted in the effective number of cases in area i as y i E = n i E P ^ i H T .

2.5. Bayesian Inference, Computation, and Model Evaluation

For the Bayesian estimation of the model parameters, we assumed an improper uniform prior for β 0 and G a m m a ( 0.5 , 0.008 ) priors for both the spatial and non-spatial precision parameters σ u 2 and σ v 2 as in [5]. The estimations were done using Integrated Nested Laplace Approximation (INLA), which is implemented in the INLA package within the statistical computer software R [20,21,22]. Detailed description of INLA are provided in Appendix A.
Model comparison and selection were carried out using the deviance information criterion (DIC) [23]. DIC value is computed as
D I C = D ¯ + p D ,
where D ¯ is the posterior mean of the deviance which measures the goodness of fit and p D is the effective number of parameters which penalises for the complexity of the model. Models with the smallest DIC indicated a better model fit.

3. Application

Although Malawi and Mozambique have experienced substantial improvements in child health, preventable child deaths continue to be unacceptably high to achieve the Sustainable Development Goals [24,25,26,27,28,29]. Understanding the local epidemiology of diseases in these two countries is critical for defining and prioritising interventions that can contribute to accelerating the reduction of morbidity and mortality in children under 5 years old in these countries.
We used Models 1–6, presented above, to estimate the geographical distribution of stunting, wasting, and underweight among children under 5 years old at the district level in Malawi, and childhood fever and diarrhoea at the province level in Mozambique.

3.1. Data Sources: Malawi and Mozambique

The 2015-16 MDHS was a national, population-based, cross-sectional survey that was conducted between December 2015 and February 2016. Briefly, the 2015-16 MDHS employed a two-stage sampling designed to produce a nationally representative sample at the national level, residence level (urban and rural), and district level. Stratification was made at two levels: the district level (32 districts), and the urban and rural areas. In the first stage, based on the Malawi Population and Housing Census conducted in Malawi in 2008, and updated based on the General Agriculture Census 2009, 850 primary sample units (PSUs) were selected, which were the enumeration areas (EAs), with a probability proportional to their size (size given by the number of households in each enumeration area). Of these PSUs, 173 were in urban areas and 677 were in rural areas. The second stage of sampling involved a systematic selection of 30 households from each urban cluster and 33 households from each rural cluster, yielding a sample size of 27,516 households from the clusters. The response rate was 99%. The methodology used in the 2015-16 MDHS has been reported in detail in [8]. Figure 1a depicts the geospatial arrangement of the districts of Malawi.
The second dataset used was the IMASIDA 2015, which includes information from 7169 households, interviewing 7749 women aged 15 to 59 years and 5283 men aged 15 to 59 years, over 307 EAs, with data collected between June and September 2015 through a two-stage sampling process designed to produce representative estimates at the national, provincial (11 geographic areas: Maputo Province, Maputo City, Inhambane, Gaza, Sofala, Manica, Zambezia, Nampula, Tete, Niassa, and Cabo Delgado), regional (north, centre and south), and the residence of areas (urban and rural), and for women and men aged 15–59 years. The methodology used has been reported in detail elsewhere [9]. Figure 1b depicts the geospatial arrangement of the provinces of Mozambique.
All of these datasets are publicly available and can be downloaded at https://dhsprogram.com/ (accessed on 21 July 2021).

3.2. Outcomes

The outcomes considered in this study for childhood in Malawi are three nutritional statuses of children, namely stunting, wasting, and underweight. Anthropometric measurements were used to define the nutritional status of children. Children with a z-score of two standard deviations (−2 SD) below the median of the WHO reference population on height-for-age are categorised as stunted; on weight-for-height as wasted, and on weight-for-age as underweight [24]. Thus, all outcome variables were binary, taking a value of “1” if a child is malnourished (i.e., stunted, wasted, or underweight), and a value of “0” otherwise. Due to missing data on these measurements, only 5149 children were considered for stunting analyses, 5178 for wasting, and 5223 for underweight, respectively.
For the Mozambique data, the outcomes considered were the fever and diarrhoea statuses. Children under 5 years old who had their mother answer whether they had diarrhoea or fever within the past 2 weeks were included in the analysis. The remaining children with missing values for the outcomes were excluded from our research. Thus, our analyses included a total of 4972 children under 5 years old for fever and 4980 children for diarrhoea.

3.3. Malawi: District Variation in the Prevalence of Child Malnutrition

The observed prevalence (weighted) of stunting, wasting, and underweight in Malawi among children under 5 years old was 36.82% (95% CI 35.18–38.46), 2.79% (95% CI 2.24–3.33), and 11.58% (95% CI 10.49–12.67), respectively, with the variation across districts that ranged from 15.44% in Mzuzu City to 45.88% in Mchinji for stunting; 1% in Balaka to 9.92% in Nsanje for wasting, and 1.93% in Zomba City to 18.85% in Nsanje for underweight (Table A1 in Appendix B).
We applied Models 1–6 to estimate the district-level pattern of child growth measures in Malawi. The fit and parameter estimates are presented in Table 1, Table 2 and Table 3. For each child’s growth measurement, the results showed similar estimates for the intercept parameters, except for Model 3 (AS), which was on a different scale. The credible intervals for the intercept parameters were generally narrower when the sampling weights were accounted for. The spatial Model 3 (AS) performed better (DIC = −86.69 for stunting; DIC = −94.76 for wasting; DIC = −89.7 for underweight).
Figure 2 presents maps of the district-level observed and spatially estimated prevalence of stunting. The spatial pattern in the stunting prevalence was smoother compared to the spatial pattern based on the observed prevalence. Generally higher stunting rates were found in the main central districts of the country. For wasting (Figure 3), districts in the southern part of the country bore the most burden. The spatial trend for wasting was similar to that of underweight prevalence (Figure 4).

3.4. Mozambique: Pronvicial Variations in the Prevalence Child Fver and Diarrhoea

A summary of the province’s prevalence of fever and diarrhoea is provided in Table A2 in Appendix B. Overall, about 29.37% (95% CI 26.99–31.87) of children under 5 years old had a fever and 11.11% (95% CI 9.93–12.41) had diarrhoea, with variation across provinces in Mozambique, ranging from 14.37% in Tete to 51.67% in Zambezia for fever, and 6.8% in Tete to 17.19% in Niassa. The model-fit criteria values and parameter estimates are presented in Table 4 and Table 5. The estimates of the intercepts from the models for each condition were similar, except for Model 3 (AS) which was an indifferent scale. Moreover, the estimates of the intercepts were slightly more precise by having narrower credible intervals when sampling weights were accounted for. Furthermore, the spatial Model 3 (AS) was the best fitting model.
Figure 5 and Figure 6 present prevalence maps of fever and diarrhoea, respectively, under different spatial model specifications. Child fever was more concentrated in provinces around the northeastern parts of Mozambique and less in the southern provinces (e.g., Maputo Cidade and Maputo Province). On the other hand, child diarrhoea was higher in most northern provinces, Zambezia, and Niassa provinces, and much lower in the southern parts.

4. Discussion and Conclusions

In this paper, we compared several statistical methods and their resulting estimates of spatial distributions of child malnutrition, fever and diarrhoea in Malawi and Mozambique using health survey data, accounting for health survey sampling weights. The results of the study showed that the sampling weight-adjusted methods were the best fitting, a finding similar to previous studies [4,5,6,7]. Even though the estimated spatial pattern was similar, the models that adjusted for the sampling weight produced estimates that had lower variability (narrowed confidence intervals). Thus, using accounting for sampling weights produced estimates of disease risk that had an increased level of confidence.
There are some concerning issues arising from our study. Firstly, although the arcsine transformation of the weighted prevalence was preferred for our application, it does not have an intuitive interpretation of the association between the binary outcomes and predictors. Secondly, for the Mozambique case, the study used a province which has a much coarser level of geographical aggregation. This may have concealed variations at some higher spatial resolution needed for local policy decisions. We suggest, for future studies, performing spatial analyses at higher spatial resolutions, for example, at the district level. Thirdly, our analyses used univariate spatial methods for the conditions that could be correlated at ecological levels [30,31]. We are now extending these statistical methods to the estimation of joint spatial patterns of diseases. Finally, we did not perform any simulation study to compare the performance of the studied statistical methods for the estimation of spatial disease patterns using complex health survey data. However, we thought that this was not necessary for this study as we aimed to describe the sampling weights adjusting methods and illustrate their use on typical examples. Other previous research work considered their performances using simulations [4,5,6,7].
In conclusion, we recommend spatial epidemiology researchers consider incorporating survey sampling weights in disease-mapping analyses for estimating the spatial distribution of disease risks based on complex health survey data. The estimates are more precise, thus providing reliable supporting evidence to drive public health policy on targeting resources in areas of most need.

Author Contributions

Conceptualization, S.R.C. and S.M.; methodology, S.R.C. and S.M.; software, S.R.C. and S.M.; formal analysis, S.R.C. and S.M.; data curation, S.R.C.; writing–original draft preparation, S.R.C., S.M. and F.M.; writing–review and editing, S.R.C., S.M., F.M. and M.d.R.O.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported through the project of the Centro de Matemática e Aplicações, UID/MAT/00297/2020, financed by the Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology). The APC was by supported the New University of Lisbon through the PhD program in Statistics and Risk Management of the FCT Nova Faculty.

Institutional Review Board Statement

Ethical review and approval were not needed for this study as secondary data were used. The respective study implementing bodies in Malawi and Mozambique sought approval from national and international Ethical and Review Boards.

Informed Consent Statement

Written informed consent to participate in this study was provided by the childrens’ legal guardian/next of kin.

Data Availability Statement

The datasets used in this study are publicly available and can be downloaded at https://dhsprogram.com/ (accessed on 21 July 2021).

Acknowledgments

Support from a doctoral Calouste Gulbenkian Foundation grant (135422 to S.R.C.) is acknowledged. Support from the Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) (through the project UIDB/00297/2020 (Centro de Matemática e Aplicações) to S.R.C. and F.M.) is acknowledged. Support from the South Africa Medical Research Council (SAMRC) with funds from the National Treasury in terms of the SAMRC’s competitive Intramural Research Fund (SAMRC-RFA-IFF-02-2016 to S.M.) is acknowledged. We also extend thanks to DHS Measure for allowing us to use the 2015-16 MDHS and 2015 IMASIDA datasets for this study.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

ASArcsine square-root transformation
DEFFDesign effect
DIC Deviance information criterion
EAsEnumeration areas
ESEffective sample size
ES-deffEffective sample size using design effect
GMRFGaussian random field
HTHorvitz–Thompson
ICARIntrinsic conditional autoregressive
IMASIDAIndicators of Immunization, Malaria and HIV/AIDS Survey
INLAIntegrated nested Laplace approximations
LNLogit Normal
MCMCMarkov chain Monte Carlo
MDHSMalawi Demographic and Health Survey
PLPseudo-likelihood
PSUsPrimary sample units
SAE Small area estimation
SDGs Sustainable Development Goals
SSA Sub Saharan Africa
UB Unadjusted binomial
UW Unweighted
WHO World Health Organization

Appendix A. The Integrated Nested Laplace Approximation

The basic idea behind INLA involves using a deterministic approach to approximating posteriors for GMRF models, which, in most cases, makes INLA faster and more accurate than MCMC alternatives to GMRF.
The posterior distribution for the latent Gaussian model is:
π ( x , θ | y ) π ( θ ) π ( x | θ ) π ( y | x , θ )
π ( θ ) π ( x | θ ) i = 1 n π ( y i | x i , θ )
π ( θ ) | Q ( θ ) | 1 2 exp   ( 1 2 x T Q ( θ ) x ) i = 1 n exp   ( log   ( π ( y i | x i , θ ) ) )
π ( θ ) | Q ( θ ) | 1 2 exp   ( 1 2 x T Q ( θ ) x + i = 1 n log   ( π ( y i | x i , θ ) ) )
where x is the class of latent fields; θ is the set of hyperparameters and y is the data. The INLA approach does not estimate the posterior marginals of interest, π ( x i | y ) , and hyperparameters π ( θ j | y ) , but rather, the whole posterior distribution:
π ( x i | y ) =   π ( x i | θ , y ) π ( θ | y ) d θ a n d π ( θ j | y ) =   π ( θ | y ) d θ j .
by constructing the nested approximations:
π ˜ ( x i | y ) =   π ˜ ( x i | θ , y ) π ˜ ( θ | y ) d θ a n d π ˜ ( θ j | y ) =   π ˜ ( θ | y ) d θ j .
where θ j is all θ s , except θ j . Here, π ˜ ( . | . ) is an approximated density (conditional) of its arguments. Approximations to π ( x i | y ) are calculated by approximation π ( θ | y ) and π ( x i | θ , y ) , and using numerical integration to integrate out θ , based on the Laplace approximation method. Integration is possible when the dimension of θ is small. The nested approach makes Laplace approximations very slow when applied to the latent Gaussian models.
The proposed Laplace approximation for π ( θ | y ) is given by:
π ˜ ( θ | y ) π ( y | x , θ ) π ( x | θ ) π ( θ ) π ˜ G ( x | θ , y ) | x = x * ( θ ) ,
where π ˜ G ( x , θ , y ) is a Gaussian approximation for the full conditional of x obtained by computing Laplace approximation, and x * ( θ ) is the mode of the full conditional of x , for a given θ . For the posterior marginals of the latent field, the density of x i | θ , y can be approximated with the Gaussian marginal derived from π ˜ G ( x | θ , y ) . For more details on this method, see [16,22].

Appendix B

Appendix B.1. Malawi Results

Table A1. Summary of the observed (weighted) prevalence rates of stunting, wasting, and underweight per district in Malawi.
Table A1. Summary of the observed (weighted) prevalence rates of stunting, wasting, and underweight per district in Malawi.
StuntingWastingUnderweight
DistrictN. Respondents(Stunted, %)N. Respondents(Wasted, %) N. Respondents(Underweighted,%)
Chitipa 13943 (33.08) 1442 (1.39) 14118 (13.89)
Karonga 14238 (28.00) 1432 (1.56) 14414 (9.18)
Nkhata Bay 14947 (31.33) 1501 (0.17) 15210 (5.89)
Rumphi 14744 (31.87) 1473 (1.74) 14720 (13.75)
Mzimba 15870 (44.79) 1595 (3.19) 15822 (13.45)
Likoma 12833 (26.99) 1285 (4.26) 12911 (9.13)
Mzuzu City 397 (15.44) 391 (2.72) 391 (2.72)
Kasungu 21173 (35.90) 2155 (2.73) 21614 (6.56)
Nkhotakota 20269 (32.58) 2047 (1.82) 20232 (13.05)
Ntchisi 18168 (40.55) 1824 (1.80) 18620 (11.53)
Dowa 19974 (39.42) 1992 (1.05) 20017 (9.31)
Salima 21278 (36.75) 2144 (1.60) 21328 (13.81)
Lilongwe Rural 14763 (43.28) 1491 (0.64) 15014 (9.42)
Mchinji 21095 (45.88) 2148 (3.35) 21326 (12.20)
Dedza 17772 (41.18) 1775 (2.85) 17928 (15.41)
Ntcheu 20078 (40.77) 1998 (3.74) 20126 (12.99)
Lilongwe City 7415 (19.55) 743 (4.14) 746 (8.07)
Mangochi 244107 (44.33) 2462 (0.90) 25532 (12.08)
Machinga 24795 (38.50) 2479 (3.71) 25340 (15.58)
Zomba Rural 18267 (36.90) 1798 (4.50) 18322 (11.93)
Chiradzulu 14048 (33.62) 1449 (6.53) 14519 (12.81)
Blantyre Rural 8628 (32.84) 875 (5.53) 867 (8.00)
Mwanza 14346 (31.24) 14312 (7.03) 15023 (14.55)
Thyolo 15154 (34.43) 1496 (3.78) 15022 (13.29)
Mulanje 17266 (36.91) 1726 (3.58) 17429 (16.30)
Phalombe 21168 (33.09) 2164 (2.03) 21221 (10.31)
Chikwawa 18055 (30.18) 1818 (4.80) 18421 (11.27)
Nsanje 15948 (31.70) 16117 (9.92) 16227 (18.85)
Balaka 21369 (32.73) 2120 (0.00) 21426 (13.19)
Neno 16572 (45.28) 1647 (4.70) 16830 (18.44)
Zomba City 498 (17.75) 483 (5.56) 501 (1.93)
Blantyre City 9228 (30.47) 922 (2.10) 937 (7.57)
Total 51491826 (36.82) 5178164 (2.79) 5223634 (11.58)

Appendix B.2. Mozambique Results

Table A2. Summary of the observed (weighted) prevalence rates of fever and diarrhoea per province in Mozambique.
Table A2. Summary of the observed (weighted) prevalence rates of fever and diarrhoea per province in Mozambique.
FeverDiarrhea
DistrictN. Respondents(N, %) N. Respondents(N, %)
Niassa 546146 (30.16) 54694 (17.19)
Cabo Delgado 38086 (21.94) 38337 (9.90)
Nampula 595228 (39.49) 59764 (11.28)
Zambezia 555264 (51.67) 55690 (16.95)
Tete 44865 (14.37) 44939 (6.80)
Manica 47981 (16.59) 47943 (8.90)
Sofala 505109 (21.58) 50646 (8.43)
Inhambane 32368 (18.23) 32327 (7.24)
Gaza 530141 (27.00) 53061 (11.79)
Maputo Provincia 34053 (15.86) 34023 (8.25)
Maputo Cidade 27170 (24.99) 27128 (9.99)
Total 49721311 (29.37) 4980549 (11.11)

References

  1. Giorgi, E.; Diggle, P.J.; Snow, R.W.; Noor, A.M. Geostatistical Methods for Disease Mapping and Visualisation Using Data from Spatio-temporally Referenced Prevalence Surveys. Int. Stat. Rev. 2018, 86, 571–597. [Google Scholar] [CrossRef] [PubMed]
  2. Manda, S.; Haushona, N.; Bergquist, R. A scoping review of spatial analysis approaches using health survey data in sub-Saharan Africa. Int. J. Environ. Res. Public Health 2020, 17, 3070. [Google Scholar] [CrossRef] [PubMed]
  3. Okango, E.; Mwambi, H.; Ngesa, O. Spatial modeling of HIV and HSV-2 among women in Kenya with spatially varying coefficients. BMC Public Health 2016, 16, 355. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Chen, C.; Wakefield, J.; Lumely, T. The use of sampling weights in Bayesian hierarchical models for small area estimation. Spat. Spatiotemporal. Epidemiol. 2014, 11, 33–43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Mercer, L.D.; Wakefield, J.; Chen, C.; Lumley, T. A comparison of spatial smoothing methods for small area estimation with sampling weights. Spat. Stat. 2014, 8, 69–85. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Vandendijck, Y.; Faes, C.; Kirby, R.S.; Lawson, A.; Hens, N. Model-based inference for small area estimation with sampling weights. Spat. Stat. 2016, 18, 455–473. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Watjou, K.; Faes, C.; Lawson, A.; Kirby, R.S.; Aregay, M.; Carroll, R.; Vandendijck, Y. Spatial small area smoothing models for handling survey data with nonresponse. Stat. Med. 2017, 36, 3708–3745. [Google Scholar] [CrossRef]
  8. NSO Malawi Demographic and Health Survey 2015–16; National Statistical Office: Zomba, Malawi; The DHS Program ICF: Rockville, MD, USA, 2017; pp. 1–658.
  9. MISAU. INE Inquérito de Indicadores de Imunização, Malária e HIV SIDA em Moçambique (IMASIDA)-2015; INS: Maputo, Mozambique, 2018. [Google Scholar]
  10. Lumley, T. Analysis of complex survey samples. J. Stat. Softw. 2004, 9, 1–19. [Google Scholar] [CrossRef] [Green Version]
  11. Horvitz, D.G.; Thompson, D.J. A Generalization of Sampling without Replacement from a Finite Universe. Am. Stat. Assoc. 1952, 47, 663–685. [Google Scholar] [CrossRef]
  12. ICF International. Demographic and Health Survey Sampling and Household Listing Manual; MEASURE DHS: Rockville, MD, USA, 2012. [Google Scholar]
  13. Rao, J.N.; Molina, I. Small Area Estimation; Jonh Wiley & Wiley Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  14. Pfeffermann, D. New important developments in small area estimation. Stat. Sci. 2013, 28, 40–68. [Google Scholar] [CrossRef]
  15. Besag, J.; York, J.; Mollié, A. Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Stat. Math. 1991, 43, 1–20. [Google Scholar] [CrossRef]
  16. Rue, H.; Martino, S. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B Stat. Methodol. 2009, 71, 319–392. [Google Scholar] [CrossRef]
  17. Raghunathan, T.E.; Xie, D.; Schenker, N.; Van Parsons, L.; Davis, W.W.; Dodd, K.W.; Feuer, E.J. Combining information from two surveys to estimate county-level prevalence rates of cancer risk factors and screening. J. Am. Stat. Assoc. 2007, 102, 474–486. [Google Scholar] [CrossRef] [Green Version]
  18. Rabe-Hesketh, S.; Skrondal, A. Multilevel modelling of complex survey data. J. R. Stat. Soc. Ser. A Stat. Soc. 2006, 169, 805–827. [Google Scholar] [CrossRef]
  19. Kish, L. Methods for design effects. J. Off. Stat. 1995, 11, 55–77. [Google Scholar]
  20. Schrödle, B.; Held, L. Spatio-temporal disease mapping using INLA. Environmetrics 2011, 22, 725–734. [Google Scholar] [CrossRef]
  21. Bivand, R.; Gómez-Rubio, V.; Rue, H. Spatial Data Analysis with R-INLA with Some Extensions. J. Stat. Softw. 2015, 68, 1–31. [Google Scholar]
  22. Blangiardo, M.; Cameletti, M. Spatial and Spatio-Temporal Bayesian Models with R-INLA; John Wiley & Sons: Hoboken, NJ, USA, 2015; ISBN 1118326555. [Google Scholar]
  23. Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 2002, 64, 583–639. [Google Scholar] [CrossRef] [Green Version]
  24. UNICEF. The State of the World’s Children 2019. Children, Food and Nutrition: Growing well in a Changing World. 2019. Available online: https://data.unicef.org/resources/state-of-the-worlds-children-2019/ (accessed on 21 July 2021).
  25. World Health Organization. Levels and Trends in Child Malnutrition. Joint Child Malnutrition Estimates. Key Findings of the 2017. Available online: https://www.unscn.org/en/resource-center/global-trends-and-emerging-issues?idnews=1709 (accessed on 21 July 2021).
  26. Institute for Health Metrics and Evaluation GBD Compare 2018. Available online: https://www.healthdata.org/gbd/publications (accessed on 21 July 2021).
  27. UNICEF. DATA Diarrhoeal Disease-UNICEF DATA 2019. Available online: https://data.unicef.org/topic/child-health/diarrhoeal-disease/ (accessed on 21 July 2021).
  28. United Nations. The Sustainable Development Goals Report 2016; UN: New York, NY, USA, 2016. [Google Scholar]
  29. World Health Organization. High risk group. Available online: https://www.who.int/ (accessed on 21 July 2021).
  30. Orunmoluyi, O.S.; Gayawan, E.; Manda, S. Spatial Co-Morbidity of Childhood Acute Respiratory Infection, Diarrhoea and Stunting in Nigeria. Int. J. Environ. Res. Public Health 2022, 19, 1838. [Google Scholar] [CrossRef] [PubMed]
  31. Kinyoki, D.K.; Kandala, N.B.; Manda, S.O.; Krainski, E.T.; Fuglstad, G.A.; Moloney, G.M.; Berkley, J.A.; Noor, A.M. Assessing comorbidity and correlates of wasting and stunting among children in Somalia using cross-sectional household surveys: 2007 to 2010. BMJ Open 2016, 6, e009854. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Map of Malawi showing the 32 districts (a) and map of Mozambique showing the 11 provinces (b).
Figure 1. Map of Malawi showing the 32 districts (a) and map of Mozambique showing the 11 provinces (b).
Ijerph 19 06319 g001
Figure 2. Maps of the observed (UW: Unweighted, HT: Horvitz–Thompson) and spatial estimated prevalences of stunting (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Figure 2. Maps of the observed (UW: Unweighted, HT: Horvitz–Thompson) and spatial estimated prevalences of stunting (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Ijerph 19 06319 g002
Figure 3. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of wasting (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Figure 3. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of wasting (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Ijerph 19 06319 g003
Figure 4. Maps of the observed (UW: unweighted and HT: Horvitz Thompson) and spatial estimated prevalences of underweight (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Figure 4. Maps of the observed (UW: unweighted and HT: Horvitz Thompson) and spatial estimated prevalences of underweight (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the district in Malawi using 2015-16 MDHS.
Ijerph 19 06319 g004
Figure 5. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of fever (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the province in Mozambique using IMASIDA 2015.
Figure 5. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of fever (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), ES: Effective Sample size estimator (Model 5), and ES-deff: Effective Sample size estimator using design effect (Model 6)) by the province in Mozambique using IMASIDA 2015.
Ijerph 19 06319 g005
Figure 6. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of diarrhoea (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), and ES: Effective Sample size estimator (Model 5)) by the province in Mozambique using IMASIDA 2015.
Figure 6. Maps of the observed (UW: unweighted and HT: Horvitz–Thompson) and spatial estimated prevalences of diarrhoea (UB: Unadjusted Binomial estimator (Model 1), LN: Logit-normal estimator (Model 2), AN: Arcsine-square root transformation estimator (Model 3), PL: Pseudo-likelihood estimator (Model 4), and ES: Effective Sample size estimator (Model 5)) by the province in Mozambique using IMASIDA 2015.
Ijerph 19 06319 g006
Table 1. A comparison of spatial models for mapping child stunting in Malawi using 2015-16 MDHS.
Table 1. A comparison of spatial models for mapping child stunting in Malawi using 2015-16 MDHS.
Parameters Model 1 Model 2 Model 3 Model 4Model 5 Model 6
β 0 (CI) −0.605 −0.585 0.634 −0.607 −0.594 −0.609
(−0.706; −0.547) (−0.67; −0.505) (0.608; 0.659) (−0.691; −0.526) (−0.68; −0.512) (−0.694; −0.528)
Sd 0.04 0.042 0.013 0.042 0.043 0.042
σ u 0.210 0.170 0.069 0.219 0.188 0.218
σ v 0.114 0.115 0.053 0.125 0.125 0.126
2 L L −135.88 −24.45 12.88 −137.78 −133.44 −137.21
p ( D ) 18.54 16.47 23.56 19.55 17.64 19.32
D I C 226.85 6.24 −86.69 229.23 222.80 228.22
Table 2. A comparison of spatial models for mapping child wasting in Malawi using 2015-16 MDHS.
Table 2. A comparison of spatial models for mapping child wasting in Malawi using 2015-16 MDHS.
Parameters Model 1 Model 2 Model 3 Model 4Model 5 Model 6
β 0 (CI) −3.514 −3.458 0.166 −3.577 −3.658 −3.609
(−3.715; −3.325) (−3.661; −3.257) (0.143; 0.19) (−3.78; −3.386) (−3.87; −3.457) (−3.814; −3.416)
Sd 0.099 0.103 0.012 0.1 0.105 0.101
σ u 2 0.493 0.353 0.061 0.486 0.519 0.467
σ v 2 0.148 0.121 0.0481 0.144 0.152 0.143
2 L L −95.22 −48.39 18.11 −94.04 −93.43 −92.31
p ( D ) 13.94 9.68 22.59 13.13 13.22 12.44
D I C 150.84 60.80 −94.76 148.73 147.06 145.98
Table 3. A comparison of spatial models for mapping child underweight in Malawi using 2015-16 MDHS.
Table 3. A comparison of spatial models for mapping child underweight in Malawi using 2015-16 MDHS.
Parameters Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
β 0 (CI) −2 −1.981 0.344 −2.003 −2.022 −2.008
(−2.109; –1.897) (−2.09; −1.877) (0.32; 0.368) (−2.112; −1.901) (−2.133;−1.916) (−2.116; −1.905)
Sd 0.054 0.054 0.012 0.053 0.055
σ u 0.1390 0.1197 0.0634 0.1523 0.1400 0.1420
σ v 0.1371 0.1139 0.0495 0.1286 0.1261 0.1321
2 L L −117.62 −29.98 15.99 −117.38 −113.23 −115.35
p ( D ) 13.38 10.42 22.91 13.41 12.23 12.91
D I C 197.75 24.96 −89.77 197.28 190.55 193.68
Table 4. A comparison of spatial models for mapping child fever in Mozambique using IMASIDA 2015.
Table 4. A comparison of spatial models for mapping child fever in Mozambique using IMASIDA 2015.
Parameters Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
β 0 (CI) −1.137 −1.123 0.523 −1.129 −1.13−1.13
(−1.433; −0.844) (−1.453; −1.123) (−0.45; 0.597) (−1.496; −0.767) (−1.458; −0.805)(−1.458; −0.805)
Sd 0.146 0.162 0.036 0.183 0.161
σ u 0.1734 0.1863 0.1046 0.1854 0.18480.1848
σ v 0.4660 0.5123 0.1094 0.5241 0.51720.5172
2 L L −62.86 −16.20 −0.834 −63.98 −61.28 −61.28
p ( D ) 10.52 10.28 10.66 10.60 10.5010.50
D I C 89.60 0.135 −37.64 89.54 86.86 86.86
Table 5. A comparison of spatial models for mapping child diarrhoea in Mozambique using IMASIDA 2015.
Table 5. A comparison of spatial models for mapping child diarrhoea in Mozambique using IMASIDA 2015.
Parameters Model 1 Model 2 Model 3 Model 4 Model 5
β 0 (CI) −2.145 −2.14 0.329 −1.945 −2.152
(−2.319;−1.979) (−2.335; −1.956) (0.283; 0.374) (−2.189; −1.708) (−2.34; −1.975)
Sd 0.085 0.095 0.023 0.1180.091
σ u 2 0.1651 0.1799 0.0748 0.23030.1724
σ v 2 0.2257 0.2231 0.0640 0.3438 0.2442
2 L L −50.27 −10.40 4.37 −53.70 −48.96
p ( D ) 8.75 7.90 10.15 9.83 8.64
D I C 81.04 4.30 −39.23 81.71 78.83
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cassy, S.R.; Manda, S.; Marques, F.; Martins, M.d.R.O. Accounting for Sampling Weights in the Analysis of Spatial Distributions of Disease Using Health Survey Data, with an Application to Mapping Child Health in Malawi and Mozambique. Int. J. Environ. Res. Public Health 2022, 19, 6319. https://doi.org/10.3390/ijerph19106319

AMA Style

Cassy SR, Manda S, Marques F, Martins MdRO. Accounting for Sampling Weights in the Analysis of Spatial Distributions of Disease Using Health Survey Data, with an Application to Mapping Child Health in Malawi and Mozambique. International Journal of Environmental Research and Public Health. 2022; 19(10):6319. https://doi.org/10.3390/ijerph19106319

Chicago/Turabian Style

Cassy, Sheyla Rodrigues, Samuel Manda, Filipe Marques, and Maria do Rosário Oliveira Martins. 2022. "Accounting for Sampling Weights in the Analysis of Spatial Distributions of Disease Using Health Survey Data, with an Application to Mapping Child Health in Malawi and Mozambique" International Journal of Environmental Research and Public Health 19, no. 10: 6319. https://doi.org/10.3390/ijerph19106319

APA Style

Cassy, S. R., Manda, S., Marques, F., & Martins, M. d. R. O. (2022). Accounting for Sampling Weights in the Analysis of Spatial Distributions of Disease Using Health Survey Data, with an Application to Mapping Child Health in Malawi and Mozambique. International Journal of Environmental Research and Public Health, 19(10), 6319. https://doi.org/10.3390/ijerph19106319

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop