Next Article in Journal
Further Geometric Properties of the Barnes–Mittag-Leffler Function
Next Article in Special Issue
Modified Two-Parameter Liu Estimator for Addressing Multicollinearity in the Poisson Regression Model
Previous Article in Journal
Paw-Type Characterization of Hourglass-Free Hamilton-Connected Graphs
Previous Article in Special Issue
A Novel PM2.5 Concentration Forecasting Method Based on LFIG_DTW_HC Algorithm and Generalized Additive Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology

1
Department of Mathematics and Statistics, College of Arts and Sciences, UNC Greensboro, Greensboro, NC 27412, USA
2
Department of Statistics, Lahore College for Women University, Lahore 54000, Pakistan
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(1), 11; https://doi.org/10.3390/axioms13010011
Submission received: 23 November 2023 / Revised: 12 December 2023 / Accepted: 14 December 2023 / Published: 22 December 2023
(This article belongs to the Special Issue Computational Statistics and Its Applications)

Abstract

:
The Quantitative Randomized Response Technique (RRT) can be used by researchers to obtain honest answers to questions that, due to their sensitive (socially undesirable, dangerous, or even illegal) nature, might otherwise invoke partially or completely falsified responses. Over the years, Quantitative RRT models, sometimes called Scrambling models, have been developed to incorporate such advancements as mixture, optionality and enhanced trust, each of which has important benefits. However, no single model incorporates all of these features. In this study, we propose just such a unified model, which we call the Mixture Optional Enhanced Trust (MOET) model. After developing methodologies to assess MOET based on standard approaches and using them to explore the key characteristics of the new model, we show that MOET has superior efficiency compared to the Quantitative Optional Enhanced Trust (OET) model. We also show that use of the model’s mixture capability allows practitioners to optimally balance the model’s efficiency with its privacy, making the model adaptable to a wide variety of research scenarios.

1. Introduction

When faced with uncomfortable or sensitive quantitative questions (for example, “What is your IQ?” or “What is your personal income level?”), respondents may modify or outright falsify their answers. This untruthfulness creates a significant problem for researchers; consequently, statisticians have developed a variety of clever techniques designed to encourage truthfulness in scenarios like these. Some of these techniques, for example, the Unmatched Count Technique (Raghavarao and Federer, 1979) and Social Desirability Scale based techniques (Reynolds, 1982), are best suited for binary applications [1,2]. Others, for example, the Bogus Pipeline technique (Jones and Sigall, 1971) and some Randomized Response Techniques (Warner 1971, Greenberg et al., 1971, Gupta et al., 2022, etc.), apply well in quantitative scenarios [3,4,5,6].
This study focuses on certain Quantitative Randomized Response Techniques. The RRT was first pioneered by Warner when he proposed a binary-question RRT in 1965. Six years later, Warner proposed a new technique, this one applicable to quantitative questions. According to this technique, researchers would instruct respondents to apply random noise to their quantitative responses via additive or multiplicative scrambling variables. As the researcher would only see scrambled responses, the respondents’ true answers would remain “hidden” or “confidential.” The idea was to make the respondent feel comfortable answering a sensitive question truthfully, knowing that their response would be obfuscated by the scrambling. Statistical techniques, taking into account the known distribution of the scrambling variable, could then be used across the group of responses to backsolve for the group level mean response to the sensitive question.
Greenberg (1971) developed another Quantitative RRT model based on an entirely different mechanism [5]. Rather than asking each respondent the sensitive question and then instructing them to scramble their response, Greenberg suggested that each respondent should answer one of two questions—either the sensitive quantitative question or some unrelated and nonsensitive quantitative question with a known probability distribution—based on a random assignment unknown to the researcher. While the researcher would see all of the respondents’ responses, the respondents’ confidentiality would none the less be maintained because the researcher would not know which question each individual respondent was answering.
Several advancements to Warner’s and Greenberg’s models were made over the years. Metha, and Aggarwal (2018) proposed a means of estimating the sensitivity level of a sensitive binary question [7]. Different kinds of scrambling techniques were explored by Diana and Perri (2011), Singh et al. (2018), and Priyanka and Trisandhya (2019) [8,9,10]. Gupta et al. (2002) showed that adding optionality to RRT models (respondents may opt in or opt out of the RRT according to whether they personally find the “sensitive” question to be sensitive) significantly improved model efficiency [11]. This same concept of optionality enabled measurement of the level of a quantitative question’s sensitivity. Additionally, Gupta et al. (2022) introduced an enhanced trust feature that enabled respondents to opt for greater levels of scrambling if they felt their responses were not being sufficiently obscured by additive scrambling alone [6].
The importance of respondent privacy was formally recognized during this timeframe as a key RRT model attribute, necessary for motivating respondent truthfulness, and Gupta et al. (2018) developed a “unified measure” designed to evaluate quantitative RRT models based on a single statistic that incorporated the competing elements of privacy and efficiency [12]. Vishwakarma et al. (2023) developed a two-stage unrelated randomized response model [13]. Narjis and Shabir (2021) proposed an RRT model for estimating rare sensitive attributes using a Poisson distribution [14]. The concept of “mixture”—combining Warner-based and Greenberg-based constructs into a single model where respondents are randomly assigned to one technique or the other— was incorporated into binary RRT models by Lovig (2021), but “mixture” has not been incorporated into a quantitative model prior to this study [15].
The model we propose in this study, which we call the Mixture Optional Enhanced Trust (MOET) model incorporates optionality, enhanced trust, and mixture into a single model, thereby consolidating many of the advantages of predecessor models into a single model.
In Section 2 of this study, we will explore the key metrics that we will use to evaluate and compare RRT models. In Section 3, we will propose the Mixture Optional Enhanced Trust (MOET) model and derive estimators for all of its key attributes. Section 4 will be devoted to computer simulations. We will observe the simulations’ output both as a way of understanding the model’s behavior and in contrast to the OET model.

2. Materials and Methods

In this section, we will discuss the methods used to measure the key characteristics of RRT models, which enable the comparisons between models shown later in the study. A measure of efficiency (Mean Squared Error, MSE), a measure of privacy ( 𝛻 ), and a unified measure that incorporates both efficiency and privacy ( δ ) are provided. All three of these metrics have been commonly used by other researchers to quantify and compare the efficacy of RRT models. These are the metrics we will use to quantify key attributes of the MOET model. We will use the same metrics to compare the characteristics of the MOET model to those of the OET model.

2.1. Efficiency Metric

The efficiency of any estimator µ Y ^ can be quantified by its MSE, denoted by M S E µ Y ^ .
M S E µ Y ^ = E [ ( µ Y ^ µ Y ) 2 ] .
Smaller MSE values are preferred, as high levels of efficiency are achieved when MSE is small.

2.2. Privacy Metric

A measure of privacy proposed by Yan et al. (2008) [16] commonly used in quantitative models is given by
𝛻 = E [ Z Y 2 ] ,
where Y represents a respondent’s true response to a sensitive question, while Z represents the respondent’s reported response (which may be scrambled) [16]. One can think of 𝛻 as a measurement of the “hiddenness” of a respondent’s true response. Clearly, when respondents’ reported responses lie far from respondents’ true responses, this metric will be large. Higher values of 𝛻 are preferred. Below, we consider the calculation of E [ ( Z Y ) 2 ] when Z is defined in various ways (we will call Z Z i ” when Z is defined in the i th unique way). These calculations will facilitate the calculation of an expression for the privacy of the MOET model in Section 3 of this study.
First, consider the trivial case of no scrambling. In this case we have Z = Z 1 = Y . Consistent with intuition, we have
E [ ( Z 1 Y ) 2 ] = E [ ( Y Y ) 2 ] = 0 .
In a case where scrambling is introduced into Z via an additive scrambling variable S (where the distribution of S is known and E ( S ) = 0 ), we have that Z = Z 2 = Y + S , and
E [ ( Z 2 Y ) 2 ] = E [ ( Y + S Y ) 2 ] = σ S 2 .
When multiplicative scrambling is implemented via a scrambling variable T (where the distribution of T is known and E ( T ) = 1 ) and additive scrambling is also included as above, we have that Z = Z 3 = T Y + S , and
E [ ( Z 3 Y ) 2 ] = E [ ( T Y + S Y ) 2 ] = E T 2 Y 2 E Y 2 + E ( S 2 ) = σ Y 2 + µ Y 2 σ T 2 + σ S 2 .
Finally, when an unrelated question is implemented, we have that Z = Z 4 = R , and
E [ ( Z 4 Y ) 2 ] = E [ ( R Y ) 2 ] = E R 2 + E Y 2 2 E ( R ) E ( Y ) = σ Y 2 + σ R 2 + ( µ Y µ R ) 2 .
We now recall that Gupta et al. (2018) showed that optionality does not compromise privacy because respondents who do not consider a question to be sensitive do not value privacy [12]. Hence, the privacy of an optional model is the same as that of a model where W , which we define as the sensitivity level of the sensitive question, is equal to 1 . Finally, we note that the privacy of a composite mixture model (MOET mixes Greenberg and Warner components) can be represented as the privacy associated with Z being defined in each of m unique ways within the model (denoted Z 1 , Z 2 , … Z m ). That is,
Z = Z 1                                w i t h   p r o b a b i l i t y            q 1 Z 2                                w i t h   p r o b a b i l i t y            q 2   . Z m                                w i t h   p r o b a b i l i t y            q m
From this we can write:
𝛻 a = j = 1 m q j E [ ( Z j Y ) 2 ] .
  • m is the number of ways Z is uniquely defined within the model.
  • j is a particular categorical way that Z is defined in the model.
  • q j is the probability that category ‘j’ captures the respondent’s response.
  • The superscript a indicates that privacy is adjusted according to Gupta et al.’s (2018) [12] optionality adjustment ( W = 1 ).
Section 3.3 of this study shows how this formula for 𝛻 a applies to the MOET model.

2.3. Unified Measure of Efficiency and Privacy

The following unified measure from Gupta et al. (2018) simultaneously evaluates a quantitative RRT model for its efficiency and for its privacy [12]:
δ a = M S E µ Y ^ 𝛻 a .
Low values of δ a are preferred, as both high privacy and high efficiency (low MSE) lead to a smaller δ a .

3. Proposed Mixture Optional Enhanced Trust Model (MOET)

We now propose the Mixture Optional Enhanced Trust model. This model features strong efficiency and unified measure performance. Additionally, the model’s many features—optionality, mixture, and enhanced trust scrambling—make it highly flexible and therefore useful in a wide variety of research settings characterized by different kinds of sensitive questions and demanding different levels of efficiency and privacy.

3.1. MOET Model Introduction

Let Y be the respondent’s true response to the sensitive question, while Z is their reported response. S and T are additive and multiplicative scrambling random variables, and R is a random variable representing a respondent’s response to the Greenberg unrelated question. Here Y , S , T , and R are mutually independent. W represents the respondent’s choice to take part in some form of an RRT (rather than simply giving a straight answer to the sensitive question without an RRT) and therefore can be considered a measure of the sensitivity of the sensitive question. The parameter α represents the proportion of respondents randomly assigned to the Warner-based model. Note that when α = 1, the MOET model becomes the OET model (See Appendix A). A represents a respondent’s trust in the RRT model in absence of additional scrambling, while ( 1 A ) represents the proportion of respondents who require additional scrambling in order to trust the RRT.
Below, Figure 1 presents a diagram of the MOET model:
Z =   Y + S                      w i t h   p r o b a b i l i t y            W α A T Y + S                    w i t h   p r o b a b i l i t y            W 1 A α + p α p Y                              w i t h   p r o b a b i l i t y            W 1 α p A + 1 W R                              w i t h   p r o b a b i l i t y            W 1 α 1 p .

3.2. MOET: Mean Estimator

Choice of random variables S and T should be made such that E S = 0 and E T = 1 . It follows from the MOET model (9) that
E Z = µ Z = W 1 α 1 p µ R + 1 W 1 α 1 p µ Y .
Using a Split Sample approach with p 1 , p 2 , where p 1   p 2 and E ( Z i ) is estimated by Z i ¯ , we have
Z ¯ i = W 1 α 1 p i µ R + [ 1 W 1 α 1 p i ] µ Y ^ ,        i = 1 ,   2 .
Therefore, the mean of the sensitive trait µ Y can be estimated by
µ Y ^ = 1 p 1 p 2 p 1 Z ¯ 2 1 p 2 p 2 p 1 Z ¯ 1 .
The variance/MSE of our unbiased estimator, using an equally-split, split sample approach for convenience, is given by
V a r µ Y ^ = ( 1 p 1 p 2 p 1 ) 2 V a r ( Z ¯ 2 ) + ( 1 p 2 p 2 p 1 ) 2 V a r ( Z ¯ 1 ) ,
V a r ( Z ¯ i ) = 2 n [ W 1 λ i A ϕ i σ S 2 + [ W 1 λ i ( 1 A σ T 2 + 1 ) + 1 W ] σ Y 2 + µ Y 2 + W λ i σ R 2 + µ R 2 µ Y + µ R µ Y W λ i 2 ] ,
λ i = 1 α 1 p i ,   i = 1 , 2 , ϕ i = p i 1 α ,   i = 1 , 2 , p 1 p 2 .
Neither A nor W values are needed to estimate µ Y . V a r µ Y ^ may be estimated by the sample variance of µ Y ^ values.

3.3. MOET: Privacy Measure

Per Equation (7), the privacy of the MOET model is given by
𝛻 a = q 1 𝛻 1 + q 2 𝛻 2 + q 3 𝛻 3 + q 4 𝛻 4 .
Analysis of the diagram of the MOET model in Section 3.1 above shows that q 1 = W 1 α p A + 1 W , q 2 = W α A , q 3 = W 1 A α + p α p , q 4 = W ( 1 α ) ( 1 p ) and 𝛻 1 = 0 , 𝛻 2 = E ( S 2 ) , 𝛻 3 = E [ ( T Y + S Y ) 2 ] , 𝛻 4 = E [ R Y 2 ] . Estimating E ( Z ) by Z ¯ , we can write
𝛻 a = α A E ( S 2 ) + 1 A α + p α p E [ T Y + S Y 2 ] + 1 α 1 p E [ R Y 2 ] .
Reducing further, we have the following expression, which represents privacy for the MOET model. The assumption of equal-split sampling underlies this formula, but a similar formula could easily be developed to represent unequal splitting.
𝛻 a = 1 2 2 α A σ S 2 + 2 λ 1 λ 2 1 A σ Y 2 + µ Y 2 σ T 2 + σ S 2 + λ 1 + λ 2 σ Y 2 + σ R 2 + µ Y µ R 2 ,
λ 1 = 1 α 1 p 1 , λ 2 = 1 α 1 p 2 , p 1 p 2 .

3.4. MOET: Sensitivity Estimator

Recall that the two samples used to estimate µ Y yield the equations
E ( Z i ) µ Y = W λ i ( µ R µ Y ) ,        i = 1 ,   2 .
Estimating E ( Z i ) by Z i ¯ and µ Y by µ Y ^ , we have:
E ( Z i ) µ Y = W ^ λ i ( µ R µ Y ) ,        i = 1 ,   2 .
Solving the above expression for W in samples 1 and 2 and then combining estimates, we have:
W ^ = λ 2 ( Z ¯ 1 µ Y ^ ) + λ 1 ( Z ¯ 2 µ Y ^ ) 2 λ 1 λ 2 ( µ R µ Y ^ ) ,       λ i = 1 α 1 p i ,       p 1 p 2 .
Inserting our estimator for µ Y ^ , this expression reduces to
W ^ = Z ¯ 1 Z ¯ 2 λ 1 µ R Z ¯ 2 λ 2 ( µ R Z ¯ 1 ) ,       λ i = 1 α 1 p i ,       p 1 p 2 .
From Equation (19) one can see that this estimator becomes unstable when µ R is close to µ Y .

4. Discussion

In this section of the study, we provide two tables that represent theoretical and simulated values and illustrate important characteristics of the MOET model. Each row of each table represents the output from a particular scenario (combination of A, W, and α values). Each scenario represents a sampling of n = 500 respondents within an RRT sampling scenario, and each simulation sampling is conducted N = 10,000 times in R software. Table 1 shows that the estimators developed in this study perform in line with theoretical expectations. Table 1 also establishes the value of the mixture model by showing that the MOET model—which is fundamentally a mixture of a Greenberg-based and a Warner-based model—yields performance preferable to either model on its own. Table 2 compares MOET model statistics to published OET model statistics in Gupta et al. (2022), and thereby demonstrates the MSE and unified measure superiority of the MOET model to the OET model [6].
Table 1 shows the results from the MOET model run according to the parameter values listed. The close fit between the empirical and theoretical values speaks to the veracity of the formulae proposed in this study, as well as the accuracy of the empirical simulations.
We draw attention to three particularly important aspects of the scenario presented in Table 1. First, observe that in the shown scenario, maximum efficiency (minimum MSE) occurs when α = 0 for any pairing of A and W values. Mathematically, this does not have to be true in every scenario. For example, in the scenario underlying the table above, the extreme choice of σ R 2 = 7 with all other assumptions left unchanged will lead to a circumstance where maximum efficiency is found at α = 1 rather than α = 0. To study this important relationship closely, we let f ( a ) = V a r µ Y ^ , per Equation (13), and then take the derivative of this expression with respect to a . Expressing the result in the form below, we can see that f ( a ) will always be positive under certain conditions.
f a = 1 p 1 p 2 p 1 2 2 n ψ 2 + Χ 2 + Υ 2 + ζ 2 + 1 p 2 p 2 p 1 2 2 n ψ 1 + Χ 1 + Υ 1 + ζ 1 ,
ψ i = A W p i σ S 2 ,
Χ i = 1 p i W ( 1 A σ T 2 + 1 σ Y 2 + µ Y 2 + σ S 2 σ R 2 µ R 2 ) ,
Υ i = 2 µ R µ Y W 1 p i µ Y ,
ζ i = 2 ( 1 a ) µ R µ Y 2 W 2 1 p i 2 ,
i = 1 ,   2 .
Specifically, under the following two sufficient but not necessary conditions, f ( a ) will be positive across all a values, so V a r µ Y ^ will be an increasing function of a . That is, maximum efficiency (minimum MSE) will always occur when α = 0 if
µ R = µ Y ,
1 A σ T 2 + 1 σ Y 2 + µ Y 2 + σ S 2 > σ R 2 + µ R 2 .
The first condition sets µ R equal to µ Y , which seems both reasonable and intuitively appealing, as the response to the unrelated question should be designed to have a similar size to that of the sensitive question under study. Because the first condition fixes µ R , the second condition puts an upper limit on the size of σ R 2 . This is not a very restrictive condition. Note that with A = 1, condition (22b) for the scenario presented in Table 1 reduces to σ R 2 < 2 . Our choice of σ R 2 = 1 meets the condition comfortably.
The second important aspect of Table 1 is that it illustrates the importance of the mixture feature of the model. The mixture capability of MOET is necessary to strike an optimal balance between the model’s efficiency and its privacy. This is clear because maximum efficiency is typically achieved when α is low (mixture leans toward the Greenberg model), while maximum privacy is achieved when α is high (mixture leans toward the Warner model), providing A < 1. For example, when A = 0.9 and W = 1 , theoretical privacy improves from 1.3 to 1.5 as α values increase from 0 to 1, but at the same time that privacy improves, efficiency declines (MSE increases from 0.0091 to 0.0152). Since RRT models must achieve strong levels of both privacy and efficiency, the choice of an α value of 0 or 1 (full Greenberg or full Warner) would be a poor choice; they would either sacrifice the model’s privacy to maximize its efficiency or vice versa. A choice of α between 0 and 1 (some mixture) will better balance the competing concerns.
Beyond the two key observations made above, Table 1 also confirms that the model behaves in many expected ways. When question sensitivity ( W ) is low, the model is more efficient. For example, when A = 1 and α = 0.6, efficiency is higher (MSE lower) for W = 0.2 (0.0069) than for W = 1 (0098). This relationship is intuitive because more respondents submit their responses without any RRT scrambling when W = 0.2, leading to greater efficiency.
When respondents do not require additional scrambling to enhance their trust (when A is large), the model is more efficient. For example, when W = 0.6 and α = 0.6, efficiency is higher (MSE lower) for A = 1 (0.0083) than for W = 0.9 (0101). Again, this result seems intuitively reasonable. Note that Table 1 was run with p 1 = 0.15 and p 2 = 0.85 . These values were chosen because widely disparate probability values lead to higher model efficiency (this is a known characteristic of the split sample approach).
The exact nature of the tradeoff between efficiency and privacy across α values, and in fact the existence of any tradeoff at all, is influenced by the sensitivity of the particular quantitative research question ( W ) and the respondent’s need for enhanced trust ( A ), as well as by the standard deviations of the chosen scrambling variables, S and T , and the mean and variance of the unrelated question response R . We make this observation as a way of drawing attention to the high level of flexibility in the model. For a given quantitative research scenario in which a researcher has specific efficiency and privacy goals, the researcher can strategically choose the MOET model’s S , T and R variables and the mixture split α that will most closely achieve those goals.
Table 2 is provided principally to facilitate comparisons between the MOET model and its predecessor, the OET model (see Appendix A). The OET model, in fact, can be thought of as a special case of the MOET model with α = 1. Comparison of Equation (A1) to Equation (9) makes this relationship clear.
Recall that we have established, based on Table 1, that the empirical values of each of these statistics falls close to their theoretical values, so for purposes of legibility, the empirical values of these quantities are not shown here. The scenarios run for the OET and MOET models represented in Table 2 are the same, and the values of all parameters equal the values used in the published Gupta et al. (2022) OET model paper [6]. That is, the values of all parameters that the two models have in common ( σ S , µ T , σ T ) , the assumed characteristics of the sensitive question ( µ Y , σ Y ,   W ), the inclination of the respondent to respond without additional scrambling ( A ), the sample size, and number of iterations run (n, N), are the same in both models.
For the table above, we have set µ R = 1 rather than µ R = µ Y = 2 . This choice was made because the estimator for W (see Equation (19)) becomes volatile when µ R = µ Y . The choice of µ R = 1 caused privacy to improve but caused efficiency to deteriorate slightly. In a real-life scenario, a researcher could choose µ R to accommodate their research needs, choosing µ R far from µ Y if an estimate of W is needed.
The Table 2 scenario shows the MOET model’s efficiency to be superior to that of the OET model. For example, in the circled row, when A = 0.95 and W = 0.9, we can see that (MSE = 0.0089) < (MSE = 0.0454), implying that the MOET has superior efficiency; this efficiency superiority in fact holds true for all A and W values in the table. It follows that in scenarios where efficiency is significantly more important than privacy, the MOET model will often be the better choice of model. However, the OET model outperforms the MOET in terms of privacy. See Figure 2 and Figure 3 below that illustrate the tabular data above:
While in the illustrated scenario MOET outperforms OET in terms of efficiency, OET outperforms MOET in terms of privacy. When we combine the two measures, we see that the MOET model outperforms the OET model according to the unified measure. For example, when A = 1 and W = 1, we can see in Table 2 that ( δ a = 0.0056) < ( δ a = 0.0121); the superiority of the MOET model in terms of unified measure holds true for all A and W values.

5. Conclusions

The Mixture-Optional Enhanced Trust model proposed in this study has important advantages over the OET model, which in turn was shown to be superior to the basic Warner model (Gupta et al. 2022) [6]. Specifically, the MOET model can yield lower MSE (therefore higher efficiency) than the OET model when MOET parameters are elected that favor efficiency over privacy. But the OET model generally achieves superior privacy, so it will usually be the better model in cases where the need for privacy is the overriding concern. This lower MSE of the MOET model offsets the higher privacy offered by the OET model, as indicated by superior unified measure values. In this study, we have shown, furthermore, that the MOET model’s mixture capability (captured by model parameter α) causes MOET to be superior to either a full Warner-based or a full Greenberg-based model. This mixture model will balance the competing concerns of privacy and efficiency, which will always be preferable to fully sacrificing one of these important characteristics for the other.

Author Contributions

Conceptualization, S.G.; methodology, M.P., S.G. and S.K.; software, S.K., and M.P.; validation, S.K. and M.P.; formal analysis, M.P., S.K. and S.G.; writing—original draft preparation, M.P.; writing—review and editing, S.G., S.K. and M.P.; supervision, S.G. and S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This study is based on simulation and does not involve a dataset. The R code used to run simulations will be provided upon query.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following is a list of acronyms used in the paper:
MOET modelMixture Optional Enhanced Trust model
OET modelOptional Enhanced Trust model
MSEMean Squared Error
RRTRandomized Response Technique
SDBSocial Desirability Bias

Appendix A. The Optional Enhanced Trust Model

Gupta et al. (2022) proposed an Optional Enhanced Trust (OET) model [6]. We will review the OET model here because the OET is a key predecessor to the Mixture Optional Enhanced Trust (MOET) model that we will propose later in this study. Comparisons between the OET and the MOET model will help illustrate the unique value of the new MOET model.
The OET is a Warner-based RRT model that allows respondents to elect RRT or not (optionality) and allows RRT respondents to elect additional multiplicative or linear combination scrambling if they feel that doing so will enhance their trust. The parameter W captures a respondent’s desire to use some form of scrambling to hide their response, so W can be thought of as a measure of the sensitive question’s sensitivity. The parameter A captures a respondent’s desire for additive and multiplicative scrambling (rather than just additive scrambling) to enhance their trust in the RRT. Here, S and T are additive and multiplicative scrambling variables, respectively, with means µ S = 0 and µ T = 1 with known variances σ S 2 and σ T 2 . The resulting OET model is shown below. Gupta et al. (2002) showed that the OET model achieved superior efficiency and better mitigated lack of trust as compared to Warner’s model [6].
Below, Figure A1 presents a diagram of the OET model:
Figure A1. OET model.
Figure A1. OET model.
Axioms 13 00011 g0a1
Z =   Y                                w i t h   p r o b a b i l i t y            1 W   Y + S                          w i t h   p r o b a b i l i t y            W A T Y + S                        w i t h   p r o b a b i l i t y            W 1 A
Gupta et al. (2022) provided estimators for the OET’s mean, variance of mean estimator (efficiency), and privacy, as given by [6]:
Mean Estimator:
µ Y ^ = Z ¯ .
Variance of Mean Estimator (Efficiency):
V a r µ Y ^ = 1 n W 1 A σ T 2 σ Y 2 + σ T 2 µ Y 2 + σ Y 2 + W + σ S 2 .
If W and/or A are unknown, the variance of the mean estimator can be estimated by
V a r ^ µ Y ^ = s Z 2 n ,
where s Z 2 is the variance in sample responses.
Privacy:
𝛻 a = 1 A σ T 2 σ Y 2 + σ T 2 µ Y 2 + σ S 2 .
Unified Measure:
δ a = V a r µ Y ^ 𝛻 a   or   V a r ^ µ Y ^ 𝛻 a .

References

  1. Raghavarao, D.; Federer, W.T. Block total response as an alternative to the randomized response method in surveys. J. R. Stat. Soc. Ser. B (Methodol.) 1979, 41, 40–45. [Google Scholar] [CrossRef]
  2. Reynolds, W. Development of a reliable and valid short form of the Marlowe-Crowne SDB scale. J. Clin. Psychol. 1982, 38, 119–125. [Google Scholar] [CrossRef]
  3. Jones, E.; Sigall, H. The Bogus Pipeline: A new paradigm for measuring effect and attitude. Psychol. Bull. 1971, 76, 349–364. [Google Scholar] [CrossRef]
  4. Warner, S. The Linear Randomized Response Model. J. Am. Stat. Assoc. 1971, 66, 884–888. [Google Scholar] [CrossRef]
  5. Greenberg, B.G.; Abul-Ela, A.L.A.; Simmons, W.R.; Horvitz, D.G. The unrelated question randomized response model: Theoretical framework. J. Am. Stat. Assoc. 1971, 666, 243–250. [Google Scholar] [CrossRef]
  6. Gupta, S.; Zhang, J.; Khalil, S.; Sapra, P. Mitigating Lack of Trust in Quantitative Randomized Response Techniques Models. Commun. Stat. Part B Simul. Comput. 2022, 51, 1–9. [Google Scholar] [CrossRef]
  7. Mehta, S.; Aggarwal, P. Bayesian estimation of sensitivity level and population proportion of a sensitive characteristic in a binary optional unrelated question rrt model. Commun. Stat.—Theory Methods 2018, 47, 4021–4028. [Google Scholar] [CrossRef]
  8. Diana, G.; Perri, P.F. A class of estimators for quantitative sensitive data. Stat. Pap. 2011, 52, 633–650. [Google Scholar] [CrossRef]
  9. Singh, G.N.; Kumar, A.; Vishwakarma, G.K. Some alternative additive randomized response models for estimation of population mean of quantitative sensitive variable in the presence of scramble variable. Commun. Stat.-Simul. Comput. 2018, 49, 2785–2807. [Google Scholar] [CrossRef]
  10. Priyanka, K.; Trisandhya, P. A composite class of estimators using scrambled response mechanism for sensitive population mean in successive sampling. Commun. Stat.—Theory Methods 2019, 48, 1009–1032. [Google Scholar] [CrossRef]
  11. Gupta, S.; Gupta, B.; Singh, S. Estimation of the sensitivity level of personal interview survey questions. J. Stat. Plan. Inference 2002, 100, 239–247. [Google Scholar] [CrossRef]
  12. Gupta, S.; Mehta, S.; Shabbir, J.; Khalil, S. A unified measure of respondent privacy and model efficiency in quantitative RRT models. J. Stat. Theory Pract. 2018, 12, 506–511. [Google Scholar] [CrossRef]
  13. Vishwakarma, G.K.; Kumar, A.; Kumar, N. Two-stage unrelated randomized response model to estimate the prevalence of a sensitive attribute. Comput. Stat. 2023, 52, 1–26. [Google Scholar] [CrossRef]
  14. Narjis, G.; Shabbir, J. An efficient partial randomized response model for estimating a rare sensitive attribute using poisson distribution. Commun. Stat.—Theory Methods 2021, 50, 1–17. [Google Scholar] [CrossRef]
  15. Lovig, M.; Khalil, S.; Rahman, S.; Sapra, P. A mixture binary RRT model with a unified measure of privacy and efficiency. Commun. Stat.—Simul. Comput. 2021, 52, 2727–2737. [Google Scholar] [CrossRef]
  16. Yan, Z.; Wang, J.; Lai, J. An Efficiency and Protection Degree-Based Comparison Among the Quantitative Randomized Response Strategies. Theory Methods 2008, 38, 400–408. [Google Scholar] [CrossRef]
Figure 1. MOET model.
Figure 1. MOET model.
Axioms 13 00011 g001
Figure 2. MOET vs. OET comparison across trust levels. (a) Efficiency results for MOET and OET models across trust levels (A). (b) Privacy results for MOET and OET models across trust levels (A). Parameter values underlying these exhibits are those cited in Table 2.
Figure 2. MOET vs. OET comparison across trust levels. (a) Efficiency results for MOET and OET models across trust levels (A). (b) Privacy results for MOET and OET models across trust levels (A). Parameter values underlying these exhibits are those cited in Table 2.
Axioms 13 00011 g002
Figure 3. MOET vs. OET comparison across sensitivity levels. (a) Efficiency results for MOET and OET models across sensitivity levels (W). (b) Privacy results for MOET and OET models across sensitivity levels (W). Parameter values underlying these exhibits are those cited in Table 2.
Figure 3. MOET vs. OET comparison across sensitivity levels. (a) Efficiency results for MOET and OET models across sensitivity levels (W). (b) Privacy results for MOET and OET models across sensitivity levels (W). Parameter values underlying these exhibits are those cited in Table 2.
Axioms 13 00011 g003
Table 1. MOET Results. ( p 1 = 0.85 , p 2 = 0.15 , μ Y = 2 , μ S = 0 , μ T = 1 , μ R = 2 , σ Y = 1 , σ S = 1 ,   σ T = 1 , σ R = 1 ,   n = 500 , N = 10,000 ).
Table 1. MOET Results. ( p 1 = 0.85 , p 2 = 0.15 , μ Y = 2 , μ S = 0 , μ T = 1 , μ R = 2 , σ Y = 1 , σ S = 1 ,   σ T = 1 , σ R = 1 ,   n = 500 , N = 10,000 ).
AWα µ ^ Y M S E ( µ ^ Y ) T * M S E ( µ ^ Y ) E * ( 𝛻 a ) T * ( 𝛻 a ) E * ( δ a ) T * ( δ a ) E *
111.02.00050.01220.01241.00001.00020.01220.0124
110.82.00170.01090.01111.00001.00030.01090.0111
110.62.00010.00970.00981.00000.99910.00970.0098
110.42.00210.00850.00831.00001.00120.00850.0083
110.21.99960.00730.00731.00001.00070.00730.0073
1101.99990.00610.00611.00001.00160.00610.0061
10.61.02.00020.00970.00981.00001.00020.00970.0098
10.60.82.00050.00900.00921.00001.00030.00900.0092
10.60.62.00020.00830.00831.00000.99910.00830.0083
10.60.41.99990.00750.00751.00001.00120.00750.0075
10.60.21.99950.00680.00691.00001.00070.00680.0069
10.602.00060.00610.00621.00001.00160.00610.0062
10.21.01.99880.00730.00741.00001.00020.00730.0074
10.20.81.99880.00710.00691.00001.00030.00710.0069
10.20.62.00010.00680.00691.00000.99910.00680.0069
10.20.42.00060.00660.00661.00001.00120.00660.0066
10.20.22.00000.00630.00621.00001.00070.00630.0062
10.201.99960.00610.00621.00001.00160.00610.0062
0.911.02.00040.01520.01541.50001.50040.01010.0103
0.910.82.00020.01400.01371.46001.45850.00960.0094
0.910.61.99920.01280.01251.42001.42070.00900.0088
0.910.42.00160.01150.01161.38001.37980.00830.0084
0.910.21.99990.01030.01041.34001.33930.00770.0078
0.9101.99920.00910.00891.30001.29820.00700.0069
0.90.61.02.00020.01160.01161.50001.50040.00770.0077
0.90.60.81.99920.01080.01091.46001.45850.00740.0075
0.90.60.61.99960.01010.01011.42001.42070.00710.0071
0.90.60.41.99880.00940.00921.38001.37980.00680.0067
0.90.60.22.00000.00860.00851.34001.33930.00640.0063
0.90.601.99950.00790.00801.30001.29820.00610.0062
0.90.21.02.00050.00790.00781.50001.50040.00530.0052
0.90.20.81.99890.00770.00761.46001.45850.00530.0052
0.90.20.62.00040.00740.00751.42001.42070.00520.0053
0.90.20.42.00010.00720.00721.38001.37980.00520.0052
0.90.20.22.00220.00690.00701.34001.33930.00510.0052
0.90.202.00070.00670.00691.30001.29820.00520.0053
* M S E ( µ ^ Y ) T and M S E ( µ ^ Y ) E represent theoretical and empirical mean squared error of the estimator of Y . 𝛻 a   T and ( 𝛻 a )   E represent theoretical and empirical model privacy. ( δ a )   T and ( δ a )   E represent theoretical and empirical unified measure.
Table 2. MOET Results compared to OET Results. ( α = 0.15 , p 1 = 0.85 , p 2 = 0.15 , μ Y = 2 , μ S = 0 , μ S 1 = 2 , μ S 2 = 1 , μ T = 1 , μ R = 1 , σ Y = 1 , σ S = 1 ,   σ T = 1 , σ R = 1 ,   n = 500 , N = 10,000 ).
Table 2. MOET Results compared to OET Results. ( α = 0.15 , p 1 = 0.85 , p 2 = 0.15 , μ Y = 2 , μ S = 0 , μ S 1 = 2 , μ S 2 = 1 , μ T = 1 , μ R = 1 , σ Y = 1 , σ S = 1 ,   σ T = 1 , σ R = 1 ,   n = 500 , N = 10,000 ).
OET Model MOET Model
AW µ ^ Y M S E ( µ ^ Y ) T ( 𝛻 a ) T * ( δ a ) T W ^ µ ^ Y M S E ( µ ^ Y ) T ( 𝛻 a ) T ( δ a ) T W ^
11 1.998990.04003.5000.01141.0001 2.001220.00771.42500.00540.9956
10.9 2.00170.04093.5000.01170.8989 1.99960.00751.42500.00530.8938
10.7 1.99750.04073.5000.01160.7023 1.99930.00721.42500.00510.6938
10.5 1.99850.03803.5000.01090.5013 2.00100.00691.42500.00480.4941
10.3 1.99990.03273.5000.00930.3001 1.99970.00661.42500.00460.2914
0.951 2.00140.04503.7500.01200.9989 1.99940.00921.59000.00580.9953
0.950.9 1.99910.04543.7500.01210.9014 1.99900.00891.59000.00560.8933
0.950.7 2.00240.04423.7500.01180.6980 2.00100.00831.59000.00520.6933
0.950.5 2.00060.04053.7500.01080.4996 2.00050.00771.59000.00480.4925
0.950.3 1.99660.03423.7500.00910.3030 1.99990.00711.59000.00450.2901
0.91 2.00100.05004.0000.01250.9993 2.00070.01071.75500.00610.9960
0.90.9 1.99970.04994.0000.01250.9000 1.99950.01031.75500.00590.8925
0.90.7 2.00100.04774.0000.01190.6997 1.99960.00941.75500.00540.6916
0.90.5 1.99880.04304.0000.01080.5013 1.99750.00841.75500.00480.4883
0.90.3 2.00000.03574.0000.00890.3012 2.00020.00751.75500.00430.2889
0.851 2.00380.05504.2500.01290.9980 2.00170.01221.92000.00640.9936
0.850.9 2.00240.05444.2500.01280.8984 1.99960.01161.92000.00600.8910
0.850.7 2.00320.05124.2500.01200.6988 1.99950.01041.92000.00540.6915
0.850.5 2.00020.04554.2500.01070.4997 2.00000.00921.92000.00480.4882
0.850.3 1.99960.03724.2500.00880.3000 2.00010.00801.92000.00420.2908
0.81 2.00280.06004.5000.01330.9987 1.99740.01372.08500.00660.9917
0.80.9 2.00360.05894.5000.01310.8982 2.00050.01302.08500.00620.8915
0.80.7 1.99970.05474.5000.01220.7004 1.99960.01152.08500.00550.6895
0.80.5 2.00410.04804.5000.01070.4971 2.00100.01002.08500.00480.4906
0.80.3 1.99900.03874.5000.00860.3008 2.00000.00842.08500.00400.28888
* M S E ( µ ^ Y ) T , ( 𝛻 a ) T , and ( δ a ) T represent the theoretical mean squared error, privacy and unified measure for the respective models. Green indicates model superiority.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Parker, M.; Gupta, S.; Khalil, S. A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology. Axioms 2024, 13, 11. https://doi.org/10.3390/axioms13010011

AMA Style

Parker M, Gupta S, Khalil S. A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology. Axioms. 2024; 13(1):11. https://doi.org/10.3390/axioms13010011

Chicago/Turabian Style

Parker, Michael, Sat Gupta, and Sadia Khalil. 2024. "A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology" Axioms 13, no. 1: 11. https://doi.org/10.3390/axioms13010011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop