A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology

Parker, Michael; Gupta, Sat; Khalil, Sadia

doi:10.3390/axioms13010011

Open AccessArticle

A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology

by

Michael Parker

^1,*,

Sat Gupta

¹ and

Sadia Khalil

^1,2

¹

Department of Mathematics and Statistics, College of Arts and Sciences, UNC Greensboro, Greensboro, NC 27412, USA

²

Department of Statistics, Lahore College for Women University, Lahore 54000, Pakistan

^*

Author to whom correspondence should be addressed.

Axioms 2024, 13(1), 11; https://doi.org/10.3390/axioms13010011

Submission received: 23 November 2023 / Revised: 12 December 2023 / Accepted: 14 December 2023 / Published: 22 December 2023

(This article belongs to the Special Issue Computational Statistics and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The Quantitative Randomized Response Technique (RRT) can be used by researchers to obtain honest answers to questions that, due to their sensitive (socially undesirable, dangerous, or even illegal) nature, might otherwise invoke partially or completely falsified responses. Over the years, Quantitative RRT models, sometimes called Scrambling models, have been developed to incorporate such advancements as mixture, optionality and enhanced trust, each of which has important benefits. However, no single model incorporates all of these features. In this study, we propose just such a unified model, which we call the Mixture Optional Enhanced Trust (MOET) model. After developing methodologies to assess MOET based on standard approaches and using them to explore the key characteristics of the new model, we show that MOET has superior efficiency compared to the Quantitative Optional Enhanced Trust (OET) model. We also show that use of the model’s mixture capability allows practitioners to optimally balance the model’s efficiency with its privacy, making the model adaptable to a wide variety of research scenarios.

Keywords:

Randomized Response Technique (RRT); respondent privacy; simulations; Social Desirability Bias (SDB); unified measure of privacy and efficiency

MSC:

62D05

1. Introduction

When faced with uncomfortable or sensitive quantitative questions (for example, “What is your IQ?” or “What is your personal income level?”), respondents may modify or outright falsify their answers. This untruthfulness creates a significant problem for researchers; consequently, statisticians have developed a variety of clever techniques designed to encourage truthfulness in scenarios like these. Some of these techniques, for example, the Unmatched Count Technique (Raghavarao and Federer, 1979) and Social Desirability Scale based techniques (Reynolds, 1982), are best suited for binary applications [1,2]. Others, for example, the Bogus Pipeline technique (Jones and Sigall, 1971) and some Randomized Response Techniques (Warner 1971, Greenberg et al., 1971, Gupta et al., 2022, etc.), apply well in quantitative scenarios [3,4,5,6].

This study focuses on certain Quantitative Randomized Response Techniques. The RRT was first pioneered by Warner when he proposed a binary-question RRT in 1965. Six years later, Warner proposed a new technique, this one applicable to quantitative questions. According to this technique, researchers would instruct respondents to apply random noise to their quantitative responses via additive or multiplicative scrambling variables. As the researcher would only see scrambled responses, the respondents’ true answers would remain “hidden” or “confidential.” The idea was to make the respondent feel comfortable answering a sensitive question truthfully, knowing that their response would be obfuscated by the scrambling. Statistical techniques, taking into account the known distribution of the scrambling variable, could then be used across the group of responses to backsolve for the group level mean response to the sensitive question.

Greenberg (1971) developed another Quantitative RRT model based on an entirely different mechanism [5]. Rather than asking each respondent the sensitive question and then instructing them to scramble their response, Greenberg suggested that each respondent should answer one of two questions—either the sensitive quantitative question or some unrelated and nonsensitive quantitative question with a known probability distribution—based on a random assignment unknown to the researcher. While the researcher would see all of the respondents’ responses, the respondents’ confidentiality would none the less be maintained because the researcher would not know which question each individual respondent was answering.

Several advancements to Warner’s and Greenberg’s models were made over the years. Metha, and Aggarwal (2018) proposed a means of estimating the sensitivity level of a sensitive binary question [7]. Different kinds of scrambling techniques were explored by Diana and Perri (2011), Singh et al. (2018), and Priyanka and Trisandhya (2019) [8,9,10]. Gupta et al. (2002) showed that adding optionality to RRT models (respondents may opt in or opt out of the RRT according to whether they personally find the “sensitive” question to be sensitive) significantly improved model efficiency [11]. This same concept of optionality enabled measurement of the level of a quantitative question’s sensitivity. Additionally, Gupta et al. (2022) introduced an enhanced trust feature that enabled respondents to opt for greater levels of scrambling if they felt their responses were not being sufficiently obscured by additive scrambling alone [6].

The importance of respondent privacy was formally recognized during this timeframe as a key RRT model attribute, necessary for motivating respondent truthfulness, and Gupta et al. (2018) developed a “unified measure” designed to evaluate quantitative RRT models based on a single statistic that incorporated the competing elements of privacy and efficiency [12]. Vishwakarma et al. (2023) developed a two-stage unrelated randomized response model [13]. Narjis and Shabir (2021) proposed an RRT model for estimating rare sensitive attributes using a Poisson distribution [14]. The concept of “mixture”—combining Warner-based and Greenberg-based constructs into a single model where respondents are randomly assigned to one technique or the other— was incorporated into binary RRT models by Lovig (2021), but “mixture” has not been incorporated into a quantitative model prior to this study [15].

The model we propose in this study, which we call the Mixture Optional Enhanced Trust (MOET) model incorporates optionality, enhanced trust, and mixture into a single model, thereby consolidating many of the advantages of predecessor models into a single model.

In Section 2 of this study, we will explore the key metrics that we will use to evaluate and compare RRT models. In Section 3, we will propose the Mixture Optional Enhanced Trust (MOET) model and derive estimators for all of its key attributes. Section 4 will be devoted to computer simulations. We will observe the simulations’ output both as a way of understanding the model’s behavior and in contrast to the OET model.

2. Materials and Methods

In this section, we will discuss the methods used to measure the key characteristics of RRT models, which enable the comparisons between models shown later in the study. A measure of efficiency (Mean Squared Error, MSE), a measure of privacy (

𝛻

), and a unified measure that incorporates both efficiency and privacy

(δ

) are provided. All three of these metrics have been commonly used by other researchers to quantify and compare the efficacy of RRT models. These are the metrics we will use to quantify key attributes of the MOET model. We will use the same metrics to compare the characteristics of the MOET model to those of the OET model.

2.1. Efficiency Metric

The efficiency of any estimator

\hat{µ_{Y}}

can be quantified by its MSE, denoted by

M S E (\hat{µ_{Y}})

.

M S E (\hat{µ_{Y}}) = {E [(\hat{µ_{Y}} - µ_{Y})}^{2}] .

(1)

Smaller MSE values are preferred, as high levels of efficiency are achieved when MSE is small.

2.2. Privacy Metric

A measure of privacy proposed by Yan et al. (2008) [16] commonly used in quantitative models is given by

𝛻 = {E [(Z - Y)}^{2}],

(2)

where

Y

represents a respondent’s true response to a sensitive question, while

Z

represents the respondent’s reported response (which may be scrambled) [16]. One can think of

𝛻

as a measurement of the “hiddenness” of a respondent’s true response. Clearly, when respondents’ reported responses lie far from respondents’ true responses, this metric will be large. Higher values of

𝛻

are preferred. Below, we consider the calculation of

{E [(Z - Y)}^{2}]

when

Z

is defined in various ways (we will call

Z

“

Z_{i}

” when

Z

is defined in the

i

th unique way). These calculations will facilitate the calculation of an expression for the privacy of the MOET model in Section 3 of this study.

First, consider the trivial case of no scrambling. In this case we have

Z = Z_{1} = Y

. Consistent with intuition, we have

{E [(Z_{1} - Y)}^{2}] = {E [(Y - Y)}^{2}] = 0 .

(3)

In a case where scrambling is introduced into

Z

via an additive scrambling variable

S

(where the distribution of

S

is known and

E (S) = 0

), we have that

Z = Z_{2} = Y + S

, and

{E [(Z_{2} - Y)}^{2}] = {E [(Y + S - Y)}^{2}] = σ_{S}^{2} .

(4)

When multiplicative scrambling is implemented via a scrambling variable

T

(where the distribution of

T

is known and

E (T) = 1

) and additive scrambling is also included as above, we have that

Z = Z_{3} = T Y + S

, and

{{E [(Z_{3} - Y)}^{2}] = E [(T Y + S - Y)}^{2}] = E (T^{2} Y^{2}) - E (Y^{2}) + E (S^{2}) = (σ_{Y}^{2} + µ_{Y}^{2}) σ_{T}^{2} + σ_{S}^{2} .

(5)

Finally, when an unrelated question is implemented, we have that

Z = Z_{4} = R

, and

{E [(Z_{4} - Y)}^{2}] = {E [(R - Y)}^{2}] = E (R^{2}) + E (Y^{2}) - 2 E (R) E (Y) = σ_{Y}^{2} + σ_{R}^{2} + {(µ_{Y} - µ_{R})}^{2} .

(6)

We now recall that Gupta et al. (2018) showed that optionality does not compromise privacy because respondents who do not consider a question to be sensitive do not value privacy [12]. Hence, the privacy of an optional model is the same as that of a model where

W

, which we define as the sensitivity level of the sensitive question, is equal to

1

. Finally, we note that the privacy of a composite mixture model (MOET mixes Greenberg and Warner components) can be represented as the privacy associated with

Z

being defined in each of

m

unique ways within the model (denoted

Z_{1}

,

Z_{2}

, …

Z_{m}

). That is,

Z = \{\begin{matrix} Z_{1} w i t h p r o b a b i l i t y q_{1} \\ Z_{2} w i t h p r o b a b i l i t y q_{2} \\ \dots . \\ Z_{m} w i t h p r o b a b i l i t y q_{m} \end{matrix}

From this we can write:

𝛻^{a} = \sum_{j = 1}^{m} q_{j} {* E [(Z_{j} - Y)}^{2}] .

(7)

$m$ is the number of ways $Z$ is uniquely defined within the model.
$j$ is a particular categorical way that $Z$ is defined in the model.
$q_{j}$ is the probability that category ‘j’ captures the respondent’s response.
The superscript $a$ indicates that privacy is adjusted according to Gupta et al.’s (2018) [12] optionality adjustment ( $W = 1$ ).

Section 3.3 of this study shows how this formula for

𝛻^{a}

applies to the MOET model.

2.3. Unified Measure of Efficiency and Privacy

The following unified measure from Gupta et al. (2018) simultaneously evaluates a quantitative RRT model for its efficiency and for its privacy [12]:

δ^{a} = \frac{M S E (\hat{µ_{Y}})}{𝛻^{a}} .

(8)

Low values of

δ^{a}

are preferred, as both high privacy and high efficiency (low MSE) lead to a smaller

δ^{a}

.

3. Proposed Mixture Optional Enhanced Trust Model (MOET)

We now propose the Mixture Optional Enhanced Trust model. This model features strong efficiency and unified measure performance. Additionally, the model’s many features—optionality, mixture, and enhanced trust scrambling—make it highly flexible and therefore useful in a wide variety of research settings characterized by different kinds of sensitive questions and demanding different levels of efficiency and privacy.

3.1. MOET Model Introduction

Let

Y

be the respondent’s true response to the sensitive question, while

Z

is their reported response.

S

and

T

are additive and multiplicative scrambling random variables, and

R

is a random variable representing a respondent’s response to the Greenberg unrelated question. Here

Y

,

S

,

T

, and

R

are mutually independent.

W

represents the respondent’s choice to take part in some form of an RRT (rather than simply giving a straight answer to the sensitive question without an RRT) and therefore can be considered a measure of the sensitivity of the sensitive question. The parameter α represents the proportion of respondents randomly assigned to the Warner-based model. Note that when α = 1, the MOET model becomes the OET model (See Appendix A).

A

represents a respondent’s trust in the RRT model in absence of additional scrambling, while

(1 - A)

represents the proportion of respondents who require additional scrambling in order to trust the RRT.

Below, Figure 1 presents a diagram of the MOET model:

Z = \{\begin{matrix} Y + S w i t h p r o b a b i l i t y W α A \\ T Y + S w i t h p r o b a b i l i t y W (1 - A) (α + p - α p) \\ Y w i t h p r o b a b i l i t y W (1 - α) p A + (1 - W) \\ R w i t h p r o b a b i l i t y W (1 - α) (1 - p) \end{matrix} .

(9)

3.2. MOET: Mean Estimator

Choice of random variables

S

and

T

should be made such that

E (S) = 0

and

E (T) = 1

. It follows from the MOET model (9) that

E [Z] = µ_{Z} = W (1 - α) (1 - p) µ_{R} + [1 - W (1 - α) (1 - p)] µ_{Y} .

(10)

Using a Split Sample approach with

p_{1}

,

p_{2}

, where

p_{1} \neq

p_{2}

and

{E (Z}_{i})

is estimated by

\bar{Z_{i}}

, we have

{\bar{Z}}_{i} = W (1 - α) (1 - p_{i}) µ_{R} + [1 - W (1 - α) (1 - p_{i})] \hat{µ_{Y}}, i = 1, 2 .

(11)

Therefore, the mean of the sensitive trait

µ_{Y}

can be estimated by

\hat{µ_{Y}} = \frac{1 - p_{1}}{p_{2} - p_{1}} {\bar{Z}}_{2} - \frac{1 - p_{2}}{p_{2} - p_{1}} {\bar{Z}}_{1} .

(12)

The variance/MSE of our unbiased estimator, using an equally-split, split sample approach for convenience, is given by

V a r (\hat{µ_{Y}}) = {(\frac{1 - p_{1}}{p_{2} - p_{1}})}^{2} V a r {(\bar{Z}}_{2}) + {(\frac{1 - p_{2}}{p_{2} - p_{1}})}^{2} V a r ({\bar{Z}}_{1}),

(13)

{V a r (\bar{Z}}_{i}) = \frac{2}{n} [W [1 - λ_{i} - A ϕ_{i}] σ_{S}^{2} + [W (1 - λ_{i}) ((1 - A) σ_{T}^{2} + 1) + 1 - W] (σ_{Y}^{2} + µ_{Y}^{2}) + W λ_{i} (σ_{R}^{2} + µ_{R}^{2}) - {[µ_{Y} + (µ_{R} - µ_{Y}) W λ_{i}]}^{2}],

λ_{i} = (1 - α) (1 - p_{i}), i = 1, 2, ϕ_{i} = p_{i} (1 - α), i = 1, 2, p_{1} \neq p_{2} .

Neither

A

nor

W

values are needed to estimate

µ_{Y}

.

V a r (\hat{µ_{Y}})

may be estimated by the sample variance of

\hat{µ_{Y}}

values.

3.3. MOET: Privacy Measure

Per Equation (7), the privacy of the MOET model is given by

𝛻^{a} = q_{1} 𝛻_{1} + q_{2} 𝛻_{2} + q_{3} 𝛻_{3} + q_{4} 𝛻_{4} .

(14)

Analysis of the diagram of the MOET model in Section 3.1 above shows that

{q_{1} = W (1 - α) p A + (1 - W), q}_{2} = W α A, q_{3} = W (1 - A) (α + p - α p), q_{4} = W (1 - α) (1 - p)

and

𝛻_{1} = 0, 𝛻_{2} = E {(S}^{2}), 𝛻_{3} = E {[(T Y + S - Y)}^{2}], 𝛻_{4} = E [{(R - Y)}^{2}]

. Estimating

E (Z)

by

\bar{Z}

, we can write

𝛻^{a} = α A E {(S}^{2}) + (1 - A) (α + p - α p) E [{(T Y + S - Y)}^{2}] + (1 - α) (1 - p) E {[(R - Y)}^{2}] .

(15)

Reducing further, we have the following expression, which represents privacy for the MOET model. The assumption of equal-split sampling underlies this formula, but a similar formula could easily be developed to represent unequal splitting.

𝛻^{a} = \frac{1}{2} \{2 α A σ_{S}^{2} + ({2 - λ}_{1} - λ_{2}) (1 - A) [(σ_{Y}^{2} + µ_{Y}^{2}) σ_{T}^{2} + σ_{S}^{2}] + (λ_{1} + λ_{2}) [σ_{Y}^{2} + σ_{R}^{2} + {(µ_{Y} - µ_{R})}^{2}]\},

(16)

λ_{1} = (1 - α) (1 - p_{1}), λ_{2} = (1 - α) (1 - p_{2}), p_{1} \neq p_{2} .

3.4. MOET: Sensitivity Estimator

Recall that the two samples used to estimate

µ_{Y}

yield the equations

E (Z_{i}) - µ_{Y} = W λ_{i} (µ_{R} - µ_{Y}), i = 1, 2 .

(17)

Estimating

{E (Z}_{i})

by

\bar{Z_{i}}

and

µ_{Y}

by

\hat{µ_{Y}}

, we have:

E (Z_{i}) - µ_{Y} = \hat{W} λ_{i} (µ_{R} - µ_{Y}), i = 1, 2 .

(18)

Solving the above expression for

W

in samples 1 and 2 and then combining estimates, we have:

\hat{W} = \frac{λ_{2} ({\bar{Z}}_{1} - \hat{µ_{Y}}) + λ_{1} ({\bar{Z}}_{2} - \hat{µ_{Y}})}{2 {λ_{1} λ}_{2} (µ_{R} - \hat{µ_{Y}})}, λ_{i} = (1 - α) (1 - p_{i}), p_{1} \neq p_{2} .

(19)

Inserting our estimator for

\hat{µ_{Y}}

, this expression reduces to

\hat{W} = \frac{{\bar{Z}}_{1} - {\bar{Z}}_{2}}{λ_{1} (µ_{R} - {\bar{Z}}_{2}) - λ_{2} (µ_{R} - {\bar{Z}}_{1})}, λ_{i} = (1 - α) (1 - p_{i}), p_{1} \neq p_{2} .

(20)

From Equation (19) one can see that this estimator becomes unstable when

µ_{R}

is close to

µ_{Y} .

4. Discussion

In this section of the study, we provide two tables that represent theoretical and simulated values and illustrate important characteristics of the MOET model. Each row of each table represents the output from a particular scenario (combination of A, W, and α values). Each scenario represents a sampling of n = 500 respondents within an RRT sampling scenario, and each simulation sampling is conducted N = 10,000 times in R software. Table 1 shows that the estimators developed in this study perform in line with theoretical expectations. Table 1 also establishes the value of the mixture model by showing that the MOET model—which is fundamentally a mixture of a Greenberg-based and a Warner-based model—yields performance preferable to either model on its own. Table 2 compares MOET model statistics to published OET model statistics in Gupta et al. (2022), and thereby demonstrates the MSE and unified measure superiority of the MOET model to the OET model [6].

Table 1 shows the results from the MOET model run according to the parameter values listed. The close fit between the empirical and theoretical values speaks to the veracity of the formulae proposed in this study, as well as the accuracy of the empirical simulations.

We draw attention to three particularly important aspects of the scenario presented in Table 1. First, observe that in the shown scenario, maximum efficiency (minimum MSE) occurs when α = 0 for any pairing of

A

and

W

values. Mathematically, this does not have to be true in every scenario. For example, in the scenario underlying the table above, the extreme choice of

σ_{R}^{2} = 7

with all other assumptions left unchanged will lead to a circumstance where maximum efficiency is found at α = 1 rather than α = 0. To study this important relationship closely, we let

f (a) = V a r (\hat{µ_{Y}})

, per Equation (13), and then take the derivative of this expression with respect to

a

. Expressing the result in the form below, we can see that

f' (a)

will always be positive under certain conditions.

f^{'} (a) = {(\frac{1 - p_{1}}{p_{2} - p_{1}})}^{2} \frac{2}{n} [ψ_{2} + Χ_{2} + Υ_{2} + ζ_{2}] + {(\frac{1 - p_{2}}{p_{2} - p_{1}})}^{2} \frac{2}{n} [ψ_{1} + Χ_{1} + Υ_{1} + ζ_{1}],

(21)

ψ_{i} = {A W p_{i} σ}_{S}^{2},

Χ_{i} = (1 - p_{i}) W ({((1 - A) σ_{T}^{2} + 1) (σ_{Y}^{2} + µ_{Y}^{2}) + σ}_{S}^{2} - σ_{R}^{2} - µ_{R}^{2}),

Υ_{i} = 2 (µ_{R} - µ_{Y}) W (1 - p_{i}) µ_{Y},

ζ_{i} = 2 {(1 - a) (µ_{R} - µ_{Y})}^{2} W^{2} {(1 - p_{i})}^{2},

i = 1, 2 .

Specifically, under the following two sufficient but not necessary conditions,

f' (a)

will be positive across all

a

values, so

V a r (\hat{µ_{Y}})

will be an increasing function of

a

. That is, maximum efficiency (minimum MSE) will always occur when α = 0 if

µ_{R} = µ_{Y},

(22a)

((1 - A) σ_{T}^{2} + 1) (σ_{Y}^{2} + µ_{Y}^{2}) + σ_{S}^{2} > σ_{R}^{2} + µ_{R}^{2} .

(22b)

The first condition sets

µ_{R}

equal to

µ_{Y}

, which seems both reasonable and intuitively appealing, as the response to the unrelated question should be designed to have a similar size to that of the sensitive question under study. Because the first condition fixes

µ_{R}

, the second condition puts an upper limit on the size of

σ_{R}^{2}

. This is not a very restrictive condition. Note that with A = 1, condition (22b) for the scenario presented in Table 1 reduces to

σ_{R}^{2} < 2

. Our choice of

σ_{R}^{2} = 1

meets the condition comfortably.

The second important aspect of Table 1 is that it illustrates the importance of the mixture feature of the model. The mixture capability of MOET is necessary to strike an optimal balance between the model’s efficiency and its privacy. This is clear because maximum efficiency is typically achieved when α is low (mixture leans toward the Greenberg model), while maximum privacy is achieved when α is high (mixture leans toward the Warner model), providing A < 1. For example, when

A

= 0.9 and

W = 1

, theoretical privacy improves from 1.3 to 1.5 as α values increase from 0 to 1, but at the same time that privacy improves, efficiency declines (MSE increases from 0.0091 to 0.0152). Since RRT models must achieve strong levels of both privacy and efficiency, the choice of an α value of 0 or 1 (full Greenberg or full Warner) would be a poor choice; they would either sacrifice the model’s privacy to maximize its efficiency or vice versa. A choice of α between 0 and 1 (some mixture) will better balance the competing concerns.

Beyond the two key observations made above, Table 1 also confirms that the model behaves in many expected ways. When question sensitivity (

W

) is low, the model is more efficient. For example, when A = 1 and α = 0.6, efficiency is higher (MSE lower) for W = 0.2 (0.0069) than for W = 1 (0098). This relationship is intuitive because more respondents submit their responses without any RRT scrambling when W = 0.2, leading to greater efficiency.

When respondents do not require additional scrambling to enhance their trust (when

A

is large), the model is more efficient. For example, when W = 0.6 and α = 0.6, efficiency is higher (MSE lower) for A = 1 (0.0083) than for W = 0.9 (0101). Again, this result seems intuitively reasonable. Note that Table 1 was run with

p_{1} = 0.15

and

p_{2} = 0.85

. These values were chosen because widely disparate probability values lead to higher model efficiency (this is a known characteristic of the split sample approach).

The exact nature of the tradeoff between efficiency and privacy across α values, and in fact the existence of any tradeoff at all, is influenced by the sensitivity of the particular quantitative research question (

W

) and the respondent’s need for enhanced trust (

A

), as well as by the standard deviations of the chosen scrambling variables,

S

and

T

, and the mean and variance of the unrelated question response

R

. We make this observation as a way of drawing attention to the high level of flexibility in the model. For a given quantitative research scenario in which a researcher has specific efficiency and privacy goals, the researcher can strategically choose the MOET model’s

S

,

T

and

R

variables and the mixture split α that will most closely achieve those goals.

Table 2 is provided principally to facilitate comparisons between the MOET model and its predecessor, the OET model (see Appendix A). The OET model, in fact, can be thought of as a special case of the MOET model with α = 1. Comparison of Equation (A1) to Equation (9) makes this relationship clear.

Recall that we have established, based on Table 1, that the empirical values of each of these statistics falls close to their theoretical values, so for purposes of legibility, the empirical values of these quantities are not shown here. The scenarios run for the OET and MOET models represented in Table 2 are the same, and the values of all parameters equal the values used in the published Gupta et al. (2022) OET model paper [6]. That is, the values of all parameters that the two models have in common (

σ_{S}

,

µ_{T}

,

σ_{T})

, the assumed characteristics of the sensitive question (

µ_{Y}

,

σ_{Y},

W

), the inclination of the respondent to respond without additional scrambling (

A

), the sample size, and number of iterations run (n, N), are the same in both models.

For the table above, we have set

µ_{R}

= 1 rather than

{µ_{R} = µ}_{Y} = 2

. This choice was made because the estimator for

W

(see Equation (19)) becomes volatile when

{µ_{R} = µ}_{Y}

. The choice of

µ_{R}

= 1 caused privacy to improve but caused efficiency to deteriorate slightly. In a real-life scenario, a researcher could choose

µ_{R}

to accommodate their research needs, choosing

µ_{R}

far from

µ_{Y}

if an estimate of W is needed.

The Table 2 scenario shows the MOET model’s efficiency to be superior to that of the OET model. For example, in the circled row, when A = 0.95 and W = 0.9, we can see that (MSE = 0.0089) < (MSE = 0.0454), implying that the MOET has superior efficiency; this efficiency superiority in fact holds true for all A and W values in the table. It follows that in scenarios where efficiency is significantly more important than privacy, the MOET model will often be the better choice of model. However, the OET model outperforms the MOET in terms of privacy. See Figure 2 and Figure 3 below that illustrate the tabular data above:

While in the illustrated scenario MOET outperforms OET in terms of efficiency, OET outperforms MOET in terms of privacy. When we combine the two measures, we see that the MOET model outperforms the OET model according to the unified measure. For example, when A = 1 and W = 1, we can see in Table 2 that (

δ^{a}

= 0.0056) < (

δ^{a}

= 0.0121); the superiority of the MOET model in terms of unified measure holds true for all A and W values.

5. Conclusions

The Mixture-Optional Enhanced Trust model proposed in this study has important advantages over the OET model, which in turn was shown to be superior to the basic Warner model (Gupta et al. 2022) [6]. Specifically, the MOET model can yield lower MSE (therefore higher efficiency) than the OET model when MOET parameters are elected that favor efficiency over privacy. But the OET model generally achieves superior privacy, so it will usually be the better model in cases where the need for privacy is the overriding concern. This lower MSE of the MOET model offsets the higher privacy offered by the OET model, as indicated by superior unified measure values. In this study, we have shown, furthermore, that the MOET model’s mixture capability (captured by model parameter α) causes MOET to be superior to either a full Warner-based or a full Greenberg-based model. This mixture model will balance the competing concerns of privacy and efficiency, which will always be preferable to fully sacrificing one of these important characteristics for the other.

Author Contributions

Conceptualization, S.G.; methodology, M.P., S.G. and S.K.; software, S.K., and M.P.; validation, S.K. and M.P.; formal analysis, M.P., S.K. and S.G.; writing—original draft preparation, M.P.; writing—review and editing, S.G., S.K. and M.P.; supervision, S.G. and S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This study is based on simulation and does not involve a dataset. The R code used to run simulations will be provided upon query.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following is a list of acronyms used in the paper:

MOET model	Mixture Optional Enhanced Trust model
OET model	Optional Enhanced Trust model
MSE	Mean Squared Error
RRT	Randomized Response Technique
SDB	Social Desirability Bias

Appendix A. The Optional Enhanced Trust Model

Gupta et al. (2022) proposed an Optional Enhanced Trust (OET) model [6]. We will review the OET model here because the OET is a key predecessor to the Mixture Optional Enhanced Trust (MOET) model that we will propose later in this study. Comparisons between the OET and the MOET model will help illustrate the unique value of the new MOET model.

The OET is a Warner-based RRT model that allows respondents to elect RRT or not (optionality) and allows RRT respondents to elect additional multiplicative or linear combination scrambling if they feel that doing so will enhance their trust. The parameter

W

captures a respondent’s desire to use some form of scrambling to hide their response, so

W

can be thought of as a measure of the sensitive question’s sensitivity. The parameter

A

captures a respondent’s desire for additive and multiplicative scrambling (rather than just additive scrambling) to enhance their trust in the RRT. Here,

S

and

T

are additive and multiplicative scrambling variables, respectively, with means

µ_{S}

= 0 and

µ_{T}

= 1 with known variances

σ_{S}^{2}

and

σ_{T}^{2}

. The resulting OET model is shown below. Gupta et al. (2002) showed that the OET model achieved superior efficiency and better mitigated lack of trust as compared to Warner’s model [6].

Below, Figure A1 presents a diagram of the OET model:

Figure A1. OET model.

Z = \{\begin{matrix} Y w i t h p r o b a b i l i t y 1 - W \\ Y + S w i t h p r o b a b i l i t y W A \\ T Y + S w i t h p r o b a b i l i t y W (1 - A) \end{matrix}

(A1)

Gupta et al. (2022) provided estimators for the OET’s mean, variance of mean estimator (efficiency), and privacy, as given by [6]:

Mean Estimator:

\hat{µ_{Y}} = \bar{Z} .

(A2)

Variance of Mean Estimator (Efficiency):

V a r (\hat{µ_{Y}}) = \frac{1}{n} [W (1 - A) (σ_{T}^{2} σ_{Y}^{2} + σ_{T}^{2} µ_{Y}^{2}) + σ_{Y}^{2} + W + σ_{S}^{2}] .

(A3)

If

W

and/or

A

are unknown, the variance of the mean estimator can be estimated by

\hat{V a r} (\hat{µ_{Y}}) = \frac{s_{Z}^{2}}{n},

(A4)

where

s_{Z}^{2}

is the variance in sample responses.

Privacy:

𝛻^{a} = (1 - A) (σ_{T}^{2} σ_{Y}^{2} + σ_{T}^{2} µ_{Y}^{2}) + σ_{S}^{2} .

(A5)

Unified Measure:

δ^{a} = \frac{V a r (\hat{µ_{Y}})}{𝛻^{a}} or \frac{\hat{V a r} (\hat{µ_{Y}})}{𝛻^{a}} .

(A6)

References

Raghavarao, D.; Federer, W.T. Block total response as an alternative to the randomized response method in surveys. J. R. Stat. Soc. Ser. B (Methodol.) 1979, 41, 40–45. [Google Scholar] [CrossRef]
Reynolds, W. Development of a reliable and valid short form of the Marlowe-Crowne SDB scale. J. Clin. Psychol. 1982, 38, 119–125. [Google Scholar] [CrossRef]
Jones, E.; Sigall, H. The Bogus Pipeline: A new paradigm for measuring effect and attitude. Psychol. Bull. 1971, 76, 349–364. [Google Scholar] [CrossRef]
Warner, S. The Linear Randomized Response Model. J. Am. Stat. Assoc. 1971, 66, 884–888. [Google Scholar] [CrossRef]
Greenberg, B.G.; Abul-Ela, A.L.A.; Simmons, W.R.; Horvitz, D.G. The unrelated question randomized response model: Theoretical framework. J. Am. Stat. Assoc. 1971, 666, 243–250. [Google Scholar] [CrossRef]
Gupta, S.; Zhang, J.; Khalil, S.; Sapra, P. Mitigating Lack of Trust in Quantitative Randomized Response Techniques Models. Commun. Stat. Part B Simul. Comput. 2022, 51, 1–9. [Google Scholar] [CrossRef]
Mehta, S.; Aggarwal, P. Bayesian estimation of sensitivity level and population proportion of a sensitive characteristic in a binary optional unrelated question rrt model. Commun. Stat.—Theory Methods 2018, 47, 4021–4028. [Google Scholar] [CrossRef]
Diana, G.; Perri, P.F. A class of estimators for quantitative sensitive data. Stat. Pap. 2011, 52, 633–650. [Google Scholar] [CrossRef]
Singh, G.N.; Kumar, A.; Vishwakarma, G.K. Some alternative additive randomized response models for estimation of population mean of quantitative sensitive variable in the presence of scramble variable. Commun. Stat.-Simul. Comput. 2018, 49, 2785–2807. [Google Scholar] [CrossRef]
Priyanka, K.; Trisandhya, P. A composite class of estimators using scrambled response mechanism for sensitive population mean in successive sampling. Commun. Stat.—Theory Methods 2019, 48, 1009–1032. [Google Scholar] [CrossRef]
Gupta, S.; Gupta, B.; Singh, S. Estimation of the sensitivity level of personal interview survey questions. J. Stat. Plan. Inference 2002, 100, 239–247. [Google Scholar] [CrossRef]
Gupta, S.; Mehta, S.; Shabbir, J.; Khalil, S. A unified measure of respondent privacy and model efficiency in quantitative RRT models. J. Stat. Theory Pract. 2018, 12, 506–511. [Google Scholar] [CrossRef]
Vishwakarma, G.K.; Kumar, A.; Kumar, N. Two-stage unrelated randomized response model to estimate the prevalence of a sensitive attribute. Comput. Stat. 2023, 52, 1–26. [Google Scholar] [CrossRef]
Narjis, G.; Shabbir, J. An efficient partial randomized response model for estimating a rare sensitive attribute using poisson distribution. Commun. Stat.—Theory Methods 2021, 50, 1–17. [Google Scholar] [CrossRef]
Lovig, M.; Khalil, S.; Rahman, S.; Sapra, P. A mixture binary RRT model with a unified measure of privacy and efficiency. Commun. Stat.—Simul. Comput. 2021, 52, 2727–2737. [Google Scholar] [CrossRef]
Yan, Z.; Wang, J.; Lai, J. An Efficiency and Protection Degree-Based Comparison Among the Quantitative Randomized Response Strategies. Theory Methods 2008, 38, 400–408. [Google Scholar] [CrossRef]

Figure 1. MOET model.

Figure 2. MOET vs. OET comparison across trust levels. (a) Efficiency results for MOET and OET models across trust levels (A). (b) Privacy results for MOET and OET models across trust levels (A). Parameter values underlying these exhibits are those cited in Table 2.

Figure 3. MOET vs. OET comparison across sensitivity levels. (a) Efficiency results for MOET and OET models across sensitivity levels (W). (b) Privacy results for MOET and OET models across sensitivity levels (W). Parameter values underlying these exhibits are those cited in Table 2.

Table 1. MOET Results.

(p_{1} = 0.85, p_{2} = 0.15, μ_{Y} = 2, μ_{S} = 0, μ_{T} = 1, μ_{R} = 2, σ_{Y} = 1, σ_{S} = 1, σ_{T} = 1, σ_{R} = 1, n = 500, N = 10,000

).

Table 1. MOET Results.

(p_{1} = 0.85, p_{2} = 0.15, μ_{Y} = 2, μ_{S} = 0, μ_{T} = 1, μ_{R} = 2, σ_{Y} = 1, σ_{S} = 1, σ_{T} = 1, σ_{R} = 1, n = 500, N = 10,000

).

A	W	α	${\hat{µ}}_{Y}$	${{M S E (\hat{µ}}_{Y})}_{T}$ *	${{M S E (\hat{µ}}_{Y})}_{E}$ *	${(𝛻^{a})}_{T}$ *	${(𝛻^{a})}_{E}$ *	${(δ^{a})}_{T}$ *	${(δ^{a})}_{E}$ *
1	1	1.0	2.0005	0.0122	0.0124	1.0000	1.0002	0.0122	0.0124
1	1	0.8	2.0017	0.0109	0.0111	1.0000	1.0003	0.0109	0.0111
1	1	0.6	2.0001	0.0097	0.0098	1.0000	0.9991	0.0097	0.0098
1	1	0.4	2.0021	0.0085	0.0083	1.0000	1.0012	0.0085	0.0083
1	1	0.2	1.9996	0.0073	0.0073	1.0000	1.0007	0.0073	0.0073
1	1	0	1.9999	0.0061	0.0061	1.0000	1.0016	0.0061	0.0061
1	0.6	1.0	2.0002	0.0097	0.0098	1.0000	1.0002	0.0097	0.0098
1	0.6	0.8	2.0005	0.0090	0.0092	1.0000	1.0003	0.0090	0.0092
1	0.6	0.6	2.0002	0.0083	0.0083	1.0000	0.9991	0.0083	0.0083
1	0.6	0.4	1.9999	0.0075	0.0075	1.0000	1.0012	0.0075	0.0075
1	0.6	0.2	1.9995	0.0068	0.0069	1.0000	1.0007	0.0068	0.0069
1	0.6	0	2.0006	0.0061	0.0062	1.0000	1.0016	0.0061	0.0062
1	0.2	1.0	1.9988	0.0073	0.0074	1.0000	1.0002	0.0073	0.0074
1	0.2	0.8	1.9988	0.0071	0.0069	1.0000	1.0003	0.0071	0.0069
1	0.2	0.6	2.0001	0.0068	0.0069	1.0000	0.9991	0.0068	0.0069
1	0.2	0.4	2.0006	0.0066	0.0066	1.0000	1.0012	0.0066	0.0066
1	0.2	0.2	2.0000	0.0063	0.0062	1.0000	1.0007	0.0063	0.0062
1	0.2	0	1.9996	0.0061	0.0062	1.0000	1.0016	0.0061	0.0062
0.9	1	1.0	2.0004	0.0152	0.0154	1.5000	1.5004	0.0101	0.0103
0.9	1	0.8	2.0002	0.0140	0.0137	1.4600	1.4585	0.0096	0.0094
0.9	1	0.6	1.9992	0.0128	0.0125	1.4200	1.4207	0.0090	0.0088
0.9	1	0.4	2.0016	0.0115	0.0116	1.3800	1.3798	0.0083	0.0084
0.9	1	0.2	1.9999	0.0103	0.0104	1.3400	1.3393	0.0077	0.0078
0.9	1	0	1.9992	0.0091	0.0089	1.3000	1.2982	0.0070	0.0069
0.9	0.6	1.0	2.0002	0.0116	0.0116	1.5000	1.5004	0.0077	0.0077
0.9	0.6	0.8	1.9992	0.0108	0.0109	1.4600	1.4585	0.0074	0.0075
0.9	0.6	0.6	1.9996	0.0101	0.0101	1.4200	1.4207	0.0071	0.0071
0.9	0.6	0.4	1.9988	0.0094	0.0092	1.3800	1.3798	0.0068	0.0067
0.9	0.6	0.2	2.0000	0.0086	0.0085	1.3400	1.3393	0.0064	0.0063
0.9	0.6	0	1.9995	0.0079	0.0080	1.3000	1.2982	0.0061	0.0062
0.9	0.2	1.0	2.0005	0.0079	0.0078	1.5000	1.5004	0.0053	0.0052
0.9	0.2	0.8	1.9989	0.0077	0.0076	1.4600	1.4585	0.0053	0.0052
0.9	0.2	0.6	2.0004	0.0074	0.0075	1.4200	1.4207	0.0052	0.0053
0.9	0.2	0.4	2.0001	0.0072	0.0072	1.3800	1.3798	0.0052	0.0052
0.9	0.2	0.2	2.0022	0.0069	0.0070	1.3400	1.3393	0.0051	0.0052
0.9	0.2	0	2.0007	0.0067	0.0069	1.3000	1.2982	0.0052	0.0053

*

{{M S E (\hat{µ}}_{Y})}_{T}

and

{{M S E (\hat{µ}}_{Y})}_{E}

represent theoretical and empirical mean squared error of the estimator of

Y

.

{(𝛻^{a})}_{T}

and

{(𝛻^{a})}_{E}

represent theoretical and empirical model privacy.

{(δ^{a})}_{T}

and

{(δ^{a})}_{E}

represent theoretical and empirical unified measure.

Table 2. MOET Results compared to OET Results.

(α {= 0.15, p}_{1} = 0.85, p_{2} = 0.15, μ_{Y} = 2, μ_{S} = 0, μ_{S 1} = 2, μ_{S 2} = 1, μ_{T} = 1, μ_{R} = 1, σ_{Y} = 1, σ_{S} = 1, σ_{T} = 1, σ_{R} = 1, n = 500, N = 10,000

).

Table 2. MOET Results compared to OET Results.

(α {= 0.15, p}_{1} = 0.85, p_{2} = 0.15, μ_{Y} = 2, μ_{S} = 0, μ_{S 1} = 2, μ_{S 2} = 1, μ_{T} = 1, μ_{R} = 1, σ_{Y} = 1, σ_{S} = 1, σ_{T} = 1, σ_{R} = 1, n = 500, N = 10,000

).

		OET Model					MOET Model
A	W	${\hat{µ}}_{Y}$	${{M S E (\hat{µ}}_{Y})}_{T}$	${(𝛻^{a})}_{T}$ *	${(δ^{a})}_{T}$	$\hat{W}$	${\hat{µ}}_{Y}$	${{M S E (\hat{µ}}_{Y})}_{T}$	${(𝛻^{a})}_{T}$	${(δ^{a})}_{T}$	$\hat{W}$
1	1	1.99899	0.0400	3.500	0.0114	1.0001	2.00122	0.0077	1.4250	0.0054	0.9956
1	0.9	2.0017	0.0409	3.500	0.0117	0.8989	1.9996	0.0075	1.4250	0.0053	0.8938
1	0.7	1.9975	0.0407	3.500	0.0116	0.7023	1.9993	0.0072	1.4250	0.0051	0.6938
1	0.5	1.9985	0.0380	3.500	0.0109	0.5013	2.0010	0.0069	1.4250	0.0048	0.4941
1	0.3	1.9999	0.0327	3.500	0.0093	0.3001	1.9997	0.0066	1.4250	0.0046	0.2914
0.95	1	2.0014	0.0450	3.750	0.0120	0.9989	1.9994	0.0092	1.5900	0.0058	0.9953
0.95	0.9	1.9991	0.0454	3.750	0.0121	0.9014	1.9990	0.0089	1.5900	0.0056	0.8933
0.95	0.7	2.0024	0.0442	3.750	0.0118	0.6980	2.0010	0.0083	1.5900	0.0052	0.6933
0.95	0.5	2.0006	0.0405	3.750	0.0108	0.4996	2.0005	0.0077	1.5900	0.0048	0.4925
0.95	0.3	1.9966	0.0342	3.750	0.0091	0.3030	1.9999	0.0071	1.5900	0.0045	0.2901
0.9	1	2.0010	0.0500	4.000	0.0125	0.9993	2.0007	0.0107	1.7550	0.0061	0.9960
0.9	0.9	1.9997	0.0499	4.000	0.0125	0.9000	1.9995	0.0103	1.7550	0.0059	0.8925
0.9	0.7	2.0010	0.0477	4.000	0.0119	0.6997	1.9996	0.0094	1.7550	0.0054	0.6916
0.9	0.5	1.9988	0.0430	4.000	0.0108	0.5013	1.9975	0.0084	1.7550	0.0048	0.4883
0.9	0.3	2.0000	0.0357	4.000	0.0089	0.3012	2.0002	0.0075	1.7550	0.0043	0.2889
0.85	1	2.0038	0.0550	4.250	0.0129	0.9980	2.0017	0.0122	1.9200	0.0064	0.9936
0.85	0.9	2.0024	0.0544	4.250	0.0128	0.8984	1.9996	0.0116	1.9200	0.0060	0.8910
0.85	0.7	2.0032	0.0512	4.250	0.0120	0.6988	1.9995	0.0104	1.9200	0.0054	0.6915
0.85	0.5	2.0002	0.0455	4.250	0.0107	0.4997	2.0000	0.0092	1.9200	0.0048	0.4882
0.85	0.3	1.9996	0.0372	4.250	0.0088	0.3000	2.0001	0.0080	1.9200	0.0042	0.2908
0.8	1	2.0028	0.0600	4.500	0.0133	0.9987	1.9974	0.0137	2.0850	0.0066	0.9917
0.8	0.9	2.0036	0.0589	4.500	0.0131	0.8982	2.0005	0.0130	2.0850	0.0062	0.8915
0.8	0.7	1.9997	0.0547	4.500	0.0122	0.7004	1.9996	0.0115	2.0850	0.0055	0.6895
0.8	0.5	2.0041	0.0480	4.500	0.0107	0.4971	2.0010	0.0100	2.0850	0.0048	0.4906
0.8	0.3	1.9990	0.0387	4.500	0.0086	0.3008	2.0000	0.0084	2.0850	0.0040	0.28888

*

{{M S E (\hat{µ}}_{Y})}_{T}

,

{(𝛻^{a})}_{T}

, and

{(δ^{a})}_{T}

represent the theoretical mean squared error, privacy and unified measure for the respective models. Green indicates model superiority.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Parker, M.; Gupta, S.; Khalil, S. A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology. Axioms 2024, 13, 11. https://doi.org/10.3390/axioms13010011

AMA Style

Parker M, Gupta S, Khalil S. A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology. Axioms. 2024; 13(1):11. https://doi.org/10.3390/axioms13010011

Chicago/Turabian Style

Parker, Michael, Sat Gupta, and Sadia Khalil. 2024. "A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology" Axioms 13, no. 1: 11. https://doi.org/10.3390/axioms13010011

APA Style

Parker, M., Gupta, S., & Khalil, S. (2024). A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology. Axioms, 13(1), 11. https://doi.org/10.3390/axioms13010011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Mixture Quantitative Randomized Response Model That Improves Trust in RRT Methodology

Abstract

1. Introduction

2. Materials and Methods

2.1. Efficiency Metric

2.2. Privacy Metric

2.3. Unified Measure of Efficiency and Privacy

3. Proposed Mixture Optional Enhanced Trust Model (MOET)

3.1. MOET Model Introduction

3.2. MOET: Mean Estimator

3.3. MOET: Privacy Measure

3.4. MOET: Sensitivity Estimator

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. The Optional Enhanced Trust Model

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI