1. Introduction
In recent years, data-driven decision-making has been routinely employed in various contexts, including evaluating the efficacy of medical treatments, vaccinations, training programs, marketing campaigns, and other interventions. The crux of effective decision-making lies in the ability to manage uncertainty and make informed choices based on available information. The concept of entropy, which quantifies uncertainty in a probability distribution, plays a crucial role in understanding how information is generated and utilized in these contexts.
Suppose an experimenter has collected data on n units, along with baseline covariate information. The question arises: how should treatments be allocated based on these pre-treatment covariates to provide precise causal estimates of treatment effects on potential outcomes? Here, the balance of information—essentially the reduction of uncertainty—across treatment groups becomes vital for drawing valid conclusions.
Randomized experiments, including completely randomized experiments and re-randomized experiments [
1], are commonly employed as control methods. However, ref. [
2] argued that a deterministic experiment can strictly dominate all randomized experiments in minimizing estimation error when continuous covariates are present. The ability of a deterministic design to minimize covariate imbalance reflects its potential to reduce uncertainty in treatment effect estimates, thereby leading to more reliable causal inferences. This deterministic experiment seeks to balance the covariate distribution across treatments, resulting in fairer experiments relative to random assignments. Consequently, deterministic experiments often appear more appealing in practice.
Ref. [
2] developed a Bayesian optimal deterministic experiment (BODE) by minimizing the expected mean squared error (MSE) risk of the estimated treatment effect across assumed stochastic processes for the potential outcomes. However, identifying a BODE requires specifying approximate prior distributions for the potential outcomes, which introduces additional uncertainty in the modeling process.
In this paper, we introduce a novel type of optimal deterministic experiment, termed the minimax optimal deterministic experiment (MODE). The design criterion to be minimized is the maximum MSE risk associated with treatment assignment over a class of mean and variance functions of potential outcomes. This model-independent approach inherently reduces uncertainty by avoiding reliance on specific parametric assumptions. The bias component of the maximum risk is proportional to the generalized discrepancy, which quantifies covariate imbalance by measuring the difference between the empirical distributions of covariates in treatment and control groups. This generalized discrepancy serves as a critical measure of information loss, providing insight into the uncertainty surrounding treatment assignments.
Quasi-Monte Carlo methods are employed to generate the MODE. From a covariate balance perspective, the covariates in a randomized experiment can be viewed as a Monte Carlo sample from the covariate population. As a result, existing randomized experiments are sub-optimal, and the proposed MODE exhibits reduced covariate imbalance and lower estimation MSE. Typically, the MODE decreases the MSE of the difference-in-means estimator based on the completely randomized experiments (CRE) from to when potential outcomes are fixed. Thus, for most online control experiments prioritizing estimation accuracy and the effective management of uncertainty, the proposed MODE is preferred. Additionally, we provide re-randomization relaxation of the MODE for situations where randomization is necessary due to practical constraints.
The remainder of this paper is organized as follows. In
Section 1.1, we review related experimental designs for causal inference. Our main contributions are summarized in
Section 1.2. In
Section 2, we formulate the problem of finding a MODE.
Section 3 derives the large sample properties of the MODE.
Section 4 theoretically compares the MODE with existing randomized experiments.
Section 5 compares the finite sample performance of the MODE with that of existing experiments across different potential outcomes models. Finally, we conclude this paper and discuss future works in
Section 6. For clarity, all proofs are provided in
Appendix B.
1.1. Related Works
Completely randomized experiments are considered the gold standard for estimating causal effects, as randomization balances all potential confounding variables—both observed and unobserved—on average, thereby providing reliable estimates of treatment effects. However, as highlighted by [
3], in a particular randomized experiment, covariates between treatment groups may become significantly unbalanced, leading to increased estimation error and greater uncertainty about the treatment effects. As pointed out by [
1], with 10 independent covariates, the probability of at least one covariate showing a significant difference at the 5% significance level between treatment and control groups is about 40%. Similarly, ref. [
4] emphasized that achieving covariate balance (e.g., through matching) enhances robustness to varying parametric assumptions.
Numerous methods exist in the design stage of an experiment to address covariate imbalance, including blocked randomization [
5] and re-randomization [
1,
6]. Blocking is a classical approach that is effective for balancing a limited number of discrete covariates. In contrast, re-randomization, as proposed by [
1], offers a comprehensive framework for reducing covariate imbalance, thereby enhancing the reliability of causal estimates. In this method, the randomized experiment is repeated until the covariates between treatment groups meet a pre-specified balance criterion. Thus, re-randomization generalizes blocking to accommodate multiple covariates with various values. For an in-depth review of the design and analysis of randomized experiments, please refer to [
7].
However, from the perspective of statistical decision-making, randomized experiments are dominated by deterministic experiments [
2]. Recent works by [
8,
9] have introduced optimal deterministic experiments based on assumed parametric potential outcome models. The BODE proposed by [
2] depends on the chosen prior for potential outcomes, which adds another layer of uncertainty. All of these optimal designs are sensitive to the specified potential outcome models. In contrast, the proposed MODE is model-independent, thereby mitigating the uncertainty associated with model specification.
The generalized discrepancy utilized in the MODE is analogous to the optimization objective of kernel optimal matching [
10], which is employed for covariate balancing in the analysis stage of an experiment. However, the generalized discrepancy is applied during the design stage, while the optimization variables in kernel optimal matching are the weights for the estimator. Unlike post hoc methods, the MODE is implemented entirely in the design phase, ensuring it remains unaffected by outcome data. More convincing reasons for why as much as possible should be done in the design phase of an experiment can be found in [
11].
1.2. Main Contributions
Firstly, we introduce a novel type of model-independent deterministic experiment, termed the MODE, which minimizes uncertainty in treatment effect estimation. Secondly, we provide a theoretical proof demonstrating that the proposed MODE reduces covariate imbalance and the estimation MSE of the completely randomized experiment or the re-randomization experiment using Mahalanobis distance, decreasing it from to when potential outcomes are fixed. The third contribution of this paper is establishing the relationship between the maximum risk of causal estimators and the generalized discrepancy, emphasizing how reducing this discrepancy enhances the overall information quality in causal inference.
2. Problem Setups
Let denote the observable data for a population of n units, where is a binary treatment for the i-th unit, i.e., the i-th unit is treated if and only if . and represent the p-dimensional pre-treatment covariates and the real-valued outcome for the i-th units, respectively. Let represent all the p-dimensional covariates. We define and as the numbers of treated and untreated units, respectively. The set includes all possible treatment assignments with treated units.
Under the finite population potential outcomes framework and the stable unit-treatment-value assumption [
12], we have
, where
and
are the potential outcomes for the
i-th unit under the treatment and control, respectively. This paper considers the randomness arising from both treatment assignment and potential outcomes. For
and
, we assume that the conditional mean and covariance of potential outcomes given all covariates are as follows:
where
and
are unknown mean and variance functions, respectively, and
is the indicator function. The conditions in (
1) impose restrictions on the correlations between potential outcomes and covariates, and they hold if
are mutually independent [
10].
Typically, the target of interest is to effectively estimate the average treatment effect
using the difference-in-means estimator
In this paper, we aim to design a deterministic treatment assignment
with
to improve the estimation precision of the difference-in-means estimator. For simplicity, let
represent the vector of mean and variance functions. The conditional MSE of
given
can be decomposed as follows:
where the conditional variance and squared bias are given by the following:
From the above expression, the MSE risk depends on the treatment assignment
, the set of covariates
, and the vector of unknown functions
. Therefore, to minimize the MSE, we can carefully design
by fully utilizing the information contained in
and
. Different assumptions regarding
lead to distinct optimal treatment assignments [
8,
9]. However, these assumptions are unverifiable since the potential outcomes are unobservable. In this paper, we adopt a minimax framework to identify the most robust treatment assignment with respect to a class of unknown mean and variance functions. We do not impose specific functional forms on these unknown functions; instead, we restrict them to belong to a function class
, as rigorously defined in (
7). The maximum risk of a treatment is quantified by the worst-case MSE for that treatment over all possible true mean and variance functions in
, i.e.,
The most robust treatment assignment with respective to
is the one for which the maximum risk achieves the minimum over all possible treatments in
. More precisely, we define the MODE as follows.
Definition 1. A treatment is called a MODE if
Therefore, the task of obtaining a MODE can be viewed as a game between the experimenter, who chooses a treatment assignment
from
, and nature, which chooses the worst-case potential outcomes structures from
. Additionally, the MODE is model-independent, as the maximum risk defined in (
6) accounts for the uncertainties of all possible potential outcomes models. If there are multiple treatments with the minimum maximum risk, the MODE in Definition 1 is not unique, but they are equivalent in terms of minimizing the maximum risk. Therefore, we can choose any one of them to conduct the experiment.
3. Minimax Optimal Deterministic Experiments
In this section, we first derive two equivalent expressions for the maximum risk in (
6). We then consider the MODE, as outlined in Definition 1, under both fixed and divergent sample sizes. Additionally, we provide an algorithm for generating the MODE.
3.1. Expressions for the Maximum Risk
For any
with
, let
and
denote the covariate information of the treated and untreated units, respectively. To ensure that the maximum risk
in (
6) is well defined, we assume that
where
and
are pre-specified parameters, and
is a reproducing kernel Hilbert space, equipped with the inner product
and the reproducing kernel
. A crucial feature of this kernel is its reproducibility, meaning that
for any
and
; see [
13,
14].
After embedding the mean functions into the reproducing kernel Hilbert space , we provide an analytical expression for the maximum risk by bounding the bias term using a generalized Koksma–Hlawka inequality.
Theorem 1. For any with , the maximum risk is expressed as follows:where and are constants specified in Δ, and is the squared generalized discrepancy between and , defined by Theorem 1 shows that the variance component of the maximum risk is minimized by any balanced experiment in
, i.e.,
. The bias component of the maximum risk is determined by the sum of the generalized discrepancies of the treated and untreated covariates. The generalized discrepancy defined in (
8) can alternatively be expressed as the squared distance between the empirical distribution functions of the point sets
and
, denoted as
and
, respectively. Specifically, we have
where
is the distance metric induced by the inner product
. A point set
with a small discrepancy
indicates that the empirical distributions of
and
are close in terms of the distance metric
.
Different choices of the reproducing kernel Hilbert spaces correspond to different generalized discrepancies, resulting in varying expressions for maximum risks. We provide two widely used reproducing kernel Hilbert spaces, as described in [
15,
16].
Example 1. Let , which is the set of all real-valued functions on that have square-integrable mixed first derivatives [15]. In this context, can represent various types of discrepancies, such as the generalized -discrepancy and the centered -discrepancy, etc. For instance, the kernel function for the centered -discrepancy is . Example 2. Let . For the explicit form of this space, refer to [16]. It has been established that the space contains the Sobolev space when p is even; otherwise, , where is the set of functions whose s-th order derivatives are square-integrable. In this context, is the energy distance between and , with the kernel function defined as . According to the property of the generalized discrepancy defined in (
8), we provide another expression for the maximum risk.
Corollary 1. The maximum risk expression in Theorem 1 is equivalent to the following: Corollary 1 indicates that the bias component of the maximum risk can be represented as the squared generalized discrepancy between the treated covariates
and the untreated covariates
. Consequently, the closer the empirical distributions of
and
are, the smaller the maximum risk becomes. The generalized discrepancy
defined in (
8) can be equivalently expressed as a maximum mean discrepancy [
17], given by the following:
Thus,
serves as a measure of covariate imbalance that incorporates a range of mean differences rather than focusing solely on a specific mean difference, as done with the Mahalanobis distance defined in (
14). A zero generalized discrepancy implies that the covariate distributions in the treatment and control groups are identical. This distributional consistency cannot be achieved through mean matching alone. Additionally, using the equivalent expression for maximum risk in Corollary 1 enhances computational efficiency, as evaluating the maximum risk of a treatment requires calculating the value of the discrepancy function only once.
3.2. MODE and Asymptotic MODE
For any fixed sample size
n and discrepancy function
in (
10), we identify the MODE in Definition 1 by solving the following:
where
. The treatment
corresponding to the above
is a MODE. When
n is small, we can obtain the MODE
by enumerating all elements in
. However, this solution depends on the pre-specified parameters
and
, and it becomes almost infeasible for large
n values since the cardinality of
grows exponentially to
. Consequently, we explore the concept of the asymptotic MODE as
n approaches infinity. To make progress, we require that the considered discrepancy function belongs to the following set:
Notably, a Monte Carlo point set of size
with
will typically have a discrepancy of order
. In contrast, a quasi-Monte Carlo point set, also referred to as representative points and defined as the minima of a discrepancy, possesses a discrepancy of order
. This low-discrepancy property supports the theoretical assertion that quasi-Monte Carlo points outperform Monte Carlo points and is fulfilled by commonly used discrepancies. For example, ref. [
16] demonstrated that the
n-point support points converge at a rate of
for any
, measured by the energy distance presented in Example 2. Additionally, Theorem 1 of [
18] shows that if the empirical distribution function of the covariates
is joint independent, the
n-point data-driven subsamples converge at a rate of
for any
, measured by the generalized
-discrepancy presented in Example 1. Thus, all discrepancies in Examples (1) and (2) belong to
.
For any given discrepancy function in , we identify an asymptotic MODE as follows.
Theorem 2. For any discrepancy in (10), the balanced treatment assignment with and , is an asymptotic MODE, with its maximum risk given by , as . Theorem 2 ensures that the variance component of the maximum risk of the asymptotic MODE is minimized, while its bias component becomes negligible compared to its variance component by minimizing the covariate imbalance, defined by the discrepancy between the treated and untreated covariates. Consequently, experimental units with non-homogeneous covariates perform comparably to those with homogeneous covariates when utilizing the asymptotic MODE.
3.3. Algorithm for Generating Asymptotic MODE
Based on Theorem 2, we can implement the asymptotic MODE in the following steps. Firstly, we search for
points from
that have the lowest discrepancy relative to the entire covariate dataset
. This subset is denoted as
, and the remaining
covariates are denoted as
, i.e.,
. Then, we can assign the treatment to the units with
(or
) and the control to the units with
(or
). The above processes are summarized in Algorithm 1.
Algorithm 1: MODE: minimax optimal deterministic experiment |
| - Input:
: the covariates of n experimental units; : a valid kernel function defined on ; M: the maximum allowable sampling times.
|
| Output: The MODE with treatment units and control units . |
1 | if then |
2 | | |
where is the points obtained by the twinning method [19]. |
3 | else if then |
4 | | | , where is the points obtained by the data-driven subsampling method [18]. |
5 | else |
6 | | | for m=1:M do |
7 | | | | . |
8 | | | end |
9 | | | . |
10 | end |
11 | . |
This algorithm systematically selects treatment and control groups to minimize covariate imbalance, thereby enhancing the robustness of treatment assignment. Any valid discrepancy on can be used as the input for Algorithm 1. The optimality of the MODE is guaranteed by Theorems 2 and 3, provided the selected discrepancy belongs to .
We tailor different optimization methods to various discrepancy functions. For continuous covariates, we recommend using the energy distance [
16] or the empirical
F-discrepancy [
18]. When employing the energy distance as the discrepancy, characterized by the kernel
, we recommend utilizing the twinning method proposed by [
19] to identify
. For the empirical
F-discrepancy using the kernel
, where
,
is the empirical distribution function of the
j-th component of
, and
is any valid kernel on
, we recommend the data-driven subsampling method proposed by [
18] to find
. For the details of the twinning and the data-driven subsampling methods in Algorithm 1, please refer to [
19] and [
18], respectively. For discrete covariates, the Lee-discrepancy [
20] is preferred. For other valid discrepancies, a Monte Carlo approximation method is applied. This involves randomly sampling
M subsets from
and selecting the one,
, with the minimum discrepancy value. Ref. [
2] demonstrated that after
times sampling, the probability that the output
is better then 99% of all possible candidates is already larger than 99%, and significantly larger values of
M are also feasible in practice.
Example 3. This example demonstrates the usefulness of the MODE procedure using the red wine quality dataset from the UCI databases library. We evaluate the impact of treatments, such as a new storage method, on red wine quality. Two key physicochemical properties, "pH” and “alcohol,” which may influence the final outcome, are treated as covariates. We assume that all these covariates of the red wines have been collected. To minimize the influence of covariates on the estimation of treatment effects, it is essential to ensure that the covariates of the treatment and control groups are as similar as possible.
For clarity, we focus on the first 100 samples in the dataset, treating them as experimental units, and scale their covariates into the range . Given that the covariates are continuous, we adopt the energy distance with the kernel function in Algorithm 1. Figure 1 illustrates the covariates of equally sized treatment and control groups divided by the CRE and MODE methods. Intuitively, the covariates of the treatment and control groups created by MODE appear more balanced than those by CRE, particularly near the point. Quantitatively, the Mahalanobis distance between the covariates of the treatment and control groups under MODE is , significantly smaller than the observed with CRE. This demonstrates that MODE provides a more effective approach for achieving covariate balance, which is crucial for reducing bias in treatment effect estimation. 4. Theoretical Comparison with Randomized Experiments
In this section, we theoretically compare the proposed MODE with completely randomized experiments and re-randomized experiments. When the potential outcomes are fixed, the MODE reduces the MSE of the CRE or the re-randomization using Mahalanobis distance, from to . Thus, the proposed MODE serves as a super effective covariate balance technique.
4.1. Comparison with Completely Randomized Experiments
For simplicity, we denote the difference-in-means estimators based on the CRE and the MODE as
and
, respectively. Under the CRE, where
is randomly sampled form
with
,
serves as an unbiased estimator for
, i.e.,
, with a conditional variance given by the following:
The expression in (
11) indicates that this randomization variance under the CRE equals the average of the conditional MSEs across treatments in
. In contrast, the MODE in (1) minimizes the maximum MSE over a class of mean and variance functions.
We define
for
, and
, where
for
, and
for
. The above randomization variance can be expressed more precisely as follows (see
Appendix A for details):
The term
aligns with the variance expression when the potential outcomes are fixed [
21], while the term
quantifies the randomness stemming from the potential outcomes.
Following the percent reduction in variance proposed by [
1], we define the percent reduction in MSE as the percentage by which the MODE reduces the randomization MSE of the difference-in-means estimator:
where
is the asymptotic MODE provided in Theorem 2. It is important to note that this percent reduction in MSE depends on the covariates
and the unknown mean and variance functions in
. Under certain mild conditions, we provide a lower bound for this percent reduction in MSE.
Assumption 1. (i) ; (ii) , for .
The first condition is necessary for the existence of the difference-in-means estimator. The second condition ensures that certain moments of the covariates are bounded in probability, which is easily satisfied if the mean and variance functions are bounded.
Theorem 3. Under the mild conditions specified in Assumption 1, the percent reduction in MSE defined in (13) satisfies the following: Theorem 3 establishes that the proposed MODE effectively reduces the MSE of the difference-in-means estimator compared to the CRE. This reduction becomes more pronounced when the term
defined in (
12) is small, or when the term
defined in (
12) is large. An intuitive explanation of Theorem 3 is that while the CRE balances covariates on average, the actual distributions of covariates in the treatment and control groups may remain unbalanced in a specific experiment, resulting in larger estimation variance. In contrast, the MODE carefully arranges the treated and untreated units to ensure that their covariate distributions align as closely as possible, thereby achieving a smaller MSE.
It is evident that under Assumption 1, . If the potential outcomes are fixed, i.e., and , Theorem 2 implies that . Thus, the MODE demonstrates superiority over the CRE.
Corollary 2. Under the mild conditions outlined in Assumption 1, if and , then the MODE reduces the MSE of the CRE from to , i.e., with .
Corollary 2 indicates that the convergence rate of the MSE is improved by employing the MODE compared to the CRE when the potential outcomes are fixed. This further establishes the superiority of the proposed MODE.
4.2. Comparison with Re-Randomized Experiments
In this subsection, we demonstrate that the proposed MODE outperforms the re-randomized experiment using Mahalanobis distance [
1], referred to as ReM, in terms of minimizing the MSE. Furthermore, the proposed MODE can also be identified as a specific case of a re-randomized experiment.
The re-randomization technique enhances the performance of a randomized experiment by excluding assignments that yield unbalanced covariate distributions prior to the experiment’s initiation. Specifically, a re-randomization criterion is defined as a binary function
such that a treatment assignment
belongs to the restricted randomization set if and only if
. The restricted set for ReM is defined as follows:
where
a is a pre-specified constant, and
is the Mahalanobis distance between the treatment and control groups defined by the following:
where
,
, and
is the sample covariance matrix. The ReM is considered balanced when
.
Consider the scenario where the potential outcomes are fixed and the causal effect is additive, i.e.,
and
for
. In this case, the balanced ReM reduces the variance of the CRE to
(Theorem 3.2 of [
1]), where
,
is the Chi-square distribution with
p degrees of freedom, and
represents the squared multiple correlation between
and
. Thus, the balanced ReM enjoys the same convergence rate with the CRE. We summarize this conclusion as the following corollary.
Corollary 3. If , for , and for , then the variance of the balanced ReM is of order .
Theorem 2 implies that if and . Thus, as a result of Corollary 3, the proposed MODE improves the convergence rate of the estimated treatment effect compared to the balanced ReM from to . This further reinforces the superiority of using the proposed MODE.
Next, we establish a connection between the MODE and re-randomization. In our setting, the restricted set for the MODE,
, can be defined as follows:
where
. The unknown minimum value of the discrepancy in this re-randomization criterion can be estimated via Monte Carlo sampling. Specifically, we can randomly sample from
a feasible number of times and take the minimum discrepancy among those samples as its estimate.
To leverage the strengths of randomized-based inference, we observe that the restricted set
typically comprises either singletons or sets of small cardinality. Therefore, we relax the re-randomization criterion
to the following:
where
is the empirical
-quantile of the distribution of all discrepancy values over
. It is evident that the above re-randomization,
, degenerates into the MODE as
and into the CRE as
. Thus, the re-randomized relaxation in (
15) with
effectively combines the randomness of the CRE with the efficiency of the MODE, making it useful when the experimenter is constrained to use randomization due to practical considerations.
5. Simulations
In this section, we compare the performance of several experimental approaches, including the balanced CRE, the balanced ReM utilizing the critical value
[
1], the BODE with default priors as outlined by [
2], and the proposed MODE employing the energy distance with the kernel function
. Each experiment is repeated 1000 times across all settings. The simulation results indicate that the proposed MODE demonstrates reduced covariate imbalance and lower estimation uncertainty compared to existing methods.
5.1. Covariate Imbalance
We examine two sample size and dimensionality scenarios: (i)
and
; (ii)
and
. In each case, the covariates are generated as
and
for
. We then apply the four experimental designs to the covariate set
. The Mahalanobis distance
, as defined in (
14), and the energy distance
, which represents the discrepancy function in (
8) with the kernel function
, are employed as two measures of covariate imbalance.
Figure 2 illustrates the covariate imbalances across the various experimental methods. The CRE method balances covariates on average; however, many CRE instances exhibit substantial covariate imbalance. In contrast, the ReM method maintains both the Mahalanobis distance and energy distance of covariates between treatment and control groups within a smaller range. The BODE further enhances covariate balance through Monte Carlo optimization. Notably, the proposed MODE demonstrates the smallest covariate imbalance across different settings. These findings suggest that the experimental units in the treatment and control groups based on MODE are more homogeneous, resulting in more comparable experimental outcomes.
5.2. Estimation Precision
To evaluate the precision of difference-in-means estimators derived from various experiments, we assume that the responses
are generated from the following potential outcomes model:
where
if and only if the
i-th unit is treated in the corresponding experiment. We consider four different specifications for mean and variance functions:
- C1.
;
- C2.
;
- C3.
;
- C4.
,
where . In C1 and C3, we assume that the potential outcomes are fixed.
Table 1 presents the empirical MSEs of different-in-means estimators across the various experiments. As the sample sizes increase, the empirical MSEs decrease, as expected; however, the percentage reductions in MSE of the MODE compared to the CRE become more pronounced, indicating that the MSEs associated with the MODE converge more rapidly. Under each configuration of dimension and sample size, the ReM reduces the randomization variance of the CRE by constraining the Mahalanobis distances of the CREs. The BODE generally yields a smaller MSE than the ReM. Notably, the proposed MODE achieves the smallest MSE across various scenarios. Furthermore, the percentage reduction in MSE of the MODE over the CRE is particularly significant when the potential outcomes are fixed (C1 and C3), consistent with the optimality outlined in Corollary 2.
We also compare the differences between the true distribution of potential outcomes and the empirical distributions of the observed outcomes derived from various experiments.
Table 2 presents the Kolmogorov–Smirnov values
for the different experiments, where
and
represent the empirical distribution functions of the potential and observed outcomes from the treatment group, respectively.
As anticipated, all Kolmogorov–Smirnov values decrease, indicating that the observed outcomes converge to the true potential outcomes in distribution as the sample sizes increase. Across all settings, the MODE yields the lowest Kolmogorov–Smirnov values, suggesting that the observed outcomes are closer to the true potential outcomes in distribution. This property of matching distributions, which benefits from quasi-Monte Carlo techniques, also enhances the estimation of treatment effects beyond the average treatment effect.
6. Conclusions and Discussion
This paper introduces a novel measure of covariate imbalance, termed the generalized discrepancy, and proposes a MODE designed to minimize this discrepancy. Both theoretical analysis and simulations demonstrate that the proposed MODE outperforms existing methods, such as CRE and ReM, in terms of minimizing MSE. Thus, the MODE serves as a super-effective framework for controlled experimentation in the presence of covariates.
Importantly, the covariate imbalance, defined as the generalized discrepancy, quantifies the differences in covariate distributions between treatment and control groups. This measure can also be utilized in designing experiments aimed at improving the accuracy of estimating various causal effects, such as quantile treatment effects and average treatment effects on the treated group. Such applications enable researchers to evaluate the impacts of interventions more comprehensively.
For most control experiments where estimation accuracy is the primary objective, such as online A/B tests, the proposed MODE is the preferred method. Since MODE is deterministic, randomization inference based on the MODE is limited [
22]; however, asymptotic inference remains feasible. In situations where exact inference is the primary goal or randomization is necessary due to practical constraints—such as policy requirements, ethical considerations, or the presence of potential unobserved confounding variables—we recommend using re-randomization based on the MODE described in (
15). This approach randomizes a set of treatments that are near the minima of the maximum risk, thus enhancing covariate balance.
The MODE can be directly extended to multi-level experiments by sequentially minimizing generalized discrepancies across treatment groups. Thus, it offers a potential alternative to the OSAT tool [
23] for sample-to-batch allocation in genomics experiments. The MODE proposed in this paper assumes that all covariates are collected in advance of conducting the experiment, including scenarios such as online A/B tests, classrooms with student data, or companies with employee records. Another potential future research direction involves adaptive allocation, particularly in biomedical experiments with human subjects, where treatments must be assigned either individually or in batches as participants are enrolled.