Previous Article in Journal
Is VIX a Contrarian Indicator? On the Positivity of the Conditional Sharpe Ratio
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Recentered Influence Function Regressions

by
Javier Alejo
1,†,
Antonio Galvao
2,†,
Julián Martínez-Iriarte
3,† and
Gabriel Montes-Rojas
4,*,†
1
Instituto de Economía, Universidad de la República, Montevideo 11200, Uruguay
2
Department of Economics, Michigan State University, East Lansing, MI 48824, USA
3
Department of Economics, University of California-Santa Cruz, Santa Cruz, CA 95064, USA
4
Instituto Interdisciplinario de Economía Política-CONICET, Universidad de Buenos Aires, Ciudad Autónoma de Buenos Aires C1122, Argentina
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Econometrics 2025, 13(2), 19; https://doi.org/10.3390/econometrics13020019
Submission received: 21 January 2025 / Revised: 13 March 2025 / Accepted: 7 April 2025 / Published: 18 April 2025

Abstract

:
This paper suggests a generalization of covariate shifts to study distributional impacts on inequality and distributional measures. It builds on the recentered influence function (RIF) regression method, originally designed for location shifts in covariates, and extends it to general policy interventions, such as location–scale or asymmetric interventions. Numerical simulations for the Gini, Theil, and Atkinson indexes demonstrate strong performance across a myriad of cases and distributional measures. An empirical application examining changes in Mincerian equations is presented to illustrate the method.

1. Introduction

The recentered influence function (RIF) regression has emerged as a powerful tool for examining how changes in covariates influence the distributional properties of an outcome variable. This approach is particularly valuable in the analysis of income distribution, inequality, and poverty. First introduced in the seminal work of Firpo et al. (2009), RIF regression is based on a transformation of the outcome variable, leveraging the influence function of a specific distributional statistic of interest. The influence function, a fundamental concept in robust statistics, measures the sensitivity of an unconditional statistic to individual observations and is widely documented in the statistical literature.1 This accessibility makes RIF regression both convenient and straightforward to implement. By regressing the RIF-transformed outcome on explanatory variables, researchers can estimate the marginal effects of these covariates on various distributional measures, offering a flexible and insightful framework for studying economic and social disparities.
RIF regression has been extensively applied in empirical research, particularly in the study of unconditional quantiles. In this context, the primary parameter of interest is the unconditional quantile effect (UQE), and the corresponding RIF regression is referred to as unconditional quantile regression (UQR). Beyond quantiles, RIF regression can be applied to a broad set of distributional statistics, including the Gini coefficient, the Theil index, and measures of polarization, provided that the influence function for the statistic is known. This flexibility has contributed to its widespread adoption across multiple disciplines, including labor economics, health economics, and public policy, where understanding the distributional effects of explanatory variables is crucial.
Building on the foundational work of Firpo et al. (2009), subsequent research has further developed RIF-based methodologies. Essama-Nssah and Lambert (2012), for instance, provide a comprehensive derivation of influence functions for a range of distributional statistics, which serve as key components in RIF models. Moreover, Fortin et al. (2011) extend RIF regression to the Oaxaca–Blinder decomposition framework (see, e.g., Oaxaca, 1973; Oaxaca & Ransom, 1994), allowing for a decomposition of differences in distributional statistics across groups. Additional refinements have been proposed by Firpo and Pinto (2016) and Firpo et al. (2018), further enhancing the applicability and robustness of the method.
Despite its extensive use, the conventional RIF regression framework primarily captures location-shift effects, where a marginal change in a covariate affects the unconditional distribution of the outcome variable in a uniform manner. Specifically, the model of Firpo et al. (2009) considers shifts in the mean of a single covariate or marginal changes in the probability mass of a binary regressor. However, in many practical settings, policy interventions do not merely induce uniform changes in covariates but instead lead to broader transformations that may include shifts in both location and scale, as well as asymmetric changes. Addressing this limitation, we propose a more general and intuitive parametric representation of covariate shifts within the RIF framework.
Our contribution lies in extending RIF regression to accommodate general parametric covariate shifts, allowing for a more flexible evaluation of policy interventions. By formulating a parametric approach, we provide a simple yet effective method applicable to any distributional functional. This generalization enables researchers to assess policy effects beyond location shifts, capturing more complex transformations in covariates. Our proposed framework enhances the analytical power of RIF regression, making it a more versatile tool for studying the impact of structural changes in economic and social environments.
Recent research has explored various approaches to extending the location-only effects framework, particularly in the context of unconditional quantile regression (UQR). Chernozhukov et al. (2013) analyze this problem as a means of transforming the status quo population; however, their work does not explicitly address the limiting or marginal cases considered in Firpo et al. (2009).2 While their study provides valuable insights, it does not propose a methodology for capturing general shifts in covariates or for extending the analysis to a broader set of distributional functionals.
A related contribution by Martínez-Iriarte et al. (2024) introduces a framework for modeling location and scale shift effects in UQR estimation, making it particularly useful for analyzing general policy interventions. However, their approach relies on maximum likelihood estimation rather than RIF regression.3 Other notable extensions of UQR include Inoue et al. (2021), which examines the two-sample problem, Sasaki et al. (2022), which addresses high-dimensional settings, and Alejo et al. (2024), which investigates the relationship between conditional and unconditional quantiles.
To illustrate the effectiveness of our proposed method, we conduct Monte Carlo simulations using the Gini coefficient as the target distributional functional. The simulation results demonstrate that the method performs well across a range of parametric shift functions. Finally, we present an empirical application that examines the impact of different policy interventions on inequality under various shift scenarios. Specifically, we analyze how changes in education and work experience influence inequality within a Mincerian earnings equation framework.
The remainder of this paper is organized as follows. Section 2 reviews the RIF regression method and introduces our proposed estimands for capturing different types of covariate shifts. This section also discusses potential empirical applications in the existing literature. Section 3 presents the corresponding estimators and explores their asymptotic properties. Section 4 evaluates the finite sample performance of the estimators using Monte Carlo experiments. Section 5 applies the proposed estimators to analyze shifts in education and age within the Mincer equation framework. Finally, Section 6 concludes with a discussion of the key findings and potential avenues for future research.

2. RIF Regression

2.1. RIF Regression Framework for Pure Location-Shifts

Consider a functional v ( · ) defined on the distribution function F Y of a random variable Y. Changes in v that measure the influence or impact of a given observation through F Y are studied using the influence function (IF). More formally, the IF is the directional derivative of v ( F Y ) at F Y and measures the effect of a small perturbation in F Y . The IF can be calculated at each data point in the domain of Y. Let y be be such point, that contains a Dirac probability mass of ϵ y , such that we define
I F ( y ; v ; F Y ) = lim t 0 v t ϵ y + ( 1 t ) F Y v ( F Y ) t .
When I F ( · ; v ; F Y ) is evaluated at Y, it becomes a random variable. Importantly, it has zero mean by construction, that is, E Y [ I F ( Y , v , F Y ) ] = 0 . For example, if v ( F Y ) = F Y 1 ( τ ) , the τ -quantile of F Y , then
I F ( y ; v ; F Y ) = τ 1 y F Y 1 ( τ ) f Y ( F Y 1 ( τ ) ) .
Note that
E [ I F ( Y ; v ; F Y ) ] = τ E 1 Y F Y 1 ( τ ) f Y ( F Y 1 ( τ ) ) = 0 ,
because E 1 Y F Y 1 ( τ ) = F Y ( F Y 1 ( τ ) ) = τ . The Appendix A contains the influence function for the Gini, Atkinson, and Theil indices.
The recentered influence function (RIF) is simply defined as
R I F ( Y , v , F Y ) = v ( F Y ) + I F ( Y , v , F Y ) ,
and by the previous result it satisfies v ( F Y ) = E Y [ R I F ( Y , v , F Y ) ] . This is a key property as it allows to implement the Law of Iterated Expectations and to work with conditional models. In particular, if we consider a set of covariates of interest X, then we have that
v F Y = E E R I F Y , v , F Y X .
Firpo et al. (2009) key result is based on modeling of the conditioning part, m v ( x ) E R I F Y , v , F Y X = x using standard regression tools, as this is referred to as RIF regression. Then, by the properties above, integrating over X links to the unconditional statistic of interest, i.e., v F Y = E X m v ( X ) .
Firpo et al. (2009) focuses on marginal effects on covariates X, modeled as a location shift X + δ where δ is a small real valued perturbation. Changes in X have a corresponding effect on Y, the outcome of interest. To study this, consider a statistical tool built upon the IF. Define F Y δ as the distribution of Y after the perturbation in X. Then we consider a functional derivative given by
lim δ 0 v F Y δ v F Y δ = lim δ 0 E [ m v ( X + δ ) ] E [ m v ( X ) ] δ = E lim δ 0 m v ( X + δ ) m v ( X ) δ .
The last equality is valid if we assume that limit and expectation are interchangeable, which we assume throughout the paper. In fact, if we assume that m v ( x ) is differentiable at any point x, then we obtain the main result of RIF regression models, that is,
Π v lim δ 0 v F Y δ v F Y δ = E m v ( x ) x = E E R I F Y , v , F Y X = x x .
Π v measures the impact of a given location shift in covariates on the unconditional functional statistic v of Y. Statistically, the expression for Π v , is called an average derivative, and is an object whose estimation has been thoroughly studied. An excellent textbook reference is chapter 4 of Pagan and Ullah (1999).

2.2. General Unconditional Effects

While Firpo et al. (2009) focus mainly on a location shift, one can consider a general shift on a given covariate,
X δ = h ( X , δ ) ,
where δ 0 , h ( x , 0 ) = x , and as a function of δ the function h ( X , δ ) is a continuously differentiable function. Then we consider general unconditional effects as
Π v , h = lim δ 0 v F Y δ v F Y δ
for general h functions.
Here we index the unconditional effect by h, which denotes the type of counterfactual policy analyzed. Naturally, depending on the particular form of h ( X , δ ) we obtain different effects. Here are few examples:
  • Location shift This is the case developed above taken from Firpo et al. (2009) analysis that consider a location shift change in one covariate of the form
    X δ = X + δ .
  • Location–scale shift
    X δ = X μ X s ( δ ) + μ X + ( δ )
    where δ 0 , s ( δ ) > 0 and ( δ ) are continuously differentiable functions with s ( 0 ) = 1 and ( 0 ) = 0 , respectively. Here, s ( δ ) acts as a shrinking parameter such that an increment in this parameter reduces the overall impact of the X variable. The pure location shift in Firpo et al. (2009) can be obtained by setting s ( δ ) = 1 and ( δ ) = δ . Martínez-Iriarte et al. (2024) have a similar shift written in a different manner. In fact, different alternatives can be developed based on how to combine the location and scale joint shifts.
  • Asymmetric shift
    X δ = X + a ( δ ) ( X m a x X ) λ
    where δ 0 , and the map δ a ( δ ) satisfies: a ( δ ) > 0 , a ( 0 ) = 0 , and is continuously differentiable. The factor X m a x is maximum value in the support of X or an upper bound. The parameter λ determines the asymmetry of the shift effects: if λ < 0 then the shift is biased towards upper values, if λ > 0 the shift is biased towards lower values, if λ = 0 this would be a pure location-shift. Although this type of shift was applied as a numerical simulation exercise in Battiston et al. (2014), the novelty of our proposal is to include it analytically within the RIF regression strategy.
From the definition of RIF, one can compute the parameter of interest in equation for the three cases described previously. Let m v ( x ) = E R I F ( Y , v ) | X = x for a given functional v = v ( F Y ) . Using the same derivation strategy as in Section 2.1 above, it is straightforward to compute that the estimand of interest is
Π v , h = E m v x x | x = X h ( X , δ ) δ | δ = 0 .
The estimand is the product of the derivative of two functions. First the m v ( x ) function, which depends on the functional of interest, and second, of h ( x , δ ) which depends on the type of counterfactual policy. The formula in (7) can then be applied to each particular effect that the user is trying to analyze.
  • Location shift
    Π v , h = E m v x x | x = X .
    This is indeed the estimand of Firpo et al. (2009) and the most popular amongst RIF regression empirical applications.
  • Location–scale shift
    Π v , h = E m v x x | x = X ( X μ X ) s ( 0 ) s 2 ( 0 ) + ( 0 ) .
    As in Martínez-Iriarte et al. (2024) we can define the location shift effect as
    Π v L : = E m v x x | x = X ( 0 ) ,
    which corresponds to the Firpo et al. (2009) RIF regression coefficient representation, and the scale effect is
    Π v S : = E m v x x x = X ( X μ X ) s ( 0 ) s 2 ( 0 ) ,
    such that Π v , h = Π v L + Π v S .
  • Asymmetric shift
    Π v , h = E m v x x | x = X a ( 0 ) ( X m a x X ) λ .

2.3. Empirical Examples to Motivate the Estimands

Here we discuss some empirical examples where these models could be used.
Effect of increasing education on wage inequality. In a Mincer equation, log wages are modeled as a function of certain observable covariates such as years of education. Changes in education levels have different effects on income inequality and a positive shift may result in augmenting it, the so-called ‘paradox of progress’ (see Bourguignon et al., 2024). A study of the effect of a shift in education on wage inequality could be implemented using our proposed framework. We can accommodate a counterfactual policy experiment where there may be not only a general increase in the education level but also a change in its dispersion or an asymmetric shift towards the highest possible level of education.
Smoking and birth weight. Consider a cigarette consumption tax. Under some assumptions, the consumption X will be reduced to X / ( 1 + δ ) for some δ that depends on the elasticities of the supply and demand. One might wonder about the final impact of this tax policy on the distribution of birth weights. This is analyzed in Martínez-Iriarte et al. (2024).
Wage controls and earnings distribution Vickers and Ziebarth (2022) studies the effect of a more uniform (less dispersion) distribution of wage control brackets on the distributions of earnings, in the context of a policy that was implemented during World War II.

3. Generalized RIF Estimator

The proposed estimands can be estimated using the RIF regression method. This is implemented in three steps.
First, we need to specify a model for the m v ( x ) function, that is, the relationship between the RIF of the v functional together with an ad hoc model on how it relates to the covariates, i.e., m v x x . Let m v x x ^ be such estimator. Assume a parametric model
m v x x g 1 ( x , η 0 )
for some parameter η with true value at η 0 . The estimator is then g 1 ( x i , η ^ ) , where η ^ is a consistent estimator of η . Firpo et al. (2009) propose several alternatives, among which RIF OLS and RIF Logit are the most commonly used. Similar procedures can be applied to any functional, as is typically done in the applied literature. In turn, this requires that the proposed model is an appropriate representation of the true model. η can then be interpreted as the parameters from the RIF regression model and η ^ as the corresponding estimator.
For example, a third-degree polynomial is a popular choice:
m v x = η 0 + η 1 x + η 2 x 2 + η 3 x 3 .
Second, depending on the proposed h ( X , δ ) model different alternatives arise as show in Section 2.2. This depends on the policy intervention of interest and the associated shift in covariates. Then, we can compute h δ ( x , 0 ) for each case. This derivative may also include estimated parameters. Let h δ ( x , 0 ) ^ be such estimator. Assume a parametric model
h δ ( x , 0 ) g 2 ( x , γ 0 )
for some parameter γ with true value at γ 0 and that γ ^ is a consistent estimator of γ . Thus, the estimator is g 2 ( x i , γ ^ ) .
Third, the unconditional effects can then be obtained by sample averages of the product formed by the two elements in the last paragraphs.
Π ^ v , h = 1 n i = 1 n m v x i x ^ h δ ( x i , 0 ) ^ = 1 n i = 1 n g 1 ( x i , η ^ ) g 2 ( x i , γ ^ ) .
Now define g ( x , α ) g 1 ( x , η ) g 2 ( x , γ ) for α = ( η , γ ) . Consistency of Π ^ v , h will follow from a uniform law of large numbers over α , and the correct specification of the parametric model for m v x . For a uniform law of large numbers, sufficient conditions can be found in Lemma 4.3 in Newey and McFadden (1994).
Assumption 1. 
Assume that m v x x | x = X h δ ( X , 0 ) = g ( X , α 0 ) is continuous at α 0 with probability one, and there is a neighborhood A ˜ 0 of α 0 such that E sup α A ˜ 0 g ( X , α ) < .
The correct specification of the model for m v x is quite challenging. Indeed, it is unlikely that a parametric specification is the correct one. However, one can view a specification like the one in (9) as a series approximation to the true model. A rigorous treatment is provided in Firpo et al. (2009) for the case h δ ( X , 0 ) 1 .
Deriving the asymptotic normality of our estimators is more complex and requires a more intricate analysis. If the influence function were observed, it would exhibit the standard parametric asymptotic convergence rate, i.e., n , as this follows from a standard OLS (or Logit) analysis. However, in most cases, the IF must be estimated and often involves nonparametric components, such as densities. As noted by Firpo et al. (2009, p. 962), “because the density is nonparametrically estimated by kernel methods, the rate of convergence of the three estimators will be dominated by this slower term”. Thus, we necessitate a case-by-case asymptotic analysis for different estimators. Firpo and Pinto (2016) derive the asymptotic distribution for the functionals used here (Gini, Theil, Atkinson), and the asymptotic distribution of our proposed estimators can, in principle, be obtained via the delta method. For the UQR case, the Supplemental Appendix in Firpo et al. (2009) provides a detailed example for the location–shift effect, which can be extended to any shift function of X. A specific extension is developed by Martínez-Iriarte et al. (2024), albeit without relying on RIF methods.
Thus, while the asymptotic normality of our proposed estimators can be derived, it involves additional technical challenges, and we leave this as an avenue for further research. In our implementation, we conduct inference using the wild bootstrap, a common approach in RIF regression methods.

4. Monte Carlo Experiments

Consider the Gini coefficient as our distributional target of interest v and the following data generating process (DGP):
Y i = 3 + 2 X i + ( 1 + θ X i ) u i
with i = 1 , , n where X i N ( 6 , 1 ) and u i is a random variable that is independent of X i . The chosen value parameters will exclude y < 0 . See the Appendix A for the details on the Gini index computation together with the RIF.
Then for each DGP we compute Π v δ by a simulation with n = 10 million sample size. The θ parameter allows for the presence of scale shifts in the effect of the X covariate in the linear model (11). In all cases, we replace X δ in Equation (11) thus obtaining Y δ and computing numerically Π v = [ v ( F Y δ ) v ( F Y ) ] / δ for δ = 0.0001 .
We consider the following cases:
  • Pure location shift: s ( δ ) = 1 and ( δ ) = δ .
  • Pure scale shift: s ( δ ) = 1 + δ and ( δ ) = 0 .
  • Location–Scale shift: s ( δ ) = 1 + δ and ( δ ) = δ .
  • Asymmetric shift: a ( δ ) = δ and λ = { 0.5 , 0 , 0.5 } . Here we set X m a x = 11.31 such that no values in the simulations will exceed this.
For the RIF regression, we use a third-degree polynomial such that
m v x = η 0 + η 1 x + η 2 x 2 + η 3 x 3
thus,
m v x ( x i ) = η 1 + 2 η 2 x i + 3 η 3 x i 2
for i = 1 , , n .
Then, for the location and scale shifts we have
Π ^ v L = n 1 i = 1 n m v x ( x i ) ^ = η ^ 1 + 2 η ^ 2 x ¯ + 3 η ^ 3 x 2 ¯ ,
and
Π ^ v S = n 1 i = 1 n ( x i x ¯ ) m v x ( x i ) ^ ,
with x ¯ = n 1 i = 1 n x i ; x 2 ¯ = n 1 i = 1 n x i 2 and the η parameters are estimated by OLS as in the RIF regression method.
Then, for the asymmetric shift we use
Π ^ v = n 1 i = 1 n ( X m a x x i ) λ m v x ( x i ) ^ .
For the simulations, we consider the cases of u i N ( 0 , 1 ) (Table 1) and u i ( χ 1 2 1 ) / 2 (Table 2), using θ { 0 , 0.3 } , thus allowing for location-only and location–scale effects in (11).
All DGPs and simulation exercises show that the proposed implementation works. Bias and variance reduce monotonically as the sample size increase. Note that, as expected, the case of asymmetric shift with λ = 0 coincides with the location-shift effect. Moreover, the method works for all shifts models.

5. Empirical Application

This section presents an empirical application. We use an extract from the Merged Outgoing Rotation Group of the Current Population Survey from 1983, 1984, and 1985, restricted to male respondents only. More details about the data can be found in Lemieux (2006). The variable of interest is Y, the hourly wage, and the covariates X are an indicator of whether the individual is unionized, years of education, marital status, race (non-white), and work experience. We use a cubic specification in all continuous covariates to give some flexibility to the RIF model.
The studied effects correspond to education (Table 3) and experience (Table 4) in order to simulate different interventions and their potential effects on measures of inequality. We consider the estimators of the location, scale, and asymmetric effects discussed above for the Gini, Theil, and Atkinson indices (see the Appendix A for more details on each index). All index values are multiplied by 100, and all effects correspond to a change in education/experience on the corresponding shift. Figure 1 and Figure 2 show the modeled shift for each year of education/experience depending on the value chosen for λ .
Consider first the effect of education, as reported in Table 3. The location effect is positive, while the scale effect is negative for all cases. This determines that while a positive shift in education levels has resulted in augmenting inequality (this is also called the ‘paradox of progress’ by different authors), a reduction in the overall impact of education has a decreasing effect on inequality. In fact, the location–scale joint effect is negative, thus reducing inequality. In the case of the asymmetric shift, note that hourly wages distribution is more egalitarian when the shift is biased towards lower educational levels ( λ > 0 ), as long as inequality is measured using the Gini or Theil coefficient. However, if the index used is more inequality-averse (Atkinson(1) and Atkinson(2)), then all effects are unequalizing.
Consider now the effect of experience in Table 4. In this case, the effects are clearly negative for both location and scale effects. In fact, in all cases there is a reducing effect on inequality measured by all indices. In the case of asymmetric change, the equalizing effect is greater when the increase in experience is biased towards lower levels ( λ > 0 ).

6. Concluding Remarks

This paper introduces a generalized framework for analyzing the distributional impacts of covariate shifts on inequality and other distributional measures. Building on the RIF regression approach originally proposed by Firpo et al. (2009) for location shifts, our method extends its applicability by incorporating a broader class of shifts, following a framework similar to Martínez-Iriarte et al. (2024). Our Monte Carlo simulations demonstrate that the proposed approach performs well across a wide range of scenarios, highlighting its robustness and versatility.
Several avenues for future research emerge from this work. First, more precise asymptotic approximations could be developed by explicitly deriving the asymptotic distribution of functional estimators of influence functions. Additionally, alternative numerical and approximation methods could be explored for cases where closed-form solutions are unavailable. Second, the framework could be extended to accommodate calibrated covariate shifts that align with empirical policy interventions. For example, any targeted policy intervention could be translated into a counterfactual covariate distribution, to which our method could be applied. Finally, Oaxaca–Blinder-type decompositions could be formulated for specific types of shifts, allowing for a broader range of decomposition analyses and further enhancing the interpretability of distributional changes.

Author Contributions

Conceptualization, J.A., A.G., J.M.-I. and G.M.-R.; methodology, J.A., A.G., J.M.-I. and G.M.-R.; software, J.A., A.G., J.M.-I. and G.M.-R.; formal analysis, J.A., A.G., J.M.-I. and G.M.-R.; investigation, J.A., A.G., J.M.-I. and G.M.-R.; resources, J.A., A.G., J.M.-I. and G.M.-R.; data curation, J.A., A.G., J.M.-I. and G.M.-R.; writing—original draft preparation, J.A., A.G., J.M.-I. and G.M.-R.; writing—review and editing, J.A., A.G., J.M.-I. and G.M.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Functionals and Their RIF

Appendix A.1. Gini Index

The Gini coefficient is defined as
v Gini = 1 2 R ( F Y ) μ ,
where μ = E ( Y ) and R ( F Y ) = y z d F Y ( z ) d F Y ( y ) . Note that if we define p = F Y ( z ) then R ( F Y ) = 0 1 G L ( p , F y ) d p , that is, the term R ( F Y ) / μ acquires the classical interpretation of the area under the Lorenz Curve given by the expression G L ( p , F Y ) = F Y 1 ( p ) z d F Y ( z ) .
Following Firpo et al. (2018), and after some algebra, the RIF for the Gini index v Gini can be written as:
R I F ( y , v Gini , F Y ) = 1 + 1 v Gini μ y + 2 μ y ( F Y ( y ) 1 ) G L ( F Y ( y ) , F Y )

Sample Estimator

Assume a random sample y 1 , , y n , we use a plug-in estimator of the previous equation. First, following Lambert (2001), consider the observations ordered from smallest to largest y 1 y 2 y n , then
v ^ Gini = 1 + 1 n 2 μ n 2 i = 1 n y i ( n + 1 i )
Finally, the rest of the components are estimated as follows
μ ^ = 1 n i = 1 n y i
F ^ Y ( y ) = 1 n i = 1 n 1 ( y i y )
G L ^ Y ( y ) = 1 n i = 1 n y i 1 ( y i y )

Appendix A.2. Theil Index

The Theil index is defined as
v Theil = η μ l n ( μ )
where η = y ln ( y ) d F Y ( y ) and μ = E ( Y ) .
Using this notation, the RIF of the Theil is:
R I F ( y , v Theil , F Y ) = v Theil + y ln ( y ) η μ η + μ μ 2 ( y μ )

Sample Estimator

Assume a random sample y 1 , , y n , the statistics to implement the RIF as a plug-in estimator are:
η ^ = 1 n i = 1 n y i ln ( y i )
μ ^ = 1 n i = 1 n y i

Appendix A.3. Atkinson Index

The Atkinson index is defined as
v Atk ( ϵ ) = 1 κ ( ϵ ) 1 1 ϵ μ if ϵ 1 ; 1 e κ ( ϵ ) μ if ϵ = 1 ,
where μ = E ( Y ) , κ ( ϵ ) = y ( 1 ϵ ) ω ln ( y ) ( 1 ω ) d F Y ( y ) , ω = 1 ( ϵ 1 ) and ϵ is an inequality aversion parameter. The parameter ϵ defines the type of social preferences W = W ( y 1 , , y n ) and is a value set by the researcher. The two extreme cases are ϵ = 0 , when the inequality index reflects a utilitarian welfare function ( W = μ ) while if ϵ it is a Rawlsian function (W is a Leontief type function).
The RIF for the Atkinson index is:
R I F ( y , v Atk ( ϵ ) , F Y ) = v Atk ( ϵ ) + A ( ϵ , y ) + B ( ϵ ) ( y μ )
where
A ( ϵ , y ) = κ ϵ 1 ϵ ( ϵ 1 ) μ ( y 1 ϵ κ ) if ϵ 1 ; e κ μ ( ln ( y ) κ ) if ϵ = 1 ,
and
B ( ϵ ) = κ 1 1 ϵ μ 2 if ϵ 1 ; e κ μ 2 if ϵ = 1 ,

Sample Estimator

Assume a random sample y 1 , , y n , the statistics to implement the RIF as a plug-in estimator are:
μ ^ = 1 n i = 1 n y i
k ^ ( ϵ ) = 1 n i = 1 n y i ( 1 ϵ ) ω ln ( y i ) ( 1 ω )
where ω = 1 ( ϵ 1 ) .

Notes

1
For an introduction to influence functions, see van der Vaart (1998).
2
Martinez-Iriarte (2024) develops a sensitivity analysis procedure that accounts for both marginal and non-marginal (global) effects on unconditional quantiles, specifically when covariates are discrete.
3
Various methods exist for estimating unconditional quantile effects (UQEs). Indeed, Firpo et al. (2009) rigorously derive three distinct estimation methods.

References

  1. Alejo, J., Galvao, A. F., Martinez-Iriarte, J., & Montes-Rojas, G. (2024). Unconditional quantile partial effects via conditional quantile regression. Journal of Econometrics, 105678. [Google Scholar] [CrossRef]
  2. Battiston, D., Garcia-Domench, C., & Gasparini, L. (2014). Could an increase in education raise income inequality? Evidence for Latin America. Latin American Journal of Economics, 51(1), 1–39. [Google Scholar] [CrossRef]
  3. Bourguignon, F., Lustig, N., & Ferreira, F. (2024). The microeconomics of income distribution dynamics. Oxford University Press. [Google Scholar]
  4. Chernozhukov, V., Fernández-Val, I., & Melly, B. (2013). Inference on counterfactual distributions. Econometrica, 81(6), 2205–2268. [Google Scholar] [CrossRef]
  5. Essama-Nssah, B., & Lambert, P. (2012). Influence functions for policy impact analysis. In J. Bishop, & R. Salas (Eds.), Inequality, mobility and segregation: Essays in honor of jacques silber research on economic inequality (Vol. 20, pp. 135–159). Emerald Group Publishing. [Google Scholar]
  6. Firpo, S., Fortin, N., & Lemieux, T. (2009). Unconditional quantile regression. Econometrica, 77(3), 953–973. [Google Scholar]
  7. Firpo, S., Fortin, N., & Lemieux, T. (2018). Decomposing wage distributions using recentered influence function regressions. Econometrics, 6(2), 28. [Google Scholar] [CrossRef]
  8. Firpo, S., & Pinto, C. (2016). Identification and estimation of distributional impacts of interventions using changes in inequality measures. Journal of Applied Econometrics, 31(3), 457–486. [Google Scholar] [CrossRef]
  9. Fortin, N., Lemieux, T., & Firpo, S. (2011). Decomposition methods in economics. In O. Ashenfelter, & D. Card (Eds.), Handbook of labor economics (Vol. 4, pp. 1–12). Elsevier. [Google Scholar]
  10. Inoue, A., Li, T., & Xu, Q. (2021). Two sample unconditional quantile effect. arXiv, arXiv:2105.09445. [Google Scholar]
  11. Lambert, P. (2001). The distribution and redistribution of income. Manchester University Press. [Google Scholar]
  12. Lemieux, T. (2006). Increasing residual wage inequality: Composition effects, noisy data, or rising demand for skill? American Economic Review, 96(3), 461–498. [Google Scholar] [CrossRef]
  13. Martinez-Iriarte, J. (2024). Sensitivity analysis in unconditional quantile effects. arXiv, arXiv:2105.09445. [Google Scholar]
  14. Martínez-Iriarte, J., Montes-Rojas, G., & Sun, Y. (2024). Unconditional effects of general policy interventions. Journal of Econometrics, 238(2), 105570. [Google Scholar] [CrossRef]
  15. Newey, W. K., & McFadden, D. (1994). Large sample estimation and hypothesis testing. In R. F. Engle, & D. L. McFadden (Eds.), Handbook of econometrics (Vol. 4, pp. 2111–2245). Elsevier. [Google Scholar]
  16. Oaxaca, R. L. (1973). Male-female wage differentials in urban labor markets. International Economic Review, 14(3), 693–709. [Google Scholar] [CrossRef]
  17. Oaxaca, R. L., & Ransom, M. R. (1994). On discrimination and the decomposition of the wage differentials. Journal of Econometrics, 61, 5–21. [Google Scholar] [CrossRef]
  18. Pagan, A., & Ullah, A. (1999). Nonparametric econometrics. Cambridge University Press. [Google Scholar]
  19. Sasaki, Y., Ura, T., & Zhang, Y. (2022). Unconditional quantile regression with high-dimensional data. Quantitative Economics, 13(3), 955–978. [Google Scholar] [CrossRef]
  20. van der Vaart, A. (1998). Asymptotic statistics. Cambridge University Press. [Google Scholar]
  21. Vickers, C., & Ziebarth, N. L. (2022). The effects of the national war labor board on labor income inequality. Forthcoming American Economic Review. Available online: https://www.aeaweb.org/conference/2022/preliminary/paper/9B7GT7BY (accessed on 1 April 2025).
Figure 1. Asymmetric shift, education.
Figure 1. Asymmetric shift, education.
Econometrics 13 00019 g001
Figure 2. Asymmetric shift, experience.
Figure 2. Asymmetric shift, experience.
Econometrics 13 00019 g002
Table 1. Estimators’ finite sample performance ( v = Gini index; u i N ( 0 , 1 ) ).
Table 1. Estimators’ finite sample performance ( v = Gini index; u i N ( 0 , 1 ) ).
Effectn θ = 0 θ = 0.3
BiasVarMSEBiasVarMSE
50−0.00440.75630.7564−0.10312.96392.9745
1000.00810.34820.3483−0.04161.38421.3860
Location500−0.00860.05470.0548−0.02600.21760.2183
1000−0.00660.02990.0299−0.01840.11530.1157
5000−0.00050.00620.00620.00200.02520.0252
50−0.11412.21962.23260.01566.96236.9625
100−0.11141.04361.05600.01443.35533.3555
Scale500−0.06010.20520.2088−0.03070.63200.6329
1000−0.04180.10780.1096−0.01950.34070.3411
5000−0.00880.02050.0206−0.00390.06060.0606
50−0.11853.57513.5891−0.087610.468910.4766
100−0.10331.71811.7288−0.02735.23795.2387
Both500−0.06880.29230.2970−0.05680.88870.8920
1000−0.04850.16120.1635−0.03790.47040.4718
5000−0.00940.03240.0325−0.00190.08820.0883
500.00480.14530.1453−0.04550.59980.6019
1000.01120.06580.0659−0.01790.27600.2763
Asymmetric5000.00160.01070.0107−0.00980.04390.0440
( λ = 0.5 )10000.00190.00570.0057−0.00660.02350.0236
50000.00330.00120.00120.00180.00520.0052
50−0.00440.75630.7564−0.10312.96392.9745
1000.00810.34820.3483−0.04161.38421.3860
Asymmetric500−0.00860.05470.0548−0.02600.21760.2183
( λ = 0.0 )1000−0.00660.02990.0299−0.01840.11530.1157
5000−0.00050.00620.00620.00200.02520.0252
50−0.03654.33974.3410−0.232916.020516.0748
100−0.00862.02792.0280−0.09397.61357.6223
Asymmetric500−0.03500.31140.3127−0.06651.18731.1917
( λ = 0.5 )1000−0.02700.17290.1737−0.04730.62590.6281
5000−0.00590.03620.03630.00310.13570.1357
Note: calculations based on 1000 Monte Carlo experiments, Gini index is multiplied by 100.
Table 2. Estimators’ finite sample performance ( v = Gini index; u i ( χ 1 2 1 ) / 2 ).
Table 2. Estimators’ finite sample performance ( v = Gini index; u i ( χ 1 2 1 ) / 2 ).
Effectn θ = 0 θ = 0.3
BiasVarMSEBiasVarMSE
500.01260.70750.70770.01703.83243.8327
1000.01000.29430.29440.01971.61371.6140
Location5000.00770.05660.05670.00740.31190.3119
10000.00360.02660.02660.00850.14380.1439
50000.00020.00570.0057−0.00350.02980.0299
50−0.16011.84471.8704−0.04547.84847.8505
100−0.12010.95080.9652−0.03403.97883.9800
Scale500−0.04990.20790.2104−0.04490.83430.8364
1000−0.03810.09330.09470.00000.39340.3934
5000−0.00400.02090.02090.00340.08450.0845
50−0.14772.25222.2741−0.02865.74895.7497
100−0.11041.21471.2268−0.01443.10123.1014
Both500−0.04240.26080.2626−0.03750.69490.6963
1000−0.03470.11590.11710.00840.30210.3022
5000−0.00400.02650.0265−0.00020.06450.0645
500.01370.14930.14950.00890.88790.8880
1000.01180.06170.06180.01090.38160.3817
Asymmetric5000.00840.01180.01190.00650.07280.0729
( λ = 0.5 )10000.00630.00560.00560.00480.03420.0343
50000.00360.00120.00120.00000.00710.0071
500.01260.70750.70770.01703.83243.8327
1000.01000.29430.29440.01971.61371.6140
Asymmetric5000.00770.05660.05670.00740.31190.3119
( λ = 0.0 )10000.00360.02660.02660.00850.14380.1439
50000.00020.00570.0057−0.00350.02980.0299
50−0.00753.65173.65170.029917.534017.5348
100−0.00531.56281.56290.03857.33577.3372
Asymmetric5000.00350.30330.30330.00601.44371.4437
( λ = 0.5 )1000−0.00310.14100.14100.01890.65230.6526
5000−0.00380.03070.0307−0.00880.13490.1349
Note: calculations based on 1000 Monte Carlo experiments, Gini index is multiplied by 100.
Table 3. Education effects.
Table 3. Education effects.
EffectGiniTheilAtkinson(1)Atkinson(2)
Location0.6086 *0.5699 *0.6282 *1.2625 *
(0.0173)(0.0218)(0.0151)(0.0233)
Scale−4.8003 *−4.7931 *−4.1004 *−6.4685 *
(0.0824)(0.1074)(0.0714)(0.1047)
Both−4.1918 *−4.2232 *−3.4722 *−5.2060 *
(0.0777)(0.1019)(0.0681)(0.1031)
Asymmetric ( λ = 0.5 )0.3303 *0.3111 *0.3347 *0.6600 *
(0.0086)(0.0109)(0.0075)(0.0115)
Asymmetric ( λ = 0 )0.6086 *0.5699 *0.6282 *1.2625 *
(0.0173)(0.0218)(0.0151)(0.0233)
Asymmetric ( λ = 0.5 )−0.1681 *−0.2530 *0.0941 *0.7458 *
(0.0383)(0.0494)(0.0343)(0.0563)
Notes: the sample size is 266,956 observations; bootstrap standard errors (500 replications) in parentheses, * for p < 0.01.
Table 4. Experience effects.
Table 4. Experience effects.
EffectGiniTheilAtkinson(1)Atkinson(2)
Location−0.4025 *−0.3963 *−0.3291 *−0.4686 *
(0.0059)(0.0068)(0.0053)(0.0092)
Scale−4.5253 *−4.5701 *−3.7068 *−5.2172 *
(0.0929)(0.1259)(0.0826)(0.1267)
Both−4.9278 *−4.9664 *−4.0358 *−5.6858 *
(0.0959)(0.1292)(0.0853)(0.1314)
Asymmetric ( λ = 0.5 )−0.0620 *−0.0607 *−0.0512 *−0.0744 *
(0.0010)(0.0011)(0.0009)(0.0015)
Asymmetric ( λ = 0 )−0.4025 *−0.3963 *−0.3291 *−0.4686 *
(0.0059)(0.0068)(0.0053)(0.0092)
Asymmetric ( λ = 0.5 )−2.8433 *−2.8099 *−2.3207 *−3.2877 *
(0.0410)(0.0482)(0.0370)(0.0626)
Notes: the sample size is 266,956 observations; bootstrap standard errors (500 replications) in parentheses, * for p < 0.01.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alejo, J.; Galvao, A.; Martínez-Iriarte, J.; Montes-Rojas, G. Generalized Recentered Influence Function Regressions. Econometrics 2025, 13, 19. https://doi.org/10.3390/econometrics13020019

AMA Style

Alejo J, Galvao A, Martínez-Iriarte J, Montes-Rojas G. Generalized Recentered Influence Function Regressions. Econometrics. 2025; 13(2):19. https://doi.org/10.3390/econometrics13020019

Chicago/Turabian Style

Alejo, Javier, Antonio Galvao, Julián Martínez-Iriarte, and Gabriel Montes-Rojas. 2025. "Generalized Recentered Influence Function Regressions" Econometrics 13, no. 2: 19. https://doi.org/10.3390/econometrics13020019

APA Style

Alejo, J., Galvao, A., Martínez-Iriarte, J., & Montes-Rojas, G. (2025). Generalized Recentered Influence Function Regressions. Econometrics, 13(2), 19. https://doi.org/10.3390/econometrics13020019

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop