*2.3. Empirical Strategy*

What were the ways in which maternal depression can undermine children's cognitive outcomes? We framed our analysis following Frank and Meara's Model (FMM) [20] of maternal depression effects on the formation of children's skill, which was inspired by Cunha and Heckman's inter-generational model of human capability formation [13,21]. FMM assumed that a skill *S* was constituted in period *t*, through a production function *f* and several determinants that occurred in the previous period (*t* − 1). In sum, the model can be represented as follows:

$$S\_t = f(S\_{t-1}, I\_{t-1}, PS; M\_{t-1\prime}) \tag{1}$$

where *S* is the level of skill formation, *PS* represents parental skill attributes (education, cognitive abilities, etc.), *It*−<sup>1</sup> indicates monetary and non-monetary investments in child capabilities, and *Mt*−<sup>1</sup> is maternal mental health status at time *t* − 1. Mental health problems that interfered with mother-child interactions or undermined maternal behavior during *t* − 1 could potentially undercut the effectiveness of parental skills and/or reduce the productivity of investments and result in deficient children's cognitive ability later in life.

To empirically estimate this theoretical model, we exploited information on maternal mental health during the 1st round and data on cognitive outcomes for our sample of 1095 children for which we have information of PPVT Z-scores from the 2nd and 3rd rounds of data collection. A naïve estimation of the effects of exposure to lagged maternal stress on cognitive development will regress a measure of maternal stress in 2002 on the PPVT Z-scores in 2006/2007 and 2009/2010, using the following specification:

$$PPVVT\_{i,t} = \alpha\_0 + \alpha\_1 MH\_{i,t-1} + \alpha\_2 C\_{i,t} + \alpha\_3 M\_{i,t} + \alpha\_4 H\_{i,t} + \varepsilon\_{it} \tag{2}$$

where *PPVTi*,*<sup>t</sup>* represents the PPVT Z-scores for child *i* in period *t* (i.e., 2006/2007 or 2009/2010). *MHi*,*t*−<sup>1</sup> captures the value of any of the three maternal mental health indexes we estimated using

data from 2002. *Ci*,*t*, *Mi*,*t*, and *Hi*,*<sup>t</sup>* are vectors of child, mother, and household/community observable and time-varying characteristics that can lead to differences in cognitive ability across children and influence their parents' investments in them. These vectors include all the variables presented in Table 1, all of which have been documented to affect children cognition (for a review, see [6]). *it* represents a random, idiosyncratic error term.

Under the assumption of complete exogeneity of *MHi*,*t*−1, the parameter of interest, αˆ 1, measures performance in the PPVT at each period *t* for children whose mothers were depressed in 2002. The fact that the specification used measures of maternal depression and child's vocabulary taken at different points in time addressed, to a large extent, the possibility of reverse causality. However, the probability that there were unobserved factors, such as pollution, access to services, or changes that had affected the household between rounds—that influenced maternal mental health and children's outcomes cannot be entirely ruled out. Consequently, we used an instrumental variable (IV) approach to address the possibility of omitted variable bias.

In addition, the IV estimation helped to remedy the problem of measurement error in the main explanatory variable, which could be a relevant factor in the context of this paper. In particular, our main explanatory variable captured symptoms of mental health issues that affected mothers 30 days prior to the survey in 2002. We used those symptoms and estimated indexes of mental health, which constituted proxies of the unobserved, latent variable *MH*∗ *<sup>i</sup>*,*t*−1. Thus, estimations of Equation (2) that incorporated the proxy for maternal depression can produce inconsistent estimators of α<sup>1</sup> and lead to attenuation bias of these coefficients if *MHi*,*t*−<sup>1</sup> and the error term *i*,*<sup>t</sup>* are negatively correlated [22,23].

The IV approach hinges on finding observable covariates that are correlated with maternal mental health, but which do not affect child cognitive status or other possible omitted variables. Considering this, we define our instrument by relying on the existing evidence that identifies the negative effect of exposure to exogenous shocks during pregnancy or during the first months after birth on children cognitive outcomes [3,24–29]. Some of these papers find that the main mechanism driving this relationship is maternal stress induced by the shock. Therefore, by exploiting the fact that the first round of YL asked caregivers about exposure to shocks, we use them to instrument maternal mental health. We excluded natural disasters and decreases in food availability due to lack of variation (less than 0.18% of households reported any of these shocks) and job or income loss because it can be highly correlated with the fact that the woman just gave birth. Hence, we restricted our analysis to the remaining three shocks–loss of crop or livestock, death or severe illness, or changes in their household composition—as potential instruments of maternal mental health. In this sense, Equation (2) corresponds to our second stage estimation, and our first stage will be given by the following:

$$MH\_{i,t-1} = \beta\_0 + \beta\_1 S\_{i,t-1}^{\dot{j}} + X\_i + \varepsilon\_i \tag{3}$$

where *S<sup>j</sup> <sup>i</sup>*,*t*−<sup>1</sup> indicates if the mother of child *<sup>i</sup>* was affected by shock *<sup>j</sup>* and *Xi* represent the vectors of child, mother, and household characteristics described in Equation (2).

The validity of the instrument had to meet 2 conditions. First, it had to be relevant. In other words, the correlation between the shock and maternal mental health had to be high and statistically different from zero. To test this condition, we presented statistics of the shocks and measures of maternal mental health in Table 2, panels A and B. Panel C summarizes the correlations between each measure of maternal mental health and the three shocks under analysis. All correlations were statistically significant. In particular, the correlation between the loss of crop or livestock and the different indexes of maternal mental health ranges between 0.34 to 0.70.

The second condition for the instrument to be valid was exogeneity. In other words, suffering a shock during pregnancy or during the 1st months after birth should not have an impact on children's vocabulary at the age of 5 other than through the impact on maternal mental health in the period when the shock occurred. There were 3 potential concerns that might affect this assumption, but we aimed to address those concerns with our specification. First, there was the concern of the nutritional effect of an income shock. A past shock can affect children's nutritional status in *t* − 1, which can then translate into worse cognitive development later in life. To address this concern, we controlled for several children anthropometric measures. A 2nd concern was the learning resources: The shock could limit the exposure of the child to enriching opportunities or materials that might help her to improve her vocabulary development during childhood. To control for this potential channel, we included in our specification some measures of household wealth and consumption in *t* − 1. Finally, the 3rd concern was that the shock limited additional stimulation that might have been provided to her by other members in the household, in addition to the mother and her partner. For example, in extended households, non-working relatives tended to contribute to childcare duties. The shock may forced these other household members to find a job, which could, in turn, limit opportunities for child stimulation and consequent development. Since extended households were larger than the non-extended ones, we controlled for that characteristic by including the variable household size in our model. Alternatively, we tested the exogeneity assumption in our model by estimating the correlation between the measure of vocabulary and the shock, conditional on the variables that captured differences in availability of learning resources, child's nutritional status, and the rest of the control variables. These results are presented in Appendix B.


**Table 2.** Correlations between Shocks and Maternal Mental Health Indexes (MHI) in 2002.

Table 2 presents summary statistics (mean and standard deviation) of maternal mental health indexes and shocks experienced by mothers of our sample of analysis. These variables are available in the first round of the Peruvian Young Lives Survey (2002). The sample is restricted to children with available information on maternal mental health in 2002 and PPVT scores in 2006 and 2009. Mental health index 1 is the standardized average of the SRQ-20 items. Panel A presents statistics of mental health indexes. Mental health index 2 and 3 are standardized indexes estimated using principal components and factor analysis, respectively. Panel B presents the % of mothers reporting being exposed to any of the four shocks. Panel C shows correlations between the mental health indexes and shocks. \*\*\* and \*\* indicate statistical significance at 1% and 5%, respectively.

Finally, having at least three instruments and a large set of potential control variables posited the challenge of selecting the "right" set of them. On the one hand, using too few controls or the wrong ones may lead to omitted variable bias. However, by using too many, our model may be affected by overfitting. To address this issue, we estimate the parameters of interest using the Instrumental Variables Least Absolute Shrinkage and Selection Operator (IV-LASSO), a routine for estimating structural parameters in linear models with many controls and/or instruments. In particular, we used the post-double selection (PDS) methodology [30,31] that was applied in Stata's built-in commands by Ahrens et al. [32].
