**2. Methodology**

First, the health function was estimated in order to investigate the relationship between IHEGs and health. The instrumental variable method was used to address the endogeneity problem and investigate the causal relationship between couples' education gaps and health. The ordinary least squares (OLS) model is presented in Equation (1).

$$H\_{\rm iC} = \alpha + \theta I HEG\_{\rm iC} + X\_{\rm iC}' \beta + D\_{\rm C}' \delta + \varepsilon\_{\rm iC} \tag{1}$$

Here, *iC* refers to an individual *i* in country *C*; *H* is an individual's health status (SRH, the mental health index, and objective health); *IHEG* denotes an intrahousehold education gap, which is the couple's education gap; *D* represents the country dummy variables; α is a constant; θ, β, and δ are the estimated coefficients; and ε is an error term. When θ is a negative value and is statistically significant, it indicates that a high IHEG may worsen an individual's health.

An endogeneity problem is possible in the OLS model, i.e., when an individual with poor health prefers to marry a highly educated partner for financial benefits, and the main independent variables of interest, *IHEG*, is correlated with the error term. Thus, to address this endogeneity problem, the instrumental variable (IV) method was utilized in this study [43,44]. The first-stage and second-stage estimation equations are expressed as Equations (2) and (3), respectively.

$$IHEG\_{i\subset} = b\_0 + Z\_{i\subset}'b\_1 + X\_{i\subset}'b\_2 + D\_{\subset}'b\_3 + \mu\_{i\subset} \tag{2}$$

$$H\_{i\mathbb{C}} = \alpha + \theta \widehat{\Pi^i \!\! E \!\! E \!\! G}\_{i\mathbb{C}} + X\_{i\mathbb{C}}' \beta + D\_{\mathbb{C}}' \delta + \varepsilon\_{i\mathbb{C}} \tag{3}$$

In Equations (2) and (3), *b*0 is a constant; *b*1, *b*2, and *b*3 represent the estimated coefficients; *u* is an error term; *Z* is the set of instrumental variables (e.g., parent's highest education level); and *IHEG* - is an imputed value based on the results of the first-stage regression shown by Equation (2). The weak instrument test and the Sargan test were used to test for the endogenous problem and to judge the statistical validity of the instruments [45].

Second, to investigate the probability channels in order to explain the impact of IHEGs on health, multiple regression models were used, as shown in Equation (4):

$$Y\_{i\mathbb{C}} = \alpha + \theta IHEG\_{i\mathbb{C}} + X\_{i\mathbb{C}}'\beta + D\_{\mathbb{C}}'\delta + \varepsilon\_{i\mathbb{C}} \tag{4}$$

In Equation (2), *Y* represents income satisfaction, weekly working days, overcoming difficulties, satisfaction with health or medical care, and attending environmental activities as a volunteer.
