*3.1. Propensity Score-Matching Model*

In order to test the impact of CSR information disclosure on innovation input, traditional regression methods usually face two limitations in solving this problem. First, they can only show that information disclosure is correlated with innovation input, but there is not sufficient reason to prove that the former has a leading effect on the intensity change of the latter. More importantly, the decision of enterprises to voluntarily disclose social responsibility information is affected by many factors, such as enterprise size, the asset–liability ratio, and organizational redundancy, which means that whether enterprises voluntarily disclose social responsibility information is not a random event. Under such circumstances, traditional methods may lead to biased estimation results due to sample selection bias and may even confound the evaluation of the information disclosure effect.

Propensity score matching (PSM), proposed by Rosenbaum and Rubin (1985) [43], is a classic counterfactual research model that is often used to measure the consequences of a given policy or event. The PSM model can transform multiple variables that affect CSR information disclosure and innovation input into one-dimensional conditional probability values that are treated, in short, this model can combine multiple dimensions to form a score; it can then match each voluntary disclosure enterprise (treatment group) with the closest probability score of the non-disclosure sample (control group). Therefore, except for the difference in the disclosure of responsible information, the two groups of samples have similar characteristics in other aspects, so the difference in innovation input between the samples can be attributed to the disclosure of responsible information. Therefore, PSM is a causality measurement method that can mitigate the effect of the non-random distribution of samples and is suitable for measuring the impact of voluntary CSR announcements on innovation input. Following the relevant studies [44,45], the PSM model is divided into the following main steps:

(1) The main factors affecting the voluntary posting of social responsibility announcements are selected as the covariates of sample-matching between the voluntary group and the unpublished group.

(2) A Logit model was used to reduce the selected multidimensional covariates into one dimension, that is, the probability value of "voluntary release of social responsibility report (volun)", depicted as:

$$\text{Pscore}(Z) = \text{P}(Z) = \text{Pr}[\text{volume} = \text{1} | Z] = \text{E}[\text{volume} | Z]. \tag{1}$$

In Formula (1), *Z* represents the covariates affecting whether the enterprise releases its social responsibility report voluntarily, and *Pscore*(*Z*) represents the tendency score value if the enterprise releases the social responsibility report voluntarily.

(3) We calculate the probability value of the social responsibility report being released voluntarily by each enterprise, select an appropriate matching method for the samples of the treatment group, and form a new control group with successfully matched samples from the original control group.

(4) After passing the common support test and the balance test, the average processing effect of releasing a social responsibility report on innovation input was calculated (the average effect of treatment on the treated ATT):

$$\begin{array}{l} ATT = E[RD\_{1i} - RD\_{0i} | volum\_i = 1] \\ = E\{E[RD\_{1i} - RD\_{0i} | volum\_i = 1, P(Z\_i)]\} \\ = E\{E[RD\_{1i} | volum\_i = 1, P(Z\_i)] - E[RD\_{0i} | volum\_i = 0, P(Z\_i)] | volum\_i = 1\} \end{array} \tag{2}$$

where *RD*1*<sup>i</sup>* and *RD*0*i*, respectively, refer to the innovation input level of enterprises that voluntarily release a social responsibility announcement and those that do not release a social responsibility announcement.

#### *3.2. Quantile Regression Model*

In order to explore the impact of CSR performance on innovation input, and to further examine the change in the effect intensity of this relationship at the different levels of the dependent variable, quantile regression is necessary because the traditional ordinary least squares (OLS) method analyzes only the influence of independent variables on the conditional expectation of the dependent variable. When facing more complex relational measures, the mean regression method shows obvious deficiencies. In addition, given the data distribution of the dependent variable, the estimation results of ordinary least squares will be meaningless if thick tails and heteroscedasticity violate the basic OLS assumptions. Conversely, quantile regression (QR) combines the traditional regression method with the conditional quantile. This model is an extension and expansion of traditional regression. It selects different quantiles between (0,1) to fit the specific linear relationship of the explanatory variables. Used to measure the marginal effect of the explanatory variable on a particular quantile of the dependent variable, quantile regression thus helps to estimate the underlying relationship between the two variables more fully. Moreover, unlike OLS regression, quantile regression does not need to satisfy the normal distribution assumption of the residual in the conditional quantiles. Quantile regression allows the use of local information to explore the entire distribution of the dependent variable function [46,47], thereby allowing us to observe the varying effects of corporate social responsibility on innovation input at different sub-points.

The basic model of quantile regression is as follows:

$$
gamma\_p(\mathbf{Y}|\mathbf{X}) = X^\prime \beta(p),\tag{3}$$

where Y and X represent the dependent variable and a vector of explanatory variables, respectively, *p* represents the quantile level, *β* represents the vector of regression coefficients. The regression coefficients at different quantiles can be estimated by minimizing the absolute deviation [48].
