*2.2. Model*

We used panel data regression to analyze EV uptake to reduce time-invariant heterogeneity resulting from some unobservable variables that affect the EV market share in different countries. The multiple linear regression model with panel data is shown as follows:

$$\begin{array}{l} \ln(y\_{it}) = \alpha + \beta\_1 \quad (\text{Sulsidies}\_{it}) + \beta\_2 (\text{Waier}\_{it}) + \beta\_3 (\text{Mandat}\_{it}) + \beta\_4 (\text{Fuel Standard}\_{it}) \\ \qquad + \beta\_5 \ln(\text{Fast Charge Density}\_{it}) + \beta\_6 \ln(\text{Slow Charge Density}\_{it}) \\ \qquad + \beta\_7 \ln(\text{GDP per Capita}\_{it}) + \beta\_8 \ln(\text{Population}\_{it}) \\ \qquad + \beta\_9 \ln(\text{Purchasing Resistance}\_{it}) + u\_j + \mu\_t + \varepsilon\_{it} \end{array} \tag{1}$$

where subscripts *i* and *t* represent the *i*-th country and the *t*-th year, respectively. The dependent variable is *ln*(*yit*) represent a logit transformation of *y* for country *i* in year *t*. *y* represents the EV share or EV sales, respectively, in two different regressions. *uj* is the fixed effects for individual countries and μ*<sup>t</sup>* is the time fixed effects. Using this model, we evaluated the effectiveness of policies and charging infrastructures with macroeconomic factors controlled. To reduce the level of heteroscedasticity, some variables are taken natural logarithm along with EV shares [23].

Moreover, a simplified model of this is used in the basic results part for a pooled regression to look at the effectiveness of different policies and other influential factors, shown in Equation (2):

$$\begin{array}{l} \ln(y\_i) = \alpha + \beta\_1 \quad \text{(Subsidies\_i)} + \beta\_2 (\text{Wailer}\_1) + \beta\_3 (\text{Manadate\_i}) + \beta\_4 (\text{Fuel Standard}\_i) \\ \qquad + \beta\text{\text{\textquotedblleft}Fast Charge Density\_i} + \beta\_6 \ln \text{(Slow Charge Density\_i)} \\ \qquad + \beta\gamma \ln \text{(GDP per Capita\_i)} + \beta\_8 \ln \text{(Population\_i)} \\ \qquad + \beta\gamma \ln \text{(Purchasing Resstrictionion\_i)} + \varepsilon\_i \end{array} \tag{2}$$

The pooled regression model is one type of model that has constant coefficients, referring to both intercepts and slopes. For this model, we can pool all of the data and run an ordinary least squares regression model without considering difference across countries and years.

Two types of models are usually considered for panel data regressions: the fixed effects and random effects model, different in dealing with endogeneity. Before deciding on the best regression method, we first have to figure out if our predictor variables are endogenous. The Hausman specification test was used to detect endogenous regressors in a regression model [24,25]. Details of the tests will be provided in the result part.

Besides, we used two statistical tests for our data and model in Section 3: the White test [26] (verifying the data conforming to the OLS homoscedasticity assumption) and the variance inflation factor test [27] (making sure that there is no multicollinearity problem in the pooled model). The tests showed that our data meet the requirements/assumptions of OLS regression.

For further analysis, we implemented the vector autoregression (VAR) model as an alternative method, in order to take time lag into consideration. Ordinarily, regressions reflect "mere" correlations, but [28] argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. The VAR model proposed by [29] is used to capture the linear interdependencies among multiple time series, with the lagged values of all endogenous variables to estimate the reverse impact of them [30]. Moreover, it allows us to consider both long-run and short-run restrictions justified by economic considerations [31]. Consequently, the VAR model can be used to capture the dynamic impacts of the influencing factors on EV sales. The mathematical expression of the general VAR model is given as follows:

$$y\_t = c + A\_1 y\_{t-1} + A\_2 y\_{t-2} + \dots + A\_p y\_{t-p} + e\_t \tag{3}$$

where the observation *yt*−*<sup>i</sup>* (*i* periods back) is called the *i*-th lag of *y*, *c* is a vector of constants (intercepts), *Ai* is a time-invariant matrix and *et* is a vector of error terms. In this paper, we used one period lag

for VAR, which is *i* equals to 1, and we explored the relationship between subsidies, infrastructures and EV sales as three variables. We showed the results of Granger causality test, which is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969 and widely used for explaining the results of VAR models [28].
