**Proof.** See Stentoft (2004b).

The result in Lemma 1 can now be combined with Proposition 1 of Stentoft (2004b) to demonstrate that when all the simulated paths are started at the current values of the state variable, i.e., *X* (0) = *x*, then the estimate in (8) converges to the true price, which establishes the convergence of the LSM method in a general multi-period setting.<sup>5</sup> Moreover, this type of algorithm has nice properties, and the work in Stentoft (2014) documented that it is the most efficient method when compared to, e.g., the value function iteration method of Carriere (1996) or Tsitsiklis and Van Roy (2001). When paths are started at initially dispersed values, the LSM method still allows one to estimate the optimal early exercise boundary, and Lemma 1 continues to hold. However, when it comes to pricing the option

<sup>5</sup> One of the important assumptions in Stentoft (2004b) is that the support is bounded. In Glasserman and Yu (2002), convergence was studied in the unbounded case. This complicates the analysis, and therefore, they limited their attention to the normal and lognormal cases. See also Gerhold (2011) for generalizations of the results to other processes and Belomestny (2011) for a proof using nonparametric local polynomial regressions.

with initially dispersed paths, this is more complicated, and an initial regression is needed at time *t* = 0; see, e.g., Létourneau and Stentoft (2019) for a proof that this method converges.

### *2.2. Regression and Optimal Early Exercise*

It is explicit in Equation (3) that the optimal early exercise region is determined by the comparison of *Z tj* and *F X tj* or in the case when the conditional expectations are approximated by the comparison of *Z tj* and *F*ˆ*NM X tj*. In particular, we can define the true early exercise region implicitly as the value *B tj* that solves:

$$Z\begin{pmatrix} t\_j \end{pmatrix} > F\begin{pmatrix} B\ \begin{pmatrix} t\_j \end{pmatrix} \end{pmatrix} . \tag{9}$$

Similarly the estimated early exercise region is defined as the value *B* ˆ *N M tj* that solves:

$$Z\left(t\_{\hat{j}}\right) > \hat{F}\_{M}^{N}\left(\hat{B}\_{M}^{N}\left(t\_{\hat{j}}\right)\right). \tag{10}$$

In the case where there is only one state variable, the exercise region at an exercise date is determined by a single point of intersection between the payoff function and conditional expectation. The work in Rasmussen 2005 proposed a Newton–Raphson procedure to find the exercise boundary. In our application, we find this exercise boundary by subtracting the exercise value from the approximated conditional expectation and determining the roots of the resulting polynomial. Note that with an approximated conditional approximation, multiple roots may exist. If this happens, the largest root below the strike is kept as the exercise boundary.

While it is simple to find and represent the optimal early exercise boundary in the plain vanilla single option case, for more complex options, this might not be the case. For example, when there are two state variables, the conditional expectations and payoff functions are planes in a three-dimensional space. Hence, the intersection is a curve in a three-dimensional space, and it becomes much more difficult to represent the intersection.<sup>6</sup> Moreover, while finding polynomial roots is easy in one dimension, this approach does not generalize easily to more complex situations. On the other hand, comparing the payoff function to the approximated conditional expectation is straightforward in multiple dimensions. Since the conditional expectations are completely described by the estimated coefficients *c*ˆ*Nm tj* from the cross-sectional regression, in practical implementations, it is therefore much easier to store these than it is to attempt to characterize and store a representation of the exercise region.

It follows from Lemma 1 above that as *M* and *N* tend to infinity, the estimated early exercise boundary will converge to the true early exercise boundary. For finite choices, however, *B* ˆ *N M tj* is only an estimate of the true early exercise boundary, and the "quality" of this estimate will depend on the choice of *M* and *N* in a particular application. The fact that the early exercise boundary is estimated leads to two types of biases. First, when using a given estimated frontier, the option may be exercised at times when it should not have been or not exercised at times when it should have been. In both cases, this suboptimal early exercise will result in a low biased price estimate.<sup>7</sup> Second, when using the same set of simulated paths to the approximation of the exercise boundary and to price the option, what we call in-sample pricing, the method suffers from a high bias from over-fitting the continuation value to the current sample of simulated paths. As explained first by Broadie and Glasserman (1997), the practice of using the same paths for the exercise decision and the payoff calculation introduces a positive correlation between the exercise decision and the future payoffs, essentially resulting in

<sup>6</sup> For exotic options like stair-step options, there can be multiple intersections and multiple exercise regions.

<sup>7</sup> Exercising when you should and not exercising when you should not have, on the other hand, no effect.

a foresight bias. The standard LSM estimator has both the low and the high bias, and therefore, the overall bias is difficult to sign.

### *2.3. Bootstrapping the Early Exercise Boundary*

Although simulation and regression methods like the LSM method outlined above determine sequentially a number of conditional expectations as can be seen from Equation (6), these are used only to determine if the option at a given time should be exercised along a given path. As such, the conditional expectations are used as a convenient way to summarize, by subtracting the exercise value, the early exercise region and hence the path-wise optimal stopping time, which uniquely determines the option value. In Figure 1a, we plot in light gray *I* = 100 such estimated early exercise boundaries for a put option. As the figure shows, these are close to the true optimal early exercise boundary with the dotted black line although quite noisy, and this particularly so at the first steps in the simulation. Figure 1a however also shows that taking averages of the *I* = 100 conditional expectation approximations at any time and using that to back out the optimal early exercise boundary leads to a much smoother frontier shown with the dotted blue line and, supposedly, a better behaved estimated price.

**Figure 1.** Individual, average, and bootstrapped early exercise boundaries. This figure shows *I* = 100 individually estimated early exercise boundaries using *N* = 100,000 paths of simulated data with *M* = 9 regressors and a constant term in the cross-sectional regressions. The right hand plot uses the proposed Initial State Dispersion (ISD) method from Rasmussen (2005). The option has a strike price of *K* = 40, a maturity of *T* = 1 year, and *J* = 50 early exercise points per year. The initial stock price is fixed at *S*(0) = 40; the volatility is *σ* = 20%; and the interest rate is *r* = 6%. The dotted blue line shows the early exercise boundary estimated from the average of the *I* = 100 continuation value approximations at each time and the red line the frontier backed out from our bootstrapped recursively averaged continuation values. The dashed black line shows the true early exercise boundary estimated with a binomial model with 50,000 steps.

Our proposed bootstrapping technique averages at each step in the backwards induction algorithm instead of only after having gone through all the steps in the backward induction algorithm. For example, at time *t* = *T* − 1, we estimate *I* = 100 independent conditional expectation approximations, i.e., polynomials of order *M* = 9 and a constant term in the cross-sectional regressions. Before proceeding backwards, we take the average of these approximations and use this same average across the *I* = 100 independent simulations to determine if the option along a given path in a given simulation at time *t* = *T* − 1 should be exercised or not. We then simply proceed, in a similar manner, backwards in time, always averaging at each early exercise point to time *t* = 0. In Figure 1a, we show in red the early exercise boundary backed out from these average conditional expectations at each early exercise time. This estimated optimal early exercise boundary is for the most part extremely close to the true optimal early exercise boundary, and estimated option prices obtained with this estimate are therefore expected to be very close to the true option price.

In Table 1, we report the corresponding price estimates using both In-Sample (IS) pricing, i.e., when the same paths are used to determine the optimal early exercise boundary and for pricing, and Out-of-Sample (OS) pricing, where new simulated stock prices are used for pricing. A benefit of using OS pricing in the LSM method is that the price estimate is guaranteed to have a low expected bias because of the sub-optimality of the estimated optimal early exercise boundary. The first line in the table reports the relevant benchmark values for both the IS and OS method generated by applying the true early exercise boundary estimated from the binomial model with 50,000 steps to the relevant simulated paths. In this case, the IS and OS simply represent two different sets of paths. The table shows that as expected, the regular LSM method provides a high biased estimate when using IS and a low biased estimate when using OS, and these differences are in fact significant at standard levels.<sup>8</sup> The difference between the two estimates is about half a cent, though slightly less when factoring in the Monte Carlo error as evidenced by the difference of 0.0013 between the benchmark price using the IS and OS sample. For the methods based on averages, the table shows that the estimated prices, both IS and OS, are much closer to the benchmark values and insignificantly different. This is particularly so for the recursive averages based on our bootstrapping method, which are spot on for the IS method and off by only a hundredth of a cent for the OS method. This indicates that for this setup, we have essentially obtained the true optimal early exercise strategy with our proposed bootstrapping method.

**Table 1.** Option prices using In-Sample (IS) and Out-of-Sample (OS) methods. LSM, Least-Squares Monte Carlo.


This table shows the estimated prices and standard deviations using *I* = 100 individually estimated early exercise boundaries using *N* = 100,000 paths of simulated data with *M* = 9 regressors and a constant term in the cross-sectional regressions. The option has a strike price of *K* = 40, a maturity of *T* = 1 year, and *J* = 50 early exercise points per year. The initial stock price is fixed at *S*(0) = 40; the volatility is *σ* = 20%; and the interest rate is *r* = 6%. The binomial model estimate obtained with 50,000 steps is 2.3141. The benchmark boundary denotes the results from a method in which the true early exercise boundary estimated from the binomial model is used in the Monte Carlo simulation. Individual LSM denotes the regular method where the average of the *I* = 100 individual simulations are used. Regular average denotes the method where the average of the individually estimated optimal early exercise strategy is used, and recursive average denotes the method where the bootstrapped estimated optimal early exercise strategy is used. By comparing these results to the values from the benchmark boundary, the error coming from the Monte Carlo simulation is eliminated.

The very jagged paths early on in the simulations in Figure 1a are due in part to there being very few paths that are deep in the money, which makes it difficult to estimate the value for which exercise is optimal. This can be corrected or improved upon by using Initial State Dispersion (ISD) as suggested by, e.g., Rasmussen (2005). The resulting estimated early exercise boundaries are shown in Figure 1b. Compared to Figure 1a, we see that using an ISD helps quite a bit for the individual early exercise boundaries, which are quite a bit less volatile up to the 25th early exercise points. However, while using an ISD may help when estimating the individual optimal early exercise boundaries, comparing the two figures clearly shows that bootstrapping helps much more.

<sup>8</sup> The one-sided *p*-value for a test of zero bias when compared to the estimate obtained with the benchmark boundary is 4.9% and 5.5% for the IS and OS prices, respectively.

The method we propose is simple to implement and yields very good results with finite choices of the number of regressors used in the simulation, the number of simulated paths, and the number of repeated independent simulations. A natural next step is to ask about the asymptotic properties of this algorithm. It is straightforward to show that our method provides asymptotically unbiased price estimates, which we state in the following corollary.

**Corollary 1.** *The bootstrapping method provides an asymptotically unbiased estimate of the option price for any choice of I, i.e., the number of repeated independent simulations used, under the assumptions outlined in Lemma 1.*

**Proof.** This follows by applying Lemma 1 to each of the independent simulations and noting that since the individual *F* ˆ*N M X tj* converge to *F X tj* in probability for *j* = 1, ... , *J* so does the average of *I* such simulations.
