Abstract
In this paper, we focus on two-factor lattices for general diffusion processes with state-dependent volatilities. Although it is common knowledge that branching probabilities must be between zero and one in a lattice, few methods can guarantee lattice feasibility, referring to the property that all branching probabilities at all nodes in all stages of a lattice are legitimate. Some practitioners have argued that negative probabilities are not necessarily ‘bad’ and may be further exploited. A theoretical framework of lattice feasibility is developed in this paper, which is used to investigate how negative probabilities may impact option pricing in a lattice approach. It is shown in this paper that lattice feasibility can be achieved by adjusting a lattice’s configuration (e.g., grid sizes and jump patterns). Using this framework as a benchmark, we find that the values of out-of-the-money options are most affected by negative probabilities, followed by in-the-money options and at-the-money options. Since legitimate branching probabilities may not be unique, we use an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables. Extensive numerical tests show that this optimized lattice model is robust for financial option valuations.
1. Introduction
The lattice (or tree) approach is a popular one for valuing derivative securities, as it is normally simple to implement and has an intuitive appeal. The lattice approach involves discrete approximation to the diffusion processes followed by the underlying variables. It is especially useful for valuing American options where early exercise is possible. Since its introduction by Cox et al. (1979), the lattice approach has undergone several extensions in the past few decades to accommodate increasingly complex derivative valuations. To name a few, those significant models include Rendleman and Bartter (1979), Boyle (1986, 1988), (H&W) Hull and White (1988, 1990, 1993, 1994), Chung and Shih (2007), Beliaeva and Nawalkha (2010), and Akyildirim et al. (2014).
In a lattice, each link (branch) connecting two lattice nodes at two consecutive time periods is associated with a branching probability. A legitimate branching probability must be between zero and one. Researchers know and follow this basic rule in developing lattice-based methods. However, to the best of our knowledge, few methods can guarantee lattice feasibility, referring to the property that all branching probabilities at all nodes in all stages of a lattice are legitimate. With lattice feasibility, a lattice constructs a discrete time financial market that is arbitrage free. It is well known that lattice feasibility is easier to achieve when there is only one underlying variable, while two-factor lattice feasibility is harder to meet, especially when the correlation between two underlying uncertainties is high.
The term lattice feasibility was coined by Tseng and Lin (2007). The authors employed the trinomial lattice proposed by H&W (Hull and White 1990) to value real options involving two underlying correlated uncertainties, each with a constant volatility. They found that each lattice configuration implies a maximum correlation of the two underlying variables that the lattice can approximate without incurring negative probabilities, and this maximum correlation may be enhanced by varying the size of its lattice grids. After optimizing the lattice configuration, Tseng and Lin (2007) also showed that the trinomial lattice proposed by H&W (Hull and White 1990) cannot accommodate a correlation beyond without incurring negative probabilities. The authors further showed that the popular two-factor interest rate tree proposed by H&W (Hull and White 1994) for valuing interest rate derivatives can only guarantee lattice feasibility when the correlation is no greater than 0.2. This means that negative probabilities may occur far more often than we know in using lattices to value derivatives in real practice.
Since it is not unusual to encounter negative probabilities when using lattices for option pricing, some practitioners have argued that negative probabilities may not necessarily be ‘bad’ and may be further exploited (e.g., Burgin and Meissner 2012; Haug 2007). Consider a price node in a trinomial tree, where the sum of three branching probabilities must be equal to one. Given an abnormality where one branching probability becomes negative or exceeds unity, the other two probabilities must be adjusted accordingly to offset the abnormality. However, the expected payoff at this price node may reveal no sign of abnormality. From this perspective, it is not surprising that some practitioners have reported that some finite difference/finite element models can still produce stable and consistent outputs even with negative probabilities (Zvan et al. 2001). How much does it really matter if one allows negative probabilities in a lattice for option pricing?
To investigate the impact of negative probabilities in option valuations, we focus on using a two-factor lattice to represent general diffusion processes such as the Heston stochastic volatility (SV) model (Heston 1993). In the Heston model, the dynamics of the volatility process is assumed to follow the CIR process (Cox et al. 1985) used to describe the interest rate dynamics. The analytical tractability of the CIR process leads to explicit solutions for some bond pricing problems (e.g., Kouritzin 2000; Maghsoodi 1996; among others). When the CIR process is incorporated as the second dimension of the Heston model, the resultant two-factor lattice is more general and is far from trivial. To observe the impact that comes from negative probabilities, we need to develop a lattice model that can guarantee lattice feasibility and can be used as a benchmark. Under the same lattice framework, with every parameter fixed but branching probabilities, we can then observe how negative probabilities influence valuations.
An important alternative to the lattice method for the options pricing under the Heston model is the Monte Carlo (MC) method. Since the MC method generates stochastic paths, not lattices, to model the evolution of the underlying uncertainties, there is no issue of lattice feasibility or negative probability. However, the generations of the stochastic paths under the Heston model are not straightforward, and there have been a number of discussions on this issue. Examples along this line include Broadie and Kaya (2006) and Andersen (2007). When the MC method is applied to the pricing of American options, the Least-Squares Monte Carlo (LSMC) method is a standard approach proposed by Longstaff and Schwartz (2001). Though a lot of progresses have been made using the MC method to price American options, one also needs to resort to its variations, such as different techniques on resampling or branching (e.g., see a recent paper by Kouritzin and Mackay 2020).
Researchers have proposed lattices for stochastic volatility models. Recent papers include (Akyildirim et al. 2014; Beliaeva and Nawalkha 2010; Costabile et al. 2012; Ruckdeschel et al. 2013). All lattice approaches consider matching two conditional marginal moments for each underlying variable at all nodes, and the correlation is dealt with either by matching the cross moment of the variables or using variable transformation to decorrelate them. Special attention must be paid to avoid negative branching probabilities that are more likely to occur when the correlation is high. One popular approach is to truncate branching probabilities that are negative or exceed unity to bring them to be between zero and one. While truncating branching probabilities may not exactly match the moments, Akyildirim et al. (2014) show that in their approach the matching error may be negligible and prove the convergence of their approach to the underlying processes. In this paper, we take the standard approach by matching the two marginal moments and the cross moment of the two underlying variables. We show that branching probabilities can be guaranteed to be between zero and one by adjusting the configuration of the lattice for a given (fixed) time step, and no probability truncation is needed.
To manage stochastic volatilities, we extend the lattice parameters from the grid size to include a jump size. With this change, the lattice configuration can be optimized to guarantee lattice feasibility even if both state variables are highly correlated with the correlation close to one. As will be shown later, this newly introduced parameter, the jump size, has the effect of refining the grid size. As opposed to traditional lattice approaches which perform lattice refinement on time space, our method can also perform refinement on the state space of the underlying variables even when the time step is not especially small. The consequence is better fitting of the underlying processes and faster convergence.
Numerical tests show that lattice feasibility has a direct impact on option pricing. We find that the values of out-of-the-money (OTM) options are most affected by negative probabilities, followed by in-the-money (ITM) options and at-the-money (ATM) options. Although negative probabilities matter less in some situations, the resulted distortion of the underlying probability distribution is in general hard to predict and exploit. Despite the importance of lattice feasibility, our numerical tests also show that lattice feasibility alone may not be sufficient to guarantee accurate valuation, especially when the time step of the lattice is not especially small. Since legitimate branching probabilities may not be unique, we propose an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables.
The rest of this paper is structured as follows. To lay the foundation of lattice feasibility for approximating two general, correlated diffusion processes, a general one-factor lattice model is first considered in Section 2, with the CIR model used as an example to show in detail how the lattice can be constructed. Section 3 considers the lattice for two general and correlated diffusion processes including the Heston SV model (Heston 1993), and derives the lattice feasibility conditions. We analyze the impact of lattice feasibility on options valuation in Section 4 and conduct extensive numerical tests, including pricing European options and American options in Section 5. This paper concludes in Section 6. All proofs of propositions and theorems are given in the Appendix A of this paper.
2. General One-Factor Trinomial Lattice
We consider the following general Ito process:
where is a Wiener process and the volatility is a state-dependent function. In this paper, we assume that is lower bounded by a constant , which is a small number close to zero. That is,
For any process whose volatility function does not satisfy (2), denoted as
where (e.g., in the CIR model), one can work with the following counterpart:
whose volatility function satisfies (2). In (3b), the volatility is assumed to be bounded below by for the convenience of subsequent treatments. For a sufficiently small , the effect brought about by this lower bound is insignificant and the difference between the two models is negligible, as long as the drift is nonzero when the volatility is close to zero with the process on the brink of being absorbed. How to determine the lower bound in the setting of (3b) will be addressed later.
We will propose a trinomial lattice to approximate (1) based on H&W (Hull and White 1993). In their model, the time horizon is divided into intervals of equal length , and the process can only take on values that are multiples of . At time t, a typical lattice node branches to nodes , , and at the next stage with respective branching probabilities , , and , where the (text) subscripts represent upward, middle, and downward branches, respectively. As will be shown later, the jump size h depends on y and, therefore, the lattice may not recombine at all nodes.
The lattice size is predetermined using the same method as when volatility is constant, where c and are constant. Since is a function, is a surrogate constant volatility, which is set to be a small number no greater than :
The branching factor k is chosen such that (middle branch) approximates the expected price deviation . That is, k is the nearest integer of :
where is the floor function that maps to the nearest integer less than or equal to the operand. Let the mismatch between and in (5) be denoted by , defined as follows:
Apparently from (5),
The interpretation of (7) is that the middle branch may miss the expected price deviation by no more than . Note that both and k are functions of y and t, and may vary from node to node.
The jump size h is determined such that approximates . Likewise, let h be the nearest integer of :
Given , one can either choose a to determine the integer , as in (10); or vice versa. In our design, one determines first, followed by using the following formula:
As increases, decreases, thereby indicating the presence of more refined lattice grids.
There may exist a mismatch between and in (9). This time, we measure their difference by their ratio and see how far it is away from 1. The ratio is defined as follows:
It is clear when h is high that is close to one. Therefore, the range of is determined by the lower bound of , which is . It can be shown that the following relations hold:
and
Given any arbitrary lattice node y, use (5) and (9) to determine the value of k and h, respectively. One can then solve the branching probabilities , , and at y, such that these three branches match the mean and variance of the price deviation. They can be expressed in terms of c, , , and :
Proposition 1.
Proposition 1 states how the values of the two key parameters c and should coordinate to achieve lattice feasibility. This proposition states that, if a lattice is configured such that its values for c and meet (16), the lattice will not have negative probabilities in any branches using (15a)–(15c). When , the only feasible c is , which is an interesting number for c. When , each typical trinomial branch approximates the underlying normal distribution well (matching up to the 5th moment when ). However, when , c becomes flexible and its feasible value spreads over a bigger range.
2.1. Effects of Grid Refinement
As pointed in the previous section, introducing has an effect of grid refinement. Basically, if has a smaller , a higher will be required, which leads to smaller and . Since , a smaller means that the mean of the price change is better approximated. Traditionally, a discrete lattice approximates the underlying continuous processes better and better as the time step decreases, which may be viewed as lattice refinement on the time space. By increasing the value of , we introduce a refinement on the state space of the underlying variable, even when is not especially small. Therefore, grid refinement has an effect on better convergence.
2.2. Weak Convergence of the One-Factor Lattice
In the proposed trinomial lattice, the jump size varies from node to node due to the state-dependent volatility. As a result, the branches may not recombine in an easily predictable way as in the constant volatility case. Therefore, it is not immediately clear that such a lattice would converge to the underlying process. Next, we show that our proposed lattice indeed converges weakly to the underlying diffusion process in (1).
2.3. Estimating for CIR Model and Feller Condition
In this section, we use the Cox et al. (1985) model as an example to illustrate the applicability of the proposed framework. Consider
where , m, and are positive constants. The CIR model (18) does not meet (1) and (2), but (3a). This is because the lower bound of the square root volatility function is zero. To meet (3b), assume that is infinitesimal. In this section, we show how to find an approximate lower bound for the volatility function. Equivalently, we would find a such that , i.e., .
To do this, we must exploit the mean reverting (MR) property of the drift function of (18). Our approach is to find a , hopefully much greater than , such that is at a level where would revert upward so that the whole lattice remains to be capped from below by it. Since the lattice is capped from below by , no other volatility value in the lattice is smaller than .
Note that this reverting level depends on and decreases to 0 as . This should not be a problem because, in practice, lattices yield satisfactory valuation results at some finite where hopefully is still much greater than .
As mentioned above, we would like to see the entire lattice to be well contained in the domain. With and using (11), we have
We want to find a such that
holds. In (20), may be viewed as the index of the lattice node of y (with the index equal to 0 at ), and from this node, corresponds to the price change of the downward branch. Therefore, the left-hand side of (20) represents the index of this lattice node mapped onto from y at the next time period, which should be bounded below by the node corresponding to . Note in (20) that is a continuous variable, which may not match the value of some discrete lattice node. If , we are done; otherwise, we check if
in order to ensure that the lattice will revert upward at .
An important fact is that a may not necessarily exist for any arbitrary process parameters. For example, it has been shown that must hold for y in (18) to be bounded below by zero, which is known as Feller condition (Feller 1951). Next, we shall show that, if the Feller condition is met, then exists and the entire lattice can be well contained in .
Proposition 3.
If (i) , and (ii) both hold, then exists and the entire lattice can be well contained in .
It is clear that condition (i) of Proposition 3 can be easily satisfied by reducing , but condition (ii) may not. Compared with the Feller condition , condition (ii) has a similar form and may be viewed as the discrete version of Feller condition in our proposed lattice approach.
For a finite , if condition (ii) is violated, to rectify it, it is probably more effective to reduce the value of c (to its lower bound by increasing ) than to reduce . This demonstrates another application of grid refinement. As a limit, our approach requires to hold (by setting and ) so that exists. This condition is slightly looser than the Feller condition due to the fact that it is easier to bound the process of y from below in the (approximate) discrete domain than in the continuous one. This also means that, whenever the Feller condition is met, a Cox et al. (1985) process can always be approximated by the proposed lattice.
Proposition 4.
Given a Cox et al. (1985) process (18) that satisfies the Feller condition, there exists a feasible lattice configuration such that and the lattice can be well contained in .
3. Two-Factor Trinomial Lattice
In this section, we consider a more general two-factor model:
where and are Wiener processes with an instantaneous correlation . Following the treatment described in (3a)–(3b) in Section 2, we assume that there exist , such that
When both individual one-factor lattices are integrated, we shall apply the same convention to all relevant notations, such as , k, h, , and c, by adding a subscript to reflect the index of the process that each notation represents.
Our idea of using two-factor trinomial lattices to approximate (22a)–(22b) follows the one proposed by Hull and White (1994): obtain a one-factor trinomial lattice for each state variable first, then integrate both lattices to one such that nine branches () emanating from each lattice node. Let , , a definition extended directly from that in the one-factor case in Section 2. Assume the one-factor discrete price increment, branching factor for , and branch jump size are , , and , , respectively, as defined in Section 2. An example is given in Figure 1, where a node at time t is shown in the left panel. The first one calculates and using (5) and (9), respectively, for each factor l to determine the three nodes , , and at time that each factor l will map into. The nine nodes of their combinations at are shown in the right panel.
Figure 1.
An example of node branching of our proposed two-factor lattice.
The main task is to solve the nine branching probabilities while matching the first two moments of each factor and their correlation. We further define , , and , . At each node, to determine the nine branching probabilities , where , one solves the following linear system, denoted by :
At each node of the trinomial lattice, it is required to solve to determine the branching probabilities, which has nine variables, six linear equations, and nine nonnegativity inequalities. In optimization terminology, is said to be feasible if a (feasible) solution exists that meets all constraints. Otherwise, is said to be infeasible, which means that some probabilities must be negative or greater than 1. Lattice feasibility refers to the condition where is feasible for all possible nodes in a lattice. In the next proposition, we will show that lattice feasibility is a necessary condition for weak convergence of the proposed two-factor lattice.
Proposition 5.
3.1. Feasibility of the General Lattice
Consider the nine branches that emanated from a fixed node, where the one-factor branching probabilities are assumed to be , . These probabilities are denoted with a tilde to indicate that they are not variables. In addition, assume at this node that the corresponding error factors of the branching and the jump sizes are and , . The branching probabilities can be obtained as follows (cf. (15a)–(15c)):
The key step is to rewrite to the following equivalent form , in terms of , , and , :
To maintain legibility, we denote , , where
In (28), covers all possible nodes in , . is denoted with to emphasize that it is a ‘local’ problem associated with some specific lattice node (as and are functions of , ). Equations (27a)–(27d) intend to match the means and variances of the price deviations of the two factors, which have been met by the marginal probabilities and , . Equation (27e) is derived from (24e) with some algebra.
Consider an initial solution , which satisfies (27a)–(27f) but (27e), unless . To determine the range of on the right-hand side (RHS) of (27e) that is feasible, consider the following two linear programs:
For a given set of , it is clear that is feasible, if, and only if, the RHS of (27e) is between and , i.e.,
Therefore, for , , we have
Given the values of and , using (31), the range of the correlation between the two factors for which the lattice can guarantee feasible branching probabilities at all nodes and all stages can be identified. Next, we will show that (31) is symmetric such that its upper bound is the negative of its lower bound.
Proposition 6.
For any arbitrary , , the following equality holds:
Since (31) is symmetric, we can focus on solving . The closed-form expression for has been derived in Tseng and Lin (2007), which is duplicated in the following proposition for the sake of clarity.
Note that , are also functions of and . It can be seen that , , and reduce to , , and , and vice versa, respectively, by exchanging the factor indices 1 and 2.
Using the result from Proposition 6 to find the upper bound of (31), one needs to solve
for , . It would be easier to solve if one could switch the two minimization operators in (33) as follows:
Proposition 8.
Let be n continuous functions and , where is finite. Then,
Next, we state the main theorem of this section.
Proposition 1.
(Two-Factor Lattice Feasibility) Given a lattice configuration , is feasible for all , , if, and only if, , where
and
Theorem 1 is general, but, unfortunately, no explicit functional expressions for are available because each of them requires solving a two-dimensional global minimum of a discontinuous function. However, given a set of lattice parameters , using numerical methods, such as exhaustive search, one can easily obtain the numerical values of .
From the perspective of minimizing computational requirements, one would prefer smaller values of so that the grid size may not become too small. To achieve that, next, we try to maximize over and for a given pair:
Using numerical methods, Table 1 shows the values of for all 100 pairs of for . Note that, since is symmetric, Table 1 only displays half of the pairs with . From Table 1, it is clear that is an increasing function in and . If one considers increasing by 1 to be as difficult in terms of computational effort as increasing by 1, then the diagonal elements where () seem to be the most efficient choices.
Table 1.
The 100 values of for .
When , to obtain the optimal solutions for achieving in (38), it turns out that in Theorem 1 are the smallest elements in the minimum operator in (36), so a symmetrical optimal solution is obtained. Using numerical methods, the values of and the corresponding ( are summarized in Table 2.
Table 2.
The values of and the optimal when .
As an example, if , using the optimized from Table 2, one can use by using obtained from the Appendix A. Note that the results presented in this section are for general two-factor lattices. If the two underlying diffusion processes have some special structures, e.g., the class of square root volatility models, then may be further increased so that the required that guarantees lattice feasibility may be lowered. In the next section, we focus on the Heston SV model, for which explicit functional expressions for are available.
3.2. Lattice for the Heston SV Model
Consider the Heston SV model (Heston 1993) as follows:
where is the logarithm of the stock price; , , m, and are positive constants.
Since both volatilities in (39a) and (39b) are functions of , we focus on the MR process of . As mentioned in Section 2.2, if the Feller condition is satisfied, exists for (39b). Based on , we can identify positive and . Once is obtained, we set
Proposition 9.
Using Proposition 9, we will show that Theorem 1 has an explicit functional form for if we set . Since is a constant, is a fixed constant for all nodes. We denote it as . Without loss of generality, we assume . To make , from (5), we need to have
This can be achieved by adjusting the value of , , or . Adjusting and/or is more straightforward than adjusting . However, since is a parameter for lattice configuration, we recommend adjusting the value of slightly to eliminate the remainder of (41) in order to make .
With and , using (31), the condition for lattice feasibility can be significantly simplified as follows:
for , .
Theorem 2.
Next, we try to maximize by adjusting the lattice configuration while maintaining lattice feasibility. Let
Using numerical methods, the solution of (45) is obtained as follows. When , is achieved at the lower bounds of and , where , and is determined by . When , is achieved at , the lower bound, and at the point where . That is,
where
It can be seen that approaches 1 as increases. The value of for is given in Table 3, along with the corresponding . Note that is symmetric in and . Therefore, another solution for is to switch and .
Table 3.
The values of (with ).
Back to the previous example, if , now it only requires with (1.1832,1.1866) or (1.1866, 1.1832) to achieve lattice feasibility, a reduction from of the optimized general model.
4. Impact of Lattice Infeasibility on Option Valuation
In this section, we investigate how much lattice infeasibility could impact option valuation. Consider the following SV model of Heston (1993) where the stock price () and the instantaneous variance () under the risk neutral measure are defined as follows:
where r is the constant risk-free rate, q is the dividend yield, and and are Wiener processes such that .
4.1. An Optimization Perspective
The problem contains six linear equations with nine variables and a nonnegativity constraint. Since has no linear independence of the linear equations, these linear equations have infinitely many solutions (sets of branching probabilities). Therefore, the key is the nonnegativity constraint, which requires a solution to be a set of legitimate probabilities. When the term ‘lattice feasibility’ was coined by Tseng and Lin (2007), there was an implication of using optimization to determine branching probabilities. The idea was to add an objective function to be optimized subject to . Since not all feasible solutions contribute the same to the objective function, using optimization would find not only a feasible solution, but an optimal solution. The objective function in this context refers to the quality of fitting the probability distribution of the underlying uncertainties. The authors in Tseng and Lin (2007) used the following objective function:
where , are subject to ; and and are marginal probabilities obtained from (26a)–(26c). Tseng and Lin (2007) showed that doing so best fits the probability distribution of the underlying variables in some measure.
The optimization problem in (50) is a standard quadratic programming (QP) problem with linear constraints. Apparently, the optimal solution of (50) is , when . When , this optimization problem finds the solution that has the least deviation from the solution of the uncorrelated case. At each node, their approach requires solving a simple optimization problem to determine branching probabilities for . We adopt this approach in this paper and refer to it as Best-Fit. Although solving a (QP) at each node may seem cumbersome, we have developed an iterative method to identify the binding constraints at optimality. Once the binding constraints are identified, the corresponding feasible solution is the optimal one due to the convexity of the QP. This approach is very efficient as it usually takes a few trials to correctly identify the binding constraints.
On the other hand, we need a counterpart method for determining branching probabilities that works well in practice but may not guarantee lattice feasibility. We consider the popular lattice approach proposed by Hull and White (1994) (denoted by H&W) as this counterpart method. Note that the comparison is conducted in the same lattice framework such that the lattice structure is exactly the same except for their respective methods of determining branching probabilities at each node: the first is by Best-Fit and the second by H&W.
When , Hull and White (1994) suggests with :
and similarly when ,
4.2. Numerical Comparisons: Best-Fit vs. H&W
It is well known that the feasibility of becomes harder to meet when the value of the instantaneous correlation is high. If one allows some branching probabilities to be negative, the corresponding probability distribution(s) of the price and/or the volatility is distorted. Depending on the degree of the distortion, there may be some impact on the option valuation. We use both approaches (Best-Fit and H&W) to value a European call option under the Heston SV model. The parameters are taken from Table 1 of Ball and Roma (1994) with , , and .
With , three cases, corresponding to three different strike prices, $80, $100, and $120, are tested with the correlation ranging between −0.8 and 0.8. The result is summarized in Table 4. For the lattice, we consider (with T = 6 months) to make sure that both methods (Best-Fit and H&W) can best fit the probability distribution of the underlying variables.
Table 4.
Impact of lattice infeasibility on options pricing.
The last two columns of Table 4 show the percentage of lattice nodes (from to T) that each has at least one outgoing branch with a probability either negative or exceeding unity. Note that this should not be confused with the situation where a price node at some stage has a negative probability to prevail. A lattice node may receive many incoming branches from other nodes in the previous stage. While some of the incoming branches may have negative probabilities, it is unlikely that the resultant probability for reaching this node is negative.
Basically, the option prices using Best-Fit are very close to the exact values, within one cent. It can be seen that Best-Fit maintains lattice feasibility for all values tested, while H&W meets the feasibility condition only when . When , there is only a very small number of nodes containing branches with negative probabilities. However, when , almost all the nodes (99.79%) violate the feasibility condition. This justifies our selection of H&W as the counterpart, as this method indeed provides very good approximations of branching probabilities that solve . Figure 2 displays the percentage deviation of the obtained option prices from the exact ones by changing using Best-Fit and H&W.
Figure 2.
Deviation (%) of the values of European call options from the exact ones using Best-Fit and H&W for branching probabilities.
It can be seen from Figure 2 that H&W obtains a precise value only when there is no correlation. As increases, the option value obtained by H&W starts to deviate from the exact values. The deviation is roughly a piecewise linear function of , whose slope doubles when . On the other hand, the deviations of Best-Fit are largely confined within 0.6 cents (less than 0.1%). Comparing both methods, we make the following three observations for H&W’s method:
- Consider the OTM option and the ITM option when . Though lattice feasibility is fully maintained, the error persists for . Therefore, lattice feasibility alone cannot guarantee good valuation, especially with a finite time step . The valuation error would converge to zero only when is sufficiently small. Therefore, lattice feasibility is merely a necessary condition for accurate valuations. From this perspective, the optimization approach for finding a feasible and optimal set of branching probabilities makes sense.
- Consider the ATM option. Its pricing errors are relatively small for all values tested. Even when , where infeasibility occurs at almost all nodes, the error does not seem too bad (0.3% when ; 1.3% when ). This shows that sometimes negative probabilities seem to matter less.
- Lattice infeasibility means that there are some distortions on the probability distribution of the underlying uncertainties. The effect can be overvaluation (e.g., OTM and ATM options when ) or undervaluation (e.g., ITM option when ) of the options.
In Figure 3, we plot the exact probability density functions (PDF) of the (logarithm of the) stock price (dotted curve) at different correlation levels, and the PDF approximated by the lattice using H&W (solid curve) and Best-Fit for branching probabilities. The exact PDF is obtained from a standard integration scheme of characteristic functions (e.g., see Rough 2013). The probability distributions are taken when , , and . At , both Best-Fit and H&W achieve lattice feasibility (with the first two moments of price and volatility deviations matched). There is no visible distortion in the PDF from Figure 3, yet its OTM option price still has an error of more than 3% (12 cents). This indicates that the optimization of (50) subtly improves the approximation of the tail distribution.
Figure 3.
Illustration of the underlying distributions of .
When , the discrepancy between the exact PDF and the PDF approximated by H&W becomes visible. Using in Figure 3 as an example, where the discrepancy is most distinct, the exact PDF and the one approximated by Best-Fit are still visually indistinguishable. In general, when (or ), it can be seen that, using H&W, the price distribution is distorted such that it is slightly less skewed to the right (or left) with a bigger (or smaller) mode. In Figure 3, the three strike prices tested corresponding to OTM, ATM, and ITM options are also identified. When (or ), it can be seen that using this distorted price distribution to value a European call option undervalues (or overvalues) the OTM and ATM ones, but overvalues (or undervalues) the ITM option. We make the following three additional observations:
- The price of an OTM option is directly impacted by lattice infeasibility as its value is determined by the tail distribution, which is the part of the probability distribution that is most affected by negative probabilities.
- For an ITM option, a wider part of the probability distribution becomes relevant, which tends to involve both tails and the central part, making the overall effects hard to predict.
- For an ATM option, the distorted tail part seems less important because the less distorted central part dominates the valuation.
To sum up, since, in reality, how the underlying distribution is distorted by lattice infeasibility is unknown a priori, it seems unlikely that one could exploit negative probabilities. However, if one really hopes for negative probabilities to work to their advantage, it is less likely to happen on OTM options, but more probable on ATM options.
5. Performance of the Best-Fit Lattice
In this section, we provide more numerical results for a lattice equipped with lattice feasibility and the Best-Fit method for branching probabilities. We continue to focus on the Heston SV model in (49a)–(49b). Since this approach guarantees feasible branching probabilities that also fit the underlying probability distribution well, with no surprise, all results indicate that such an approach is very reliable for accurate option valuations.
5.1. European Options Valuation
To make a comprehensive analysis of pricing errors, we compare the option prices obtained by the proposed lattice model with the exact solutions of European call options of various specifications. These specifications are drawn from combinations of the following factors: , (dividend yield), years (time to maturity), , , , (strike price), , , and . Those combinations that do not meet the Feller condition (i.e., ) are excluded. After that, 540 sets remain for the testing. The accuracy measure used in this paper is the root mean squared error (RMSE), defined as follows:
where the error is the difference between the exact value of the i-th European call option and the estimated option value using the proposed lattice model.
From Table 5, it can be seen that the pricing errors of the proposed lattice model are generally small compared with the exact option prices even if the number of time steps is small. For instance, when , the RMSE of the proposed trinomial method for pricing European call options is $0.0061, which is far smaller than the bid-ask spreads observed in the market. Moreover, Table 5 indicates that the rate of convergence of the proposed method is of order .
Table 5.
The accuracy of the proposed Best-Fit model for pricing European call options under the Heston SV model.
5.2. Convergence and Complexity
Consider another test case taken from Table 1 of Ball and Roma (1994) with , , , and . In addition, assume months, . Using this example, we also investigate the convergence pattern of the option prices obtained by the proposed lattice approach and the exact value ($8.3595) when the number of time steps increases, along with its computational requirement. The results are presented in Figure 4. As seen in the upper portion of Figure 4, the option prices obtained by the proposed lattice model do converge to the exact price. We have investigated the convergence pattern for more than twenty cases, and the results are similar to those shown in Figure 4. In the lower portion, we show the CPU times required for obtaining the option values. The CPU times (in seconds) are measured on a personal computer with Intel Core 2 Duo processor E8400 of 3 GHz. We also record the number of lattice nodes in the final stage. Both the CPU times and the node numbers are displayed in logarithmic scales on both the horizontal and vertical axes. The purpose is to check whether the computational complexity is exponential. A linear behavior in a log-log graph, such as the lower portion of Figure 4, indicates that the complexity is polynomial. It is estimated for this particular instance that the CPU time is approximately of order and the number of nodes in the final stage is of order , where N is the number of steps. The result indicates that, as N increases, most branches do recombine, which prevents the number of lattice nodes from growing exponentially with N. In this example, when , the option value is already within 0.2% of the exact value using about three seconds of CPU time and about 40,000 nodes in the final stage. When , the pricing error of the proposed approach is within 0.1%, using about 21 s of CPU time and 160,000 lattice nodes in the final stage. The number of lattice nodes involved in the proposed approach is indeed much greater than that of using traditional recombining lattices. However, the computation using the proposed lattice model can be managed to be quite efficient.
Figure 4.
Convergence pattern of the option prices and the computational requirement.
5.3. American Options Valuation
One unique benefit for the lattice approach is its ability to value American options. We use the proposed lattice model (Best-Fit) to value American put option under the Heston SV model with parameters: strike $100. , , , , and . The results are summarized in Table 6. We compare the result with that reported in Beliaeva and Nawalkha (2010), denoted by ‘B&N Tree’ in the same table. We also apply the control variate (CV) technique to value the American put as follows:
American Put (CV) = American Put (Best-Fit) +
(European Put (Closed-Form) - European Put (Best-Fit))
Table 6.
American put price computed using Best-Fit model and control variate (CV) technique, compared with the result reported in Beliaeva and Nawalkha (2010).
Each of the last two columns of Table 6 indicates the difference, labeled as ‘error’, between an obtained American put option prices (by either the proposed lattice model or the B&N Tree) and that by the CV technique. It can be seen that most errors are smaller than one cent, and, in most cases, the differences obtained by the Best-Fit model are smaller than the corresponding ones reported in Beliaeva and Nawalkha (2010). Furthermore, the approach that achieves a smaller error is highlighted in bold in each comparison.
A more detailed comparison is summarized in Table 7. There are 36 cases tested in Table 6 (corresponding to 36 rows), which can be divided into several categories given in Table 7. The number of wins (in terms of a lower error) and the percentage of winning are recorded, along with the corresponding RMSE of each category. Overall, the Best-Fit model achieves smaller errors in 61% of all 36 test cases; our RMSE (0.0081) is only 44% of that (0.0181) by the B&N Tree (see Table 6). In all categories summarized in Table 7, the Best-Fit model outperforms the B&N tree approach in all categories either by the number of case wins or RMSE. In terms of the correlation , Table 7 shows that the Best-Fit model performs especially well when the correlation is high () such that the Best-Fit model does better in 67% of cases, and the RMSE is only 1/3 of the B&N Tree approach. When is high, lattice feasibility becomes harder to meet. Since the proposed lattice model maintains lattice feasibility and can best fit the underlying probability distribution for all , the results in Table 7 show that it indeed performs relatively well when is high.
Table 7.
Comparison summary of the results in Table 6.
To summarize Table 7, it is fair to say that the Best-Fit model performs better especially when is high, T is greater than one month, and/or for the options that are ITM or OTM.
6. Conclusions
In this paper, we focus on two-factor lattices for general diffusion processes where volatilities can be state-dependent, including stochastic volatility models. For a lattice approach, although it is common knowledge that branching probabilities must be between zero and one, few methods can guarantee all branching probabilities of all nodes in all stages are always legitimate. We refer to this property as lattice feasibility. Since it is not unusual to encounter negative probabilities, some practitioners have argued that negative probabilities are not necessarily ‘bad’ and may be further exploited. We have developed a theoretical framework of lattice feasibility to investigate how negative probabilities may impact option pricing in a lattice approach. We have shown that lattice feasibility can be achieved by adjusting a lattice’s configuration (e.g., grid sizes).
Failing to meet lattice feasibility means that some branching probabilities in a lattice are negative or exceed unity, which implies distortions on the probability distribution of the underlying variables. Depending on the distortion, the accuracy of options pricing may be affected. We have found that out-of-the-money options are most affected, followed by in-the-money options and at-the-money options. It has also been observed that negative probabilities indeed matter less in some situations. Since, in reality, how an underlying probability distribution is distorted by lattice infeasibility is unknown a priori, it seems unlikely that one could exploit negative probabilities consistently as some practitioners may hope.
Although lattice feasibility is a necessary condition for weak convergence of approximating the underlying diffusion processes, our numerical tests also show that lattice feasibility alone may not be sufficient to guarantee accurate valuation, especially when the time step of the lattice is not especially small. Since legitimate branching probabilities may not be unique, we use an optimization approach to find branching probabilities that are not only legitimate but also can best fit the probability distribution of the underlying variables. Extensive numerical tests show that this optimized lattice model is a reliable and robust approach for financial option valuations.
Author Contributions
Conceptualization, C.-L.T.; methodology, C.-L.T. and D.W.-C.M.; software, C.-L.T.; analysis, C.-L.T., D.W.-C.M., S.-L.C. and P.-T.S.; writing—original draft preparation, C.-L.T.; writing—review and editing, C.-L.T., D.W.-C.M., S.-L.C. and P.-T.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Appendix A.1. Proof of Proposition 1
For , , and to be legitimate for all , we consider
The upper bound of (A2) is achieved at . The lower bound is achieved at . Thus,
Thus,
Plugging to (A6) gives the following ranges of :
When , apparently must be 3. When , the upper bound of increases with h, while the lower bound decreases. Therefore, for , sets the bounds. That is, for .
Appendix A.2. Proof of Proposition 2
The proof is based on Durrett (1996) where the conditions for weak convergence from Markov chains to diffusion processes are given. To present our proof, we first give the following lemma which is taken from Lemma 8.2 of (Durrett 1996, p. 306) and adapted to our one-factor case. For convenience, we introduce the following definitions for the one-factor lattice .
They are concerned with the first, second, and higher (absolute) moments of the change in lattice across .
Lemma A1.
Suppose that . If there exists a and for all , we have
then the one-factor lattice converges weakly to as .
To prove Proposition 2, we need to check the above three conditions. Note that conditions (i) and (ii) are true because (17a)–(17c) hold for all y. Then, we check condition (iii) with and have
Note that
Fix an arbitrarily large . For any y such that (thus and are finite) and for a sufficiently small , we have
Putting these together, we have
Thus, condition (iii) is satisfied.
Appendix A.3. Proof of Proposition 3
Before we prove Proposition 3, some preparation needs to be done. Equation (20) is equivalent to:
Likewise, (21) is equivalent to:
Since dealing with the floor functions is cumbersome, the following lemma enables us to consider the sufficient conditions without the floor functions that imply (A7a) and (A7b).
Lemma A2.
Suppose . The following two statements are true:
- (a)
- If , then .
- (b)
- If , then .
Proof.
Let , , , and . This implies that , , , and .
- (a)
- We have . If , i.e., , then . This means that implies .
- (b)
- Similarly, . If , i.e., , then . This means that, if , then .
□
We want to show that exists such that (A9) holds for all . Then, using Lemma A2(a), this implies that (20) holds for all .
Lemma A3.
The following two statements are true:
- (a)
- Given , if , , , and , then for .
- (b)
- If and , then has a solution .
Proof.
- (a)
- Given , , , , and , consider . Note that is a quadratic convex function. Consider two cases: (I) If , then ; is increasing and is positive for . (II) If , f has a local minimum at with objective value . Both cases imply that , . Since and , implies that .
- (b)
- When , the left-hand-side of . Therefore, by continuity, there exists some to satisfy the inequality .
□
Now, we are ready to prove Proposition 3 using Lemmas A2 and A3. Let , , , where is from (19). From Lemma A3(a), if , which is condition (i) of Proposition 3, and , then (A9) holds for , which further implies that (20) holds for . Now, consider the following equivalent statements:
Using Lemma A3(b), exists if the right-hand side of (A14) is strictly positive. That is,
which is given by condition (ii) of Proposition 3. Suppose a feasible is found that satisfies (A14). Using Lemma A2(a) and Lemma A3(a), we conclude exists such that (20) holds for .
Using to evaluate from (19), if , we are done. Otherwise, is imposed as given in (21), which is reduced to (A11). Using Lemma A3(b), it is clear that exists to ensure (A11). However, to ensure that , from (19), the following upper bound of must also be imposed:
Using Lemma A3(b) again, it is clear that a exists that also meets (A16). Since is a function of from (19), we conclude that a exists, say , such that (21) holds. Since , it is clear that also meets (A14). Therefore, both (A7a) and (A7b) hold with .
To sum up, given conditions (i) and (ii) of Proposition 3, exists (e.g., equal to ). This means that the entire lattice can be well contained in .
Appendix A.4. Proof of Proposition 4
Given Feller’s condition , we want to show there exists a lattice configuration such that . To do so, we choose a small such that , i.e., . We also choose a big such that is very close to 1 while still being greater than 1 based on (16), and furthermore . This would imply , which is condition (i) of Proposition 3; and
which is condition (ii) of Proposition 3. The result follows from Proposition 3.
Appendix A.5. Proof of Proposition 5
The proof for the two-factor case is in the same spirit as the one-factor case treated in Section A.2. For the two-factor lattice , define
With the above definitions, we may restate Lemma 8.2 of Durrett (1996) for the two-factor case as follows.
Lemma A4.
Suppose that . If there exists a and for all , we have
then converges weakly to as .
The proof of Proposition 5 requires the above three conditions. Conditions (i) and (ii) are true because (25a)–(25d) hold. To check condition (iii) with , we observe that
where . Using an argument similar to Section A.2, we conclude that, for a given , there exists a constant such that for all lattice nodes with . This shows that condition (iii) is satisfied.
Appendix A.6. Proof of Proposition 6
- (i)
- First, observe from (15a)–(15c) that replacing by is equivalent to switching and . Consider the optimization problem of with a feasible solution of , as shown in Figure A1. The objective function is , which is the sum of the northwest and southeast corner elements minus the sum of the other two corner elements. It can be seen that switching the first and the third columns (i.e., replacing and ), and the first and the third rows (i.e., replacing and ), yields the same objective value. This implies that .
Figure A1. Interpretation of .Figure A1. Interpretation of .
- (ii)
- Continuing the argument in (i), if one only switches the first and the third columns or the first and the third rows, the objective value will still be the same but with a different sign. Therefore, .
- (iii)
- From (ii), it is clear thatIn (A18d), we use the fact that ; thus, replacing by will not affect the result of the optimization.
Appendix A.7. Proof of Proposition 7
See the proof of Lemma 2 in Tseng and Lin (2007).
Appendix A.8. Proof of Proposition 8
Let , and , . For simplicity, we assume , . We want to show .
By the definition of , , we have . Thus, . Furthermore, we have , . Thus, . This means that . Thus, we conclude that .
Appendix A.9. Proof of Theorem 1
We consider the six cases of , , given in Proposition 7.
- (i)
- (ii)
- This is a symmetric case for case (i) by exchanging the factor indices, 1 and 2.
- (iii)
- , we haveAgain, in (A20d), we optimize over first, and the optimal solution is achieved at with .
- (iv)
- . Repeat the same process and we havewhere in (A21a) we used the fact that the optimal solution for is .
- (v)
- For cases associated with and , they are symmetric counterparts of (iii) and (iv) by exchanging the factor indices 1 and 2. This obtains the same lower bounds as in (iii) and (iv).Summarizing all four possible cases above, we conclude thatThe proof is completed.
Appendix A.10. Proof of Proposition 9
Thus,
Appendix A.11. Proof of Theorem 2
This proof is similar to that of Theorem 1. Six cases corresponding to to will be discussed.
- (i)
- . ConsiderThis corresponds to .
- (ii)
- . Considerwhere the minimum is achieved at . This corresponds to .
- (iii)
- , we havewhere in the minimization of (A26) over , the minimum is achieved at . To solve the optimization in (A27), treat the objective function as a one-dimensional function of , which is a continuous variable, with h plugged in as from (9). It can be seen that the objective function is discontinuous. Using computing tools to display the one-dimensional function, subject to from (14) and , the global minimum is achieved at , where .
- (iv)
- .where, in the minimization of (A29) over , the minimum is achieved at . Using computing tools to solve the one-dimensional minimization in (A30), the global minimum is achieved at and where is infinitesimal. The existence of an infinitesimal is to enforce the relation . However, for obtaining the minimum value, which will just serve a bound in our purpose, one essentially can just plug into (A29) to find the value of the minimum.
- (v)
- where, in the minimization of (A31) over , the minimum is achieved at . Following the steps described in (iii) to minimize (A32) as a one-dimensional problem using computing tools, the global minimum is achieved at , where . This term corresponds to . It can be seen that is smaller than the term in (A28). Thus, does not contribute to .
- (vi)
- .where, in the minimization in (A33) over , the minimum is achieved at . Using computing tools to solve the one-dimensional minimization in (A34), the global minimum is achieved at and where is infinitesimal. Like in (iv), one can simply plug into (A34). This corresponds to , which is always smaller than the term in (A30). Thus, does not contribute to .
References
- Akyildirim, Erdinç, Yan Dolinsky, and H. Mete Soner. 2014. Approximating stochastic volatility by recombinant trees. The Annals of Applied Probability 24: 2176–205. [Google Scholar] [CrossRef]
- Andersen, Leif B. G. 2007. Efficient Simulation of the Heston Stochastic Volatility Model. Available online: http://ssrn.com/abstract=946405 (accessed on 21 May 2021).
- Ball, Clifford, and Antonio Roma. 1994. Stochastic volatility option pricing. Journal of Financial and Quantitative Analysis 29: 589–607. [Google Scholar] [CrossRef]
- Beliaeva, Natalia A., and Sanjay K. Nawalkha. 2010. A simple approach to price American options under the Heston stochastic volatility model. Journal of Derivatives 17: 25–43. [Google Scholar] [CrossRef]
- Boyle, Phelim P. 1986. Option valuation using a three-jump process. International Options Journal 3: 7–12. [Google Scholar]
- Boyle, Phelim P. 1988. A lattice framework for option pricing with two state variables. Journal of Financial and Quantitative Analysis 23: 1–12. [Google Scholar] [CrossRef]
- Broadie, Mark, and O¨zgu¨r Kaya. 2006. Exact simulation of stochastic volatility and other affine jump diffusion processes. Operations Research 54: 217–31. [Google Scholar] [CrossRef]
- Burgin, Mark, and Gunter Meissner. 2012. Negative Probabilities in Financial Modeling. Wilmott Magazine 58: 60–65. [Google Scholar] [CrossRef]
- Chung, San-Lin, and Pai-Ta Shih. 2007. Generalized Cox-Ross-Rubinstein binomial models. Management Science 53: 508–20. [Google Scholar] [CrossRef]
- Costabile, Massimo, Ivar Massabo‘, and Emilio Russo. 2012. A forward shooting grid method for option pricing with stochastic volatility. Journal of Derivatives 20: 67–78. [Google Scholar] [CrossRef]
- Cox, John C., Jonathan E. Ingersoll, and Stephen A. Ross. 1985. A theory of the term structure of interest rates. Econometrica 53: 385–408. [Google Scholar] [CrossRef]
- Cox, John C., Stephen Ross, and Mark Rubinstein. 1979. Option pricing: A simplified approach. Journal of Financial Economics 7: 229–64. [Google Scholar] [CrossRef]
- Durrett, Richard. 1996. Stochastic Calculus: A Practical Introduction. Boca Raton: CRC Press Inc. [Google Scholar]
- Feller, William. 1951. Two singular diffusion problems. Annals of Mathematics 54: 173–82. [Google Scholar] [CrossRef]
- Haug, Espen Gaarder. 2007. Why so negative to negative probabilities. In Derivatives Models on Models. New York: John Wiley & Sons, Chapter 14. [Google Scholar]
- Heston, Steven L. 1993. A closed-form solutions for options with stochastic volatility with application to bond and currency options. The Review of Financial Studies 6: 327–43. [Google Scholar] [CrossRef]
- Hull, John C., and Alan White. 1988. The use of control variate technique in option-pricing. Journal of Financial and Quantitative Analysis 23: 237–51. [Google Scholar] [CrossRef]
- Hull, John C., and Alan White. 1990. Valuing derivative securities using the explicit finite difference method. Journal of Financial and Quantitative Analysis 25: 87–100. [Google Scholar] [CrossRef]
- Hull, John C., and Alan White. 1993. One-factor interest-rate models and the valuation of interest-rate derivative securities. Journal of Financial and Quantitative Analysis 28: 235–54. [Google Scholar] [CrossRef]
- Hull, John C., and Alan White. 1994. Numerical procedures for implementing term structure models II: Two-factor models. Journal of Derivatives 2: 37–49. [Google Scholar] [CrossRef]
- Kouritzin, Michael A. 2000. Exact infinite dimensional filters and explicit solutions. In Stochastic Models. Edited by Luis G. Gorostiza and B. Gail Ivanoff. Providence: American Mathematical Society, pp. 265–82. [Google Scholar]
- Kouritzin, Michael A., and Anne Mackay. 2020. Branching particle pricers with Heston examples. International Journal of Theoretical and Applied Finance 23: 2050003. [Google Scholar] [CrossRef]
- Longstaff, Francis A., and Eduardo S. Schwartz. 2001. Valuing American options by simulation: A simple least-squares approach. Review of Financial Studies 14: 113–47. [Google Scholar] [CrossRef]
- Maghsoodi, Yoosef. 1996. Solutions of the extended CIR term structure and bond option valuation. Mathematical Finance 6: 89–109. [Google Scholar] [CrossRef]
- Rendleman, Richard J., and Brit J. Bartter. 1979. Two-state option pricing. Journal of Finance 34: 1093–110. [Google Scholar] [CrossRef]
- Rough, Fabrice D. 2013. The Heston Model and Its Extension in Matlab and C#. Hoboken: Wiley. [Google Scholar]
- Ruckdeschel, Peter, Tilman Sayer, and Alexander Szimayer. 2013. Pricing American options in the Heston model: A close look on incorporating correlation. Journal of Derivatives 20: 9–29. [Google Scholar] [CrossRef][Green Version]
- Tseng, Chung-Li, and Kyle Lin. 2007. A framework using two-factor price lattices for generation asset valuation. Operations Research 55: 234–51. [Google Scholar] [CrossRef]
- Zvan, R., Peter A. Forsyth, and K. R. Vetzal. 2001. Negative Coefficients in Two Factor Option Pricing Models. Working Paper. Available online: https://cs.uwaterloo.ca/~paforsyt/posmesh3.pdf (accessed on 21 May 2021).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).



