*5.3. Example*

We analyse *n* = 1043 daily log-returns for the Bitcoin price series for the period 2016–2019; values are multiplied by 100. We first apply the semi-parametric approach of Genest et al. (1995) using the log-likelihood (23) which yields the results in Table 1. Di fferent models are referred to by VT(*n*)-ARMA(*p*, *q*), where (*p*, *q*) refers to the ARMA model and *n* indexes the v-transform: 1 is the linear v-transform Vδ in (8); 3 is the three-parameter transform Vδ,κ,<sup>ξ</sup> in (7); 2 is the two-parameter v-transform given by Vδ,<sup>κ</sup> := Vδ,κ,1. In unreported analyses, we also tried the three-parameter family based on the beta distribution, but this had negligible e ffect on the results.

**Table 1.** Analysis of daily Bitcoin return data 2016–2019. Parameter estimates, standard errors (below estimates) and information about the fit: SW denotes Shapiro–Wilks *p*-value; *L* is the maximized value of the log-likelihood and AIC is the Akaike information criterion.


The column marked *L* gives the value of the maximized log-likelihood. All values are large and positive showing strong evidence of stochastic volatility in all cases. The model VT(1)-ARMA(1,0) is a first-order Markov model with linear v-transform. The fit of this model is noticeably poorer than the others suggesting that Markov models are insu fficient to capture the persistence of stochastic volatility in the data. The column marked SW contains the *p*-value for a Shapiro–Wilks test of normality applied to the residuals from the VT-ARMA copula model; the result is non-significant in all cases.

According to the AIC values, the VT(2)-ARMA(1,1) is the best model. We experimented with higher order ARMA processes, but this did not lead to further significant improvements. Figure 5 provides a visual of the fit of this model. The pictures in the panels show the QQplot of the residuals against normal, acf plots of the residuals and squared residuals and the estimated conditional mean process (μ<sup>ˆ</sup>*t*), which can be taken as an indicator of high and low volatility periods. The residuals and absolute residuals show very little evidence of serial correlation and the QQplot is relatively linear, suggesting that the ARMA filter has been successful in explaining much of the serial dependence structure of the normalized volatility proxy process.

We now add various marginal distributions to the VT(2)-ARMA(1,1) copula model and estimate all parameters of the model jointly. We have experimented with a number of location-scale families including Student-t, Laplace (double exponential), and a double-Weibull family which generalizes the Laplace distribution and is constructed by taking back-to-back Weibull distributions. Estimation results are presented for these three distributions in Table 2. All three marginal distributions are symmetric around their location parameters μ, and no improvement is obtained by adding skewness using the construction of Fernández and Steel (1998) described in Section 3.1; in fact, the Bitcoin returns in this time period show a remarkable degree of symmetry. In the table, the shape and scale parameters of the distributions are denoted η and σ, respectively; in the case of Student, an infinite-variance distribution with degree-of-freedom parameter η = 1.94 is fitted, but this model is inferior to the models with Laplace and double-Weibull margins; the latter is the favoured model on the basis of AIC values.

**Figure 5.** Plots for a VT(2)-ARMA(1,1) model fitted to the Bitcoin return data: QQplot of the residuals against normal (**upper left**); acf of the residuals (**upper right**); acf of the absolute residuals (**lower left**); estimated conditional mean process (μ*t*) (**lower right**).

**Table 2.** VT(2)-ARMA(1,1) model with three different margins: Student-t, Laplace, double Weibull. Parameter estimates, standard errors (alongside estimates) and information about the fit: SW denotes Shapiro–Wilks *p*-value; *L* is the maximized value of the log-likelihood and AIC is the Akaike information criterion.


Figure 6 shows some aspects of the joint fit for the fully parametric VT(2)-ARMA(1,1) model with double-Weibull margin. A QQplot of the data against the fitted marginal distribution confirms that the double-Weibull is a good marginal model for these data. Although this distribution is sub-exponential (heavier-tailed than exponential), its tails do not follow a power law and it is in the maximum domain of attraction of the Gumbel distribution (see, for example, McNeil et al. 2015, Chapter 5).

**Figure 6.** Plots for a VT(2)-ARMA(1,1) model combined with a double Weibull marginal distribution fitted to the Bitcoin return data: QQplot of the data against fitted double Weibull model (**upper left**); estimated volatility proxy profile function *gT* (**upper right**); estimated v-transform (**lower left**); implied relationship between data and volatility proxy variable (**lower right**).

Using (26), the implied volatility proxy profile function *g*<sup>ˆ</sup>*T* can be constructed and is found to lie just below the line *y* = *x* as shown in the upper-right panel. The change point is estimated to be μ<sup>ˆ</sup>*T* = 0.06. We can also estimate an implied volatility proxy transformation in the equivalence class defined by *g*<sup>ˆ</sup>*T* and μ<sup>ˆ</sup>*T*. We estimate the transformation *T* = *T*(*Z*) in (5) by taking *<sup>T</sup>*<sup>ˆ</sup>(*x*) = <sup>Φ</sup>−<sup>1</sup>(Vθ<sup>ˆ</sup>(*V*)(*FX*(*<sup>x</sup>*; <sup>θ</sup><sup>ˆ</sup>(*M*)))). In the lower-left panel of Figure 6, we show the empirical v-transform formed from the data (*xt*, *<sup>T</sup>*<sup>ˆ</sup>(*xt*)) together with the fitted parametric v-transform <sup>V</sup>θ<sup>ˆ</sup>(*V*). We recall from Section 1 that the empirical v-transform is the plot (*ut*, *vt*) where *ut* = *F*(*X*) *n* (*xt*) and *vt* = *F*(*T*<sup>ˆ</sup>(*X*)) *n* (*T*<sup>ˆ</sup>(*xt*)). The empirical v-transform and the fitted parametric v-transform show a good degree of correspondence. The lower-right panel of Figure 6 shows the volatility proxy transformation *T* <sup>ˆ</sup>(*x*) as a function of *x* superimposed on the points *xt*, <sup>Φ</sup>−<sup>1</sup>(*vt*). Using the curve, we can compare the effects of, for example, a log-return (× 100) of −10 and a log-return of 10. For the fitted model, these are 1.55 and 1.66 showing that the up movement is associated with slightly higher volatility.

As a comparison to the VT-ARMA model, we fit standard GARCH(1,1) models using Student-t and generalized error distributions for the innovations; these are standard choices available in the popular rugarch package in R. The generalized error distribution (GED) contains normal and Laplace as special cases as well as a model that has a similar tail behaviour to Weibull; note, however, that, by the theory of Mikosch and Stărică (2000), the tails of the marginal distribution of the GARCH decay according to a power law in both cases. The results in Table 3 show that the VT(2)-ARMA(1,1) models with Laplace and double-Weibull marginal distributions outperform both GARCH models in terms of AIC values.

Figure 7 shows the in-sample 95% conditional value-at-risk (VaR) estimate based on the VT(2)-ARMA(1,1) model which has been calculated using (22). For comparison, a dashed line shows the corresponding estimate for the GARCH(1,1) model with GED innovations.


**Figure 7.** Plot of estimated 95% value-at-risk (VaR) for Bitcoin return data superimposed on log returns. Solid line shows VaR estimated using the VT(2)-ARMA(1,1) model combined with a double-Weibull marginal distribution; the dashed line shows VaR estimated using a GARCH(1,1) model with GED innovation distribution.

Finally, we carry out an out-of-sample comparison of conditional VaR estimates using the same two models. In this analysis, the models are estimated daily throughout the 2016–2019 period using a 1000-day moving data window and one-step-ahead VaR forecasts are calculated. The VT-ARMA model gives 47 exceptions of the 95% VaR and 11 exceptions of the 99% VaR, compared with expected numbers of 52 and 10 for a 1043 day sample, while the GARCH model leads to 57 and 12 exceptions; both models pass binomial tests for these exception counts. In a follow-up paper (Bladt and McNeil 2020), we conduct more extensive out-of-sample backtests for models using v-transforms and copula processes and show that they rival and often outperform forecast models from the extended GARCH family.

## **6. Conclusions**

This paper has proposed a new approach to volatile financial time series in which v-transforms are used to describe the relationship between quantiles of the return distribution and quantiles of the distribution of a predictable volatility proxy variable. We have characterized v-transforms mathematically and shown that the stochastic inverse of a v-transform may be used to construct stationary models for return series where arbitrary marginal distributions may be coupled with dynamic copula models for the serial dependence in the volatility proxy.

The construction was illustrated using the serial dependence model implied by a Gaussian ARMA process. The resulting class of VT-ARMA processes is able to capture the important features of financial return series including near-zero serial correlation (white noise behaviour) and volatility clustering. Moreover, the models are relatively straightforward to estimate building on the classical maximum-likelihood estimation of an ARMA model using the Kalman filter. This can be accomplished in the stepwise manner that is typical in copula modelling or through joint modelling of the marginal and copula process. The resulting models yield insights into the way that volatility responds to returns of di fferent magnitude and sign and can give estimates of unconditional and conditional quantiles (VaR) for practical risk measurement purposes.

There are many possible uses for VT-ARMA copula processes. Because we have complete control over the marginal distribution, they are very natural candidates for the innovation distribution in other time series models. For example, they could be applied to the innovations of an ARMA model to obtain ARMA models with VT-ARMA errors; this might be particularly appropriate for longer interval returns, such as weekly or monthly returns, where some serial dependence is likely to be present in the raw return data.

Clearly, we could use other copula processes for the volatility PIT process (*Vt*). The VT-ARMA copula process has some limitations: the radial symmetry of the underlying Gaussian copula means that the serial dependence between large values of the volatility proxy must mirror the serial dependence between small values; moreover, this copula does not admit tail dependence in either tail and it seems plausible that very large values of the volatility proxy might have a tendency to occur in succession.

To extend the class of models based on v-transforms, we can look for models for the volatility PIT process (*Vt*) with higher dimensional marginal distributions given by asymmetric copulas with upper tail dependence. First-order Markov copula models as developed in Chen and Fan (2006) can give asymmetry and tail dependence, but they cannot model the dependencies at longer lags that we find in empirical data. D-vine copula models can model higher-order Markov dependencies and Bladt and McNeil (2020) show that this is a promising alternative specification for the volatility PIT process.

**Funding:** This research received no external funding.

**Data Availability Statement:** The analyses were carried out using R 4.0.2 (R Core Team, 2020) and the tscopula package (Alexander J. McNeil and Martin Bladt, 2020) available at https://github.com/ajmcneil/tscopula. The full reproducible code and the data are available at https://github.com/ajmcneil/vtarma.

**Acknowledgments:** The author is grateful for valuable input from a number of researchers including Hansjoerg Albrecher, Martin Bladt, Valérie Chavez-Demoulin, Alexandra Dias, Christian Genest, Michael Gordy, Yen Hsiao Lok, Johanna Nešlehová, Andrew Patton, and Ruodu Wang. Particular thanks are due to Martin Bladt for providing the Bitcoin data and advice on the data analysis. The paper was completed while the author was a gues<sup>t</sup> at the Forschungsinstitut für Mathematik (FIM) at ETH Zurich.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **Appendix A. Proofs**

*Appendix A.1. Proof of Proposition 1*

> We observe that, for *x* ≥ 0,

$$F\_{T(X)}(\mathbf{x}) = P\{\mu\_T - T\_1^{-1}(\mathbf{x}) \le X\_l \le \mu\_T + T\_2^{-1}(\mathbf{x})\} = F\_X(\mu\_T + T\_2^{-1}(\mathbf{x})) - F\_X(\mu\_T - T\_1^{-1}(\mathbf{x})).$$

*Xt* ≤ μ*T* ⇔ *U* ≤ *FX*(μ*T*) and in this case

$$\begin{aligned} V = F\_{T(X)}(T(X\_l)) = F\_{T(X)}(T\_1(\mu\_T - X\_l)) &= F\_X(\mu\_T + T\_2^{-1}(T\_1(\mu\_T - X\_l))) - F\_X(X\_l) \\ &= F\_X(\mu\_T + \mathcal{g}\_T(\mu\_T - F\_X^{-1}(\mathcal{U}))) - \mathcal{U}. \end{aligned}$$

*Xt* > μ*T* ⇔ *U* > *FX*(μ*T*) and in this case

$$\begin{split} V = F\_{T(X)}(T(X\_l)) &= F\_{T(X)}(T\_2(X\_l - \mu\_T)) = F\_X(X\_l) - F\_X(\mu\_T - T\_1^{-1}(T\_2(X\_l - \mu\_T))) \\ &= \mathcal{U} - F\_X(\mu\_T - \mathcal{g}\_T^{-1}(F\_X^{-1}(\mathcal{U}) - \mu\_T)). \end{split}$$

#### *Appendix A.2. Proof of Proposition 2*

The cumulative distribution function *<sup>F</sup>*0(*x*) of the double exponential distribution is equal to 0.5*e<sup>x</sup>* for *x* ≤ 0 and 1 − 0.5*e*<sup>−</sup>*<sup>x</sup>* if *x* > 0. It is straightforward to verify that

$$F\_X(\mathbf{x}; \boldsymbol{\gamma}) = \begin{cases} \delta \boldsymbol{\varepsilon}^{\chi \mathbf{x}} & \mathbf{x} \le 0 \\\ 1 - (1 - \delta) \boldsymbol{\varepsilon}^{-\frac{\mathfrak{x}}{\mathfrak{y}}} & \mathbf{x} > 0 \end{cases} \text{ and} \\ F\_X^{-1}(\boldsymbol{u}; \boldsymbol{\gamma}) = \begin{cases} \frac{1}{\mathfrak{y}} \ln \left( \frac{\mathfrak{u}}{\delta} \right) & \boldsymbol{u} \le \delta \\\ -\boldsymbol{\gamma} \ln \left( \frac{1 - \mathfrak{u}}{1 - \delta} \right) & \boldsymbol{u} > \delta \end{cases}$$

When *gT*(*x*) = *kx*ξ, we obtain for *u* ≤ δ that

$$\mathcal{IV}\_{\delta,\varkappa,\xi}(\mu) = F\_X\left(\frac{k}{\gamma^{\mathcal{L}}} \Big(\ln\left(\frac{\delta}{\mu}\right)^{\mathcal{L}}\Big); \gamma\right) - \mu = 1 - \mu - (1 - \delta)\exp\left(-\frac{k}{\gamma^{\mathcal{L}+1}} \Big(-\ln\left(\frac{\mu}{\delta}\right)\Big)^{\mathcal{L}}\right).$$

For *u* > δ, we make a similar calculation.

#### *Appendix A.3. Proof of Theorem 1*

It is easy to check that Equation (10) fulfills the list of properties in Lemma 2. We concentrate on showing that a function that has these properties must be of the form (10). It helps to consider the picture of a v-transform in Figure 3. Consider the lines *v* = 1 − *u* and *v* = δ − *u* for *u* ∈ [0, δ]. The areas above the former and below the latter are shaded gray.

The left branch of the v-transform must start at (0, <sup>1</sup>), end at (δ, <sup>0</sup>), and lie strictly between these lines in (0, <sup>δ</sup>). Suppose, on the contrary, that *v* = V(*u*) ≤ δ − *u* for *u* ∈ (0, <sup>δ</sup>). This would imply that the dual point *u*<sup>∗</sup> given by *u*<sup>∗</sup> = *u* + *v* satisfies *u*<sup>∗</sup> ≤ δ which contradicts the requirement that *u*<sup>∗</sup> must be on the opposite side of the fulcrum. Similarly, if *v* = V(*u*) ≥ 1 − *u* for *u* ∈ (0, <sup>δ</sup>), then *u*<sup>∗</sup> ≥ 1 and this is also not possible; if *u*<sup>∗</sup> = 1, then *u* = 0, which is a contradiction.

Thus, the curve that links (0, 1) and (δ, 0) must take the form

$$\mathcal{V}\mathcal{V}(\boldsymbol{\mu}) = (\delta - \boldsymbol{\mu})\Psi\Big(\frac{\boldsymbol{\mu}}{\delta}\Big) + (1 - \boldsymbol{\mu})\Big(1 - \Psi\Big(\frac{\boldsymbol{\mu}}{\delta}\Big)\Big) = (1 - \boldsymbol{\mu}) - (1 - \delta)\Psi\Big(\frac{\boldsymbol{\mu}}{\delta}\Big)$$

where Ψ(0) = 0, Ψ(1) = 1 and 0 < <sup>Ψ</sup>(*x*) < 1 for *x* ∈ (0, <sup>1</sup>). Clearly, Ψ must be continuous to satisfy the conditions of the v-transform. It must also be strictly increasing. If it were not, then the derivative would satisfy V(*u*) ≥ −1, which is not possible: if at any point *u* ∈ (0, <sup>δ</sup>), we have V(*u*) = −1, then the opposite branch of the v-transform would have to jump vertically at the dual point *<sup>u</sup>*<sup>∗</sup>, contradicting continuity; if V(*u*) > −1, then V would have to be a decreasing function at *<sup>u</sup>*<sup>∗</sup>, which is also a contradiction.

Thus, Ψ fulfills the conditions of a continuous, strictly increasing distribution function on [0, 1], and we have established the necessary form for the left branch equation. To find the value of the right branch equation at *u* > δ, we invoke the square property. Since V(*u*) = V(*u*<sup>∗</sup>) = V(*u* − V(*u*)), we need to solve the equation *x* = V(*u* − *x*) for *x* ∈ [0, 1] using the formula for the left branch equation of V. Thus, we solve *x* = 1 − *u* + *x* − (1 − δ)Ψ- *u*−*x*δ for *x*, and this yields the right branch equation as asserted.

*Risks* **2021**, *9*, 14

#### *Appendix A.4. Proof of Proposition 3*

Let *gT*(*x*) be as given in (11) and let *u*(*x*) = *FX*(μ*T* − *<sup>x</sup>*). For *x* ∈ *R*+, *u*(*x*) is a continuous, strictly decreasing function of *x* starting at *u*(0) = δ and decreasing to 0. Since Ψ is a cumulative distribution function, it follows that

$$
\mu^\*(\mathbf{x}) = \mu(\mathbf{x}) + \mathcal{V}(\mu(\mathbf{x})) = 1 - (1 - \delta)\Psi\left(\frac{\mu(\mathbf{x})}{\delta}\right)
$$

is a continuous, strictly increasing function starting at *u*<sup>∗</sup>(0) = δ and increasing to 1. Hence, *gT*(*x*) = *F*−<sup>1</sup> *X* (*u*<sup>∗</sup>(*x*)) − μ*T* is continuous and strictly increasing on *R*<sup>+</sup> with *gT*(0) = 0 as required of the profile function of a volatility proxy transformation. It remains to check that, if we insert (11) in (4), we recover V(*u*), which is straightforward.

#### *Appendix A.5. Proof of Theorem 2*

1. For any 0 ≤ *v* ≤ 1, the event {*U* ≤ *u*, *V* ≤ *v*} has zero probability for *u* < V−<sup>1</sup>(*v*). For *u* ≥ V−<sup>1</sup>(*v*), we have

$$\{\mathcal{U} \le u, \mathcal{V} \le v\} = \left\{\mathcal{V}^{-1}(v) \le \mathcal{U} \le \min\{u, \mathcal{V}^{-1}(v) + v\}\right\}$$

and hence *P*(*U* ≤ *u*, *V* ≤ *v*) = min*u*, V−<sup>1</sup>(*v*) + *v*− V−<sup>1</sup>(*v*) and (12) follows.

2. We can write *P*(*U* ≤ *u*, *V* ≤ *v*) = *<sup>C</sup>*(*<sup>u</sup>*, *<sup>v</sup>*), where *C* is the copula given by (12). It follows from the basic properties of a copula that

$$P(\mathcal{U} \le u, \mathcal{V} = v) = \frac{\mathbf{d}}{\mathbf{d}v} \mathbb{C}(u, v) = \begin{cases} 0 & u < \mathcal{V}^{-1}(v) \\ -\frac{\mathbf{d}}{\mathbf{d}v} \mathcal{V}^{-1}(v) & \mathcal{V}^{-1}(v) \le u < \mathcal{V}^{-1}(v) + v \\ 1 & u \ge \mathcal{V}^{-1}(v) + v \end{cases}$$

This is the distribution function of a binomial distribution, and it must be the case that <sup>Δ</sup>(*v*) = − d <sup>d</sup>*v*V−<sup>1</sup>(*v*). Equation (14) follows by differentiating the inverse.

3. Finally, *E*(Δ(*V*)) = δ is easily verified by making the substitution *x* = V−<sup>1</sup>(*v*) in the integral *E*(Δ(*V*)) = − 101 <sup>V</sup>(V−<sup>1</sup>(*v*))d*<sup>v</sup>*.

#### *Appendix A.6. Proof of Proposition 4*

It is obviously true that V-<sup>V</sup>−<sup>1</sup>(*<sup>v</sup>*, *W*) = *v* for any *W*. Hence, V(*U*) = V-<sup>V</sup>−<sup>1</sup>(*<sup>V</sup>*, *W*) = *V*. The uniformity of *U* follows from the fact that

$$P(\mathcal{V}^{-1}(V,\mathcal{W}) = \mathcal{V}^{-1}(v) \mid V = v) = P(\mathcal{W} \le \Delta(v) \mid V = v) = P(\mathcal{W} \le \Delta(v)) = \Delta(v) \; . $$

Hence, the pair of random variables (*<sup>U</sup>*, *V*) has the conditional distribution (13) and is distributed according to the copula *C* in (12).

#### *Appendix A.7. Proof of Theorem 3*

1. Since the event {*Vi* ≤ *vi*} is equal to the event V−<sup>1</sup>(*vi*) ≤ *Ui* ≤ V−<sup>1</sup>(*vi*) + *vi*, we first compute the probability of a box [*<sup>a</sup>*1, *b*1] ×···× [*ad*, *bd*] where *ai* = V−<sup>1</sup>(*vi*) ≤ V−<sup>1</sup>(*vi*) + *vi* = *bi*. The standard formula for such probabilities implies that the copulas *CV* and *CU* are related by

$$\mathbb{C}\_{V}(v\_{1},\ldots,v\_{d}) = \sum\_{j\_{1}=1}^{2} \cdots \sum\_{j\_{d}=1}^{2} (-1)^{j\_{1} + \cdots + j\_{d}} \mathbb{C}\_{\mathcal{U}}(\mu\_{1j\_{1}}, \ldots, \mu\_{dj\_{d}}) \; ;$$

see, for example, McNeil et al. (2015), p. 221. Thus, the copula densities are related by

$$c\_V(v\_1, \dots, v\_d) = \sum\_{j\_1=1}^2 \cdots \sum\_{j\_d=1}^2 c u(u\_{1^{j\_1}}, \dots, u\_{d^{j\_d}\_d}) \prod\_{i=1}^d \frac{\mathbf{d}}{\mathbf{d}v\_i} (-1)^{\bar{j}\_i} u\_{ij\_i}.$$

and the result follows if we use (14) to calculate that

$$\frac{\mathbf{d}}{\mathbf{d}v\_i}(-1)^j u\_{ij} = \begin{cases} \frac{\mathbf{d}}{\mathbf{d}v\_i} \Big( -\mathcal{V}^{-1}(v\_i) \Big) = \Delta(v\_i) & \text{if} j = 1, \\\frac{\mathbf{d}}{\mathbf{d}v\_i} \Big( v\_i + \mathcal{V}^{-1}(v\_i) \Big) = 1 - \Delta(v\_i) & \text{if} j = 2. \end{cases}$$

2. For the point (*<sup>u</sup>*1, ... , *ud*) ∈ [0, <sup>1</sup>]*<sup>d</sup>*, we consider the set of events *Ai*(*ui*) defined by

$$A\_{i}(u\_{i}) = \begin{cases} \ \{\mathcal{U}\_{i} \le u\_{i}\} & \text{if} \boldsymbol{u}\_{i} \le \delta \\\ \{\mathcal{U}\_{i} > u\_{i}\} & \text{if} \boldsymbol{u}\_{i} > \delta \end{cases}$$

The probability *<sup>P</sup>*(*<sup>A</sup>*1(*<sup>u</sup>*1), ... , *Ad*(*ud*)) is the probability of an orthant defined by the point (*<sup>u</sup>*1, ... , *ud*) and the copula density at this point is given by

$$c\_{\mathbf{d}I}(u\_1, \dots, u\_d) = (-1)^{\sum\_{i=1}^d I\_{[u\_i > 0]}} \frac{\mathbf{d}^d}{\mathbf{d}u\_1 \cdots \mathbf{d}u\_d} P\left(\bigcap\_{i=1}^d A\_i(u\_i)\right) \dots$$

The event *Ai*(*ui*) can be written

$$A\_i(u\_i) = \begin{cases} \{V\_i \ge \mathcal{V}(u\_i), W\_i \le \Delta(V\_i)\} & \text{if} \mu\_i \le \delta \\\{V\_i > \mathcal{V}(u\_i), W\_i > \Delta(V\_i)\} & \text{if} \mu\_i > \delta \end{cases}$$

and hence we can use Theorem 2 to write

$$P\left(\bigcap\_{i=1}^{d} A\_i(u\_i)\right) = \int\_{\mathcal{V}(u\_1)}^{1} \cdots \int\_{\mathcal{V}(u\_d)}^{1} c\_{\mathcal{V}}(v\_1, \dots, v\_d) \prod\_{i=1}^{d} \Delta(v\_i)^{I\_{[u\_i \odot b]}} (1 - \Delta(v\_i))^{I\_{[u\_i \odot b]}} \mathrm{d}v\_1 \cdots \mathrm{d}v\_d \,.$$

The derivative is given by

$$\frac{d^d}{du \cdots du\_d} P(\bigcap\_{i=1}^d A\_i(u\_i)) = (-1)^d c\_V(\mathcal{V}(u\_1), \dots, \mathcal{V}(u\_d)) \prod\_{i=1}^d p(u\_i)^{I\_{\left[u\_i \le \emptyset\right]}} (1 - p(u\_i))^{I\_{\left[u\_i > \emptyset\right]}} \mathcal{V}'(u\_i)$$

where *p*(*ui*) = <sup>Δ</sup>(V(*ui*)) and hence we obtain

$$c\_{\mathcal{U}}(\boldsymbol{u}\_1, \dots, \boldsymbol{u}\_d) = c\_{\mathcal{V}}(\mathcal{V}(\boldsymbol{u}\_1), \dots, \mathcal{V}(\boldsymbol{u}\_d)) \prod\_{i=1}^d (-p(\boldsymbol{u}\_i))^{l\_{|\boldsymbol{u}\_i \leq \boldsymbol{\delta}|}} (1 - p(\boldsymbol{u}\_i))^{l\_{|\boldsymbol{u}\_i > \boldsymbol{\delta}|}} \mathcal{V}''(\boldsymbol{u}\_i).$$

It remains to verify that each of the terms in the product is identically equal to 1. For *ui* ≤ δ, this follows easily from (14) since −*p*(*ui*) = −<sup>Δ</sup>(V(*ui*)) = 1/V(*ui*). For *ui* > δ, we need an expression for the derivative of the right branch equation. Since V(*ui*) = V(*ui* − V(*ui*)), we obtain

$$\mathcal{V}'(u\_i) = \mathcal{V}'(u\_i - \mathcal{V}(u\_i))(1 - \mathcal{V}'(u\_i)) = \mathcal{V}'(u\_i^\*)(1 - \mathcal{V}'(u\_i)) \implies \mathcal{V}'(u\_i) = \frac{\mathcal{V}'(u\_i^\*)}{1 + \mathcal{V}'(u\_i^\*)}$$

implying that

$$1 - p(u\_i) = 1 - \Delta(\mathcal{V}(u\_i)) = 1 - \Delta(\mathcal{V}(u\_i^\*)) = 1 + \frac{1}{\mathcal{V}'(u\_i^\*)} = \frac{1 + \mathcal{V}'(u\_i^\*)}{\mathcal{V}'(u\_i^\*)} = \frac{1}{\mathcal{V}'(u\_i)} \dots$$

.

#### *Appendix A.8. Proof of Proposition 5*

Let *Vt* = V(*Ut*) and *Zt* = <sup>Φ</sup>−<sup>1</sup>(*Vt*) as usual. The process (*Zt*) is an ARMA process with acf ρ(*k*) and hence -*Zt*1 , ... ,*Ztk* are jointly standard normally distributed with correlation matrix *<sup>P</sup>*(*<sup>t</sup>*1, ... , *tk*). This implies that the joint distribution function of -*Vt*1 , ... , *Vtk* is the Gaussian copula with density *c*Ga *<sup>P</sup>*(*<sup>t</sup>*1,...,*tk*) and hence by Part 2 of Theorem 3 the joint distribution function of -*Ut*1 , ... , *Utk* is the copula with density *c*Ga *<sup>P</sup>*(*<sup>t</sup>*1,...,*tk*)(V(*<sup>u</sup>*1), ... , V(*uk*)).

#### *Appendix A.9. Proof of Proposition 6*

We split the integral in (18) into four parts. First, observe that, by making the substitutions *v*1 = V(*<sup>u</sup>*1) = 1 − *u*1/δ and *v*2 = V(*<sup>u</sup>*2) = 1 − *u*2/δ on [0, δ] × [0, δ], we ge<sup>t</sup>

$$\begin{aligned} \int\_0^\delta \int\_0^\delta u\_1 u\_2 c\_{\rho(k)}^{\text{Ca}}(\mathcal{V}(u\_1), \mathcal{V}(u\_2)) \text{d}u\_1 \text{d}u\_2 &= \delta^4 \int\_0^1 \int\_0^1 (1 - v\_1)(1 - v\_2) c\_{\rho(k)}^{\text{Ca}}(v\_1, v\_2) \text{d}v\_1 \text{d}v\_2 \\ &= \delta^4 E((1 - V\_t)(1 - V\_{t+k})) \\ &= \delta^4 (1 - E(V\_t) - E(V\_{t+k}) + E(V\_t V\_{t+k})) = \delta^4 E(V\_t V\_{t+k}) \end{aligned}$$

where (*Vt*, *Vt*+*<sup>k</sup>*) has joint distribution given by the Gaussian copula *C*Ga ρ(*k*). Similarly, by making the substitutions *v*1 = V(*<sup>u</sup>*1) = 1 − *u*1/δ and *v*2 = V(*<sup>u</sup>*2) = (*<sup>u</sup>*2 − δ)/(1 − δ) on [0, δ] × [δ, 1], we ge<sup>t</sup>

$$\begin{split} &\int\_{0}^{\delta} \int\_{\delta}^{1} u\_{1}u\_{2}c\_{\rho(k)}^{\mathrm{Ga}}(\mathcal{V}(u\_{1}),\mathcal{V}(u\_{2})) \mathrm{d}u\_{1} \mathrm{d}u\_{2} \\ &= \int\_{0}^{1} \int\_{0}^{1} \delta^{2}(1-\delta)(1-v\_{1})(\delta+(1-\delta)v\_{2})c\_{\rho(k)}^{\mathrm{Ga}}(v\_{1},v\_{2}) \mathrm{d}v\_{1} \mathrm{d}v\_{2} \\ &= \delta^{3}(1-\delta)E(1-V\_{t}) + \delta^{2}(1-\delta)^{2}E((1-V\_{t})V\_{t+k}) = \frac{\delta^{2}(1-\delta)}{2} - \delta^{2}(1-\delta)^{2}E(V\_{t}V\_{t+k}). \end{split}$$

and the same value is obtained on the quadrant [δ, 1] × [0, δ]. Finally, making the substitutions *v*1 = V(*<sup>u</sup>*1) = (*<sup>u</sup>*1 − δ)/(1 − δ) and *v*2 = V(*<sup>u</sup>*2) = (*<sup>u</sup>*2 − δ)/(1 − δ) on [δ, 1] × [δ, 1], we ge<sup>t</sup>

$$\begin{split} &\int\_{\delta}^{1}\int\_{\delta}^{1}u\_{1}u\_{2}c\_{\rho(k)}^{\text{Ca}}(\mathcal{V}(u\_{1}),\mathcal{V}(u\_{2}))\mathrm{d}u\_{1}\mathrm{d}u\_{2} \\ &=\int\_{0}^{1}\int\_{0}^{1}(1-\delta)^{2}\left(\delta+(1-\delta)v\_{1}\right)\left(\delta+(1-\delta)v\_{2}\right)c\_{\rho(k)}^{\text{Ca}}(v\_{1},v\_{2})\mathrm{d}v\_{1}\mathrm{d}v\_{2} \\ &=\int\_{0}^{1}\int\_{0}^{1}(1-\delta)^{2}\Big{(}\delta^{2}+\delta(1-\delta)v\_{1}+\delta(1-\delta)v\_{2}+(1-\delta)^{2}v\_{1}v\_{2}\Big{)}c\_{\rho(k)}^{\text{Ca}}(v\_{1},v\_{2})\mathrm{d}v\_{1}\mathrm{d}v\_{2} \\ &=\delta^{2}(1-\delta)^{2}+\delta(1-\delta)^{3}E(V\_{t})+\delta(1-\delta)^{3}E(V\_{t+k})+(1-\delta)^{4}E(V\_{t}V\_{t+k}) \\ &=\delta(1-\delta)^{2}+(1-\delta)^{4}E(V\_{t}V\_{t+k}) \end{split}$$

Collecting all of these terms together yields

$$\int\_0^1 \int\_0^1 u\_1 u\_2 \varepsilon\_{\rho(k)}^{\text{Ga}}(\mathcal{V}(u\_1), \mathcal{V}(u\_2)) \text{d}u\_1 \text{d}u\_2 = \delta(1-\delta) + (2\delta - 1)^2 \text{E}(V\_t V\_{t+k})$$

and, since ρ*S*(*Zt*,*Zt*+*<sup>k</sup>*) = <sup>12</sup>*E*(*VtVt*+*<sup>k</sup>*) − 3, it follows that

$$\begin{split} \rho \,(\mathcal{U}\_{l\prime} \,\mathcal{U}\_{t+k}) &= 12E(\mathcal{U}\_{t} \mathcal{U}\_{t+k}) - 3 \\ &= 12 \int\_{0}^{1} \int\_{0}^{1} \mu\_{1} u\_{2} \mathcal{C}^{\text{Ga}}\_{\rho(k)} (\mathcal{V}(\mu\_{1}), \mathcal{V}(\mu\_{2})) \mathrm{d}\mu\_{1} \mathrm{d}\mu\_{2} - 3 \\ &= 12\delta(1 - \delta) + 12(2\delta - 1)^{2} E(V\_{t} V\_{t+k}) - 3 \\ &= 12\delta(1 - \delta) + (2\delta - 1)^{2} (\rho\_{\mathcal{S}}(Z\_{t}, Z\_{t+k}) + 3) - 3 \\ &= (2\delta - 1)^{2} \rho\_{\mathcal{S}}(Z\_{t}, Z\_{t+k}) \,. \end{split}$$

The value of Spearman's rho ρ*S*(*Zt*,*Zt*+*<sup>k</sup>*) for the bivariate Gaussian distribution is well known; see, for example, McNeil et al. (2015).

#### *Appendix A.10. Proof of Proposition 7*

The conditional density satisfies

$$\left(f\_{lI\_l|\mathcal{U}\_{t-1}}(\boldsymbol{u}\mid\boldsymbol{u}\_{t-1})\right)\_{-}=\frac{c\_{l\mathcal{U}\_{l}}(\boldsymbol{u}\_{1},\ldots,\boldsymbol{u}\_{t-1},\boldsymbol{u})}{c\_{l-1}(\boldsymbol{u}\_{1},\ldots,\boldsymbol{u}\_{t-1})}=\frac{c\_{l^{\mathrm{Ca}}}^{\mathrm{Ca}}(\mathcal{V}(\boldsymbol{u}\_{1}),\ldots,\mathcal{V}(\boldsymbol{u}\_{t-1}),\mathcal{V}(\boldsymbol{u}))}{c\_{l^{\mathrm{O}}|\mathcal{U}\_{t-1},\boldsymbol{u}\_{1}\rangle}(\mathcal{V}(\boldsymbol{u}\_{1}),\ldots,\mathcal{V}(\boldsymbol{u}\_{t-1}))}.\ \ .$$

The Gaussian copula density is given in general by

$$c\_P^{\mathrm{Ga}}(v\_1, \dots, v\_d) = \frac{f\_{\mathbf{Z}}(\Phi^{-1}(v\_1), \dots, \Phi^{-1}(v\_d))}{\prod\_{i=1}^d \phi(\Phi^{-1}(v\_i))}$$

where *Z* is a multivariate Gaussian vector with standard normal margins and correlation matrix *P*. Hence, it follows that we can write

$$\begin{array}{rcl} f\_{l\boldsymbol{l}|\boldsymbol{\mathcal{U}}\_{l-1}}(\boldsymbol{u}\mid\boldsymbol{u}\_{l-1}) &=& \frac{f\_{\boldsymbol{\mathcal{Z}}\_{l}}\Big(\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}\_{1})),\ldots,\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}\_{l-1})),\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}))\Big)}{f\_{\boldsymbol{\mathcal{Z}}\_{l-1}}(\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}\_{1})),\ldots,\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}\_{l-1})))\phi(\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u})))}\\ &=& \frac{f\_{\boldsymbol{\mathcal{Z}}\_{l}|\boldsymbol{\mathcal{Z}}\_{l-1}}(\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}))|\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u}\_{l-1})))}{\phi(\boldsymbol{\Phi}^{-1}(\mathcal{V}(\boldsymbol{u})))}\end{array}$$

where *fZt*<sup>|</sup>*Zt*−<sup>1</sup>is the conditional density of the ARMA process, from which (20) follows easily.
