2.2.3. Copula Function

Copula functions describe nonlinear relationships among multivariate data and model sample nonlinearly interrelated multivariate data [53]. They are functions that couple the joint distributions to their marginal distributions. Recently, copula methods have been extensively applied for finance, natural sciences and engineering, etc. To analyze the combination of MSI and SAI in space, the joint distribution of two indices is calculated with Copula functions to obtain the suitability degree for winter tourism regions.

Copulas provide a method for measuring the dependence between variables. Sklar [54] described the function relationship between a Copula *C* and a cumulative distribution function. Suppose there is a bivariate domain, and *FX*(*x*) = *P*[*X* ≤ *x*] and *FY*(*y*) = *P*[*Y* ≤ *y*] are cumulative distribution functions of the random variables *X* and *Y*. Their joint distribution is *FXY*(*<sup>x</sup>*, *y*) = *P*[*X* ≤ *<sup>x</sup>*,*Y* ≤ *y*]. Then, on the basis of mathematical theory, there is a unique function *C*, *FXY*(*<sup>x</sup>*, *y*) = *<sup>C</sup>*(*FX*(*x*), *FY*(*y*)), and *C* is called a copula function. Thus, if function *C* was deduced, the joint distribution *<sup>F</sup>*(*<sup>x</sup>*, *y*) of variables can be derived from their marginal distributions, *FX*(*x*) and *FY*(*y*). Furthermore, let the probabilities *u* = *FX*(*x*) and *v* = *FY*(*y*); we can take *x* = *F*−<sup>1</sup> *X* (*u*) and *y* = *F*−<sup>1</sup> *Y* (*v*). Then, *FXY*(*<sup>x</sup>*, *y*) = *<sup>F</sup>*(*<sup>F</sup>*−<sup>1</sup> *X* (*u*), *and F*−<sup>1</sup> *Y* (*v*)) = *<sup>C</sup>*(*<sup>u</sup>*, *v*) is a copula (*Sklar's theorem*) if a two-dimensional function, *C* : [0, 1] × [0, 1 ] → [ 0, 1], meets three conditions:


There are two categories of widely used copulas: ellipse and Archimedean [55]. Ellipse copulas, such as the Gauss- and *<sup>t</sup>*-copulas, can be produced by known multivariate distributions. Archimedean copulas are produced from different generators (*ϕ*) based on the definition of the Archimedean copula [56]. At present, Archimedean copulas are widely used in actual applications because they can model dependence in arbitrarily high dimensions with only one parameter, governing the strength of dependence. The one-parametric Archimedean Copula is expressed as:

$$\mathcal{C}(u, v; \theta) = \varphi^{-1}[\varphi(u, \theta) + \varphi(v, \theta); \theta] \tag{7}$$

where *C* is a Copula function, *ϕ* is the generator function, and *ϕ*<sup>−</sup><sup>1</sup> is its pseudoinverse. *ϕ* : [0, 1] × Θ → [0, <sup>∞</sup>), *ϕ*(1) = 0. *θ* is a generator function parameter within parameter space Θ. Further important Copula features and the theoretical background can be found in Nelsen [57], who provides a detailed introduction to Copula functions.

Kendall's *τ* is the rank correlation coefficient. It can be calculated from the available observation samples as:

$$\pi = \binom{N}{2}^{-1} \sum\_{j=1}^{N} \sum\_{i=1}^{j} \text{sign}[(\mathbf{x}\_i - \mathbf{x}\_j)(y\_i - y\_j)] \tag{8}$$

where sign = 0 if [(*xi* − *xj*)(*yi* − *yj*)] = 0, sign = 1 if [(*xi* − *xj*)(*yi* − *yj*)] > 0, and sign = −1 if [(*xi* − *xj*)(*yi* − *yj*)] < 0 and *i*, *j* = 1, 2, ··· , *N* [58]. The corresponding Copula function coefficient *θ* can be calculated using the functions in the right column of Table 3.


**Table 3.** Archimedean Copula functions, their generators, and connections to Kendall's *τ* considered for this study.

In this study, MSI and SAI are used to describe the suitability degree of winter tourism. The joint probability of the two indices in the space is analyzed based on Copula functions. The root mean square errors (RMSEs) are used to identify the most appropriate Copula function, which is calculated from the theoretical and the empirical joint non-exceedance probabilities [59]. The RMSEs are determined using all of the observed samples and empirical non-exceedance probabilities of events:

$$RMSE = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left[ P\_{\hat{c}}(i) - P\_{\hat{o}}(i) \right]^2} \tag{9}$$

where *Pc*(*i*) is the joint probability of the *i*th joint observation value calculated by the Copula function, N is the observation samples, and *Po*(*i*) is the empirical joint nonexceedance probability, calculated as follows [58,60].

$$F\_{XY}(\mathbf{x}\_{i\prime}, y\_i) = P(X \le \mathbf{x}\_i, Y \le y\_i) = \frac{\sum\_{m=1}^{l} \sum\_{l=1}^{l} N\_{mj} - 0.44}{N + 0.12} \tag{10}$$

where *Nmj* represents the number of occurrences of (*xm*, *yj*) with *xm* < *xi* and *yj* < *yi*, *i* = 1, ... , *N* and *m*, *j* ∈ [1, *i*]. *N* is the sample size. In the calculation process, the pairs (*xm*, *yj*) are arranged in ascending order with respect to *xm*.
