*Article* **An Application of the SRA Copulas Approach to Price-Volume Research**

#### **Pedro Antonio Martín Cervantes †, Salvador Cruz Rambaud \*,† and María del Carmen Valls Martínez †**

Department of Economics and Business, University of Almería, La Cañada de San Urbano, 04120 Almería, Spain; pmc552@ual.es (P.A.M.C.); mcvalls@ual.es (M.d.C.V.M.)

**\*** Correspondence: scruz@ual.es; Tel.: +34-950-01-51-84

† These authors contributed equally to this work.

Received: 18 September 2020; Accepted: 19 October 2020; Published: 26 October 2020

**Abstract:** The objective of this study was to apply the Sadegh, Ragno, and AghaKouchak (SRA) approach to the field of quantitative finance by analyzing, for the first time, the relationship between price and trading volume of the securities using four stock market indices: DJIA, FOOTSIE100, NIKKEI225, and IBEX35. This procedure is a completely new methodology in finance that consists of the application of a Bayesian framework and the development of a hybrid evolution algorithm of the Markov Chain Monte Carlo (MCMC) method to analyze a large number (26) of parametric copulas. With respect to the DJIA, the Joe's copula is the one that most efficiently models its succinct dependence structures. One of the copulas included in the SRA approach, the Tawn's copula, is jointly adjusted to the FOOTSIE100, NIKKEI225, and IBEX 35 indices to analyze the asymmetric relationship between price and trading volume. This adjustment can be considered almost perfect for the NIKKEI225, and a relatively different characterization for the IBEX35 seems to indicate the existence of endogenous patterns in the price and volume.

**Keywords:** copulas; Markov Chain Monte Carlo simulation; local optima vs. local minima; financial markets; SRA approach

**MSC:** 62H05; 62F15; 60J22; 62P05

#### **1. Introduction**

Current trends in quantitative finance reveal that econophysics has become an economic analysis discipline characterized not by its multidisciplinary but by its transdisciplinary nature [1], contributing to the formation of a common framework in the research of financial phenomena [2]. Traditionally, the link has been strong between the stochastic analysis of hydrological phenomena and the study of time series, especially in the field of quantitative finance. The best-known example is likely represented by the Hurst exponent, a procedure inspired by the floods of the Nile River [3], which is of unquestionable efficiency when estimating the long-term memory of time series. Hydrological phenomena are completely different to financial ones but, in general, they present certain common patterns of analysis. Thus, several works have transferred the applicability of the theory of copulas from the field of hydrology to finance [4–7]. Recently, Sadegh et al. [8] developed a specific methodology based on the joint use of 26 multivariate copulas applied in hydrology (hereinafter, SRA), which, in our opinion, offers huge potential for the analysis of the price–volume relationship. Therefore, our aim was to introduce this methodological approach within quantitative finance, summarizing its fundamental aspects as a step prior to its practical implementation.

The analysis of the joint dependence between economic and financial variables has found important support in the Sklar's theorem, through which it has been possible to specify, define, and contrast the latent or redundant dependence structures present in the bivariate and multivariate time series. Notably, Sklar's theorem, the starting point from which this theory departs, has been subject to continuous extensions that have improved the analysis of the structures of dependence between random variables or, in other words, of their succinct relationships when these are schematized in their minimal mathematical expression.

The emergent interest in copulas, detailed by [9], which increased in the field of finance after the paper by [10], does not correspond in reality to the use of several of the numerous types of pre-existing copulas, but to the systematic implementation of certain copulas types, either in economics and quantitative finance or in any other field. According to the compendium of copulas by [11], nonparametric and semiparametric models represent a minority that is largely surpassed by parametric models, amongst which almost 100 different types could be distinguished. Some of them have not been yet fully spread by the literature or, at least, they are not sufficiently well known, since most empirical studies opt for the application of a narrow number of copulas that could be classified as classic copulas.

Conversely, the analysis of the price–volume relationship (hereinafter, PVR) continues being a specific area of the financial literature that has not yet received a conclusive solution. In our opinion, the relationship between prices and trading volume can be derived by dissecting the dependence structure of both variables through the Sklar's theorem, that is, through the implementation of copulas. To accomplish this task, we followed the suggestion of [11] when implementing as many parametric copulas as possible to jointly analyze the same relationship, prices vs. trading or transaction volume, from different points of view (or dependence structures). Therefore, through this empirical work, we aimed to provide a new approach to the application of copulas in the context of PVR, implementing a large number of copulas that, to the best of our knowledge, have not been previously applied in the area of quantitative finance with the aim that these types of transdisciplinary approaches will transcend from the study of PVR to other areas of financial research in the future. This study was mainly based on [8], whose 26 parametric copulas, estimated according to a Bayesian uncertainty framework, were replicated in the price–volume variables of the DJIA, FTSE100, NIKKEI225, and IBEX35 indices.

The SRA was implemented in accordance with two different guidelines focused on two respective scenarios: first, this procedure was applied per se to price–volume data of the DJIA index over the period 1928–2009. Second, one of the 26 copulas included in this methodology, the Tawn's copula [12], was used to jointly compare the dependence structures derived from the PVR in the FTSE100, NIKKEI225, and IBEX35 indices using the period 2000–2018 as the time horizon (also in per se values). This copula was expressly used as it can be considered one of the new-generation copulas whose knowledge is not yet broadly applied in the literature and whose contribution to the analysis of the PVR may be crucial given its exhaustiveness in the estimation of parameters.

The rest of this article is organized as follows: first, Section 2 describes the current state of this research by outlining a literature review concerning the theory of copulas and the analysis of the PVR, detailing the works that expressly employed copulas in the determination of the relationship between prices and trading volume. In our opinion, with few exceptions such as [13,14], most of the works usually offer an excessively summarized and, in some cases, incomplete literature review of the PVR. For this reason, an extensive review of the literature was conducted by listing the four explanatory hypotheses that were mostly addressed in its study. Similarly, this section summarizes the plausible shortcomings derived from the utilization of copulas, pointing out a series of sociological weaknesses. In Section 3, the different databases used as well as a brief review of the theoretical bases presented in the SRA are described: its Bayesian perspective, later developed in Appendix A, and the Markov Chain Monte Carlo simulation used by this methodology. In Section 4, the results obtained are contextualized, finishing this investigation with Section 5, which is dedicated to the discussion of the results. The paper finishes with Section 6, which reflect our conclusions, supplemented with

a proposal for future lines of investigation, congruent with the methodological scheme implemented in this manuscript, emphasizing the practical usefulness of the PVR analysis, both for investors and practitioners, from the perspective of the scheme proposed by Karpoff [13]. To ensure the maximum possible exhaustiveness, Appendix B provides an introductory summary of the main basis of the theory of copulas.

#### **2. State of the Art**

#### *2.1. Related to the Theory of Copulas*

From Sklar [15] until now, the theory of copulas has not stopped being an area under continuous development, to the point that copulas, as a concept, as well as their proven ability to determine parametric and nonparametric dependence measures, have been discovered and rediscovered during the last 50 years [16]. In this sense, Genest et al. [9] applied bibliometric methods to fix the end of the 1990s as the starting point of a growing interest, practically exponential, which, according to [17–19], was due to the seminal repercussion of several works of singular importance for its popularization. This would mean the rediscovery of Sklar's works. In the opinion of [20], this would include its involvement in quantitative finance areas and the opening of new lines of research in this field, which would serve as a trigger for its gradual generalization toward numerous multidisciplinary areas such as the insurance sector, actuarial science, meteorology, hydrology, and many other disciplines [5].

Daníelsson [21] highlighted three stylized findings commonly detected when implementing copulas: the volatility clustering, the phenomenon of fat tails [22], and the analysis of a nonlinear dependence between a given dataset of variables [23–26]. More generically, the application of copulas in economic-financial fields can be structured around a series of predominant research lines such as the valuation of collateralized debt obligations (CDOs) [10], the analysis of financial time series [27–29] (reinforced by the time-varying copulas approach [30,31]), the interpretation of the implicit asymmetries in the exchange rates [32], the successive contributions to the context of the portfolio management either from the construction of a simplified portfolio based on the theory of copulas [33] or from the application of the value-at-risk (VaR) methodology [30,34], or to the study of contingent claims, especially the valuation of financial options in turbulent environments, characterized by risk [35–37]. In addition to these research lines, the theory of copulas has been employed to address all kinds of specific aspects like the methodology proposed by [38] to obtain new copulas based on a given one or the creation of a new class of semiparametric copula-based multivariate dynamic models (SCOMDY), introduced by [39]. Analogously, García et al. [40] focused on building copulas in the contexts of marked uncertainty; the elaborated goodness-of-fit testing procedure for copulas suggested by [41] are also remarkable, as well as the development promoted by the vine-copulas to model dependence structures [42–44] in which the copulas are directly linked with the decision processes.

#### Limitations of the Copulæ Approach

Strictly, a complete literature review of the theory of copulas would not be objective enough if some of its perceptible limitations are not highlighted, often given by an erroneous conception and misuse of its theoretical basis and, to a lesser extent, by sociological factors. Embrechts et al. [45] listed three conceptual fallacies linked to the relative understanding and abuse when implementing copulas. However, although these have gradually been solved, the main limitation of copulas is the breach of the continuity condition [46], which a priori establishes a univocal relationship between any continuous multivariate distribution and a single resulting copula *C* [45]. So, in any case, Equation (A7) (see Appendix A) must be satisfied if all distribution functions *F*1(*x*1), *F*2(*x*2), ... , *Fn*(*xn*) are continuous. Schweizer and Sklar [47] showed that if there is at least one discrete *Fi*, the joint distribution function can continue being expressed as a function, as shown in Equation (A7); however, this would not be defining a copula per se, but a possible (or feasible) copula *C*. Several works have furthered the mitigation of this inconvenience; for example, Genest and Nešlehová [48] related copulas with discrete

distribution functions, demonstrating how such links can invalidate some basic precepts of the theory of copulas (evidently, in the continuous case) or Mayor et al. [49], who performed a discrete extension of the Sklar's theorem in function of some operators similar to copulas, defined as a finite chain that they denominates "discrete copulas".

Similarly, others [50,51] emphasized that the justification of modeling the relationship of dependence between variables via copulas does not always have to be obvious or completely necessary as, in many cases, it may be more convenient to directly adjust the variables to a given multivariate distribution function (i.e., Gaussian or lognormal) to delimit the predictable stylized findings relative to their dependence structures. Another impediment, according to [52], is that copulas do not entirely correspond with the pre-existing stochastic framework because they are static models and, therefore, they are not completely adequate for modeling dependence structures over time.

The misuse of the Gaussian copula as a general indicator of credit risk should also be considered during the most recent period of economic boom, called "irrational exuberance" by Shiller [53], in whose case the procedure introduced by [10] practically became a standardized measure of the risk level of certain assets with high levels of volatility, being one of the indirect triggers in the expansion of the subprime mortgage crisis. Donnelly and Embrechts [54] metaphorically stated that "the devil is in the tails" when describing the main limitation of the models based on Gaussian copulas to fit extreme data values or outliers if compared with others like the Gumbel copula [55]. According to [45,56,57], there were many voices that, long enough in advance, warned about these models' inconsistencies that ignored the fact that the application of Gaussian copulas could be more or less viable in relatively stable financial environments but would be completely inefficient in detecting joint extreme events. This conclusion was personally confirmed by P. Embrechts to one of the coauthors of this work (November 2017):

"[. . . ] I insisted from the beginning, back in 1998, that credit risk models based on Gaussian copula are not capable of capturing joint credit defaults in a sufficiently realistic way. The mathematical result underlying this statement of mine dates back to the late fifties [. . . ]"

Mikosch [52], Daníelsson [58], and Zimmer [59] also criticized the widespread application of this procedure and even Salmon [60] deduced that the interests, aims, and objectives of the banking industry overlapped with those of mathematics, pointing out a sociological limitation born from considering the mathematical methodology implicit in the theory of copulas as a *factotum* in the determination of the risk of financial assets. In this sense, Rogers [61] stated:

"The problem is not that mathematics was used by the banking industry, the problem was that it was abused by the banking industry. Quants were instructed to build models which fitted the market prices. Now if the market prices were way out of line, the calibrated models would just faithfully reproduce those wacky values, and the bad prices get reinforced by an overlay of scientific respectability".

Daníelsson [21] considered that the a priori use of copulas can arbitrarily determine any structure of dependence so that an "optimal" adjustment of a copula does not mean an obligatory a sine qua non condition that leads to an optimal fit from the original distribution of the data. As no economic theory is explicitly linked to copulas, it is difficult to specify in advance what type of copulas are the most appropriate for each specific analysis given the total freedom in the choice of the underlying structures of dependence, which, in no case, are subrogated in a preliminary way to any economic theory.

#### *2.2. Related to the Analysis of the Price-Volume Relationship*

Osborne [62] was the first to address the concurrent relationship between prices and trading volume from a strictly quantitative perspective, estimating that the logarithm of the price of financial assets follows a diffusion process with a trend whose variance depends on the trading volume. Samuelson [63] was inspired by this research to infer that the prices of financial assets describe a

specific random trajectory based on the Geometric Brownian motion. Thus, the primary roots of modern quantitative finance are based in the preliminary studies of the analysis of the PVR. Others [62,64,65] applied spectral analysis to determine that, in principle, there is no a significant relationship between prices and volumes (or it is too meager to take it into consideration).

These initial works provided the background to justify and empirically test the reconsidered theory of demand [66], a new conceptualization of the theory of supply and demand, openly contrary to classical postulates, which would anticipate the empirical basis of the Granger causality test [67]. Based on Godfrey et al. [65], Ying [68] presented a complete disagreement with the theory of conventional demand, performing a series of statistical tests whose results defined five empirical patterns that characterize the joint evolution of the price and volume variables. Clark [69] used a mixture of probabilistic distributions to describe what would be considered the first explanatory hypothesis of the PVR, the MDH (Mixture of Distribution Hypothesis), proposing that the number of operations that occur per unit of time is a random variable and the variation in prices per unit of time is the sum of the increments of the intraday price equilibrium. Thus, the mixed variable is hypothesized according to the information rate periodically reached by the markets, inferring that, in principle, price and volume must be positively correlated, varying in a contemporary basis, just before the arrival of new information. Others [70–74] used the basis of this approach, which were further expanded [75,76] by inputting the information rate into the GARCH (Generalized Autoregressive Conditional Heteroskedasticity models) primary specification of Bollerslev [77], hypothesizing that the daily trading volume behaves like a representative proxy variable when explaining the evolution of prices growth depending on the GARCH effects, or on the persistence of transitory volatility shocks. Practically as a counterpart to the MDH, the SAIH (Sequential Information Arrival Hypothesis) [78,79] arose as a probabilistic model based on a binomial distribution, according to which the information arrives the markets generating a noncontinuous or fragmented flow. Per Darrat et al. [80], this hypothesis should be only contrastable in those periods in which the information is public and whose empirical evidence is ascertained by all market participants. Copeland [78] argued that, as more than an effective explanatory hypothesis of the PVR, it should be reconsidered as "a new technique for the analysis of demand".

The DBH (Dispersion of Beliefs Hypothesis) and the NTH (Noise Trader Hypothesis) would complete, together with the MDH and the SIAH, the four major explanatory hypotheses of the PVR, being the common denominator of all information that reaches the markets, although analyzed from opposite points of view and finally convergent [81]. The NTH [82] states that prices and volumes are the result of positive and negative feedback strategies that degenerate into noise in the sense stated by Black [83], on which passive, rational, and speculative investors react positively to a feedback strategy. In other words, according to this hypothesis, all information of interest that arrives to the markets, or relevant in any investment process, would be equivalent to the paradigmatic [83] noise. In contrast, the DBH [84,85] defines an antagonistic theoretical scenario in which investors who interact exclusively for speculative reasons and their degree of risk aversion is neutral, collectively receive public information, which, in principle, is common and perceived in the form of market signals. Consequently, the consecutive changes in prices exhibit a negative serial correlation and trading volume is positively correlated [84].

#### Use of Copulas in the Price-Volume Research

Amongst the works that explicitly opted for the implementation of copulas in the study of the PVR are those by Gurgul, who focused on the Polish and central European stock markets (Austria and Germany). Gurgul and Syrek [86] implemented the family of Archimedean copulas to demonstrate that the volatility of (daily) returns of the companies listed on the DAX was positively related to the trading volume. Gurgul et al. [87] introduced a measure of dependences based on copulas to quantify the relationship between performance and volume, volatility and volume, and yield and performance of the benchmark Polish stock market (WIG) compared to three indices corresponding to other international financial markets (ATX, DAX, and DJIA). They concluded that each one of the proposed relationships is significant except for the volume traded in the Polish market vs. the volatility of DJIA returns. Gurgul et al. [88] used a Granger's nonlinear causality model based on the Bernstein's copula by applying the nonparametric test of conditional independence between two vector processes [89] in five selected ordinary shares of the ATX index, confirming the existence of several well-defined causal guidelines between the performance of shares, the volatility, and the trading volume (both expected and unexpected). This same copula, in conjunction with Hellinger's distance, was implemented [90] to study the high-frequency data of 10 central European companies (Austria and Poland), detecting a high degree of unidirectional causality, both linear and nonlinear, of the returns to the expected volume, which was not appreciable in the opposite direction. They also observed the existence of a linear causality from the volatility realized to the expected trading volume that, once again, was negligible in the opposite direction.

Gurgul and Syrek [91] studied the dependence structures of ordinary stock returns, volatility, and transaction volumes of several companies listed in the CAC40 and FTSE100 indices to verify the long-term memory of the MDH through the fractional cointegration of these series according to the procedure previously described [92]. In most cases, there is no structure of common dependence whereby the analyzed series would not be caused by a process of reaching a common information with long-term memory. Gurgul et al. [93] investigated the high-frequency data of 13 German companies included in the DAX index for a period of 33 days by selecting the copulas *t* and Gumbel to analyze their different underlying dependence structures according to the inference function for margins (IFM) method [94]. These scholars inferred that the contemporary relationship between the price duration and its associated trading volume depends on the distribution tails as unusual high volume accumulations tend to coincide with long durations and, conversely, dependence is minimal when any of the variables are delayed.

The Asian Financial Crisis of 1997 provided a empirical scenario from which Ning and Wirjanto [95] analyzed the structure of dependence between prices and volumes in a context of extreme volatility by examining the evolution of the most representative stock indices of the six countries in southern Asia, which were more seriously affected by the crisis. Gallant et al. [96] implemented several mixtures of copulas (Clayton, survival Clayton, and Frank) expressly focused on both tails. They obtained two conclusions: (1) In general terms, volume positively depends on the return exclusively in the upper tail of the distribution but not in the lower, which can be interpreted as volume is a key piece able to explain the periodical booms of the market, not its eventual collapses. (2) A marked asymmetric dependence exists between return and volume in the extremes of the distribution, evidenced by extremely high returns tending to be attached to extremely large volumes, but extremely low returns tending not to be associated with disproportionate trading volumes, whether high or low.

Naeem et al. [97] focused on the study of the PVR from the analysis of the asymmetric relationship between returns and trading volumes based on four stock indices also in Asia, developing an alternative measure of dependence by combining several copulas (Clayton, Survival Clayton, and Gumbel) with the univariate GARCH and FIGARCH (Fractionally Integrated GARCH models) in which the marginal distributions of the respective series of returns and volumes are adjusted, proving that the FIGARCH specification substantially improves the estimation of the parameters of each of the proposed copulas. As in [95], we remark that extraordinarily high trading volumes are often related to significant returns, which is due to sudden and sharp declines in the value of financial assets and, more specifically, within financial crisis environments.

#### **3. Materials and Methods**

Our objective was to present a multi-perspective design of Larkin's research [98] that enables the analysis of the PVR from different standpoints, depending on the use of different datasets, time horizons, and analytical tools (copulas). The SRA was applied to two different scenarios to provide

a generic and a specific image of this methodology. Instead of using a representative hydrological or meteorological index as an empirical basis (i.e., the standardized precipitation index (SPI) [99]), per se values of four stock market indices commonly employed by the literature in the study of the PVR were selected: DJIA, FTSE100, NIKKEI225, and IBEX35.

In the first case, or generic scenario, all available copulas (26) were applied to a single index (DJIA). Later, in the specific scenario, a single copula was adjusted to three indices (FTSE100, NIKKEI225, and IBEX35). The copula chosen in the second case was the Tawn copula, a family of new-generation copulas derived from the Khoudraji's device copula [100]. In this way, we contribute to the analysis of the PVR with the inclusion of new copulas never or rarely implemented in this research, such as some of those included in the SRA approach. In relation to the construction of the generic scenario, we decided to use a wide database consisting of 20,219 stock trading sessions of the DJIA index, covering the period from 10 January 1928 to 4 August 2009, which were consecutively subdivided into quarterly periods until obtaining 490 observations representing the adjusted closing values of the DJIA at the end of each corresponding session and the final volume of the shares traded at each date.

This temporal accrual as well as the use of data per se allowed us to adapt the original datasets to the methodology proposed by [8]. The analysis of the specific scenario corresponding to the FTSE100, NIKKEI225, and IBEX35 indices involved monthly data of per se price and volume collected during the period from 31 October 2000 to 30 November 2018, which included 218 monthly observations for each stock index. The most representative descriptive statistics of the generic scenario, shown in Table 1, reveal a fundamental aspect: the huge level of variability of variables "price" and "trading volume" when both are measured in per se terms (especially in the latter case).


**Table 1.** Descriptive statistics and dependence evaluation of DJIA price and volume per se (1928–2009).

Subtable (A): (\*) The analyzed data contain at least five mode values. Only the smallest four have been selected. T. Count: Total Count; SEM: Standard Error of the Mean; T. Mean: Trimmed Mean; CV: Coefficient of Variation; SS: Sum of Squares; IQR: Interquartile Range; MSSD: Mean of the Squared Successive Differences. Subtable (B): Source: Own elaboration.

Spearman's rank-order 0.9559 0 Yes Pearson product-moment 0.7365 0 Yes

In the same way, the values per se of the variables "price" and "trading volume" denote a relatively high degree of correlation in terms of the Pearson, Kendall, and Spearman correlation coefficients (0.7365, 0.8279, and 0.9559, respectively), which a priori could be considered significant measures

of dependence. However, as underlined by Frey et al. [101], a high degree of correlation does not necessarily imply real dependence between the involved variables.

Figure 1 shows the huge level of dispersion and variability of both variables. The first two subfigures, elaborated according to Patton [29], exhibit a normalized time series plot of price (DJIA)–volume (DJIA) as well as a scatter plot of log-increments, both series normalized in base 100, according to the equality 100 <sup>×</sup> exp - ∑*n i*=1 ln *Xi* ln *Xi*−<sup>1</sup> . The third subfigure represents the Pearson regression coefficient of per se prices and volumes of the DJIA over the analyzed time horizon, showing a quasicyclical relationship between prices and transactional volume within this index, which a priori do not appear to be connected with the evolution of the economic cycle. Several phases or trends can be distinguished: relative decline (1934–1957, 1979–1984, and 2000 onwards), stabilization (1967–1977), and increase (1929–1933, 1958–1966, and 1985–1999) in the relationship between the variables in terms of Pearson's linear correlation coefficient (*ρ*).

**Figure 1.** Three different representations of DJIA price-volume evolution and variability during the period 1928–2009. Source: Own elaboration.

Considering per se magnitudes, Figure 2 presents a three-dimensional scatter plot of the DJIA index that links variables *X* (volume) and *Y* (price) to the Pearson linear correlation coefficient (*Z* = *ρ*). Simply, it can be observed that this chart mostly associates the highest correlation levels of *P* and *V* to high per se values of *<sup>P</sup>*. Low trading volume per se usually fluctuates within a range from 5.00 <sup>×</sup> <sup>10</sup><sup>9</sup> to 15.00 <sup>×</sup> <sup>10</sup>9, although sometimes a relatively high degree of correlation between price and low trading volume can be detected (close to 5.00 <sup>×</sup> 109).

**Figure 2.** Scatter plot of *ρ* vs. Price (DJIA)-Volume (DJIA). Source: Own elaboration.

The aim of this paper is to highlight the key aspects of the SRA as an optimal methodological approach for the analysis of the PVR from an empirical perspective that is completely different from the rest of the predominant lines of research. In summary, this methodology can be characterized by: (1) the use of a high number of bivariate copulas (26, see Table 2), especially recommended to simultaneously represent different dependence structures and to conduct prospective inferences based on the chosen variables (not necessarily related to hydrology), such as the variables price and trading volume of a given financial asset or stock index. Notably, to the best of our knowledge, the large number of copulas jointly implemented in the SRA was employed for the first time in the investigation of the PVR. (2) This methodology is based on a unitary reference framework (Bayesian analysis, see Appendix A) in which the hybrid evolution algorithm of the Monte Carlo Markov Chain simulation (MCMCS) was introduced, focusing on the numerical estimation of the subsequent distribution of copula parameters within a context of uncertainty that is relatively similar to the uncertainty observable in financial markets, especially when the different volatility ranges can be conveniently delimited.





As stated by Johannes and Polson [114], the key aspect of the MCMCS is its ability to easily characterize the complete conditional distributions, *p*(*θ*|*X*,*Y*) and *p*(*X*|*θ*,*Y*), instead of analyzing the higher-dimensional joint distribution *p*(*θ*, *X*|*Y*). The SRA belongs to the class of econometric methods usually applied to the sampling of high-dimensional complex distributions, which implement a hybrid-evolution MCMCS algorithm to infer posterior parameter regions within a Bayesian context. This algorithm is considered a hybrid since it includes a combination of Gibbs steps and Metropolis–Hastings steps [114].

The hybrid-evolution MCMCS algorithm starts with an intelligent starting point selection, structured according to the use of adaptive metropolis (AM), differential evolution (DE), and snooker update. Table 3 summarizes, in descending order, the working schema implemented in the algorithm developed by Sadegh et al. [8]. For the sake of brevity, intermediate iterative conditions (i.e., end do, end if, etc.) have been omitted from the table.

**Table 3.** Description of the basis scheme of the hybrid MCMCS algorithm implemented in the SRA approach.


Checking for Gelman-Rubin *R*ˆ convergence diagnostic.

*LN* = number of samples drawn from the prior distribution [*p*(*θ*)], using Latin Hypercube Sampling (LHS) and *N* = number of Markov chains (*CH*). *D* = the dimension of the entire parameter space, *d* = the dimension of the subspace of the parameters randomly selected for update (Metropolis within Gibbs sampling), *T* = the total number of iterations, and *NAM* = the number of chains selected for the Adaptive Metropolis algorithm. *γ*<sup>1</sup> − *γ*<sup>4</sup> = "jump factors", where *γ*<sup>1</sup> is randomly selected, *γ*<sup>2</sup> = 2.38/ <sup>√</sup>*d*, *<sup>γ</sup>*<sup>3</sup> <sup>=</sup> 0.1/ <sup>√</sup>*<sup>d</sup>* and *<sup>γ</sup>*<sup>4</sup> <sup>=</sup> 2.38/ √ 2*d*. Σ*<sup>d</sup>* = adaptive covariance matrix, based on the last 50% samples of the Markov chains. Source: Specifically readapted to this study from [8].

#### **4. Results**

Despite the SRA employing a good number of new generation copulas, with some of them complex in mathematical terms (i.e., Plackett or Shih-Louis), Table 4 shows that two copulas with a not very analytically complex, Li et al. [102] and Frees and Valdez [50] best fit the price–volume time series of the DJIA during the considered period (1928–2009), emphasizing that, in all cases, the specified selection criteria coincide except for three copulas: Galambos, BB1, and BB5.

Complementarily, Table 5 provides estimations of the parameters of each copula (Par) by fixing a range of 95% of uncertainty in their estimation (Unc-Range) through the application of local optimization and MCMCS. The copulas with best performance (Rank) are defined in terms of the root mean square error (RMSE) and the Nash–Sutcliff Efficiency (NSE) criteria. At this point, the existing literature usually employs local optimization algorithms when estimating the parameters of copulas with the consequent risk of being trapped in local optima, thus often obtaining unbiased and nonsignificant results [8]. Conversely, the hybrid-evolution MCMCS algorithm used in the SRA overcomes this initial limitation by determining an efficient estimator of the global optimum as well as an accurate approximation of uncertainties in the content of a Bayesian conceptual framework in the form of isolines, which is another of the improvements provided by this methodology to PVR analysis.


**Table 4.** Selection of copulas fitted to the DJIA index (1928–2009) based on three different criteria. Performance-criterion ranking amongst the implemented copulas.

Souce: Own elaboration.

The analysis of the SRA applied to the NIKKEI225, FTSE100, and IBEX35 indices using the Tawn's copula is summarized in Table 6, similarly to Table 5. The price–volume dependence structure of the per se NIKKEI225 index is optimal in accordance with the NSE criterion, as it is very close to unity (0.9914), indicating an almost perfect model fitting. The per se IBEX35 adjustment is relatively optimal (0.9737), being lower for the FTSE100 (0.8235). The range of uncertainty of the parameters defining the Tawn's copula (*θ*1, *θ*2, and *θ*3, Table 2) is considerably lower in the Nippon index than in the other two stock market indices.

Figure 3 shows that each stock exchange index corresponds to a certain typology of its probability isolines. Rows 1 to 3 refer to the analyzed indices, whereas columns correspond to the following specifications: (A) fitted empirical copulas probabilities, (B) fitted empirical copulas, and (C) return

period copulas, calculated according to [115] by considering the joint return <sup>1</sup> 1−*C*(*u*,*v*) as a measure of the dependence structure between the observed price peaks and trading volumes.

**Figure 3.** Probability isolines of Tawn's copula for FOOTSIE100, NIKKEI225, and IBEX35 indices. Source: Own elaboration.

**8**

**8**

**3UREDELOLW\RI8**

The isolines derived from the application of Tawn's copula are ostensibly biased toward the upper left corner, which seems to indicate a low probability of occurrence of the price (*U*1) synchronously linked to a high probability of occurrence of the trading volume (*U*2) (both measured in magnitudes per se). Likewise, given the joint representation of the probability isolines and the empirical estimates of the joint probability distributions, the trends of the FOOTSIE100 and NIKKEI225 indices are fairly similar, although in the former index, high prices use to be related to trading volumes lower than those shown in the Japanese stock market. The *P*—*V* relationship in the IBEX35, although following a similar pattern, differs to some extent from the analysis of the other two indices, as low prices seem to be more related to high trading volumes quotas. This type of asymmetric and skewed dependence structure can be considered a common pattern of the three indices analyzed, equally extrapolated to the analysis of the fitted empirical copulas and return period copulas.


A. Root Mean Square Error. B. Nash-Sutcliff Efficiency. Source: Own elaboration.


**Table 5.**

*Mathematics* **2020** , *8*, 1864


**Table 6.** Tawn copula parameters estimation: NIKKEI225, IBEX35, and FTSE100 (2000–2018).

Figure 4 shows the degree of uncertainty associated with the three parameters defining the Tawn's copula. Figure 4 exhibits the specification of the copula parameters generated by the MCMCS through a Bayesian framework. Blue bins represent the MCMC-obtained parameters, blue crosses (bottom of each plot) denote the maximum likelihood estimation parameters, and red asterisks (top of each plot) indicate the copula parameter value obtained by local optimization.

**Figure 4.** Posterior distribution of fitted Tawn's copula on FOOTSIE100, NIKKEI225, and IBEX35 indices obtained by the MCMCS. Source: Own elaboration.

In a context characterized by minimal uncertainty when specifying the parameters of the copula in each market, the parameters obtained by the local optimization algorithm should coincide with the mode of the distribution calculated through the MCMCS. However, this was only observed in the NIKKEI225 (parameters 1 and 2) and IBEX35 (parameter 1) and was not contrastable for any of the three parameters obtained from the Tawn's copula to the FOOTSIE100 index. These results are consistent with the previously calculated uncertainty ranges and with the delimitation of the degree of goodness of the adjustment performed by the NSE criteria (Table 6), according to which the NIKKEI225 index represented a quasiperfect fitting to this copula, followed by the IBEX35, and, to a lesser extent, the FOOTSIE100. This can be justified by the different range of variation of the parameters obtained

for each index, where the FOOTSIE100 index is associated with a higher level of uncertainty compared with NIKKEI225 and IBEX35.

#### **5. Discussion**

The application of the SRA provides an alternative and innovative approach to PVR based on the simultaneous application of 26 copulas, which facilitated the analysis of their dependence structures and implicit morphology according with their probability isolines. Many of these copulas are dissimilar in form, although quite similar in performance. This also allowed us to model the relationship between prices and trading volumes from different points of view, quantifying the uncertainty underlying to the specification of the parameters defining each copula. The PVR, usually characterized by a markedly asymmetric relationship [13], is reinforced by the application of the SRA, since several of the copulas used in this methodology (e.g., Galambos, Bernstein, Tawn, etheory of copulas.) are especially effective in the study of phenomena with underlaying asymmetric skewed dependence structures.

From an empirical point of view, the joint implementation of the 26 copulas in the DJIA (generic scenario) confirmed that Joe's copula is able to more efficiently model the dependence structures of this index. Framing our findings with the existing literature, the use of Tawn's copula in the FOOTSIE100, NIKKEI225, and IBEX35 indices (specific scenario) confirms Ying [68]'s findings in their seminal analysis of the S&P 500 index. Similarly, our results confirm the analysis of the NIKKEI225 completed by Bremer and Kato [116], according to which an asymmetric relationship could be observed (negative correlation between past prices and current trading volume). This relationship was explained in the FOOTSIE100 by Huang and Masulis [117] based on the existence of a minority of informed-trading investors who simply sought immediate liquidity. The asymmetric PVR detected in the IBEX35 aligns with that already reported in the literature (see, for example, [118]). Its differentiated nature with respect to the other two indices is probably due to, according to [119] in the Spanish financial markets (Mercado Continuo), a strong linear causal relationship from returns to trading volume. Specifically, periods with high returns are usually followed by periods with particularly high trading volume. Such guidelines are comparable to those detected in other works [95,97], which explicitly used copulas in the study of PVR in the Asian financial markets, repeatedly verifying the existence of an inverse relationship between prices and volumes traded, in both cases foreseeably increased by the effects of the 1997 Asian financial crisis.

One of the improvements associated with the application of the SAR approach to the PVR is facilitating the analysis of both variables from a large number of copulas by defining the relationship based on ranges of uncertainty applying the RMSE and NSE criteria (see Tables 5 and 6), independent of the degree or sign of the linear correlation exhibited by the Pearson correlation coefficient (*ρ*), which, to the best of our knowledge, is an entirely new application in the field of quantitative finance of future utility for researchers and investors.

#### **6. Conclusions**

The main contribution of this research is the analysis of the existing relationship between prices and trading volume from multiple copulas, which allowed us to comparatively abstract the underlying dependency structures of both variables to establish possible analogies or differences. One of the most important limitations related to the empirical application of copulas is solved, which is employing a limited number of standard copulas when, in reality, there are multiple copulas not yet well extended in the literature [11]. Through the empirical methodology introduced in this article, the versatility of copulas increases when they are simultaneously combined with the polyvalence of the Bayesian analysis and with the hybrid-evolution MCMCS algorithm proposed by Sadegh et al. [8]. We are the first to implement the SRA, not just in PVR analysis, but in the ambit of quantitative finance. More specifically, for practical purposes, PVR analysis is decisive for both academics and practitioners, since, following the scheme constructed by Karpoff [13], it has the following implications: (1) it generates additional information regarding the structure of financial markets; (2) from an

empirical point of view, it is fundamental in the generation of case studies that jointly use prices and trading volume, facilitating the implementation of analyses and inferences; (3) it is a crucial element in the study of the empirical distribution of speculative prices; and (4) its research would be particularly indicated in the futures markets where, a priori, the variability of prices used to affect the trading volume.

Additionally, we tried to answer and reconcile three questions linked to the theory of copulas: which copula is the "right one" [120], which copula should be used [20], and why copulas have been successful in many practical applications [44]? Versatility is the key term that best defines a copula; therefore, the most appropriate copula for analyzing a particular issue is the one that best summarizes its implicit dependence structures. Hence, copulas have been so successful in different fields of study.

A first conclusion to be drawn from this work is that the potential of the theory of copulas could be significantly reduced if certain copulas-type are systematically used in the analysis of bivariate time series. Precisely, this was the factor that caused a certain reluctance toward the use of copulas when the Gaussian copula [10] was employed massively in almost any scientific field, without considering either the intrinsic nature of the phenomena analyzed or that the use of a large number of copulas can substantially improve the knowledge of the different relations of dependence observable in a given dataset [50]. Thus, the SRA is not simply limited to the task of choosing and fitting the copula [121], but following the transdisciplinary perspective of econophysics, it supposes a new framework in the analysis of the PVR, extrapolated from the field of hydrology, which is directly applicable to many other areas such as quantitative finance.

Since Karpoff [13], the PVR has been practically subsumed to the generalization of the significance of Pearson's linear correlation coefficient of the price and trading volume variables. However, an alternative is provided in this methodology since the classical optimization methods applied to copulas often get trapped in local minima. The SRA is able to conveniently overcome this limitation by accurately describing the dependence structure of variables *P* and *V* and, importantly, by allowing the analysis of uncertainties given a determined time horizon (or length of record, see [8]). Another contribution of this work that may be important for future lines of research is the incorporation in the methodology of copula probability isolines in the analysis of PVR, which approximates this research to the multifractal models of Mandelbrot [22].

In our opinion, other future lines of research related to this work include, for example, the analysis of the role played by floating capital (outstanding shares vs. restricted shares) in the context of PVR, an aspect which has been often overlooked in the literature, or the rigorous enunciation and detailed compilation of those empirical stylized facts defining the price–volume time series, as well as the definitive consolidation of the works that have analyzed PVR from the perspective of the market microstructure of Garman [122]. This research could be gradually applied to the area of behavioral finance following the path of works such as Gomes [123], in which the analysis of PVR is directly connected to the prospective theory of Kahneman and Tversky [124].

**Author Contributions:** Conceptualization, methodology, software, writing—original draft preparation, P.A.M.C.; resources, writing—review and editing, funding acquisition, S.C.R.; investigation, data curation, supervision, M.d.C.V.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This paper was partially supported by the project "La sostenibilidad del sistema nacional de salud: reformas, estrategias y propuestas" (Ministry of Economy and Competitiveness, DER2016-76053-R).

**Acknowledgments:** The authors would like to thank P. Embrechts (ETH, Zürich) for making us aware of some limitations derived from the misuse of copulas in certain specific assets, as well as E. Ragno (University of California at Irvine) and M. Sadegh (Boise State University, Idaho), two coauthors of the SRA, who advised and encouraged us in the use of this methodology given its first implementation in the area of quantitative finance.

**Conflicts of Interest:** The authors declare no conflict of interest.

*Mathematics* **2020**, *8*, 1864

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A. The Bayesian Perspective of the SRA Approach**

The Bayesian methodology constitutes one of the most recurrent approaches in economic and financial research, considered as an alternative third way to traditional perspectives. In previous studies [125,126], we can find an extensive introduction to Bayesian methods applied to finance, which created an important field of implementation in those contexts characterized by a high uncertainty, such as stress testing study cases [121,127] and the risk management optimization or the estimation of GARCH models in environments of extreme volatility [128,129]. Shemyakin [130] is essential for studying the Bayesian estimate of copulas based on the simplicity of Bayes' theorem (A1), that is, univocally assigning the corresponding uncertainties representatives of each parameter to the model and estimating its posterior distribution, the starting point of the SRA:

$$p(\theta|\tilde{Y}) = \frac{p(\theta)p(\tilde{Y}|\theta)}{p(\tilde{Y})},\tag{A1}$$

where *<sup>p</sup>*(*θ*), *<sup>p</sup>*(*θ*|*Y*˜), *<sup>p</sup>*(*Y*˜|*θ*) <sup>∼</sup><sup>=</sup> <sup>L</sup>(*θ*|*Y*˜), and *<sup>p</sup>*(*Y*˜) = *<sup>θ</sup> <sup>p</sup>*(*Y*˜|*θ*)d*<sup>θ</sup>* denote prior and posterior distribution (of parameters), likelihood function and coined (or real) evidence, respectively. Under the hypothesis that error residuals are Gaussian-distributed with mean zero, uncorrelated, and homoscedastic, the likelihood function can be reformulated as:

$$\mathcal{L}(\theta|\vec{Y}) = \prod\_{i=1}^{n} \frac{1}{\sqrt{2\pi\vec{\sigma}^2}} \exp\left\{-\frac{1}{2}\vec{\sigma}^{-2}[\vec{y}\_i - y\_i(\theta)]^2\right\},\tag{A2}$$

and logarithmically transformed into the formula [8]:

$$\ell(\theta|\tilde{Y}) = -\frac{n}{2}\ln(2\pi) - \frac{n}{2}\ln\tilde{\sigma}^2 - \frac{1}{2}\tilde{\sigma}^{-2}\sum\_{i=1}^{n} \left[\tilde{y}\_i - y\_i(\theta)\right]^2,\tag{A3}$$

where *σ*˜ is an estimate of the standard deviation of the measurement error given by:

$$
\sigma^2 = \frac{\sum\_{i=1}^n [\mathcal{Y}\_i - y\_i(\theta)]^2}{n},
\tag{A4}
$$

which allows us to simplify (A2) into:

$$\ell(\theta|\bar{Y}) = -\frac{n}{2}\ln(2\pi) - \frac{n}{2} - \frac{n}{2}\ln\frac{\sum\_{i=1}^{n}[\bar{y}\_i - y\_i(\theta)]^2}{n} \tag{A5}$$

Finally, eliminating the constant terms of (A5), we would obtain a simplified equivalent log-likelihood function as:

$$\ell(\theta|\vec{Y}) \approx -\frac{n}{2} \ln \frac{\sum\_{i=1}^{n} [\vec{y}\_i - y\_i(\theta)]^2}{n}.\tag{A6}$$

Once the data have been modeled according to any of the available copulas (Table 2), the SRA evaluates the goodness of fit using three different criteria: max-likelihood [131], Akaike information criterion (AIC) [132,133], and Bayesian information criterion (BIC) [134], taking the primary error residuals function as a reference under the assumption that, given a set of parameters, its maximum likelihood level completely minimizes the residuals between the model simulations and their linked observations. Notably, these assumptions are explicitly referred to the distribution of residual error that is applied to construct the likelihood function that summarizes the distance between the given observations and the prospective model simulations.

#### **Appendix B. Brief Insight into the Theory of Copulas**

The background of the theory of copulas can be traced to the works of Fréchet, Hoeffding, Menger, Féron, Gumbel, and Dell'Aglio, most of them analyzing the relationships between bivariate and trivariate distributions with their corresponding univariate marginal distributions. According to Sempi [135], the basis of the theory of copulas was established by Fréchet [136] and can be synthesized schematically according to the dimensions of Fréchet [136] and Hoeffding [137].

An *n*-dimensional *copula C* is a multivariate distribution function on the *n*-dimensional hypercube [0, 1] *<sup>n</sup>* with uniformly distributed marginals.

The Sklar's theorem [15] is the starting point for the construction, development, and modeling of a new class of functions (or dependence functions, according to Galambos [138]), which have been generically denominated copulas since Sklar [15], who nominalized them using the Latin term *copulæ* ("couples") [4].

In short, this theorem [139] states that given a *n*-dimensional random vector *X* = (*X*1, *X*2, ... , *Xn*) with joint distribution function *F* and marginal distribution functions *F*1, *F*2, ... , *Fn*, there exists an *<sup>n</sup>*-dimensional copula *<sup>C</sup>*, such that for every (*x*1, *<sup>x</sup>*2, ... , *xn*) <sup>∈</sup> <sup>R</sup>*n*, Equation (A7) is satisfied. For absolutely continuous distributions, the copula *C* is unique.

Conversely, if *C* is the *n*-dimensional copula corresponding to a multivariate distribution function *F* with marginal distribution functions *F*1, *F*2,..., *Fn*, then *C* can be expressed as:

$$\mathbb{C}(u\_1, \ldots, u\_n) = F(F\_1^{-1}(u\_1), \ldots, F\_n^{-1}(u\_n)) \tag{A7}$$

and its copula density or probability function is given by:

$$\mathcal{L}(u\_1, \ldots, u\_n) = \frac{f(F\_1^{-1}(u\_1), \ldots, F\_n^{-1}(u\_n))}{f\_1(F\_1^{-1}(u\_1)) \cdots f\_n(F\_n^{-1}(u\_n))}.\tag{A8}$$

If the joint distribution function is *n* times differentiable, the partial derivatives of order *n* can be calculated in (A7), by obtaining:

$$f(\mathbf{x}) \equiv \frac{\partial^n}{\partial \mathbf{x}\_1 \partial \mathbf{x}\_2 \cdots \partial \mathbf{x}\_n} F(\mathbf{x}) = \prod\_{i=1}^n f\_i(\mathbf{x}\_i) \times \frac{\partial^n}{\partial u\_1 \partial u\_2 \cdots \partial u\_n} \mathcal{C}(F\_1(\mathbf{x}\_1), F\_2(\mathbf{x}\_2), \dots, F\_n(\mathbf{x}\_n))$$

$$\equiv \prod\_{i=1}^n f\_i(\mathbf{x}\_i) \times c(F\_1(\mathbf{x}\_1), F\_2(\mathbf{x}\_2), \dots, F\_n(\mathbf{x}\_n)), \tag{A9}$$

from where:

$$\log f(\mathbf{x}) = \sum\_{i=1}^{n} \log f\_i(\mathbf{x}\_i) + \log c(F\_1(\mathbf{x}\_1), F\_2(\mathbf{x}\_2), \dots, F\_n(\mathbf{x}\_n)). \tag{A10}$$

That is, the joint density function is equal to the product of the marginal densities and the density of the copula (represented by *c* [27]), from which it follows that the joint logarithmic probability is equal to the sum of the univariate logarithmic likelihoods and the copula logarithmic likelihood, which is a feature of extreme utility for the parametric estimation of multivariate model. Therefore, according to the Sklar's theorem (A7) and considering the equivalence relation (A9), given any couple of variables *X* and *Y* with respective marginal distributions *u* = *F*(*xt*) and *v* = *G*(*yt*) and joint distribution function *J*(*xt*, *yt*), there is a copula *C* for all (*xt*, *yt*) in R2, which relates them according to the equation:

$$J(\mathbf{x}\_{l}, y\_{l}) = \mathbb{C}(F(\mathbf{x}\_{l}), G(y\_{l})).\tag{A11}$$

Again, calculating the partial derivatives in both terms of Equation (A11), we obtain:

$$\frac{\partial^2 f(\mathbf{x}\_t, y\_t)}{\partial \mathbf{x}\_t \partial y\_t} = \frac{\partial^2 C(F(\mathbf{x}\_t), G(y\_t))}{\partial F \partial G} f(\mathbf{x}\_t) g(y\_t)),\tag{A12}$$

which allows us to model the marginal distributions and the dependence structure between the variables separately from a certain copula [95].

Thus, the Sklar's theorem implies that the dependence relation between different variables can be completely subsumed to the construction of a copula, a process that can be summarized in two consecutive steps [51,139]: (1) identification of associated marginal distributions, and (2) election of a certain copula that appropriately represents the interrelations between the variables, so that the dependence between *n* random variables *X*1, *X*2,..., *Xn* is theoretically explained in its entirety from its joint distribution function *<sup>F</sup>*(*x*1,..., *xn*) = P[*X*<sup>1</sup> ≤ *<sup>x</sup>*1,..., *Xn* ≤ *xn*] [56].

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
