Next Article in Journal
Grain Size Distribution of Bedload Transport in a Glaciated Catchment (Baranowski Glacier, King George Island, Western Antarctica)
Previous Article in Journal
Flow Velocity Effects on Fe(III) Clogging during Managed Aquifer Recharge Using Urban Storm Water
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Copula-Based Stochastic Simulation for Regional Drought Risk Assessment in South Korea

1
Department of Agricultural and Rural Engineering, Chungbuk National University, 28644 Cheongju, Korea
2
Postdoctoral Research Associate, K-Water Research Institute, Daejeon 34045, Korea
*
Authors to whom correspondence should be addressed.
Water 2018, 10(4), 359; https://doi.org/10.3390/w10040359
Submission received: 21 February 2018 / Revised: 16 March 2018 / Accepted: 19 March 2018 / Published: 22 March 2018

Abstract

:
In South Korea, meteorological droughts are becoming frequently-occurring phenomena in different parts of the country, because precipitation varies significantly in both space and time. In this study, the quantiles of four identified homogeneous regions were estimated by incorporating major drought variables (e.g., duration and severity) based on the Standardized Precipitation Index (SPI). The regional frequency analysis of drought was performed by evaluating a variety of probability distributions and copulas, using graphical comparisons and goodness-of-fit test statistics. Results indicate that the Pearson type III (PE3) and Kappa marginal distributions, as well as Gaussian and Frank copulas, are better able to simulate the drought variables across the region. Bivariate stochastic simulation of selected copulas showed that the behavior of simulated data may change when the degree of association (e.g., Kendall’s τ) between the drought variables was considered. Results showed that the south-west coast and east coastal areas are under high drought risk, and inland mid-latitude areas (surrounding areas of Yeongju station) and northwest parts are under low drought risk. The joint distributions were used to compute conditional probabilities, as well as primary, secondary, and conditional return periods, which can be useful for designing and managing water demand and the supply system on a regional scale.

1. Introduction

The United Nations classified South Korea as water-deficit country, and recently, South Korea has faced serious droughts and water deficit problems [1]. Historical climate records confirm the presence of severe droughts across South Korea [2,3]. In addition, it is reported that the temperature of South Korea increased throughout the 20th century [4]. Since anthropogenic climate change is expected to increase in future, South Korea will be vulnerable to extreme drought events. It is reported that changes in climate extremes have an adverse effect on water resources of South Korea [5].
The stochastic nature of drought should be expressed by different correlated random variables, such as drought duration and severity. A bivariate distribution is needed to express those correlated drought variables. In the case of flood studies, there are a variety of bivariate distributions applied to obtained joint distribution, such as bivariate normal distribution [6], bivariate exponential distributions [7], and bivariate gamma distributions [8]. However, these bivariate distributions are unable to characterize the individual behavior of random variables using the same parametric family of univariate distribution [9]. In the case of drought studies, drought characteristics are assumed to be independent, identically random variables [10], or assumed to be derived from the same univariate distribution and have explicit bivariate forms (e.g., bivariate forms of normal, exponential, or gamma distributions) [6,7,8]. However, usually, none of the above-stated assumptions are satisfied, because of the presence of high correlation between drought variables, and the fact that those variables may belong to different univariate distributions. The copula theory proposed by Cancelliere [11] can solve the above-stated problems. The copula can preserve the dependence structure, and is able to jointly simulate different univariate distribution characteristics of drought variables.
When the record length of available meteorological data is much shorter than the return period of interest, the local frequency analysis is difficult to estimate. The regional frequency analysis is used to cope with this lack of data. It is based on the assumption that the available data is transferred from the different meteorological stations within the same homogeneous region into a site with little or no data available. The regional frequency of drought at various levels of duration and severity is critical to understanding the nature of drought risk, and helping us to establish comprehensive and integrated drought management strategies. However, univariate frequency analysis does not provide an accurate probability of extreme droughts if the underlying drought event is described by more than one random variable (duration and severity, in this study). Therefore, the frequency analysis may lead to an incorrect estimation of drought risk [12]. Consequently, the multivariate statistical approach is becoming more popular in recent studies [13,14,15].
Most of the studies having the application of copulas in hydrology are particularly focused on the bivariate modelling, to simulate the dependence between the variables. Shiau [16] used bivariate copulas to determine the joint distribution between drought duration and severity. In a later study, Shiau [17] used the Clayton copula for a bivariate frequency analysis of drought severity and duration at the Abadan and Anzali gauge stations in Iran. The results showed that drought severity is expected to increase in the region if the high fluctuation in the rainfall continues. Kao [18] used copulas to perform a spatial and temporal drought analysis for the Midwestern U.S., and adopted a copula-based joint deficit index (JDI) for describing overall drought status, comparing it to the Palmer drought severity index. Furthermore, Kao later [19] indicated that the JDI index has the tendency to take into account the seasonality of precipitation and streamflow margins.
The precipitation patterns in South Korea have complex spatial and temporal variations, because of the coincident typhoon season in the western North Pacific [20,21,22]. Due to complicated mountainous terrain and climatic features, analyzing the spatial characteristics of drought are the main obstacles in drought analysis. Identification of drought-prone areas using drought risk analysis has direct relevance to both the water industry and the environmental demands of the river system. In this study, regional drought risk analysis is performed using univariate probability distributions and bivariate elliptical and Archimedean copulas, to identify the areas vulnerable to drought hazard. In addition, various types of drought risks were compared, to provide useful information for decision makers and to help them to reduce the impact of drought at the local level.

2. Materials and Methods

2.1. Study Area and Data

The Korean Peninsula is located between a northeastern part of the Asian continent, between 33–43° N and 124–131° E (Figure 1). South Korea has more than 18,797 reservoirs, which are regularly monitored for a proper supply of agricultural water and flood control. In some dry regions, precipitation is less than 1000 mm, due to topographical effects, and many parts of South Korea are characterized by a precipitation range of 1200 to 1400 mm, which is about 30% greater than the worldwide average of 973 mm [23].
The Korean Meteorological Administration (KMA) database was used to extract the monthly precipitation data for 70 rainfall stations across South Korea. The randomness of the monthly precipitation data was tested using homogeneity, the absence of the artificial trends, and spurious auto-correlation tests. Three non-parametric tests, as explained in [24], including the Mann-Whitney homogeneity test, Mann-Kendall trend test, and Kendall’s autocorrelation test [25], were tested and evaluated for all stations. The test results indicated that more than 15 stations needed to be eliminated, because of low-quality data and more than 5% missing values. The remaining 55 rainfall stations across South Korea, covering more than 35 years (1980–2015) of data, were used for further regionalization analysis. The Maintenance of Variance Extension (Move4) technique, as explained in [26], was used to fulfill the gap of missing values. This method is selected because extended records are generated while maintaining the variance of the data series.

2.2. Drought Identification

The Standardized Precipitation Index (SPI) proposed by Mckee [27] was used in this study for identifying the duration of drought events and to evaluate their severity. The SPI was computed based on fitting long-term rainfall data to the probability distribution on any desired timescale, such as 1, 3, 6, 12, and 24 months. It was found that the gamma distribution fit more closely to the precipitation data of candidate stations (55 total) across South Korea [28]. Since there are a number of zero-bounded continuous variables in climatology, it is important to give a distribution which may be used for such variables. The gamma distribution, which has a zero lower bound, has been found to fit several such variables well [29]. Therefore, gamma distribution has successfully been applied for drought studies by many researchers in South Korea [1,30].
In this study, the SPI timescale was estimated using a timescale of 6 months, because it matches well with the dry and wet alternations in South Korea, and has successfully been applied for drought monitoring in South Korea [1]. The duration of any drought event is defined as the time when the SPI values I continuously remain below the predefined truncation level [31]. The severity of any drought event indicates the cumulative deficiency below the truncation level, defined as S =   i = 1 D | I i | (Figure 2), as described in [31]. In this study, the theory of run analysis was performed using the predefined truncation level of −0.99.
Figure 3 shows the spatial variability of mean drought duration (Figure 3a) and mean drought severity (Figure 3b) from 1980 to 2015, using SPI-6 across South Korea. Droughts of the longest mean duration (3.1 to 3.4 months) were recorded at the southwest coast (areas surrounding Jangheung, Goheung, Haenam, and Wando stations) and mid-latitude inland areas (areas surrounding Jecheon and Gumi stations). Figure 3a shows that droughts of highest severities were recorded at the southwest coast (areas surrounding Jangheung, Goheung, Haenam, and Mokpo stations) and the areas surrounding Sancheong and Jecheon stations (Figure 3b). It can be seen that the droughts of longest mean duration mostly correspond to the droughts of highest mean severities, and thus drought duration and severity are correlated variables.

2.3. Cluster Analysis and Testing of Regional Homogeneity

A more robust and statistic based clustering algorithm, the hierarchical classification on principal components (HCPC), was applied for the delineation of homogeneous regions. The regionalization of drought variables was performed, considering the topographical variables such as latitude, longitude, and elevation above sea level (m), as well as climatological variables, such as mean annual precipitation (mm), mean daily maximum temperature (°C), mean daily minimum temperature (°C), annual evaporation (mm/year), and mean relative humidity (%) (refer to [32]). The HCPC clustering algorithm proposed in this study dealt with the above-stated eight topographical and climatological variables, which are likely to affect the drought mechanism in the region. Clusters formed by HCPC algorithm were validated using distance-based validating indices (e.g., connectivity, silhouette width, Dunne index, and Calinski and Harabasz index). A detailed explanation about the bivariate discordancy and homogeneity tests applied for regionalization of drought across South Korea is provided in [32].

2.4. Selection of Regional Distribution Models

After making sure that identified regions are homogeneous, it is necessary to identify the probability model for each of the drought variables; in our case, the drought variables are duration and severity. To accomplish this task, five candidate probability distributions, such as generalized logistic (GLO, three parameters), generalized Pareto (GPA, three parameters), generalized extreme value (GEV, three parameters), generalized normal (GNO, three parameters), and Pearson type III (PE3, three parameters) were considered in this study.
In this study, the L-moment ratio diagrams and a goodness-of-fit measure (the Z statistic) were used to identify the best-fitted marginal distribution, as recommended by many researchers [33,34,35,36]. The Z statistic for a particular distribution [37] can be expressed as follows.
Z =   ( t ¯ 4 R   t 4 dist ) σ t 4
where t ¯ 4 R indicates regional average L-kurtosis, σ t 4 indicates the standard deviation of t ¯ 4 R , and t 4 dist indicates the L-kurtosis of the fitted distribution. The best-fitted probability distribution is selected based on the smallest value of | Z | . The probability distribution function, with the | Z | 1.64 at a significant level of α = 10 % , is considered acceptable for representing the sample data [35].

2.5. Dependence Measures

Dependence among drought variables was analyzed using both qualitative and quantitative approaches. Qualitative dependence was assessed using graphical diagnostic tools, such as Chi and Kendall plots, whereas quantitative dependence was assessed using Pearson’s linear correlation r, Spearman’s rank correlation ρ, and Kendall’s τ. The detailed description of the Chi and Kendall plot are provided in [14].

2.6. Copula Function

Copula helps us to construct a joint distribution function of univariate random variates and be able to capture the dependence between two variables. Let X = (X1, X2, ..., XN) indicate a random vector with a continuous marginal distribution function (CDFs) F1, F2, …, FN. Sklar [11] describes the relationship between CDF H(X) copula function C, which can be written as described in [38]:
H ( X ) =   C ( F 1 ( X 1 ) ,   F 2 ( X 2 ) , F n ( X n ) )   X R n
where the unique function C : [ 0 , 1 ] d [ 0 , 1 ] is called a d-dimensional copula. The multivariate joint distribution model for H can be constructed in two parts: (1) computation of the marginal CDFs (F1, F2, …, FN) and (2) the computation of the copula model (C).
In this study, the maximum pseudo-likelihood (MPL) approach was used to compute the parameters of copula family [38]. Candidate copula families used for the analysis were the elliptical and Archimedean copula, as shown in Table 1. Elliptical copulas include normal (Gaussian) and Student’s t, and Archimedean copulas include Clayton, Gumbel, Joe, and Frank. The best-fitted copula was chosen using parametric the bootstrap-based Cramér–von Mises test (Sn) [39] and Akaike information criterion (AIC) [40] goodness-of-fit. Bivariate joint distributions were estimated using the Copula package in R programming [41].

2.7. Conditional Probability

Copula-based joint distribution is necessary to derive the conditional probability distributions of drought duration and severity. The conditional probability of drought severity with drought duration exceeding various threshold values is indicated by d , and the conditional probability of drought duration with drought severity exceeding various threshold values is indicated by s . The corresponding conditional probabilities can be expressed as follows [42].
P ( S s | D d ) = P ( D d ,   S s ) P ( D d ) = F ( s ) + F ( d ,   s ) 1 F ( d ) = F ( s ) C ( F ( d ) ,   F ( s ) ) 1 F ( d )
Similarly
P ( D d | S s ) = P ( D d ,   S s ) P ( S s ) = F ( d ) F ( d ,   s ) 1 F ( s ) = F ( d ) C ( F ( d ) ,   F ( s ) ) 1 F ( s )

2.8. Return Period in a Bivariate Framework

2.8.1. Primary Return Periods

Estimation of the return period has special importance in the planning and management of water resources. Let us suppose that D and S denote the drought duration and severity, respectively—then, the univariate return period can be computed using the procedure described in [15,43]:
T D =   μ t 1 F ( d )
T S =   μ t 1 F ( s )
T D and T S indicate the return period for drought duration and drought severity, respectively. F ( d ) and F ( s ) indicate the cumulative distribution functions of drought duration and severity, respectively. In this study, μ t   can be calculated using the theory of run and the Markov theorem [42]:
μ t =   1 P DW +   1 P WD
where P DW = p ( SPI t 0.99 | SPI t 1 0.99 and P WD =   p ( SPI t > 0.99 | SPI t 1 0.99 ) . Here, the unit of μ t is months. There are two possible cases for the derivation of return periods as described by [42]: (1) the return period for D ≥ d and S ≥ s and (2) the return period for D ≥ d or S ≥ s. The copula-based joint return period in the above stated cases can be derived as follows:
T ^ D ,   S =   μ t P ( D     d ,   S     s ) = μ t 1 F ( d )   F ( s ) +   F   ( d ,   s ) =   μ t 1 F ( d )   F ( s ) +   C ( F   ( d ) ,   F ( s ) )
T ˇ D ,   S =   μ t P ( D     d   or   S     s ) =   μ t 1 F   ( d ,   s ) =   μ t 1   C ( F   ( d ) ,   F   ( s ) )
where C ( F   ( d ) ,   F   ( s ) ) indicates the copula-based joint distribution function of the drought duration and severity, respectively. In this study, sign ∧ indicates “and”, and sign ∨ indicates “or”. Additionally, “or” is also known as standard return periods.

2.8.2. Secondary Return Period

The secondary return period provided a precise indication for performing risk analysis, and may also yield useful hints for doing numerical simulations [13,44,45]. In this study, the secondary return period was adopted for the evaluation of drought risk within South Korea, and was compared with the primary return periods computed using Equations (5) and (6). The occurrence probability should be estimated for the calculation of secondary or Kendall’s return period. K C , denoting Kendall’s distribution, can be computed following K C ( t ) = P ( C ( F   ( d ,   s ) t ) ) . The related secondary return period can be derived as follows [32]:
T D ,   S * =   μ t 1 K C ( t ) =   μ t 1   P ( C ( F   ( d ,   s ) t ) )
where t indicates the critical probability level and K C indicates the Kendall distribution function. The detailed description of the K C function for other Archimedean and elliptical copula families are available in [15,38,45].

2.8.3. Conditional Return Period

The concept of a conditional return period has special importance, especially for particular conditional drought events that are of interest in our applications. For instance, the evaluation of the risk for malfunction of a specific water resources system needs to consider the drought event, at the given drought duration D, when drought severity equals or exceeds a certain design value s, and vice versa.
The return period of drought duration and severity is derived conditionally, following the expression proposed by [42]. For example, the conditional return period for D given S ≥ s, and the conditional return period for S given D ≥ d:
T D | S     s =   T S P ( D     d ,   S     s ) = μ t 1   F ( s ) × 1 1 F ( d )   F ( s ) +   F   ( d ,   s ) =   μ t [ 1   F ( s ) ] [ 1 F ( d )   F ( s ) +   C ( F   ( d ) ,   F ( s ) ) ]
T S | D     d =   T D P ( D     d ,   S     s ) = μ t 1   F ( d ) × 1 1 F ( d )   F ( s ) +   F   ( d ,   s ) =   μ t [ 1   F ( d ) ] [ 1 F ( d )   F ( s ) +   C ( F   ( d ) ,   F ( s ) ) ]
T D | S     s indicates the conditional return period of drought duration when drought severity exceeds a certain level, and T S | D     d indicates the conditional return period of drought severity when drought duration exceeds a certain level.

3. Results and Discussion

3.1. Bivariate Regionalization of Drought in South Korea

The HCPC algorithm was applied for the formation of initial regions, using topography and climate-based site characteristics of each station. However, regions obtained by the HCPC algorithm are not statistically homogeneous. Univariate and bivariate discordancy and homogeneity tests results showed that Tongyeong station in Region I, as well as the Ganghwa, Jeonju, and Boeun stations in Region III, were identified as discordant sites. Following methodology proposed by [37], the stations shifted from one region to another, in order to improve the homogeneity of the regions. The spatial distribution of the final identified homogeneous regions is presented in Figure 4. Region II, located at the mid-latitude inland of South Korea, is the smallest region compared to the others, as it contains only five stations. Moderate drought attributes observed in Region II may be because of its location away from the coastal areas, making it thus less affected by summer typhoons, which occur mostly at the coastal areas. Table 2 showed that all the regions fulfill the criteria of homogeneity. The detailed results of the bivariate homogeneity tests are provided in [32]. Since regional drought frequency analysis is based on the extreme values of drought variables (duration and severity), basic statistics such as mean, maximum, minimum, standard deviation, skewness, and kurtosis were computed for four homogeneous regions (Table 3). The longest drought duration of 13 months was recorded at Jecheon station in Region I, and the highest severity of 22.30 was recorded at Yeongju station in Region II.

3.2. Identification of Regional Marginal Distribution

The L-moment ratio diagram, in the case of drought duration for the four identified homogeneous regions, is shown in Figure 5. Among the five candidate probability distributions, the PE3 distribution line passed through the small portion of the observed data, in case of Regions I and III, and passed away from the observed data in the case of Regions II and IV. However, it is noted that the regional average point is always located away from the PE3 distribution line. The Z statistic value showed that none of the candidate probability distributions satisfied the criteria that | Z | 1.64 (Table 4). This may be due to a large number of ties, especially with the short drought duration. Therefore, for drought duration, a more robust Kappa distribution was selected, as recommended by [37,46,47,48]. In the case of drought severity, the L-moment ratio diagram (Figure 6) and Z statistics (Table 4) showed that except for Region I, all other regions can be modeled with PE3 distribution. For Region I, a four-parameter Kappa distribution was selected.
The visual comparison between empirical and theoretical probability distribution functions (PDFs) and their corresponding cumulative distribution functions (CDFs) for the drought duration and severity of each region is presented in Figure 7. CDFs and PDFs were computed using Kappa distribution for Regions I, II, III, and IV, in the case of drought duration, and for Region I in the case of drought severity. The drought severity of Regions II, III, and IV were computed using PE3 distribution. Here, the empirical cumulative probability was calculated by using plotting position formula p ( X x i ) =   i 0.35 n , proposed by Hosking [49]. Here, i denotes the rank of the observations in ascending order, and n denotes sample size. The derived theoretical CDFs and PDFs show good agreement with the empirical ones, which indicates that these probability distributions perform well with the observed data. The CDFs of the final best-fitted probability distributions and their corresponding parameters are given in Table 5.
The parameters of each distribution were estimated using the L-moment method. Before fitting the copula model, site-specific scaling factors were used to standardize the observations from the pooled sites [50]. In this study μ i ,   S and μ i ,   D denote the site-specific scaling factor for drought duration and severity, respectively. This approach helped us to increase statistical reliability of the calculations, especially at the sites having small a length of records. The site-specific scaling factor for drought duration and severity of each region and the number of drought events recorded at each station are shown in Table 6. The highest number of drought events (34) were recorded at Tongyeong station. This can be correlated with unusual precipitation patterns in southern coastal areas. This is because of the major contribution of typhoons to the seasonal (particularly summer) precipitation patterns and convective systems within the air mass in southern coastal areas [51]. A study based on spatial patterns of trends in summer precipitation showed a significant increasing trend in amount and intensity of precipitation at southeast coastal areas [20].

3.3. Application of Copulas

Prior to fitting the bivariate copulas, it is important to examine the dependence structure between the drought duration and severity. In this study, quantitative dependence was assessed using three correlation coefficient measures, such as Pearson’s linear correlation r, Spearman’s rank correlation ρ, and Kendall’s τ. The Student’s t-test at the significance level of 0.05 was used to test the statistical significance of the correlation values. Since Pearson correlation coefficient can only indicate the linear dependence between the variables, it may not be useful for heavy-tailed variables and may be affected by the outliers. Therefore, Kendall’s τ was used to describe the wider class of dependencies and to cope with the outliers [52]. Moreover, rank-based evaluation of the correlation between drought duration and severity was also presented in Table 7. The three correlation coefficient test results showed that there is a statistically significant positive dependence between the drought variables for all regions.
The qualitative dependence between drought variables were accomplished using graphical diagnostic tools. The pair-wise dependence pattern of drought variables, using Chi and Kendall plots for each region, is presented in Figure 8. According to [53,54], the two variables could be considered as independent if the majority of the events lay within the confidence band of the Chi plots. In the case of the Chi plot, a strong deviation from the control point was observed for all regions. All the points fell near or away from the confidence band of the Chi plots. Furthermore, most of the drought event values were higher than the median value (positive lambda values). In the case of the Kendall plots, two variables could be considered as independent if the majority of the event lay on the diagonal line. Events above the diagonal line indicate positive dependence, and event below the diagonal line indicate the negative dependence. The results of the Kendall plots showed a strong deviation from the diagonal line for all regions. This indicates the presence of strong positive association between drought variables.
Since the qualitative and quantitative dependence measures indicated significant positive correlation between the drought variables, and goodness-of-fit tests indicated the best-fitted probability distributions, a copula can be employed to simulate the drought variables using joint distribution. Elliptical and Archimedean copula families, described in Section 2.6, were fitted and compared to find the best-fitted copula. The Ali-Maikhail-Haq copula is also the part of Archimedean copula family, but was not considered in this study, because the application of this copula needs Kendall’s τ value to be within the range of [−0.1817, 0.333] [38]. Furthermore, the six copula models were fitted using the MPL method, where the parameters were estimated using either Kappa or PE3 distribution. The performance of the copula families was compared using the Cramér–von Mises test (Sn). The test statistic Sn and its associated p-value can be estimated using either parametric bootstrap [39,55] or by means of a multiplier approach [56,57].
The drawbacks of the parametric bootstrap are that it has very high computational cost, as each iteration requires both random number generation from the fitted copula and the estimation of copula parameters [58]. Additionally, the parametric bootstrap method becomes prohibitive with an increase in the sample size. Therefore, a fast large-sample testing procedure based on the multiplier approach was preferred in this study. The Sn and p-value were computed using the multiplier iteration of 10,000, with the sample length as the historical data (Table 8). The Sn statistic was computed on the basis of the probability integral transformation of Rosenblatt [59]. In addition to this, the AIC criteria as described in [40] was also used to choose the appropriate copula model. Neither the Joe copula in Regions I, III, and IV nor the Gumbel copula in Region III was able to pass the Sn test at 90% significance level. Since the difference in the Sn statistic between the six copula values was very small, additional AIC statistics were also considered. Therefore, if the Sn statistics for the two copulas were the same, then the copula having a lower value of AIC was preferred. The Gaussian copula was identified as the best copula for Regions I and IV, and the Frank copula for Regions II and III are shown in Table 8 (bold). Estimated parameters using the MPL approach for each copula family are also presented in Table 8.
Overall, the AIC and Sn statistics indicate the different aspects of the model under study. The Sn statistic was used as the formal hypothesis test, as a criterion to rank the copulas according to a predefined significance level, whereas AIC is a relatively simple likelihood criteria-based method, where selection is only a matter of choosing the models with the lowest likelihood value. Results did not show any significant correlation between the two statistical tests; this may be because the numerical tests tend to narrowly focus on a particular aspect of the relationship between the empirical and theoretical copulas, and often try to compress the information into a single descriptive number (see e.g., [60]). The second reason is that the test has been successfully applied to at-site frequency analysis, where the total sample size is very small compared to regional drought frequency analysis. Large sample sizes can affect the performance of an Sn goodness-of-fit test statistic, as described in [39]. Comparing the method of computation, the p-value in the Sn test statistic is complicated to implement and computationally intensive; however, the AIC has a relatively low computational cost.
Even though the AIC and Sn goodness-of-fit test statistics accepted the commonly used copulas in drought studies, the graphical comparison of the empirical and theoretical copulas may not show good agreement. Therefore, joint CDFs and PDFs for drought duration and severity were estimated using the two best-fitted copulas (Gaussian and Frank) mentioned in Table 8, and are shown in Figure 9. Joint CDFs in Figure 9 show the visual comparison between empirical and theoretical copula functions for drought duration and severity. The red dashed contour line shows the empirical copula obtained through C n ( u ,   v ) =   1 / n i = 1 n 1 ( R i ( n + 1 )   u ,   S i ( n + 1 ) v ) ,   where u ,   v   [ 0 ,   1 ] ,   R i and S i denote the ranks of the ordered sample. The solid contour line represents the theoretical copula. Results indicate that the Gaussian and Frank empirical copulas from the Regions I and II, respectively, matched well with the theoretical copulas.
However, the Frank and Gaussian empirical copulas from Regions III and IV, respectively, showed some deviations between the theoretical copulas, especially at low probability levels. This may be due to discrete characteristics of drought duration and the relative low accuracy of marginal distribution. Furthermore, the estimation of the probabilities of interest is affected by the majority of “ties” (i.e., identical pairs of drought duration and severity) at the lower probability level. It can be observed from Figure 9 that at high probability levels, all regions showed good agreement between empirical and theoretical copula values. The upper and lower tail dependencies in the case of both the Gaussian and Frank copulas is 0.

3.4. Conditional Probability

The copula-based joint distribution of drought duration and severity can provide very useful information for regional drought management. For illustration, the probability when drought duration and severity exceeds a threshold value, which can act as a special condition for a specific water demand and supply system, and may help to propose drought mitigation plan in the region. This probability can be easily derived with the help of copula-based joint distribution.
P ( D d ,   S s ) =   1 F ( d )   F ( s ) +   F   ( d ,   s ) =   1 F ( d )   F ( s ) + C ( F   ( d ) ,   F ( s )
For example, if d = 3 months and s = 4 , as the threshold condition for water demand and the supply system of Region I, then the Kappa distribution would be F ( d ) =   0.67 and F ( s ) =   0.75 and the Gaussian copula would yield F   ( d ,   s ) =   C ( F   ( d ) ,   F ( s ) =   0.62 . Therefore, the probability that drought duration and severity exceeding three months and 2, respectively, will be equal to 0.20.
Water resource managers are interested to know about the conditional probability P ( S s | D d ) of drought severity, given that drought durations exceeding the threshold values of d can be computed using Equation (4), and the conditional probability P ( D d | S s ) of drought duration, given drought severity exceeding threshold values of s , can be computed using Equation (5). In this study, d and s took the threshold levels of the 25th, 50th, 75th, and 95th percentile, shown in Figure 10 and Figure 11. Both cases of conditional probability evaluated for the four regions ( P ( S s | D d )   and   P ( D d | S s ) ) showed an increasing trend in the skewness of the conditional probability curves. Based on the computed conditional probability curves, water resource managers can extract useful information; for example, in the case of Region IV, the probabilities for a drought severity of less than 2 or 4, given a drought duration exceeding two months (50th percentile), will be equal to 0.235 and 0.785, respectively (Figure 10). Similarly, the probabilities for a drought duration less than two or four months, given drought severity exceeding 1.97 (50th percentile), will be equal to 0.331 and 0. 853, respectively (Figure 11).
Conditional probability curves also help us to evaluate and compare the drought risk between the regions. For example, for a drought severity of less than 3, given a drought duration exceeding three months (75th percentile), corresponding conditional probability ( P ( S s | D d ) ) values estimated for Regions I, II III, and IV are 0.532, 0.641, 0.612 and 0.596, respectively. This information can serve as basis to evaluate the capability of water demand and the supply system when considering different drought events, and can be used to trigger a drought contingency plan on regional scale.

3.5. Stochastic Simulation of Copulas and Bivariate Return Periods

The simulation of the copulas was performed using 5000 random samples generated from the Gaussian copula for Regions I and IV, and from the Frank copula for Regions II and III, as presented in Figure 12. It can be observed that the simulated data (grey dots) for all regions fitted well with the observed data (red dots) of drought variables. A large number of “ties” (i.e., identical pairs of drought duration and severity) were observed at drought durations of one and two months, and the concentration of observed data decreased with an increase in drought duration. Simulated drought data also showed a similar trend, as it closely resembled that of the observed sample. Pearson’s r, Kendall’s τ, and Spearman’s rank ρ correlation coefficients of simulated drought data for Region I were 0.913, 0.748 and 0.916, respectively; for Region II they were 0.910, 0.839, and 0.967, respectively; for Region III, they were 0.925, 0.846, and 0.970, respectively; and for Region IV they were 0.914, 0.749, and 0.916, respectively. Among the three correlation coefficients, Kendall’s τ showed more variation from one region to another. Figure 12 shows that the simulated data tend to concentrate along the main diagonal with the higher Kendall’s τ value (Regions II and III, with Kendall’s τ values of 0.839 and 0.846, respectively), and tend to disperse along the main diagonal with the lower Kendall’s τ value (Regions I and IV with Kendall’s τ values of 0.749 and 0.748 and, respectively). This shows that the behavior of a bivariate drought sample may change when the degree of association (e.g., Kendall’s τ) between the variables is considered. These results are in accordance with the findings of [61] by using Frank and Gumbel copulas. The contour lines of joint return (standard) periods for 5, 10, 20, 50, 100, 200, and 500 years, computed using Equation (9), are shown in Figure 12. Since the different combination of the correlated drought variables (duration and severity) can occur at the same time, the return periods are shown using contour lines, together with observed and simulated samples.
In drought frequency analysis, the droughts of longer duration and highest severities were always more important, because they can affect water resource planning and may pose a high risk to agriculture. Therefore, historical long-lasting drought events were identified for each region. In the case of Region I, the longest drought event recorded among 55 stations lasted for 13 months (Table 3), at Jecheon station. It lasted from April 2008 to April 2009, with the corresponding severity of 16.04; the joint return period of this event is more than 1000 years. The second longest drought event was recorded at Suncheon and Miryang stations, with the duration of 10 months, from April 1988 to January 1989; their corresponding severities were 16.98 and 16.46, respectively, and the joint return period was closer to 500 years. Simultaneous occurrence of drought events was observed at both stations, and surrounding stations may have similar precipitation patterns.
In case of Region II, the two longest drought events had a duration of 12 months and 10 months, recorded at Yeongju station and Geochang station, respectively. The first lasted from March 1982 to February 1983, and the second from July 2008 to April 2009, and their corresponding severities were 22.30 and 21.50, respectively. In the case of Region III, the three longest drought events had an identical duration of 10 months, recorded at both Seoul and Incheon stations from March 2014 to December 2014, and at Buan station from April 1988 to January 1989; their corresponding severities were 15.44, 12.94, and 16.00, respectively, and their joint return periods were less than 500 years. Drought events at Seoul and Incheon stations occur simultaneously because of the possibility of having similar precipitation patterns at adjacent stations (Figure 1). In the case of Region IV, the two longest drought events were recorded at Ganghwa and Mungyeong stations, from March 2014 to January 2015 and from March 1982 to December 1982, respectively, with the severities of 20.01 and 16.15, respectively; their joint return periods were nearly 200 and 500 years, respectively.
Since extreme drought events are more of a concern in drought frequency analysis, only droughts of the longest duration and highest severities were selected for spatial representation of drought risk maps. In this study, the drought events with D 6 months and S 6.5 were considered, being the droughts of longest duration and highest severities. This criterion was chosen because agriculture and surface water supplies are likely to be affected after six months. Additionally, this criteria was also recommend in previous studies [62]. On the basis of the copula-based bivariate drought return period, the associated drought risk (R) can be computed using the equation R = 1 ( 1 1 / T ˇ D ,   S ) N [63], to impose the probability of extreme drought with a T ˇ D ,   S year return period for N years of life span . Figure 13 indicates spatial patterns of the regional drought risks, using the T ˇ D ,   S observed at each site for the life spans of 10 and 50 years. In Figure 13, green, yellow, and red colors indicate the areas of low, moderate, and high drought risks, respectively, and their values vary from 0 to 1. The drought risk map for N = 10 years shows low or nearly moderate drought risk, except on the southwestern coast (areas surrounding Jangheung, Goheung, Wando, and Mokpo stations) and the east coast (Uljin and Yeongdeok stations) of South Korea (Figure 13a). In case when the N = 50 years, most of the areas fall under the category of moderate and high drought risk, except for the area surrounding Yeongju, Mungyeong, Ganghwa, and Chuncheon stations (Figure 13b). Consequently, those regions which are located at the southern (Region I) and eastern coast (Region IV) of South Korea are under high drought risk, compared to other regions. This may be due to changes in precipitation patterns caused by the “Changma front”; thus, the rainy season becomes short, but the amount of the rainfall and the number of heavy rainfall days have increased at the southern and eastern coasts of South Korea [20,64]. Precipitation trend analysis showed the decline in annual mean precipitation on the southwest coast of South Korea [21], which is therefore more vulnerable to drought risk. Kim et al. [3] showed that extreme historical drought events mostly occurred in the central and southern regions of South Korea.

3.6. Comparison of the Bivariate Return Periods

Comparison of multivariate drought frequency analysis was performed using the best-fitted copulas for each region (Table 9). T ^ D ,   S (“and”), T ˇ D ,   S (“or”), T D ,   S * (Kendall’s), and T D | S     s ,   T S | D     d (conditional) joint return periods were computed for the duration of 2, 5, 10, 20, 50, 100, 200, 500, and 1000 years for the four regions, accounting for the bivariate drought properties (Table 9). T ^ D ,   S , T ˇ D ,   S , T D ,   S * , T D | S     s , and T S | D     d were computed using Equations (8)–(12), respectively. Since the primary, secondary, and conditional return periods explain the different situations, the preference of the return period may change, based on what type of drought risk needs to be evaluated for the area under study. The different levels of drought risks can be used to assess the malfunction of a specific water demand and supply system. For example, a particular water demand and supply system in Region I may be unable to supply water under a condition of drought severity exceeding 5.67, given a drought duration exceeding 4.41 months. The return period in this situation will be equal to 30.16 years, computed using Equation (12).
Table 9 shows the difference in the return period, according to the considered type of risk. For example, for Region I, at a univariate return period of 100 years, the values of drought duration and severities are 8.50 months and 9.65, respectively. However, joint bivariate return periods of drought variables show the return period being 9.95 years for T ^ D ,   S (D ≥ 8.50 and S ≥ 9.65) and 8.28 years for T ˇ D ,   S (D ≥ 8.50 or S ≥ 9.65). Similarly, conditional and secondary return periods for T D | S     s , T S | D     d , and T D ,   S * are 96.04, 84.56, and 9.14 years, respectively. Furthermore, Table 9 shows that Kendall return periods T D ,   S * are always greater than the T ˇ D ,   S return periods and smaller than the T ^ D ,   S return periods. T ˇ D ,   S is always smaller than T ^ D ,   S because the probability of occurrence of two cases simultaneously is always smaller than when only one of the two cases occur. The results match well with the finding of [61]. Moreover, it is noticed that the difference between T D ,   S * and T ˇ D ,   S increases with an increase in critical probability level t. Results showed that conditional return periods T D | S     s and T S | D     d are always greater than the primary and secondary return periods T ^ D ,   S , T ˇ D ,   S , and T D ,   S * , and the conditional return period T D | S     s was greater than T S | D     d . However, direct comparison of different return periods is a difficult task, because each type of return period has a different physical meaning in drought risk analysis.

4. Conclusions

In the presence of complex topographical and climatic features of South Korea, precipitation varies both in space and time. Therefore, drought is becoming a frequently-occurring phenomena in different parts of the country. Droughts are recognized as the main obstacle to effective water resource management and planning. Regionalization of drought events (extracted using SPI-6) is performed using the HCPC algorithm (a blend of Ward’s classification and PCA approach) and bivariate homogeneity tests. Results of bivariate regionalization of drought showed that the whole of South Korea can be divided into four homogeneous regions, which have been tested to be robust. In drought frequency analysis, the lack of lengthy records reduces the reliability of quantile estimation. To overcome this problem, regional frequency analysis of drought on a multivariate framework was performed for the four identified homogeneous regions. Various probability distributions were tested and evaluated, using both quantitative (Z-statistic) and qualitative (visual comparison) goodness-of-fit tests, in order to fit drought duration and severity for all regions. Dependence structures between drought variables were assessed, using various graphical diagnostic tools and correlation coefficients. For the construction of joint distributions between drought variables, best-fitted copulas were identified on the basis of goodness-of-fit test statistics (Sn and AIC), and the visual comparison between empirical vs theoretical values. Potential drought risks for all regions across South Korea were evaluated and compared through primary, secondary, and conditional types of joint return periods. The primary conclusions determined from this study are as follows:
  • The spatial distribution of mean drought duration and severity from 1980 to 2015 indicates that the droughts of longest duration and highest severities were recorded at southwestern coastal areas of South Korea, which is due to the unusual precipitation patterns along the coast.
  • The Z statistics and L-moment ratio diagram suggested that among the five candidate distributions evaluated for both drought duration and severity, only the PE3 distribution fits better for the drought severity of Regions II, III, and IV. All regions of drought duration showed poor fit because of a large number of ties, especially at short drought durations. Therefore, a more general and robust Kappa distribution model was used instead for the regions having higher Z statistic values (>1.64). The further evaluation of the selected probability distributions, using a visual comparison between empirical and theoretical probabilities, showed that PE3 and Kappa distributions are better able to simulate the drought variables across the region.
  • Results of Chi plot, Kendall plots, Pearson’s r, Spearman’s ρ and Kendall’s τ correlation coefficients showed the significant positive dependence between the drought variables, and thus indicate the suitability of drought variables for joint modeling.
  • Goodness-of-fit statistics (Sn and AIC), based on Rosenblatt’s transformation and visual comparison between empirical and theoretical copula functions, indicate that among six copulas, the bivariate Gaussian copula is better able to simulate the drought variables for Regions I and IV, and the bivariate Frank copula for Regions II and III.
  • Evaluation of best-fitted copulas by comparing observed and simulated random samples (5000 samples), showed fairly good agreement. It is noticed that Kendall’s τ is more sensitive to drought variables as compared to Pearson’s r and Spearman’s ρ correlation coefficients, as Kendall’s τ varies significantly from one region to another. Furthermore, the structure of the simulated data changes, according to the degree of association (e.g., Kendall’s τ) between drought variables.
  • The drought risk maps derived for 10 and 50 year life spans, using the droughts of longest duration and highest severities, showed that the southwestern coast and eastern coastal areas are under high drought risk, and that inland mid-latitude areas (areas surrounding Yeongju station) and northwestern parts are under low drought risk. This is due to the effect of monsoons to the summer precipitation patterns and convective system within the air mass on the southern and eastern coast of South Korea.
  • Comparison between the different types of return periods showed that secondary return periods T D ,   S * are always greater than the T ˇ D ,   S return periods and smaller than the T ^ D ,   S return periods, and T ˇ D ,   S return periods are always smaller than T ^ D ,   S return periods. Moreover, conditional return periods T D | S     s and T S | D     d are always greater than the primary return periods T ^ D ,   S and T ˇ D ,   S , and the secondary return period T D ,   S * . Since each type of return period explains a different situation, the preference of the return period may change based on type of drought risk that needs to be evaluated for the area under study.

Acknowledgments

This research was supported by a grant (18RDRP-B134213-06) from Regional Development Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.

Author Contributions

Muhammad Azam designed, carried out the analysis and wrote the paper. Seung Jin Maeng reviewed and edited the manuscript. Hyung San Kim and Ardasher Murtazaev provided assistance in the calculations.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kim, C.J.; Park, M.J.; Lee, J.H. Analysis of climate change impacts on the spatial and frequency patterns of drought using a potential drought hazard mapping approach. Int. J. Climatol. 2014, 34, 61–80. [Google Scholar] [CrossRef]
  2. Min, S.K.; Kwon, W.T.; Park, E.H.; Choi, Y. Spatial and temporal comparisons of droughts over Korea with East Asia. Int. J. Climatol. 2003, 23, 223–233. [Google Scholar] [CrossRef]
  3. Kim, D.W.; Byun, H.R.; Choi, K.S.; Oh, S.B. A spatiotemporal analysis of historical droughts in Korea. J. Appl. Meteorol. Climatol. 2011, 50, 1895–1912. [Google Scholar] [CrossRef]
  4. Im, E.S.; Jung, I.W.; Bae, D.H. The temporal and spatial structures of recent and future trends in extreme indices over Korea from a regional climate projection. Int. J. Climatol. 2011, 31, 72–86. [Google Scholar] [CrossRef]
  5. Nam, W.H.; Hong, E.M.; Choi, J.Y. Has climate change already affected the spatial distribution and temporal trends of reference evapotranspiration in South Korea? Agric. Water Manag. 2015, 150, 129–138. [Google Scholar] [CrossRef]
  6. Yue, S. Applying Bivariate Normal Distribution to Flood Frequency Analysis. Water Int. 1999, 24, 248–254. [Google Scholar] [CrossRef]
  7. Bacchi, B.; Becciu, G.; Kottegoda, N.T. Bivariate exponential model applied to intensities and durations of extreme rainfall. J. Hydrol. 1994, 155, 225–236. [Google Scholar] [CrossRef]
  8. Yue, S.; Ouarda, T.B.M.J.; Bobée, B. A review of bivariate gamma distributions for hydrological application. J. Hydrol. 2001, 246, 1–18. [Google Scholar] [CrossRef]
  9. Ganguli, P.; Reddy, M.J. Evaluation of trends and multivariate frequency analysis of droughts in three meteorological subdivisions of western India. Int. J. Climatol. 2014, 34, 911–928. [Google Scholar] [CrossRef]
  10. Cancelliere, A.; Salas, J.D. Drought length properties for periodic-stochastic hydrologic data. Water Resour. Res. 2004, 40. [Google Scholar] [CrossRef]
  11. Sklar, A. Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 1959, 8, 229–231. [Google Scholar] [CrossRef]
  12. González, J.; Valdés, J.B. Bivariate Drought Recurrence Analysis Using Tree Ring Reconstructions. J. Hydrol. Eng. 2003, 8, 247–258. [Google Scholar] [CrossRef]
  13. Salvadori, G.; De Michele, C. Frequency analysis via copulas: Theoretical aspects and applications to hydrological events. Water Resour. Res. 2004, 40. [Google Scholar] [CrossRef]
  14. Genest, C.; Favre, A.-C. Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask. J. Hydrol. Eng. 2007, 12, 347–368. [Google Scholar] [CrossRef]
  15. Salvadori, G.; De Michele, C.; Durante, F. On the return period and design in a multivariate framework. Hydrol. Earth Syst. Sci. 2011, 15, 3293–3305. [Google Scholar] [CrossRef] [Green Version]
  16. Shiau, J.T. Fitting Drought Duration and Severity with Two-Dimensional Copulas. Water Resour. Manag. 2006, 20, 795–815. [Google Scholar] [CrossRef]
  17. Shiau, J.T.; Modarres, R. Copula-based drought severity-duration-frequency analysis in Iran. Meteorol. Appl. 2009, 16, 481–489. [Google Scholar] [CrossRef]
  18. Kao, S.-C.; Govindaraju, R.S.; Niyogi, D. A Spatio-temporal Drought Analysis for the Midwestern US. In World Environmental and Water Resources Congress 2009: Great Rivers © 2009 ASCE 4654; ASCE Publications: Reston, VA, USA, 2009; pp. 4654–4663. [Google Scholar]
  19. Kao, S.C.; Govindaraju, R.S. A copula-based joint deficit index for droughts. J. Hydrol. 2010, 380, 121–134. [Google Scholar] [CrossRef]
  20. Chang, H.; Kwon, W.-T. Spatial variations of summer precipitation trends in South Korea, 1973–2005. Environ. Res. Lett. 2007, 2, 45012. [Google Scholar] [CrossRef]
  21. Bae, D.H.; Jung, I.W.; Chang, H. Long-term trend of precipitation and runoff in Korean river basins. Hydrol. Process. 2008, 22, 2644–2656. [Google Scholar] [CrossRef]
  22. Kim, J.S.; Jain, S. Precipitation trends over the Korean peninsula: Typhoon-induced changes and a typology for characterizing climate-related risk. Environ. Res. Lett. 2011, 6. [Google Scholar] [CrossRef]
  23. Jung, I.W.; Bae, D.H.; Kim, G. Recent trends of mean and extreme precipitation in Korea. Int. J. Climatol. 2011, 31, 359–370. [Google Scholar] [CrossRef]
  24. Helsel, D.R.; Hirsch, R.M. Statistical Methods in Water Resources; Elsevier: Amsterdam, The Netherlands, 1992; Volume 49. [Google Scholar]
  25. Hipel, K.W.; McLeod, A.I. Time Series Modelling of Water Resources and Environmental Systems; Nonparametric Tests; Elsevier: Amsterdam, The Netherlands, 2005. [Google Scholar]
  26. Vogel, R.M.; Stedinger, J.R. Minimum variance streamflow record augmentation procedures. Water Resour. Res. 1985, 21, 715–723. [Google Scholar] [CrossRef]
  27. Mckee, T.B.; Doesken, N.J.; Kleist, J. The relationship of drought frequency and duration to time scales. In Proceedings of the AMS 8th Conference on Applied Climatology, Anaheim, CA, USA, 17–22 January 1993; pp. 179–184. [Google Scholar]
  28. Lee, J.-H.; Seo, J.-W.; Kim, C.-J. Analysis on Trends, Periodicities and Frequencies of Korean Drought Using Drought Indices. J. Korea Water Resour. Assoc. 2012, 45, 75–89. [Google Scholar] [CrossRef]
  29. Thorn, H.C.S. Some Methods of Climatological Analysis; WMO Technics Note No. 81; World Meteorological Organization: Geneva, Switzerland, 1966; pp. 16–22. [Google Scholar]
  30. Lee, J.H.; Kim, C.J. A multimodel assessment of the climate change effect on the drought severity-duration-frequency relationship. Hydrol. Process. 2013, 27, 2800–2813. [Google Scholar] [CrossRef]
  31. Yevjevich, V. An Objective Approach to Definitions and Investigations of Continental Hydrologic Droughts; Hydrology Papers; Colorado State University: Fort Collins, CO, USA, 1967. [Google Scholar]
  32. Azam, M.; Park, H.; Maeng, S.J.; Kim, H.S. Regionalization of Drought across South Korea Using Multivariate Methods. Water 2017, 10, 24. [Google Scholar] [CrossRef]
  33. Schaefer, M.G. Regional analyses of precipitation annual maxima in Washington State. Water Resour. Res. 1990, 26, 119–131. [Google Scholar] [CrossRef]
  34. Vogel, R.M.; Fennessey, N.M. L moment diagrams should replace product moment diagrams. Water Resour. Res. 1993, 29, 1745–1752. [Google Scholar] [CrossRef]
  35. Vicente-Serrano, S. Differences in Spatial Patterns of Drought on Different Time Scales: An Analysis of the Iberian Peninsula. Water Resour. Manag. 2006, 20, 37–60. [Google Scholar] [CrossRef]
  36. Norbiato, D.; Borga, M.; Sangati, M.; Zanon, F. Regional frequency analysis of extreme precipitation in the eastern Italian Alps and the August 29, 2003 flash flood. J. Hydrol. 2007, 345, 149–166. [Google Scholar] [CrossRef]
  37. Hosking, J.R.M.; Wallis, J.R. Regional Frequency Analysis: An Approach Based on L-Moments; Cambridge University Press, United States of America: New York, NY, USA, 1997; ISBN 9780521019408. [Google Scholar]
  38. Nelsen, R.B. An Introduction to Copulas; Springer: New York, NY, USA, 2013; Volume 53, ISBN 9788578110796. [Google Scholar]
  39. Genest, C.; Rémillard, B.; Beaudoin, D. Goodness-of-fit tests for copulas: A review and a power study. Insur. Math. Econ. 2009, 44, 199–213. [Google Scholar] [CrossRef]
  40. Requena, A.I.; Chebana, F.; Mediero, L. A complete procedure for multivariate index-flood model application. J. Hydrol. 2016, 535, 559–580. [Google Scholar] [CrossRef]
  41. Hofert, M.; Mächler, M. Nested archimedean copulas meet R: The nacopula package. J. Stat. Softw. 2011, 39, 1–20. [Google Scholar] [CrossRef]
  42. Shiau, J.T. Return period of bivariate distributed extreme hydrological events. Stoch. Environ. Res. Risk Assess. 2003, 17, 42–57. [Google Scholar] [CrossRef]
  43. Shiau, J.T.; Feng, S.; Nadarajah, S. Assessment of hydrological droughts for the Yellow River, China, using copulas. Hydrol. Process. 2007, 21, 2157–2163. [Google Scholar] [CrossRef]
  44. Salvadori, G.; De Michele, C. Multivariate multiparameter extreme value models and return periods: A copula approach. Water Resour. Res. 2010, 46. [Google Scholar] [CrossRef]
  45. Vandenberghe, S.; Verhoest, N.E.C.; Onof, C.; De Baets, B. A comparative copula-based bivariate frequency analysis of observed and simulated storm events: A case study on Bartlett-Lewis modeled rainfall. Water Resour. Res. 2011, 47. [Google Scholar] [CrossRef]
  46. Santos, J.F.; Portela, M.M.; Pulido-Calvo, I. Regional Frequency Analysis of Droughts in Portugal. Water Resour. Manag. 2011, 25, 3537–3558. [Google Scholar] [CrossRef]
  47. Werick, W.J.; Willeke, G.E.; Guttman, N.B.; Hosking, J.R.M.; Wallis, J.R. National drought atlas developed. Eos Trans. Am. Geophys. Union 1994, 75, 89. [Google Scholar] [CrossRef]
  48. Wallis, J.R.; Schaefer, M.G.; Barker, B.L.; Taylor, G.H. Regional precipitation-frequency analysis and spatial mapping for 24-hour and 2-hour durations for Washington State. Hydrol. Earth Syst. Sci. 2007, 11, 415–442. [Google Scholar] [CrossRef]
  49. Hosking, J.R.M. L-Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics. J. R. Stat. Soc. Ser. B 1990, 52, 105–124. [Google Scholar]
  50. Hosking, J.R.M.; Wallis, J.R. Some statistics useful in regional frequency analysis. Water Resour. Res. 1993, 29, 271–281. [Google Scholar] [CrossRef]
  51. Lee, S.; Kwon, W.T. A variation of summer rainfall in Korea. J. Korean Geogr. Soc. 2004, 39, 819–832, (In Korean with English abstract). [Google Scholar]
  52. Klein, B.; Schumann, A.H.; Pahlow, M. Copulas—New Risk Assessment Methodology for Dam Safety. In Flood Risk Assessment and Management; Springer: Dordrecht, The Netherlands, 2011; pp. 149–185. ISBN 978-90-481-9917-4. [Google Scholar]
  53. Fisher, N.I.; Switzer, P. Chi-plots for assessing dependence. Biometrika 1985, 72, 253–265. [Google Scholar] [CrossRef]
  54. Fisher, N.I.; Switzer, P. Graphical Assessment of Dependence. Am. Stat. 2001, 55, 233–239. [Google Scholar] [CrossRef]
  55. Genest, C.; Remillard, B. Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 2008, 44, 1096–1127. [Google Scholar] [CrossRef]
  56. Kojadinovic, I.; Yan, J. A goodness-of-fit test for multivariate multiparameter copulas based on multiplier central limit theorems. Stat. Comput. 2011, 21, 17–30. [Google Scholar] [CrossRef]
  57. Kojadinovic, I.; Yan, J.; Holmes, M. Fast large-sample goodness-of-fit tests for copulas. Stat. Sin. 2011, 21, 841. [Google Scholar] [CrossRef]
  58. Kojadinovic, I.; Yan, J. Modeling Multivariate Distributions with Continuous Margins Using the copula R Package. J. Stat. Softw. 2010, 34, 1–20. [Google Scholar] [CrossRef]
  59. Rosenblatt, M. Remarks on a Multivariate Transformation. Ann. Math. Stat. 1952, 23, 470–472. [Google Scholar] [CrossRef]
  60. Natrella, M. NIST/SEMATECH e-Handbook of Statistical Methods; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2012.
  61. Salvadori, G.; De Michele, C.; Kottegoda, N.T.; Rosso, R. Extremes in Nature: An Approach Using Copulas; Springer: Amsterdam, The Netherlands, 2005; ISBN 1402044143. [Google Scholar]
  62. Zhang, Q.; Xiao, M.; Singh, V.P.; Li, J. Regionalization and spatial changing properties of droughts across the Pearl River basin, China. J. Hydrol. 2012, 472–473, 355–366. [Google Scholar] [CrossRef]
  63. Chow, V.T.; Maidment, D.R.; Mays, L.W.; Chow, V.T.; Maidment, D.R.; Mays, L.W. Applied Hydrology; McGraw-Hill: New York, NY, USA, 1988; ISBN 0070108102. [Google Scholar]
  64. Chung, Y.-S.; Yoon, M.-B.; Kim, H.-S. On Climate Variations and Changes Observed in South Korea. Clim. Chang. 2004, 66, 151–161. [Google Scholar] [CrossRef]
Figure 1. Location of study area and spatial distribution of selected stations.
Figure 1. Location of study area and spatial distribution of selected stations.
Water 10 00359 g001
Figure 2. Drought characteristics using run theory.
Figure 2. Drought characteristics using run theory.
Water 10 00359 g002
Figure 3. Spatial distribution of (a) mean duration (months); (b) mean severity, using the inverse distance weighted (IDW) method. IDW assumes substantially that the rate of correlations and similarities between neighbors is proportional to the distance between them. The IDW method is recommended when there are enough sample points with a suitable dispersion at local scale. The accuracy of IDW is affected by the size of the neighborhood and the number of neighbors.
Figure 3. Spatial distribution of (a) mean duration (months); (b) mean severity, using the inverse distance weighted (IDW) method. IDW assumes substantially that the rate of correlations and similarities between neighbors is proportional to the distance between them. The IDW method is recommended when there are enough sample points with a suitable dispersion at local scale. The accuracy of IDW is affected by the size of the neighborhood and the number of neighbors.
Water 10 00359 g003aWater 10 00359 g003b
Figure 4. Spatial distribution of final delineated homogeneous regions [32].
Figure 4. Spatial distribution of final delineated homogeneous regions [32].
Water 10 00359 g004
Figure 5. L-moment ratio plot for drought duration of (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Figure 5. L-moment ratio plot for drought duration of (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Water 10 00359 g005
Figure 6. L-moment ratio plot for drought severity of (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Figure 6. L-moment ratio plot for drought severity of (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Water 10 00359 g006
Figure 7. Visual comparison between empirical and theoretical probabilities for both drought variables across all regions.
Figure 7. Visual comparison between empirical and theoretical probabilities for both drought variables across all regions.
Water 10 00359 g007
Figure 8. Pair-wise dependence pattern of drought variables using Chi plots (first row) and Kendall plots (second row) for Regions I, II, III, and IV.
Figure 8. Pair-wise dependence pattern of drought variables using Chi plots (first row) and Kendall plots (second row) for Regions I, II, III, and IV.
Water 10 00359 g008
Figure 9. The joint PDF (left) and corresponding CDF contour (right) for each region. The red dotted contour lines indicate the empirical copula, and solid contour lines indicate the theoretical copula.
Figure 9. The joint PDF (left) and corresponding CDF contour (right) for each region. The red dotted contour lines indicate the empirical copula, and solid contour lines indicate the theoretical copula.
Water 10 00359 g009aWater 10 00359 g009b
Figure 10. Conditional probability curves P ( S s | D d ) of drought severity, given drought durations exceeding certain thresholds d ( d indicate the threshold levels of duration at the 25th, 50th, 75th, and 95th percentile) for Regions I, II, III, and IV.
Figure 10. Conditional probability curves P ( S s | D d ) of drought severity, given drought durations exceeding certain thresholds d ( d indicate the threshold levels of duration at the 25th, 50th, 75th, and 95th percentile) for Regions I, II, III, and IV.
Water 10 00359 g010
Figure 11. Conditional probability curves P ( D d | S s ) of drought duration given drought severity exceeding certain thresholds s ( s indicate the threshold levels of severity at the 25th, 50th, 75th, and 95th percentile) for Regions I, II, III, and IV.
Figure 11. Conditional probability curves P ( D d | S s ) of drought duration given drought severity exceeding certain thresholds s ( s indicate the threshold levels of severity at the 25th, 50th, 75th, and 95th percentile) for Regions I, II, III, and IV.
Water 10 00359 g011
Figure 12. Comparison of observed and simulated random samples from the best-fitted copula functions (Gaussian copula for Regions I and III, Frank copula for Regions II and III). The joint return period values use the standard return period (“or”) for drought duration and severity, (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Figure 12. Comparison of observed and simulated random samples from the best-fitted copula functions (Gaussian copula for Regions I and III, Frank copula for Regions II and III). The joint return period values use the standard return period (“or”) for drought duration and severity, (a) Region I, (b) Region II, (c) Region III and (d) Region IV.
Water 10 00359 g012aWater 10 00359 g012b
Figure 13. Drought risk map for (a) N = 10 years; (b) N = 50 years.
Figure 13. Drought risk map for (a) N = 10 years; (b) N = 50 years.
Water 10 00359 g013aWater 10 00359 g013b
Table 1. List of copulas applied in this study.
Table 1. List of copulas applied in this study.
CopulasBivariate Copula C θ ( u ,   v ) Parameters θ
Archimedean copulas
Clayton ( u θ +   ν θ 1 ) 1 / θ θ [ 1 ,   ) \ { 0 }
Frank 1 θ log [ 1 + ( e θ u 1 ) ( e θ ν 1 ) e θ 1 ] θ [ ,   ) \ { 0 }
Gumbel exp [ ( u θ +   ν θ ) 1 / θ ] θ [ 1 ,   )
Joe 1 [ ( 1 u ) θ +   ( 1 v ) θ     ( 1 u ) θ ( 1 v ) θ ] 1 / θ θ [ 1 ,   )
Elliptical copulas
Student’s t t ϑ 1 ( u ) t ϑ 1 ( v ) 1 2 π ( 1 r 2 ) { 1 +   x 2   2 rxy +   y 2 ϑ ( 1 r 2 ) } ϑ + 2 2 dxdy
t ϑ ( x ) =   x Г ( ( ϑ + 1 ) / 2 ) π ϑ Г ( ϑ 2 ) ( 1 + y 2 / ϑ ) ( ϑ + 1 ) / 2 dy ,   ϑ 0
ϑ > 2 , r ( 0 ,   1 ]
Gaussian Φ 2   ( Φ 1 ( u ) ,   Φ 1 ( v ) ,   ρ ) 1   ρ 1
Table 2. Heterogeneity measure for the final adjusted homogeneous regions [32].
Table 2. Heterogeneity measure for the final adjusted homogeneous regions [32].
RegionDrought EventsStationsDSDS
I51620−0.70 (H 1)−0.82 (H 1)−0.11 (H 1)
II13950.85 (H 1)−0.12 (H 1)−0.34 (H 1)
III49218−2.21 (H 1)−2.26 (H 1)−1.45 (H 1)
IV308121.47 (A.H 2)0.74 (H 1)0.56 (H 1)
1 H = Homogeneous region, 2 A.H = Acceptably homogeneous region.
Table 3. Some basic statistics of drought duration and severity for each identified homogeneous region.
Table 3. Some basic statistics of drought duration and severity for each identified homogeneous region.
RegionVariablesMeanMaxMinSD 1SkewnessKurtosis
IDuration2.6513.001.002.241.444.33
Severity4.0317.611.003.931.554.52
IIDuration2.6312.001.002.231.615.24
Severity4.0122.301.003.932.078.11
IIIDuration2.5310.001.002.121.504.21
Severity3.8820.551.003.781.705.07
IVDuration2.4811.001.002.161.554.58
Severity3.7220.011.003.781.896.33
1 SD indicates standard deviation.
Table 4. Z statistics of the drought variables, using five candidate probability distribution for the four sub-regions.
Table 4. Z statistics of the drought variables, using five candidate probability distribution for the four sub-regions.
VariablesRegionGLOGEVGNOPE3GPA
DurationI9.418.827.24.416.46
II4.514.193.361.922.97
III7.927.425.853.165.31
IV6.336.064.923.004.75
SeverityI7.276.745.162.464.56
II2.862.581.780.421.45
III5.925.443.80.993.35
IV4.994.683.411.263.22
Note: The bold values indicate the best distribution for a significance level of 10%.
Table 5. Parameters of final selected probability distributions for each region.
Table 5. Parameters of final selected probability distributions for each region.
VariablesRegionDistributionCDFParameters
DurationIKappa F ( x ) =   ( 1 h ( 1 κ ( x ξ ) α ) 1 / κ ) 1 / h ξ = 6.854 ; α = 8.872
k = 0.832 ; h = 3.908
IIKappa F ( x ) =   ( 1 h ( 1 κ ( x ξ ) α ) 1 / κ ) 1 / h ξ = 4.182 ; α = 5.239 ;
k = 0.624 ; h = 3.492
IIIKappa F ( x ) =   ( 1 h ( 1 κ ( x ξ ) α ) 1 / κ ) 1 / h ξ = 4.022 ;   α = 4.852 ;
k = 0.574 ; h = 3.614
IVKappa F ( x ) =   ( 1 h ( 1 κ ( x ξ ) α ) 1 / κ ) 1 / h ξ = 17.771 ;   α = 34.673 ; ablesy   occur   at   central   and   souther   region   of   south   Korea .   restxceeding   varirtain   erved   data . ness   and   L kurtosis   and   theorat
k = 1.147 ; h = 5.079
SeverityIKappa F ( x ) =   ( 1 h ( 1 κ ( x ξ ) α ) 1 / κ ) 1 / h ξ = 2.739 ;   α = 3.278 ;
k = 0.369 ; h = 3.074
IIPE3 F ( x ) =   0 x ξ β t α 1 exp ( t ) dt Г ( α ) β : 1.232 ;   α : 0.557 ; ξ : 0.313
IIIPE3 F ( x ) =   0 x ξ β t α 1 exp ( t ) dt Г ( α ) β : 1.274 ;   α : 0.510 ; ξ : 0.350
IVPE3 F ( x ) =   0 x ξ β t α 1 exp ( t ) dt Г ( α ) β : 1.44 ;   α : 0.444 ; ξ : 0.362
Table 6. µi,D and µi,S indicate the site-specific scaling factors for drought duration and severity, respectively, and ni indicates the number of drought events recorded at each station.
Table 6. µi,D and µi,S indicate the site-specific scaling factors for drought duration and severity, respectively, and ni indicates the number of drought events recorded at each station.
Region I
Siteµi,Dµi,SniSiteµi,Dµi,Sni
Jangheung3.355.0323Jinju2.614.0128
Haenam2.583.8426Geoje2.673.8924
Yeongcheon2.584.0724Buan2.313.4329
Miryang2.934.3527Namhae2.634.1924
Sancheong3.004.9721Jeongeup2.633.8330
Ulsan2.523.8027Goheung3.295.1421
Gwangju2.584.0626Yeosu2.533.7330
Busan2.002.9531Wando3.054.6921
Tongyeong1.942.7834Suncheon2.563.9227
Mokpo2.804.6520Jeonju3.435.1223
Region II
Chupungnyeong2.573.9028Boeun2.613.8228
Geumsan2.774.1130Yeongju2.383.6829
Geochang2.884.6524
Region III
Chuncheon2.293.3531Imsil2.854.5426
Chungju2.593.9027Boryeong2.694.1229
Daejeon2.263.4031Namwon2.804.5725
Jecheon2.644.2128Seoul2.794.1824
Yangpyeong2.463.6528Incheon2.193.1632
Icheon2.904.6621Wonju2.504.0824
Cheonan2.323.5028Cheongju2.523.9927
Buyeo2.373.4630Gunsan2.684.1425
Suwon2.383.6326Seosan2.703.9830
Region IV
Sokcho2.313.3729Uiseong2.003.3329
Daegwallyeong2.363.3128Gumi3.174.7424
Gangneung2.453.5529Mungyeong2.443.7927
Uljin2.924.1625Yeongdeok2.834.2024
Pohang2.062.8516Inje2.964.3324
Daegu2.173.5229Ganghwa2.173.5124
Table 7. Correlation coefficients for drought variables.
Table 7. Correlation coefficients for drought variables.
Dependence MeasureRegion IRegion IIRegion IIIRegion IV
Pearson’s r0.9410.9120.9200.928
Kendall’s τ0.8340.8320.8260.799
Spearman’s ρ0.9340.9390.9380.914
Table 8. Goodness-of-fit test and parameters of the bivariate copula for both duration and severity. The fitted copula is indicated in bold.
Table 8. Goodness-of-fit test and parameters of the bivariate copula for both duration and severity. The fitted copula is indicated in bold.
RegionCopulaSnp-valueAICParameter ( θ )
IStudent’s t0.0150.2031962.9100.904
Gaussian0.0060.9101498.7390.905
Clayton0.0130.4701651.63018.376
Gumbel0.0240.1241607.4151.952
Frank0.0060.8471705.82725.236
Joe0.0490.0051963.7571.957
IIStudent’s t0.0150.282517.1690.895
Gaussian0.0110.619516.4460.896
Clayton0.0060.629420.52816.138
Gumbel0.0140.718424.3311.901
Frank0.0060.827396.85621.978
Joe0.0300.104431.8891.914
IIIStudent’s t0.0150.2131783.2900.907
Gaussian0.0130.3711783.9630.908
Clayton0.0060.4411525.58417.839
Gumbel0.0250.0941462.5392.029
Frank0.0060.8561359.50824.169
Joe0.0450.0051567.1732.061
IVStudent’s t0.0140.2331102.4630.923
Gaussian0.0050.928834.4230.924
Clayton0.0120.460986.05219.175
Gumbel0.0210.134903.4982.217
Frank0.0060.8471010.15525.908
Joe0.0420.0151102.4322.268
Table 9. Comparison of univariate and bivariate return periods of drought characteristics for all Regions.
Table 9. Comparison of univariate and bivariate return periods of drought characteristics for all Regions.
RegionTDS T ^ D ,   S T ˇ D ,   S T S | D     d T D | S     s T D ,   S *
I21.512.202.241.493.384.921.79
54.415.676.843.8930.1638.784.91
105.466.156.465.2435.2539.715.75
206.517.518.046.1652.3160.377.43
507.688.989.587.2973.5686.078.52
1008.509.659.958.2884.5696.049.14
2009.1710.7211.598.62106.18124.2410.01
5009.9711.9014.068.83140.14167.3212.23
100010.3012.6015.768.85162.31198.5813.91
II21.542.492.491.543.856.221.96
54.234.945.433.9222.9926.834.12
105.526.426.925.2038.2244.485.91
206.737.587.716.6351.8858.417.05
507.358.448.857.0665.0374.678.01
1008.499.6510.198.1186.4898.379.25
2009.2210.9811.179.09103.03122.6810.21
50010.0712.0113.049.45131.32156.5810.32
100010.2813.1214.009.80143.95183.7312.52
III21.522.362.361.523.585.561.83
53.975.075.313.8421.1026.924.52
105.886.917.005.8241.1548.336.55
206.867.628.216.4556.3262.497.14
507.688.648.857.5367.9576.468.01
1008.3110.5510.988.0791.26115.809.52
2009.4411.3111.979.02112.99135.4210.85
50010.0911.9913.239.35133.40158.6011.88
100010.2513.0215.139.23155.01196.9113.45
IV21.292.362.351.293.035.541.62
55.085.776.084.8730.8835.055.01
106.297.178.465.5553.2460.656.72
208.218.029.587.0378.6176.778.21
508.369.2411.217.2293.72103.599.85
1009.1810.7513.907.69127.55149.4311.01
2009.8611.5114.688.32144.81169.0012.91
50010.9112.9517.049.07185.90220.7414.75
100011.7213.8019.249.45225.48265.5016.37

Share and Cite

MDPI and ACS Style

Azam, M.; Maeng, S.J.; Kim, H.S.; Murtazaev, A. Copula-Based Stochastic Simulation for Regional Drought Risk Assessment in South Korea. Water 2018, 10, 359. https://doi.org/10.3390/w10040359

AMA Style

Azam M, Maeng SJ, Kim HS, Murtazaev A. Copula-Based Stochastic Simulation for Regional Drought Risk Assessment in South Korea. Water. 2018; 10(4):359. https://doi.org/10.3390/w10040359

Chicago/Turabian Style

Azam, Muhammad, Seung Jin Maeng, Hyung San Kim, and Ardasher Murtazaev. 2018. "Copula-Based Stochastic Simulation for Regional Drought Risk Assessment in South Korea" Water 10, no. 4: 359. https://doi.org/10.3390/w10040359

APA Style

Azam, M., Maeng, S. J., Kim, H. S., & Murtazaev, A. (2018). Copula-Based Stochastic Simulation for Regional Drought Risk Assessment in South Korea. Water, 10(4), 359. https://doi.org/10.3390/w10040359

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop