Next Article in Journal
Prediction of Pork Supply Based on Improved Mayfly Optimization Algorithm and BP Neural Network
Next Article in Special Issue
Monthly Runoff Forecasting Based on Interval Sliding Window and Ensemble Learning
Previous Article in Journal
The Vagueness of COLREG versus Collision Avoidance Techniques—A Discussion on the Current State and Future Challenges Concerning the Operation of Autonomous Ships
Previous Article in Special Issue
Three-Dimensional Analysis of Air-Admission Orifices in Pipelines during Hydraulic Drainage Events
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Investigation on the Effect of Outliers for Flood Frequency Analysis: The Case of the Eastern Mediterranean Basin, Turkey

Department of Civil Engineering, Adana Alparslan Türkeş Science and Technology University, Adana 01250, Turkey
Sustainability 2022, 14(24), 16558; https://doi.org/10.3390/su142416558
Submission received: 1 November 2022 / Revised: 2 December 2022 / Accepted: 8 December 2022 / Published: 9 December 2022
(This article belongs to the Special Issue Sustainable Planning, Management and Utilization of Water Resources)

Abstract

:
Flood frequency analysis is accepted as one of the most important applications of water resource engineering. Measurements with higher and lower values, such as outliers, can be seen in hydrological data sets based on longer observation periods that extend the overall range. This study used 50 and 25 years of annual maximum flow data from 1962 to 2011 and from 1987 to 2011 from the Stream Gauging Stations (SGS) numbered 1712, 1717, and 1721 located within the borders of the Eastern Mediterranean Basin. The flood discharges were estimated using Normal, Gumbel, and Pearson Type III probability distributions. The study adopted Kolmogorov–Smirnov (K-S) and Chi-squared goodness-of-fit tests to investigate the suitability of probability distribution functions. The maximum flow rates were obtained by utilizing Normal distribution in the 2-year and 5-year return periods for the flood values calculated with the raw data; however, after the modification of the outliers, maximum flood discharges were estimated by adopting the Pearson Type III function. While the maximum discharges for the 1717 SGS were determined using the Gumbel distribution, the Pearson Type III distribution function was utilized for the 1712 and 1721 SGSs. As a result of the K-S and Chi-squared tests, it was determined that adjustment of the outliers resulted in positive goodness-of-fit results with the Pearson Type III function.

1. Introduction

Floods induced by natural and anthropic factors are considered to be among the most damaging disasters on Earth [1]. Due to the increasing human population and the alteration in the global climate, it is predicted that considerable changes in many meteorological and hydrological data, such as mean precipitation, discharge, and temperature, could increase the destructive effect of floods [2,3]. Structural solutions are generally sufficient to prevent floods; however, the design of hydraulic structures can often vary according to the streamflow conditions [4]. Hence, frequency analysis successfully represents the relationship between the flood’s severity, magnitude, and return period together with the rational method, the unit hydrograph method and the rainfall–runoff method [5,6,7,8]. This analysis is defined as calculating the maximum flow data recorded at gauging stations along a river or water resource using several probability distribution functions [9,10,11,12,13].
In the published literature, many studies have carried out flood frequency analyses for various return periods in different drainage basins [1,14,15]. Moreover, the application of goodness-of-fit tests to assess the suitability between observed streamflow data and probability distribution produces more reliable results [16,17,18]. During long observation periods, greater or fewer records may exist compared to the determined limit values. Thus, outliers are specified as inconsistent observation values within datasets [19,20,21]. Errors may occur due to data entry, so uncertainties in measurements such as decimal notation and incorrect scaling can take place [19]. The outliers are hydrological parameters that differ from the general distribution in the available dataset [22]. Thanks to the adjustment of the outliers, the new values obtained will prevent high costs in the design and construction processes for water structures. Hydrological design is one of the main subjects of water resource engineering applications. Any data obtained from the hydrological stages should be as consistent as possible and reflect optimum engineering conditions. Consequently, for an accurate and effective planning phase, it is essential to deal with the meteorological and hydrological records in detail during the feasibility work [23]. Many studies have been carried out on adjusting the effects of the outliers observed in hydrological data in the literature [24,25,26,27,28].
Flood frequency analysis provides an estimate of discharges used to design of water structures to be built on rivers. The Eastern Mediterranean Basin has a dense river system. The regime of most rivers is a non-uniform characteristic, and this basin consists of areas with a high flood risk. In this study, the outliers were determined. An attempt was made to modify the effects of these values with graphical and statistical methods by using the data from the Stream Gauging Stations (SGSs) numbered 1712, 1717, and 1721, located in the Eastern Mediterranean Basin, which is one of the important basins in Turkey and where flood events are frequent. Therefore, different probability distribution functions such as Normal, Gumbel, and Pearson Type III have been used, and the effect of outliers on the estimated flow rates has been examined. Furthermore, widely used in the literature, Kolmogorov–Smirnov (K-S) and Chi-squared goodness-of-fit tests were applied to determine the most appropriate of these functions. As a result of the effect of the outliers, a more realistic result will be produced, mainly for optimum cost calculations.

2. Materials and Methods

The Eastern Mediterranean Basin is located in the south of Turkey between 36° and 37° northern latitudes and between 32° and 35° eastern longitudes. The basin contains approximately 3% of Turkey’s population with a drainage area of 21.676 km2. The basins adjacent to the Eastern Mediterranean Basin are the Konya Closed Basin, the Seyhan Basin, and the Antalya Basin [29]. In the Eastern Mediterranean Basin, significant water transfers between sub-basins occur. Currently, the total water transfer is 220.46 hm3, and the highest water transfer in the basin is 113.59 hm3 for agricultural purposes. A total of 106.87 hm3 of drinking and utility water is transferred from the Tarsus sub-basin to the Kızıldere sub-basin, which has the highest population density in the basin area [30].
The present study used 50 and 25 years of annual maximum flow data from 1962 to 2011 and from 1987 to 2011, respectively, from the SGS belonging to the Electricity Works Survey Administration (known as EIEI) and The General Directorate State Hydraulic Works (known as DSI) numbered 1712, 1717, and 1721, which are located within the borders of the Eastern Mediterranean Basin (Table 1). The locations and information of the specified SGSs is shown in in Figure 1 [30,31]. Streamflow data of the stations used for this study is only available until 2011. These stations were closed after 2011. Although at least 30 years of historical hydrometeorological data have been evaluated in the literature, it is stated that data of at least 22 years and above can be used in hydrological studies [32]. In light of this information, historical flood frequency analyzes were conducted for the Eastern Mediterranean Basin.
The Göksu River, Lamas Creek, and Anamur Creek are listed as the water sources where the stations are located. In 2004, a major flood event occurred in Silifke and its surroundings after the Göksu River overflowed. The reason for the flood was said to be the increase in the discharge due to snowmelt rather than precipitation. This disaster greatly damaged agricultural lands [33]. It is stated that sudden floods have been seen over the years in the regions where the stations used in the study are located, and there is a great flood potential in this region [34]. Therefore, the basin has the potential for flash floods to occur in many ways. Within the scope of this study, it is thought that essential contributions can be provided to the literature in terms of effective flood management by frequency analysis.
The Eastern Mediterranean Basin, where a Mediterranean climate prevails, has hot and dry summers and mild and rainy winters. While continental climate features are dominant in the northern and upper parts of the basin, it has been observed that winters are cold and generally snowy. From a climate change perspective, the basin can be considered vulnerable to such changes. Significant temperature increases and precipitation irregularities can produce negative impacts associated with climate change [35].

Outliers Term and Probability Distribution Functions

The term outlier usually refers to data that differs from specific limit values within the available observed data [36]. Flood analysis can employ many different methods to detect and modify these outliers. It is possible to specify these methods as graphical and statistical methods [37]. The present study compared the obtained results by the graphical method with the results of the Grubbs–Beck test (1972) [38], which is frequently used to detect outliers. The formulas for this test are detailed in Equations (1)–(3) [36]:
X min = µ ^ K n σ ^
X max = µ ^ + K n σ ^
K n = 0.9043 + 3.345 log 10 ( n ) 0.4046 log 10 ( n )
  • n: is the number of samples;
  • µ ^ : is the mean value;
  • σ ^ : is the standard deviation.
It is difficult to determine which one of the probability distribution functions is appropriate for flood frequency analysis. Normal, Gumbel, and Pearson Type III distributions are frequently used in the literature. Various statistical parameters of the data series are needed to apply the probability distribution functions, such as the mean ( µ ^ ), standard deviation ( σ ^ ), and skewness coefficient (γ) [39]. Xmin and Xmax parameters represent lower and higher outlier boundary values. The parameters of the selected probability distributions and parameter estimation methods are given in Table 2. While two parameters, the mean and standard deviation, are used in the calculation of Normal distribution and flood flow rates, in the Gumbel and Pearson Type III distributions, the skewness coefficient is also necessary to add these two parameters.

3. Results

The main objective of this study was to examine the effect of the modification of the outliers on the flood flow rates obtained with Normal, Gumbel, and Pearson Type III distributions for the data from the 1712, 1717, and 1721 SGSs in the Eastern Mediterranean Basin. At first, it is vital to investigate the series for homogeneity to determine that there is no significant difference in the causative hydrological processes. The Buishand homogeneity test, frequently used in the literature, has been evaluated to determine no significant change in the data series [40,41,42,43]. Accordingly, from the data of these three stations, it is seen that there is no significant difference in the limit values between the 99% and 95% confidence intervals, and it is accepted that the series are suitable for frequency analysis. Figure 2 shows the graphs generated as a result of the adjustment of the outliers in the maximum flow values of 50 years for 1712 SGS in the 1962 to 2011 period and the 25-year maximum flow values measured between 1987 and 2011 for 1717 SGS and 1721 SGS. Considering the graphical method and Grubbs–Beck test results, to determine these outliers, deviations of ±1.5σ from the mean value were determined as an acceptable level calculated for each station. The statistical parameters calculated with the raw state data and the state after the outliers have been modified are shown in Table 3.
In Figure 2 the green lines represent average discharge values, and the red lines show Xmax and Xmin. As can be seen in Figure 2, it was concluded that the SGS data between the red lines are at an acceptable level, and that those outliers within a ±1.5σ level limit outside these lines have been included. However, since the lower boundary condition line (Xmin) in Figure 2b is a negative value, it was not considered necessary to be included in this figure. Afterwards, six values for 1712 SGS, one value for 1717 SGS, and three values for 1721 SGS were detected and modified. The statistical parameters calculated with the raw state data and the state after the outliers were adjusted is shown in Table 3. Since modified outliers were above the +1.5σ limit, the statistical parameters for the standardized values declined compared to the parameters for the raw values (Figure 3). Although a few data below the −1.5σ level were set to the limit values, the parameter values decreased due to more values being drawn back to the +1.5σ level.
The outliers for all stations were modified, and a decrease was observed in all parameters. Still, the skewness coefficient and kurtosis parameters showed more significant variation in 1717 SGS compared to the other stations. In the 1717 SGS, the skewness coefficient for the raw values shows substantial deviations from the Normal distribution with a value of 2.762. After the adjustment, the difference compared to the Normal distribution decreased considerably.
Flood discharges for 2, 5, 10, 25, 50, 100, 200, 500, and 1000 years of return periods were calculated using Normal, Gumbel, and Pearson Type III probability distributions applied to the raw and outlier-modified flow data. As seen in Table 4, while the maximum flow values for 1712 SGS and 1721 SGS were obtained with the Gumbel distribution at high periods, minimum flow values were estimated with the Normal distribution. The results seem to be compatible with various other studies [8,15,44]. Additionally, for 1712 SGS, the maximum flood discharges were calculated with Pearson Type III distribution in the 2-year and 5-year return periods. With the adjustment of the outliers for all stations, a decrease was observed in all parameters. Still, the skewness coefficient and kurtosis parameters showed more significant variation in 1717 SGS compared to other stations (Table 5). In the 1717 SGS, the skewness coefficient for the raw values showed substantial deviations from the Normal distribution with a value of 2.762. After the adjustment, the difference compared to the Normal distribution decreased considerably. Table 4 and Table 5 show the raw discharge value and the discharges after the outliers have been modified for all three stations according to the different return periods.

4. Discussion

The changes in the flow rates when the effect of the outliers is modified for the determined probability functions are shown in detail in Figure 4a–c. The distributions determined that the modified flow rates were lower than those calculated with the raw values [45]. Nonetheless, the exact condition occurred in the Pearson Type III distribution but with a 2-year return period (Figure 4c).
However, it is remarkable that the Normal distribution gives the maximum flow in a 2-year return period. Fewer flood discharges were obtained for the 1717 SGS compared to the other two stations (Figure 4a). Similarly, while the maximum flood value for the 1717 SGS was calculated with the Gumbel distribution, maximum values at 1712 and 1721 SGS were estimated with Pearson Type III distribution (Figure 4b,c) [46,47].
The Kolmogorov–Smirnov (K-S) and Chi-squared tests were applied to investigate which probability distribution gives more compatible results in the estimation of flood discharges. In the goodness-of-fit tests, the probability distribution function with the minimum error value was determined to be the most compatible (Figure 5). In Figure 5a–c, the modified, expected, and raw values are shown with different scatter plots. As seen in Figure 5, the probability distributions in both tests showed compliance at a 90% confidence interval, regardless of the adjustment of the outliers. Nonetheless, the Chi-squared results decreased significantly after the modification, and the fitness level increased prominently (Figure 5a–c).
The K-S and Chi-squared expected values in Figure 5 were selected from relevant tables in the literature according to the number of samples and the significance level. Considering the Chi-squared test, the most appropriate probability distribution with the data available for the 1712 SGS was acquired by Normal distribution, similar to many studies in the literature [48]. However, for the results of some other studies, the lowest performance was achieved with the Normal distribution [49,50]. After adjusting the outliers, Pearson Type III showed compatible distribution with similar studies [18]. For the 1717 SGS, the Normal distribution was found to be the most appropriate one, and showed similarity to many studies [10,51]. However, the adjustment of the 1721 SGS outliers did not cause any difference in the probability distribution functions.
Although all probability functions provide a 90% confidence interval for the K-S test, the modification of outliers did not seem to improve the results of the Normal and Gumbel distributions. Dmax values for these probability distributions for each SGSs were examined. The Dmax refers to the maximum value chosen from the absolute differences between the expected and calculated values. It was remarked that they produce the same Dmax results in the modified discharges. Modification of the outliers significantly increased the compatibility of the Pearson Type III distribution for each station. It was seen that there are parallel results for similar studies in the published literature [51,52,53,54].

5. Conclusions

In this study, 1712, 1717, and 1721 SGSs in the Eastern Mediterranean Basin were chosen as the study area, and the effect of the outliers on the flood discharges obtained by investigating with various probability distributions. Both the Kolmogorov–Smirnov (K-S) and Chi-squared goodness-of-fit tests were applied to determine the reliability of the results. In the short-term 2-year and 5-year return periods, while the maximum raw flood values were calculated with the Normal distribution function, the modified flow rates were estimated with the Pearson Type III distribution function. Furthermore, as the adjustment of the outliers led to a decrease in flood discharges, it is seen that this situation is due to the withdrawal of the outliers beyond the limit values.
In high return periods, it was noticed that the probability distributions differ according to the SGSs. While the maximum flow rates for 1717 SGS were obtained with the Gumbel distribution, it was determined that the Pearson Type III distribution for 1712 and 1721 SGSs gave the largest flood discharge values. For the K-S test, the modification of the outliers did not improve the fitness level of the Gumbel and Normal distributions, but it was seen to have had a positive effect on the consistent Pearson Type III distribution level. According to the Chi-square test, the most compatible distribution for the raw values was the Normal distribution, while this situation changes to the Pearson Type III as a result of the modification of outliers. Considering both tests, the most suitable probability distribution calculated with the raw values used the Pearson Type III function.
As a result, by modifying the outliers in the observed streamflow data, a more reliable result will be obtained in the design processes of hydraulic structures, especially optimum cost calculations. In future studies, more detailed analysis can be performed for the estimation of discharges by using different statistical distribution functions and methods for the overall basin. Structural and non-structural measures should be assessed by preparing emergency action plans for areas where flood events are intense.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The author sincerely appreciates the Electrical Works Survey Administration (known as EIEI, location: Ankara, Turkey) for sharing the specified streamflow gauge station data. In addition, the author would like to thank Serin DEĞERLİ ŞİMŞEK, research assistant, for her valuable support while preparing this article.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Bhat, M.S.; Alam, A.; Ahmad, B.; Kotlia, B.S.; Farooq, H.; Taloor, A.K.; Ahmad, S. Flood frequency analysis of River Jhelum in Kashmir Basin. Quat. Int. 2019, 507, 288–294. [Google Scholar] [CrossRef]
  2. Aghayev, A.T. Determining of different inundated land use in Salyan Plain during 2010 the Kura River flood through GIS and remote sensing tools. Int. J. Eng. Geosci. 2018, 3, 80–86. [Google Scholar] [CrossRef] [Green Version]
  3. Baig, M.A.; Zaman, Q.; Baig, S.A.; Qasim, M.; Khalil, U.; Khan, S.A.; Ali, S. Regression analysis of hydro-meteorological variables for climate change prediction: A case study of Chitral Basin, Hindukush region. Sci. Total Environ. 2021, 793, 148595. [Google Scholar] [CrossRef] [PubMed]
  4. Lotfirad, M.; Salehpoor, J.; Ashrafzadeh, A. Using the IHACRES Model to investigate the impacts of changing climate on streamflow in a semi-arid basin in North-Central Iran. Shahid Chamran Univ. Ahvaz J. Hydraul. Struct. 2019, 5, 27–41. [Google Scholar]
  5. Ahn, J.; Cho, W.; Kim, T.; Shin, H.; Heo, J.H. Flood frequency analysis for the annual peak flows simulated by an event-based rainfall-runoff model in an urban drainage basin. Water 2014, 6, 3841–3863. [Google Scholar] [CrossRef] [Green Version]
  6. Mengistu, T.D.; Feyissa, T.A.; Chung, I.M.; Chang, S.W.; Yesuf, M.B.; Alemayehu, E. Regional flood frequency analysis for sustainable water resources management of Genale–Dawa River Basin, Ethiopia. Water 2022, 14, 637. [Google Scholar] [CrossRef]
  7. Metzger, A.; Marra, F.; Smith, J.A.; Morin, E. Flood frequency estimation and uncertainty in arid/semi-arid regions. J. Hydrol. 2020, 590, 125254. [Google Scholar] [CrossRef]
  8. Sahoo, A.; Ghose, D.K. Flood frequency analysis for Menace Gauging Station of Mahanadi River, India. J. Inst. Eng. Ser. A 2021, 102, 737–748. [Google Scholar] [CrossRef]
  9. Ahmad, U.N.; Shabri, A.; Zakaria, Z.A. Flood frequency analysis of annual maximum streamflows using L-Moments and TL-Moments approach. Appl. Math. Sci. 2011, 5, 243–253. [Google Scholar]
  10. Farooq, M.; Shafique, M.; Khattak, M.S. Flood frequency analysis of River Swat using Log Pearson Type 3, Generalized Extreme Value, Normal, and Gumbel Max distribution methods. Arab. J. Geosci. 2018, 11, 216. [Google Scholar] [CrossRef]
  11. Guru, N.; Jha, R. Flood frequency analysis of Tel Basin of Mahanadi river system, India using annual maximum and POT flood data. Aquat. Procedia 2015, 4, 427–434. [Google Scholar] [CrossRef]
  12. Pekarova, P.; Halmova, D.; Mitkova, V.B.; Miklanek, P.; Pekar, J.; Skoda, P. Historic flood marks and flood frequency analysis of the Danube River at Bratislava, Slovakia. J. Hydrol. Hydromech. 2013, 61, 326–333. [Google Scholar] [CrossRef]
  13. Tegegne, G.; Melesse, A.M.; Asfaw, D.H.; Worqlul, A.W. Flood Frequency Analyses over Different Basin Scales in the Blue Nile River Basin, Ethiopia. Hydrology 2020, 7, 44. [Google Scholar] [CrossRef]
  14. Ahuchaogu, U.E.; Ojinnaka, O.C.; Njoku, R.N.; Baywood, C.N. Flood frequency analysis for River Niger at Lojoka, Kogi State using Log-Pearson Type III distribution. Int. J. Water Resour. Environ. Eng. 2021, 13, 30–36. [Google Scholar] [CrossRef]
  15. Turhan, E.; Değerli, S.; Duyan Çulha, B. Comparison of Different Probability Distributions for Estimating Flood Discharges for Various Recurrence Intervals: The Case of Ceyhan River. Black Sea J. Sci. 2021, 11, 731–742. [Google Scholar] [CrossRef]
  16. Ismail, A.Z.; Yusop, Z.; Tusof, Z. Comparison of flood distribution models for Johor River Basin. J. Teknol. 2015, 74, 123–128. [Google Scholar] [CrossRef] [Green Version]
  17. Majeed, A.R.; Nile, B.K.; Al-Baidhani, J.H. Selection of suitable pdf model and build of idf curves for rainfall in Najaf City, Iraq. J. Phys. Conf. Ser. 2021, 1973, 012184. [Google Scholar] [CrossRef]
  18. Sarfaraz, Q.; Masood, M.; Shakir, A.S.; Sarwar, M.K.; Khan, N.M.; Azhar, A.H. Flood Frequency Analysis Of River Swat Using Easyfit Model & Statistical Approach. Pak. J. Eng. Appl. Sci. 2021, 29, 8–21. [Google Scholar]
  19. Aydin, M. Investigation of the effect of outliers in hydrological data on flood frequency analysis. Yuz. Yil Univ. J. Inst. Nat. Appl. Sci. 2015, 20, 47–55. [Google Scholar]
  20. Lamontagne, J.R.; Stedinger, J.R.; Yu, X.; Whealton, C.A.; Xu, Z. Robust flood frequency analysis: Performance of EMA with Multiple Grubbs-Beck Outlier Tests. Water Resour. Res. 2016, 52, 3068–3084. [Google Scholar] [CrossRef] [Green Version]
  21. Saeed, S.F.; Mustafa, A.S.; Al Aukidy, M. Assessment of flood frequency using maximum flow records for the Euphrates River, Iraq. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1076, 012111. [Google Scholar] [CrossRef]
  22. Wu, Y.B.; Xue, L.Q.; Liu, Y.H. Local and Regional Flood Frequency Analysis based on Hierarchical Bayesian model in Dongting Lake Basin, China. Water Sci. Eng. 2019, 12, 253–262. [Google Scholar] [CrossRef]
  23. Hlavcova, K.; Kohnova, S.; Borga, M.; Horvat, O.; St’astny, P.; Pekarova, P.; Majercakova, O.; Danacova, Z. Post-event analysis and flash flood hydrology in Slovakia. J. Hydrol. Hydromech. 2016, 64, 304–315. [Google Scholar] [CrossRef] [Green Version]
  24. Besha, K.Z.; Demissie, T.A.; Feyessa, F.F. Spatiotemporal trends and detection of changes in hydrological and climatic variables of Modjo River Watershed, Ethiopia. Theor. Appl. Climatol. 2021; preprints. [Google Scholar] [CrossRef]
  25. Chebana, F.; Dabo-Niang, S.; Ouarda, T.B.M.J. Exploratory functional flood frequency analysis and outlier detection. Water Resour. Res. 2012, 48, 1–20. [Google Scholar] [CrossRef] [Green Version]
  26. Heidarpour, B.; Panjalizadeh, M.B.; Ekramirad, A.; Hosseinzhad, A.; Ghasemian, L.A. Detection of outlier in flood observations: A case study of Tamer Watershed. Res. J. Recent Stud. 2015, 4, 150–153. [Google Scholar]
  27. Ng, W.W.; Panu, U.S.; Lennox, W.C. Chaos based analytical techniques for daily extreme hydrological observations. J. Hydrol. 2007, 342, 17–41. [Google Scholar] [CrossRef]
  28. Okafor, G.C.; Jimoh, O.D.; Larbi, K.I. Detecting changes in hydro-climatic variables during the last four decades (1975–2014) on Downstream Kaduna River Catchment, Nigeria. Atmos. Clim. Sci. 2017, 07, 161–175. [Google Scholar] [CrossRef] [Green Version]
  29. Koçyiğit, M.B.; Akay, H.; Babaiban, E. Evaluation of morphometric analysis of flash flood potential of Eastern Mediterranean Basin using principle component analysis. J. Fac. Eng. Archit. Gazi Univ. 2021, 36, 1669–1685. [Google Scholar] [CrossRef]
  30. Eastern Mediterranean Basin Flood Action Plan, Turkish Republic Ministry of Agriculture and Forestry. 2019. Available online: https://L24.İm/5GOD (accessed on 11 February 2022).
  31. Electrical Works Survey Administration-EIEI. Streamflow Observation Annuals between 1962 and 2011. Ankara, Turkey. Available online: https://www.dsi.gov.tr/ (accessed on 15 February 2022).
  32. Özdemir, H. Applied Flood Hydrology; The General Directorate of State Hydraulic Works (DSI): Ankara, Türkiye, 1977.
  33. Buldur, A.D.; Pınar, A.; Başaran, A. Göksu River flood dated 05-07 March 2004 and its effect on Silifke. J. Selcuk Univ. Soc. Sci. Institue 2007, 17, 139–160. [Google Scholar]
  34. Babaiban, E. Assessment of Potential Flash Flood Risk Using Morphometric Parameters: The Eastern Mediterranean Basin Case Study; Gazi Unıversity Graduate School of Natural and Applied Sciences: Ankara, Turkey, 2020. [Google Scholar]
  35. Oğuz, K.; Akın, B.S. Evaluation of temperature, precipitation, aerosol variation in Eastern Mediterranean Basin. J. Eng. Sci. Des. 2019, 7, 244–253. [Google Scholar] [CrossRef] [Green Version]
  36. Heidarpour, B.; Saghafian, B.; Yazdi, J.; Azamathulla, H.M. Effect of extraordinary large floods on at-site flood frequency. Water Resour. Manag. 2017, 31, 4187–4205. [Google Scholar] [CrossRef]
  37. Alberta Transportation. Civil Projects Branch. Guidelines on Flood Frequency Analysis; Alberta Transportation: Edmonton, AB, Canada, 2001. [Google Scholar]
  38. Grubbs, F.E.; Beck, G. Extension of sample sizes and percentage points for significance tests of outlying observations. Technometrics 1972, 14, 847–854. [Google Scholar] [CrossRef]
  39. Pawar, U.; Hire, P. Flood frequency analysis of the Mahi Basin by using Log Pearson Type III probability distribution. Hydrospatial Anal. 2019, 2, 102–112. [Google Scholar] [CrossRef] [Green Version]
  40. Buishand, T.A.; Brandsma, T. Comparison of circulation classification schemes for predicting temperature and precipitation in The Netherlands. Int. J. Clim. 1997, 17, 875–889. [Google Scholar] [CrossRef]
  41. Orke, Y.A.; Li, M.-H. Hydroclimatic Variability in the Bilate Watershed, Ethiopia. Climate 2021, 9, 98. [Google Scholar] [CrossRef]
  42. Yang, R.; Xing, B. Spatio-Temporal Variability in Hydroclimate over the Upper Yangtze River Basin, China. Atmosphere 2022, 13, 317. [Google Scholar] [CrossRef]
  43. Arrieta-Pastrana, A.; Saba, M.; Alcázar, A.P. Analysis of Climate Variability in a Time Series of Precipitation and Temperature Data: A Case Study in Cartagena de Indias, Colombia. Water 2022, 14, 1378. [Google Scholar] [CrossRef]
  44. Onen, F.; Bagatur, T. Prediction of flood frequency factor for Gumbel distribution using regression and GEP Model. Arab. J. Sci. Eng. 2017, 42, 3895–3906. [Google Scholar] [CrossRef]
  45. Brodie, I. Regional analysis of PROQ transforms for flood frequency estimation based on GRADEX principles. Australas. J. Water Resour. 2020, 24, 183–198. [Google Scholar] [CrossRef]
  46. Cong, S.; Xu, Y. The effect of discharge measurement error in flood frequency analysis. J. Hydrol. 1987, 96, 237–254. [Google Scholar] [CrossRef]
  47. Lalitha, S.; Tripathi, P. Detection of a pair of outliers in a sample from a Gumbel distribution with known scale parameter. J. Appl. Stat. 2018, 45, 243–254. [Google Scholar] [CrossRef]
  48. Amin, M.T.; Rizwan, M.; Alazba, A.A. A best-fit probability distribution for the estimation of rainfall in Northern Regions of Pakistan. Open Life Sci. J. 2016, 11, 432–440. [Google Scholar] [CrossRef]
  49. Feyissa, T.A.; Tukura, N.G. Evaluation of the best-fit probability of distribution and return periods of river discharge peaks. Case study: Awetu River, Jimma, Ethiopia. J. Sediment. Environ. 2019, 4, 360–368. [Google Scholar] [CrossRef] [Green Version]
  50. Langat, P.K.; Kumar, L.; Koech, R. Identification of the most suitable probability distribution models for maximum, minimum, and mean streamflow. Water 2019, 11, 734. [Google Scholar] [CrossRef] [Green Version]
  51. Mujiburrehman, K. Frequency analysis of flood flow at Garudeshwar Station in Narmada River, Gujarat, India. Univers. J. Environ. Res. Technol. 2013, 3, 677–684. [Google Scholar]
  52. Cerneaga, C.; Maftei, C. Flood frequency analysis of Casimcea River. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1138, 012014. [Google Scholar] [CrossRef]
  53. Malik, M.A.; Bhatti, A.Z. Characterizing snowmelt regime of the River Swat—A case study. Tech. J. Univ. Eng. Technol. Taxila 2015, 20, 95–103. [Google Scholar]
  54. Bogdanowicz, E.; Kochanek, K.; Strupczewski, W. The weighted function method: A handy tool for flood frequency analysis or just a curiosity. J. Hydrol. 2018, 559, 209–221. [Google Scholar] [CrossRef]
Figure 1. Locations of gauging stations [30].
Figure 1. Locations of gauging stations [30].
Sustainability 14 16558 g001
Figure 2. Graph of the selected outliers in the streamflow data set for (a) 1712 SGS, (b) 1717 SGS, and (c) 1721 SGS.
Figure 2. Graph of the selected outliers in the streamflow data set for (a) 1712 SGS, (b) 1717 SGS, and (c) 1721 SGS.
Sustainability 14 16558 g002
Figure 3. Modification in statistical parameters: mean (m3/s) and standard deviation (m3/s) (M.: Modified).
Figure 3. Modification in statistical parameters: mean (m3/s) and standard deviation (m3/s) (M.: Modified).
Sustainability 14 16558 g003
Figure 4. (a) Normal distribution graphs of flood discharges at several return periods logarithmically (M.: Modified). (b) Gumbel distribution graphs of flood discharges at several return periods logarithmically (M.: Modified); (c) Pearson Type III distribution graphs of flood discharges at several return periods logarithmically (M.: Modified).
Figure 4. (a) Normal distribution graphs of flood discharges at several return periods logarithmically (M.: Modified). (b) Gumbel distribution graphs of flood discharges at several return periods logarithmically (M.: Modified); (c) Pearson Type III distribution graphs of flood discharges at several return periods logarithmically (M.: Modified).
Sustainability 14 16558 g004aSustainability 14 16558 g004b
Figure 5. (a) Kolmogorov–Smirnov test results. (b) Chi-squared results: outliers extracted. (c) Chi-squared results: raw discharge values.
Figure 5. (a) Kolmogorov–Smirnov test results. (b) Chi-squared results: outliers extracted. (c) Chi-squared results: raw discharge values.
Sustainability 14 16558 g005aSustainability 14 16558 g005b
Table 1. Flow observation station information [31].
Table 1. Flow observation station information [31].
StationStation NoStreamObserved Period (Year)Average Rainfall (mm)Elevation (m)Average Temperature (°C)Drainage Area (km²)
Bucakkışla1712Göksu River5038.639318.22702
Kızılgeçit1717Lamas Creek2544.497512.71005.2
Alaköprü1721Anamur Creek2596.33719.1313.2
Table 2. Probability distribution calculation stages.
Table 2. Probability distribution calculation stages.
DistributionConstrainsEstimation TechniquesProbability Function fx(x)Frequency Factor (Kt)
NormalScale (σ)
Location (μ)
Method of
Moments
fx ( x ) = 1 2 Π σ exp [ 1 2 ( x   µ σ ) 2 ] < x < +
Standard normal deviate z with exceedance probability 1/T
Gumbel
(GEV Type II)
Shape (k)
Scale (σ)
Location (μ)
Method of
L-moments
fx(x) =
1 σ   e ( 1 + 2 ( x   µ σ ) ) ( 1 + 2 ( x   µ σ ) 3 2 )
Kt = 6 Π [ 0.5772 + l n [ l n ( T T 1 ) ] ]
Kt = frequency factor = ( x   µ σ )
T = 1 1 e e ( 0.5772 + Π K t 6 )
T = return period
Pearson
Type III
Shape (α)
Scale (β)
Location (γ)
Maximum likelihoodα = σ/ β ,
β= ( 2 / γ ) 2 , Є= µ − σ β
fx(x) = 1 α Г ( β ) exp [ ( x   Є α ) β 1 ] e [ ( x Є ) / α ]
x: refers discharge values
γ: skewness coefficient of x
µ: mean of x
σ: standard deviation of x
Kt = z + ( ( z 2 1 ) γ 6 + 1 3 ( z 3 6 z ) ( γ 6 ) 2 ( z 2 1 ) ( γ 6 ) 3 + z ( γ 6 ) 4 1 3 ( γ 6 ) 4
Standard normal deviate z with exceedance probability 1/T
Table 3. Statistical parameters of the data used in the calculations (M.: Modified).
Table 3. Statistical parameters of the data used in the calculations (M.: Modified).
Statistical
Parameters
1712 SGS1717 SGS1721 SGS
ValueM. ValueValueM. ValueValueM. Value
Mean (m3/s)2242182220243239
Variance (m6/s2)8574586229512993407228
Standard Deviation (m3/s)92.6076.5617.1911.3496.6485.02
Skewness1.0110.3762.7621.0100.347−0.005
Kurtosis1.381−0.2939.7420.247−0.075−0.914
Table 4. Raw discharge values (m3/s).
Table 4. Raw discharge values (m3/s).
Return Period1712 SGS1717 SGS1721 SGS
NormalGumbelPearson Type IIINormalGumbelPearson Type IIINormalGumbelPearson Type III
2224209209221915243229237
5303291295363830325333322
10343345349445043367401370
25387414414526561412488423
50415465460577775442552459
100440515505628890468616492
2004635655486699105492679523
50049163160471114125521763562
100051068264575126140541826590
Table 5. Modified discharge values (m3/s).
Table 5. Modified discharge values (m3/s).
Return Period1712 SGS1717 SGS1721 SGS
NormalGumbelPearson Type IIINormalGumbelPearson Type IIINormalGumbelPearson Type III
2218206214201918239227239
5283274281303129311318311
10317318319353935348378348
25352375352404943388455388
50376417391435649414511414
100397459417476454437567437
200416500442497160458623458
500439555474538167484697484
1000454596496558972501752515
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Turhan, E. An Investigation on the Effect of Outliers for Flood Frequency Analysis: The Case of the Eastern Mediterranean Basin, Turkey. Sustainability 2022, 14, 16558. https://doi.org/10.3390/su142416558

AMA Style

Turhan E. An Investigation on the Effect of Outliers for Flood Frequency Analysis: The Case of the Eastern Mediterranean Basin, Turkey. Sustainability. 2022; 14(24):16558. https://doi.org/10.3390/su142416558

Chicago/Turabian Style

Turhan, Evren. 2022. "An Investigation on the Effect of Outliers for Flood Frequency Analysis: The Case of the Eastern Mediterranean Basin, Turkey" Sustainability 14, no. 24: 16558. https://doi.org/10.3390/su142416558

APA Style

Turhan, E. (2022). An Investigation on the Effect of Outliers for Flood Frequency Analysis: The Case of the Eastern Mediterranean Basin, Turkey. Sustainability, 14(24), 16558. https://doi.org/10.3390/su142416558

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop