1. Introduction
A tidal river is affected by the tides due to the influence of the sun and the moon and its water level is characterized by periodical rising and falling [
1]. Thus, flood damage can be greater at high tide due to the backwater effect. However, it is very difficult to measure the flood discharge quantitatively due to the possibility of reverse discharge (i.e., flow from downstream to upstream; minus discharge), which depends on the tide. Researchers have therefore conducted studies to protecting human life and property from natural disasters such as flooding and inundation due to fluctuations in sea level in coastal areas [
2,
3,
4].
Hidayat et al. constructed artificial neural network (ANN) models using upstream water level data and tide level data to predict the discharge of a tidal river. They obtained good performance for predictions for up to two days in advance [
5]. Fan et al. used the Qual2K and HEC-RAS models to assess the water quality of a tidal river in northern Taiwan and found that water quality varies according to the tidal effect [
6]. Kim and Kim (2013) studied the spatiotemporal variability characteristics of daily mean tidal residuals on the coasts of Korea. They reported that the variability characteristics on Korea’s Yellow Sea coastline are dominated by the influence of wind along the north-south axis rather than changes in atmospheric pressure [
7].
To determine tidal residuals accurately, it is important to isolate the tide level time series and determine its characteristics. Liu showed that the energy distribution corresponding to the frequencies shown in the Fourier spectrum is not always constant over a 10-min period. They showed intermittent increases and decreases according to sea-surface fluctuations [
8]. The study noted that this intermittent nature of wave groupings, which characterizes the fluctuations appearing in wind wave recordings, is not shown in the frequency spectrum.
Kang and Lin (2007) conducted a wavelet analysis on three kinds of hydrological data: precipitation, stream flow and well water levels. Although there was no temporal pattern in the case of precipitation, such patterns were present in the stream flow and well water levels [
9]. Kang et al. (2013) used Fourier interpretation and wavelet analysis to shed light on tidal residual characteristics in coastal waters. As a result, tidal residual components were sorted into those with short periods of 24 h or less, mid-periods of 1 to 16 days and long periods of 1 month or more. They found that the mid-period components were largely affected by seasonal winds [
10].
Research has mainly been conducted on tide levels and waves in coastal areas and there has been a lack of interest in how the water level of a tidal river is influenced by various components [
11,
12,
13]. Therefore, in this study, we propose a methodology for dividing the water level data of a tidal river into four components: tide, wave, rainfall runoff and noise. The influence of these components was analyzed.
2. Methodology
A tidal river has complex time-series characteristics due to the influence of various factors. We employed a methodology to decompose the water level of a tidal river into distinct components (see
Figure 1). We judged that the water level data of a tidal river are mainly influenced by the tides, waves, rainfall runoff and noise and we sorted the data in terms of these components. We can use the water-level time series of a river and the sea level time series of the sea in this case, which have similar statistical characteristics because they are close to each other. First, we did a periodicity analysis to separate the periodic tide component and the non-periodic wave component. Next, we removed these components from the water level of the river and then used filtering analysis to separate the rainfall-induced runoff component and the noise component. The study framework is shown in
Figure 1.
2.1. Wavelet Analysis
The wavelet transform was first proposed by Haar in the early 1900s [
5,
14,
15]. It is an orthogonal transformation method like the Fourier transform, which has been the most widely used method in the field of signal processing [
16]. The wavelet transform can decompose data or a given function into various frequency components and is used to interpret each of these components according to the resolution corresponding to its scale [
17,
18,
19]. Unlike frequency spectrum analysis, wavelet analysis includes time information. Thus, it can be used to analyze a spectrum’s temporal characteristics and facilitates the analysis of transient and irregular wave states. In the case of Fourier transforms, a new basis space is constituted by basic functions like sine, cosine, or exponential functions. In the case of wavelet transforms, however, it is constituted by new basis functions called mother wavelets [
20,
21]. Therefore, Fourier transforms are localized in only the frequency domain, whereas wavelet transforms are localized in both the frequency and time domains [
22]. This is a key factor in accounting for the phenomenon of non-stationary states.
The mother wavelet
is mathematically expressed as follows:
This equation represents the scaling and translation of the basis function.
is a value that determines the scaling and
is a value that determines how much to shift the function. Equation (1) uses a wavelet function that shifts the mother wavelet by
and changes its scale by
[
23]. The width of a wavelet’s function becomes narrower at higher frequencies and wider at lower frequencies. The wavelet transform expresses an arbitrary function through the superposition of wavelet basis functions and each function has a different scale level with a resolution corresponding to that level.
2.2. Curve Fitting
Given a set of non-stationary data that is realistically obtainable, curve fitting refers to the method of finding a mathematical straight line or curve that can best represent the data points in that set. Curve fitting is used as a core technique for the analysis of experimental data in a wide range of fields, such as science and engineering, statistics and various automation technologies. Its value has risen in tandem with the rapid advances in computer technology since the 1980s.
There are two main types of curve fitting: least-squares regression and interpolation. Least-squares regression is used when the data contain a significant degree of error or scattered data points but the method has a drawback in that the derived curve cannot fit all the data points. Interpolation is used when there is very precise knowledge of the data. Interpolation methods are divided into linear interpolation and polynomial interpolation. We isolated tide-level components that undergo periodic oscillations that are describable by sine and cosine functions by using a polynomial interpolation that consists of sine functions:
2.3. Filters
A filter can be used to limit the noise from the frequency range of a desired signal and involves the removal of unwanted portions from the frequency spectrum [
24]. Filters are generally divided into four types: low-pass, high-pass, band-pass and band-reject filters. As shown in
Figure 2, components with a high frequency can be extracted with a high pass filter and components with a low frequency can be extracted with a low pass filter. The aim in this study was partly to extract the long-period component of the tide level and we used a high-pass filter that passes signals in a band ranging from a certain cutoff frequency to infinity.
2.4. Entropy Theory for Information Measurement
Shannon and Weaver [
25,
26] defined the marginal entropy, as shown in Equation (3):
where
is the occurrence probability of
and
is the marginal entropy that represents the amount of information of
.
If a variable
has information related to a variable
, the uncertainty of
may be reduced. Based on the theory, the conditional entropy
in
with the given
can be estimated [
27], as shown in Equation (4):
where
is the joint probability of
and
and
is the conditional probability of
with the given
. The information transferred between
and
is defined in Equation (5):
The information sent from
X to
is defined as
S(
,
):
where
H(
) is the marginal entropy of a single variable
and
T(
,
) is the trans-information between
and
.
4. Results and Discussion
In this study, we applied wavelet analysis and a sine fitting function to show that the tide and wave components can be separated from the stream water level data and that the rainfall-runoff and noise components can be separated by using a high-pass filter. We applied each analytical method to the water-level time series obtained from stations in Ulsan and Guyeong and separated four components (tide, wave, rainfall-runoff and noise). The main results are summarized in
Figure 9 and the rate at which each component influences the fluctuations in the water-level time series is shown in
Table 1. The ratios in
Table 1 are the average of the calculated ratios of each component divided by the stream water level in the entire time series, as in Equation (9).
In the water-level time series (
Figure 9a), fluctuations at low water levels were largely influenced by tide fluctuations and fluctuations at high water levels were influenced by the rainfall-runoff component. The rate at which each component influences the water-level time series is approximately 57.40% for the tide component and 27.62% for the rainfall-runoff component, which together account for more than 85% of the influence exerted by all the components together. The wave component is the next most influential at approximately 12.52% and the noise component was found to have very little influence at less than 3%. In order to confirm that the time series of the stream water level are properly decomposed, the time series for the 2012 typhoon ‘Sanba’ is shown as
Figure 10. Tide component is a periodic data showing no variation. Wave component and runoff-rainfall component showed a rising characteristic when rainfall occurs. When the typhoon ‘Sanba’ occurred, the stream water level was more than 2 m, which indicates that the stream water level time series was properly decomposed. If the Ulsan stream water level station had been situated a little further upstream from its actual location in the estuary, then the influence of the rainfall-runoff component would have been greater than that of the tide component.
Entropy theory was applied to estimate how much information at the stream water level series can be obtained from each component [
19]. Markus et al. and Kim et al. used the mutual information theory of entropy to evaluate the stream water level at a station [
19,
26]. This theory was also used in the present study [
26].
The analysis results from the entropy theory for the tide, wave, rainfall-runoff and noise components sending information to the stream water level were about 2.42, 2.39, 1.43 and 0.89 respectively. The results were a little different compared to the rates of influence from the statistical method. The rainfall-runoff component had the second highest influence rate but it was ranked third in terms of the amount of information. The reason is judged to be the characteristics of the data series in each component. The rainfall-runoff component has a large effect on the water-level series when the event occurs occasionally but the amount of information on the stream water level series is small because the uncertainty is large compared to the wave component. For the noise component, which is considered to have the greatest uncertainty, the information on the stream water level series was the smallest at 0.89 (12.5%).
To assess the results, we compared the rainfall-runoff component of the stream water level data from the Ulsan station with the stream water level data from the Guyeong station located upstream, as shown in
Figure 11 and
Figure 12. The stream water level sensitivity to rain events is different at the Guyeong station given its characteristics but the data patterns are quite similar. Guyeong station is located upstream of the Ulsan station that can be seen from
Figure 3, which means that the stream water level of Ulsan station is influenced by the stream water level of Guyeong station. That is, there is a high degree of correlation and similarity when we standardize and compare the two sets of time series data, as shown in
Figure 12. The correlation was 0.754, the mean absolute error (MAE) was 0.068 m and the RMSE was 0.114. Thus, the components that influence the stream water level of a tidal river can be decomposed effectively through the proposed methodology.
Nevertheless, the present study has some limitations and room for improvement. First, if a sufficient amount of time series data is obtained, it will be possible to overcome the procedural complexity of decomposing the series into short-, mid- and long-period components and integrating them. Second, the methodology can be applied only if the variability characteristics of the tidal river’s water-level time series and sea-level time series are close enough to coincide with one another. Third,
Figure 5 shows that the 24-h component showed a significant result but this component could not be extracted because it did not satisfy the 95% significance level. However, since the 24 h component is closely related to the tide, it is necessary to extract this component using another methodology in the future. Finally, if there is a station in an upstream that shows the diurnal cycle, the effect should be considered when this approach is applied.
5. Conclusions
This study divided the components that influence the stream water level of a tidal river into tide, wave, rainfall-induced runoff and noise components and proposed a methodology to isolating each one. To this end, we used the data available from hydrological stations in the vicinity of a tidal river and effectively separated the tide and wave components through wavelet analysis. The rainfall-runoff and noise components were separated through filtering analysis. Thus, this study could contribute to the understanding of the behavior of water-level time series in a tidal river, which has a complex structure due to the interaction of various components. Follow-up studies are expected to predict the stream water level of a tidal river while considering the variability characteristics of each component in connection with climate factors. Furthermore, the results of this study and predicted water levels could be used to manage tidal river water levels during the flood season, which could make it possible to minimize the damages due to water disasters such as typhoons.