Next Article in Journal
Modeling the Effect of Hyporheic Flow on Solute Residence Time Distributions in Surface Water
Next Article in Special Issue
Green Development of Titanium Dioxide Using Astragalus boeticus for the Degradation of Cationic and Anionic Dyes in an Aqueous Environment
Previous Article in Journal
Environmental Monitoring of Tritium (3H) and Radiocarbon (14C) Levels in Mafikeng Groundwater Using Alpha/Beta Spectrometry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatio-Temporal Model of a Product–Sum Simulation on Stream Network Based on Hydrologic Distance

by
Achmad Bachrudin
1,*,
Budi Nurani Ruchjana
1,
Atje Setiawan Abdullah
1 and
Rahmat Budiarto
2
1
Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Jatinangor 45363, Indonesia
2
College of Computer Science and Information Technology, Al-Baha University, Alaqiq 65779-7738, Saudi Arabia
*
Author to whom correspondence should be addressed.
Water 2023, 15(11), 2039; https://doi.org/10.3390/w15112039
Submission received: 20 April 2023 / Revised: 22 May 2023 / Accepted: 26 May 2023 / Published: 27 May 2023
(This article belongs to the Special Issue Advanced Technology for Smart Environment and Water Treatment)

Abstract

:
The modeling of spatio-temporal processes is crucial in many fields such as environmental science, hydrology, and water storage engineering. A basic concept of the spatio-temporal model is an auto-covariance function. The stream network model based on the Euclidean distance as the auto-covariance function for investigating river pollution will potentially face a serious accuracy problem, due to the material or water pollutant movement on the water only occurring within the river network and the distance travelled not following the path of the river network. The use of the hydrologic distance may overcome the problem because, statistically, the Euclidean distance-based auto-covariance function is not valid if it is a negative-definite. Therefore, this study develops a spatio-temporal model of the generalized product–sum based on the hydrologic distance with a positive-definite property. The proposed model is built and, subsequently, it is then verified mathematically to be positive-definite. Furthermore, the relationships between the hydrologic distance, the Euclidean distance, and spatial configurations are also investigated. The simulation study result shows the proposed spatio-temporal model is positive-definite for some semivariogram models. The relationships between spatial configurations in the stream network follow an exponential model.

1. Introduction

In the past two decades, the development of spatio-temporal models has grown rapidly in environmental science, metrology, hydrology, water storage engineering, and even social science. There have been plenty of research works on the development of spatio-atemporal models such as (on auto-covariance models) [1,2,3,4,5,6,7,8,9,10,11,12]; (on spatio-temporal product–sum models) [3,7,10]; (on stream networks) [13,14,15,16,17]. Matheron et al. [18] developed geostatistical models based on the Euclidean distance for analyzing pollutants in a river network using a standard semivariogram model and the kriging method. The Euclidean distance was deployed because the covariance function cannot capture some characteristics of stream networks, especially in a river network, i.e., (1) a spatial configuration, (2) a longitudinal connectivity, and (3) a flow direction or discharge [19,20,21]. Therefore, applying the Euclidean distance does not make sense from an ecological standpoint since it does not consider those materials (pollutants) [17]. Moreover, the standard semivariogram models based on the hydrologic distance are potentially negative-definite, and not valid [22,23,24]. In the spatial field, the auto-covariance function through the variogram concept to measure the degree of dissimilarity [25] is related to the Euclidean distance or spatial distance and the variables that are of interest to study, while in the field of time series data, it aims to identify the fit model with the characteristics of the time series data.
The hydrologic distance on stream networks is considered more instead of the Euclidean distance, especially in a river network [26]. This study focuses on the tail-up model based on the hydrologic distance and allows for the connected flow, which is able to bring many kinds of materials from one location to other locations on the river networks [27]. There are some studies on the issue of the behavior of the relationships between the two distance models in stream networks, such as studies conducted by Maki and Okabe [28], in which the application of both distance models is related to the scale of the stream networks. The spatial configuration is relatively cumbersome for a larger scale stream network and depends on the hydrologic distance which points out the spatial configuration decreased drastically as the hydrologic distance increased. This issue will also be paid attention in this study in addition to developing the spatio-temporal model.
The water quality data are time series data composed of water quality parameters collected at different times, used to describe the change in water quality status with time. They reflect the state and characteristic that water quality parameters change periodically with time. Current studies focus on explaining and calculating pollution parameters but do not focus on the relationship between pollution parameters and water and sediment quality.
A basic concept of the spatio-temporal model is an auto-covariance function. The stream network model based on the Euclidean distance as the auto-covariance function for investigating river pollution will potentially face a serious accuracy problem, due to the material or water pollutant movement on the water only occurring within the river network and the distance travelled not following the path of the river network. The use of the hydrologic distance may overcome the problem because, statistically, the Euclidean distance-based auto-covariance function is not valid if it is a negative-definite [17].
The spatial model, theoretically, can be applied on the non-stationary spatial model with weights. O’Donnel [11] developed a spatio-temporal model for the hydrologic distance based on a river network with a non-parametric approach, namely penalized splines, as well as the hydrologic distance-based auto-covariance function in the tail-up model, while Boergens et al. [3] used the spatio-temporal model. Tang and Zimmerman [12] developed a spatio-temporal mixture model based on the hydrologic distance, with the tail-up auto-covariance function model and tail-down where, if the Euclidean distance model is used, then the model is a product–sum. Fernandez et al. [14] developed a spatio-temporal model for a river network using a Bayesian framework. The proposed model applies a product–sum model with the components of the covariance function which are the Euclidean and hydrologic distances in the tail-up and tail-down models. A bibliometric analysis on the fundamental concepts related to this study, i.e., the tail-up model, hydrologic distance, and spatio-temporal model of the product–sum has been carried out in [29]. The analysis revealed that there were no truly similar publications on these fundamental concepts until the year 2022. Table 1 summarizes the latest works closely related to the present work.
Thus, this study aims to develop a spatio-temporal model of the generalized product–sum based on the hydrologic distance with a positive-definite property. The proposed model is built by modifying the spatial component of the Euclidean distance-based spatio-temporal model with the hydrologic distance, which is then subsequently verified mathematically to be positive-definite. The relationships between the hydrologic distance, the Euclidean distance, and spatial configurations are further investigated.
This study contributes toward the body of knowledge of mathematics modeling for spatio-temporal systems by considering the product–sum and hydrologic distance. The model is useful for solving the problem of the inaccurate prediction of pollutant distribution in watersheds. Furthermore, we also indicate the necessary condition for the validity of the developed model in terms of lemmas and theorems.
The rest of this paper is structured as follows. Section 2 discusses the construction of the proposed model, starting with the derivation of the spatial model for the stream network in Section 2.1, followed by the consideration of the general product–sum and the hydrologic distance in Section 2.2. Section 2.3 describes the simulation method, including the simulation scenario, the software, and the procedure. Section 3 presents the simulation results and analysis. Section 4 presents a discussion on the overall simulation results, and lastly, Section 5 provides the conclusion and future work.

2. The Proposed Model

The overall process of deriving the proposed model is illustrated in Figure 1.

2.1. Spatial Model for a Stream Network

Ver Hoef et al. [17] developed two spatial models, namely tail-up and tail-down, and proposed the moving average construction method which incorporates both the stream distance and flow direction. The flow direction can be reached by choosing moving average functions whose tails go along with or against the flow direction, which are called tail-down or tail-up moving average construction, respectively [27]. Subsequently, the flow direction allows for making the moving average construction method suitable for different types of variables. Some types of variables, such as a water pollutant or other stream chemistry variables, will move along with the flow direction, especially for the connected flow; thus, this model allows for developing an auto-covariance function which can just be enabled for the connected flow, namely the tail-up model. Other variables, such as the occurrence of fish species, may move upstream, and it is necessary, in this case, to consider an auto-covariance function for the unconnected flow, namely the tail-down model. This model can be applied for both flow directions. The Euclidean distance is defined as the distance between two points, i.e., the length of the line segment between two points. The Euclidean distance can be computed by using the coordinate points and the Pythagoras theorem. The Euclidean distance may not accurately describe proximity relationships among spatial data. However, the hydrologic distance, which is a non-Euclidean distance, must be used with caution in geostatistical applications. There are no guarantees that existing covariance and variogram functions remain valid (i.e., positive-definite or conditionally negative-definite) when used with a non-Euclidean distance measure. There are certain distance measures that when used with existing covariance and variogram functions remain valid, an issue that is explored. The concept of isometric embedding is introduced and linked to the concepts of positive- and conditionally negative-definiteness to demonstrate classes of valid norm-dependent isotropic covariance and variogram functions, and many of the results have yet to appear in the mainstream geostatistical literature or applications. Thus, applying the Euclidean distance for the river network does not capture the water pollutants because this distance does not measure the length of the river segments. This paper devotes to the tail-up model as described briefly in the following paragraph.
Barry and Ver Hoef [26] showed a large class of auto-covariance can be developed by creating random variables as an integral form of the moving average function through a white-noise random process and the covariance of the integral form of the moving average function is given in (1) [22,26]:
C s h r θ = g x θ 2 d x ,   h r = 0 g x θ g x s θ d x , h r > 0
where g ( . | θ ) denotes the moving average function. From (1) can be written in the form of covariance function in (2):
C s h r θ = C 1 ( 0 ) ,   h r = 0 d h r C 1 h r θ , h r > 0  
With θ = θ 1 , θ 2 is the vector of the parameters whose elements are sill and range, and s and h r are the location distance and hydrologic distance, respectively, C 1 h r denotes an auto-covariance model, ω k is a river weight, and
d h r = k ω k
The semivariogram in geostatistics has around 20 models. In general, from a practical perspective, we determined the semivariogram using four initial points based on linear, spherical, exponential, or Gaussian models. Thus, we chose to consider the three models, i.e., spherical, exponential, and linear models.
Table 2 shows the standard semivariogram models for the aspect of the temporal [11] and hydrologic distance [24]. It can be seen in the table that both types of semivariogram models are almost similar, except for d h r , which reflects the structure of the river network with specified weighting methods that are applied specifically for the tail-up method. Some known methods include (1) using the segment’s proportional influence on the segment directly downstream, and it is calculated by dividing the segment’s cumulative water shade area by the total incoming area at its downstream node, (2) incorporating the flow volume; if it is unavailable, then each split is simply weighted by 1 2 , or (3) by applying a stream order [30], which is used in this study.

2.2. Spatio-Temporal Model of the General Product–Sum

There are two questions addressed in modeling the spatio-temporal model: (1) how to ensure one has a valid model and (2) how to fit the data to the model [31]. The product model used in [32] was extended to a product–sum model [3]. De Iaco et al. [16] subsequently developed the product–sum, which has a smaller number of parameters, to become a general product–sum model. The generalized product–sum model serves a large class of models which were not attainable by Cressie and Huang [1] and which are easily modeled using techniques similar to those used for modeling a spatial variogram [33]. In the last decade, the spatio-temporal model has quickly become used as a different approach. A covariance model is a basic concept in the spatio-temporal model, especially in the field of geostatistics because it is related to the concepts of the semivariogram model and the kriging method.
Figure 1. Proposed model derivation diagram [25,34].
Figure 1. Proposed model derivation diagram [25,34].
Water 15 02039 g001
Space-time data are assumed to be the realization of a stochastic process. Let Z = Z s , t : s , t D × T be a second-order stationary spatio-temporal random field, where D d   and   T + are the expected value, covariance, and semivariogram, respectively. The second-order stationary spatio-temporal satisfies three assumptions:
E Z s , t = μ , s , t ,
C s , t h s , h t = C o v Z s + h z , t + h t Z s , t ,  
γ s , t h s , h t = V a r Z ( s + h s , t + h t Z s , t 2 ,
where s , s + h s D 2   and   t , t + h t T 2 .
In the case of the stationary, we can deduce from the foregoing results that a function of the real values C h i , h j defined on d × is a stationary covariance function if and only if, it is an even function C h i , h j = C h i , h j and positive-definite; that is,
i = 1 n j = 1 n a i a j C h i , h j 0
for any n , and for any h i , h j d × and a i ,   i = 1 , , n .
In the same way, a non-negative function of the real values γ h i , h j defined on d × is an intrinsically stationary semivariogram if and only if, it is an even function γ h i , h j = γ h i , h j and conditionally negative-definite. That is,
i = 1 n j = 1 n a i a j C h i , h j 0
for any n , and for any h i , h j d × and a i , i = 1 , , n .
De Cesare et al. [3] firstly introduced the spatio-temporal model of product–sum as follows:
C s , t h s , h t = k 1 C s h s C t h t + k 2 C h s + k 3 C t h t
where C s and C t are the valid spatial and temporal covariance models, respectively. In the form of a semivariogram, (8) can be written as:
γ s . t h s , h t = k 2 + k 1 C t 0 γ s h s + k 3 + k 1 C s 0 γ t h t k 1 γ s h s γ t h t
where γ h s , h t is the spatio-temporal variogram, γ s h s and γ t h t are, respectively, the spatial and the temporal variogram. The spatio-temporal model in (8) is said to be valid, i.e., a positive-definite if k 1 > 0 ,   k 2 0 , and k 3 0 . From (9), we obtain (10).
k 2 + k 1 C t 0 = k s k 3 + k 1 C s 0 = k t
In the product–sum model, the constraints are k s = 1 and   k t = 1 .
De Iaco et al. [10] put away the constraints and developed a more general model, called a generalized product–sum. The semivariogram model of the spatio-temporal model becomes:
γ s , t h s , h t = γ s , t h s , 0 + γ s , t 0 , h t k γ s , t h s , 0 γ s , t 0 , h t
where γ s , t h s , 0 and γ s , t 0 , h t are, respectively, a marginal semivariogram model, and the constant k can be formulated as (12)–(15) [7].
k = k 1 k s k t = k s C s 0 + k t C t 0 C s , t 0 , 0 k s C s 0 k t C t 0
k 1 = k s C s 0 + k t C t 0 C s , t 0 , 0 C s 0 C t 0
k 2 = C s , t 0 , 0 k t C t 0 C s 0
k 3 = C s , t 0 , 0 k s C s 0 C t 0
and the constant k is said to be a positive-definite if the value of k fulfills (16).
k = s i l l × γ s , t h s , 0 s i l l × γ s , t 0 , h t s i l l × γ s , t h s , h t s i l l × γ s , t h s , 0 s i l l × γ s , t 0 , h s
or in the form of the parameters for the variogram models in Table 2 as:
k = θ 1 t + θ 1 t θ θ 1 s θ 1 t
θ 1 s , θ 1 t and θ indicate the parameters of the sills of the spatial, temporal, and global variogram models.
The above spatio-temporal models of product–sum and generalized product–sum are based on the Euclidean distance, and next, the development of the spatio-temporal models based on the hydrologic distance using two theorems and two lemmas will be discussed. Afterward, the developed models will be verified as to whether they are positive-definite or not. The models are derived by substituting the spatial semivariogram component of the product–sum in (8) and (9) [33], and the generalized product–sum in (11) [7] based on the Euclidean distance by the spatial semivariogram as well as based on the hydrologic distance model in (2) [17,26].
Lemma 1.
(Montero, 2015) [25] If C s h r is a stationary spatial covariance function defined on   d , and d > 0 then d C s h r is also a stationary spatial auto-covariance function on d .
Lemma 2.
(Magnus and Neudecker, 1999) [34] Let A be a symmetric matrix of order nxn. A is said to be positive-definite for any non-zero vector x in   n if its quadratic form x t A x > 0 .
Theorem 1.
Spatio-temporal model using product–sum model based on the hydrologic distance.
Let  Z = Z s , t : s , t D × T  be a second-order stationary spatio-temporal random field, where  D R d  and  T R .  Let  γ s , t h r , h t  be a spatio-temporal model based on the hydrologic distance, hr. Then,
C s , t h s , h t = k 1 C s h r C t h t + k 2 C s h s + k 3 C t h t
The spatio-temporal model is said to be valid, i.e., a positive-definite if  k 1 > 0 ,   k 2 0 ,  and  k 3 0 .
Theorem 2.
Spatio-temporal model using the general product–sum model based on the hydrologic distance.
Let  Z = Z s , t : s , t D × T  be a second-order stationary spatio-temporal random field, where  D R d  and  T R .  Let  γ s , t h r , h t  be a spatio-temporal semivariogram model based on the hydrologic distance, hr. Subsequently,
γ s , t h r , h t = γ s , t h r , 0 + γ s , t 0 , h t k γ s , t h r , 0 γ s , t 0 , h t
is said to be the positive-definite if k satisfies the following inequality.
0 < k 1 max s i l l γ s , t h r , 0 ; s i l l γ s , t 0 , h t ,
Proof of Theorem 1.
Recall the spatial covariance model in (2):
C s h r = k ω k C 1 h r , h r > 0 .
Let d h r = k ω k be a positive real number, 0 < d h r 1 . Using Lemma 1, then,
C s h r = d h r C 1 h r
is also the auto-covariance function based on the hydrologic distance. Applying Lemma 1, and taking the variance becomes
V a r Z s i , t i ,   Z s j , t j = i , j a i × a j × C s , t Z s i , t i , Z s j , t j
Using the stationary assumption, (19) can be written as:
C s , t h r , h t = r , t a r a t ( k 1 C s h r C t h t + k 2 C s h r + k 3 C t h t )
Employing Lemma 2, and (6), and because the constant a and C . . , . are positive, then, (20) must be the positive-definite if k 1 > 0 , k 2 0 , and k 3 0 . □
Proof of Theorem 2.
Taking Lemma 2, (6), and the stationary assumption yields
C s , t h r , h t = r , t a r a t ( C s , t 0 , 0 γ s , t h r , h t   ) = r , t a r a t ( C s , t 0 , 0 ( γ s , t h r ,   0   + γ s , t 0 , h t   k γ s , t h r ,   0   γ s , t h r ,   0   ) )
Since γ s , t . , C s , t 0 , 0 , and the constant a as well are positive numbers, respectively, and (22) is the positive-definite, then,
γ h r , 0 + γ 0 , h t k γ h r , 0 γ 0 , h t   0
With the condition i a i . However, it still depends on the value of the constant, k. Based on (13)–(15), and k 1 > 0 ,   k 2 0 ,   k 3 0 , then the inequalities (24)–(26) are acquired.
C s , t 0 , 0 < k s C s 0 + k t C t 0
C s , t 0 , 0 k t C t 0
C s , t 0 , 0 k s C s 0
and from (25) and (26) we arrived at:
C s , t 0 , 0 max k s C s 0 , k t C t 0
Employing (24), (27), and (12) so that the constant k holds the inequalities (28)
0 < k 1 m a x k s C s 0 , k t C t 0
Taking Theorem 1 (De Iaco et al. 2001) [7], into (28), yields
0 < k 1 max S i l l ( g ( h r , 0 , S i l l g 0 , h t )
and (29) can be written in the form of the parameters of the variogram models in Table 2 as follows:
0 < k 1 m a x θ 1 r ,   θ 1 t
where θ 1 r and θ 1 t are, respectively, the parameters of the sill for the spatial and temporal variogram models. The constant k in (29) also holds for the variogram in (23) that meets (7). □

2.3. Simulation Method

For simulation purposes, a virtual river network created by SSN R Packages [24], with the hydrologic distance and weighted stream order are considered, as illustrated in Figure 2. The stream network is assumed as a dendritic pattern with 50 stream branches and 20 observation points s 1 , , s 20 represented by black points with a stream order weight ω, as shown in Figure 2.
The simulation procedure is as follows:
(1)
Generate a set of data from the model AR (1) t 1 , , t 10 using ε t   ϵ   N 0 ,   1 1 ρ 2 for each observation point with the correlations 0.99, 0.8, 0.5, 0.3, and 0.01.
(2)
Compute the experimental semivariogram, γ s , t h r , 0 and γ s , t 0 , h t [3] in the form of 2D and 3D.
(3)
Estimate the sill and range parameters of the semivariogram models, namely exponential, spherical, and linear (see Table 3), and the components of the spatio-temporal model γ s , t h r , h t , namely the spatial variogram and time variogram models.
(4)
Compute the experimental semivariogram, γ s , t h r , 0 [3].
(5)
Estimate a global sill.
(6)
Compute the constants, k 1 ,   k 2 ,   k 2 using (14), (18), and (19). Finally, go to step (1) with the number of replications being two.
Computation of the hydrologic distance and the Euclidean distance applies the glmssn function in the SSN R Packages [24], while the SSNbayes R Package [14] is applied for generating the data. The empirical variogram sample is obtained from the variogram function, while the weighting process and model fitting of the spatio-temporal aspect are obtained from the fit.StVariogram function in the gstat R packages [35]. The Spacetime R Package [36] is used in the data forming of the space-time, and finally, the weighting process makes use of a stream order [31]. Analysis of the relationships between the hydrologic distance, the Euclidean distance, and the spatial configuration was performed on Minitab trials [37], and this analysis is statistically descriptive. The simulation was run on a PC with an Intel Core i7 processor, 2.70 GHz and 8 GB RAM running Windows 10 Operating System.

3. Simulation Results and Analysis

3.1. Statistical Descriptive

On average, the hydrologic distance is generally longer than the Euclidean distance since the hydrologic distance is computed based on a long segment of river networks between two locations, which tends to be non-linear. The T-test result shown in Figure 3 describes a significant test. From the test results shown in Figure 4a,b, it can be concluded that the Euclidean distance can be applied in the range of a (0–5) length unit, while the hydrologic distance can be applied in a range of a (6–10) length unit. There is a relationship between the hydrologic distance and a weighting process, as seen in Figure 5, and subsequently, the following discussion is presented.
The relationships can be formulated in the exponential regression in Equation (30) because Figure 5a shows that the weights decrease exponentially as the hydrologic distance increases. The result shows that the determination coefficient is 72%, or the correlation coefficient is 0.85. Its regression equation in Figure 5b, plotting of the hydrologic distance, and the regression equation and the correlation between the weight predicted values and the empirical weights is 0.83. The regression equation is given in Equation (30).
d h r = e 0.1760 + 0.1448 h r
The results of this study can be considered as a strong supporter of the results of Maki and Okabe [28] that the Euclidean distance is presumably applied for a smaller size of stream networks, while the hydrologic distance can be used for a larger size of the network. Moreover, this study has also tried to analyze the relationship between the hydrologic distance and weighted values for the stream networks of a larger size. The analysis result shows a strong enough relationship that the weights are smaller exponentially as the hydrologic distance increases. However, the result still needs further experiments because there are still some limitations such as the use of a virtual stream network that has a small scale size, and the generator function has nothing to do with the locations (independent).

3.2. Results Analysis

Table 3 shows the estimated values of the sill for the spatial and temporal from each data structure with the different values of the correlations, and various variogram models, namely exponential, spherical, and linear. The notations of Sill.s, Sill.t, and Sill.g, respectively, represent the sill values of the semivariogram of space, time, and global sill or θ 1 s ,     θ 1 t , and θ in this study. The sill values are relatively stable, which is not so much different, for the correlation value of 0.3. The most significant root-mean-squared error (RMSE) is pointed out by the spherical model and linear model considered as the worst fitting, while the correlation value indicates the best fitting of 0.99. The information in Table 3 will be used to analyze the validity of the models presented in Table 4.
Figure 6 are more colorful than the others, indicating that the factors of space and time contribute to the values of the empirical variogram; contrarily. Figure 6 also are dominated by a single color, exhibiting that the weakness of the space and time contributed to the empirical variogram.
All values of k 1 ,   k 2 ,   k 3 and k in Table 4 are positive, indicating the sum–product model based on the hydrologic distance calculated by Equation (18) is positive-definite for all of the models (exponential, spherical, and linear models). The use of criteria of validity in Equation (30) points out that the exponential and linear models with the correlation values of 0.01 are not valid, and with the correlation value of 0.3 demonstrates the spherical and linear models are not valid. Next, the implementations of Equation (18) or (19) are examined. For example, for the exponential model with the correlation value of 0.3, the global sill calculation is 62.69, and applying Equation (30), the value of k should fulfill 0 < k < 0.03. From Table 4, the value of k for the model is 0.006; therefore, it can be concluded that the exponential model using the hydrologic distance is valid. In a similar way, it can be computed for the other models with different correlation values and the results are presented in Table 4.
Figure 7a,c,d show the values of the empirical variogram become larger as the time lag and hydrologic distance become larger, except in Figure 7b which shows the values of the empirical variogram become lower as the time lag and hydrologic distance become larger. In Figure 7e, for the hydrologic distance equal to 2, the values of the empirical calculation are lower as the time lag increases, but there is no correlation between the time lag and the hydrologic distance to the values of the empirical variogram when the hydrologic distance is more than 2. The fitting spatio-temporal variogram between the sample and model depends on the values of the correlation of the data structure in Figure 7, which shows the higher the values of the correlations, the more fit there is between the empirical variogram and the variogram model.

4. Discussion

The results of this study can be considered as a strong supporter of the results of Maki and Okabe [28] that the Euclidean distance is presumably applied for a smaller size of stream networks, whereas the hydrologic distance can be used for a larger size. In addition, this study has tried to analyze the relationship between the hydrologic distance and weighted values for the stream networks of a larger size, and the result shows a strong enough relation, that the weights are smaller exponentially as the hydrologic distance increases. Nevertheless, the result needs further investigation because there are still some limitations such as the use of virtual stream networks still having a small scale size, and the generator function has nothing to do with the locations (independent).
The validity measures of the spatio-temporal model of the product–sum and the generalized product–sum are different. The validity measure by De Iaco et al. [7], who developed the generalized model, is a narrower range than the product–sum model because the generalized product–sum model put aside the constraint developed by the product–sum. Therefore, there are a few models that are not valid in this study. If the generalized product–sum is valid, then it is automatically valid for the product–sum model, but otherwise it is not [7].

5. Conclusions and Future Work

A spatio-temporal model of the generalized product–sum based on the hydrologic distance along with the positive-definite property on the stream networks has been developed. Simulation computation results bring us to the following conclusions:
(1)
The hydrologic distance is significantly larger than the Euclidean distance;
(2)
Applying the Euclidean distance is strongly recommended for smaller scale stream networks, while the hydrologic distance is used for wider scale;
(3)
The weights in the proposed model decrease exponentially as the hydrologic distance increases; therefore, the weighting process depends only on the hydrologic distance without computing the spatial weights.
The experimental results also show a few models that are not valid for the generalized product–sum model because these models put aside the constraints of the product–sum model; thus, the validity measure of the generalized model has a narrower range. In addition, the experimental results show that a data structure of the time series provides a result of the empirical semivariogram and model fitting between the empirical semivariogram and semivariogram model.
There are two main research questions addressed in the modeling of the spatio-temporal: (1) how to ensure that a model is valid and (2) how to fit data to the model. This paper focuses on the development of the mathematical model for a stream network that considers the hydrologic distance and provides the necessary condition for the validity of the model. We plan to implement the model on the real dataset obtained from the observation of pollution spreading in the Citarum river, the longest river on West Java island, and which was named as the dirtiest river in the world in 2018. We intend to conduct experiments on the models using several scenarios. We also plan to incorporate the model with artificial intelligence techniques and compare it with existing models.

Author Contributions

Conceptualization, A.B. and B.N.R.; methodology, A.S.A.; software, A.B. and R.B.; validation, A.B., B.N.R. and R.B.; formal analysis, A.B., B.N.R.; investigation, A.B. and R.B.; resources, B.N.R.; data curation, A.B and A.S.A.; writing—original draft preparation, A.B.; writing—review and editing, B.N.R. and R.B.; visualization, A.B., A.S.A. and R.B.; supervision, B.N.R.; project administration, A.B. and B.N.R.; funding acquisition, B.N.R. All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Universitas Padjadjaran, under the Doctoral Dissertation Grant (RDDU) Scheme, contract number 2203/UN6.3.1/PT.00/2022 and Academic Leadership Grant with contract number: 1549/UN6.3.1/PT.00/2023.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

Authors would like to thank the Rector of Universitas Padjadjaran for the research funding. Authors are also grateful to the editor and anonymous reviewers for valuable suggestions that greatly improve the manuscript. The authors are also thanks to the collaborators in the RISE_SMA project 2023.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cressie, N.; Huang, H. Classes of nonseparable spatio-temporal stationary covariance models. J. Am. Stat. Assoc. 1999, 94, 13–30. [Google Scholar] [CrossRef]
  2. Brown, P.E.; Karesen, K.F.; Roberts, G.O.; Tonellato, S. Blur-generated non-separable space-time models. J. R. Stat. Soc. Ser. B-Stat. Methodol. 2000, 62, 847–860. [Google Scholar] [CrossRef]
  3. De Cesare, L.; Myers, D.E.; Posa, D. Estimating and modeling space-time correlation structures. Stat. Probab. Lett. 2001, 51, 9–14. [Google Scholar] [CrossRef]
  4. Dumelle, M.; Ver Hoef, J.; Fuentes, C.; Gitelman, A. A linear mixed model formulation for spatio-temporal random processes with computational advances for the product, sum, and product–sum covariance functions. Spat. Stat. 2021, 43, 100510. [Google Scholar] [CrossRef]
  5. De Cesare, L.; Myers, D.E.; Posa, D. Product-sum covariance for space-time modeling: An environmental application. Environmetrics 2001, 12, 11–23. [Google Scholar] [CrossRef]
  6. De Iaco, S.; Posa, D. Positive and negative non-separability for space-time covariance models. J. Stat. Plan. Inference 2013, 143, 376–391. [Google Scholar] [CrossRef]
  7. De Iaco, S.; Myers, D.; Posa, D. Space-time analysis using a general product-sum model. Stat. Probab. Lett. 2001, 52, 21–28. [Google Scholar] [CrossRef]
  8. De Iaco, S.; Myers, D.E.; Posa, D. Nonseparable space-time covariance models: Some parametric families. Math. Geol. 2002, 34, 23–42. [Google Scholar] [CrossRef]
  9. De Iaco, S.; Myers, D.E.; Posa, D. The linear coregionalization model and product-sum space-time variogram. Math. Geol. 2003, 35, 25–38. [Google Scholar] [CrossRef]
  10. De Iaco, S.; Myers, D.; Posa, D. On strict positive definiteness of product and product-sum covariance models. J. Stat. Plan. Inference 2011, 141, 1132–1140. [Google Scholar] [CrossRef]
  11. O’Donnel, D. Sial Prediction and Spatio-Temporal Modelling on River Networks. Ph.D. Thesis, University of Glasgow, Glasgow, UK, 2012. [Google Scholar]
  12. Tang, J.; Zimmerman, D. Space-Time Covariance Models on Networks with An Application on Streams. arXiv 2020, arXiv:2009.14745v1. [Google Scholar]
  13. Boergens, E.; Dettmering, D.; Schwatke, C.; Seitz, F. Water level, areal extent and volume change of Lake Turkana: Multi-year time series from satellite altimetry and remote sensing. PANGEA 2017. [Google Scholar] [CrossRef]
  14. Fernandez, E.S.; Ver Hoef, J.M.; Peterson, E.E.; McGree, J.; Issak, D.J.; Mengersen, K. Bayesian spatio-temporal models for stream networks. Comput. Stat. Data Anal. 2022, 170, 107446. [Google Scholar] [CrossRef]
  15. Liu, X.; Fu, X.; Li, Y.; Shen, J.L.; Wang, Y.; Zou, G.H.; Wu, Y.Z.; Ma, Q.M.; Chen, D.; Wang, C.; et al. Spatio-temporal variability in N2O emission from a tea -planted soil in subtropical China. World Congr. Soil Sci. 2016, 251, 161–164. [Google Scholar]
  16. Okabe, A.; Sugihara, K. Spasial Analysis along Networks; John & Wiley, Ltd.: London, UK, 2012. [Google Scholar]
  17. Ver Hoef, J.; Peterson, E.; Theobald, D. Spatial statistical models that use flow and stream distance. Environ. Ecol. Stat. 2006, 13, 449–464. [Google Scholar] [CrossRef]
  18. Matheron, G.; Pawlowsky-Glahn, V.; Serra, J. Matheron’s Theory of Regionalised Variables; Oxford Academic: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
  19. Dent, C.L.; Grimm, N.B. Spatial heterogeneity of stream water nutrient concentrations over successional time. Ecology 1999, 80, 2283–2298. [Google Scholar] [CrossRef]
  20. Ganio, L.M.; Torgersen, C.E.; Gresswell, R.E. A Geostatistical Approach for Describing Spatial Pattern in Stream Networks. Front. Ecol. Environ. 2005, 3, 138–144. [Google Scholar] [CrossRef]
  21. Porcu, E.; Gregori, P.; Mateu, J. Nonseparable stationary anisotropic space-tithe me covariance functions. Stoch. Environ. Res. Risk Assess. 2006, 21, 113–122. [Google Scholar] [CrossRef]
  22. Cressie, N.; Frey, B.; Frey, B.; Smith, M. Spatial prediction on river network. J. Agricaltural Biol. Environ. Stat. 2006, 11, 127–150. [Google Scholar] [CrossRef]
  23. Curreo, F. On the Use of Non-Euclidean Isotropy in Geostatistics; Working Paper 94; The Berkeley Electronic Press: Berkeley, CA, USA, 2005. [Google Scholar]
  24. Ver Hoef, J.; Peterson, E.; Clifford, D.; Shah, R. SSN: An R Package for spatial statistical modeling on stream networks. J. Stat. Softw. 2014, 56, 1–46. [Google Scholar]
  25. Montero, J.-M.; Fernandez-Aviles, G.; Mateu, J. Spatial and Spatio-Temporal Geostatiscal Modeking and Kriging; John and Wiley & Sons, Ltd.: London, UK, 2015. [Google Scholar]
  26. Ver Hoef, J.; Peterson, E. A moving average approach for spatial statistical models of stream networks. J. Am. Stat. Assoc. 2010, 105, 6–18. [Google Scholar] [CrossRef]
  27. Li, J. Spatial Multivariate Design in the Plane and Stream Networks. Ph.D. Thesis, University of Iowa, Iowa City, IA, USA, 2009. [Google Scholar]
  28. Maki, N.; Okabe, A. Spatio-temporal analysis of aged members of a fitness club in suburbs. Proc. Geogr. Inf. Syst. Assoc. 2005, 14, 29–34. [Google Scholar]
  29. Bachrudin, A.; Ruchjana, B.N.; Abdullah, A.S.; Budiarto, R. Bibliometric analysis of spatio-temporal model using a general product-sum based on a hydrologic distance. J. Front. Appl. Math. Stat. 2022, 8, 994287. [Google Scholar] [CrossRef]
  30. Shreve, R.L. Infinite topographically random channel networks. J. Geol. 1967, 75, 178–186. [Google Scholar] [CrossRef]
  31. Christakos, G. On the Problem of Permissible Covariance and Variogram Models. Water Resour. Res. 1984, 20, 251–265. [Google Scholar] [CrossRef]
  32. De Cesare, L.; Myers, D.; Posa, D. Spasial-Temporal Modelling of SO2 in Milan District; Geostatistics Wollongong ’96; Kluwer Academic Publisher: Dordrecht, The Netherlands, 1997; Volume 2, pp. 1031–1042. [Google Scholar]
  33. De Iaco, S.; Myers, D.E.; Posa, D. Space-time variograms and a functional form for total air pollution measurements. Comput. Stat. Data Anal. 2002, 41, 311–328. [Google Scholar] [CrossRef]
  34. Magnus, J.R.; Neudecker, N. Matrix Differential Calculus with Applications in Statistics and Economics; Wiley: New York, NY, USA, 1999. [Google Scholar]
  35. Gräiler, B.; Pebesma, E.; Heuvelink, G. Spatio-temporal interpolation using gstat. R J. 2016, 8, 204–218. [Google Scholar] [CrossRef]
  36. Pebesma, E. Spacetime: Spatio-Temporal Data in R. J. Stat. Softw. 2012, 51, 1–30. Available online: https://www.jstatsoft.org/v51/i07/ (accessed on 1 February 2020). [CrossRef]
  37. Riyan, B.S. Minitab Handbook, 6th ed.; Cangage Learning: Boston, MA, USA, 2012. [Google Scholar]
Figure 2. Simulated virtual river network model.
Figure 2. Simulated virtual river network model.
Water 15 02039 g002
Figure 3. The sample variations and the averages of the Hydrologic and Euclidean distances.
Figure 3. The sample variations and the averages of the Hydrologic and Euclidean distances.
Water 15 02039 g003
Figure 4. The relationships between Hydrologic distance (the vertical axis) and their corresponding Euclidean distances (JE) (the horizontal axis): (a) the ratio of the Hydrologic distance to its corresponding Euclidean distances, (b) the scatter plot of Hydrologic distances and their corresponding Euclidean distances (c17).
Figure 4. The relationships between Hydrologic distance (the vertical axis) and their corresponding Euclidean distances (JE) (the horizontal axis): (a) the ratio of the Hydrologic distance to its corresponding Euclidean distances, (b) the scatter plot of Hydrologic distances and their corresponding Euclidean distances (c17).
Water 15 02039 g004
Figure 5. The exponential model and the predicted weighted values: (a) the relationships of the exponential model (c19) and Hydrologic distances (c18), (b) the relationships between the empirical weighted (B2) and the predicted weight (B1).
Figure 5. The exponential model and the predicted weighted values: (a) the relationships of the exponential model (c19) and Hydrologic distances (c18), (b) the relationships between the empirical weighted (B2) and the predicted weight (B1).
Water 15 02039 g005aWater 15 02039 g005b
Figure 6. Empirical Variogram time lag and Hydrologic distance with the different values of correlation.
Figure 6. Empirical Variogram time lag and Hydrologic distance with the different values of correlation.
Water 15 02039 g006
Figure 7. The 3D Wireframe of Spatio-temporal product–Sum with the correlations: (a) 0.01, (b) 0.3, (c) 0.5, (d) and (e) 0.99.
Figure 7. The 3D Wireframe of Spatio-temporal product–Sum with the correlations: (a) 0.01, (b) 0.3, (c) 0.5, (d) and (e) 0.99.
Water 15 02039 g007
Table 1. Summary of closely related works.
Table 1. Summary of closely related works.
No.AuthorsTail-
Up
Tail-
Down
Product–SumHydrologic DistanceEuclidean Distance
1O’Donnel (2014) [11]xxx
2Boergens et al. (2016) [13]xx
3Tang and Zimmerman (2020) [12]x
4Fernandez et al. (2022) [14]xx
5The present work (2023)
Table 2. Semivariogram models for the spatio-temporal model.
Table 2. Semivariogram models for the spatio-temporal model.
Semi-Variogram Models γ s , t 0 , h t γ s , t h r , 0
Euclidean Distance or Time LagHydrologic Distance
Exponential θ 1 t 1 exp h t θ 2 t h t > 0 θ 1 t 1 d h r exp h t θ 2 t h t > 0
Spherical θ 1 t 3 2 h t θ 2 t 1 2 h t θ 2 t 3 , 0 h t θ 2 t θ 1 t , h t > θ 2 t   θ 1 r 1 d h r 1 3 2 h r θ 2 r + 1 2 h r θ 2 r 3 , 0 h r θ 2 r θ 1 r , h r > θ 2 r
Linear θ 1 t 1 h t θ 2 t ,   0 h t θ 2 t θ 1 r 1 d h r 1 h r θ 2 r , 0 < h r < θ 2 r
Table 3. Spatio-temporal components of Sill parameters.
Table 3. Spatio-temporal components of Sill parameters.
CorrelationModelSill.sSill.tSill.gRMSE
0.01Exp6.210.00016.2112.25
Sph6.441.198.8312.21
Lin6.490.428.0812.22
0.3Exp32.4325.1762.6935.37
Sph6.627.0320.4458.69
Lin6.055.4818.8058.04
0.5Exp7.374.6015.1838.73
Sph7.908.5620.0136.60
Lin6.2012.6021.5532.78
0.8Exp11.9713.0625.886.12
Sph6.7314.8721.615.31
Lin6.689.2417.3111.30
0.99Exp7.150.467.610.09
Sph6.780.397.160.10
Lin6.610.397.000.10
Notes: the original numbers are in long decimal.
Table 4. The validities of product–sum and the generalized product–sum model based on Hydrologic distance.
Table 4. The validities of product–sum and the generalized product–sum model based on Hydrologic distance.
CorrelationModelk1k2k3kValidity CriteriaNote
0.01Exponential1.001.002.860.290.16*
Sphere6.831.1820.1550.155
Linear4.641.173.770.420.15*
0.3Exponential57.491.151.20.0060.03
Sphere12.312.021.960.1450.142*
Linear9.642.192.320.210.16*
0.5Exponential11.521.431.690.090.13
Sphere16.241.441.410.050.11
Linear18.61.441.210.030.07
0.8Exponential24.861.071.060.000.07
Sphere21.341.001.000.000.06
Linear15.51.201.150.020.1
0.99Exponential5.321.001.000.000.13
Sphere4.271.001.000.000.14
Linear4.071.001.000.000.15
Notes: * indicate the model is not valid, and the original numbers in a long decimal.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bachrudin, A.; Ruchjana, B.N.; Abdullah, A.S.; Budiarto, R. Spatio-Temporal Model of a Product–Sum Simulation on Stream Network Based on Hydrologic Distance. Water 2023, 15, 2039. https://doi.org/10.3390/w15112039

AMA Style

Bachrudin A, Ruchjana BN, Abdullah AS, Budiarto R. Spatio-Temporal Model of a Product–Sum Simulation on Stream Network Based on Hydrologic Distance. Water. 2023; 15(11):2039. https://doi.org/10.3390/w15112039

Chicago/Turabian Style

Bachrudin, Achmad, Budi Nurani Ruchjana, Atje Setiawan Abdullah, and Rahmat Budiarto. 2023. "Spatio-Temporal Model of a Product–Sum Simulation on Stream Network Based on Hydrologic Distance" Water 15, no. 11: 2039. https://doi.org/10.3390/w15112039

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop