Abstract
Process-based models are widely used to investigate nutrient dynamics for water management purposes. Simulating nutrient transport and transformation processes from agricultural land into water bodies at the catchment scale are particularly relevant and challenging tasks for water authorities. However, few practical methods guide inexperienced modelers in the selection process of an appropriate model. In particular, data availability is a key aspect in a model selection protocol, since a large number of models contain the functionalities to predict nutrient fate and transport, yet a smaller number is applicable to specific datasets. In our work, we aim at providing a model selection protocol fit for practical application with particular emphasis on data availability, cost-benefit analysis and user’s objectives. We select for illustrative purposes five process-based models with different complexity as “candidates” models: SWAT (Soil and Water Assessment Tool), SWIM (Soil and Water Integrated Model), GWLF (Generalized Watershed Loading Function), AnnAGNPS (Annualized Agricultural Non-Point Source Pollution model) and HSPF (Hydrological simulation program-FORTRAN). The models are described in terms of hydrological and chemical output and input requirements. The model selection protocol considers data availability, model characteristics and user’s objectives and it is applied to hypothetical scenarios. This selection method is particularly formulated to choose process-based models for nutrient modeling, but it can be generalized for other applications which are characterized by a similar degree of complexity.
1. Introduction
Nutrients, such as nitrogen and phosphorous, are a serious problem threatening water quality [1,2,3]. Excessive nutrient loads cause water quality deterioration, including toxic algal blooms, oxygen deficiency, fish death and eutrophication of the river network and lakes [4,5,6,7]. Arable land is a major source of nitrogen (N) and phosphorus (P) due to intensive agricultural activities like fertilization [8,9]. Considerable amount of nutrients are discharged from these areas into natural water bodies [10,11,12]. Thus, N and P transport processes from agricultural land into water bodies is a significant issue for environment managers and policy makers worldwide.
Unlike point-source pollution, nutrient fluxes from agricultural land are difficult to measure and control, because they are heterogeneously distributed and derive from a variety of diffuse sources and may occur randomly and intermittently [13]. Furthermore, transport processes are complex, since they are controlled by a variety of natural and anthropogenic driving forces, such as hydrology, climate, geology, soil characteristic, and land use [14]. Despite the uncertainty related to model predictions, a process-based modeling approach is necessary to simulate nutrient fate and transport at the catchment scale and to support water managers and decision makers. Process-based models consist of a rather complete representation of the environmental system, which combines hydrological, soil and nutrient processes. These models are able to calculate long time series of relevant physical quantities (e.g., nutrient fluxes) with variable spatial distributions [9].
Numerous process-based models are available to predict nutrients dynamics in agricultural catchments, such as AnnANGPS [6,9], GWLF [4,15], HSPF [9,10,16], SWAT [9,12,13], SWIM [2,3,7], MIKE-SHE (Système Hydrologique Europeén) [17] and ANSWERS (Areal Non-point Source Watershed Environmental Response Simulation) [9]. However, data availability is generally a matter of concern and a crucial factor to evaluate model results and their uncertainty [9,18,19]. Indeed, in numerous situations, models are difficult or even impossible to be implemented, due to the fact that a limited amount of data is available [12]. Therefore, considering data availability (considered here as the amount of variables required as input to the model) and model complexity (considered here as the number of processes considered in the model) can lead to the selection of different models. Inexperienced users require, therefore, a simple model selection protocol, to handle the necessary compromise between data availability and model complexity.
Generally, choosing a model is more or less a “horse for course” issue [18,20]. Models are normally designed for a specific purpose and have different strengths and weaknesses. It is easy to select between models that satisfy the user’s objective and those that do not. However, it becomes more complex and even confusing to select the most suitable one for a specific case study among the models which all contain the required functionalities, e.g., nitrogen module. In fact, differences are often subtle between many of these models [18], especially when dealing with specialized modeling aspects, such as numerical methods and simplification of physical and chemical processes. These differences are often difficult to be captured by inexperienced users. Accordingly, many studies have tried to offer references to support model selection. Most of them focus on model intercomparison, either by applying several models in a specific study area [4,16,19] or relying on uncertainty analysis [21]. Some model selection protocols have been described in previous studies [22,23,24], but they mainly focused their selection criteria on the model codes, or on some general and simple guidelines without offering a detailed strategy to approach the problem. In particular, Saloranta et al. [22] and Boorman et al. [23] aimed at guiding water managers through the selection of a model suitable for applications complying with the European Water Framework Directive (WFD). These methods are mainly driven by the necessity for a modeler to discuss with a wide non-specialist public, and by the need to guide the decision of inexperienced user with a simple multi-criteria analysis. Such simplified schemes are particularly relevant for policy makers and water authorities called to comply with regulations such as the WFD. The task of models in this case is to formulate management plans and to mitigate the threat of nutrients to water quality. The concept of “cost and benefit” assessment is also a valuable criterion in water management [25,26], which can be utilized to select a model fitting the user’s application.
In this work, a novel model selection protocol, particularly calibrated to choose models predicting nutrient dynamics, was formulated and explained by considering five widely used process-based models (SWAT, SWIM, GWLF, AnnAGNPF and HPSF). These five models were selected among all available open source continuous models to describe nutrient transport for illustrative purposes and to cover a range of variation in (1) the degree of simplification in representing physical, chemical, and bio-chemical processes; and (2) the input data requirements for implementing the models. We do not aim at comparing the models performance per se or in evaluating the quality of their results. The selection protocol is based on a multi-criteria analysis. We first list for all models the hydrological processes, input data requirements and outputs related to nutrients dynamics, to show what the models provide as an output in terms of nutrient prediction and what data are needed for calculations. Then, we suggest performing a “cost and benefit” analysis to evaluate a model considering the cost of performing additional field campaigns to collect missing data required by a specific model, in combination with the potential benefit that the model may bring in terms of future applications. We exemplify the procedure considering different user’s objectives. The protocol is used as an integrated assessment tool to take into consideration the potential objectives of user, model characteristics, data availability and a “cost and benefits” analysis.
2. Model Description
2.1. SWAT
The SWAT model [27] is a complex semi-distributed process-based model. It was developed by the Agricultural Research Service of the United States Department of Agriculture [28,29] and can model changes in hydrology processes, vegetation, erosion, and nutrient loadings at the catchment scale. It divides the catchment into subcatchments and subsequently into Hydrologic Response Units (HRUs). Different combinations of land use, soil types and slope in each subcatchment can be represented by the HRUs. The processes related to water, sediment and nutrient transport are modeled at the HRU scale. The hydrological processes are distributed in five compartments: the stream, the soil surface, the soil layers, the shallow unconfined aquifer, and the deep confined aquifer. Up to ten soil layers can be divided in SWAT. Surface runoff can be calculated either by Soil Conservation Service-Curve Number (SCS-CN) method or Green & Ampt method. Erosion is estimated with the Modified Universal Soil Loss Equation (MUSLE). With the daily time step, SWAT simulates nutrient transports and transformations in soil profiles, river network, various water bodies (e.g., pond, lakes, and wetland), and the interaction processes between different systems. It can also differentiate between nutrient fluxes from different sources, e.g., urban areas. More information about the model is provided in [12,21,27].
2.2. SWIM
The SWIM model [30] is an integrated, semi-distributed model based on SWAT and MATSALU [31]. It simulates hydrological processes, vegetation, erosion and nutrient cycles at the catchment scale. The catchment is divided into subcatchments. A subcatchment is composed of hydrotopes, which are sets of elementary units with homogeneous soil and land use types. In each hydrotope, nutrient transport and transformation are simulated to model processes from the hydrological system to the river network. The time step, hydrological and soil components, as well as the methods of calculating water flow and erosion, of SWIM are identical to those of SWAT. Unlike SWAT, SWIM does not consider water bodies like lakes or ponds. More detailed information is provided in [3,7,30].
2.3. GWLF
GWLF [32] is a combined distributed/lumped parameter, continuous process-based model, which is able to simulate runoff, erosion, and nutrient loads from various source areas. Each source area is considered uniform with respect to soil and land cover. Surface runoff is calculated with the SCS-CN method. Erosion is simulated with the Universal Soil Loss Equation (USLE). Nutrient loads are calculated at the monthly scale, considering the monthly value of water balance, which are aggregated from daily water balance values. The total amount of nutrient loads within the catchment is calculated as the sum of the nutrient loads from each source area. Notice that the spatial location of the source areas is not considered in GWLF. Within each area, multiple land uses can be defined, while other parameters (e.g., water table height) are assumed to be uniform. Urban sourced nutrients are assumed to occur only in solid phase. Nutrient transformation is not considered in this model. Additional details about the model have been provided by [4,15,32].
2.4. AnnAGNPS
AnnAGNPS [33] is a continuous version developed from the event-based Agricultural Nonpoint Source model (AGNPS) [34]. AnnAGNPS is a distributed, continuous-simulation, watershed-scale nonpoint source (NPS) pollution model developed especially for agricultural catchments. The catchment is represented by homogenous squared cells in the model. Nutrient fate and hydrological cycle, e.g., runoff and erosion, are simulated with the daily time step by combining irrigation system as the main hydrological component. AnnAGNPS uses the SCS-CN approach to model runoff and the Revised Universal Soil Loss Equation (RUSLE) to compute erosion. A groundwater module is not available. Detailed model description can be found in previous publications, such as [6,33].
2.5. HSPF
HSPF [35] is a continuous, semi-distributed, watershed scale model. It was developed to simulate the hydrological system and associated nutrient states of the pervious and impervious land, stream and reservoir. The model disaggregates the catchment into land segments of uniform characteristics on the basis of land use. It models the changes in water, sediment, and nutrient amounts with a series of vertical storages. Soil profile is divided into surface layer, upper layer and lower layer. It simulates surface runoff with Chezy-manning equation and an empirical expression which links outflow depth with detention storage. Erosion is computed according to Negev’s equations. Nutrients are predicted in sub-daily time step. The model has been extensively described and applied in [16,35,36].
3. Model Components Analysis
Physical and chemical processes can hardly be decoupled in process-based models, which make hydrogeochemical modeling at the catchment scale a challenging task. Five natural compartments including atmosphere, soil, surface water body, groundwater and vegetation are involved. Different models approach the simplification of such a complex system in different ways. In this chapter, transport and nutrient transformation processes are described.
3.1. Surface and Subsurface Hydrological Components
Modeling surface and subsurface hydrological processes is the core component of process-based transport models. In Figure 1, thirteen common processes that can be considered in process-based models are shown. They consist of precipitation, interception, infiltration, percolation, seepage flow, revap flow (capillary rise), evapotranspiration, surface runoff, interflow, groundwater flow, drainage flow, irrigation, and pumping. Among the processes, precipitation and irrigation are responsible for water input. The former one is basic and essential in all process-based models; the latter one only exists in models that consider irrigation systems. Liquid and solid precipitations (i.e., rain and snow) are water sources for the soil layer deriving from the atmosphere. Before reaching the soil surface layer, parts of them are intercepted by vegetation. Such effect is defined as canopy interception or canopy storage in most models.
Water in the soil surface layer is subdivided into three groups: one group forms surface runoff which causes erosion and finally contributes to the stream flow; one goes deeply into the soil unsaturated zone by infiltration; and the last one goes back to the atmosphere by evapotranspiration. Besides the soil compartment, canopy storage and surface water body also contribute to evapotranspiration. In the unsaturated zone, water may contribute to stream flow by interflow processes, or by drainage flow in agriculture land. Soil and groundwater can be connected by water downwards with percolation flow and upwards with revap flow (capillary rise). In groundwater, seepage connects the water movement from shallow saturated zone to a deeper saturated zone. Groundwater is connected with the other parts of the system by two processes: it may contribute to stream flow in terms of return flow, and it may be removed by pumping. The hydrological different processes considered by the five models are listed in Table 1.
Figure 1.
Hydrology/hydrogeology processes considered by common process-based models.
Figure 1.
Hydrology/hydrogeology processes considered by common process-based models.

Table 1.
Hydrological processes considered in the five models.
| SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
|---|---|---|---|---|
| Surface Runoff | Surface runoff | Surface runoff | Surface runoff | Surface runoff |
| Infiltration | Infiltration | Infiltration | Infiltration | Infiltration |
| Evapotranspiration | Evapotranspiration | Evapotranspiration | Evapotranspiration | Evapotranspiration |
| Interflow | Interflow | Percolation flow | Interflow | Interflow |
| Percolation flow | Percolation flow | Base flow | Percolation flow | Percolation flow |
| Base flow | Base flow | Seepage flow | Drainage flow | Base flow |
| Revap flow | Revap flow | - | - | Interception |
| Pumping flow | Seepage flow | - | - | - |
| Interception | - | - | - | - |
| Drainage flow | - | - | - | - |
| Seepage flow | - | - | - | - |
3.2. Reactions
The mechanism of nutrient transformation is actually based on series of chemical, biochemical and physical reactions. Being distinct in characteristics, N and P forms are affected by different reaction pathways. For N, the major reactions simulated in models are: nitrogen fixation, decomposition, immobilization/mineralization, nitrification, denitrification, volatilization, ammonia ionization, ammonium adsorption/desorption, settling/sinking, and plant uptake. They are represented by various mathematical equations that couple different N forms. Regarding P, the key reactions involved in models are decomposition, mineralization/immobilization, sorption, adsorption/desorption, and settling/sinking, as well as plant uptake. Also in this case the reactive term couples the different transport equations of the different P species.
3.3. Input Requirements
3.3.1. Basic Input Data
In process-based models, nutrient transport and transformation are calculated with numerous equations representing the processes and reactions mentioned in Section 3.1. Such equations depend on numerous parameters, required as input data. In this work, input data are mainly referring to the data, which concern nutrient dynamics and can be obtained by measurements. Among them, these related to climate, soil, land use, vegetation, topography, hydrology, and hydrogeology parameters are defined as basic input data. Most of them are used to model the hydrological cycle represented in the model, although they are also used to compute nutrient chemical reactions.
In Table 2, we list the basic input requirements of the five models under consideration and we classify them into six categories: climate, soil, hydrology/hydrogeology, land use and vegetation, topography, and separated system. Subsequently, the amount of input data of each category is shown in Figure 2. Significant differences are present among the five models, as it can be observed in Table 2. The amount of basic input data requirements is, respectively, 52 for SWAT, 40 for SWIM, 16 for GWLF, 35 for AnnAGNPS and 27 for HSPF. Among them, 39 of SWAT, 34 of SWIM, 15 of GWLF, 27 of AnnAGNPS, and 24 of HSPF are used for the description of hydrological processes. SWAT requires the largest amount of input data, whereas GWLF needs the least in all six categories (see Figure 2).
Among the six categories, climate presents the minimal differences between each model (Table 2 and Figure 2). There are ten climate inputs in total, three of them are common to all models: rainfall, snow, and air temperature. GWLF requires only two more input data (i.e., evapotranspiration and daylight hours), while all other models require the same climate input data. Some special inputs are characteristic of a single model: carbon dioxide concentration is only required by SWAT, daylight hours is only needed by GWLF, and vapor pressure is only for HSPF.
As shown in Table 2 and Figure 2, soil is characterized by 14 input parameters. GWLF and HSPF show the lowest data requirement. SWAT, SWIM and AnnAGNPS require similar amounts and types of input data, and among them AnnAGNPS needs the most. Special soil inputs, which are only required by a single model, are temperature, pH, CaCO3 content, and organic matter.
Table 2.
Basic input requirements of the five models (“(√)” means this input can be either supplied by measurements, or calculated by the model).
| Climate | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Rainfall | √ | √ | √ | √ | √ |
| Snow | √ | √ | √ | √ | √ |
| Air temperature | √ | √ | √ | √ | √ |
| Solar radiation | √ | √ | √ | √ | |
| Humidity/dew point | √ | (√) | √ | √ | |
| Wind speed | √ | (√) | √ | √ | |
| Carbon dioxide concentration | √ | ||||
| Evapotranspiration | √ | √ | √ | √ | √ |
| Daily daylight hours | √ | ||||
| Vapor pressure | √ | ||||
| Soil | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Depth/thickness | √ | √ | √ | √ | |
| Texture | √ | √ | √ | √ | |
| Temperature | √ | ||||
| Bulk density | √ | √ | √ | √ | |
| Initial soil water content/moisture | √ | √ | √ | √ | √ |
| Field capacity | (√) | √ | √ | ||
| Wilting point | (√) | √ | √ | ||
| Hydraulic conductivity | √ | √ | √ | ||
| Porosity | √ | √ | √ | ||
| Available water capacity | √ | √ | |||
| Organic matter | √ | ||||
| pH | √ | ||||
| Organic carbon content % | √ | √ | √ | ||
| CaCO3 content % | √ | ||||
| Hydrology/Hydrogeology | |||||
| Water table height | √ | √ | √ | ||
| Hydraulic conductivity | √ | √ | √ | ||
| Specific yield of shallow aquifer | √ | √ | |||
| Groundwater extraction | √ | ||||
| Snow water content/snow melt | √ | √ | √ | √ | |
| Initial shallow aquifer storage | √ | √ | |||
| Revap storage | √ | √ | |||
| Recharge water | √ | √ | |||
| Drain spacing | √ | √ | |||
| Irrigation | √ | √ | |||
| Drainable volume of water stored in the saturated zone | √ | ||||
| Saturated depth from the imperious layer | √ | ||||
| Surface water storage | √ | ||||
| Active groundwater storage | √ | ||||
| Interflow storage | √ | ||||
| Lower zone storage | √ | ||||
| pH (water) | √ | ||||
| Land Use and Vegetation | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Land use | √ | √ | √ | √ | √ |
| Land cover | √ | √ | √ | √ | √ |
| Vegetation type | √ | √ | √ | √ | √ |
| Vegetation height | √ | ||||
| Leaf area index | √ | √ | |||
| Plant canopy height | √ | ||||
| Residue | √ | √ | √ | ||
| Total biomass | √ | √ | |||
| Base temperature for plant growth | √ | √ | |||
| Root depth | √ | √ | |||
| Fertilizing rate/amount | √ | √ | √ | ||
| Crop management/tillage operation | √ | √ | |||
| Topography | |||||
| Area | √ | √ | √ | √ | √ |
| Elevation | √ | √ | √ | √ | |
| Hillslope length | √ | √ | |||
| Hillslope steepness | √ | √ | |||
| Hillslope width | √ | ||||
| Land surface slope length | √ | √ | √ | √ | √ |
| Land surface slope steepness | √ | √ | √ | √ | √ |
| Separated System | |||||
| Channel system/in-stream system | √ | √ | √ | √ | |
| Tile drainage system | √ | √ | |||
| Pond system | √ | ||||
| Wetland system | √ | ||||
| Reservoir system | √ | √ | |||
| Pothole system | √ |

Figure 2.
Amounts of inputs for each category: (a) climate; (b) soil; (c) hydrology/hydrogeology; (d) land use and vegetation; (e) topography; and (f) separated system.
Figure 2.
Amounts of inputs for each category: (a) climate; (b) soil; (c) hydrology/hydrogeology; (d) land use and vegetation; (e) topography; and (f) separated system.


The hydrology/hydrogeology input data section shows the most significant differences between the five different models. As shown in Figure 2, SWAT is the most data-demanding model (ten input data), marking a strong contrast to GWLF, which only requires one input: snow water content/snow melt. Most inputs for SWIM are the same as for SWAT, but inputs of groundwater extraction and irrigation are not involved in SWIM. HSPF requires six inputs, five of which are specific and unique inputs: surface water storage, active groundwater storage, interflow storage, lower zone storage, and pH value in water. AnnAGNPS requires four inputs, three of which are among the inputs of SWAT and the one left is unique: saturated depth from the imperious layer.
Concerning land use and vegetation data (Table 2 and Figure 2), SWAT and SWIM require many similar input data, and the former model requires the most input data. AnnAGNPS requires six inputs: land use, land cover, vegetation type, residue, fertilizing rate/amount, and crop management/tillage operation. Both GWLF and HSPF just utilize three input data: land use, land cover and vegetation type.
Topography contains seven input data. Among the five models (Table 2), SWIM requires all seven inputs (Figure 2), making it the most data-demanding one in this part. SWAT data are similar to SWIM without requiring hillslope. AnnAGNPS and HSPF are the same with needing four inputs. GWLF requires the least with three inputs requirement. Furthermore, three inputs are shared by all five models: area, land surface slope length and land surface slope steepness.
Separated system is the last category of basic inputs. There are totally six separated systems considered by these five models in relation to nutrient transport or transformation. SWAT can model all the six separated systems; in contrast, GWLF does not consider them. AnnAGNPS can simulate two of them: in-stream system and tile drainage system. HSPF is capable of modeling two of them: in-stream system and reservoir system. SWIM only considers one: in-stream system. These inputs of the separated system will be required only when a separated system like wetland system exists and is involved in the nutrient cycles of the study catchment.
3.3.2. Nutrient Input Data
For the simulation of nutrient fate, additional input data are required to characterize the system. Nutrients input data are divided into two groups: nitrogen inputs and phosphorus inputs. Further, inputs of each part are classified in terms of nutrient forms and nutrient sources. The major forms of N inputs described in models are organic N, fresh organic N, active organic N, dissolved N, NO3−, NO2−, NH4+, NH3, and total N. The sources of N inputs are mainly soil, runoff, sediment, groundwater, surface water bodies, plant uptake, urban source, point source, fertilizer, septic system, and atmosphere. The common P forms involved in model inputs are organic P, fresh organic P, active organic P, inorganic P, labile inorganic P, active inorganic P, stable inorganic P, soluble P, dissolved P, PO43−, and total P. The sources of P inputs are almost the same as N inputs, except that, currently, models do not consider atmosphere P source.
Table 3 shows the nitrogen input requirements of the five models. N inputs of various forms from 15 sources are listed. Among them, nitrogen inputs of soil are common requirements for most models except GWLF. Soil nitrogen inputs are used to simulate nitrogen transport processes and transformation from soil to the reach. The other input parameters are utilized either to predict nitrogen transport from some specific sources, e.g., urban sources, or to model the nitrogen statuses for some specific system, e.g., reservoir system. Some input parameters, including normal fraction of N in plant biomass and fertilizer N, are crop-specific. Therefore, they should be provided as N information for each category of crop such as corn, wheat and potato. Totally, SWAT input-data demand is high, since it considers 33 N inputs from 11 sources. Compared with SWAT, the other four models are much more simplified. HSPF requires ten N inputs from four sources, GWLF requires eight N inputs from seven sources, SWIM requires seven N inputs from four sources, and AnnAGNPS only needs four N inputs from three sources. Numbers of N input sources of the models are presented in Figure 3.
The situation of Phosphorus input requirements of the five models (Table 4 and Figure 3) is similar to nitrogen. Various P inputs are presented for 14 specific sources. P inputs of soil are common requirements for all models but GWLF. The main functions of P inputs of different sources are the same as those of N inputs. In the five models, SWAT needs the most P inputs with an amount of 18 for ten sources; the other four models require less input data: seven P inputs for three sources in SWIM, eight for seven sources in GWLF, six for two sources in AnnAGNPS, and five for four sources in HSPF.
Table 3.
Nitrogen input (“(√)” means this input can be either supplied by measurements, or calculated by the model using equations).
| Initial Soil Nitrogen | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Organic N | (√) | √ | √ | √ | |
| NO3 | (√) | √ | √ | √ | |
| NH4 | √ | √ | |||
| Fresh organic N | √ | ||||
| Normal fraction of N in plant biomass (crop-specific) | √ | √ | |||
| Nitrogen in runoff | |||||
| Dissolved N | √ | ||||
| Nitrogen in groundwater | |||||
| Dissolved N | √ | ||||
| Nitrogen in sediment | |||||
| Total N | √ | ||||
| Plant uptake nitrogen | |||||
| Total N | √ | ||||
| Urban sources | |||||
| Total N | √ | √ | √ | ||
| NO3 | √ | ||||
| Point sources | |||||
| Organic n | √ | ||||
| NO3 | √ | ||||
| NO2 | √ | ||||
| NH4 | √ | ||||
| Dissolved N | √ | ||||
| Fertilizer nitrogen (crop-specific) | |||||
| Organic N | √ | √ | |||
| Active organic N | √ | ||||
| Inorganic N | √ | √ | |||
| NH4 | √ | ||||
| Septic system | |||||
| Dissolved N in outflow | √ | ||||
| Dissolved N from ponded system | √ | ||||
| Total N | √ | ||||
| NO3 | √ | ||||
| NO2 | √ | ||||
| Organic N | √ | ||||
| NH4 | √ | ||||
| Initial nitrogen in pond | |||||
| Organic N | √ | ||||
| NO3 | √ | ||||
| Initial nitrogen in wetland | |||||
| Organic N | √ | ||||
| NO3 | √ | ||||
| Initial nitrogen in reservoir | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Organic N | √ | ||||
| NO3 | √ | √ | |||
| NO2 | √ | √ | |||
| NH4 | √ | ||||
| NH4 + NH3 | √ | ||||
| In-stream nitrogen | |||||
| Organic N | √ | ||||
| NO3 | √ | √ | |||
| NO2 | √ | √ | |||
| NH4 | √ | ||||
| NH4 + NH3 | √ | ||||
| Atmospheric deposition | |||||
| NO3 in rain | √ | √ | |||
| NH4 in rain | √ | ||||
| NO3 in dry deposition | √ | ||||
| NH4 in dry deposition | √ |
Figure 3.
Numbers of nutrient inputs sources of the models: (a) N input sources; (b) P input sources.
Figure 3.
Numbers of nutrient inputs sources of the models: (a) N input sources; (b) P input sources.

Comparing Table 3 and Table 4, we can notice that in the five models considered, only GWLF demands nutrient inputs from runoff, groundwater, sediment, and plant uptake; the other four models except SWAT do not need nutrient inputs of pond and wetland. From the chemical perspective, the nutrient forms are not independent. In fact, some of them are interconnected, for example, organic N is composed of fresh organic N and active organic N; dissolved N can contain NO3−, NO2−and NH4+; organic P consists of fresh organic P and active organic P; labile inorganic P, active inorganic P and stable inorganic P constitute inorganic P. In this case, the models choosing analogous nutrient forms for the same source can be ascribed to the fact that different models had distinct emphases or focuses when they were developed, they require specific forms of nutrient to model what they are prone to. Even for modeling a similar, or same, process or reaction, the mathematical equations of different models could be different.
Table 4.
Phosphorus input (“(√)” means this input can be either supplied by measurements, or calculated by the model using equations).
| Initial Soil Phosphorus | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Organic P | (√) | √ | √ | √ | |
| PO4 | √ | ||||
| Fresh organic P | √ | ||||
| Soluble P | (√) | ||||
| Labile inorganic P | √ | √ | |||
| Active inorganic P | √ | √ | |||
| Stable inorganic P | √ | √ | |||
| Normal fraction of P in plant biomass (crop-specific) | √ | √ | |||
| Phosphorus in runoff | |||||
| Dissolved P | √ | ||||
| Phosphorus in groundwater | |||||
| Dissolved P | √ | ||||
| Phosphorus in sediment | |||||
| Total P | √ | ||||
| Plant uptake Phosphorus | |||||
| Total P | √ | ||||
| Urban Sources | |||||
| Total P | √ | √ | √ | ||
| Point sources | |||||
| Organic P | √ | ||||
| Soluble P | √ | ||||
| Dissolved P | √ | ||||
| Fertilizer phosphorus (crop-specific) | |||||
| Organic P | √ | √ | |||
| Active organic P | √ | ||||
| Inorganic P | √ | √ | |||
| Septic system | |||||
| Dissolved P in outflow | √ | ||||
| Dissolved P from ponded system | √ | ||||
| Total P | √ | ||||
| PO4 | √ | ||||
| Organic P | √ | ||||
| Initial phosphorus in pond | |||||
| Organic P | √ | ||||
| Soluble P | √ | ||||
| Initial phosphorus in wetland | |||||
| Organic P | √ | ||||
| Soluble P | √ | ||||
| Initial phosphorus in reservoir | |||||
| Organic P | √ | ||||
| Soluble P | √ | ||||
| PO4 | √ | ||||
| In-stream phosphorus | |||||
| Organic P | √ | ||||
| PO4 | √ |
3.4. Nutrient Output
Provided with basic input data and nutrient input data, process-based models are able to predict nutrient fate in a catchment system, in the form of output data. Due to their distinct emphases or focuses, different models generally have differences in nutrient output data. In general, nutrients output data can be roughly categorized into two groups, one is nutrients transport output which displays nutrient fate associated with transport processes in hydrological systems, and the other is nutrient transformation output that shows nutrient fate related to reactions.
In our work, outputs of nitrogen and phosphorus of the five models are illustrated separately in two tables. Nitrogen outputs are subdivided into four groups (Table 5): soil nitrogen, external nitrogen added to the catchment system, transports, and transformations. Considering nitrogen forms, nitrogen outputs are composed of a variety of N forms, including organic N, inorganic N, dissolved N, NO3−, NO2−, NH4+, NH3, and total N. The five models show significant differences in nitrogen outputs. For the first group, HSPF is the only model that simulates soil N in three forms as output. Considering the second group, external nitrogen inputs deriving from plant residue and fertilizer are calculated as model outputs only by SWAT and AnnAGNPS; nitrate from rain is modeled by SWAT and SWIM as output. Transport and transformation are the dominant two groups in terms of input data requirement. As shown in Figure 4, in N transport outputs, SWAT and HSPF show a higher number of outputs. SWAT provides the user with 28 N outputs covering eight transport processes and HSPF simulate 23 N outputs related to six processes. The other three models produce a lower amount of output data: four N outputs with respect to three transport processes for SWIM, and six N outputs related to five processes for GWLF as well as for AnnAGNPS. Although being different in chemical forms, all five models predict N outputs in surface runoff. All models, except GWLF, simulate N outputs in interflow and percolation flow. SWAT, GWLF and HSPF provide N outputs associated with urban source flow and groundwater flow. SWAT and GWLF model N outputs with respect to septic system. SWAT and HSPF simulate N outputs in surface water body systems. N outputs of point source, infiltration and evaporation are characteristic of a single model: GWLF models N from point source; AnnAGNPS simulates N in infiltration; SWAT predicts N in evaporation. In the following section regarding N transformation (Figure 4), GWLF has no outputs, namely GWLF gains no ability in predicting N transformation. SWAT and HSPF calculate 14 N outputs considering nine reactions and 17 N outputs related to ten reactions, respectively. SWIM and AnnAGNPS consider less in transformation. The former one models six N outputs for four reactions and the latter predicts five outputs for four reactions. Among the reactions, denitrification, decomposition, mineralization/immobilization, and plant uptake are modeled by all models expect GWLF. Ammonia volatilization, nitrification, in-stream reactions and settling are simulated by SWAT and HSPF. Fixation, ionization, adsorption/desorption and reservoir system reactions are rare and considered in no more than one model: fixation is considered by SWAT; ionization, adsorption/desorption and reservoir system reactions are modeled by HSPF.
Table 5.
Nitrogen output.
| Soil Nitrogen | SWAT | SWIM | GWLF | AnnAGNPS | HSPF | |
| Organic N | √ | |||||
| NO3 | √ | |||||
| NH4 | √ | |||||
| Add in | ||||||
| Organic N from residue | √ | √ | ||||
| Nitrogen applied in fertilizer | √ | √ | ||||
| NO3 added to soil profile by rain | √ | √ | ||||
| Transport | ||||||
| Surface Runoff | ||||||
| Total N in sediment | √ | √ | ||||
| Organic N in sediment | √ | √ | √ | √ | ||
| NH4 in sediment | √ | |||||
| NO3 in water | √ | √ | ||||
| NH4 in water | √ | |||||
| Inorganic N in water | √ | |||||
| Dissolved N in water | √ | |||||
| Nitrogen from Urban area by wash off | ||||||
| Total N | √ | √ | √ | |||
| Nitrogen from septic system | ||||||
| Dissolved N | √ | |||||
| Total N | √ | |||||
| Nitrogen from point source | ||||||
| Dissolved N | √ | |||||
| Interflow | ||||||
| NO3 | √ | √ | √ | |||
| NH4 | √ | |||||
| Inorganic N | √ | |||||
| Subsurface drainage flow | ||||||
| Inorganic N | √ | |||||
| Leaching by percolation | ||||||
| NO3 | √ | √ | √ | |||
| NH4 | √ | |||||
| Inorganic N | √ | |||||
| Groundwater Flow | ||||||
| NO3 | √ | √ | ||||
| NH4 | √ | |||||
| Dissolved N | √ | |||||
| Infiltration | ||||||
| Inorganic N | √ | |||||
| From first soil layer to surface by evaporation | ||||||
| NO3 | √ | |||||
| Surface Water Body Systems | SWAT | SWIM | GWLF | AnnAGNPS | HSPF | |
| Organic N transported with water into reach | √ | |||||
| Organic N transported with water out of reach | √ | |||||
| NO3 transported with water into reach | √ | √ | ||||
| NO3 transported with water out of reach | √ | √ | ||||
| NH4 transported with water into reach | √ | √ | ||||
| NH4 transported with water out of reach | √ | √ | ||||
| NO2 transported with water into reach | √ | √ | ||||
| NO2 transported with water out of reach | √ | √ | ||||
| Concentration of organic N in pond | √ | |||||
| Concentration of NO3 in pond | √ | |||||
| Concentration of organic N in wetland | √ | |||||
| Concentration of NO3 in wetland | √ | |||||
| Organic N transported into reservoir | √ | |||||
| Organic N transported out of reservoir | √ | |||||
| NO3 transported into reservoir | √ | √ | ||||
| NO3 transported out of reservoir | √ | √ | ||||
| NO2 transported into reservoir | √ | √ | ||||
| NO2 transported out of reservoir | √ | √ | ||||
| NH4 transported into reservoir | √ | √ | ||||
| NH4 transported out of reservoir | √ | √ | ||||
| Transformation | ||||||
| Fixation | √ | |||||
| Nitrification | √ | √ | ||||
| Ammonia volatilization | √ | √ | ||||
| Denitrification | √ | √ | √ | √ | ||
| Ionization | √ | |||||
| Mineralization/immobilization | ||||||
| Fresh organic N to mineral N | √ | √ | √ | |||
| Active organic N to mineral N | √ | √ | √ | |||
| N transferred between active organic N and stable organic N | √ | √ | ||||
| NO3 to Organic N | √ | |||||
| NH4 to Organic N | √ | |||||
| Organic N to NH4 | √ | |||||
| Decomposition | √ | √ | √ | √ | ||
| Plant uptake | ||||||
| Inorganic N | √ | √ | √ | |||
| NH4 | √ | |||||
| NO3 | √ | |||||
| In-stream reaction (related to Algae or plankton) | ||||||
| organic N | √ | √ | ||||
| NH4 | √ | |||||
| NH4 + NH3 | √ | |||||
| NO3 | √ | √ | ||||
| NO2 | √ | √ | ||||
| Adsorption/Desorption | SWAT | SWIM | GWLF | AnnAGNPS | HSPF | |
| NH4 | √ | |||||
| Nitrogen settling/sinking in ponds, wetland and reservoir | ||||||
| Total N | √ | √ | ||||
| Reservoir system chemical and biochemical transformation | √ | |||||
Figure 4.
Numbers of transport and transformation processes involved in the nutrient outputs: (a) N transport processes; (b) N transformation processes; (c) P transport processes; (d) P transformation processes.
Figure 4.
Numbers of transport and transformation processes involved in the nutrient outputs: (a) N transport processes; (b) N transformation processes; (c) P transport processes; (d) P transformation processes.

Phosphorus outputs are also classified into four groups (Table 6): soil phosphorus, external phosphorus added to the catchment system, transport, and transformations. They also consist of diverse P forms, containing organic P, inorganic P, mineral P, labile inorganic P, active inorganic P, stable inorganic P, soluble P, dissolved P, PO43−, and total P. There are considerable diversities about P outputs among the five models. For the soil group, AnnAGNPS and HSPF have equations to get the outputs in different forms of P. For external phosphorus, utilizing several equations, SWAT and AnnAGNPS export the outputs about P from fertilizer and residue. Like in the case of N outputs, transport and transformation are major P output groups. Considering transport (Figure 4), SWAT is the most comprehensive model in quantity of P outputs, which have 19 P outputs for six transport processes. The state of the P outputs of other four models are as followed: SWIM simulates P in one transport process with outputs in terms of two forms; GWLF calculates P of five transport processes with six outputs; AnnAGNPS predicts P in two transport processes with four outputs; and HSPF models P considering six transport processes with 12 outputs. Among the processes, surface runoff is considered by all five models for P prediction. The five models reveal diversity in P forms of outputs. For SWAT, GWLF and HSPF, P outputs from urban area and groundwater flow can be computed. SWAT and GWLF provide P outputs of septic system. SWAT and HSPF simulate P transport due to percolation and in surface water body system. No more than one model entails some functionalities for predicting P fate of point source, interflow and infiltration: GWLF considers P of a point source; HSPF predicts P in interflow; AnnAGNPS models P in infiltration. For transformation (Figure 4), phosphorous transformations are not considered by GWLF. Among the other four models, SWAT contains nine P outputs for six reactions; SWIM has five P outputs in regard to three reactions; AnnAGNPS models four reactions in the form of five P outputs; and HSPF calculates seven reactions in terms of nine P outputs. These four models consider decomposition and mineralization/immobilization. A majority of models simulate sorption and plant uptake: SWAT, SWIM and AnnAGNPS simulate sorption; SWAT, AnnAGNPS and HSPF simulate plant uptake P. SWAT and HSPF predict in-stream reactions and settling. Only HSPF takes adsorption/desorption and reservoir system reactions into consideration.
Comparing Table 5 and Table 6, we can summarize that disparities exist among distinct models and these disparities are presented not only in the different processes and reactions considered by the models, but also in the distinguishable nutrient forms which the models prefer to use as output for the same process or reaction.
Table 6.
Phosphorus output.
| Soil Phosphorus | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Organic P | √ | ||||
| PO4 | √ | ||||
| Soluble P | √ | ||||
| Labile inorganic P | √ | ||||
| Active inorganic P | √ | ||||
| Stable inorganic P | √ | ||||
| Add in | |||||
| Phosphorus (mineral and organic) applied in fertilizer | √ | √ | |||
| Organic P from residue | √ | √ | |||
| Transport | |||||
| Surface Runoff | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Total P in sediment | √ | √ | √ | ||
| Organic P in sediment | √ | √ | √ | ||
| Mineral P in sediment | √ | ||||
| Inorganic P in sediment | √ | ||||
| PO4 in sediment | √ | ||||
| Soluble P in water | √ | √ | |||
| Inorganic P in water | √ | ||||
| PO4 in water | √ | ||||
| Dissolved P in water | √ | ||||
| Phosphorus from Urban area by wash off | |||||
| Total P | √ | √ | √ | ||
| Phosphorus from septic system | |||||
| Dissolved P | √ | ||||
| Total P | √ | ||||
| Phosphorus from point source | |||||
| Dissolved P | √ | ||||
| Interflow | |||||
| PO4 | √ | ||||
| Leaching by percolation | |||||
| Soluble P | √ | ||||
| PO4 | √ | ||||
| Groundwater flow | |||||
| Soluble P | √ | ||||
| PO4 | √ | ||||
| Dissolved P | √ | ||||
| Infiltration | |||||
| Inorganic P | √ | ||||
| Surface Water Body Systems | |||||
| Organic P transported with water into reach | √ | ||||
| Organic P transported with water out of reach | √ | ||||
| Mineral P transported with water into reach | √ | ||||
| Mineral P transported with water out of reach | √ | ||||
| Inflow PO4 to reach | √ | ||||
| Outflow PO4 from reach | √ | ||||
| Concentration of organic P in pond | √ | ||||
| Concentration of mineral P in pond | √ | ||||
| Concentration of organic P in wetland | √ | ||||
| Concentration of mineral p in wetland | √ | ||||
| Organic p transported into reservoir | √ | ||||
| Organic p transported out of reservoir | √ | ||||
| Mineral P transported into reservoir | √ | ||||
| Mineral P transported out of reservoir | √ | ||||
| Inflow PO4 to reservoir | √ | ||||
| Outflow PO4 from reservoir | √ | ||||
| Transformation | |||||
| Decomposition | √ | √ | √ | √ | |
| Mineralization/ Immobilization | SWAT | SWIM | GWLF | AnnAGNPS | HSPF |
| Fresh organic to mineral P | √ | √ | √ | ||
| Organic to labile mineral P | √ | √ | √ | ||
| PO4 to Organic P | √ | ||||
| Organic P to PO4 | √ | ||||
| Sorption | |||||
| P transferred between Labile and active mineral P | √ | √ | √ | ||
| P transferred between Active and stable mineral P | √ | √ | |||
| Plant uptake | |||||
| Inorganic P | √ | √ | |||
| PO4 | √ | ||||
| In-stream reaction(related to Algae or plankton) | |||||
| Organic P | √ | √ | |||
| Soluble P | √ | ||||
| PO4 | √ | ||||
| Adsorption/desorption | |||||
| PO4 | √ | ||||
| Phosphorus settling/sinking in ponds, wetland and reservoir | |||||
| Total P | √ | √ | |||
| Reservoir system chemical and biochemical transformation | √ |
3.5. Model Complexity
Considering model complexity as the capability of a model to describe a given amount of physical and chemical processes, we can observe that the five process-based models considered have different model complexities. As described in the previous sections, major processes related to nutrient predictions are discussed in the form of basic inputs and nutrient inputs and outputs. For representation of the processes in the hydrology cycle, the sequence from a more complex to simpler model is SWAT > SWIM > AnnAGNPS > HSPF > GWLF. Considering nutrient outputs, SWAT is the most complex model, HSPF ranks the second in model complexity, SWIM and AnnAGNPS are simpler models with practically the same complexity, and GWLF is the simplest model. Such ranking is justified in the following. Firstly, they are different in dividing the soil column. SWAT and SWIM can subdivide the soil column into 10 layers, GWLF and AnnAGNPS divide the soil only into two layers, HSPF divides it into three layers: surface, upper and lower soil. A series of soil input data are required for each layer, if all the layers are implemented; SWAT and SWIM obviously become the most expensive models (in terms of input data required) in this aspect. Secondly, as all five models are based on water balance, they are almost the same in most hydrological/hydrogelogical components, yet different complexities still exist. Precipitation, surface runoff, evapotranspiration, percolation, and infiltration are simulated in the five models. Groundwater flow/base flow is computed in the four models except, AnnAGNPS. Interflow is modeled in four models (expect GWLF). Interception is simulated just in SWAT and HSPF, thus the contribution of evaporation from canopy storage to the total evapotranspiration is not considered in AnnAGNPS, SWIM and GWLF. Revap flow is implemented by SWAT and SWIM. Tile drainage flow and irrigation are considered in SWAT and AnnAGNPS. As AnnAGNPS is specially developed for agricultural catchments, the tile drainage and irrigation are essential components of the hydrology system in AnnAGNPS, but in SWAT they are computed optionally as management modules. Similarly, only pumping and bypass flow can be simulated in SWAT if desired. In addition, SWAT can simulate the hydrological cycle of various surface water bodies including in-stream system, pond, wetland, reservoir, and pothole system. HSPF can model in-stream systems and reservoir. AnnAGNPS and SWIM only consider in-stream systems, while GWLF has a very elementary representation of water dynamics in the catchment. Thirdly, the computation of reactive transport is closely related to the hydrological and transformation processes considered. According to the nutrient outputs of the five models provided in Table 5 and Table 6, we can identify four cases:
- (1)
- Not all nutrient sources are considered by all five models, e.g., nutrient from septic system can be simulated by SWAT and GWLF, but are not considered yet by the others.
- (2)
- Despite the models contain the same hydrological process, only some models predict reactive transport in a given hydrological compartment, e.g., groundwater flow/base flow is simulated by SWAT, SWIM, GWLF and HSPF, but only SWAT and HSPF predict nutrient transport in this compartment.
- (3)
- Not all possible transformation reactions are modeled by all the models, e.g., nitrification is simulated only by SWAT and HSPF.
- (4)
- Some models contain the same process or reaction and all can simulate the nutrient fate of the process or reaction, but the nutrient forms may be different, e.g. four models compute N in interflow, but the form is NO3 for SWAT and SWIM, inorganic N for AnnAGNPS, and NO3 and NH4 for HSPF.
4. Set up a Model Selection Protocol
4.1. A Model Selection Protocol
As it will be detailed below, we propose a model selection protocol (Figure 5), which considers all major factors including data availability, user objectives and cost/benefit analysis. The strategy can be briefly summarized as followed:
- (1)
- Define the objectives of the user.
- (2)
- Make a list about the models at hand and the models that can be available.
- (3)
- Select the potential models that are capable of simulating the general user’s objectives.
- (4)
- Make a list about the corresponding outputs and input requirements of the potential models.
- (5)
- Investigate data availability.
- (6)
- Specify the objectives in more details, considering the potential outputs.
- (7)
- Specify the missing data, comparing the input requirements with the available data.
- (8)
- Analyze the time and financial cost for additional data measurements.
- (9)
- Analyze the potential benefits associated with the use of additional data and eventually a more complex model in terms of model predictions and outputs.
- (10)
- Select the model by evaluating the benefits and cost.
Figure 5.
Scheme of the model selection protocol.
Figure 5.
Scheme of the model selection protocol.

4.2. Objectives of the User
As its name implies, objectives of the user means what the user wants to do by using the model. It is a crucial factor for model selection, which decides the major functionalities to be contained in the model. Considering the purpose of modeling nutrient fate and transport, the core objective is clear. The set of models to be considered is composed of those with nutrient simulation functionaries. However, specific objectives of different users can be different. These differences are reflected in four aspects: (1) Nutrient transport and nutrient transformation are two independent parts, some users may consider only the first part, and some users may want to simulate both; (2) Nutrient transport involves numerous processes and users may be interested in focusing on a specific one; (3) Nutrient transformation consists of many reactions; users may be willing to model only some selected pathways; (4) There are various forms of nutrients; users choose them based on their objectives. As a consequence, these differences in specific objectives have significant effects on choosing a model of different complexity level. Then, the different model complexity leads to distinct input requirements. Subsequently, this situation draws the attentions to the core problem in model selection: data availability.
4.3. Input Data Availability
Data availability in a study catchment is often the principal model selection criterion and also the prominent concern considered by the model users at an early stage before carrying out succeeding modeling operations. Characterization of the study area is costly and time consuming and it is often performed independently of the modeling activities, which are generally foreseen at a later stage. Available data may derive from heterogeneous sources, such as local gauging stations, available databases, and environmental or water authorities of the government, or from scientific literature. For nutrient modeling work, input data limitation is generally specific for a single case study. Usually, missing data are related to deep soil data, crop and management information, groundwater system data, and nutrient data.
Due to their geological features, deep soil and groundwater system are not easy to access and their nutrient load is often difficult to properly quantify. Although in some catchments nutrient data are recorded, they are limited in both nutrient forms and nutrient sources. Without instruments or measurement records, which require substantial investment in monitoring network of sensors, data time series with high spatio-temporal resolution are seldom available for these data. In ungauged catchments, the condition of data scarcity is even more relevant. However, model functionalities can optimally operate only when all required field-specific input data are provided. Under the circumstances that limited input data are available or some essential data are missing, extra measures should be taken to find surrogate data. This can be achieved by collecting additional measurements, which will increase the time required to obtain model outputs and the costs of the investigation. Thus, data availability is definitely a dominant criterion and precondition of great importance for model selection. As shown in Table 2, Table 3 and Table 4, in selecting model, the required input data should be checked carefully to make sure that the required input are available. With a clear comparison between the input requirements and the available data, the user can get a first judgment on which model is the easiest to set up and which data should be measured if a more complex and data-demanding model has to be selected.
To provide a general approach towards the issue of model selection, in the present study, each input parameter has the same importance in the protocol. Depending on a specific case study and on the experience of the user, different weights could be assigned to different parameters.
4.4. Model Complexity
The complexity of a model is frequently associated to the model functionality. Model functionality is reflected in model outputs in the form of quality and quantity and by what and how many processes are implemented in the model. Complex models often appear to be an optimal choice, due to the detailed process description they entail. However, as a pay-off, they are extremely data-demanding, in order to achieve reliable model results and predictions. Complex models can simulate diverse processes with various outputs, but each computation requires a given amount of input data. In general, the more functionalities a model implements, the more input data are needed. In other words, full-featured and powerful functionalities can be realized on the basis of owning abundant input data. In conditions of limited data availability, complex models are not easy or unable to properly operate as expected and tend to generate a higher uncertainty. At this time, a simple model with less complexity could be a better choice. A simple model may implement the same functionalities as a complex model does, although it may neglect some processes, which may be of secondary importance in a specific case study. Fulfilling all the input requirements is a necessity in order to increase model complexity.
In model selection, we suggest to consider model complexity as a decision criterion, which depends on users’ objectives, on data availability and on a cost-benefit analysis [37]. Irrelevant functionalities should never be considered as the possible reasons for choosing a model. Besides, model complexity is an essential model attribute, which cannot be changed by the user, but the user can decide to choose the model with an adequate complexity level.
4.5. Cost and Benefit Analysis
Considering the case of limited data availability, it is worth taking into account the possibility for additional measurement campaigns, in view of the aforementioned time and financial constraints. A cost and benefit analysis is advisable [25,26], since it may also influence the users’ objectives and the choice for a model of higher complexity. For all users, cost is evaluated in the form of time and money investment based on the difference between available data and required inputs. The evaluation rule of benefit varies from individual to individual according to the objective. In view of distinguishable rules of assessing benefit, the model users are classified into three categories: scientists, stakeholders, and water authorities. Cost and benefit are discussed in detail for each category in the following examples.
4.6. Model Selection Protocol: Some Applications
In each project, scientists have fixed and explicit research objects, e.g., specific nutrient forms, particular transport processes or certain reactions. For them, benefit is evaluated according to the model’s ability of supporting the explanation of particular research questions. The model selection protocol suggests first to group different models, which allows addressing the research objects and then to estimate the costs to acquire missing data for each model, considering the limitation of research funding. Let us exemplify this situation, considering the five models presented above. One researcher aims at studying nitrogen transport from different sources to a reach, to perform then further eco-hydrological analyses. Within the five models, only GWLF and SWAT have the corresponding functions. GWLF requires 15 basic input data to set up the hydrological model and eight N inputs for nutrient transport simulations. SWAT requires 39 basic input data and 13 N input data. The corresponding N outputs of the two models are similar. It is visible that SWAT is more data demanding than GWLF. However, from the point of view of a researcher, the processes described by GWLF could be too simplified. GWLF simply predicts the amount of nitrogen transported from each source but it does not describe other processes, for example, GWLF ignores the nitrogen flux interactions inside the soil profile between surface water and groundwater. Due to a more complete representation of hydrological cycle, SWAT is capable of describing in more details a larger number of processes, which benefit the investigation of the mechanisms of N transport form different sources. Thus, despite its higher cost due to the additional data required, SWAT provides some benefits for a researcher, which can justify the investment of collected missing information. If affordable within the available funding, SWAT is chosen by the researcher.
For stakeholders, the first priority is usually profit. Among multiple models, ordinarily, the model costing the least to get all the required input data is applied to a local application. A more data-demanding model is justified only if its outputs can provide a significant increase in the stakeholder’s profits. For example, let us assume that controlling nitrate leaching by soil nitrate remediation is the goal of a stakeholder. A modeling approach is required to predict the local nitrate leaching by percolation, which can provide insights about the remediation strategy to be performed. SWAT, SWIM, AnnAGNPS, and HSPF are potential models. For operating this functionality, SWAT, SWIM, AnnAGNPS, and HSPF, respectively, require 39, 34, 27 and 24 basic inputs to present the hydrological processes, and they need the same nitrogen input: soil nitrate. Intuitively, HSPF is the best choice because of its least input requirements. However, taking into account the final goal of remediation, HSPF can be substituted by SWIM or SWAT. Indeed HSPF represents the soil with three layers, while SWIM and SWAT can divide the soil column into ten layers. A more detailed investigation on the nitrate leaching state of each layer can help to locate the crucial layers for nitrate remediation. Workload and investment of remediation work can be largely reduced by focusing on a specific soil layer. Compared with HSPF, SWIM and SWAT cost more to fulfill the input requirements, but in the long run, they may be beneficial to reduce the remediation costs. In this way, SWIM and SWAT could be more suitable choices. Since SWIM requires less in input data than SWAT, it is preferred in the final selection.
Water authorities frequently work with large projects, which involve numerous modelers from different departments of multiple districts and require a close cooperation and guidance of policy makers. Aiming at an overall societal, environmental and economic planning, policy makers take into account numerous aspects in their requirements. Dealing with benefit, they evaluate the long-term benefit of the whole project with a long-term perspective. For instance, one policy maker wants to carry out a comprehensive investigation about nitrogen pollution in a large-scale rural catchment, in order to plan future economic activities on the interested region. In this catchment, there is no reservoir, wetland, pond, or pothole. A modeling approach is applied to simulate nitrogen dynamics. As it is a comprehensive study, both nitrogen transport and transformation are considered. The catchment is divided into several sub-catchments. Modelers from multiple departments participate in this work and each is responsible for the modeling work of one sub-catchment. A model should be selected to fit this practical work. According to the nitrogen outputs (Table 5), GWLF is excluded since it does not model N transformation processes. SWAT, SWIM, AnnAGNPS, and HSPF are capable in predicting both N transport and N transformation. Compared with the other two, SWIM and AnnAGNPS are not selected, because both of them simulate fewer reactions (four reactions) for N transformation. Moreover, they are not able to predict N transport in groundwater flow, which may affect the accuracy of prediction, particularly if the groundwater is abundant. Therefore, SWAT and HSPF are the two options considered. They both have competitive features to predict N transport and N transformation. Concerning N transport modeling, SWAT and HSPF can simulate N transport with the same four hydrological processes. Additionally, SWAT is able to model N transport from the first soil layer to the surface by evaporation, while HSPF does not entail this feature. With respect to the N transformation, SWAT and HSPF can both simulate seven reactions including nitrification, denitrification, ammonia volatilization, mineralization/immobilization, decomposition, plant uptake and in-stream reactions. As distinctive functions, HSPF can simulate ionization and ammonium adsorption/desorption; SWAT can model N fixation. Considering N outputs, SWAT and HSPF are comparable. There is not a significant difference between both models in terms of nutrient inputs. Considering basic inputs required by the relevant functionalities, SWAT needs 44 input data while HSPF 25. This means that SWAT is more data demanding than HSPF. Based on the cost, it seems better to choose HSPF. From the perspective of the complexity of the hydrological processes represented in the two models, HSPF represents most nutrient dynamics in a relatively simple way due to its simple hydrological basis. SWAT simulates much more complex hydrological systems than HSPF, and due to the heterogeneous community of modelers involved, this could be seen as a problem due to the potentially different background and experience of the users. Therefore, a more complex representation of the processes, in this case, can lead to a delay in the achievement of the model results. In view of the whole project, HSPF is a better choice than SWAT due to the three facts:
- (1)
- HSPF is complex enough to solve and further explain the problems of this practical case;
- (2)
- HSPF’s easier representation of hydrological processes is easier to be handled by the modelers;
- (3)
- An easier understanding of the model will lead to a faster achievement of the project objective.
5. Conclusions
This model selection protocol is designed especially to help inexperienced users in choosing among process-based models with nutrient simulation functionalities. It is formulated emphasizing practical application, and on the basis of an objective method with neither model comparison nor “good/bad” judgments. Model candidates are presented in terms of their inputs, outputs and complexity. More experienced users may also consider how single physical process is described in the specific models as additional selection criteria. However, such analysis goes beyond the scope of the present work. Different from previous works [22,23], our model selection protocol particularly emphasizes the compromise, which has to be met between model complexity and input requirements under limited data availability. Furthermore, we underline that the selection of the most suitable model is also driven by other factors such as user’s objectives and an accurate cost-benefit analysis, which could justify integrative sampling campaigns for the characterization of the domain.
For a specific modeling issue, data availability and the user’s objective are both crucial criteria, which significantly affect model selection. Model input requirements rely on data availability. Model outputs are closely related to the user’s objectives, and some potential objectives are derived according to the model outputs. A cost and benefit analysis connects the model, field data and user together to build up a network of multi-criteria, which provides new insight into model selection work by considering practical data availability in combination with potential objectives. This protocol will be an advisable method for model selection in advance of practical application. It is an applicable multi-criteria method that weighs a model through an overall balance between model characteristics, data availability and user’s potential goals.
Although this multi-criteria idea of model selection is particularly designed for nutrient simulation work on the basis of five process-based models, it can be applied to other model selection issues under the following conditions: (1) limited data availability; (2) practical modeling application work; (3) there are many potential models with functionality that is fit for the objectives; and (4) these model “candidates” have a similar mechanism basis and are of different complexities.
Acknowledgments
This work is supported by the German Research Foundation (DFG) and the Technische Universität München within the funding programme Open Access Publishing. The first author gratefully acknowledges China Scholarship Council (CSC) for the financial support.
Author Contributions
Ye Tuo was mainly responsible for the literature work, model survey and analysis, and paper writing, and he is the principal author who contributed the most to the manuscript. Gabriele Chiogna and Markus Disse conceived of and designed the overall concept and work plan for the research, they were also responsible for the consultations about model analysis, paper writing and manuscript modification.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Lindim, C.; Pinho, J.L.; Vieira, J.M.P. Analysis of spatial and temporal patterns in a large reservoir using water quality and hydrodynamic modeling. Ecol. Model. 2011, 222, 2485–2494. [Google Scholar] [CrossRef]
- Krysanova, V.; Haberlandt, U. Assessment of nitrogen leaching from arable land in large river basins: Part I. Simulation experiments using a process-based model. Ecol. Model. 2002, 150, 255–275. [Google Scholar] [CrossRef]
- Huang, S.; Hesse, C.; Krysanova, V.; Hattermann, F. From meso- to macro-scale dynamic water quality modelling for the assessment of land use change scenarios. Ecol. Model. 2009, 220, 2543–2558. [Google Scholar] [CrossRef]
- Niraula, R.; Kalin, L.; Srivastava, P.; Anderson, C.J. Identifying critical source areas of nonpoint source pollution with SWAT and GWLF. Ecol. Model. 2013, 268, 123–133. [Google Scholar] [CrossRef]
- Xu, Y. Transport and retention of nitrogen, phosphorus and carbon in North America’s largest river swamp basin, the Atchafalaya river basin. Water 2013, 5, 379–393. [Google Scholar] [CrossRef]
- Pease, L.M.; Oduor, P.; Padmanabhan, G. Estimating sediment, nitrogen, and phosphorous loads from the Pipestem Creek watershed, North Dakota, using AnnAGNPS. Comput. Geosci. 2010, 36, 282–291. [Google Scholar] [CrossRef]
- Hesse, C.; Krysanova, V.; Päzolt, J.; Hattermann, F.F. Eco-Hydrological modelling in a highly regulated lowland catchment to find measures for improving water quality. Ecol. Model. 2008, 218, 135–148. [Google Scholar] [CrossRef]
- Mayzelle, M.; Viers, J.; Medellín-Azuara, J.; Harter, T. Economic feasibility of irrigated agricultural land use buffers to reduce groundwater nitrate in rural drinking water sources. Water 2014, 7, 12–37. [Google Scholar] [CrossRef]
- Shen, Z.; Liao, Q.; Hong, Q.; Gong, Y. An overview of research on agricultural non-point source pollution modelling in China. Sep. Purif. Technol. 2012, 84, 104–111. [Google Scholar] [CrossRef]
- Hunter, H.M.; Walton, R.S. Land-Use effects on fluxes of suspended sediment, nitrogen and phosphorus from a river catchment of the Great Barrier Reef, Australia. J. Hydrol. 2008, 356, 131–146. [Google Scholar] [CrossRef]
- Küstermann, B.; Christen, O.; Hülsbergen, K.J. Modelling nitrogen cycles of farming systems as basis of site- and farm-specific nitrogen management. Agric. Ecosyst. Environ. 2010, 135, 70–80. [Google Scholar] [CrossRef]
- Panagopoulos, Y.; Makropoulos, C.; Baltas, E.; Mimikou, M. SWAT parameterization for the identification of critical diffuse pollution source areas under data limitations. Ecol. Model. 2011, 222, 3500–3512. [Google Scholar] [CrossRef]
- Chen, Y.; Shuai, J.; Zhang, Z.; Shi, P.; Tao, F. Simulating the impact of watershed management for surface water quality protection: A case study on reducing inorganic nitrogen load at a watershed scale. Ecol. Eng. 2014, 62, 61–70. [Google Scholar] [CrossRef]
- Shen, Z.; Chen, L.; Hong, Q.; Qiu, J.; Xie, H.; Liu, R. Assessment of nitrogen and phosphorus loads and causal factors from different land use and soil types in the Three Gorges Reservoir Area. Sci. Total Environ. 2013, 454–455, 383–392. [Google Scholar] [CrossRef] [PubMed]
- Volf, G.; Atanasova, N.; Kompare, B.; Ožanić, N. Modeling nutrient loads to the northern Adriatic. J. Hydrol. 2013, 504, 182–193. [Google Scholar] [CrossRef]
- Nasr, A.; Bruen, M.; Jordan, P.; Moles, R.; Kiely, G.; Byrne, P. A comparison of SWAT, HSPF and SHETRAN/GOPC for modelling phosphorus export from three catchments in Ireland. Water Res. 2007, 41, 1065–1073. [Google Scholar] [CrossRef] [PubMed]
- Thorsen, M.; Refsgaard, J.C.; Hansen, S.; Pebesma, E.; Jensen, J.B.; Kleeschulte, S. Assessment of uncertainty in simulation of nitrate leaching to aquifers at catchment scale. J. Hydrol. 2001, 242, 210–227. [Google Scholar] [CrossRef]
- Ranatunga, K.; Nation, E.R.; Barratt, D.G. Review of soil water models and their applications in Australia. Environ. Model. Softw. 2008, 23, 1182–1206. [Google Scholar] [CrossRef]
- Robson, B.J. State of the art in modelling of phosphorus in aquatic systems: Review, criticisms and commentary. Environ. Model. Softw. 2014, 61, 339–359. [Google Scholar] [CrossRef]
- CRC Catchment Hydrology. General Approach to Modelling and Practical Issues to Model Choice; CRC Catchment Hydrology: Melbourne, Australia, 2000. [Google Scholar]
- Xie, H.; Lian, Y. Uncertainty-Based evaluation and comparison of SWAT and HSPF applications to the Illinois River Basin. J. Hydrol. 2013, 481, 119–131. [Google Scholar] [CrossRef]
- Saloranta, T.M.; Kämäri, J.; Rekolainen, S.; Malve, O. Benchmark criteria: A tool for selecting appropriate models in the field of water management. Environ. Manag. 2003, 32, 322–333. [Google Scholar]
- Boorman, D.B.; Williams, R.J.; Hutchins, M.G.; Penning, E.; Groot, S.; Icke, J. A model selection protocol to support the use of models for water management. Hydrol. Earth Syst. Sci. 2007, 11, 634–646. [Google Scholar] [CrossRef]
- Beven, K.J. Rainfall-Runoff Modelling: The Primer, 2nd ed.; John Wiley & Sons: Chichester, UK, 2012; pp. 16–18. [Google Scholar]
- Galioto, F.; Marconi, V.; Raggi, M.; Viaggi, D. An assessment of disproportionate costs in WFD: The experience of Emilia-Romagna. Water 2013, 5, 1967–1995. [Google Scholar] [CrossRef]
- Rahman, M.; Rusteberg, B.; Uddin, M.; Saada, M.; Rabi, A.; Sauter, M. Impact assessment and multicriteria decision analysis of alternative managed aquifer recharge strategies based on treated wastewater in northern Gaza. Water 2014, 6, 3807–3827. [Google Scholar] [CrossRef]
- Neitsch, S.L.; Arnold, J.G.; Kiniry, J.R.; Williams, J.R. Soil and Water Assessment Tool Theoretical Documentation Version 2009; Texas Water Resources Institute: College Station, TX, USA, 2011; pp. 1–618. [Google Scholar]
- Arnold, J.G.; Srinivasan, R.; Muttiah, R.S.; Williams, J.R. Large area hydrologic modeling and assessment part I: model development1. J. Am. Water Resour. Assoc. 1998, 34, 73–89. [Google Scholar] [CrossRef]
- Shen, Z.; Zhong, Y.; Huang, Q.; Chen, L. Identifying non-point source priority management areas in watersheds with multiple functional zones. Water Res. 2015, 68, 563–571. [Google Scholar] [CrossRef] [PubMed]
- Krysanova, V.; Wechsung, F.; Arnold, J.; Srinivasan, R.; Williams, J. PIK Report No. 69 “SWIM (Soil and Water Integrated Model), User Manual”; Potsdam Institute for Climate Impact Research: Potsdam, Germany, 2000; pp. 1–239. [Google Scholar]
- Krysanova, V.; Meiner, A.; Roosaare, J.; Vasilyev, A. Simulation modelling of the coastal waters pollution from agricultural watershed. Ecol. Model. 1989, 49, 7–29. [Google Scholar] [CrossRef]
- Haith, D.A.; Mandel, R.; Wu, R.S. GWLF—Generalized Watershed Loading Functions. Version 2.0. User’s Manual; Department of Agricultural & Biological Engineering, Cornell University: Ithaca, NY, USA, 1992; pp. 1–63. [Google Scholar]
- Bingner, R.L.; Theurer, F.D.; Yuan, Y.P. AnnAGNPS Technical Processes Documentation, Version 5.2; U.S. Department of Agriculture: Washington, DC, USA, 2011; pp. 1–143.
- Young, R.A.; Onstad, C.A.; Bosch, D.D.; Anderson, W.P. AGNPS: A non-point source pollution model for evaluating agricultural watersheds. J. Soil Water Conserv. 1989, 44, 168–173. [Google Scholar]
- Bicknell, B.R.; Imhoff, J.C.; Kittle, J.L., Jr.; Donigian, A.S., Jr.; Johanson, R.C. Hydrological Simulation Program—FORTRAN: User’s Manual for version 11; U.S. Environmental Protection Agency, National Exposure Research Laboratory: Athens, GA, USA, 1997; pp. 1–755.
- Jeon, J.H.; Yoon, C.G.; Donigian, A.S., Jr.; Jung, K.W. Development of the HSPF-Paddy model to estimate watershed pollutant loads in paddy farming regions. Agric. Water Manag. 2007, 90, 75–86. [Google Scholar] [CrossRef]
- Wrede, S.; Seibert, J.; Uhlenbrook, S. Distributed conceptual modelling in a Swedish lowland catchment: A multi-criteria model assessment. Hydrol. Res. 2013, 44, 318–333. [Google Scholar] [CrossRef]
© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).